|dc.description.abstract||xml has become a standard for representing, storing and exchanging data,
so eectively and eciently retrieving data from xml data sources becomes
increasingly important. In this thesis, we mainly study two types of xml
queries: xml structured query and xml keyword search.
For xml structured query, we focus on twig pattern matching, which lies
in the center of most xml query languages (e.g., XPath, XQuery). Existing
twig pattern matching algorithms can be classied into two-phase
algorithms and one-phase algorithms. We rst propose two novel one-phase
holistic twig matching algorithms, TwigMix and TwigFast, which combine
the ecient selection of useful elements (introduced in TwigStack) with the
simple lists for storing nal solutions (introduced in TwigList). TwigMix
simply introduces the element selection function getNext of TwigStack into
TwigList to avoid manipulation of useless elements in the stack and lists.
TwigFast further improves this by introducing some pointers in the lists to
completely avoid the use of stacks. On the other hand, previous twig pattern
matching algorithms may incur other redundant computation, so we
propose two approaches, namely re-test checking and forward-to-end, which
can reduce the redundant computation and can be easily applied to both
holistic one-phase and two-phase algorithms.
Improving the eectiveness of xml keyword search remains an open problem.
In this thesis, we rst present XKMis, which divides the nodes into
meaningful and self-containing information segments, called minimal information
segments (MISs), and return MIS-subtrees which consist of MISs
that are logically connected by the keywords. The MIS-subtrees are closer
to what the user wants. XReal  utilizes the statistics of underlying data
to resolve keyword ambiguity problems.||en_US