Authors: Christopher D. Manning, Prabhakar Raghavan, Hinrich Schutze
Chapter 10 - XML Retrieval
Page 178
Relational databases involve searching structured data
Information retrieval (IR) is the searching of unstructured “raw” text without markup, or tagging
Table 10.1 summarizes differences in searching structured data vs. unstructured data
XQUERY - Page 197 - good candidate to be the standard for structured queries
structured data can be represented as structured documents searched with structured retrieval - good for searching “digital libraries, patent databases, blogs, text with persons and entities tagged…. and files from office suites saving as marked up text”
Structured queries work well for questions that dont work well with unranked retrieval
Boolean queries return lots of results without ranking the most relevant first
users may not be aware of which elements are structured and can be used in queries (example: country:Vatican)