2 Background

This thesis presents novel models and methods to improve information access using various information sources, including relevance assessments, pseudo relevant documents, (structured) knowledge sources, and Wikipedia. The guiding intuition is that knowledge captured in the concepts of a concept language can be successfully employed to improve information access. This chapter serves as an introduction to related work and provides the foundation upon which the thesis is built. Related work specific to the various chapters will be introduced in the respective chapters.

We begin this chapter by recalling basic facts about IR: first, a brief history of the field will be given. Then, we take a closer look at Generative Language Modeling for IR in Section 2.2. In Section 2.3 we zoom in on a form of query transformation that is frequently used in the thesis, a process known as query modeling. We then discuss typical approaches used to link text to concept languages.

We postpone until Chapter 3 a discussion of the evaluation methodology employed in IR in general and in various places in this thesis in particular; that chapter will also introduce the test collections that will be used in later experiments.

2.1 Information Retrieval
2.2 Generative Language Modeling for IR
  2.2.1 Query Likelihood
  2.2.2 KL divergence
  2.2.3 Relation to Probabilistic Approaches
2.3 Query Modeling
  2.3.1 Translation Model
  2.3.2 Relevance Feedback
  2.3.3 Term Dependence Models
2.4 Language Modeling Variations
  2.4.1 Topic Models
  2.4.2 Concept Models
  2.4.3 Cluster-based Language Models
2.5 Linking Free Text to Concepts
  2.5.1 Natural Language Interfaces to Databases
  2.5.2 Ontology Matching
  2.5.3 Ontology Learning, Ontology Population, and Semantic Annotation
2.6 Summary

Chapter 2Background

Chapter 2
Background