### 2.2 Generative Language Modeling for IR

The success of using statistical language models (LMs) to improve automatic speech
recognition (ASR), as well as the practical challenges associated with using
the PRP model inspired several IR researchers to re-cast IR in a generative
probabilistic framework, by representing documents as generative probabilistic
models.

The main task of automatic speech recognition is the transcription of spoken utterances.
An effective and theoretically well-founded way of approaching this task is by estimating a
probabilistic model based on the occurrences of word sequences in a particular
language [147, 271]. Such models are distributions over term sequences (or: n-grams,
where $n$
indicates the length of each sequence) and can be used to compute the probability of
observing a sequence of terms, by computing the product of the probabilities of
observing the individual terms. Then, when a new piece of audio material
$A$
needs to be transcribed, each possible interpretation of each observation is
compared to this probabilistic model (the LM) and the most likely candidate
$S$ is
returned:
$$\begin{array}{r}{S}^{\ast}=\underset{}{arg\phantom{\rule{0.3em}{0ex}}}\end{array}$$