Bibliography

[1]    Agrawal, R., Gollapudi, S., Halverson, A., and Ieong, S. (2009). Diversifying search results. In WSDM ’09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 5–14, New York, NY, USA. ACM.

[2]    Agrawal, S., Chaudhuri, S., and Das, G. (2002). Dbxplorer: A system for keyword-based search over relational databases. In Proceedings of the 18th International Conference on Data Engineering, pages 5–16.

[3]    Aleksovski, Z., Klein, M. C. A., ten Kate, W., and van Harmelen, F. (2006). Matching unstructured vocabularies using a background ontology. In Managing Knowledge in a World of Networks, 15th International Conference, EKAW 2006, pages 182–197.

[4]    Allan, J., Carterette, B., Dachev, B., Aslam, J. A., Pavlu, V., and Kanoulas, E. (2007). Million query track 2007 overview. In E. M. Voorhees and L. P. Buckland, editors, TREC, volume Special Publication 500-274. National Institute of Standards and Technology (NIST).

[5]    Alonso, O. and Mizzaro, S. (2009). Can we get rid of TREC assessors? using mechanical turk for relevance assessment. In SIGIR 2009 Workshop on The Future of IR Evaluation.

[6]    Alonso, O. and Zaragoza, H. (2010). Special issue on semantic annotations in information retrieval. Information Processing and Management, 46(4), 381–382.

[7]    Alonso, O., Rose, D. E., and Stewart, B. (2008). Crowdsourcing for relevance evaluation. SIGIR Forum, 42(2), 9–15.

[8]    Amati, G. and Van Rijsbergen, C. J. (2002). Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst., 20(4), 357–389.

[9]    Anick, P. (2003). Using terminological feedback for web search refinement: a log-based study. In SIGIR ’03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 88–95, New York, NY, USA. ACM.

[10]    Anick, P. and Kantamneni, R. G. (2008). A longitudinal study of real-time search assistance adoption. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 701–702.

[11]    Arguello, J., Diaz, F., Callan, J., and Crespo, J.-F. (2009). Sources of evidence for vertical selection. In SIGIR ’09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 315–322, New York, NY, USA. ACM.

[12]    Aronson, A. R. (1994). Exploiting a large thesaurus for information retrieval. In J.-L. Funck-Brentano and F. Seitz, editors, RIAO, pages 197–217. CID.

[13]    Artstein, R. and Poesio, M. (2008). Inter-coder agreement for computational linguistics. Comput. Linguist., 34(4), 555–596.

[14]    Aslam, J. A., Pavlu, V., and Yilmaz, E. (2006). A statistical method for system evaluation using incomplete judgments. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 541–548, New York, NY, USA. ACM.

[15]    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., and Ives, Z. (2007). DBpedia: A nucleus for a web of open data. In Proceedings of 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference (ISWC+ASWC 2007), pages 722–735.

[16]    Azzopardi, L. and Roelleke, T. (2007). Explicitly considering relevance within the language modeling framework. In ICTIR ’07: Proceedings of the 1st International Conference on Theory of Information Retrieval, pages 125–134.

[17]    Azzopardi, L., Kazai, G., Robertson, S. E., Rüger, S. M., Shokouhi, M., Song, D., and Yilmaz, E., editors (2009). Advances in Information Retrieval Theory, Second International Conference on the Theory of Information Retrieval, ICTIR 2009, volume 5766 of Lecture Notes in Computer Science. Springer.

[18]    Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison Wesley.

[19]    Baeza-Yates, R., Broder, A., Maarek, Y., and Raghavan, P. (2010). The new frontiers of web search: going beyond the 10 blue links. In D. Harper and P. Schäuble, editors, 33rd Annual ACM SIGIR Conference: SIGIR 2010 Industry Track. Presented at the SIGIR 2010 Industry Track.

[20]    Bai, J. and Nie, J.-Y. (2008). Adapting information retrieval to query contexts. Information Processing and Management, 44(6), 1901–1922.

[21]    Bai, J., Song, D., Bruza, P., Nie, J.-Y., and Cao, G. (2005). Query expansion using term relationships in language models for information retrieval. In CIKM ’05: Proceedings of the 14th ACM international conference on Information and knowledge management, pages 688–695, New York, NY, USA. ACM Press.

[22]    Bailey, P., Craswell, N., White, R., Chen, L., Satyanarayana, A., and Tahaghoghi, S. M. M. (2010). Evaluating search systems using result page context. In IIIX ’10: Proceedings of the fourth international symposium on Information interaction in context.

[23]    Balog, K. (2008). People Search in the Enterprise. Ph.D. thesis, University of Amsterdam.

[24]    Balog, K., Weerkamp, W., and de Rijke, M. (2008). A few examples go a long way: constructing query models from elaborate query formulations. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 371–378, New York, NY, USA. ACM.

[25]    Balog, K., de Vries, A. P., Serdyukov, P., Thomas, P., and Westerveld, T. (2009). Overview of the TREC 2009 entity track. In [331].

[26]    Balog, K., Meij, E., and de Rijke, M. (2010). Entity search: Building bridges between two worlds. In Proceedings of the Workshop on Semantic Search (SemSearch 2010) at the 19th International World Wide Web Conference (WWW 2010).

[27]    Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M., and Etzioni, O. (2007). Open information extraction from the web. In M. M. Veloso, editor, IJCAI, pages 2670–2676.

[28]    Beitzel, S. M., Jensen, E. C., Lewis, D. D., Chowdhury, A., and Frieder, O. (2007). Automatic classification of web queries using very large unlabeled query logs. ACM Trans. Inf. Syst., 25(2), 9.

[29]    Bendersky, M. and Croft, W. B. (2008). Discovering key concepts in verbose queries. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 491–498, New York, NY, USA. ACM.

[30]    Bennett, G., Scholer, F., and Uitdenbogerd, A. (2007). A comparative study of probabilistic and language models for information retrieval. In ADC ’08: Proceedings of the nineteenth conference on Australasian database, pages 65–74, Darlinghurst, Australia, Australia. Australian Computer Society, Inc.

[31]    Berger, A. and Lafferty, J. (1999). Information retrieval as statistical translation. In SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 222–229, New York, NY, USA. ACM.

[32]    Berners-Lee, T. (2009). Linked Data – Design Issues. http://www.w3.org/DesignIssues/LinkedData.html [Online; accessed August 2010].

[33]    Berners-Lee, T., Hendler, J., and Lassila, O. (2001). The semantic web. Scientific American.

[34]    Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., and Sudarshan, S. (2002). Keyword searching and browsing in databases using banks. In Proceedings of the 18th International Conference on Data Engineering, pages 431–440.

[35]    Bhogal, J., Macfarlane, A., and Smith, P. (2007). A review of ontology based query expansion. Information Processing and Management, 43(4), 866–886.

[36]    Bizer, C., Heath, T., Idehen, K., and Berners-Lee, T. (2008). Linked data on the web. In WWW ’08: Proceeding of the 17th international conference on World Wide Web, pages 1265–1266.

[37]    Bizer, C., Heath, T., and Berners-Lee, T. (2009). Linked data - the story so far. International Journal on Semantic Web and Information Systems (IJSWIS), 5(3), 1–22.

[38]    Blair, D. C. (2003). Information retrieval and the philosophy of language. Annual Review of Information Science and Technology, 37, 3–50.

[39]    Blei, D. M. and Mcauliffe, J. D. (2007). Supervised topic models. In Advances in Neural Information Processing Systems 21.

[40]    Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

[41]    Blocks, D., Binding, C., Cunliffe, D., and Tudhope, D. (2002). Qualitative evaluation of thesaurus-based retrieval. In ECDL ’02: Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries, pages 346–361.

[42]    Bordino, I., Castillo, C., Donato, D., and Gionis, A. (2010). Query similarity by projecting the query-flow graph. In SIGIR ’10: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 515–522, New York, NY, USA. ACM.

[43]    Boscarino, C. and de Vries, A. P. (2009). Prior information and the determination of event spaces in probabilistic information retrieval models. In [17], pages 257–264.

[44]    Boyd-Graber, J. L., Blei, D. M., and Zhu, X. (2007). A topic model for word sense disambiguation. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1024–1033.

[45]    Brin, S. and Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst., 30(1-7), 107–117.

[46]    Broder, A. (2002). A taxonomy of web search. SIGIR Forum, 36(2), 3–10.

[47]    Broder, A. Z., Fontoura, M., Gabrilovich, E., Joshi, A., Josifovski, V., and Zhang, T. (2007). Robust classification of rare queries using web knowledge. In SIGIR ’07.

[48]    Buckley, C. and Robertson, S. (2008). Relevance feedback track overview: TREC 2008. In [331].

[49]    Buckley, C. and Voorhees, E. M. (2000). Evaluating evaluation measure stability. In SIGIR ’00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pages 33–40, New York, NY, USA. ACM.

[50]    Buckley, C. and Voorhees, E. M. (2004). Retrieval evaluation with incomplete information. In SIGIR ’04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 25–32, New York, NY, USA. ACM.

[51]    Buckley, C., Salton, G., and Allan, J. (1994). The effect of adding relevance information in a relevance feedback environment. In SIGIR ’94: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pages 292–300, New York, NY, USA. Springer-Verlag New York, Inc.

[52]    Buckley, C., Dimmick, D., Soboroff, I., and Voorhees, E. (2007). Bias and the limits of pooling for large collections. Information Retrieval, 10(6), 491–508.

[53]    Buitelaar, P., Cimiano, P., and Magnini, B. (2005). Ontology Learning from Text: Methods, Evaluation and Applications. IOS Press.

[54]    Burges, C. J. C., Ragno, R., and Le, Q. V. (2006). Learning to rank with nonsmooth cost functions. In B. Schölkopf, J. C. Platt, T. Hoffman, B. Schölkopf, J. C. Platt, and T. Hoffman, editors, NIPS, pages 193–200. MIT Press.

[55]    Büttcher, S., Clarke, C. L. A., and Soboroff, I. (2006). The TREC 2006 terabyte track. In [330].

[56]    Camous, F., Blott, S., and Smeaton, A. F. (2006). On combining MeSH and text searches to improve the retrieval of Medline documents. In Proceedings of the Third Conference en Recherche d’Informations et Applications (CORIA).

[57]    Cao, G., Nie, J.-Y., and Bai, J. (2005). Integrating word relationships into language models. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 298–305, New York, NY, USA. ACM.

[58]    Cao, G., Nie, J.-Y., Gao, J., and Robertson, S. (2008). Selecting good expansion terms for pseudo-relevance feedback. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 243–250, New York, NY, USA. ACM.

[59]    Caracciolo, C., Euzenat, J., Hollink, L., Ichise, R., Isaac, A., Malaisé, V., Meilicke, C., Pane, J., , Shvaiko, P., Stuckenschmidt, H., Šváb, O., and Svátek, V. (2008). Results of the ontology alignment evaluation initiative 2008. In The Third International Workshop on Ontology Matching at ISWC, pages 73–120.

[60]    Carpineto, C., de Mori, R., Romano, G., and Bigi, B. (2001). An information-theoretic approach to automatic query expansion. ACM Trans. Inf. Syst., 19(1), 1–27.

[61]    Carterette, B., Allan, J., and Sitaraman, R. (2006). Minimal test collections for retrieval evaluation. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 268–275, New York, NY, USA. ACM.

[62]    Carterette, B., Pavlu, V., Kanoulas, E., Aslam, J. A., and Allan, J. (2008). Evaluation over thousands of queries. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 651–658, New York, NY, USA. ACM.

[63]    Chelba, C. and Jelinek, F. (1998). Exploiting syntactic structure for language modeling. In ACL-36: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, pages 225–231.

[64]    Chemudugunta, C., Holloway, A., Smyth, P., and Steyvers, M. (2008). Modeling documents by combining semantic concepts with unsupervised statistical learning. In ISWC ’08: Proceedings of the 7th International Semantic Web Conference, pages 229–244.

[65]    Chen, S. F. and Goodman, J. (1996). An empirical study of smoothing techniques for language modeling. In Proceedings of the 34th annual meeting on Association for Computational Linguistics, pages 310–318, Morristown, NJ, USA. Association for Computational Linguistics.

[66]    Chen, Y., Xue, G.-R., and Yu, Y. (2008). Advertising keyword suggestion based on concept hierarchy. In WSDM ’08: Proceedings of the international conference on Web search and web data mining, pages 251–260, New York, NY, USA. ACM.

[67]    Chung, Y. (2004). Optimization of some factors affecting the performance of query expansion. Information Processing and Management, 40(6), 891–917.

[68]    Church, K. W. and Gale, W. A. (1995). Inverse document frequency (IDF): A measure of deviations from poisson. In Proc. Third Workshop on Very Large Corpora, pages 121–130.

[69]    Cimiano, P., Schultz, A., Sizov, S., Sorg, P., and Staab, S. (2009). Explicit versus latent concept models for cross-language information retrieval. In IJCAI’09: Proceedings of the 21st international jont conference on Artifical intelligence, pages 1513–1518, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.

[70]    Clarke, C., Cormack, G., Lynam, T., Buckley, C., and Harman, D. (2009). Swapping documents and terms. Information Retrieval, 12(6), 680–694.

[71]    Clarke, C. L., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., and MacKinnon, I. (2008). Novelty and diversity in information retrieval evaluation. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 659–666, New York, NY, USA. ACM.

[72]    Clarke, C. L. A., Craswell, N., and Soboroff, I. (2010). Overview of the TREC 2009 web track. In [331].

[73]    Clements, M., de Vries, A. P., and Reinders, M. J. T. (2010). The influence of personalization on tag query length in social media search. Information Processing and Management, 46(4), 403–412.

[74]    Cleverdon, C. W. (1966). The effect of variations in relevance assessments in comparative experimental tests of index languages. Technical Report 3, Cranfield Institute of Technology, UK.

[75]    Cleverdon, C. W., Mills, J., and Keen, M. (1966). Factors determining the performance of indexing systems. In ASLIB Cranfield project, Cranfield.

[76]    Clough, P., Müller, H., Deselaers, T., Grubinger, M., Lehmann, T., Jensen, J., and Hersh, W. (2005). The CLEF 2005 Cross-Language Image Retrieval Track. In CLEF 2005 Working Notes.

[77]    Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.

[78]    Cool, C., Belkin, N., Erieder, ., and Kantor, P. (1993). Characteristics of texts affecting relevance judgments. In Proceedings of the 14th National Online Meeting, pages 77–84.

[79]    Cooper, W. S. (1973). On selecting a measure of retrieval effectiveness. part i. the ~subjective~ philosophy of evaluation; Part II. implementation of the philosophy. Journal of the American Society for Information Science, 24, 87–100; 413–424.

[80]    Coursey, K., Mihalcea, R., and Moen, W. (2009). Using encyclopedic knowledge for automatic topic identification. In CoNLL ’09: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, pages 210–218.

[81]    Croft, B. W. and Harper, D. J. (1979). Using probabilistic models of document retrieval without relevance information. Journal of Documentation, 35(4), 285–295.

[82]    Croft, B. W. and Lafferty, J., editors (2003). Language Modeling for Information Retrieval, volume 1. Kluwer.

[83]    Croft, B. W., Callan, J., and Lafferty, J. (2001). Workshop on language modeling and information retrieval. SIGIR Forum, 35(1), 4–6.

[84]    Cronen-Townsend, S., Zhou, Y., and Croft, W. B. (2002). Predicting query performance. In SIGIR ’02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pages 299–306, New York, NY, USA. ACM.

[85]    Cui, H., Wen, J.-R., Nie, J.-Y., and Ma, W.-Y. (2002). Probabilistic query expansion using query logs. In WWW ’02: Proceedings of the 11th international conference on World Wide Web, pages 325–332.

[86]    Dang, V. and Croft, B. W. (2010). Query reformulation using anchor text. In WSDM ’10: Proceedings of the third ACM international conference on Web search and data mining, pages 41–50, New York, NY, USA. ACM.

[87]    de Vries, A. P., Vercoustre, A.-M., Thom, J. A., Craswell, N., and Lalmas, M. (2007). Overview of the INEX 2007 Entity Ranking Track. In Focused Access to XML Documents, 6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX, pages 245–251.

[88]    Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407.

[89]    Demidova, E., Fankhauser, P., Zhou, X., and Nejdl, W. (2010). Divq: diversification for keyword search over structured databases. In SIGIR ’10: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 331–338.

[90]    Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1–38.

[91]    Deselaers, T., Weyand, T., Keysers, D., Macherey, W., and Ney, H. (2005). FIRE in ImageCLEF 2005: Combining Content-based Image Retrieval with Textual Information Retrieval. In CLEF 2005 Working Notes.

[92]    Diaz, F. and Metzler, D. (2006). Improving the estimation of relevance models using large external corpora. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 154–161, New York, NY, USA. ACM Press.

[93]    Diemert, E. and Vandelle, G. (2009). Unsupervised query categorization using automatically-built concept graphs. In WWW ’09: Proceedings of the 18th international conference on World wide web, pages 461–470, New York, NY, USA. ACM.

[94]    Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T., Rajagopalan, S., Tomkins, A., Tomlin, J., and Zien, J. (2003). Semtag and seeker: Bootstrapping the semantic web via automated semantic annotation. In Proceedings of the 12th international conference on World Wide Web, pages 178–186.

[95]    Doyle, L. (1962). Indexing and abstracting by association. American Documentation, 13(4), 378–390.

[96]    Efron, M. (2010). Hashtag retrieval in a microblogging environment. In SIGIR ’10: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 787–788.

[97]    Efthimiadis, E. N. (1996). Query expansion. Annual Review of Information Systems and Technology (ARIST), 31, 121–187.

[98]    Eguchi, K. and Croft, W. B. (2006). Boosting relevance model performance with query term dependence. In CIKM ’06: Proceedings of the 15th ACM international conference on Information and knowledge management, pages 792–793, New York, NY, USA. ACM.

[99]    Elbassuoni, S., Ramanath, M., Schenkel, R., Sydow, M., and Weikum, G. (2009). Language-model-based ranking for queries on RDF-graphs. In CIKM ’09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 977–986. ACM.

[100]    Fang, H., Tao, T., and Zhai, C. (2004). A formal study of information retrieval heuristics. In SIGIR ’04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 49–56, New York, NY, USA. ACM.

[101]    Fellbaum, C., Palmer, M., Dang, H. T., Delfs, L., and Wolf, S. (2001). Manual and automatic semantic annotation with wordnet. In WordNet and Other Lexical Resources, pages 3–10.

[102]    Finkelstein, L. E. V., Gabrilovich, E., Matias, Y., Rivlin, E. H. U. D., Solan, Z. A. C. H., Wolfman, G. A. D. I., and Ruppin, E. (2002). Placing search in context: the concept revisited. ACM Transactions on Information Systems, 20(1), 116–131.

[103]    Fortuna, B., Grobelnik, M., and Mladenic, D. (2007). Ontogen: semi-automatic ontology editor. In Proceedings of the 2007 conference on Human interface, pages 309–318.

[104]    French, J. C., Powell, A. L., Gey, F., and Perelman, N. (2002). Exploiting manual indexing to improve collection selection and retrieval effectiveness. Information Retrieval, 5(4), 323–351.

[105]    Furnas, G. W., Landauer, T. K., Gomez, L. M., and Dumais, S. T. (1987). The vocabulary problem in human-system communication. Commun. ACM, 30(11), 964–971.

[106]    Gabrilovich, E. and Markovitch, S. (2007). Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In IJCAI’07: Proceedings of the 20th international joint conference on Artifical intelligence, pages 1606–1611, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.

[107]    Gabrilovich, E. and Markovitch, S. (2009). Wikipedia-based semantic interpretation for natural language processing. J. Artif. Intell. Res. (JAIR), 34, 443–498.

[108]    Gao, J., Qi, H., Xia, X., and Nie, J.-Y. (2005). Linear discriminant model for information retrieval. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 290–297, New York, NY, USA. ACM.

[109]    Gey, F., Buckland, M., Chen, A., and Larson, R. (2001). Entry vocabulary: a technology to enhance digital search. In HLT ’01: Proceedings of the first international conference on Human language technology research, pages 1–5, Morristown, NJ, USA. Association for Computational Linguistics.

[110]    Ghani, R., Jones, R., Mladenic, D., Nigam, K., and Slattery, S. (2000). Data mining on symbolic knowledge extracted from the web. In Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (KDD-2000), Workshop on Text Mining.

[111]    Giger, H. P. (1988). Concept based retrieval in classical IR systems. In SIGIR ’88: Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval, pages 275–289, New York, NY, USA. ACM.

[112]    Girolami, M. and Kaban, A. (2003). On an equivalence between PLSI and LDA. In SIGIR ’03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 433–434, New York, NY, USA. ACM Press.

[113]    Google (2010). Google search basics: Advanced Search. http://www.google.com/support/websearch/bin/answer.py?answer=35890&&hl=en [Online; accessed August 2010].

[114]    Gray, A. J. G., Gray, N., Hall, C. W., and Ounis, I. (2010). Finding the right term: Retrieving and exploring semantic concepts in astronomical vocabularies. Information Processing and Management, 46(4), 470–478.

[115]    Greiff, W. R. (2001). Is it the language model in language modeling? In J. Callan, B. W. Croft, and J. Lafferty, editors, Workshop on Language Modeling and Information Retrieval.

[116]    Grineva, M., Grinev, M., and Lizorkin, D. (2009). Extracting key terms from noisy and multitheme documents. In WWW ’09: Proceedings of the 18th international conference on World wide web, pages 661–670.

[117]    Guo, J., Xu, G., Cheng, X., and Li, H. (2009). Named entity recognition in query. In SIGIR ’09: 32nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 267–274.

[118]    Hagen, M., Potthast, M., Stein, B., and Braeutigam, C. (2010). The power of naive query segmentation. In SIGIR ’10: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 797–798.

[119]    Harman, D. (1988). Towards interactive query expansion. In SIGIR ’88: Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval, pages 321–331, New York, NY, USA. ACM.

[120]    Harman, D. (1992). Evaluation issues in information retrieval. Information Processing and Management, 28(4), 439–440.

[121]    Harman, D. (1993). Overview of the First Text REtrieval Conference. In R. Korfhage, E. M. Rasmussen, and P. Willett, editors, SIGIR, pages 36–47, Pittsburgh, PA. ACM.

[122]    Harter, S. (1975). A probabilistic approach to automatic keyword indexing. Journal of the American Society for Information Science, 26(5).

[123]    Hayes, A. and Krippendorf, K. (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures, 1(1), 77–89.

[124]    He, B. and Ounis, I. (2009a). Finding good feedback documents. In CIKM ’09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 2011–2014, New York, NY, USA. ACM.

[125]    He, B. and Ounis, I. (2009b). Studying query expansion effectiveness. In ECIR ’09: Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, pages 611–619, Berlin, Heidelberg. Springer-Verlag.

[126]    He, J., Meij, E., and de Rijke, M. (In Press, Accepted Manuscript). Result diversification based on query-specific cluster ranking. Journal of the American Society for Information Science and Technology.

[127]    Hersh, W. R., Hickam, D., and Leone, T. (1992). Words, concepts, or both: Optimal indexing units for automated information retrieval. In Proc. 16th Annu. Symp. Comput. Appl. Med. Care, pages 644–848.

[128]    Hersh, W. R., Hickam, D. H., Haynes, R. B., and McKibbon, K. A. (1994). A performance and failure analysis of SAPHIRE with a MEDLINE test collection. Journal of the American Medical Informatics Association : JAMIA, 1(1), 51–60.

[129]    Hersh, W. R., Bhupatiraju, R. T., Ross, L., Cohen, A. M., Kraemer, D., and Johnson, P. (2004). TREC 2004 genomics track overview. In E. M. Voorhees and L. P. Buckland, editors, TREC, volume Special Publication 500-261. National Institute of Standards and Technology (NIST).

[130]    Hersh, W. R., Cohen, A. M., Yang, J., Bhupatiraju, R. T., Roberts, P. M., and Hearst, M. A. (2005). TREC 2005 genomics track overview. In E. M. Voorhees and L. P. Buckland, editors, TREC, volume Special Publication 500-266. National Institute of Standards and Technology (NIST).

[131]    Hersh, W. R., Cohen, A. M., Roberts, P. M., and Rekapalli, H. K. (2006). TREC 2006 genomics track overview. In [330].

[132]    Herskovic, J. R., Tanaka, L. Y., Hersh, W., and Bernstam, E. V. (2007). A day in the life of PubMed: analysis of a typical day’s query log. J Am Med Inform Assoc, 14(2), 212–220.

[133]    Hewins, E. T. (1990). Information need and use studies. Annual Review of Information Science and Technology, 25, 145–172.

[134]    Hiemstra, D. (1998). A linguistically motivated probabilistic model of information retrieval. In ECDL ’98: Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries, pages 569–584, London, UK. Springer-Verlag.

[135]    Hiemstra, D. and de Vries, A. P. (2000). Relating the new language models of information retrieval to the traditional retrieval models. Technical Report CTIT Technical Report TR-CTIT-00-0, Centre for Telematics and Information Technology, University of Twente.

[136]    Hiemstra, D., Robertson, S., and Zaragoza, H. (2004). Parsimonious language models for information retrieval. In SIGIR ’04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 178–185, New York, NY, USA. ACM.

[137]    Hoenkamp, E., Bruza, P., Song, D., and Huang, Q. (2009). An effective approach to verbose queries using a limited dependencies language model. In [17], pages 116–127.

[138]    Hofmann, K., Tsagkias, M., Meij, E., and de Rijke, M. (2009). The impact of document structure on keyphrase extraction. In CIKM ’09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 1725–1728, New York, NY, USA. ACM.

[139]    Hofmann, T. (1999). Probabilistic latent semantic indexing. In SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50–57. ACM Press.

[140]    Hristidis, V. and Papakonstantinou, Y. (2002). Discover: Keyword search in relational databases. In VLDB, pages 670–681. Morgan Kaufmann.

[141]    Hull, D. (1993). Using statistical testing in the evaluation of retrieval experiments. In SIGIR ’93: Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval.

[142]    Huurnink, B., Hollink, L., van den Heuvel, W., and de Rijke, M. (2010). Search behavior of media professionals at an audiovisual archive: A transaction log analysis. Journal of the American Society for Information Science and Technology, 61(6), 1180–1197.

[143]    Jansen, B. J. and Spink, A. (2006). How are we searching the world wide web? a comparison of nine search engine transaction logs. Information Processing and Management, 42(1), 248 – 263. Formal Methods for Information Retrieval.

[144]    Jansen, B. J., Spink, A., and Saracevic, T. (2000). Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management, 36(2), 207–227.

[145]    Jardine, N. and van Rijsbergen, C. J. (1971). The use of hierarchic clustering in information retrieval. Information Storage and Retrieval, 7(5), 217–240.

[146]    Järvelin, K. and Kekäläinen, J. (2002). Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4), 422–446.

[147]    Jelinek, F. (1990). Self-organized language modeling for speech recognition. Readings in speech recognition, pages 450–506.

[148]    Jelinek, F. and Mercer, R. L. (1980). Interpolated estimation of markov source parameters from sparse data. In Workshop Pattern Recognition in Practice.

[149]    Jimeno-Yepes, A., Berlanga-Llavori, R., and Rebholz-Schuhmann, D. (2010). Ontology refinement for improved information retrieval. Information Processing and Management, 46(4), 426–435.

[150]    Jin, R., Hauptmann, A. G., and Zhai, C. X. (2002). Title language model for information retrieval. In SIGIR ’02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval.

[151]    Jing, Y. and Croft (1994). An association thesaurus for information retrieval. In Proceedings of RIAO ’94.

[152]    Joachims, T. (2002). Optimizing search engines using clickthrough data. In KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 133–142, New York, NY, USA. ACM Press.

[153]    Joachims, T., Granka, L., Pan, B., Hembrooke, H., Radlinski, F., and Gay, G. (2007). Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Trans. Inf. Syst., 25(2), 7.

[154]    John, G. H. and Langley, P. (1995). Estimating continuous distributions in bayesian classifiers. In UAI ’95: Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence, pages 338–345.

[155]    Jones, K. S. (2004). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 60(5), 493–502.

[156]    Jones, K. S., Walker, S., and Robertson, S. E. (2000). A probabilistic model of information retrieval: development and comparative experiments. Information Processing and Management, 36(6), 779–808.

[157]    Joyce, T. and Needham, R. M. (1958). The thesaurus approach to information retrieval. American Documentation, 9(3), 192–197.

[158]    Kalt, T. (1996). A new probabilistic model of text classification and retrieval. Technical Report UM-CS-1998-018, University of Massachusetts, Amherst, Massachusetts.

[159]    Kamps, J., Lalmas, M., and Larsen, B. (2009). Evaluation in context. In M. Agosti, J. L. Borbinha, S. Kapidakis, C. Papatheodorou, and G. Tsakonas, editors, ECDL, volume 5714 of Lecture Notes in Computer Science, pages 339–351. Springer.

[160]    Kasneci, G., Suchanek, F. M., Ifrim, G., Ramanath, M., and Weikum, G. (2008). Naga: Searching and ranking knowledge. In ICDE, pages 953–962. IEEE.

[161]    Kaufmann, E. and Bernstein, A. (In Press, Accepted Manuscript). Evaluating the usability of natural language query languages and interfaces to semantic web knowledge bases. Web Semantics: Science, Services and Agents on the World Wide Web, pages –.

[162]    Kelly, D. and Belkin, N. J. (2001). Reading time, scrolling and interaction: exploring implicit sources of user preferences for relevance feedback. In SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 408–409.

[163]    Kelly, D., Fu, X., and Shah, C. (2010). Effects of position and number of relevant documents retrieved on users’ evaluations of system performance. ACM Trans. Inf. Syst., 28(2), 1–29.

[164]    Kent, A., Berry, M. M., Luehrs, and Perry, J. W. (1955). Machine literature searching VIII, operational criteria for designing information retrieval systems. American Documentation, 6(2), 93–101.

[165]    Keskustalo, H., Järvelin, K., and Pirkola, A. (2008). Evaluating the effectiveness of relevance feedback based on a user simulation model: effects of a user scenario on cumulated gain value. Information Retrieval, 11(3), 209–228.

[166]    Kiryakov, A., Popov, B., Terziev, I., Manov, D., and Ognyanoff, D. (2004). Semantic annotation, indexing, and retrieval. Web Semantics: Science, Services and Agents on the World Wide Web, 2(1), 49–79.

[167]    Koolen, M. and Kamps, J. (2010). The importance of anchor text for ad hoc search revisited. In SIGIR ’10: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 122–129, New York, NY, USA. ACM.

[168]    Korfhage, R. R. (1984). Query enhancement by user profiles. In SIGIR ’84: Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval, pages 111–121, Swinton, UK, UK. British Computer Society.

[169]    Kraaij, W. and de Jong, F. (2004). Transitive probabilistic CLIR models. In Proceedings of RIAO ’04.

[170]    Kurland, O. (2008). The opposite of smoothing: a language model approach to ranking query-specific document clusters. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 171–178, New York, NY, USA. ACM.

[171]    Kurland, O. and Lee, L. (2004). Corpus structure, language models, and ad hoc information retrieval. In SIGIR ’04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 194–201, New York, NY, USA. ACM.

[172]    Kurland, O., Lee, L., and Domshlak, C. (2005). Better than the real thing?: iterative pseudo-query processing using cluster-based language models. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 19–26, New York, NY, USA. ACM.

[173]    Lafferty, J. and Zhai, C. (2001). Document language models, query models, and risk minimization for information retrieval. In SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 111–119, New York, NY, USA. ACM.

[174]    Lafferty, J. and Zhai, C. (2003a). Probabilistic relevance models based on document and query generation. Language Modeling for Information Retrieval.

[175]    Lafferty, J. and Zhai, C. (2003b). Probabilistic relevance models based on document and query generation. In Language Modeling for Information Retrieval. Springer.

[176]    Lalmas, M., MacFarlane, A., Rüger, S. M., Tombros, A., Tsikrika, T., and Yavlinsky, A., editors (2006). Advances in Information Retrieval, 28th European Conference on IR Research, ECIR 2006, London, UK, April 10-12, 2006, Proceedings, volume 3936 of Lecture Notes in Computer Science. Springer.

[177]    Lancaster, F. (1969). MEDLARS: report on the evaluation of its operating efficiency. American Documentation, 20(2), 119–148.

[178]    Lancaster, W. F. (1982). Information Retrieval Systems: Characteristics, Testing and Evaluation. Wiley Interscience.

[179]    Landis, R. J. and Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.

[180]    Lavrenko, V. (2004). A Generative Theory of Relevance. Ph.D. thesis, University of Massachusettes.

[181]    Lavrenko, V. (2008). A Generative Theory of Relevance. Springer Publishing Company, Incorporated.

[182]    Lavrenko, V. and Croft, B. W. (2003). Relevance models in information retrieval. In [82], pages 11–54.

[183]    Lavrenko, V. and Croft, W. B. (2001). Relevance based language models. In SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 120–127, New York, NY, USA. ACM.

[184]    Lease, M., Allan, J., and Croft, W. B. (2009). Regression rank: Learning to meet the opportunity of descriptive queries. In M. Boughanem, C. Berrut, J. Mothe, and C. Soulé-Dupuy, editors, ECIR, volume 5478 of Lecture Notes in Computer Science, pages 90–101. Springer.

[185]    Lee, K. S., Croft, W. B., and Allan, J. (2008). A cluster-based resampling method for pseudo-relevance feedback. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 235–242, New York, NY, USA. ACM.

[186]    Lesk, M. and Salton, G. (1968). Relevance assessments and retrieval system evaluation. Information Storage and Retrieval, 4, 343–359.

[187]    Lewis, D. D. (1998). Naive (bayes) at forty: The independence assumption in information retrieval. In ECML ’98: Proceedings of the 10th European Conference on Machine Learning, pages 4–15.

[188]    Li, X. (2008). A new robust relevance model in the language model framework. Information Processing and Management, 44(3), 991 – 1007.

[189]    Liu, X. and Croft, W. B. (2004). Cluster-based retrieval using language models. In SIGIR ’04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 186–193, New York, NY, USA. ACM.

[190]    Liu, Y.-H. (2009). The impact of MeSH (Medical Subject Headings) terms on information seeking effectiveness. Ph.D. thesis, Rutgers, The State University of New Jersey.

[191]    Losada, D. and Azzopardi, L. (2008a). An analysis on document length retrieval trends in language modeling smoothing. Information Retrieval, 11(2), 109–138.

[192]    Losada, D. E. and Azzopardi, L. (2008b). Assessing multivariate bernoulli models for information retrieval. ACM Trans. Inf. Syst., 26(3), 1–46.

[193]    Lu, Y., Mei, Q., and Zhai, C. (2010). Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA. Information Retrieval, pages 1–26.

[194]    Luhn, H. P. (1961). The automatic derivation of information retrieval encodements from machine-readable texts. Information Retrieval and Machine Translation, 3(1), 1021–1028.

[195]    Luk, R. W. (2008). On event space and rank equivalence between probabilistic retrieval models. Information Retrieval, 11(6), 539–561.

[196]    Lundquist, C., Grossman, D. A., and Frieder, O. (1997). Improving relevance feedback in the vector space model. In CIKM ’97: Proceedings of the sixth international conference on Information and knowledge management, pages 16–23, New York, NY, USA. ACM.

[197]    Lv, Y. and Zhai, C. (2009). A comparative study of methods for estimating query language models with pseudo feedback. In CIKM ’09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 1895–1898, New York, NY, USA. ACM.

[198]    Madsen, R. E., Kauchak, D., and Elkan, C. (2005). Modeling word burstiness using the Dirichlet distribution. In ICML ’05: Proceedings of the 22nd international conference on Machine learning, pages 545–552, New York, NY, USA. ACM.

[199]    Maedche, A. and Volz, R. (2001). The ontology extraction maintenance framework text-to-onto. Proceedings of the IEEE International Conference on Data Mining.

[200]    Malaisé, V., Gazendam, L., and Brugman, H. (2007). Disambiguating automatic semantic annotation based on a thesaurus structure. TALN 2007: Actes de la 14e conférence sur le Traitement Automatique des Langues Naturelles.

[201]    Manning, C. D. and Schuetze, H. (1999). Foundations of Statistical Natural Language Processing. The MIT Press.

[202]    Manning, C. D., Raghavan, P., and Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press.

[203]    Maron, M. E. and Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval. J. ACM, 7(3), 216–244.

[204]    Mccallum, A. and Nigam, K. (1998). A comparison of event models for naive bayes text classification. In Proc. AAAI-98 Workshop on Learning for Text Categorization, pages 41–48.

[205]    Medelyan, O., Milne, D., Legg, C., and Witten, I. H. (2009). Mining meaning from Wikipedia. International Journal of Human-Computer Studies, 67(9), 716–754.

[206]    Mei, Q., Zhang, D., and Zhai, C. (2008). A general optimization framework for smoothing language models on graph structures. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 611–618, New York, NY, USA. ACM.

[207]    Meij, E. (2008). Towards a combined model for search and navigation of annotated documents. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, page 898, New York, NY, USA. ACM.

[208]    Meij, E. and de Rijke, M. (2007a). Integrating Conceptual Knowledge into Relevance Models: A Model and Estimation Method. In ICTIR ’07: Proceedings of the 1st International Conference on Theory of Information Retrieval.

[209]    Meij, E. and de Rijke, M. (2007b). Thesaurus-based feedback to support mixed search and browsing environments. In L. Kovács, N. Fuhr, and C. Meghini, editors, ECDL, volume 4675 of Lecture Notes in Computer Science, pages 247–258. Springer.

[210]    Meij, E. and de Rijke, M. (2007c). Using prior information derived from citations in literature search. In D. Evans, S. Furui, and C. Soulé-Dupuy, editors, RIAO. CID.

[211]    Meij, E. and de Rijke, M. (2008). The University of Amsterdam at the CLEF 2008 Domain Specific Track - parsimonious relevance and concept models. In CLEF ’08 Working Notes.

[212]    Meij, E. and de Rijke, M. (2009). Concept models for domain-specific search. In CLEF’08: Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access, pages 207–214, Berlin, Heidelberg. Springer-Verlag.

[213]    Meij, E. and de Rijke, M. (2010). Supervised query modeling using Wikipedia. In SIGIR ’10: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 875–876, New York, NY, USA. ACM.

[214]    Meij, E. and de Rijke, M. (Submitted). A comparative study of relevance feedback methods for query modeling. Information Retrieval.

[215]    Meij, E., Trieschnigg, D., de Rijke, M., and Kraaij, W. (2008a). Parsimonious concept modeling. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 815–816, New York, NY, USA. ACM.

[216]    Meij, E., Weerkamp, W., Balog, K., and de Rijke, M. (2008b). Parsimonious relevance models. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 817–818, New York, NY, USA. ACM.

[217]    Meij, E., Mika, P., and Zaragoza, H. (2009a). An evaluation of entity and frequency based query completion methods. In SIGIR ’09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 678–679, New York, NY, USA. ACM.

[218]    Meij, E., Mika, P., and Zaragoza, H. (2009b). Investigating the demand side of semantic search through query log analysis. In Proceedings of the Workshop on Semantic Search (SemSearch 2009) at the 18th International World Wide Web Conference (WWW 2009), pages 2–5.

[219]    Meij, E., Bron, M., Huurnink, B., Hollink, L., and de Rijke, M. (2009c). Learning semantic query suggestions. In ISWC ’09: Proceedings of the 8th International Conference on The Semantic Web, pages 424–440.

[220]    Meij, E., Weerkamp, W., and de Rijke, M. (2009d). A query model based on normalized log-likelihood. In CIKM ’09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 1903–1906, New York, NY, USA. ACM.

[221]    Meij, E., Trieschnigg, D., de Rijke, M., and Kraaij, W. (2010). Conceptual language models for domain-specific retrieval. Inf. Process. Manage., 46(4), 448–469.

[222]    Meij, E., Bron, M., Hollink, L., Huurnink, B., and de Rijke, M. (Accepted subject to revisions). Mapping queries to the linked open data cloud: A case study using DBpedia. Web Semantics: Science, Services and Agents on the World Wide Web.

[223]    Metzler, D. (2005). Direct maximization of rank-based metrics. Technical report, University of Massachusetts, Amherst.

[224]    Metzler, D. and Croft, W. B. (2005). A markov random field model for term dependencies. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 472–479, New York, NY, USA. ACM.

[225]    Metzler, D. and Croft, W. B. (2007). Latent concept expansion using markov random fields. In SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 311–318, New York, NY, USA. ACM.

[226]    Mihalcea, R. and Csomai, A. (2007). Wikify!: Linking documents to encyclopedic knowledge. In CIKM ’07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 233–242.

[227]    Mika, P., Meij, E., and Zaragoza, H. (2009). Investigating the semantic gap through query log analysis. In ISWC ’09: Proceedings of the 8th International Semantic Web Conference, pages 441–455.

[228]    Miller, D. R. H., Leek, T., and Schwartz, R. M. (1999a). BBN at TREC-7: Using hidden markov models for information retrieval. In TREC ’99.

[229]    Miller, D. R. H., Leek, T., and Schwartz, R. M. (1999b). A hidden markov model information retrieval system. In SIGIR ’99.

[230]    Milne, D. and Witten, I. H. (2008). Learning to link with Wikipedia. In CIKM ’08: Proceedings of the 17th ACM conference on Information and knowledge management, pages 509–518.

[231]    Minker, J., Wilson, G. A., and Zimmerman, B. H. (1972). An evaluation of query expansion by the addition of clustered terms for a document retrieval system. Information Storage and Retrieval, 8(6), 329–348.

[232]    Mishne, G. and de Rijke, M. (2005). Boosting web retrieval through query operations. In D. E. Losada and J. M. Fernández-Luna, editors, ECIR, volume 3408 of Lecture Notes in Computer Science, pages 502–516. Springer.

[233]    Mishne, G. and de Rijke, M. (2006). A study of blog search. In ECIR ’06: Proceedings of the 28th European Conference on Information Retrieval, pages 289–301.

[234]    Mitchell, J. and Lapata, M. (2009). Language models based on semantic composition. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, pages 430–439.

[235]    Mitra, M., Singhal, A., and Buckley, C. (1998). Improving automatic query expansion. In SIGIR ’98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 206–214, New York, NY, USA. ACM.

[236]    Momtazi, S. and Klakow, D. (2010). Hierarchical Pitman-Yor language model for information retrieval. In SIGIR ’10: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 793–794.

[237]    Mooers, C. N. (1952). Information retrieval viewed as temporal signaling. In Proceedings of the International Congress of Mathematicians, pages 572–573.

[238]    Morrison, P. J. (2008). Tagging and searching: Search retrieval effectiveness of folksonomies on the world wide web. Information Processing and Management, 44(4), 1562 – 1579.

[239]    Nallapati, R., Croft, B., and Allan, J. (2003). Relevant query feedback in statistical language modeling. In CIKM ’03: Proceedings of the twelfth international conference on Information and knowledge management, pages 560–563, New York, NY, USA. ACM.

[240]    Ng, K. (2001). A maximum likelihood ratio information retrieval model. In Proceedings of the 9th Text Retrieval Conference (TREC 2000).

[241]    Nie, J.-Y., Cao, G., and Bai, J. (2006). Inferential language models for information retrieval. ACM Transactions on Asian Language Information Processing (TALIP), 5(4), 296–322.

[242]    Ogilvie, P., Voorhees, E., and Callan, J. (2009). On the number of terms used in automatic query expansion. Information Retrieval, 12(6), 666–679.

[243]    Peat, H. J. and Willett, P. (1991). The limitations of term co-occurrence data for query expansion in document retrieval systems. Journal of the American Society for Information Science, 42(5), 378–383.

[244]    Petras, V. and Baerisch, S. (2008). The domain-specific track at CLEF 2008. In CLEF ’08 Working Notes.

[245]    Petras, V., Baerisch, S., and Stempfhuber, M. (2007). The domain-specific track at CLEF 2007. In CLEF ’07.

[246]    Platt, J. C. (1999). Fast training of support vector machines using sequential minimal optimization. In Advances in kernel methods: support vector learning, pages 185–208. MIT Press.

[247]    Ponte, J. (2000). Language models for relevance feedback. In Advances in Information Retrieval, pages 73–95. Kluwer Academic.

[248]    Ponte, J. M. and Croft, W. B. (1998). A language modeling approach to information retrieval. In SIGIR ’98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 275–281, New York, NY, USA. ACM.

[249]    Popov, B., Kiryakov, A., Manov, D., Kirilov, A., Ognyanoff, D., and Goranov, M. (2003). Towards semantic web information extraction. In Human Language Technologies Workshop at the 2nd International Semantic Web Conference (ISWC2003), pages 2–22.

[250]    Pu, Q. and He, D. (2009). Pseudo relevance feedback using semantic clustering in relevance language model. In CIKM ’09: Proceeding of the 18th ACM conference on Information and knowledge management, pages 1931–1934, New York, NY, USA. ACM.

[251]    Qi, X. and Davison, B. D. (2009). Web page classification: Features and algorithms. ACM Comput. Surv., 41(2), 1–31.

[252]    Qiu, Y. and Frei, H.-P. (1993). Concept based query expansion. In SIGIR ’93: Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pages 160–169, New York, NY, USA. ACM.

[253]    Quinlan, R. J. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.

[254]    Rajashekar, T. B. and Croft, W. B. (1995). Combining automatic and manual index representations in probabilistic retrieval. J. Am. Soc. Inf. Sci., 46(4), 272–283.

[255]    Ramage, D., Hall, D., Nallapati, R., and Manning, C. D. (2009). Labeled lda: a supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, pages 248–256, Morristown, NJ, USA. Association for Computational Linguistics.

[256]    Rennie, J. D. M., Teevan, J., and Karger, D. R. (2003). Tackling the poor assumptions of naive bayes text classifiers. In ICML ’03: In Proceedings of the 20th International Conference on Machine Learning, pages 616–623.

[257]    Roberts, N. (1 January 1984). The pre-history of the information retrieval thesaurus. Journal of Documentation, 40, 271–285(15).

[258]    Robertson, S. (2004). Understanding inverse document frequency: On theoretical arguments for idf. Journal of Documentation, 60(5), 503–520.

[259]    Robertson, S. (2005). On event spaces and probabilistic models in information retrieval. Information Retrieval, 8(2), 319–329.

[260]    Robertson, S. (2008). On the history of evaluation in ir. Journal of Information Science, 34(4), 439–456.

[261]    Robertson, S. and Belkin, N. (1978). Ranking in principle. Journal of Documentation, 34(2), 93–100.

[262]    Robertson, S. and Zaragoza, H. (2007). On rank-based effectiveness measures and optimization. Information Retrieval, 10(3), 321–339.

[263]    Robertson, S. E. (1977). The probability ranking principle in ir. Journal of Documentation, 33(4), 294–304.

[264]    Robertson, S. E. and Jones, K. S. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3), 129–146.

[265]    Robertson, S. E. and Walker, S. (1994). Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR ’94: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pages 232–241.

[266]    Robertson, S. E., van Rijsbergen, C. J., and Porter, M. F. (1981). Probabilistic models of indexing and searching. In SIGIR ’80: Proceedings of the 3rd annual ACM conference on Research and development in information retrieval, pages 35–56.

[267]    Rocchio, J. (1971). Relevance feedback in information retrieval. In [274].

[268]    Rocha, C., Schwabe, D., and Aragao, M. P. (2004). A hybrid approach for searching in the semantic web. In WWW ’04.

[269]    Rorissa, A. (2010). A comparative study of Flickr tags and index terms in a general image collection. J. Am. Soc. Inf. Sci., 61(11), 2230–2242.

[270]    Rose, D. E. and Levinson, D. (2004). Understanding user goals in web search. In WWW ’04: Proceedings of the 13th international conference on World Wide Web, pages 13–19, New York, NY, USA. ACM.

[271]    Rosenfeld, R. (2000). Two decades of statistical language modeling: Where do we go from here. Proc. IEEE, 88(8), 1270–1278.

[272]    Ruthven, I. and Lalmas, M. (2003). A survey on the use of relevance feedback for information access systems. Knowl. Eng. Rev., 18(2), 95–145.

[273]    Salton, G. (1971a). Information analysis and dictionary construction. In [274].

[274]    Salton, G., editor (1971b). The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall, Englewood Cliffs, NJ.

[275]    Salton, G. (1996). A new horizon for information science. Journal of the American Society for Information Science, 47(4).

[276]    Salton, G. and Buckley, C. (1990). Improving retrieval performance by relevance feedback. JASIST, 41(4), 288–297.

[277]    Sanderson, M. (2010). Test collection based evaluation of information retrieval systems. Foundations and Trends in Information Retrieval, 4(4), 247–375.

[278]    Sanderson, M. and Zobel, J. (2005). Information retrieval system evaluation: effort, sensitivity, and reliability. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 162–169, New York, NY, USA. ACM.

[279]    Saracevic, T. (1975). Relevance: A review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26(6), 321–343.

[280]    Savoy, J. (2005). Bibliographic database access using free-text and controlled vocabulary: an evaluation. Information Processing and Management, 41(4), 873–890.

[281]    Shakery, A. and Zhai, C. (2008). Smoothing document language models with probabilistic term count propagation. Information Retrieval, 11(2), 139–164.

[282]    Shen, D., Sun, J.-T., Yang, Q., and Chen, Z. (2006). Building bridges for web query classification. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 131–138, New York, NY, USA. ACM.

[283]    Shen, X., Tan, B., and Zhai, C. (2005). Context-sensitive information retrieval using implicit feedback. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 43–50, New York, NY, USA. ACM.

[284]    Shvaiko, P. and Euzenat, J. (2005). A survey of schema-based matching approaches. Journal on Data Semantics, 4(3730), 146–171.

[285]    Silveira, M. L. and Ribeiro-Neto, B. (2004). Concept-based ranking: a case study in the juridical domain. Information Processing and Management, 40(5), 791–805.

[286]    Smucker, M. D. and Jethani, C. P. (2010). Impact of retrieval precision on perceived difficulty and other user measures. In HCIR 2010: the fourth international workshop on human-computer interaction and information retrieval (HCIR ’10).

[287]    Smucker, M. D., Allan, J., and Carterette, B. (2007). A comparison of statistical significance tests for information retrieval evaluation. In CIKM ’07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management.

[288]    Soboroff, I. (2007). A comparison of pooled and sampled relevance judgments. In SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 785–786, New York, NY, USA. ACM.

[289]    Soergel, D. (1976). Is user satisfaction a hobgoblin? Journal of the American Society for Information Science, 27(4), 256–259.

[290]    Song, F. and Croft, W. B. (1999). A general language model for information retrieval. In CIKM ’99: Proceedings of the eighth international conference on Information and knowledge management.

[291]    Sparck-Jones, K. (1971). Automatic keyword classification for information retrieval. Archon Books.

[292]    Sparck Jones, K. (2004). What’s new about the semantic web?: some questions. SIGIR Forum, 38(2), 18–23.

[293]    Sparck-Jones, K. and Jackson, D. M. (1970). The use of automatically-obtained keyword classifications for information retrieval. Information Processing and Management, 5(1), 175–201.

[294]    Sparck-Jones, K. and Needham, R. M. (1968). Automatic term classification and retrieval. Information Processing and Management, 4(1), 91–100.

[295]    Sparck-Jones, K. and Robertson, S. (2001). LM vs. PM: Where is the relevance? In J. Callan, B. W. Croft, and J. Lafferty, editors, Workshop on Language Modeling and Information Retrieval.

[296]    Sparck Jones, K. and Willett, P., editors (1997). Readings in information retrieval. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

[297]    Sparck-Jones, K., Robertson, S. E., and Hiemstra, D. (2003). language modeling and relevance, pages 57–71. Volume 1 of [82].

[298]    Spiegel, J. and Bennett, E. (1964). A modified statistical association procedure for automatic document content analysis and retrieval. In M. Stevens, V. Guiliano, and L. Heilprin, editors, Statistical Association Methods For Mechanized Documentation.

[299]    Spink, A., Jansen, B. J., and Ozmultu, C. H. (2000). Use of query reformulation and relevance feedback by excite users. Internet Research: Electronic Networking Applications and Policy, 10(4), 317–328.

[300]    Spink, A., Jansen, B. J., Wolfram, D., and Saracevic, T. (2002). From e-sex to e-commerce: Web search changes. IEEE Computer, 35(3), 107–109.

[301]    Srikanth, M. and Srihari, R. (2002). Biterm language models for document retrieval. In SIGIR ’02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pages 425–426.

[302]    Srinivasan, P. (1996). Query expansion and MEDLINE. Information Processing and Management, 32(4), 431–443.

[303]    Steyvers, M. and Griffiths, T. (2007). Probabilistic topic models. In T. K. Landauer, D. S. Mcnamara, S. Dennis, and W. Kintsch, editors, Handbook of Latent Semantic Analysis, pages 427–448, Mahwah, NJ. Lawrence Erlbaum Associates.

[304]    Stoilos, G., Stamou, G. B., and Kollias, S. D. (2005). A string metric for ontology alignment. In ISWC ’05: Proceedings of the 4th International Semantic Web Conference, pages 624–637.

[305]    Stokes, N., Li, Y., Cavedon, L., and Zobel, J. (2009). Exploring criteria for successful query expansion in the genomic domain. Information Retrieval, 12(1), 17–50.

[306]    Stumme, G., Hotho, A., and Berendt, B. (2006). Semantic web mining: State of the art and future directions. Web Semantics: Science, Services and Agents on the World Wide Web, 4(2), 124–143.

[307]    Suchanek, F. M., Kasneci, G., and Weikum, G. (2008). Yago: A large ontology from Wikipedia and wordnet. Web Semantics: Science, Services and Agents on the World Wide Web, 6(3), 203 – 217.

[308]    Sunehag, P. (2007). Using two-stage conditional word frequency models to model word burstiness and motivating TF-IDF. In M. Mella and X. Shan, editors, Conference for Artificial Intelligence and Statistics, pages 8–16.

[309]    Tague-Sutcliffe, J. M. (1996). Some perspectives on the evaluation of information retrieval systems. J. Am. Soc. Inf. Sci., 47(1), 1–3.

[310]    Tao, T. and Zhai, C. (2006). Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 162–169, New York, NY, USA. ACM.

[311]    Tao, T., Wang, X., Mei, Q., and Zhai, C. (2006). Language model information retrieval with document expansion. In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pages 407–414, Morristown, NJ, USA. Association for Computational Linguistics.

[312]    Tata, S. and Lohman, G. M. (2008). SQAK: doing more with keywords. In SIGMOD Conference, pages 889–902. ACM.

[313]    Thompson, P. (2008). Looking back: On relevance, probabilistic indexing and information retrieval. Information Processing and Management, 44(2), 963–970.

[314]    Trajkova, J. and Gauch, S. (2004). Improving ontology-based user profiles. In Proceedings of RIAO ’04.

[315]    Trieschnigg, D., Kraaij, W., and Schuemie, M. (2006). Concept based passage retrieval for genomics literature. In [330].

[316]    Trieschnigg, D., Kraaij, W., and de Jong, F. (2007). The influence of basic tokenization on biomedical document retrieval. In SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 803–804, New York, NY, USA. ACM.

[317]    Trieschnigg, D., Meij, E., de Rijke, M., and Kraaij, W. (2008). Measuring concept relatedness using language models. In SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 823–824, New York, NY, USA. ACM.

[318]    Trieschnigg, D., Pezik, P., Lee, V., Kraaij, W., de Jong, F., and Rebholz-Schuhmann, D. (2009). MeSH Up: Effective MeSH text classification and improved document retrieval. Bioinformatics, 25(11), 1412–1418.

[319]    Troncy, R. (2008). Bringing the IPTC News Architecture into the Semantic Web. In A. P. Sheth, S. Staab, M. Dean, M. Paolucci, D. Maynard, T. W. Finin, and K. Thirunarayan, editors, International Semantic Web Conference, volume 5318 of Lecture Notes in Computer Science, pages 483–498. Springer.

[320]    Tsikrika, T., Diou, C., de Vries, A., and Delopoulos, A. (2009). Image annotation using clickthrough data. In CIVR ’09: Proceeding of the ACM International Conference on Image and Video Retrieval, pages 1–8.

[321]    Tudhope, Douglas, Binding, Ceri, Blocks, Dorothee, Cunliffe, and Daniel (2006). Query expansion via conceptual distance in thesaurus indexed collections. Journal of Documentation, 62(4), 509–533.

[322]    Turney, P. and Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141–188.

[323]    Vakkari, P., Jones, S., Macfarlane, A., and Sormunen, E. (2004). Query exhaustivity, relevance feedback and search success in automatic and interactive query expansion. Journal of Documentation, 60(2), 109–127.

[324]    van Hage, W. R., de Rijke, M., and Marx, M. (2004). Information retrieval support for ontology construction and use. In ISWC ’04: Proceedings of the 3rd International Semantic Web Conference, pages 518–533.

[325]    Van Rijsbergen, C. J. (1979). Information Retrieval, 2nd edition. Dept. of Computer Science, University of Glasgow.

[326]    Vapnik, V. N. (1995). The nature of statistical learning theory. Springer-Verlag.

[327]    Voorhees, E. M. (1994). Query expansion using lexical-semantic relations. In SIGIR ’94: Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pages 61–69, New York, NY, USA. Springer-Verlag New York, Inc.

[328]    Voorhees, E. M. (2000). Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing and Management, 36(5), 697–716.

[329]    Voorhees, E. M. (2005). The TREC Robust retrieval track. SIGIR Forum, 39(1), 11–20.

[330]    Voorhees, E. M. and Buckland, L. P., editors (2006). Proceedings of the Fifteenth Text REtrieval Conference, TREC 2006, Gaithersburg, Maryland, November 14-17, 2006, volume Special Publication 500-272. National Institute of Standards and Technology (NIST).

[331]    Voorhees, E. M. and Buckland, L. P., editors (2009). Proceedings of The Eighteenth Text REtrieval Conference, TREC 2009, Gaithersburg, Maryland, USA, November 2009. National Institute of Standards and Technology (NIST).

[332]    Voorhees, E. M. and Harman, D. K. (2005). TREC: Experiment and Evaluation in Information Retrieval. MIT Press.

[333]    Wang, K., Li, X., and Gao, J. (2010). Multi-style language model for web scale information retrieval. In SIGIR ’10: Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 467–474, New York, NY, USA. ACM.

[334]    Wang, S., Englebienne, G., and Schlobach, S. (2008). Learning concept mappings from instance similarity. In ISWC ’08: Proceedings of the 7th International Conference on The Semantic Web, pages 339–355.

[335]    Wang, X. and Zhai, C. (2008). Mining term association patterns from search logs for effective query reformulation. In CIKM ’08: Proceeding of the 17th ACM conference on Information and knowledge management, pages 479–488, New York, NY, USA. ACM.

[336]    Weerkamp, W. and de Rijke, M. (2008). Credibility improves topical blog post retrieval. In ACL, pages 923–931. The Association for Computer Linguistics.

[337]    Weerkamp, W., Balog, K., and de Rijke, M. (2009a). A generative blog post retrieval model that uses query expansion based on external collections. In ACL-ICNLP 2009.

[338]    Weerkamp, W., Balog, K., and Meij, E. J. (2009b). A generative language modeling approach for ranking entities. In Advances in Focused Retrieval.

[339]    Wei, X. (2007). Topic Models in Information Retrieval. Ph.D. thesis, University of Massachusetts.

[340]    Wei, X. and Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 178–185, New York, NY, USA. ACM.

[341]    White, R. W., Ruthven, I., Jose, J. M., and Rijsbergen, C. J. V. (2005). Evaluating implicit feedback models using searcher simulations. ACM Trans. Inf. Syst., 23(3), 325–361.

[342]    Wikipedia (2010). Wikipedia:Manual of Style (lead section). http://en.wikipedia.org/wiki/wikipedia:Lead_section [Online; accessed August 2010].

[343]    Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80–83.

[344]    Witten, I. H. and Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.

[345]    Xu, J. and Croft, W. B. (1996). Query expansion using local and global document analysis. In SIGIR ’96: Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pages 4–11, New York, NY, USA. ACM.

[346]    Xu, J. and Croft, W. B. (1999). Cluster-based language models for distributed retrieval. In SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 254–261, New York, NY, USA. ACM.

[347]    Xu, Y., Jones, G. J., and Wang, B. (2009). Query dependent pseudo-relevance feedback based on Wikipedia. In SIGIR ’09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 59–66, New York, NY, USA. ACM.

[348]    Xu, Z. and Akella, R. (2010). Improving probabilistic information retrieval by modeling burstiness of words. Information Processing and Management, 46(2), 143–158.

[349]    Yang, Y. and Chute, C. G. (1993). Words or concepts: the features of indexing units and their optimal use in information retrieval. Proc. 17th Annu. Symp. Comput. Appl. Med. Care, pages 685–689.

[350]    Yang, Y. and Pedersen, J. O. (1997). A comparative study on feature selection in text categorization. In ICML ’97: Proceedings of the Fourteenth International Conference on Machine Learning, pages 412–420.

[351]    Yu, J. X., Qin, L., and Chang, L. (2010). Keyword search in relational databases: A survey. IEEE Data Eng. Bull. Special Issue on Keyword Search, 33(1), 67–78.

[352]    Zaragoza, H., Hiemstra, D., and Tipping, M. (2003). Bayesian extension to the language model for ad hoc information retrieval. In SIGIR ’03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 4–9, New York, NY, USA. ACM Press.

[353]    Zhai, C. (2002). Risk Minimization and Language Modeling in Text Retrieval. Ph.D. thesis, Carnegie Mellon University.

[354]    Zhai, C. and Lafferty, J. (2001). Model-based feedback in the language modeling approach to information retrieval. In CIKM ’01: Proceedings of the tenth international conference on Information and knowledge management, pages 403–410, New York, NY, USA. ACM.

[355]    Zhai, C. and Lafferty, J. (2002). Two-stage language models for information retrieval. In SIGIR ’02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pages 49–56, New York, NY, USA. ACM.

[356]    Zhai, C. and Lafferty, J. (2004). A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst., 22(2), 179–214.

[357]    Zhou, X., Hu, X., Zhang, X., Lin, X., and Song, I.-Y. (2006). Context-sensitive semantic smoothing for the language modeling approach to genomic ir. In SIGIR ’06.

[358]    Zhou, X., Hu, X., and Zhang, X. (2007). Topic signature language models for ad hoc retrieval. IEEE Transactions on Knowledge and Data Engineering, 19(9), 1276–1287.

[359]    Zhou, Y. and Croft, B. W. (2007). Query performance prediction in web search environments. In SIGIR ’07: 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 543–550.

[360]    Zipf, G. K. (1929). Relative frequency as a determinant of phonetic change. Harvard Studies in Classical Philology, 15, 1–95.

[361]    Zipf, G. K. (1932). Selective Studies and the Principle of Relative Frequency in Language. Harvard University Press.

[362]    Zobel, J. (1998). How reliable are the results of large-scale information retrieval experiments? In SIGIR ’98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 307–314, New York, NY, USA. ACM.