Tampereen yliopistoInformaatiotieteiden tiedekunta
Informaatiotutkimuksen laitos

Project: Multigrade CLIR – Direct and Transitive Cross-Language IR Evaluated by Graded Relevance Assessments

Description

 

Research on cross-language information retrieval (CLIR) has typically been restricted to settings using binary relevance assessments. In this project, we present evaluation results for dictionary-based CLIR using graded relevance assessments in a best match retrieval environment. We use text databases containing newspaper articles, and test topics with graded relevance assessments scaled from 0 (non-relevant) to 3 (highly relevant). We have such collections in Finnish and English, which thus form the target languages in our experiments. As source languages we use Finnish, English, German and Swedish. We study both direct translations from the source languages to the target languages and transitive translations via pivot languages as well. Monolingual baseline queries are also considered. In our tests we employ the UTACLIR query translation system, which is dictionary-based, and query expansion based on pseudo-relevance feedback. Generally we use target queries structured by synonym sets – shown to yield better performance than bag-of-words target queries. CLIR performance is evaluated using three relevance thresholds: stringent, regular, and liberal as well as generalized recall and precision (Kekäläinen & Järvelin 2002).

 

Duration

 

2003 – 2007. Project finished.

 

Researchers

Mrs. Raija Lehtokangas– supervised by Prof. Kal Järvelin
Mr. Heikki Keskustalo– supervised by Dr. Ari Pirkola and Prof. Kal Järvelin

Publications


  1. Lehtokangas, R. & Keskustalo, H. & Järvelin, K. (2008). Experiments with Transitive Dictionary Translation and Pseudo-Relevance Feedback Using Graded Relevance Assessments. Journal of the American Society for Information Science and Technology (JASIST) 59(3): 476-488. ( pdf ) Lehtokangas-JASIST'08
  2. Lehtokangas, R. & Keskustalo, H. & Järvelin, K. (2006). Highly Relevant Documents Lost in CLIR: Experiments with Dictionary Translation and Pseudo-Relevance Feedback. Information Retrieval  10(xxx): xxx-xxx, to appear. Preprint
  3. Lehtokangas, R., Keskustalo H. & Järvelin K. (2005) Dictionary-Based CLIR Loses Highly Relevant Documents. In: Losada D and Fernandez-Luna J, eds. Advances in Information Retrieval, Proceedings of the 27th European Conference on IR Research, ECIR 2005, Santiago de Compostela, Spain. Lecture Notes in Computer Science, 3408. Berlin: Springer, 421-432. (pdf )
  4. Lehtokangas, R. & Airio, E. & Järvelin, K. (2004). Transitive dictionary translation challenges direct dictionary translation in CLIR. Information Processing & Management 40(6): 973-988. (pdf )

  5. <>

Updated 11.03.2008 Responsibility for updating: KJ


Informaatiotutkimuksen laitos