 |
Project:
Semantic Information Retrieval in Unannotated Document Collections
Description
Ontologies and other
conceptual models describe the structure of delimited topical areas.
While conceptual models are traditional tools in Information Science,
e.g., in the form of thesauri, they have acquired much recent attention
in research in several disciplines due to the semantic web. While
ontologies greatly resemble traditional thesauri, they can be richer
in structural relationships. The most important difference is however
the aim toward computational semantics (or inference), which may
support more “intelligent” applications and interoperability
of IR systems. Through the use of ontologies, the information searcher
can avoid (at least greatly reduce) the complexity of natural languages
when searching, e.g., in the web. Annotations based on ontologies
are supposed to capture the semantic content of documents in a nutshell.
As the history of indexing research informs, however, there are
problems in cost, quality (consistency), exhaustiveness and specificity
of annotation. For example, the most popular metadata format for
the Web, the Dublin Core format, was in 2002 employed in 0.3 % of
web documents.
Our approach is therefore
different: we investigate ontology-based access to unannotated document
collections. This line of research was begun more than 10 years
ago and has produced several academic degrees, research articles
(see the FIRE archive). We have shown that structured queries, based
on ontologies, greatly improve performance in ontology-based IR
at least in news article collections. Research problems for the
five-year period include: ontology-based query formulation in varying
types of unannotated document collections (news, research articles,
legal documents, image collections) in various languages.
Further research problems
include methods for building ontologies and integrating publicly
available semantic sources such as semantic web ontologies, professional
terminologies, dictionaries and resources like the WordNet (http://www.cogsci.princeton.edu/~wn/).
Yet another research theme is the design of search interfaces based
on ontologies. An underlying theme is the evaluation of the effectiveness
of each method or tool regarding the quality of the response.
We have earlier developed
a principle of abstraction levels (Järvelin & al., 1996;
2001), which systematically organizes the ontology level with the
corresponding linguistic level (NL expressions for ontological concepts)
and the string matching level (patterns for matching expressions
in text, inflectional and compound languages included). This supports
information retrieval in varying environments without requiring
the user to master the details (document indexing, query languages)
of the environments. Based on this principle, we have developed
the search ontology editor ShOE, which supports semiautomatic construction
of search ontologies, and the QUCCOO query constructor, which is
based on such ontologies.
Duration
2003 - 2009
Researchers
Mr. Feza Baskaya –
supervisor Prof. Kalervo Järvelin
Mrs. Sari Suomela – supervisor Prof. Jaana Kekäläinen
Publications
- Järvelin, K.
& Kekäläinen, J. & Niemi, T. (2001). ExpansionTool:
Concept-Based Query Expansion and construction. Information Retrieval
4(3/4): 231-255.
- Airio, E. & Järvelin,
K. & Saatsi, P. & Kekäläinen, J. & Suomela,
S. (2004). CIRI An ontology-based query interface for text retrieval.
In: Hyvönen, E. et al. (Ed.) Web Intelligence: STeP 2004
The 11th Finnish Artificial Intelligence Conference. Helsinki,
Finland: Finnish Artificial Intelligence Society, Publications
20, pp. 73-82.
- Suomela, S. (2005).
User test on multi-lingual ontology interface. In: Bailey, A,
Ruthven, I, Azzopardi, L, eds. Proceedings of the Workshop on
Evaluating User Studies in Information Access at CoLIS 5, Glasgow,
Scotland, June 2005.
- Suomela, S. (2005).
User study on ontology as query construction tool. In: Bailey,
A, Ruthven, I, Azzopardi, L, eds. Proceedings of the Workshop
on Evaluating User Studies in Information Access at CoLIS 5, Glasgow,
Scotland, June 2005.
- Suomela, S & Kekäläinen,
J. (2005). Ontology as a search tool: A study of real users' query
formulation with and without conceptual support. In: Losada, DE
& Fernandez Luna, JM, eds. 27th European Conference on Information
Retrieval ECIR05, Santiago de Compostela Spain, March 2005. Heidelberg:
Springer, Lecture Notes in Computer Science 3408, 315-329.
- Suomela, S. &
Kekäläinen, J.: User Study on Ontology as a Query Construction
Tool. Information Retrieval 9(xxx): xxx-xxx. Accepted for publication,
October 2005.
Relevant links
The
project ShOE –
search ontology editor
The
project QUCCOO – ontology-based search interface
The
project OntolA – evaluation of interactive use of QUCCOO
Updated
29.12.2005
Responsibility for updating: KJ
|