2009 | 197 Pages | ISBN: 3540893636 | PDF | 3 MB
A modern information retrieval system must have the capability to find, organize and present very different manifestations of information – such as text, pictures, videos or database records – any of which may be of relevance to the user. However, the concept of relevance, while seemingly intuitive, is actually hard to define, and it’s even harder to model in a formal way.Lavrenko does not attempt to bring forth a new definition of relevance, nor provide arguments as to why any particular definition might be theoretically superior or more complete. Instead, he takes a widely accepted, albeit somewhat conservative definition, makes several assumptions, and from them develops a new probabilistic model that explicitly captures that notion of relevance. With this book, he makes two major contributions to the field of information retrieval: first, a new way to look at topical relevance, complementing the two dominant models, i.e., the classical probabilistic model and the language modeling approach, and which explicitly combines documents, queries, and relevance in a single formalism; second, a new method for modeling exchangeable sequences of discrete random variables which does not make any structural assumptions about the data and which can also handle rare events.Thus his book is of major interest to researchers and graduate students in information retrieval who specialize in relevance modeling, ranking algorithms, and language modeling.