EEMCS

Home > Publications
Home University of Twente
Education
Research
Prospective Students
Jobs
Publications
Intranet (internal)
 
 Nederlands
 Contact
 Sitemap
 Search
 Organisation

EEMCS EPrints Service


18164 Query-Based Sampling using Snippets
Home Policy Brochure Browse Search User Area Contact Help

Tigelaar, A.S. and Hiemstra, D. (2010) Query-Based Sampling using Snippets. In: Eighth Workshop on Large-Scale Distributed Systems for Information Retrieval, 23 Jul 2010, Geneva, Switzerland. pp. 9-14. CEUR Workshop Proceedings 630. CEUR-WS. ISSN 1613-0073

Full text available as:

PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
460 Kb

Official URL: http://ceur-ws.org/Vol-630/lsdsir1.pdf

Exported to Metis

Abstract

Query-based sampling is a commonly used approach to model the content of servers. Conventionally, queries are sent to a server and the documents in the search results returned are downloaded in full as representation of the server’s content. We present an approach that uses the document snippets in the search results as samples instead of downloading the entire documents. We show this yields equal or better modeling performance for the same bandwidth consumption depending on collection characteristics, like document length distribution and homogeneity. Query-based sampling using snippets is a useful approach for real-world systems, since it requires no extra operations beyond exchanging queries and search results.

Item Type:Conference or Workshop Paper (Full Paper, Talk)
Research Group:EWI-DB: Databases
Research Program:CTIT-NICE: Natural Interaction in Computer-mediated Environments
Research Project:DIRKA: Distributed Information Retrieval by means of Keyword Auctions
Uncontrolled Keywords:distributed information retrieval, query-based sampling
ID Code:18164
Status:Published
Deposited On:27 July 2010
Refereed:Yes
International:Yes
More Information:statisticsmetis

Export this item as:

To correct this item please ask your editor

Repository Staff Only: edit this item