EEMCS

Home > Publications
Home University of Twente
Education
Research
Prospective Students
Jobs
Publications
Intranet (internal)
 
 Nederlands
 Contact
 Search
 Organisation

EEMCS EPrints Service


17523 Near-real time statistics gathered from a continuous and voluminous data mutation stream
Home Policy Brochure Browse Search User Area Contact Help

Lavooij, K. (2010) Near-real time statistics gathered from a continuous and voluminous data mutation stream. Master's thesis, University of Twente.

Full text available as:

PDF

330 Kb

Abstract

The amount of digital data is growing fast [1]. Providing that information as a service is not enough, with the amount of information available [2]. To support the users in finding information, supporting systems have been developed to extract specific information from a large amount of stored data.
Finding or extracting interesting information is as least as important as providing the original data. The “collective intelligence? of a large number of users can be used to order the information. The ordered information is of much greater value when compared to the unordered information, because it provides the user with an overview of interesting and less interesting information.
Current database systems are not able to provide ranked information by analyzing a massive amount of user feedback (e.g. clicks) within a short period of time. Therefore, the systems update the answers periodically.
In this thesis, a Stream Processing Engine [3, 4, 5, 6] (SPE) is being adapted. The modified SPE accepts a stream of mutations to a virtual data storage as opposed a stream of tuples. The newly created system exploits the properties of statistical functions in order to efficiently aggregate live statistics over a large stream of mutations.
The newly created system is able to provide answers to a small set of continuous queries. The answers to the queries will be continuously maintained, instead of recalculated. Therefore, the system is able to provide the answers to the continuous queries instantly and with low latency for a large number of users.

Item Type:Master's Thesis
Research Group:EWI-DB: Databases
Research Program:CTIT-NICE: Natural Interaction in Computer-mediated Environments
ID Code:17523
Deposited On:21 February 2010
More Information:statistics

Export this item as:

To correct this item please ask your editor

Repository Staff Only: edit this item