Knowledge and Data Management White Papers

Fast Mining of Massive Tabular Data Via Approximate Distance Computations

Overview Tabular data abound in many data stores: traditional relational databases store tables, and new applications also generate massive tabular datasets. For example, consider the geographic distribution of cell phone traffic at different base stations across the country or the evolution of traffic at Internet routers over time. Detecting similarity patterns in such data sets is of great importance. Identification of such patterns poses many conceptual challenges (what is a suitable similarity distance function for two "Regions") as well as technical challenges (how to perform similarity computations efficiently as massive tables get accumulated over time) that are addressed. This paper presents methods for determining similar regions in massive tabular data.

Further White Paper Details
PublisherUniversity of Warwick File FormatPDF
Date PublishedJuly 2002
FormatWhite Papers   
Topics
  • Featured White Papers
Thin clients switch on digitally excluded

Thin clients switch on digitally excluded

Case study: Digital inclusion project tackles social exclusion in Liverpool more

Renault goes multilingual

Renault goes multilingual

Case study: Translation tech turns docs into 23 languages… more


Quick Sitemap Links: