ResearchInconsistent Data Management

Overview

 

Picture of Inconsistent Data ManagementManaging data quality, and particularly data inconsistency, has been one of the major challenges in the research and practice of database management. Sources of inconsistency include imprecise processes of data generation such as mistakes in manual form filling and noisy sensing equipment, as well as data integration where different source databases may contain conflicting information. This problem becomes even more important and central to data management in modern times, where data repositories are increasingly based on imprecise processes (e.g., crowdsourcing and information extraction from natural language) and integration of repositories with varying levels of reliability. In our research, we aim to develop fundamental approaches to managing data quality, including ways to clean, query, and measure the error level in inconsistent databases.

People

Selected Publications

Ester Livshits, Benny Kimelfeld, Sudeepa Roy, "Computing Optimal Repairs for Functional Dependencies", PODS 2018: 225-237   abstractpaper
Christopher De Sa, Ihab F. Ilyas, Benny Kimelfeld, Christopher Ré, Theodoros Rekatsinas, "A Formal Framework For Probabilistic Unclean Databases", CoRR abs/1801.06750, 2018   abstractpaper
Ester Livshits, Benny Kimelfeld, "Counting and Enumerating (Preferred) Database Repairs", PODS 2017: 289-301    abstractpaper
Benny Kimelfeld, Ester Livshits, Liat Peterfreund, "Detecting Ambiguity in Prioritized Database Repairing", ICDT 2017: 17:1-17:20   abstractpaper