ResearchText Analysis

Overview

Picture of Text Analysis
The amount of available textual data is rapidly increasing, and so is the potential value for organizations in its analysis. Yet, typical extraction of data from textual resources is done outside of the database, via programs that are fundamentally different from the traditional declarative query languages. We develop foundations of declarative languages for analyzing text alongside structured data. Among our main efforts is the design of efficient algorithms for programs, and their compilation into efficient execution plans.

People

Collaborators

Ron Fagin, IBM, USA

Selected Publications

Oren Mishali, Benny Kimelfeld, "Towards Linked Data of Bible Quotations in Jewish Texts", DH 2018: 455-456   abstractpaper
Dominik D. Freydenberger, Benny Kimelfeld, Liat Peterfreund, "Joining Extractions of Regular Expressions", PODS 2018: 137-149   abstractpaper
Liat Peterfreund, Balder ten Cate, Ronald Fagin, Benny Kimelfeld, "Recursive Programs for Document Spanners", CoRR abs/1712.08198 (2017). To appear in ICDT 2019   abstractpaper
Ronald Fagin, Benny Kimelfeld, Frederick Reiss, Stijn Vansummeren, "Declarative Cleaning of Inconsistencies in Information Extraction", ACM Trans. Database Syst. 41(1): 6:1-6:44 (2016)   abstractpaper
Ronald Fagin, Benny Kimelfeld, Frederick Reiss, Stijn Vansummeren, "Document Spanners: A Formal Approach to Information Extraction", J. ACM 62(2): 12:1-12:51 (2015)   abstractpaper