|search externally:||Google Scholar, Springer, CiteSeer, Microsoft Academic Search, Scirus , DBlife|
My topics of interest span several fields including databases, data mining, machine learning and statistics. A good idea about my research interests can be obtained by following my publications. Some specific problems and projects on which I have worked are listed below.
- World Wide Tables: The goal of this project is to answer table queries by tapping partially structured sources like tables and lists on the web.
- Information Extraction and data integration: Recently, I have been interested in graphical models and their use for various extraction and integration problems. As part of this effort, I have developed a package for Conditional Random Fields (CRF) that can be downloaded from sourceforge.
- ALIAS: This is a prototype of an interesting and fairly compelling application of the use of machine learning techniques like Active Learning to ease the duplicate elimination task that arise in data cleaning.
- DATAMOLD: is a tool for Information Extraction (more like text segmentation) using learning based on Hidden Markov Models. This software has been licensed by a data cleaning consulting company to solve real-life address cleaning tasks.
- ICube: This is a project on which I worked actively between 1999-2001. It is about enhanced mining of multidimensional OLAP products. A web demo of ICube is available.
- New data mining operations: I have worked on temporal data mining. Currently interested in various multi-class, multi-label and multi-taxonomy learning problems.
- Database mining integration: I have worked on two different aspects of this problem. First on algorithmic and architectural issues related to expressing association rule mining algorithm, in a relational engine. Second, on deploying learnt models within a relational engine so as to allow close integration with SQL querying and optimization.
- Some past projects (pre-1996): In the past I have worked on various problems related to multidimensional OLAP indexing and aggregation computation. My PhD thesis was on query optimization and scheduling for tertiary memory databases.
- Ancient projects (pre-1991): I got my first glimpse to research in computer science theory through search problems arising in rectangle cutting and packing problems.
WWT: A system for query-driven relation extraction from the semi-structured web
as author at 1st Workshop on Automated Knowledge Based Construction (AKBC), Grenoble 2010,
Accurate Max-margin Training for Structured Output Spaces
as author at 25th International Conference on Machine Learning (ICML), Helsinki 2008,
Open-domain Quantity Queries on Web Tables: Annotation, Response, and Consensus Models
as author at Research Sessions,