Mining the Web to Facilitate Fast and Accurate Approximate Match
author: Venkatesh Ganti, Microsoft Research
author: Dong Xin, Microsoft Research
published: May 20, 2009, recorded: April 2009, views: 4047
Slides
Related content
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Description
Tasks relying on recognizing entities have recently received significant attention in the literature. Many such tasks assume the existence of reference entity tables. In this paper, we consider the problem of determining whether a candidate string approximately matches with a reference entity. This problem is important for extracting named entities such as products or locations from a reference entity table, or matching entity entries across heterogenous sources. Prior approaches have relied on string-based similarity which only compare a candidate string and an entity it matches with. In this paper, we observe that considering such evidence across multiple documents significantly improves the accuracy of matching. We develop efficient techniques which exploit web search engines to facilitate approximate matching in the context of our proposed similarity functions. In an extensive experimental evaluation, we demonstrate the accuracy and efficiency of our techniques.
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !
Reviews and comments:
Good content, but his English is sucking.
Write your own review or comment: