19th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Chicago 2013

Machine Learning in Health Informatics: Making Better use of Domain Experts

author: Byron C. Wallace, Brown Laboratory for Linguistic Information Processing, Brown University
published: Sept. 27, 2013, recorded: August 2013, views: 6244

Slides

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.

Lecture popularity: You need to login to cast your vote.

Description

We present novel machine learning and data mining methods that make real-world learning systems more eﬃcient. We focus on the domain of clinical informatics, an archetypical example of a ﬁeld overwhelmed with information. Due to properties inherent to clinical informatics tasks – and indeed, to many tasks that require specialized domain knowledge – ‘oﬀ-the-shelf’ machine learning technologies generally perform poorly in this domain.

If machine learning is to be successful in clinical science, novel methods must be developed to: mitigate the eﬀects of class imbalance during model induction; exploit the wealth of domain knowledge highly skilled domain experts bring to the task; and to induce better models with less eﬀort (fewer labels). We present new machine learning methods that address each of these issues, and demonstrate their eﬃcacy in the task of abstract screening. In particular, we develop new theoretical perspectives on class imbalance, novel methods for exploiting dual supervision (i.e., labels on both instances and features), and new active learning techniques that address issues inherent to real-world applications (e.g., exploiting multiple experts in tandem). Each of these contributions aims to squeeze better classiﬁcation performance out of fewer labels, thereby making better use of domain experts’ time and expertise.

The immediate aim in this work is to reduce the workload involved in conducting systematic reviews, and to this end we demonstrate that the developed methods can reduce reviewer workload by more than half, without sacriﬁcing the comprehensiveness of reviews (i.e., without missing any relevant published evidence). But this is only an exemplary task; the approaches presented here have wider application to many real-world learning problems, i.e., those that require specialized expertise, exhibit class imbalance (and asymmetric costs) and for which limited human resources are available. We show that the methods we have developed bring substantial improvements over previously existing machine learning approaches in terms of inducing better models with less eﬀort.

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Write your own review or comment:

Comment:
Name:
Email address:
URL:

make sure you have javascript enabled or clear this field: