Moving from Anomalies to Known Phenomena
published: Nov. 7, 2016, recorded: August 2016, views: 982
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Basic anomaly detection finds data points that are unusual relative to an expected distribution and there are good methods for defining expected distributions and quantifying deviation from them. However, using anomaly detectors in operational systems has proven to be a challenge. Naive flagging of anomalous points can lead to an overwhelming number of false positives and result in detection thresholds being lowered to the point of effectively turning off the anomaly detector.
I will discuss two improvements for anomaly detection systems. The first is to observe that we are often not interested in individually anomalous data points. Instead, there are underlying phenomena we want to identify and those phenomena affect large groups of data points. Anomaly detection over sets of data can produce more significant results. Second, anomaly detection should be in a closed loop process that generates additional labeled data, updated models, and ever fewer detections, thus reducing load on the user. We will illustrate these ideas with some scientific and commercial use cases.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !