Which Supervised Learning Method Works Best for What? An Empirical Comparison of Learning Methods and Metrics

author: Rich Caruana, Cornell University
published: Feb. 25, 2007,   recorded: May 2006,   views: 31314
Categories

Slides

Related content

Report a problem or upload files

If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Lecture popularity: You need to login to cast your vote.
  Delicious Bibliography

Description

Decision trees are intelligible, but do they perform well enough that you should use them? Have SVMs replaced neural nets, or are neural nets still best for regression, and SVMs best for classification? Boosting maximizes margins similar to SVMs, but can boosting compete with SVMs? And if it does compete, is it better to boost weak models, as theory might suggest, or to boost stronger models? Bagging is simpler than boosting -- how well does bagging stack up against boosting? Breiman said Random Forests are better than bagging and as good as boosting. Was he right? And what about old friends like logistic regression, KNN, and naive bayes? Should they be relegated to the history books, or do they still fill important niches?
In this talk we compare the performance of ten supervised learning methods on nine criteria: Accuracy, F-score, Lift, Precision/Recall Break-Even Point, Area under the ROC, Average Precision, Squared Error, Cross-Entropy, and Probability Calibration. The results show that no one learning method does it all, but some methods can be "repaired" so that they do very well across all performance metrics. In particular, we show how to obtain the best probabilities from max margin methods such as SVMs and boosting via Platt's Method and isotonic regression. We then describe a new ensemble method that combines select models from these ten learning methods to yield much better performance. Although these ensembles perform extremely well, they are too complex for many applications. We'll describe what we're doing to try to fix that. Finally, if time permits, we'll discuss how the nine performance metrics relate to each other, and which of them you probably should (or shouldn't) use.
During this talk I'll briefly describe the learning methods and performance metrics to help make the lecture accessible to non-specialists in machine learning.

See Also:

Download slides icon Download slides: solomon_caruana_wslmw_01.pdf (1.1 MB)


Help icon Streaming Video Help

Link this page

Would you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !

Reviews and comments:

Comment1 Dan, July 11, 2007 at 10:48 p.m.:

Very well-delivered talk and very useful, combined with the slides. The inclusion of background to some of the techniques is useful, helps contextualise the comparisons nicely.

(The "View slides" option makes it a bit difficult to read the text in the tables, by the way.)


Comment2 KEBI_Weiwei, May 27, 2008 at 8:26 p.m.:

Nice talk!

The link of the slides is wrong, btw.


Comment3 Jonathan, March 13, 2009 at 9:12 a.m.:

The correct slides link is: http://www.cs.cornell.edu/courses/cs6...


Comment4 LIAR, July 14, 2009 at 11:01 p.m.:

*Probably* interesting...

Unfortunately, I'm not under windows, *like many other people*, so neither the slides, nor the video can be view without using billions of tricks... So, I haven't watch anything yet since I don't have time to play site the site's sources...

It's a pitty that a site that promotes free diffusion of knowledge is unable to use free formats, so that anyone can have access to the information simply...


Comment5 Jamie Olson, July 31, 2009 at 6:10 p.m.:

Great show. It seems like some variety on "decision trees FTW!" is a recurring theme when people do these large-scale experiments.

@LIAR I'm not sure what your problem is. The link to the slides was provided as a pdf. Even if it were a .ppt you shouldn't have trouble viewing that unless you're CLI-only using lynx, in which case you're crazy.


Comment6 NAIVE LIAR, August 8, 2014 at 1:34 p.m.:

Hello LIAR, I use mac and it working fine if you use firefox,if it is not work with safari.


Comment7 James, November 18, 2014 at 12:24 a.m.:

Confirmed broken on Safar, but working on Chrome.


Comment8 Kasey Sk, August 22, 2020 at 10:31 a.m.:

Really opened my thinking horizons. Thanks for the lecture https://vidmate.bet/


Comment9 Girly Camillo, December 2, 2021 at 1:29 p.m.:

How well does bagging stack up against boosting?
http://clevelanddecking.com/

Write your own review or comment:

make sure you have javascript enabled or clear this field: