Two architectures for one-shot learning
published: Oct. 6, 2014, recorded: December 2013, views: 186
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
People can learn a new visual concept almost perfectly from just a single example, yet machine learning algorithms typically require hundreds of examples to perform similarly. Humans can also use their learned concepts in richer ways than conventional machine learning systems, to parse objects into parts, generate new instances,recognize abstract types of concepts and even imagine novel concepts of a given type. I will discuss two computational approaches we have explored that aim to capture these human learning abilities. The first (with Ruslan Salakhutdinov and Antonio Torralba) is the hierarchical-deep or "HD" architecture, which learns a hierarchical nonparametric Bayesian model (specifically, a hierarchical Dirichlet process topic model) on top of a deep Boltzmann machine feature extractor. This approach is appealingly general, but I will argue it is insufficiently structured to capture the conceptual knowledge humans learn in real-world domains. I will then present a new architecture (with Brendan Lake and Ruslan Salakhutdinov) that represents concepts as simple programs, embodying basic principles of hierarchy, compositionality and causality, and constructs programs that best explain observed examples under a Bayesian criterion. This approach requires more domain-specific engineering in choosing the form of the programs to be learned, but it is still very flexible. On a challenging one-shot classification task, the Bayesian program learner is the first to achieve human-level performance, substantially outperforming a range of other approaches including both deep Boltzmann machines and our HD models. I will also illustrate several "visual Turing test" experiments probing the program learning model's more creative parsing, generalization and generation abilities, and show that in many cases it is indistinguishable from the performance of humans.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !