Experiment Databases for Machine Learning / BenchMarking Via Weka
published: Dec. 20, 2008, recorded: December 2008, views: 6244
Slides
Related content
Report a problem or upload files
If you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Description
Experiment Databases for Machine Learning
Experiment Databases for Machine Learning is a large public repository of machine learning experiments as
well as a framework for producing similar databases for specific goals. This projects aims to bring the infor-
mation contained in many machine learning experiments together and organize it a way that allows everyone
to investigate how learning algorithms have performed in previous studies. To share such information with
the world, a common language is proposed, dubbed ExpML, capturing the basic structure of a large range
of machine learning experiments while remaining open for future extensions. This language also enforces
reproducibility by requiring links to the used datasets and algorithms and by storing all details of the ex-
periment setup. All stored information can then be accessed by querying the database, creating a powerful
way to collect and reorganize the data, thus warranting a very thorough examination of the stored results.
The current publicly available database contains over 500,000 classification and regression experiments, and
has both an online interface, at http://expdb.cs.kuleuven.be, as well as a stand-alone explorer tool offering
various visualization techniques. This framework can also be integrated in machine learning toolboxes to
automatically stream results to a global (or local) experiment database, or to download experiments that
have been run before.
BenchMarking Via Weka
BenchMarking Via Weka is a client-server architecture that supports interoperability between dierent machine
learning systems. Machine learning systems need to provide mechanisms for processing data and
evaluating generated models. In our system, the server hosts all the data and performs all the statistical
analyses, while the client performs all the pre-processing and model building. This separation of tasks
opens up the possibility of oering a cross-platform and cross-language framework. By performing statistical
analyses on the host, we avoid unnecessary exchange and conversion of generated results.
See Also:
Download slides: mloss08_reutemann_experiment_databases.pdf (1.2 MB)
Download slides: mloss08_reutemann_weka.pdf (302.6 KB)
Link this page
Would you like to put a link to this lecture on your homepage?Go ahead! Copy the HTML snippet !
Write your own review or comment: