Brought to you by our collaborators ...
... with all websites still alive for post-challenge submissions!
Feature selection (NIPS 2003)
Seventy five participants competed on five classification problems to make best predictions and select the smallest possible subset of relevant input variables (features). The tasks include: cancer diagnosis from mass-spectrometry data, handwritten digit recognition, text classification, and drug discovery.
|
[www] Challenge web site (data available)
[Wsp] Workshop page [Resu] Result page [Code] Matlab software and course material [JMLR] Special issue on feature selection [Springer] Book edited (+data CD & code) |
Performance prediction (WCCI 2006)
|
[www] Challenge web site (data available) [Wsp] WCCI 2006 wshop ; NIPS 2006 wshop [Resu] Result page [Code] Matlab software [JMLR] Special topic on model selection [CiML] Book edited (free PDF of CiML vol 1) |
Agnostic learning vs. prior knowledge, ALvsPK (IJCNN 2007)
This challenge had two tracks: the agnostic learning track and the prior knowledge track, corresponding to two versions of five datasets. The “agnostic track” data was preprocessed in a feature-based representation suitable for off-the-shelf machine learning packages. The “prior knowledge track” had raw data, not always in a feature representation, coming with information about the nature and source of the data. Can you do better with the raw data and prior knowledge about the task? How far can you get with pure “black box learning”?
|
[www] Challenge web site (data available) [Wsp] IJCNN 2007 workshop page [Resu] Results [Code] Matlab software (CLOP) [JMLR] Special topic on model selection [CiML] Book edited (free PDF of CiML vol 1) |
Learning causal relationships (WCCI 2008 and NIPS 2008)
What affects your health? What affects the economy? What affects climate changes? and… which actions will have beneficial effects? This series of competitions challenged the participants to discover the causes of given effects, based on observational data. The datasets include re-simulation data from models closely resembling real systems and real data for which the causal dependencies are known from experimental evidence. A first challenge on "causation and prediction" featuring 4 datasets (Genomics, Pharmacology, and Census data) was followed by a "pot-luck challenge" in which the participants exchanged tasks. Fifteen datasets are available to study causal problems.
|
[www] Challenge web site (data available) [Wsp] WCCI 2008 wshop; NIPS2008 workshop [Resu] Results [Code] Causal explorer (Matlab) [JMLR] JMLR W&CP proceedings vol 3 JMLR W&CP proceedings vol 6 [CiML] Book edited (free PDF of CiML vol 2) |
Fast scoring in a large database (KDD cup 2009)
Customer Relationship Management (CRM) is a key element of modern marketing strategies. The KDD Cup 2009 offered the opportunity to work on large marketing databases from the French Telecom company Orange to predict the propensity of customers to switch provider (churn), buy new products or services (appetency), or buy upgrades or add-ons proposed to them to make the sale more profitable (up-selling). This challenge attracted over 450 participants from 46 countries.
|
[www] Challenge web site (data available) [Wsp] KDD cup 2009 workshop page [Resu] Results [Code] Matlab software (CLOP) [JMLR] JMLR W&CP proceedings vol 7 [Book] Book edited (free PDF of CiML vol 3) |
Learning to rank challenge from Yahoo! labs (ICML 2010)
The datasets come from web search ranking and are of a subset of what Yahoo! uses to train its ranking function. They consist of features vectors extracted from query-urls pairs along with relevance judgments. The relevance judgments can take 5 different values from 0 (irrelevant) to 4 (perfectly relevant). The queries, urls and features descriptions are not disclosed, only the feature values. The challenge, which ran from March 1 to May 31, attracted a very large participation with 4,736 submissions coming from 1,055 teams.
|
[www] Challenge web site (data available) [Wsp] ICML 2010 workshop page [Resu] Results [JMLR] JMLR W&CP proceedings vol 14 |
Active Learning Challenge (AISTATS 2010 and WCCI 2010)
Labeling data is expensive, but large amounts of unlabeled data are available at low cost. Such problems might be tackled from different angles: learning from unlabeled data or active learning. In the former case, the algorithms must satisfy themselves with the limited amount of labeled data and capitalize on the unlabeled data with semi-supervised learning methods. In the latter case, the algorithms may place a limited number of queries to get labels. The goal in that case is to optimize the queries to label data and the problem is referred to as active learning.
|
[www] Challenge website (data available)
[Wsp] AISTATS 2010 wsp; WCCI 2010 wsp [Resu] Results [Code] Sample Matlab code [JMLR] JMLR W&CP proceedings vol 16 [Book] In press |
Unsupervised and Transfer Learning Challenge, UTL
|
[www] Challenge web site (data available) [Wsp] ICML 2100 wshop; IJCNN 2011 wshop [Resu] Results [Code] Sample Matlab code [JMLR] Call for paper [Book] In preparation |
Learn the rhythms, predict the musical scores (KDD cup 2011)
Yahoo! Music has amassed billions of user ratings for musical pieces. When properly analyzed, the raw ratings encode information on how songs are grouped, which hidden patterns link various albums, which artists complement each other, and above all, which songs users would like to listen to. The KDD Cup contest released over 300 million ratings performed by over 1 million anonymized users of Yahoo! The competition attracted more than 2000 contestants with about 1300 teams reaching the final stage of the competition.
|
[www] Challenge web site (data available) [Wsp] KDD cup 2010 workshop [Resu] Results [JMLR] Call for paper |
One-Shot-Learning Gesture Challenge (in the news)
Humans are capable of recognizing patterns like hand gestures after seeing just one example. Can machines do that too?
We are organizing a challenge on gesture and sign language recognition from video data. We are mostly focusing on hand gestures, although facial expressions may enter into account. Applications include recognizing signals for man-machine communication, translating sign languages for the deaf to hearing people, and computer gaming. |
[www] Challenge web site (data available) [round1][round2] [Wsp] CVPR2012 and ICPR 2012 [Resu] Round 1 (login:CVPR2012, password:papers) [Code] Sample code (Matlab) [JMLR] Call for papers [CiML] |
Other resources:
Search challenge-related websites by typing keywords in the field above.
Pascal challenges: The Pascal network is sponsoring several challenges in Machine learning.
Data mining competitions:
A list of data mining competitions maintained by KDnuggets, including the well known KDD cup.
NNGC: Neural Network Grand Challenge in time series forecasting.
Platforms hosting challenges:
Kaggle: Presently hosting the 3 million Health Heritage prize.
Tunedit: Similar platform more academically oriented.
Crowdsourcing:
Amazon Mechanical Turk: Gets you hire people from all around the world to solve your tasks. Used to label computer vision data.
Crowdflower: Hire people to collect, filter and enhance data.
Netflix: The 1 million dollar Netflix prize, which attracted a lot of attention and broke new grounds for recommender systems.
Robocup: Robots who play soccer, a yearly held contest.
UCI machine learning repository: A great collection of datasets for machine learning research.
DELVE: A platform developed at University of Torontoto benchmark machine learning algorithms.
CAMDA
Critical Assessment of Microarray Data Analysis, an annual conference on gene expression microarray data analysis. This conference includes a context with emphasis on gene selection, a special case of feature selection.
ICDAR
International Conference on Document Analysis and Recognition, a bi-annual conference proposing a contest in printed text recognition. Feature extraction/selection is a key component to win such a contest.
TREC
Text Retrieval conference, organized every year by NIST. The conference is organized around the result of a competition. Past winners have had to address feature extraction/selection effectively.
ICPR
In conjunction with the International Conference on Pattern Recognition, ICPR 2004, a face recognition contest is being organized.
CASP
An important competition in protein structure prediction called Critical Assessment of
Techniques for Protein Structure Prediction.
ICAPS competitions
Competitions in planning and knowledge engineering
ICMI competitions
Competitions on multimodal interaction
Data resources:
Computer vision datasets
Pascal challenges: The Pascal network is sponsoring several challenges in Machine learning.
Data mining competitions:
A list of data mining competitions maintained by KDnuggets, including the well known KDD cup.
NNGC: Neural Network Grand Challenge in time series forecasting.
Platforms hosting challenges:
Kaggle: Presently hosting the 3 million Health Heritage prize.
Tunedit: Similar platform more academically oriented.
Crowdsourcing:
Amazon Mechanical Turk: Gets you hire people from all around the world to solve your tasks. Used to label computer vision data.
Crowdflower: Hire people to collect, filter and enhance data.
Netflix: The 1 million dollar Netflix prize, which attracted a lot of attention and broke new grounds for recommender systems.
Robocup: Robots who play soccer, a yearly held contest.
UCI machine learning repository: A great collection of datasets for machine learning research.
DELVE: A platform developed at University of Torontoto benchmark machine learning algorithms.
CAMDA
Critical Assessment of Microarray Data Analysis, an annual conference on gene expression microarray data analysis. This conference includes a context with emphasis on gene selection, a special case of feature selection.
ICDAR
International Conference on Document Analysis and Recognition, a bi-annual conference proposing a contest in printed text recognition. Feature extraction/selection is a key component to win such a contest.
TREC
Text Retrieval conference, organized every year by NIST. The conference is organized around the result of a competition. Past winners have had to address feature extraction/selection effectively.
ICPR
In conjunction with the International Conference on Pattern Recognition, ICPR 2004, a face recognition contest is being organized.
CASP
An important competition in protein structure prediction called Critical Assessment of
Techniques for Protein Structure Prediction.
ICAPS competitions
Competitions in planning and knowledge engineering
ICMI competitions
Competitions on multimodal interaction
Data resources:
Computer vision datasets










