Friday, November 12, 2010

The AI set of functions

I recently read an article from Y. Bengio and Y. LeCun named "Scaling Learning Algorithms to AI". You can also find it as a book chapter in "Large-Scale Kernel Machines"L. Bottou, O. Chapelle, D. DeCoste, J. Weston (eds) MIT Press, 2007.

In some aspects it is an "opinion paper" where the authors advocate for deep learning architectures and their vision of the Machine Learning. However, I think the main message is extremely relevant. I was actually surprised to see how much it agrees with my own opinions.
Here is how I would summarize it:

- no learning algorithm can be completely universal, due to the "No free lunch theorem"
- that's not such a big problem: we don't care about the set of all possible functions
- we care about the "AI set", which contains the functions useful for vision, language, reasoning, etc.
- we need to create learning algorithms with an inductive bias towards the AI set
- the models should "efficiently" represent the functions of interest, in terms of having low Kolmogorov complexity
- researchers have exploited the "smoothness" prior extensively with non-parametric methods. However many manifolds of interest have strong local variations.
- we need to explore other types of priors, more appropriate to the AI set.

The authors then give examples of two "broad" priors, such as the sharing of weights in convolutional networks (inspired by translation invariance in vision) and the use of multi-layer architectures (which can be seen as levels of increasing abstraction).

Of course here is where many alternatives are open! Many other useful inductive-bias could be found. That's where I think we should focus our research efforts! :)

Monday, November 8, 2010

Tutorial: handwritten digit recognition with convolutional neural networks

I recently added to my webpage a tutorial on how to use torch5 library to train a convolutional neural network for the task of handwritten digit recognition.

Saturday, October 23, 2010

NYC Machine Learning Symposium 2010

The event took place yesterday at the New York Academy of Sciences, a building right next to the World Trade Center. The views from the 40th floor were breathtaking:


The names of the participants in the room was no less impressive, (by no special order): Corinna Cortes (Google), Rob Schapire and David Blei (Princeton University), John Langford and Alex Smola (Yahoo), Yann LeCun (NYU), Sanjoy Dasgupta (Univ. California), Michael Collins (MIT), Patrick Haffner (AT&T), among many others.
I particularly liked to see the latest developments in LeCun's group, including a demo by Benoit Corda and Clément Farabet on speeding-up Convolutional Neural Networks with GPUs and FPGAs.
Alex Kulezka and Ben Taskar had a nice work on "Structured Determinantal Point Processes", which can be seen as a probabilistic model with a bias towards diversity of the hidden structures.
Mathew Hoffman (with D. Blei and F. Bach) used stochastic gradient descent (widely used among neural network community) for online training of topic models. Sean Gerrish and D. Blei actually had a funny application of topic models to the prediction of votes by Senators!
I was also happy to see that there is some Machine Learning being applied to the problem of sustainability and the environment. Gregory Moore and Charles Bergeron had a poster on trash detection in lakes, rivers and oceans.
To conclude, the best student paper award went to a more theoretical paper by Kareem Amin, Michael Kearns and Umar Syed (U Penn) called "Parametric Bandits, Query Learning, and the Haystack Dimension", which defines a measure of complexity for multi-armed bandit problems in which the number of actions can be infinite (there is some analogy to the role of VC-dimension in other learning models).

There were probably many other interesting posters worth being mentioned, but I didn't have the chance to check them all!

On the personal side: my summer internship at NEC Labs with David Grangier is about to finish. It was an amazing learning experience and I am very grateful for it.
Next step: back to Idiap Research Institute, EPFL and all the Swiss lakes and mountains! :)


Tuesday, July 6, 2010

Machine Learning recent sites

In the last few months (in which I haven't posted in this blog) there were a few interesting web platforms related to Machine Learning poping-up, most notably:

MLcomp.org - you can upload your datasets and/or your algorithms, and experiments will run automatically. Then you can see statistics related to classifier performances and computation times. It is intended to help researchers and practitioners comparing different methods, and it works as a collaborative platform where code and data can be shared.

MetaOptimize.com - it contains a great QA about Machine Learning and related topics, using the same web platform StackOverflow has for programming topics.

I find these two websites a great way to improve collaboration among the ML community. Highly recommended!

The latest link is more market oriented, and it comes from Google:

Google Predict : it puts together well established ML algorithms in an API that developers can use to make predictions on their own datasets.