UCSD
Here is a video of Dave Rumelhart giving a very similar lecture.
- Rumelhart, D.E., Hinton, G. E., & Williams, R.J. (1986) Learning internal representations by error propagation. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, D.E. Rumelhart, J.L. Mclelland, and the PDP Group. pdf
- Tong, M.H., Joyce, C.A., and Cottrell, G.W. (2008) Why is the fusiform face area recruited for novel categories of expertise? A neurocomputational investigation Brain Research 1202:14-24. pdf
Stanford
- Olshausen, B.A. & Field, D.J. (1996) Emergence of simple cell properties by learning sparse codes of natural images. Nature 381:607-609. pdf
- Rajat Raina Alexis Battle Honglak Lee Benjamin Packer Andrew Y. Ng (1997) Self-taught Learning: Transfer Learning from Unlabeled Data. In International Conference on Machine Learning. pdf
- PLUS!: Andrew's tutorial on deep networks and relevant papers is here.
U. Montreal
- Yoshua Bengio (2012) Evolving Culture vs Local Minima. arXiv:1203.2990v1 [cs.LG] 14 Mar 2012 pdf
- Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, Samy Bengio (2010) Why Does Unsupervised Pre-training Help Deep Learning? Journal of Machine Learning Research 11:625-660 pdf
Courant Institute,
NYU
- Yann LeCun, Koray Kavukcuoglu and Clément Farabet:
Convolutional Networks and Applications in Vision, Proc. International
Symposium on Circuits and Systems (ISCAS'10), IEEE, 2010 pdf
- Kevin Jarrett, Koray Kavukcuoglu, Marc'Aurelio Ranzato and
Yann LeCun: What is the Best Multi-Stage Architecture for Object
Recognition?, Proc. International Conference on Computer Vision
(ICCV'09), IEEE, 2009 pdf
- Yoshua Bengio and Yann LeCun: Scaling learning algorithms
towards AI, in Bottou, L. and Chapelle, O. and DeCoste, D. and Weston,
J. (Eds), Large-Scale Kernel Machines, MIT Press, 2007. pdf
- http://yann.lecun.com/exdb/publis/index.html#lecun-iscas-10
University of Toronto
- Ruslan Salakhutdinov, Josh Tenenbaum & Antonio Torralba (2012) Learning to Learn with Compound Hierarchical-Deep Models. Neural Information Processing Systems (NIPS 25). pdf
- Ruslan Salakhutdinov and Geoffrey Hinton (2009) Deep Boltzmann Machines. In 12th International Conference on Artificial Intelligence and Statistics. pdf
NYU
- Graham W. Taylor, Geoffrey E. Hinton, Sam T. Roweis (2011) Two Distributed-State Models For Generating High-Dimensional Time Series. Journal of Machine Learning Research 12:1025-1068 pdf
University of Toronto
- Hinton, G. E., Krizhevsky, A. and Wang, S. (2011) Transforming Auto-encoders. ICANN-11: International Conference on Artificial Neural Networks, Helsinki. [pdf]
CU Boulder
- O'Reilly, R.C. & Munakata, Y., Frank, M.J., Hazy, T.E., and contributors (2012) Learning In O'Reilly, R. C., Munakata, Y., Frank, M. J., Hazy, T. E., and Contributors (2012). Computational Cognitive Neuroscience. Wiki Book, 1st Edition. URL: http://ccnbook.colorado.edu
- O'Reilly, R.C. (1996). Biologically Plausible Error-driven Learning using Local Activation Differences: The Generalized Recirculation Algorithm. Neural Computation, 8:895-938. pdf
UCSD
- Shan, Honghao, Zhang, Lingyun and Garrison W. Cottrell (2007) Recursive ICA. In Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA. [pdf]
- Shan, H. and Cottrell,
G.W. (2008)
Looking around the back yard helps the recognition of faces
and digits. In Computer Vision and Pattern
Recognition (CVPR 2008). [pdf]