Deep Learning in 4000 Words

January 15, 2015

If you are building your personal knowledgebase about smart software, I suggest you read “A Brief overview of Deep Learning.” The write up is accessible, which is not something I usually associate with outputs from Cal Tech wonks.

I highlighted this passage with my light blue marker:

In the old days, people believed that neural networks could “solve everything”. Why couldn’t they do it in the past? There are several reasons.

  • Computers were slow. So the neural networks of past were tiny. And tiny neural networks cannot achieve very high performance on anything. In other words, small neural networks are not powerful.
  • Datasets were small. So even if it was somehow magically possible to train LDNNs, there were no large datasets that had enough information to constrain their numerous parameters. So failure was inevitable.
  • Nobody knew how to train deep nets. Deep networks are important. The current best object recognition networks have between 20 and 25 successive layers of convolutions. A 2 layer neural network cannot do anything good on object recognition. Yet back in the day everyone was very sure that deep nets cannot be trained with SGD, since that would’ve been too good to be true!

It’s funny how science progresses, and how easy it is to train deep neural networks, especially in retrospect.

Highly recommended.

Stephen E Arnold, January 15, 2015

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta