But what is a neural network? Deep Learning Chapter 1

This is part of a megapost is about the 3Blue1Brown series about deep learning.

Overview

This is intended as a lightweight introduction into the topic. The motivation is about the very hard task of programatically classifying the handwritten digits.

There are many variants of neural networks.

Plain vanilla – Multilayer perceptron

Neuron: A thing that holds a number in \([0,1]\). The number inside the neuron is called ‘activation’.

All 784 neurons make up the first layer of the network.

In the last layer we have only 10 neurons which represent the output number.

There are 2 hidden layers a 16 neurons.

The Component Analogy

We hope that each middle layer represents the components of the numbers. (Note: this seems always like a claim without me having seen a proof about that at any time)

The analogy goes further. We disect the circles into single edges, and those into pixels (or the other way around seen from input to output).

How to design the activation flow

Pixel -> Edges -> Patterns -> Digits.

The task at hand is: What dials have to be turned to reliably recognize a pattern.

Take all the activations of the first layer and compute their weighted sum.

\(sumOfNeuron_X = w_1 a_1 + w_2 a_2 + … + w_n a_n \)

Activation function (per neuron)

Sigmoid!!! (i like that curve)

Constrains the inputs into outputs from -1 to 1.

Bias (per neuron)

Maybe you want a bias as to when we want to read a positive value. Then we add a negative bias (threshold). To distinguish getting meaningfully active.

Notation

Because we want to standardize everthing.

All activations are represented with a column-vector, the weights are represented as a row in the weight matrix and a column-vector for the bias. The \( \sigma \) function is the above mentioned sigmoid function, getting applied to each of the \( k \) neurons in the next layer.

\[
\sigma \left(
\begin{bmatrix} w_{0,0} & w_{0,1} & … & w_{0,n}\\ w_{1,0} & w_{1,1} & … & w_{1,n} \\ \vdots & \vdots & \ddots & \vdots \\ w_{k,0} & w_{k,1} & … & w_{k,n} \end{bmatrix}

\cdot

\begin{bmatrix} a_{0} \\ a_{1} \\ \vdots \\ a_{n} \end{bmatrix}

+

\begin{bmatrix} b_{0} \\ b_{1} \\ \vdots \\ b_{k} \end{bmatrix}

\right)

=

\begin{bmatrix} r_{0} \\ r_{1} \\ \vdots \\ r_{k} \end{bmatrix}
\]

This gets oversimplified to the following

\[
\sigma \left(
W

a^{(0)}

+

b

\right)

=

a^{(1)}
\]

What is a Neuron – revisited

A function. Takes in the output of all the neurons in the previous layer and spits out a number.

How To Speak

Patrick Winston’s How to Speak talk has been an MIT tradition for over 40 years. Offered every January during the Independent Activities Period (IAP), usually to overflow crowds, the talk is intended to improve your speaking ability in critical situations by teaching you a few heuristic rules.

Start

  1. Do not start a talk with a joke.
  2. Promise – Tell them what they gonna learn at the end of your talk.
  3. Cycle – make your idea repeated many times in order to be completely clear for everyone.
  4. Make a “Fence” around your idea so that it can be distinguished from someone else’s idea.
  5. Verbal punctuation – sum up information within your talk some times to make listeners get back on.
  6. Ask a question – intriguing one

Place and Time

  1. Best time for having a lecture is 11 am. (not too early and not after lunch)
  2. The place should be well lit.
  3. The place should be seen and checked before the lecture.
  4. The place should not be full less than a half, it must be chosen according to the amount of listeners.

Tools

For Teaching

  1. Board – it’s got graphics, speed, target. Watch your hands! Don’t hold them behind your back, it’s better to keep them straight and use for pointing at the board.
  2. Props – use them in order to make your ideas visual. Visual perception is the most effective way to interact with listeners.

For Job Talk / Exposing / Slides

  1. Don’t put too many words on a slide. Slides should just reflect what you’re saying, not the other way around. Pictures attracts attention and people start to wait for your explanation – use that tip.
  2. Make slide as easy as you can – no title, no distracting pictures, frames, points and so on.
  3. Do not use laser pointer – due to that you lose eye contact with the audience. Instead you can make the arrows just upon a slide.

Informing

Show to your listeners your stuff is cool and interesting.

You have to be able to

  • Show your vision of that problem
  • Show that you’ve done particular things (by steps)

All of that should be done real quick in no more than 5 minutes.

Persuade your listeners you’re not a rookie

Getting Famous

If you want people to remember your ideas you’ve got to have the 5 S

  • Symbol associated with your ideas
  • Slogan describing your idea
  • Surprise
  • Salient idea – the one that sticks out
  • Story – how you did it

How to End

  1. Dont put collaborators at the end, do that at the beginning
  2. You shall end with a ‘Contribution’ slide – to sum up everything you’ve uniquely contributed to the result.
  3. At the very end you could tell a joke since people then will leave the event feeling fun and thus keep a good memory of your talk.
  4. Don’t end with “Thank you!”. End with
    • A quote
    • A salute: On how much you valued the time of them being here