Hi, Rahim
Rahim Sonawalla
rahim@sonawalla.org
https://www.hirahim.com
Zola
2022-12-20T00:00:00+00:00
https://www.hirahim.com
https://www.hirahim.com/images/favicon-196x196.png
Copyright © 2007–2024 Rahim Sonawalla
Backpropagation by Example
Rahim Sonawalla
2022-12-20T00:00:00+00:00
2022-12-20T00:00:00+00:00
https://www.hirahim.com/posts/backpropagation-by-example/
<p>Note: this post contains LaTeX-formatted text, such as mathematical formulas, and SVG images which may not render correctly in your feed reader. Please read this entry in your web browser.</p><p>I’ve been spending a lot of time on <a rel="noopener" target="_blank" href="https://course.fast.ai/Lessons/lesson3.html">Lesson 3</a>/<a rel="noopener" target="_blank" href="https://github.com/fastai/fastbook/blob/master/04_mnist_basics.ipynb">Chapter 4</a> of the <a rel="noopener" target="_blank" href="https://course.fast.ai">fast.ai course</a> trying to really understand the fundamentals of neural networks. Conceptually, they’re pretty simple, but I wanted to take some time to work through the details.</p>
<p>In particular, I spent a lot of time on backpropagation. There are a good deal of sites explaining the formulas and proofs for backpropagation, but surprisingly few that worked through any examples with real numbers. The best was <a rel="noopener" target="_blank" href="https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/">Matt Mazur’s post</a>, but it left out the handling of biases, so I figured I’d try writing something up.</p>
<p>I won’t get into proofs, since there are better sources for that, like <a rel="noopener" target="_blank" href="http://neuralnetworksanddeeplearning.com/chap2.html">Michael Nielsen’s online book</a>. I’ll also assume some existing understanding of neural networks. Basically, I’m hoping this will be helpful for other learners that are around where I’m at.</p>
<p>You’ll want to go through these examples in order, since they build on each other. And for best effect, try to do them by hand with pen, paper, and a calculator. I’ll use <a rel="noopener" target="_blank" href="https://pytorch.org/">PyTorch</a> to check for correctness, but won’t do anything more complex than basic tensor operations. You can find the supporting <a href="./backpropagation-by-example.ipynb">Jupyter notebook here</a> or <a rel="noopener" target="_blank" href="https://www.kaggle.com/hirahim/backpropagation-by-example">on Kaggle</a>.</p>
<h2 id="loss-and-activation-functions">Loss and Activation Functions</h2>
<p>To keep things simple, all the examples will use the same loss and activation functions. We’ll use <a rel="noopener" target="_blank" href="https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html#torch.nn.MSELoss">mean squared error</a> as our loss function since it’s easy to calculate by hand. Typically, it’s averaged over the squared differences of all the output values, but since all our examples will only have one output value, the formula we’ll use is just:</p>
<p>$$ \text{mse} = (\text{prediction} - \text{actual})^2 $$</p>
<p>For our activation function, we’ll use <a rel="noopener" target="_blank" href="https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html">ReLU</a> since it’s also easy to work with by hand:</p>
<p>$$ \operatorname{ReLU}(x) = \max(0, x) $$</p>
<p>Of note, there’s some debate about the correct definition of ReLU’s derivative since it isn’t defined at \(x = 0\). We won’t run into this situation with any of our examples, but in general I’ve been going with:</p>
<p>$$
\operatorname{ReLU^{\prime}}(x) =
\begin{cases}
0 & \text{if } x \leq 0 \\
1 & \text{if } x > 0
\end{cases}
$$</p>
<h2 id="starting-small">Starting Small</h2>
<p>To kick things off, we’ll start with a rudimentary network with just one input, one output, and no hidden layers.</p>
<p><a href="images/simple-network.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/simple-network.svg" alt="Diagram of a simple network with one input neuron, labeled “i”, one output neuron, labeled “o”, one connection between them with weight “w”, and one bias, labeled “b”." /></a></p>
<p>In the diagram above, “i” is the input node, “o” is the output node, “b” is the bias, and “w” is the weight.</p>
<p>Normally, you’d use random values for the weight and bias, but we’ll use fixed values so that we can work through the examples together. We’ll use 0.5 for the input (\(i\)), 0.3 for the weight (\(w\)), and 0.4 for the bias (\(b\)). This will give us an output prediction (\(o\)) of 0.55. Let’s work through how we got that.</p>
<p><a href="images/simple-network-with-values.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/simple-network-with-values.svg" alt="Diagram of a simple network above with fixed values. 0.5 for the input layer, 0.3 for the weight, and 0.4 for the bias." /></a></p>
<h3 id="forward-pass">Forward Pass</h3>
<p>In order to get the output prediction (\(o_{out}\)), we’ll do a forward pass through the network using the following formula: </p>
<p>$$
\begin{align*}
o_{net} &= w * i + b \\
o_{out} &= \operatorname{ReLU}(o_{net})
\end{align*}
$$</p>
<p>Plugging in our values, we have:</p>
<p>$$
\begin{align*}
o_{net} &= 0.3 * 0.5 + 0.4 \\
&= 0.55 \\
\\
o_{out} &= \operatorname{ReLU}(0.55) \\
&= 0.55
\end{align*}
$$</p>
<p>Next, we’ll calculate our loss (\(E\)) assuming the actual (a.k.a. target) value was 0.95:</p>
<p>$$
\begin{align*}
E &= (\text{prediction} - \text{actual})^2 \\
&= (0.55 - 0.95)^2 \\
&= 0.16
\end{align*}
$$</p>
<p>Let’s quickly check our work with PyTorch:</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#c594c5;">import </span><span>torch
</span><span>
</span><span style="color:#c594c5;">def </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">input</span><span style="color:#5fb3b3;">):
</span><span> relu </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">nn</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">ReLU</span><span style="color:#5fb3b3;">()
</span><span> </span><span style="color:#c594c5;">return </span><span style="color:#6699cc;">relu</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">input</span><span style="color:#5fb3b3;">)
</span><span>
</span><span style="color:#c594c5;">def </span><span style="color:#6699cc;">loss_function</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">prediction</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">actual</span><span style="color:#5fb3b3;">):
</span><span> mse_loss </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">nn</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">MSELoss</span><span style="color:#5fb3b3;">()
</span><span> </span><span style="color:#c594c5;">return </span><span style="color:#6699cc;">mse_loss</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">prediction</span><span style="color:#5fb3b3;">, </span><span style="color:#6699cc;">actual</span><span style="color:#5fb3b3;">)
</span><span>
</span><span>inputs </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">5</span><span style="color:#5fb3b3;">])
</span><span>actual </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">95</span><span style="color:#5fb3b3;">])
</span><span>
</span><span>output_layer_weights </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">3</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>output_layer_bias </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">4</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>
</span><span>output_layer_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">inputs</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">output_layer_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">output_layer_bias</span><span style="color:#5fb3b3;">)
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Prediction: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">output_layer_out</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Prediction: tensor([0.5500], grad_fn=<ReluBackward0>)"
</span><span>
</span><span>loss </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">loss_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">output_layer_out</span><span style="color:#5fb3b3;">, </span><span style="color:#6699cc;">actual</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Loss: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">loss</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Loss: tensor(0.1600, grad_fn=<MseLossBackward0>)"
</span></code></pre>
<h3 id="backpropagation">Backpropagation</h3>
<p>Now that we’ve done the forward pass and calculated our loss (\(E\)), we can move on to the main event: backpropagation. To do that, we’ll need to calculate our gradients. For the weight, it’s defined by the <a rel="noopener" target="_blank" href="https://www.khanacademy.org/math/ap-calculus-ab/ab-differentiation-2-new">chain rule</a> as:</p>
<p>$$ \frac{\partial E}{\partial w} = \frac{\partial E}{\partial o_{out}} * \frac{\partial o_{out}}{\partial o_{net}} * \frac{\partial o_{net}}{\partial w} $$</p>
<p>We’ll start by calculating \(\partial E / \partial o_{out}\), which is the partial derivative of the mean sqaured error with respect to \(o_{out}\). Using the <a rel="noopener" target="_blank" href="https://www.khanacademy.org/math/old-ap-calculus-ab/ab-derivative-rules/ab-diff-negative-fraction-powers/a/power-rule-review">power rule</a>, this becomes:</p>
<p>$$
\begin{align*}
\operatorname{mse} &= (0.55 - 0.95)^2 \\
\frac{\partial E}{\partial o_{out}} &= 2(0.55 - 0.95)^{2 - 1} \\
&= -0.8
\end{align*}
$$</p>
<p>Next, we’ll calculate \(\partial o_{out} / \partial o_{net}\), which is the partial derivative of \(o_{out}\) with respect to \(o_{net}\). This translates to calculating the derivative of ReLU for 0.55, which is 1 (\(\operatorname{ReLU^\prime}(0.55) = 1\)). I chose ReLU because it’s so easy to work with by hand. It’ll be 1 for all our examples.</p>
<p>$$
\begin{align*}
\frac{\partial o_{out}}{\partial o_{net}} &= \operatorname{ReLU^\prime}(0.55) \\
&= 1
\end{align*}
$$</p>
<p>Lastly, we’ll calculate \(\partial o_{net} / \partial w\):</p>
<p>$$
\begin{align*}
\frac{\partial o_{net}}{\partial w} &= \frac{\partial w * i + b}{\partial w} \\
&= \frac{\cancel{\partial w} * i + b}{\cancel{\partial w}} \\
&= i + b \\
&= i + 0 \\
&= i \\
&= 0.5
\end{align*}
$$</p>
<p>With our numbers plugged in, we get:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w} &= -0.8 * 1 * 0.5 \\
&= -0.4 \\
\end{align*}
$$</p>
<p>For the bias, we have a similar formula:</p>
<p>$$ \frac{\partial E}{\partial b} = \frac{\partial E}{\partial o_{out}} * \frac{\partial o_{out}}{\partial o_{net}} * \frac{\partial o_{net}}{\partial b} $$</p>
<p>We have the first two values already, and \(\partial o_{net} / \partial b\) will always be 1, so we have:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial b} &= 2(0.55 - 0.95) * 1 * 1 \\
&= -0.8 \\
\end{align*}
$$</p>
<p>We’ll once again check our work against PyTorch:</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#6699cc;">loss</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">backward</span><span style="color:#5fb3b3;">()
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Output layer weight gradient: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">output_layer_weights</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Output layer weight gradient: tensor([-0.4000])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Output layer bias gradient: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">output_layer_bias</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Output layer bias gradient: tensor([-0.8000])"
</span></code></pre>
<p>To take things a little further, we can complete the backward pass by using these gradients to update our weight and bias. After that, we can do another forward pass to see if the loss is reduced.</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span>learning_rate </span><span style="color:#5fb3b3;">= </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">5
</span><span>
</span><span>updated_output_layer_bias </span><span style="color:#5fb3b3;">= </span><span>output_layer_bias </span><span style="color:#5fb3b3;">- </span><span>learning_rate </span><span style="color:#5fb3b3;">* </span><span>output_layer_bias</span><span style="color:#5fb3b3;">.</span><span>grad</span><span style="color:#5fb3b3;">.</span><span>data
</span><span>updated_output_layer_weights </span><span style="color:#5fb3b3;">= </span><span>output_layer_weights </span><span style="color:#5fb3b3;">- </span><span>learning_rate </span><span style="color:#5fb3b3;">* </span><span>output_layer_weights</span><span style="color:#5fb3b3;">.</span><span>grad</span><span style="color:#5fb3b3;">.</span><span>data
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Updated weight: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">updated_output_layer_weights</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Updated weight: tensor([0.5000], grad_fn=<SubBackward0>)"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Updated bias: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">updated_output_layer_bias</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Updated bias: tensor([0.8000], grad_fn=<SubBackward0>)"
</span><span>
</span><span>updated_output_layer_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">inputs</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">updated_output_layer_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">updated_output_layer_bias</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Updated prediction: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">updated_output_layer_out</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Updated prediction: tensor([1.0500], grad_fn=<ReluBackward0>)"
</span><span>
</span><span>loss </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">loss_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">updated_output_layer_out</span><span style="color:#5fb3b3;">, </span><span style="color:#6699cc;">actual</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Updated loss: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">loss</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Updated loss: tensor(0.0100, grad_fn=<MseLossBackward0>)"
</span></code></pre>
<p>Cool, our loss has gone down from 0.16 to 0.01! We could repeat the forward and backward passes to further train our network, but we’ll stop here and move on.</p>
<h2 id="one-hidden-layer">One Hidden Layer</h2>
<p>Let’s do another example. This time, we’ll add a hidden layer (\(h\)), but still with one neuron in each layer.</p>
<p><a href="images/one-hidden-layer.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/one-hidden-layer.svg" alt="Diagram of a neural network with one input neuron of value 0.5, one hidden neuron with an incoming weight of 0.3, bias of 0.4, and value of 0.55, and one output neuron with an incoming weight of 0.2, bias of 0.1, and prediction value of 0.21." /></a></p>
<p>Briefly, the forward pass would be:</p>
<p>$$
\begin{align*}
h &= \operatorname{ReLU}(w_{h} * i + b_{h}) \\
&= \operatorname{ReLU}(0.3 * 0.5 + 0.4) \\
&= 0.55 \\
\\
o &= \operatorname{ReLU}(w_{o} * h + b_{o}) = 0.21 \\
&= \operatorname{ReLU}(0.2 * 0.55 + 0.1) \\
&= 0.21 \\
\\
E &= (0.21 - 0.95)^2 = 0.5476
\end{align*}
$$</p>
<p>(Since the output of ReLU on a number will always be the same as the input number for our examples, we’ll drop the \(o_{net}\) and \(o_{out}\) notations and just use \(o\) in our formulas going forward.)</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span>layer1_weights </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">3</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>layer1_bias </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">4</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>
</span><span>output_layer_weights </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">2</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>output_layer_bias </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">1</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>
</span><span>layer1_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">inputs</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">layer1_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">layer1_bias</span><span style="color:#5fb3b3;">)
</span><span>Hidden layer</span><span style="color:#5fb3b3;">: </span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">5500</span><span style="color:#5fb3b3;">], </span><span style="color:#f99157;">grad_fn</span><span style="color:#5fb3b3;">=<</span><span style="color:#6699cc;">ReluBackward0</span><span style="color:#5fb3b3;">>)
</span><span style="color:#5f6364;"># Prints "Hidden layer: tensor([0.5500], grad_fn=<ReluBackward0>)"
</span><span>
</span><span>output_layer_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">layer1_out</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">output_layer_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">output_layer_bias</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Prediction: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">output_layer_out</span><span style="color:#5fb3b3;">)
</span><span>Prediction</span><span style="color:#5fb3b3;">: </span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">2100</span><span style="color:#5fb3b3;">], </span><span style="color:#f99157;">grad_fn</span><span style="color:#5fb3b3;">=<</span><span style="color:#6699cc;">ReluBackward0</span><span style="color:#5fb3b3;">>)
</span><span>
</span><span>loss </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">loss_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">output_layer_out</span><span style="color:#5fb3b3;">, </span><span style="color:#6699cc;">actual</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Loss: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">loss</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Loss: tensor(0.5476, grad_fn=<MseLossBackward0>)"
</span></code></pre>
<p>For backpropagation, we’ll take it one layer at a time, starting from the output layer. The gradient for the output layer’s weight (we’ll call it \(w_{o}\)) would be:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w_{o}} &= \operatorname{mse^\prime}(0.21, 0.95) * \operatorname{ReLU^\prime}(0.21) * h \\
&= 2(0.21 - 0.95) * 1 * 0.55 \\
&= -0.814 \\
\end{align*}
$$</p>
<p>This is the same formula as before, but using the (post-ReLU) value from the hidden layer’s node, 0.55, instead of the input layer’s value, 0.5. And the gradient for the bias (we’ll call it \(b_{o}\)) would be:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial b_{o}} &= \operatorname{mse^\prime}(0.21, 0.95) * \operatorname{ReLU^\prime}(0.21) * 1 \\
&= 2(0.21 - 0.95) * 1 * 1 \\
&= -1.48 \\
\end{align*}
$$</p>
<p>To help keep things compact, we’ll start referring to \(\operatorname{mse^\prime}(0.21, 0.95)\) as just \(\operatorname{mse^\prime}\). For the hidden layer’s weight gradient, \(w_{h}\):</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w_{h}} &= (\operatorname{mse^\prime} * w_{o}) * \operatorname{ReLU^\prime}(0.55) * i \\
&= (2(0.21 - 0.95) * 0.2) * 1 * 0.5 \\
&= -0.148 \\
\end{align*}
$$</p>
<p>Here, we use almost the same formula as our simple network example, but instead of just the derivative of the mean squared error, we use the derivative of the mean squared error times the output layer’s weight. Similarly, for the bias, \(b_{h}\):</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial b_{h}} &= (\operatorname{mse^\prime} * w_{o}) * \operatorname{ReLU^\prime}(0.55) * 1 \\
&= (2(0.21 - 0.95) * 0.2) * 1 * 1 \\
&= -0.296 \\
\end{align*}
$$</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#6699cc;">loss</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">backward</span><span style="color:#5fb3b3;">()
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for output layer weights: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">output_layer_weights</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for output layer weights: tensor([-0.8140])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for output layer bias: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">output_layer_bias</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for output layer bias: tensor([-1.4800])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for hidden layer weights: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">layer1_weights</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for hidden layer weights: tensor([-0.1480])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for hidden layer bias: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">layer1_bias</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for hidden layer bias: tensor([-0.2960])"
</span></code></pre>
<h2 id="two-neurons">Two Neurons</h2>
<p>Let’s keep building. This time, we’ll add an extra neuron to the hidden layer.</p>
<p><a href="images/two-neurons.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/two-neurons.svg" alt="Diagram of a neural network with one input neuron of value 0.5, two hidden neurons with incoming weights (from top to bottom) of 0.3 and 0.2, biases of 0.4 and 0.1, and values of 0.55 and 0.2, and one output neuron with an incoming weights of 0.1 and 0.7, bias of 0.3, and prediction value of 0.495." /></a></p>
<p>We’ll name the top neuron in the hidden layer \(h_{1}\) and the bottom neuron \(h_{2}\). For the forward pass, the formulas are:</p>
<p>$$
\begin{align*}
h_{1} &= \operatorname{ReLU}(w_{h1} * i + b_{h1}) \\
&= \operatorname{ReLU}(0.3 * 0.5 + 0.4) \\
&= 0.55 \\
h_{2} &= \operatorname{ReLU}(w_{h2} * i + b_{h2}) \\
&= \operatorname{ReLU}(0.2 * 0.5 + 0.1) \\
&= 0.2 \\
\\
o &= \operatorname{ReLU}((w_{o1} * h_{1}) + (w_{o2} * h_{2}) + b_{o}) \\
&= \operatorname{ReLU}((0.1 * 0.55) + (0.7 * 0.2) + 0.3) \\
&= 0.495 \\
\\
E &= (0.495 - 0.95)^2 = 0.207025
\end{align*}
$$</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span>inputs </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">5</span><span style="color:#5fb3b3;">])
</span><span>actual </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">95</span><span style="color:#5fb3b3;">])
</span><span>
</span><span>layer1_weights </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([[</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">3</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">2</span><span style="color:#5fb3b3;">]]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>layer1_bias </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">4</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">1</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>
</span><span>output_layer_weights </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([[</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">1</span><span style="color:#5fb3b3;">], [</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">7</span><span style="color:#5fb3b3;">]]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>output_layer_bias </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">3</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>
</span><span>layer1_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">inputs</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">layer1_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">layer1_bias</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Hidden layer weights: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">layer1_out</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Hidden layer weights: tensor([0.5500, 0.2000], grad_fn=<ReluBackward0>)"
</span><span>
</span><span>output_layer_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">layer1_out</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">output_layer_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">output_layer_bias</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Prediction: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">output_layer_out</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Prediction: tensor([0.4950], grad_fn=<ReluBackward0>)"
</span><span>
</span><span>loss </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">loss_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">output_layer_out</span><span style="color:#5fb3b3;">, </span><span style="color:#6699cc;">actual</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Loss: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">loss</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Loss: tensor(0.2070, grad_fn=<MseLossBackward0>)"
</span></code></pre>
<p>For backpropagation, the formulas are:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w_{o1}} &= 2(0.495 - 0.95) * 1 * .55 \\
&= -0.5005 \\
\frac{\partial E}{\partial w_{o2}} &= 2(0.495 - 0.95) * 1 * .2 \\
&= -0.182 \\
\frac{\partial E}{\partial b_{o}} &= 2(0.495 - 0.95) * 1 * 1 \\
&= -0.91 \\
\\
\frac{\partial E}{\partial w_{h1}} &= (2(0.495 - 0.95) * 0.1) * 1 * 0.5 \\
&= -0.0455 \\
\frac{\partial E}{\partial w_{h2}} &= (2(0.495 - 0.95) * 0.7) * 1 * 0.5 \\
&= -0.3185 \\
\frac{\partial E}{\partial b_{h1}} &= (2(0.495 - 0.95) * 0.1) * 1 * 1 \\
&= -0.091 \\
\frac{\partial E}{\partial b_{h2}} &= (2(0.495 - 0.95) * 0.7) * 1 * 1 \\
&= -0.637
\end{align*}
$$</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#6699cc;">loss</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">backward</span><span style="color:#5fb3b3;">()
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for output layer weights: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">output_layer_weights</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for output layer weights: tensor([[-0.5005], [-0.1820]])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for output layer bias: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">output_layer_bias</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for output layer bias: tensor([-0.9100])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for hidden layer weights: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">layer1_weights</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for hidden layer weights: tensor([[-0.0455, -0.3185]])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for hidden layer bias: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">layer1_bias</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for hidden layer bias: tensor([-0.0910, -0.6370])"
</span></code></pre>
<h1 id="two-hidden-layers">Two Hidden Layers</h1>
<p>Let’s conclude with one last example. This time, we’ll have two hidden layers. We’ll address the first hidden layer’s neurons with \(h_{1}\) and \(h_{2}\), and the second hidden layer’s neurons with \(j_{1}\) and \(j_{2}\).</p>
<p><a href="images/two-hidden-layers.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/two-hidden-layers.svg" alt="Diagram of a more complex neural network. There’s still one input neuron of value 0.5, but now there are two hidden layers. The first hidden layer has two neurons with incoming weights (from top to bottom) of 0.3 and 0.2, biases of 0.4 and 0.1, and values of 0.55 and 0.2. The second hidden layer has two neurons with incoming weights of 0.1 and 0.6 and bias of 0.3 for the top neuron and incoming weights of 0.8 and 0.7 and bias of 0.2 for the bottom neuron. The output layer has a single neuron with incoming weights of 0.4 and 0.6, a bias of 0.3, and a prediction of 0.958." /></a></p>
<p>$$
\begin{align*}
h_{1} &= \operatorname{ReLU}(0.3 * 0.5 + 0.4) = 0.55 \\
h_{2} &= \operatorname{ReLU}(0.2 * 0.5 + 0.1) = 0.2 \\
\\
j_{1} &= \operatorname{ReLU}((0.1 * 0.55) + (0.6 * 0.2) + 0.3) \\
&= 0.475 \\
j_{2} &= \operatorname{ReLU}((0.8 * 0.55) + (0.7 * 0.2) + 0.2) \\
&= 0.78 \\
\\
o &= \operatorname{ReLU}((0.4 * 0.475) + (0.6 * 0.78) + 0.3) \\
&= 0.958 \\
\\
E &= (0.958 - 0.95)^2 = 0.000064
\end{align*}
$$</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span>inputs </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">5</span><span style="color:#5fb3b3;">])
</span><span>actual </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">95</span><span style="color:#5fb3b3;">])
</span><span>
</span><span>layer1_weights </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([[</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">3</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">2</span><span style="color:#5fb3b3;">]]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>layer1_bias </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">4</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">1</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>
</span><span>layer2_weights </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([[</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">1</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">8</span><span style="color:#5fb3b3;">], [</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">6</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">7</span><span style="color:#5fb3b3;">]]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>layer2_bias </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">3</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">2</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>
</span><span>output_layer_weights </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([[</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">4</span><span style="color:#5fb3b3;">], [</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">6</span><span style="color:#5fb3b3;">]]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>output_layer_bias </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">torch</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">tensor</span><span style="color:#5fb3b3;">([</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">3</span><span style="color:#5fb3b3;">]).</span><span style="color:#6699cc;">requires_grad_</span><span style="color:#5fb3b3;">()
</span><span>
</span><span>layer1_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">inputs</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">layer1_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">layer1_bias</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Layer 1 weights: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">layer1_out</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Layer 1 weights: tensor([0.5500, 0.2000], grad_fn=<ReluBackward0>)"
</span><span>
</span><span>layer2_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">layer1_out</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">layer2_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">layer2_bias</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Layer 2 weights: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">layer2_out</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Layer 2 weights: tensor([0.4750, 0.7800], grad_fn=<ReluBackward0>)"
</span><span>
</span><span>output_layer_out </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">activation_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">layer2_out</span><span style="color:#5fb3b3;">@</span><span style="color:#6699cc;">output_layer_weights </span><span style="color:#5fb3b3;">+ </span><span style="color:#6699cc;">output_layer_bias</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Prediction: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">output_layer_out</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Prediction: tensor([0.9580], grad_fn=<ReluBackward0>)"
</span><span>
</span><span>loss </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">loss_function</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">output_layer_out</span><span style="color:#5fb3b3;">, </span><span style="color:#6699cc;">actual</span><span style="color:#5fb3b3;">)
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">("</span><span style="color:#99c794;">Loss: </span><span style="color:#5fb3b3;">", </span><span style="color:#6699cc;">loss</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Loss: tensor(6.4001e-05, grad_fn=<MseLossBackward0>)"
</span></code></pre>
<p>For backpropagation, the formulas for the output layer’s gradients are:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w_{o1}} &= 2(0.958 - 0.95) * 1 * .475 \\
&= 0.0076 \\
\frac{\partial E}{\partial w_{o2}} &= 2(0.958 - 0.95) * 1 * .78 \\
&= 0.01248 \\
\frac{\partial E}{\partial b_{o}} &= 2(0.958 - 0.95) * 1 * 1 \\
&= 0.016
\end{align*}
$$</p>
<p>For the second hidden layer, we’ll use \(w_{j11}\) and \(w_{j12}\) to refer to the two weights feeding into the first neuron (\(j_{1}\)) and \(w_{j21}\) and \(w_{j22}\) to refer to the two weights feeding into the second neuron, \(j_{2}\). The gradients for the set of weights and bias feeding into the first neuron, \(j_{1}\), are:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w_{j11}} &= (2(0.958 - 0.95) * 0.4) * 1 * 0.55 \\
&= 0.00352 \\
\frac{\partial E}{\partial w_{j12}} &= (2(0.958 - 0.95) * 0.4) * 1 * 0.2 \\
&= 0.00128 \\
\frac{\partial E}{\partial b_{j1}} &= (2(0.958 - 0.95) * 0.4) * 1 * 1 \\
&= 0.0064
\end{align*}
$$</p>
<p>The gradients for the set of weights and bias feeding into the second neuron, \(j_{2}\), are:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w_{j21}} &= (2(0.958 - 0.95) * 0.6) * 1 * 0.55 \\
&= 0.00528 \\
\frac{\partial E}{\partial w_{j22}} &= (2(0.958 - 0.95) * 0.6) * 1 * 0.2 \\
&= 0.00192 \\
\frac{\partial E}{\partial b_{j2}} &= (2(0.958 - 0.95) * 0.6) * 1 * 1 \\
&= 0.0096
\end{align*}
$$</p>
<p>The formulas to calculate the gradients for the first hidden layer get pretty long, so we’ll start by defining some variables to keep things readable. To start, we’ve seen \(2(0.958 - 0.95)\) used throughout. This is the derivative of the mean squared error, \(\operatorname{mse}^\prime(0.958, 0.95)\):</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial o} &= \operatorname{mse}^\prime(0.958, 0.95) \\
&= 2(0.958 - 0.95) \\
&= 0.016
\end{align*}
$$</p>
<p>Next, we’ll take the \(\operatorname{mse}^\prime\) value from above and walk the path back from the output node to \(h_{1}\), multiplying the weights as we go. There are two paths to get to \(h_{1}\), so we’ll start with the topmost path:</p>
<p>$$
\begin{align*}
a &= \operatorname{mse}^\prime(0.958, 0.95) * w_{o1} * w_{j11} \\
&= 0.016 * 0.4 * 0.1 \\
&= 0.00064
\end{align*}
$$</p>
<p><a href="images/first-hidden-layer-first-weight-path.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/first-hidden-layer-first-weight-path.svg" alt="Diagram showing the path walked through the two hidden layer neural network in order to arrive at our intermediate variable “a”." /></a></p>
<p>Next, we’ll do the bottom path to \(h_{1}\):</p>
<p>$$
\begin{align*}
b &= \operatorname{mse}^\prime(0.958, 0.95) * w_{o2} * w_{j21} \\
&= 0.016 * 0.6 * 0.8 \\
&= 0.00768
\end{align*}
$$</p>
<p><a href="images/first-hidden-layer-second-weight-path.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/first-hidden-layer-second-weight-path.svg" alt="Diagram showing the path walked through the two hidden layer neural network in order to arrive at our intermediate variable “b”." /></a></p>
<p>With these done, we just need to add the two values together and multiply the sum by \(\operatorname{ReLU}^\prime\) (always 1 in our examples) and the input value, 0.5 to get our answer:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w_{h1}} &= (a + b) * 1 * i \\
&= (0.00064 + 0.00768) * 1 * 0.5 \\
&= 0.00416
\end{align*}
$$</p>
<p>We can do something similar to what we’ve seen before for \(h_{1}\)’s bias:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial b_{h1}} &= (a + b) * 1 * 1 \\
&= (0.00064 + 0.00768) * 1 * 1 \\
&= 0.00832
\end{align*}
$$</p>
<p>Now we just need to do \(h_{2}\)’s weights. Like before, we’ll start by taking the \(\operatorname{mse}^\prime\) value and walking the topmost path to \(h_{2}\):</p>
<p>$$
\begin{align*}
c &= \operatorname{mse}^\prime(0.958, 0.95) * w_{o1} * w_{j12} \\
&= 0.016 * 0.4 * 0.6 \\
&= 0.00384
\end{align*}
$$</p>
<p><a href="images/first-hidden-layer-third-weight-path.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/first-hidden-layer-third-weight-path.svg" alt="Diagram showing the path walked through the two hidden layer neural network in order to arrive at our intermediate variable “c”." /></a></p>
<p>And then the bottom-most path:</p>
<p>$$
\begin{align*}
d &= \operatorname{mse}^\prime(0.958, 0.95) * w_{o2} * w_{j22} \\
&= 0.016 * 0.6 * 0.7 \\
&= 0.00672
\end{align*}
$$</p>
<p><a href="images/first-hidden-layer-fourth-weight-path.svg"><img src="https://www.hirahim.com/posts/backpropagation-by-example/images/first-hidden-layer-fourth-weight-path.svg" alt="Diagram showing the path walked through the two hidden layer neural network in order to arrive at our intermediate variable “d”." /></a></p>
<p>And then, like before:</p>
<p>$$
\begin{align*}
\frac{\partial E}{\partial w_{h2}} &= (c + d) * 1 * 0.5 \\
&= (0.00384 + 0.00672) * 1 * 0.5 \\
&= 0.00528 \\
\\
\frac{\partial E}{\partial b_{h2}} &= (c + d) * 1 * 1 \\
&= (0.00384 + 0.00672) * 1 * 1 \\
&= 0.01056
\end{align*}
$$</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for output layer weights: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">output_layer_weights</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for output layer weights: tensor([[0.0076], [0.0125]])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for output layer bias: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">output_layer_bias</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for output layer bias: tensor([0.0160])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for hidden layer 2 weights: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">layer2_weights</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for hidden layer 2 weights: tensor([[0.0035, 0.0053], [0.0013, 0.0019]])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for hidden layer 2 bias: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">layer2_bias</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for hidden layer 2 bias: tensor([0.0064, 0.0096])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for hidden layer 1 weights: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">layer1_weights</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for hidden layer 1 weights: tensor([[0.0042, 0.0053]])"
</span><span>
</span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">Gradient for hidden layer 1 bias: </span><span style="color:#5fb3b3;">', </span><span style="color:#6699cc;">layer1_bias</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">grad</span><span style="color:#5fb3b3;">)
</span><span style="color:#5f6364;"># Prints "Gradient for hidden layer 1 bias: tensor([0.0083, 0.0106])"
</span></code></pre>
<h2 id="wrapping-up">Wrapping Up</h2>
<p>Phew, so much for going another year without using calculus! If you’ve found this helpful or spot any errors, please let me know! Feel free to shoot me an e-mail at <a href="mailto:rahim@sonawalla.org">rahim@sonawalla.org</a>.</p>
Using Machine Learning to Detect Songs Produced by Jack Antonoff
Rahim Sonawalla
2022-11-30T00:00:00+00:00
2022-11-30T00:00:00+00:00
https://www.hirahim.com/posts/antonoff-detector/
<p>It’s been 17 years since I last did anything in the AI/ML space. Back then, it was as an undergrad taking a “Projects in Artificial Intelligence” class using genetic algorithms to make an app that dynamically generated melodies over chord progressions. With announcements of ground-breaking advances in neural networks coming out seemingly daily for the past few months, I knew I needed to find some time to better understand what’s been going on in all those years. So I requested a big chunk of time off work, and started going through the <a rel="noopener" target="_blank" href="https://course.fast.ai/">fast.ai course</a>.</p>
<p>I’ve only just finished the first lesson so far, but thanks to how approachable fast.ai makes its materials, I’ve managed to throw together something interesting.</p>
<h2 id="the-idea">The Idea</h2>
<p>Last month, <a rel="noopener" target="_blank" href="https://twitter.com/calebgamman">@calebgamman</a> put up a viral video of him guessing which songs off Taylor Swift’s new album, Midnights, were produced by Jack Antonoff. If you haven’t already, you should take a moment to watch it <a rel="noopener" target="_blank" href="https://twitter.com/calebgamman/status/1583474726797979649">here</a>.</p>
<p>(<a rel="noopener" target="_blank" href="https://www.youtube.com/@calebgamman">Caleb’s YouTube channel</a> is also worth checking out. He’s got a bunch of high-quality videos, including one on his thoughts <a rel="noopener" target="_blank" href="https://www.youtube.com/watch?v=MravA_dgUkQ">on AI</a>, and a criminally low subscriber count.)</p>
<p>I was <em>very</em> skeptical it would work at all, but figured it might be fun to see if I could use what I learned from the first chapter (and some Googling) to train a neural network to do something similar. I’m in absolute disbelief that it actually turned out to work decently and didn’t require a huge training set nor much coding at all.</p>
<h2 id="the-approach">The Approach</h2>
<p>The implementation turned out to be straight-forward: take the audio files, turn them into spectrograms, and then use those to “fine tune” an existing, pre-trained computer vision model.</p>
<p>To start, I went through Taylor Swift’s album pages on Wikipedia to find songs produced by Jack Antonoff. Antonoff has produced music for a ton of other artists, including The 1975, Lorde, and Lana del Rey, but I decided to limit the training set to just Taylor Swift songs, since I wasn’t sure what would happen if I started throwing in other artists. I also grabbed URLs to song previews for each song and put everything in a <a rel="noopener" target="_blank" href="https://gist.github.com/rahims/7d4177e4272aa697da5df737aefd55dd">CSV file</a>. Here’s a sample of its structure:</p>
<pre style="background-color:#2b2c2f;color:#cccece;"><code><span>Producer,Artist,Title,URL
</span><span>Jack Antonoff,Taylor Swift,Out of the Woods,https://audio-ssl.itunes.apple.com/itunes-assets/AudioPreview115/v4/86/37/dd/8637dd40-bbbd-ed09-9c0d-cc802f5cfd21/mzaf_10076581349863730065.plus.aac.ep.m4a
</span><span>Jack Antonoff,Taylor Swift,Cruel Summer,https://audio-ssl.itunes.apple.com/itunes-assets/AudioPreview112/v4/85/8e/b9/858eb9a3-e75a-363a-4049-71d654a4104c/mzaf_4981758308159811037.plus.aac.ep.m4a
</span></code></pre>
<p>I ended up with 85 songs to use for training, 39 produced by Antonoff and 46 produced by others. Next, I wrote a Python script to parse the CSV, download the audio previews, and generate a spectrogram of the first 30 seconds of each audio preview:</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#c594c5;">import </span><span>re
</span><span style="color:#c594c5;">import </span><span>csv
</span><span style="color:#c594c5;">import </span><span>librosa
</span><span style="color:#c594c5;">import </span><span>numpy </span><span style="color:#c594c5;">as </span><span>np
</span><span style="color:#c594c5;">import </span><span>matplotlib</span><span style="color:#5fb3b3;">.</span><span>pyplot </span><span style="color:#c594c5;">as </span><span>plt
</span><span style="color:#c594c5;">import </span><span>librosa</span><span style="color:#5fb3b3;">.</span><span>display
</span><span style="color:#c594c5;">from </span><span>urllib</span><span style="color:#5fb3b3;">.</span><span>request </span><span style="color:#c594c5;">import </span><span>urlretrieve
</span><span>
</span><span style="color:#c594c5;">def </span><span style="color:#6699cc;">clean_name</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">name</span><span style="color:#5fb3b3;">):
</span><span> cleaned </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">re</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">sub</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">[^a-zA-Z0-9 </span><span style="color:#5fb3b3;">\n</span><span style="color:#99c794;">\.]</span><span style="color:#5fb3b3;">', '', </span><span style="color:#6699cc;">name</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">lower</span><span style="color:#5fb3b3;">())
</span><span> </span><span style="color:#c594c5;">return </span><span style="color:#6699cc;">cleaned</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">replace</span><span style="color:#5fb3b3;">(" ", "</span><span style="color:#99c794;">_</span><span style="color:#5fb3b3;">")
</span><span>
</span><span style="color:#c594c5;">def </span><span style="color:#6699cc;">plot</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">filename</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">outname</span><span style="color:#5fb3b3;">):
</span><span> </span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">(</span><span style="color:#c594c5;">f</span><span style="color:#5fb3b3;">'</span><span style="color:#99c794;">Loading </span><span style="color:#5fb3b3;">{</span><span style="color:#6699cc;">filename</span><span style="color:#5fb3b3;">}')
</span><span> y, sr </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">librosa</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">load</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">filename</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">duration</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">30</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">)
</span><span>
</span><span> fig, ax </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">plt</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">subplots</span><span style="color:#5fb3b3;">()
</span><span> </span><span style="color:#6699cc;">plt</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">axis</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">off</span><span style="color:#5fb3b3;">')
</span><span>
</span><span> S </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">librosa</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">feature</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">melspectrogram</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">y</span><span style="color:#5fb3b3;">=</span><span style="color:#6699cc;">y</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">sr</span><span style="color:#5fb3b3;">=</span><span style="color:#6699cc;">sr</span><span style="color:#5fb3b3;">)
</span><span> S_dB </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">librosa</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">power_to_db</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">S</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">ref</span><span style="color:#5fb3b3;">=</span><span style="color:#6699cc;">np</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">max</span><span style="color:#5fb3b3;">)
</span><span> </span><span style="color:#6699cc;">librosa</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">display</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">specshow</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">S_dB</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">sr</span><span style="color:#5fb3b3;">=</span><span style="color:#6699cc;">sr</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">x_axis</span><span style="color:#5fb3b3;">='</span><span style="color:#99c794;">time</span><span style="color:#5fb3b3;">', </span><span style="color:#f99157;">y_axis</span><span style="color:#5fb3b3;">='</span><span style="color:#99c794;">mel</span><span style="color:#5fb3b3;">', </span><span style="color:#f99157;">ax</span><span style="color:#5fb3b3;">=</span><span style="color:#6699cc;">ax</span><span style="color:#5fb3b3;">)
</span><span>
</span><span> </span><span style="color:#6699cc;">fig</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">savefig</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">outname</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">bbox_inches </span><span style="color:#5fb3b3;">= '</span><span style="color:#99c794;">tight</span><span style="color:#5fb3b3;">', </span><span style="color:#f99157;">pad_inches </span><span style="color:#5fb3b3;">= </span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">)
</span><span> </span><span style="color:#6699cc;">plt</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">close</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">fig</span><span style="color:#5fb3b3;">)
</span><span>
</span><span style="color:#c594c5;">def </span><span style="color:#6699cc;">download_files</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">data_file</span><span style="color:#5fb3b3;">):
</span><span> </span><span style="color:#c594c5;">with </span><span style="color:#6699cc;">open</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">data_file</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">newline</span><span style="color:#5fb3b3;">='') </span><span style="color:#c594c5;">as </span><span>csvfile</span><span style="color:#5fb3b3;">:
</span><span> csvreader </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">csv</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">DictReader</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">csvfile</span><span style="color:#5fb3b3;">)
</span><span>
</span><span> </span><span style="color:#c594c5;">for </span><span>row </span><span style="color:#c594c5;">in </span><span>csvreader</span><span style="color:#5fb3b3;">:
</span><span> producer </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">clean_name</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">row</span><span style="color:#5fb3b3;">['</span><span style="color:#99c794;">Producer</span><span style="color:#5fb3b3;">'])
</span><span> artist </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">clean_name</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">row</span><span style="color:#5fb3b3;">['</span><span style="color:#99c794;">Artist</span><span style="color:#5fb3b3;">'])
</span><span> title </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">clean_name</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">row</span><span style="color:#5fb3b3;">['</span><span style="color:#99c794;">Title</span><span style="color:#5fb3b3;">'])
</span><span> url </span><span style="color:#5fb3b3;">= </span><span>row</span><span style="color:#5fb3b3;">['</span><span style="color:#99c794;">URL</span><span style="color:#5fb3b3;">']
</span><span>
</span><span> filename </span><span style="color:#5fb3b3;">= </span><span style="color:#c594c5;">f</span><span style="color:#5fb3b3;">'{</span><span>producer</span><span style="color:#5fb3b3;">}</span><span style="color:#99c794;">-</span><span style="color:#5fb3b3;">{</span><span>artist</span><span style="color:#5fb3b3;">}</span><span style="color:#99c794;">-</span><span style="color:#5fb3b3;">{</span><span>title</span><span style="color:#5fb3b3;">}'
</span><span>
</span><span> </span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">(</span><span style="color:#c594c5;">f</span><span style="color:#5fb3b3;">'</span><span style="color:#99c794;">Downloading </span><span style="color:#5fb3b3;">{</span><span style="color:#6699cc;">filename</span><span style="color:#5fb3b3;">}')
</span><span> </span><span style="color:#6699cc;">urlretrieve</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">url</span><span style="color:#5fb3b3;">, </span><span style="color:#c594c5;">f</span><span style="color:#5fb3b3;">'</span><span style="color:#99c794;">./originals/</span><span style="color:#5fb3b3;">{</span><span style="color:#6699cc;">filename</span><span style="color:#5fb3b3;">}</span><span style="color:#99c794;">.m4a</span><span style="color:#5fb3b3;">')
</span><span>
</span><span> </span><span style="color:#6699cc;">print</span><span style="color:#5fb3b3;">(</span><span style="color:#c594c5;">f</span><span style="color:#5fb3b3;">'</span><span style="color:#99c794;">Generating spectrograms for </span><span style="color:#5fb3b3;">{</span><span style="color:#6699cc;">filename</span><span style="color:#5fb3b3;">}')
</span><span> </span><span style="color:#6699cc;">plot</span><span style="color:#5fb3b3;">(</span><span style="color:#c594c5;">f</span><span style="color:#5fb3b3;">'</span><span style="color:#99c794;">./originals/</span><span style="color:#5fb3b3;">{</span><span style="color:#6699cc;">filename</span><span style="color:#5fb3b3;">}</span><span style="color:#99c794;">.m4a</span><span style="color:#5fb3b3;">', </span><span style="color:#c594c5;">f</span><span style="color:#5fb3b3;">'</span><span style="color:#99c794;">./spectrograms/</span><span style="color:#5fb3b3;">{</span><span style="color:#6699cc;">filename</span><span style="color:#5fb3b3;">}</span><span style="color:#99c794;">.png</span><span style="color:#5fb3b3;">')
</span><span>
</span><span style="color:#6699cc;">download_files</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">data.csv</span><span style="color:#5fb3b3;">')
</span></code></pre>
<p>I spent a bunch of time trying to figure out how to generate and save spectrogram images. A lot of tutorials I found online used the <a rel="noopener" target="_blank" href="https://fastaudio.github.io/">fastaudio</a> library, but that had version conflicts with the <a rel="noopener" target="_blank" href="https://github.com/fastai/fastai">fastai</a> libraries, so I decided to switch to using <a rel="noopener" target="_blank" href="https://librosa.org/">librosa</a> instead. Eventually, I got everything working and saving images that looked like this:</p>
<p><a href="images/jack_antonoff-taylor_swift-antihero.png"><img src="https://www.hirahim.com/posts/antonoff-detector/images/jack_antonoff-taylor_swift-antihero.png" alt="Spectrogram of the first 30 seconds of Taylor Swift’s song, Anti-Hero" /></a></p>
<p>From there, I moved over to a Jupyter notebook on Kaggle to do the actual training since my four-year-old laptop wasn’t up for the task. I zipped up the images and uploaded them as a Kaggle dataset and wrote some basic code to fine tune a vision model:</p>
<pre data-lang="python" style="background-color:#2b2c2f;color:#cccece;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#c594c5;">from </span><span>fastai</span><span style="color:#5fb3b3;">.</span><span>vision</span><span style="color:#5fb3b3;">.</span><span>all </span><span style="color:#c594c5;">import </span><span style="color:#f99157;">*
</span><span>
</span><span>input_path </span><span style="color:#5fb3b3;">= '</span><span style="color:#99c794;">/kaggle/input/antonoff-detector/spectograms</span><span style="color:#5fb3b3;">'
</span><span>
</span><span style="color:#c594c5;">def </span><span style="color:#6699cc;">is_produced_by_jack</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">x</span><span style="color:#5fb3b3;">):
</span><span> </span><span style="color:#c594c5;">return </span><span style="color:#6699cc;">x</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">startswith</span><span style="color:#5fb3b3;">('</span><span style="color:#99c794;">jack_antonoff-</span><span style="color:#5fb3b3;">')
</span><span>
</span><span>dls </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">ImageDataLoaders</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">from_name_func</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">input_path</span><span style="color:#5fb3b3;">,
</span><span style="color:#6699cc;"> get_image_files</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">input_path</span><span style="color:#5fb3b3;">),
</span><span style="color:#6699cc;"> </span><span style="color:#f99157;">valid_pct</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">2</span><span style="color:#5fb3b3;">,
</span><span style="color:#6699cc;"> </span><span style="color:#f99157;">seed</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">42</span><span style="color:#5fb3b3;">,
</span><span style="color:#6699cc;"> </span><span style="color:#f99157;">label_func</span><span style="color:#5fb3b3;">=</span><span style="color:#6699cc;">is_produced_by_jack</span><span style="color:#5fb3b3;">,
</span><span style="color:#6699cc;"> </span><span style="color:#f99157;">bs</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">16</span><span style="color:#5fb3b3;">)
</span><span>
</span><span>learn </span><span style="color:#5fb3b3;">= </span><span style="color:#6699cc;">vision_learner</span><span style="color:#5fb3b3;">(</span><span style="color:#6699cc;">dls</span><span style="color:#5fb3b3;">, </span><span style="color:#6699cc;">resnet18</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">loss_func</span><span style="color:#5fb3b3;">=</span><span style="color:#6699cc;">CrossEntropyLossFlat</span><span style="color:#5fb3b3;">(), </span><span style="color:#f99157;">metrics</span><span style="color:#5fb3b3;">=[</span><span style="color:#6699cc;">accuracy</span><span style="color:#5fb3b3;">])
</span><span>
</span><span>callbacks </span><span style="color:#5fb3b3;">= [</span><span style="color:#6699cc;">ReduceLROnPlateau</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">monitor</span><span style="color:#5fb3b3;">='</span><span style="color:#99c794;">valid_loss</span><span style="color:#5fb3b3;">', </span><span style="color:#f99157;">min_delta</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">1</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">patience</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">2</span><span style="color:#5fb3b3;">),
</span><span> </span><span style="color:#6699cc;">EarlyStoppingCallback</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">monitor</span><span style="color:#5fb3b3;">='</span><span style="color:#99c794;">valid_loss</span><span style="color:#5fb3b3;">', </span><span style="color:#f99157;">min_delta</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">05</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">patience</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">4</span><span style="color:#5fb3b3;">)]
</span><span>
</span><span style="color:#6699cc;">learn</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">fine_tune</span><span style="color:#5fb3b3;">(</span><span style="color:#f99157;">20</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">wd</span><span style="color:#5fb3b3;">=</span><span style="color:#f99157;">0</span><span style="color:#5fb3b3;">.</span><span style="color:#f99157;">1</span><span style="color:#5fb3b3;">, </span><span style="color:#f99157;">cbs</span><span style="color:#5fb3b3;">=</span><span style="color:#6699cc;">callbacks</span><span style="color:#5fb3b3;">)
</span></code></pre>
<p>Finally, I tried the model out on some song spectrograms that I had set aside for testing–ones that the model had never seen before—and was surprised to find it actually did a good job! 75% of the time, it was able to correctly identify if a Taylor Swift song was produced by Jack Antonoff or not. I’m honestly amazed at how well it did with such little work.</p>
<p>From there, I tried a few tweaks to see if I could improve the accuracy:</p>
<ol>
<li>Instead of using the <code>resnet18</code> architecture, I tried <code>resnet34</code>. Unfortunately, that didn’t help and resulted in decreased accuracy. I tried <code>densenet161</code> as well, since I’d seen <a rel="noopener" target="_blank" href="https://wandb.ai/jhartquist/fastaudio-esc-50/reports/Fine-Tuning-ResNet-18-for-Audio-Classification--VmlldzoyOTAyMzc">an article</a> about how it performed better than <code>resnet18</code> on audio spectrograms, but I kept hitting GPU limits in Kaggle and wasn’t able to run it at all.</li>
<li>I tried different batch sizes: 8, 16, and 32, but settled on 16.</li>
<li>I tried some light data augmentation using <a rel="noopener" target="_blank" href="https://docs.fast.ai/callback.mixup.html#mixup">MixUp</a>, but that didn’t help.</li>
<li>The audio previews were about a minute long, and I was only taking the first 30 seconds, so I figured I might be able to increase the accuracy by making spectrograms of the second 30 seconds of the songs and adding those to the training set as well. This helped a little, bringing the accuracy up to 78% against never-seen-before test data. For this model, I also tried to see how well it would do against songs not by Taylor Swift (some produced by Antonoff and some by others), but came back with only 54% accuracy.</li>
</ol>
<p>While doing all this, I was surprised to find that training models isn’t deterministic—different training runs result in different levels of accuracy. For example, here I train the model and get an accuracy of 87%:</p>
<p><a href="images/run-1.png"><img src="https://www.hirahim.com/posts/antonoff-detector/images/run-1.png" alt="Screenshot of the output of the first model training run showing 87% accuracy" /></a></p>
<p>And then immediately run it again—with no other changes—and get an accuracy of 77%:</p>
<p><a href="images/run-2.png"><img src="https://www.hirahim.com/posts/antonoff-detector/images/run-2.png" alt="Screenshot of output of second model training run showing 77% accuracy" /></a></p>
<p>Even stranger, the model with 87% accuracy was wrong when I asked it about a Lorde song produced by Joel Little, while the model with the 77% accuracy correctly identitified that it wasn’t produced by Antonoff.</p>
<p>In retrospect this non-deterministic model generation makes sense, since the neural network is “discovering” a set of weights that fit the problem space, and with a small training set there could be multiple weights that work, but it still feels…weird. And I have no idea why the model with lower accuracy gave a better answer for the Lorde test case.</p>
<h2 id="final-thoughts">Final Thoughts</h2>
<p>Overall, I wasn’t prepared for how “fuzzy” things would be. I thought there’d be a more precise way to determine which architecture and parameter values to use, but it ended up being a lot of “try it and see how it works”. Will using MixUp improve accuracy? Maybe? Try it and find out! What batch size should I be using? Small, but not too small. How small? Try a few and see!</p>
<p>I still don’t know enough to make any bold predictions about AI, but I’m having fun with what I’m learning so far and looking forward to the rest of the course.</p>
Shutting Down SoundCloud on Sonos
Rahim Sonawalla
2014-07-14T00:00:00+00:00
2014-07-14T00:00:00+00:00
https://www.hirahim.com/posts/shutting-down-soundcloud-on-sonos/
<p>Projects rarely end cleanly. It’s tough to shut the door on something you’ve put yourself into, so more often than not, you set it aside, tell yourself you’ll pick it up again someday, and it ends up ignored and eventually forgotten. The slow death of software, I’m as guilty of it as anyone else, but in the case of <a href="https://www.hirahim.com/projects/sonos-soundcloud/">SoundCloud on Sonos</a>, I’m happy to say I won’t be. SoundCloud <a rel="noopener" target="_blank" href="https://sonos.soundcloud.com/">released an official service for Sonos</a> today, so I’m shutting mine down. I started the project almost two years ago to solve a need that a lot of people had, and now that an official service is available, there’s not a lot of sense in keeping mine around. (If you need help setting up the official version, <a rel="noopener" target="_blank" href="https://sonos.custhelp.com/app/answers/detail/a_id/2836">check the setup instructions</a>. You can find uninstall instructions for my service <a href="https://www.hirahim.com/projects/sonos-soundcloud/#uninstall">here</a>.)</p>
<h2 id="metrics">Metrics</h2>
<p>Overall, I’d say the project was successful. The service had roughly 10,000 unique households registered, playing around 22,000 tracks each day, and making 2 million requests to the server a week, all off word-of-mouth and one free <a rel="noopener" target="_blank" href="https://www.heroku.com/">Heroku</a> instance. That’s a lot of houses filled with great music!</p>
<p><a href="images/mixpanel.png"><img src="https://www.hirahim.com/posts/shutting-down-soundcloud-on-sonos/images/mixpanel.png" alt="Mixpanel Metrics" /></a></p>
<p><a href="images/newrelic.png"><img src="https://www.hirahim.com/posts/shutting-down-soundcloud-on-sonos/images/newrelic.png" alt="New Relic Metrics" /></a></p>
<h2 id="goodwill">Goodwill</h2>
<p>Metrics are one thing, but what really made me proud was all the kind e-mails and tweets. During the lower points, it felt good knowing that someone out there was a little happier because of something I made. If you’re using software you like, take a moment to write the creator(s) an e-mail, it’ll help more than you think.</p>
<p><a href="images/glitch-mob-tweet.png"><img src="https://www.hirahim.com/posts/shutting-down-soundcloud-on-sonos/images/glitch-mob-tweet.png" alt="Screenshot of a tweet from The Glitch Mob" /></a>
<a href="images/sarah-tweet.png"><img src="https://www.hirahim.com/posts/shutting-down-soundcloud-on-sonos/images/sarah-tweet.png" alt="Screenshot of a tweet from Sarah Julien" /></a>
<a href="images/thank-you-email.png"><img src="https://www.hirahim.com/posts/shutting-down-soundcloud-on-sonos/images/thank-you-email.png" alt="Screenshot of a “thank you” e-mail" /></a>
<a href="images/thanks-email.png"><img src="https://www.hirahim.com/posts/shutting-down-soundcloud-on-sonos/images/thanks-email.png" alt="Screenshot of another “thank you” e-mail" /></a></p>
<h2 id="other-integrations-and-source-code">Other Integrations and Source Code</h2>
<p>Outside of support questions, the two most frequently asked questions I got from people were “could you do something similar for Mixcloud and 8tracks” and “can I have the source code?” With the service shutting down today, I imagine that’ll be on people’s minds now more than ever. Unfortunately, the answer to both of those questions is no, because I currently work at <a rel="noopener" target="_blank" href="https://www.beatsmusic.com/">Beats Music</a>, and doing either would probably be seen as helping out a competitor. (I started SoundCloud on Sonos before I joined Beats Music.) However, if you’re a developer, you might be interested in <a rel="noopener" target="_blank" href="https://github.com/SoCo/SoCo">SoCo</a>, an opensource library to control Sonos speakers <a href="https://www.hirahim.com/posts/dissecting-the-sonos-controller/">I put out a while back</a>. (For similar reasons, it’s now managed by a group of very active and very friendly developers.)</p>
<h2 id="what-s-next">What’s Next</h2>
<p>For non-compete reasons, I’ve got no immediate plans to do any new Sonos apps, nor any SoundCloud integrations, but my love for connected devices (what initially drew me to Sonos) is still strong and I’ll be exploring that with other devices as they catch my eye. You can always reach me at <a href="mailto:rahim@sonawalla.org">rahim@sonawalla.org</a> or <a rel="noopener" target="_blank" href="https://twitter.com/rahims">@rahims</a>.</p>
SoundCloud on Sonos
Rahim Sonawalla
2012-11-25T00:00:00+00:00
2012-11-25T00:00:00+00:00
https://www.hirahim.com/projects/sonos-soundcloud/
<p>Just a placeholder. All the actual stuff is in the template since it was straight copied over.</p>
<h2 id="uninstall">Uninstall</h2>
<p>^ This is needed because of a deeplink in another post</p>
Saving Pandora Songs to Rdio
Rahim Sonawalla
2012-10-28T00:00:00+00:00
2012-10-28T00:00:00+00:00
https://www.hirahim.com/posts/saving-pandora-songs-to-rdio/
<p>I flew to Austin the other weekend for the <a rel="noopener" target="_blank" href="http://austinmusichacks.eventbrite.com/">Austin Music Hackathon</a> and built a quick Chrome extension called “<a rel="noopener" target="_blank" href="https://github.com/rahims/Pandora-to-Rdio">Pandora to Rdio</a>“. (Lame name, I know.) It adds any track you “thumbs up” in Pandora to an Rdio playlist. Take the pain out of creating playlists by letting Pandora do the heavy lifting of finding music you like. Download the extension/grab its source code on <a rel="noopener" target="_blank" href="https://github.com/rahims/Pandora-to-Rdio">my Github page</a>.</p>
<h2 id="motivation">Motivation</h2>
<p>Pandora is great at finding music I like, but it streams one track after another without any ability to repeat a song. Rdio is great at letting me pick and choose songs I want to listen to and repeat them as often as I like, but—despite its name—does a lousy job of finding me new music. I wanted to get a <a rel="noopener" target="_blank" href="https://www.youtube.com/watch?v=fPRtTc-cuEQ">peanut butter chocolate situation</a> going and combine the good parts of each service.</p>
<h2 id="revisiting-pandora">Revisiting Pandora</h2>
<p>I created a somewhat popular Firefox extension called <a href="https://www.hirahim.com/posts/harmony-10-now-available/">Harmony</a> a few years back that automatically scrobbled songs you listened to on Pandora to Last.fm. It used an undocumented Pandora Javascript API that ended up being removed during one of Pandora’s redesigns (subsequently breaking my extension). I’d been meaning to tinker around with Pandora again since then, and what better time than at a music hackathon?</p>
<p>The latest version of the Pandora web player is in HTML, so hooking into it is straightforward. The Chrome Developer Tools came in handy, I simply right-clicked on the “thumbs up” button, hit “Inspect”, and had its CSS class name. I did the same for the song’s name and artist. With those, I used a few lines of jQuery to add an event listener to the “thumbs up” button and extract the values for song title and artist.</p>
<p><a href="images/Chrome-Developer-Tools-Pandora.png"><img src="https://www.hirahim.com/posts/saving-pandora-songs-to-rdio/images/Chrome-Developer-Tools-Pandora.png" alt="A screenshot of the Chrome Developer Tools inspecting the Pandora website" /></a></p>
<p>The trickier part was working with oAuth in a browser extension. The Chrome Extension documentation has a <a rel="noopener" target="_blank" href="http://developer.chrome.com/extensions/tut_oauth.html">page</a> dedicated to it, and provides a helper library that makes the oAuth dance a breeze. Well, except that if you use the code as provided, you’ll be presented with an error in your Javascript Console:</p>
<pre style="background-color:#2b2c2f;color:#cccece;"><code><span>Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'self' chrome-extension-resource:". - chrome_ex_oauth.html:19
</span></code></pre>
<p>The reason for this is that the Chrome security policy doesn’t allow for HTML pages in extensions to have inline Javascript. The workaround is to move the small amount of Javascript code in <code>chrome_ex_oauth.html</code> into a new Javascript file. I called mine <code>oauth_helper.js</code>:</p>
<pre data-lang="javascript" style="background-color:#2b2c2f;color:#cccece;" class="language-javascript "><code class="language-javascript" data-lang="javascript"><span style="color:#6699cc;">$</span><span>(document)</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">ready</span><span>(</span><span style="color:#c594c5;">function</span><span style="color:#5fb3b3;">() {
</span><span> ChromeExOAuth</span><span style="color:#5fb3b3;">.</span><span style="color:#6699cc;">initCallbackPage</span><span>()</span><span style="color:#5fb3b3;">;
</span><span style="color:#5fb3b3;">}</span><span>)</span><span style="color:#5fb3b3;">;
</span></code></pre>
<p>Here’s my updated <a rel="noopener" target="_blank" href="https://github.com/rahims/Pandora-to-Rdio/blob/master/oauth_helper.js">chrome_ex_oauth.html</a> and <a rel="noopener" target="_blank" href="https://github.com/rahims/Pandora-to-Rdio/blob/master/oauth_helper.js">oauth_helper.js</a> for reference.</p>
<h2 id="invisible-interfaces-and-room-for-improvement">Invisible Interfaces and Room for Improvement</h2>
<p>When you install the extension, one of the things you’ll notice is that there isn’t anything to notice. Unlike most other extensions, Pandora to Rdio doesn’t add buttons or any other elements to the browser’s chrome. This was a conscious decision. The interface is Pandora itself. Like a song? Hit the thumbs up button in Pandora, the extension will silently take care of adding the song to your Rdio playlist. It’ll even create the special “Pandora Favorites” playlist automatically if it doesn’t exist. An ideal invisible interface only disturbs the user for errors the application couldn’t deal with on its own, so you won’t find any “successfully added track” type notifications. Success is the default, not something that needs to be called out every time. Unfortunately, Pandora to Rdio isn’t an ideal invisible interface and while it doesn’t disturb you with “successfully added track” type notifications, it also doesn’t call out errors properly. This is something I wanted to work on, but ended up running out of time.</p>
<p>Unfortunately, it’s also not packaged for distribution to non-technical users since each person needs to sign up for their own (free) <a rel="noopener" target="_blank" href="http://developer.rdio.com/">Rdio Developer account</a> and enter their Rdio API credentials in the <a rel="noopener" target="_blank" href="https://github.com/rahims/Pandora-to-Rdio/blob/master/background.js">background.js</a> file. I wrote up <a rel="noopener" target="_blank" href="https://github.com/rahims/Pandora-to-Rdio#installation">step-by-step install instructions</a> that should help make things easier.</p>
<p>Despite a few rough edges, I think the project is still a useful starting point for more elaborate hackathon projects. You can grab the source code on <a rel="noopener" target="_blank" href="https://github.com/rahims/Pandora-to-Rdio">my Github page</a>. If you end up using it to build something cool, let me know all about it, <a rel="noopener" target="_blank" href="https://twitter.com/rahims">@rahims</a> or <a href="mailto:rahim@sonawalla.org">rahim@sonawalla.org</a>!</p>
Dissecting the Sonos Controller
Rahim Sonawalla
2012-04-29T00:00:00+00:00
2012-04-29T00:00:00+00:00
https://www.hirahim.com/posts/dissecting-the-sonos-controller/
<p>I’ve been eyeing the <a rel="noopener" target="_blank" href="http://www.sonos.com/system/">Sonos devices</a> for some time now, but never found a chance to sit down and play with them. For the folks that have never heard of Sonos, they make high-quality wireless speakers that you can set up in your home to stream music through. The interesting thing about them is that they’re fairly easy to configure—just plug in power and connect to your home network—and they play well with each other. Adding multiple speakers is as easy as plugging them in and joining them to your home network. The devices automatically find each other, and from there you can easily arrange them however you like (have all the speakers play the same song, have two playing one song in one room and another playing a different song in another room, and so on). The downside is that they’re quite expensive (the cheapest speaker costs ~$300), which made it hard for me to justify getting one. However, that all changed recently. I was in Australia for <a rel="noopener" target="_blank" href="http://sydney.musichackday.org/2012">Music Hack Day Sydney</a> this past weekend and, as a sponsor of the event, Sonos was there with a bunch of Play:5 speakers!</p>
<p>Realizing that this was the perfect opportunity to finally tinker with the Sonos players, I started digging through their <a rel="noopener" target="_blank" href="http://musicpartners.sonos.com/?q=node/21">developer documentation</a>. What I quickly found out though, was that while there were plenty of docs (and sample code) for creating applications that stream music <em>through</em> the speakers, there was no documentation on how one could control the speakers themselves. After a bit or reading, I learned that the Sonos devices communicate with each other over <a rel="noopener" target="_blank" href="https://en.wikipedia.org/wiki/Universal_Plug_and_Play">UPnP</a>, so controlling the speakers would simply be a matter of working with that protocol. Unfortunately, that was easier said than done. Despite being an open protocol, UPnP is lacking in easy to understand documentation and easy to use libraries. To be fair, there seem to be good UPnP libraries in C, but I didn’t want to dust off my ancient C programming skills for a weekend hackathon. I was able to find two Python libraries, <a rel="noopener" target="_blank" href="http://brisa.garage.maemo.org/">BRisa</a> and <a rel="noopener" target="_blank" href="https://github.com/henkelis/sonospy">sonospy</a>, but neither worked when I gave them a go. It was time to get creative.</p>
<h2 id="peaking-under-the-hood">Peaking Under the Hood</h2>
<p>I had a working Sonos controller on my computer, the official Sonos application, I just needed to figure out how it did its magic. Enter <a rel="noopener" target="_blank" href="http://www.wireshark.org/">Wireshark</a>, the revealer of secrets. After a few packet sniffing sessions, I had what I needed. Sonos devices can be controlled by sending them SOAP messages. For example, to pause the currently playing track, you would make the following HTTP POST request:</p>
<pre style="background-color:#2b2c2f;color:#cccece;"><code><span>POST http://[Sonos speaker’s IP address]:1400/MediaRenderer/AVTransport/Control
</span><span>
</span><span>POST Headers:
</span><span>Content-type: text/xml
</span><span>SOAPACTION: "urn:schemas-upnp-org:service:AVTransport:1#Pause"
</span><span>
</span><span>POST Body:
</span><span><s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"><s:Body><u:Pause xmlns:u="urn:schemas-upnp-org:service:AVTransport:1"><InstanceID>0</InstanceID><Speed>1</Speed></u:Pause></s:Body></s:Envelope>
</span></code></pre>
<p>Finding out the Sonos speaker’s IP address is done programmaticaly through the <a rel="noopener" target="_blank" href="https://en.wikipedia.org/wiki/Simple_Service_Discovery_Protocol">Simple Service Discovery Protocol</a> (SSDP). I wasn’t able to write the code to do dynamic discovery over the weekend, so I took the easier route. If you have the official Sonos application installed, you can get a report of all the speakers’ IPs by clicking on “About My Sonos System” in the Sonos menu.</p>
<p><a href="images/Menubar.png"><img src="https://www.hirahim.com/posts/dissecting-the-sonos-controller/images/Menubar.png" alt="About Sonos Menubar" /></a></p>
<p>It’ll give you an output similar to the one below.</p>
<pre style="background-color:#2b2c2f;color:#cccece;"><code><span>Associated ZP: 10.0.0.103
</span><span>———————————
</span><span>BRIDGE: BRIDGE
</span><span>Serial Number: 00-0E-58-4F-87-AC:F
</span><span>Version: 3.7 (build 17551200)
</span><span>Hardware Version: 1.5.0.0-2
</span><span>IP Address: 10.0.0.100
</span><span>———————————
</span><span>PLAY:5: HACK CHILL ROOM1
</span><span>Serial Number: 00-0E-58-5D-15-32:E
</span><span>Version: 3.7 (build 17551200e)
</span><span>Hardware Version: 1.16.4.1-2
</span><span>IP Address: 10.0.0.102
</span><span>OTP: 1.1.1(1-16-4-zp5s-0.5)
</span><span>———————————
</span></code></pre>
<p>I was able to implement the other basic controller functionality, like play, stop, next track, previous track, and get information about the currently playing track, by sending similar SOAP messages. A few hours into the weekend, and I was able to control the hackathon’s Sonos speakers through my terminal! (This got pretty annoying for the other hackers at the event since those were the sames speakers that were playing the music for the weekend.)</p>
<h2 id="spicing-things-up">Spicing Things Up</h2>
<p>A terminal prompt makes for a lousy hackathon demo, so I decided to code up some simple applications on top of my base Sonos controller library. The first was a web app version of the Sonos controller. To spice things up, I used the <a rel="noopener" target="_blank" href="http://developer.rovicorp.com/">Rovi API</a> to bring in high-quality cover art and editorial album reviews. I also used the <a rel="noopener" target="_blank" href="http://developer.rdio.com/">Rdio API</a> to add in a “favorite” button. Push it and the currently playing track gets added to an Rdio playlist.</p>
<p><a href="images/soco.png"><img src="https://www.hirahim.com/posts/dissecting-the-sonos-controller/images/soco.png" alt="Screenshot of SoCo web app" /></a></p>
<p>For my second example application, I used the <a rel="noopener" target="_blank" href="http://www.twilio.com/">Twilio API</a> to let me control the Sonos speaker through SMS. This was particularly cool because the official Sonos application for the iPhone only works when you’re in your home network. With SMS, I could control the speakers even when I was out. Ran out of the house, but forgot to turn off your music? Just send your speaker a text message.</p>
<h2 id="the-payoff">The Payoff</h2>
<p>Sonos Play:3 and BridgeAll three demos went over well with the audience at the hackathon during my presentation on Sunday night, and I ended up winning a Sonos Play:3 and bridge of my very own!</p>
<p><a href="images/sonos.png"><img src="https://www.hirahim.com/posts/dissecting-the-sonos-controller/images/sonos.png" alt="Sonos Play:3 and Bridge" /></a></p>
<h2 id="the-real-payoff">The Real Payoff</h2>
<p>While winning some hardware of my own is awesome, the real win for me is the potential for great things in the near future. All of my code for the weekend and the sample applications is available on <a rel="noopener" target="_blank" href="https://github.com/rahims/SoCo">my Github page</a>. With the basic groundwork laid, I hope developers at future hackdays will be able to create all kinds of interesting takes on controlling Sonos devices. Bring on the Kinect and OpenCV hacks! If you build something cool, please let me know about it through <a href="mailto:rahim@sonawalla.org">e-mail</a> or <a rel="noopener" target="_blank" href="http://www.twitter.com/rahims">Twitter</a>.</p>
How to Renew an Expired Passport in Nine Hours
Rahim Sonawalla
2011-08-03T00:00:00+00:00
2011-08-03T00:00:00+00:00
https://www.hirahim.com/posts/how-to-renew-an-expired-passport-in-nine-hours/
<p><strong>Update (7/5/17):</strong> Almost 6 years later, and this post is still going strong! All the information below is still accurate.</p>
<p><strong>Update (3/17/14):</strong> I wrote this post a few years ago, so please take that with a grain of salt. That being said, based on the happy user comments, it looks like nothing of the process I describe below has changed–everything is still accurate and should work for you.</p>
<p>It was the day before <a rel="noopener" target="_blank" href="http://www.startupfestival.com/">Startup Festival</a> and I was just a few hours away from hopping on a red eye to Montreal. I went to United’s website to check in when I was prompted to enter my passport number. I pull out my passport and take a look at the information, expires July 8th 2011. It was July 12th—shit. My passport expired four days ago and I had a flight to catch.</p>
<p>I call up United and, as I suspected, they inform me that I won’t be able to fly on an expired passport. My only option was to cancel the flight, scramble to renew my passport in a day, and book a new flight tomorrow. This is how I did it.</p>
<p>Disclaimer: I did this in Los Angeles. The process should be the same elsewhere in the nation, but your mileage may vary. Also, I cut this close; there’s no guarantee that you’ll be able to renew your passport the same day, but it’s worth a shot.</p>
<ul>
<li><strong>Make an appointment.</strong> Call up the <a rel="noopener" target="_blank" href="http://travel.state.gov/passport/">US Passport Office</a> and use their automated phone system to make an appointment. When I called, they gave me a date late into next week. That wouldn’t work for me so I decided to hangup and show up the next day, sans appointment. Following the appointment process is the recommended way, but I was desperate.</li>
<li><strong>Get your photo taken.</strong> An expired passport means an outdated passport photo. In order to renew your passport, you’ll need new photos. Call around to the local CVS and Walgreens. I was able to find a nearby Walgreens that had a photo department that was open until 10 PM. I went over and had two photos made. It cost me around $10. If you can’t find anything, you can go to the photo place near the <a rel="noopener" target="_blank" href="http://maps.google.com/maps/place?q=Federal+Building+11000+Wilshire+Blvd.,+Suite+1000+Los+Angeles,+CA+90024-3602&hl=en&cid=3988301706415377649">Federal Building’s</a> cafeteria the next morning. They open at 7 AM and charge $14 for two photos. (Avoid this since it would cause you to lose your place in line.)</li>
<li><strong>Fill out the renewal form.</strong> Go to the US Passport Office’s website and download the <a rel="noopener" target="_blank" href="http://travel.state.gov/passport/forms/ds82/ds82_843.html">DS-82 form</a>. Print it out and complete it ahead of time so that you’re not scrambling last minute. Don’t staple your photos to it, they’ll take care of it at the office. When filling out the form, you’ll want the passport book, not the passport card. (I accidentally checked “both” and had to change it later.)</li>
<li><strong>Print out your itinerary.</strong> In order to renew your passport at the last minute, you need to be able to prove that it’s urgent. Print our your itinerary that shows that you’re flying soon. (I attempted to use my hotel reservation, but they wanted to see a flight itinerary.)</li>
<li><strong>The next morning</strong>, bring with you: a copy of your itinerary, your DS-82, your expired passport, your wallet.</li>
<li><strong>Line up early.</strong> At the time of this writing, the Los Angeles US Passport Office opens at 7 AM. (Check the website for your location’s hours.) I woke up at 5 AM and was at the office by 5:30 AM. There were three people already lined up in front of me. Get there early. If you don’t have an appointment, you’ll need to line up at the “Will Call” windows. You should be able to see them when you arrive at the <a rel="noopener" target="_blank" href="http://maps.google.com/maps/place?q=Federal+Building+11000+Wilshire+Blvd.,+Suite+1000+Los+Angeles,+CA+90024-3602&hl=en&cid=3988301706415377649">Federal Building</a>. They’re labeled “Will Call A” and “Will Call B”. There will be two lines, one at will call for people that don’t have appointments, and one at the door to the passport office for those that do have appointments.</li>
<li><strong>Make your case.</strong> At around 7 AM the window will open and they’ll take people one at a time. Be polite, show your itinerary, and explain your situation. “My passport expired and I need to fly out tonight. I don’t have an appointment, is there anything you can do?” The person will give you a ticket with a number on it and tell you to go wait in the line with people that have appointments. That ticket is essentially your appointment.</li>
<li><strong>Wait.</strong> You’ll slowly make your way through the appointment line to security. The security check is just like the one you’d find at an airport. Turn off your cell phones (this is required) and leave as much as you can in your car (don’t bring a laptop or anything more than your paperwork and old passport). Once in the building you’ll head to the Passport office and stand in another line. Eventually you’ll get to the window where the clerk will look your paperwork over and give you another ticket. You’ll then sit in the waiting area and wait for your number to be called. Cell phones and laptops are not allowed, so chat up the people next to you or bring a book—the analog kind.</li>
<li><strong>Pay the fee.</strong> Eventually, your number will be called and you’ll go up to a window. (By now it was around 9 AM.) The clerk will look over your paperwork once more and you’ll pay $110 to renew the passport and $60 to expedite the process, so $170 total. (These prices were at the time of writing, they may increase. They accept credit cards.) If everything goes well, they’ll give you a receipt with a time to come back to pick up your passport. Typically, it’s between 1 PM and 3 PM that same day. If you got this far, there’s a very good chance you’ll be getting your passport.</li>
<li><strong>Wait some more.</strong> Go grab lunch at In-n-Out and come back to the Federal Building around 2 PM. (The receipt says to show up between 1 PM and 3 PM, but they never have it ready that early.) Stand in the will call line and give the clerk your receipt. They’ll either have your passport ready, or ask you to wait a bit longer. I ended up getting mine at 3 PM, right when the Passport office closes.</li>
</ul>
<p>There you have it, a passport in nine hours. Now, never do this again. Be prudent and renew by mail well in advance next time. Happy travels!</p>
Harmony 1.0 now available
Rahim Sonawalla
2008-07-05T00:00:00+00:00
2008-07-05T00:00:00+00:00
https://www.hirahim.com/posts/harmony-10-now-available/
<p>I submitted my entry for the <a rel="noopener" target="_blank" href="http://labs.mozilla.com/contests/extendfirefox3/">Extend Firefox 3</a> contest last night. (Awesome way to spend the 4th of July, I know.) I’m calling it <a rel="noopener" target="_blank" href="https://addons.mozilla.org/en-US/firefox/addon/8012">Harmony</a>, mostly because I was short on time and couldn’t think of a better name. My goal was to create a lite version of the <a rel="noopener" target="_blank" href="http://www.last.fm/">Last.fm</a> client that specialized in web media, but I had to cut a lot of features to make sure it got done in time for the contest deadline, so this first release isn’t really like I imagined it would be when I started. (The contest began on March 17th, I found out about it two weeks before the July 4th deadline.)</p>
<p>In this release, you can look up information about artists by selecting the artist’s name, right clicking, and hitting “look up artist.” Also, if you listen to music on the <a rel="noopener" target="_blank" href="http://www.pandora.com/">Pandora</a> web site, or using <a rel="noopener" target="_blank" href="https://addons.mozilla.org/en-US/firefox/addon/219">FoxyTunes</a>, you’ll get real-time information about the song; just click the Harmony icon in the browser’s status bar to show the panel.</p>
<p>If you’ve got a Last.fm account, you can scrobble the music you listen to on the Pandora web site, or in FoxyTunes; just go to preferences (either by clicking on the “tools” menu in Firefox, then clicking “add-ons,” selecting “harmony” from the list and click on the “options” button, or by clicking on the Harmony menu icon on the top left of the Harmony panel and clicking preferences) and enter your Last.fm account information. You’ll probably need to restart the browser to get it to integrate with Pandora and FoxyTunes properly.</p>
<p>In the next release, I’d like to add support for displaying album information, and caching listening information if it can’t be scrobbled when the song is played.</p>
<p>If you’re interested in playing around with it, you can download it <a rel="noopener" target="_blank" href="https://addons.mozilla.org/en-US/firefox/addon/8012">here</a>. I’m going to be using <a rel="noopener" target="_blank" href="http://getsatisfaction.com/hirahim/products/hirahim_harmony">Get Satisfaction</a> for handling support, so drop your bug reports and feature requests there. Here’s a video to get you started.</p>
<div >
<iframe src="//player.vimeo.com/video/1282974" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>
</div>
<p>That’s it for now. I’m off to enjoy the rest of the Independence day weekend by paying $10.50 to watch <a rel="noopener" target="_blank" href="http://www.imdb.com/title/tt0448157/">Hancock</a> :/</p>