5 Minutes of Machine Learning: Manipulating Tensors [Day 6]

Now that we know what TensorFlow is, and what a tensor is and how it generally behaves, let’s talk about manipulating tensors. If you are utterly confused about how and why you got here, please start on Day 1 of the 5 Minutes of Machine learning series, or hop to the last post if you just want to cut to the chase and learn about TensorFlow for a bit.

At this point, do yourself a favor and knock out this quick notebook of exercises with messing around with Tensorflow… it will make the rest of this conversation much easier to ingest (:

In a brief review, let’s redefine what a tensor is… I also just want an excuse to post a few of my horrible drawings again:

Remember this? Yaaas…

So just in review, tensors can be either scalar values, vectors, or matrices. Today, we will be focusing on matrices when it comes to tensor manipulation per Google’s Machine Learning Crash Course, which is what this series is based off of as I teach myself machine learning (they said it would be easy…).

Recall how I compared a graph with nodes and edges (edges being tensors) as waves on a beach. Here is another horrible drawing as a refresher:

Think of nodes as values that change after a set wave of tensors ripple over them, changing their values after each wave.

We will continue to work with this concept today as we talk about changing matrix-based tensors, specifically with multiplication since the Google TensorFlow gods in the ML Crash course claim this is the most common basic tensor manipulation operation to start with.

Let’s pull back a second before we dive into tensor manipulation. Why would we want to do this as machine learning developers? Well, we have to manipulate tensors to tweak how our model trains, and how our algorithms perform in training to produce the most optimal model results. While you might not do some of this work manually yourself, it really puts some of the Estimator functions in perspective if you are someone who likes to understand what is going on under the hood, and copying and pasting code from the TensorFlow website without understanding what it is doing isn’t really your thing (I am guilty of this sometimes…).

meep.

Time travel back to middle school algebra for a second…

mmhmm.

Tensor shapes are (in this case, for matrices) also known as matrix sizes or shapes.

So if I say I have a 2 x 3 matrix, the matrix is 2 rows by 3 columns as you can see in the meme above (first matrix in the upper left hand corner).

Traditionally in mathematics addition and multiplication operations can only be performed on matrices of the same shape. Check this site out for general rules on how to multiply matrices before you continue.

… So you did that, right? Lookie here at this part:

Source here.

Well guess what? In Tensorflow, this particular rule goes out of the window… with something called broadcasting. This numpy documentation page outlines broadcasting simply and really, really well. Go on a tangent here and scan this page before continuing.

Broadcasting

If you’re lazy/ are time constrained (this thing is titled 5 minutes, after all…) check out the examples below of broadcasting:

A      (2d array):  5 x 4
B (1d array): 1
Result (2d array): 5 x 4

A (2d array): 5 x 4
B (1d array): 4
Result (2d array): 5 x 4

A (3d array): 15 x 3 x 5
B (3d array): 15 x 1 x 5
Result (3d array): 15 x 3 x 5

A (3d array): 15 x 3 x 5
B (2d array): 3 x 5
Result (3d array): 15 x 3 x 5

A (3d array): 15 x 3 x 5
B (2d array): 3 x 1
Result (3d array): 15 x 3 x 5

So the general rule of thumb for manipulating matrix tensors is NOT that they have to be the same dimensions, but that the numbers “match” by either multiplying a number by one, or all of the numbers in the column matching.

Because I usually have to draw things out myself, check out my interpretation of the code snippet above:

Yes, I used instagram stories to do this…

Here are examples of matrix tensors that would NOT broadcast:

A      (1d array):  3
B (1d array): 4 # trailing dimensions do not match

A (2d array): 2 x 1
B (3d array): 8 x 4 x 3 # second from last dimensions mismatched

If you have some extra time on your hands (either now, or sometime shortly after you read this) I would highly recommend knocking out this notebook of exercises, applying everything you just learned!

So, great, we can basically multiply MORE types of tensors in matrix form now.

So how does this work while coding with TensorFlow? We will use tf.reshape to change the shape of tensors when performing operations to train a model. If you want to run through this example below yourself, I’ll point you back to this notebook to actually run it:

with tf.Graph().as_default():
# Create an 8x2 matrix (2-D tensor).
matrix = tf.constant([[1,2], [3,4], [5,6], [7,8],
[9,10], [11,12], [13, 14], [15,16]], dtype=tf.int32)
# Reshape the 8x2 matrix into a 2x8 matrix.
reshaped_2x8_matrix = tf.reshape(matrix, [2,8])

# Reshape the 8x2 matrix into a 4x4 matrix
reshaped_4x4_matrix = tf.reshape(matrix, [4,4])

So what happens here? To end this (longer…) 5 minutes series, one of my elegant drawings:

A tensor can be reshaped however you need to reshape it to perform multiplication

So I think you get it now! Stay tuned for the next post, which will cover initializing and assigning variables in TensorFlow!