Mon 10 February 2014

Deep Learning in Python with Pylearn2 and Amazon EC2

In working on a Kaggle competition, I wanted to start working with neural networks. There are a few libraries to do this, but I decided to use Pylearn2. It's built to compile your code onto the GPU. My only problem is that, while I have a beefy 2011 Macbook Pro, it has an AMD card - no cuda. Well, let's worry about that later.

After checking out the github repo, I moved to DIRECTORY, and started working through the softmax_regression tutorial. I've seen people complain that the documentation for Pylearn2 is mediocre at best. I've actually found the tutorials at least to be fairly decent thus far - so maybe that situation has improved. I'm still trying to get used to their API - it's a good deal more complicated than scikit-learn, for example.

The softmax_regression tutorial trains a softmax regressor (a multi-class logistic regressor) on the well-known MNIST library. It's straight-forward and runs just fine on my Macbook Pro's CPU. Then I moved to the multilayer_perceptron tutorial. This tutorial actually warns you you'll probably want a GPU.

I set up the multilayer_perceptron based on the tutorial. And waited.. and waited.. and then went to make some lunch. Then I got back, saw it still running, and started looking at what Amazon EC2 instances I could set up. There are two types g2.2xlarge and cg1.4xlarge. g2 instances are the current generation, have a K104 GPU, are about a quarter the cost, and are usually used for "Game streaming, 3D application streaming, and other server-side graphics workloads". cg1 have more CPU compute units, a more powerful M2050 "Fermi" GPU, are more expensive, and are often used for "Computational chemistry, rendering, financial modeling, and engineering design".

Let's see how they stack up. I spun up one of each instance using the Ubuntu 12.04 hvm instance. Two notes: 1. The hvm version is a bit further down the list than the initially visible paravirtual Ubuntu option, which cannot be used with GPU compute instances. 2. The old cg1 instances are available on US East and specifically are not available on US West.

After some debugging, I came up with the following shell script that sets up the entire environment.

This sets up several things.

  1. ipython
  2. the MNIST data set in ~/data/mnist
  3. The nvcc cuda compiler, pylearn2, and theano

That does the system-wide setup. Next, I ran time python work.py, where work.py is the following file.

I converted the tutorials from YAML to Python just because I prefer to understand what I'm doing in the context of Python rather than YAML - there's a pretty simple 1-to-1 conversion between them. Since convergence conditions are the same, all computers achieve 0.9813 accuracy. However, the time's are dramatically different.

ComputerTimeTime (s)
Macbook Pro118m49.801s7129.801
g2.2xlarge 28m14.577s1694.577
cg1.4xlarge 19m3.275s1143.275

Looking at the numbers, a few conclusions:

  1. Don't try to use the CPU for tasks that are meant for the GPU. It's just really slow.
  2. The g2 instances are about 50% slower than the cg1 instances. On the other hand, they're about 28% the price. Ergo, g2 is about half the price in computing units - they're half as slow but 4 times cheaper. That being said, if I were running a small to medium size job, it might be worth the 50% time savings.

If anyone has a retina Macbook Pro, those have some fairly beefy GPUs in them, and I'd be very curious to see how they stack up.


Want to receive similar articles? (No spam, promise!)