In working on a Kaggle competition, I wanted to start working with neural networks. There are a few
libraries to do this, but I decided to use Pylearn2. It's built to compile your code onto the GPU. My
only problem is that, while I have a beefy 2011 Macbook Pro, it has an AMD card - no cuda. Well, let's
worry about that later.
After checking out the github repo, I moved to DIRECTORY, and started working through the
softmax_regression tutorial. I've seen people complain that the documentation for Pylearn2 is mediocre
at best. I've actually found the tutorials at least to be fairly decent thus far - so maybe that
situation has improved. I'm still trying to get used to their API - it's a good deal more complicated
than scikit-learn, for example.
The softmax_regression tutorial trains a softmax regressor (a multi-class logistic regressor) on the
well-known MNIST library. It's straight-forward and runs just fine on my Macbook Pro's CPU. Then I
moved to the multilayer_perceptron tutorial. This tutorial actually warns you you'll probably want a
I set up the multilayer_perceptron based on the tutorial. And waited.. and waited.. and then went to make
some lunch. Then I got back, saw it still running, and started looking at what Amazon EC2 instances I
could set up. There are two types g2.2xlarge and cg1.4xlarge. g2 instances are the current
generation, have a K104 GPU, are about a quarter the cost, and are usually used for "Game streaming, 3D
application streaming, and other server-side graphics workloads". cg1 have more CPU compute units, a
more powerful M2050 "Fermi" GPU, are more expensive, and are often used for "Computational chemistry,
rendering, financial modeling, and engineering design".
Let's see how they stack up. I spun up one of each instance using the Ubuntu 12.04 hvm instance. Two
notes: 1. The hvm version is a bit further down the list than the initially visible paravirtual Ubuntu
option, which cannot be used with GPU compute instances. 2. The old cg1 instances are available on US
East and specifically are not available on US West.
After some debugging, I came up with the following shell script that sets up the entire environment.
This sets up several things.
the MNIST data set in ~/data/mnist
The nvcc cuda compiler, pylearn2, and theano
That does the system-wide setup. Next, I ran time python work.py, where work.py is the following
I converted the tutorials from YAML to Python just because I prefer to understand what I'm doing in the
context of Python rather than YAML - there's a pretty simple 1-to-1 conversion between them. Since
convergence conditions are the same, all computers achieve 0.9813 accuracy. However, the time's are
Looking at the numbers, a few conclusions:
Don't try to use the CPU for tasks that are meant for the GPU. It's just really slow.
The g2 instances are about 50% slower than the cg1 instances. On the other hand, they're about 28%
the price. Ergo, g2 is about half the price in computing units - they're half as slow but 4 times
cheaper. That being said, if I were running a small to medium size job, it might be worth the 50% time savings.
If anyone has a retina Macbook Pro, those have some fairly beefy GPUs in them, and I'd be very curious
to see how they stack up.