EC2 Setup

For forthcoming assignments, you may find it useful to use an amazon EC2 instance. You can sign up for AWS Educate credits using this link:

https://aws.amazon.com/education/awseducate

Once you have your account, you can create/start/stop machines from the AWS console. Before connecting to amazon instances, you need an RSA or DSA Public-Key Encryption pair.

The key is in two parts: <keyname> and <keyname>.pub. The first file is the private key. Be careful with this. Dont share it, and make sure it is not readable by anyone else.

You can create keypairs in a unix environment with

ssh-keygen

Which will ask you where to save the keys. Make sure the file protection is owner RW access only for the private key.

Googling "Amazon Web Services" should bring you to a login page. After logging in, you should see a page like the one below. We recommend you select "US West (Oregon)" from the drop-down menu (these instances are on the west coast but cheaper than California), then select EC2.

[screenshot]

At the EC2 console, select "Key Pairs" from the left-hand side menu (in the "Network & Security" Group. Then select "Import Key Pair" and browse to the location of your public key. Upload it and give it a name (it doesn't matter what name you give it, but make sure you remember where the private key is).

At the EC2 console, now select "Instances" from the left-hand side:

[screenshot]

From the instances menu you will see a list of instances you currently have (none in the beginning). Clicking on the big blue "launch instance" button at the top of the page will lead you through the process of setting up a virtual machine. You should then see a page with various images. Select the "Deep Learning AMI (Ubuntu) Version 4.0". This image has a variety of deep learning toolkits, including Tensorflow and PyTorch.

 ami2.png

The next page shows you the various virtual machine options. We recommend p3.xlarge (GPU) or c5.2xlarge (non GPU) instances. Check the current prices of these instances in the US West zone before deciding.

Click on "Configure Instance Details" at the bottom of the screen. You will then see various options that you can configure. The defaults should be fine. Click "Review and Launch". This takes you to a page that summarize the instances configuration. Finally hit the "Launch" button. It will ask you to select the key pair to use. Select the right one (you probably only have one if you're reading this), and check the "acknowledge.." box. Then click "Launch" to really start your instance. The final button "View Instances" takes you back to the Instance page where you will see your new instance launching.

The display will look something like this (with fewer machines). Make sure your machine is selected (blue check mark). The look at the bottom of the list.

IP.png

You should see a "Public DNS" entry. That is the public machine name that you should connect to to use your machine. Unfortunately it will be different each time you start your instance. You can either copy this entire entry, or just use the IP address, which is the numeric part of the name. i.e. here its 34.211.248.29. The IP address is also listed in the "IPv4 Public IP" column of the instance table. Normally you'll have to scroll to the right to see it. Anyway, use either the DNS name or IP address.

Connecting

Then to connect to an EC2 host (using the DNS name/IP address you looked up above), if you are on Mac OSX or Linux, use the script linked here to connect. Copy it to a file, and make that file executable (chmod 755 filename). Make sure you fill in the name of your private key file, and the machine IP address in the script before you try to connect. For Windows, follow these directions for using Putty.

Note Amazon EC2 machine names are volatile and will be different each time you start the machine. You will have to look up the name from the EC2 console and save it to the connection script each time.

To connect from a Mac or Linux machine, run the connection script you just edited from a terminal prompt.

Once inside, you will find yourself in a Linux environment. The machine has 16 GB RAM, an 8-core Intel processor, and an NVIDIA GPU.

Running Notebooks

You can start notebooks normally from the command prompt:

ipython notebook

When you start a notebook it will first try to start a browser through the EC2 machine X11 service. If you are running X11 on your client machine (i.e. if you use Linux), it will start a new window on your client. Its painfully slow to work in this mode however, because you have to transfer all the pixels that change when you perform an action. If you get this window, just close it (you dont have to do anything if you dont see it).

In case you get an error "OSError: /home/ubuntu/anaconda3/envs/tensorflow_p36/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.20' not found" with tensorflow, do these steps.

Instead the notebooks can be used with your local browser. To start ipython without starting its own browser, do:

ipython notebook --no-browser

Just open a free browser window or tab on your client (i.e. your laptop) and go to the URL:

http://localhost:8888

This (8888) is the default socket number of the first notebook you open on a machine. If you start more than one notebook, they will use higher numbers. Just look at the Notebook's startup message on the terminal that you started it on to see which socket number it uses, and use that in your connection URL.

Although the machine address is "localhost" the connection is actually made to the remote EC2 instance. That's because the connection scripted created SSH tunnels from port 8888 (and higher) on your local machine to the corresponding ports on the remote EC2 instance. That simplifies connecting, and also secures the connections against snooping in transit.

Once at that URL you can interact with the notebook as you would on your local machine

Stop your Machine

Stop your machine from the EC2 management console when you are done. Logging out does not stop the machine. Nor does shutting it down from a command prompt to the virtual machine, i.e. "sudo shutdown now" will cause your VM to shutdown, but billing will continue.  You have to go to the EC2 management console above, select your instance, click on the "Actions" menu, select "Instance State" and select "Stop". Do not click "Terminate". That destroys your instance and any disks attached to it. Avoid rapid restarts. Amazon rounds each start/stop cycle to the nearest hour for billing. 5 rapid start/stop cycles in 10 minutes will be billed as 5 hours.

Also be careful if you have been working for a while because the connection from the EC2 Management Console in your browser to the web server may be stale. Its an active page so you can still click on Actions-->Instance State-->Stop but it may not stop your instance. Always refresh the "EC2 Management Console" page in your browser before applying actions. Make sure you see positive evidence that your instance is stopping, i.e. refresh the Instances page, and make sure your instance shows "stopped" or "stopping".