HW 1

Text Analysis and Entity Resolution

In this HW you will apply some basic text preparation methods and use the IPython notebook. First of all, you need to start IPython. IPython is a client-server system that runs a server tasks for editing and evaluation, and a client which can be any web browser. From a terminal prompt in your virtual machine, type:

ipython notebook

You will see some startup diagnostics on the screen (sometimes it will take a while), and then it should bring up the IPython dashboard on your default web browser. Leave the terminal window open and the ipython task running - when you close it IPython will cease to function in the browser.

If the browser window doesnt open, or if you need to restart the browser, point it at this URL:

http://localhost:8888

Documentation for IPython is available on the IPython project page Links to an external site..

Once you have IPython running, you need to load the HW assignment, which is an ipython notebook file. You should be able to click on this link Links to an external site. to get the file. If you do this inside a browser that is running on your VM, you can save it somewhere in the VM's file system, e.g. in a directory you created named ~/hw1. Or if you have trouble with networking in your VM, open the link above in any browser on your host machine, save it to a file, and then use drag-and-drop to copy the file from your host to your VM. Drag-and-drop is not enabled by default but you can turn it on from the "Settings" option in the Oracle VM control panel.

If you just want to look at the HW, you can use this link Links to an external site.. This opens a (non-editable) viewer with the HW file. Don't try to do anything to that window, its solely for previewing the assignment.

Now that you have hw1.ipynb in a directory on your VM (say in ~/hw1), you need to load it into IPython. That means you need to "upload" it to the server. From the "Notebooks" tab in IPython, either use drag-and-drop or click on the highlighted link to upload a file. The file will first appear in a list with an "Upload" button. You have to click this button and then the file will be uploaded. Finally clicking on the filename "hw1.ipynb" will open the IPython editor/evaluator.

Once you have the hw1 notebook open, you can follow the directions in it to do the assignment. Make sure you regularly hit File->Save and Checkpoint so that your work is being saved. When you are done editing, dont close the IPython server. First use File->Download As and save as an IPython Notebook (.ipynb) e.g. as ~/hw1/hw1-submit.ipynb. You will submit this file later.

Submission Instructions

Submit homework 1 by Thursday 10 September at 10:00PM by using your bcourse accounts and this link. Your submission must include a file named 'hw1-submit.ipynb'.