Assignment 3 Description

In this assignment, you will learn more about vector space models of semantics (specifically, word embeddings), attention in the context of machine translation, and memory networks. The goals of this assignment are as follows:

  • Understand and implement two predictive methods for learning word embeddings.
  • Solve two semantic tasks using pretrained word embeddings.
  • Implement several methods for attention in neural machine translation.
  • Implement an end-to-end memory network for QA.

All parts of the assignment must be submitted by 11PM on Monday 4/2.

Setup

You can work on the assignment in one of three ways: locally on your own machine, on the EECS instructional machines, or on a virtual machine in EC2. See the Assignment 2 Description page for more information on these options.

GPU Resources

GPUs are not required for this assignment but will help to speed up training and processing time for some of the questions.

Working on the assignment

Get V6 of the code as a zip file here Download here.

Start IPython

After you have downloaded the assignment, you should start the Jupyter (IPython) notebook server from the assignment3 directory, with the jupyter notebook command. There are cells in appropriate places in each notebook that serve to download data and pretrained weights.

Some Notes

NOTE 1: The assignment3 code has been tested to be compatible with python versions 3.5 and 3.6 (it may work with other versions of 3.x, but we won’t be officially supporting them). For this assignment, we are NOT officially supporting python2. Use it at your own risk. You will need to make sure that during your virtualenv setup that the correct version of python is used. You can confirm your python version by (1) activating your virtualenv and (2) running python --version.

NOTE 2: We will be using the latest version of TensorFlow (1.6), and we will be discouraging the use of high-level abstractions from the tf.nn module. We will not be using PyTorch for this assignment.

NOTE 3: If you are working in a virtual environment on OSX, you may potentially encounter errors with matplotlib due to the issues described here (Links to an external site.)Links to an external site.. In our testing, it seems that this issue is no longer present with the most recent version of matplotlib, but if you do end up running into this issue you may have to use the start_ipython_osx.sh script from the assignment3 directory (instead of jupyter notebook above) to launch your IPython notebook server. Note that you may have to modify some variables within the script to match your version of python/installation directory. The script assumes that your virtual environment is named .env.

Submitting your work

Whether you work on the assignment locally or on EC2, once you are done working on all parts of the assignment (make sure you have the latest version), and made a PDF of your assignment as per the instructions in Piazza post @314 Links to an external site., run the collect_submission.sh script; this will produce a file called assignment3.zip. Please submit this file here to make a submission on bCourses.

We will also ask you to make a parallel submission to the course website on gradescope Links to an external site. for this assignment to make grading more efficient. Please follow the instructions detailed in Piazza post @314 Links to an external site. to submit a PDF version of your assignment to gradescope.

Note: As of 3/23/18, all notebooks are available. You need to have v6 of the assignment 3 zip file which is linked above to get the latest fixes. Make sure you don't overwrite your hard work when you download a new version of the code!

Assignment Tasks

Pay attention to Note 2 above: We are discouraging use of the high-level abstractions from tf.nn to make sure you understand the implementational details of the methods. Please write the code with lower-level TensorFlow primitives. (You may, however, use such abstractions for the course project.) As a starting point, see the TensorFlow documentation for how to make the sigmoid cross-entropy numerically stable here Links to an external site..

Q1: Word Embeddings (30 points)

The Jupyter notebook word_embeddings_tf.ipynb will walk you through implementing two methods for learning word embeddings. You will implement both and test your implementations by visualizing the learned embeddings with t-SNE. The notebook will also present a word similarity task and an analogy task, both of which you will solve using pretrained word embeddings.

Q2: Attention & Machine Translation (40 points)

The Jupyter notebook machine_translation_and_attention_tf.ipynb will walk you through implementing an LSTM cell that uses attention for the decoder of an NMT model, as well as three different methods for implementing the attention computation. You will test your implementations by training the models.

Q3: Memory Networks (30 points)

The Jupyter notebook memory_networks_tf.ipynb will walk you through implementing the MemN2N model. You will test your implementation by training the model on the babi question answering (QA) task.