Assignment 3 Description
In this assignment, you will learn more about vector space models of semantics (specifically, word embeddings), attention in the context of machine translation, and memory networks. The goals of this assignment are as follows:
- Understand and implement two predictive methods for learning word embeddings.
- Solve two semantic tasks using pretrained word embeddings.
- Implement several methods for attention in neural machine translation.
- Implement an end-to-end memory network for QA.
All parts of the assignment must be submitted by 11PM on Monday 4/2.
Setup
You can work on the assignment in one of three ways: locally on your own machine, on the EECS instructional machines, or on a virtual machine in EC2. See the Assignment 2 Description page for more information on these options.
GPU Resources
GPUs are not required for this assignment but will help to speed up training and processing time for some of the questions.
Working on the assignment
Get V6 of the code as a zip file here Download here.
Start IPython
After you have downloaded the assignment, you should start the Jupyter (IPython) notebook server from the assignment3
directory, with the jupyter notebook
command. There are cells in appropriate places in each notebook that serve to download data and pretrained weights.
Some Notes
NOTE 1: The assignment3
code has been tested to be compatible with python versions 3.5
and 3.6
(it may work with other versions of 3.x
, but we won’t be officially supporting them). For this assignment, we are NOT officially supporting python2. Use it at your own risk. You will need to make sure that during your virtualenv
setup that the correct version of python
is used. You can confirm your python version by (1) activating your virtualenv and (2) running python --version
.
NOTE 2: We will be using the latest version of TensorFlow (1.6), and we will be discouraging the use of high-level abstractions from the tf.nn
module. We will not be using PyTorch for this assignment.
NOTE 3: If you are working in a virtual environment on OSX, you may potentially encounter errors with matplotlib due to the issues described here (Links to an external site.). In our testing, it seems that this issue is no longer present with the most recent version of matplotlib, but if you do end up running into this issue you may have to use the start_ipython_osx.sh
script from the assignment3
directory (instead of jupyter notebook
above) to launch your IPython notebook server. Note that you may have to modify some variables within the script to match your version of python/installation directory. The script assumes that your virtual environment is named .env
.
Submitting your work
Whether you work on the assignment locally or on EC2, once you are done working on all parts of the assignment (make sure you have the latest version), and made a PDF of your assignment as per the instructions in Piazza post @314
Links to an external site., run the collect_submission.sh
script; this will produce a file called assignment3.zip
. Please submit this file here to make a submission on bCourses.
We will also ask you to make a parallel submission to the course website on gradescope Links to an external site. for this assignment to make grading more efficient. Please follow the instructions detailed in Piazza post @314 Links to an external site. to submit a PDF version of your assignment to gradescope.
Note: As of 3/23/18, all notebooks are available. You need to have v6 of the assignment 3 zip file which is linked above to get the latest fixes. Make sure you don't overwrite your hard work when you download a new version of the code!
Assignment Tasks
Pay attention to Note 2 above: We are discouraging use of the high-level abstractions from tf.nn
to make sure you understand the implementational details of the methods. Please write the code with lower-level TensorFlow primitives. (You may, however, use such abstractions for the course project.) As a starting point, see the TensorFlow documentation for how to make the sigmoid cross-entropy numerically stable here
Links to an external site..
Q1: Word Embeddings (30 points)
The Jupyter notebook word_embeddings_tf.ipynb
will walk you through implementing two methods for learning word embeddings. You will implement both and test your implementations by visualizing the learned embeddings with t-SNE. The notebook will also present a word similarity task and an analogy task, both of which you will solve using pretrained word embeddings.
Q2: Attention & Machine Translation (40 points)
The Jupyter notebook machine_translation_and_attention_tf.ipynb
will walk you through implementing an LSTM cell that uses attention for the decoder of an NMT model, as well as three different methods for implementing the attention computation. You will test your implementations by training the models.
Q3: Memory Networks (30 points)
The Jupyter notebook memory_networks_tf.ipynb
will walk you through implementing the MemN2N model. You will test your implementation by training the model on the babi question answering (QA) task.