Course Syllabus

Welcome to Applied Natural Language Processing (i256)!

Mon, Wed 10:30-12:00, 202 South Hall

Prof. Marti Hearst,

Fall 2015

Office Hours: Wed 3-4pm, 307B South Hall

TA: John Semerdjian,

Much of the most valuable information available online today resides in textual form, but natural language is notoriously difficult to process automatically. Applied natural language processing -- also known as automated content analysis and language engineering -- can provide partial solutions.

This course will examine the state-of-the-art in applied NLP, with an emphasis on how well the algorithms work and how they can be used (or not) in applications.  Today there are many ready-to-use plug-and-play software tools for NLP algorithms.  For this reason, this course will emphasize getting facile with quick programs using existing tools.  The intended learning outcomes are for students to:

    • Learn about major NLP issues and solutions
    • Become agile with NLP programming
    • Be able to asses NLP problems
    • Be able to get the gist of relevant research papers

This course will also be making use of a different learning approach than we use in most classes, which has been shown by hundreds of research papers to work better than the traditional lecture.  This method makes use of what is variously known as active learning and peer/collaborative learning.  What it means for students is:

    • Lecturing will be minimized in favor of active work in class, which means students must prepare for class in advance.  Therefore ...
    • Students must prepare and turn in materials before class every week.
    • Students will be actively engaged during most of the class period, including extensive programming in class.
    • Students will work closely with other students in class to improve their learning.
    • For these reasons, the class must be taken for a grade.  No auditors, no S/U.

The course book is free online; it is the book that accompanies the NLTK software, which will will be working with extensively through the semester.  Another terrific book is Jurafsky & Martin's Speech and Language Processing, but since it is both too expensive and a bit too technical, we are not using it in this class.

The UC Berkeley code of conduct is in effect in this class; you are expected to do your own work except when explicitly asked to work with others.  You may consult with others but you must write your own code when that is required by an assignment.  If you use code from elsewhere, you must explicitly note which pieces of code come from elsewhere and describe where the code comes from.

We are also using bcourses, which is a pretty terrific course management tool. The best way to view what is happening is via the Modules View.

See the flyer for the final project poster session

Course Summary:

Date Details