Final Project Proposal

Due by 10pm Friday 9/25

Team Forming

The first step in starting your project is team formation. Some of you may already have a team, others may be more interested in specific topics. To facilitate theme matching, you can use the discussion functionality in bCourses. On the menu at the left, you can click on "Discussions". This will take you to a list of discussion threads. You will find a thread there for each of the recommended projects as "<Project Title> Project". You can reply to those threads if you are interested in that project, and others will be able to see your post.

If you want to start a new project and are looking for partners, you can post a new discussion topic (make sure it has name "<Project Title> Project") and then others can reply to it to express interest. You can take it from there.

Here are the recommended projects:

Recommended Projects Page

Writeup

For the project proposal assignment, we would like you to produce a 2+ page Project Proposal. You can either submit pure text or a pdf file. This proposal should include these sections:

The Team

A list of the members of your team and their expected roles. If team members all have similar backgrounds, explain this, and you can describe a symmetric work breakdown.

Problem Statement and Background

A high-level statement of the problem you intend to address, e.g. finding correspondences between neural recordings and DNN layers. Try to translate the high-level into specific questions if you can - but this may be difficult before you do exploratory data analysis.

Give background on the problem you are solving: why it is interesting, who is interested, what is known, some references about it, etc.

The Data Source(s) You Intend to Use

Describe the data source(s) you will use. If you're not doing one of the recommended projects, make sure you have access to the data you want to use *in the quantity and quality you need*. Ideally you should get all the data in your hands before starting your project. Several teams in past offerings of this course have run into trouble because they couldnt obtain the volume or quality of data they needed. Many online data sources (e.g. Yelp) strongly throttle the data volume you can get through their API. Others (Twitter) have a higher limit but have use restrictions that you must respect. Certain types of meta-data (e.g. location, demographics) in social media sources are both sparse and unreliable.

Describe how you plan to obtain the data, or how you got it if you already have it.

Give a summary of the cleaning/joining of data that you expect to do up front.

Goals of Your Analysis

List some goals of your analysis, ideally in the form of testable hypothesis, or via well-defined success metrics. These can be tentative, and you dont need to stick to them throughout your project. Again since you havent done any exploratory analysis yet, you might assume that the data has structure that it doesnt, and you might not have seen other interesting patterns in the data. But you should always approach the data with some expectations so that your efforts are focused.

Description of Data Analysis Tools You Plan to Use

Describe the tools you plan to use throughout the project. As you might expect, there will be several stages in the project. Those will include exploratory data analysis, modeling (using machine learning), and possibly a scale-up from a sample of data to the full dataset. This part can also be tentative, and we will give you feedback on your analysis plan as part of grading the assignment.

Describe the Data Products Your Project Will Produce

Data products include results of statistical tests, performance analyses of learning algorithms, visualizations of the data or model parameters. Give a list of these relevant to your project.

Submission

Use this link to submit your writeup.