Syllabus for CS 7150 Spring 2025

Class time and Location: Tuesdays 11:45 AM - 1:25 PM and Thursdays 2:50 PM - 4:30 PM, Dodge Hall 173

The class meetings are in-person only.

The class is very participatory, with groups of students presenting and discussing papers on most days. On the first day of class, please fill out https://bit.ly/cs7150-roles25 to sign up.

The class will be 100% in-person.

Sign up for piazza here: https://piazza.com/northeastern/spring2025/cs715041859202530/home

Professor: David Bau davidbau@northeastern.edu

Professor Office Hours: right before class 2pm Thursdays at the lecture hall.

TAs: Vatika Tewari tewari.v@northeastern.edu

TA help hours:

To be posted.

Summary

In this course we will learn the principles and practice of deep learning methods and research.

We will cover the capabilities of deep networks, the main methods for training them effectively, and the common architectures and techniques for using a deep network to process and produce images and natural language text. In addition to getting experience using these methods, we will discuss seminal research papers, overview some of the main questions and debates that have emerged in deep learning research, and explore some current research topics. There will be a final project where you work with a partner to choose, investigate, and write a blog exploration of a deep learning research topic centered on a paper you choose.

Grading

We are planning 160 points total of course work.

Class participation: 40 points. In-class paper presentations and discussion.

Semester Overview: Throughout the semester, 15 lectures will be dedicated to discussing each one of the 15 selected research papers by a panel of 12 students that will lead the discussion with follow up questions from the audience (or questioners). These papers & their respective dates are pre-scheduled & can be seen in timeline schedule of the course structure.

Points Breakdown & To Dos (40 Points):

Nightly Questions: One night before each paper discussion, read the assigned paper & submit a question (and further elaboration as to why do you think that question is important or reasonable justification for it) for Prof. Bau or the review panel on canvas. Each question submitted grants you 1 point, amounting to a total of 12 Points.
In-Class Interactions: Prof. Bau may randomly call on students during any lecture, be it a paper presentation or a standard class. Students earn points from being called on and from other active participation; class participation is 7 points.
Paper Presentation & Review: Thrice during the semester, you'll be assigned to one of the roles from the list below, to review and present an aspect of a paper that will let you read a paper from different perspectives. This collaborative effort, done with several other students, requires the creation of a slide-deck (more on that below). Successfully leading these discussions can earn you up to 7 points per presentation, amounting to a total of 21 Points.

Roles (several students for each paper):

Diagrammers (1 students): Create a slide visually explaining the method of the paper.
Reviewers (up to 2 students): Address NeurIPS review questions. Summaries to be included in the last slide and detailed reviews uploaded to a specific Google Drive folder.
Archaeologist (1 student): Offer historical context relevant to the paper.
Private Investigator (up to 1 student): Delve into the backgrounds and motivations of the paper's authors.
Academic Researcher (up to 1 student): Suggest a potential future academic project based on the paper's findings or method.

Format & Preparation:

- Duration: Each paper discussion will last 45 minutes

- Roles & Sign-up: Students are required to fill the role-playing sign-up form on Canvas before January 9. Here, you can indicate your role preference for each paper. While we aim to honor your preferences, please understand that adjustments might be necessary.

- Slide Deck & Materials Submission: All participants of the panel, should prepare and submit the required materials, one night before the scheduled discussion. This includes:

- Google Slides presentation only for those in designated roles (Go to the shared drive and read through instructions in 'INSTRUCTIONS: For reference purposes' File, it is detailed for every role, what particularly you have to do, where exactly and what files you have to upload)

- A question about the paper for the broader class (Nightly Questions through canvas)

- Acknowledgment of the role you're playing for that paper (If you have a role for a particular paper, within the same nightly question assignment, you can select role from the options acknowledging which role will you play for that paper)

Class Procedure (during presentation):

Slide Presentation (10-15 mins): Professor Bau will guide the class through the Google Slides, with each student briefly explaining (in under a minute) their slide and findings.
Open Discussion: TAs will select students at random to pose their previously submitted questions to the panel

Homework: 40 points. Programming and calculation exercises.

These will be jupyter notebooks to be worked through by students individually, due to be submitted online every two weeks at 11AM before Monday class. Late homework submissions will be accepted but will lose points per day late, no points after a week. 10 points per homework.

Midterm: 40 points. A written exam about foundational methods. Closed-book.

Final Project: 40 points. A blog/webpage report and presentation about a research paper that you choose. Done by groups of 2 students (3 with permission). Similar to the paper-reading roleplaying exercise, except that you choose the papers, you play all the roles, and instead of putting together slides, you will assemble an illustrated blog with all your analysis. Optionally, your project may include a programming demonstration or a research extension of the concepts in the paper that you choose.

Late day policy

Late homework submission will be accepted. Every student has a bank of 3 free late days that can be applied without asking permission (e.g., 2 days on one homework, 1 day on another homework). Beyond the banked 3 late days, automatically 20% of the points will be penalized for each day late.

Collaboration Policy

Collaboration is allowed, but you should think about the problems yourself before discussing them with others. When you do seek out help, we strongly advise you to find a fellow classmate to talk with and work together rather than copying an answer. You will all learn much more by thinking collaboratively and explaining ideas to one another. When you collaborate, you must acknowledge your collaborators by listing them explicitly.

Academic integrity policy

Read through university's academic integrity policy and plagiarism.

Topic Map for the Course

List of important concepts to know after this course

*Starred items may be on the midterm exam.

The historical and intellectual evolution of deep network methods:

From Ramon y Cajal to McCulloch-Pitts Neurons to Cybenko's Universal Approximation*
Comparing Rosenblatt’s Perceptron Learning to Rumelhart-Hinton’s Backpropagation*
Lettvin’s Grandmother Neuron idea vs Hinton’s Parallel Distributed Processing Model*
Hinton-Salakhutdinov’s Layerwise Pretraining vs Glorot-Bengio’s End-to-End Training*
From Hubel and Wiesel to LeCun’s LeNet to Krizhevsky’s AlexNet*
From Ellman’s Structure in Time to GPT-3 Few-Shot Learning

How to use deep network tools (pytorch)

GPU tensors*
Autograd / backpropagation*
SGD and Adam and other optimizers*
Network modules*
Fast data loading*
Using pretrained models*

How to train a model deeper than a few layers

The backpropagation algorithm*
Vanishing and exploding gradient*
Xavier or Kaiming initialization*
ReLU vs Sigmoid vs Threshold nonlinearities*
Residual connections*
Batch normalization*

How to improve generalization

Holdout tests*
L2 Weight decay*, L1 regularization*
Dropout*
Early stopping*
Data augmentation*
Denoising*
Weight sharing, invariants and equivariances*
Double-descent

Learning to Learn

Metalearning
Prompting

Doing research

Reading papers
Writing papers

* Starred concepts may be on the midterm.

How to design effective objectives

Softmax and classification*
Cross-Entropy loss, KL divergence, and distribution-matching*
Autoencoders* and Denoising Autoencoders*
L2 reconstruction loss and L1 median-finding loss*
Contrastive learning*
Variational Autoencoders
Diffusion Models
Normalizing Flows

How to input images

Convolutional Layers*
Pooling*
AlexNet, VGG, and ResNet*

How to output images

U-Nets*
Perceptual losses
Generative Adversarial Networks

How to understand and visualize a convolutional network

Adversarial examples
Gradient visualizations
Saliency maps
Feature visualizations and dissection

How to input and output text

Word embedding*
Recurrent networks*
Language models*
LSTM and GRU*
Attention and Transformers*
BERT, GPT, and T5

How to understand and visualize a language model

Probability visualization
Attention visualization
Probing classifiers
Logit lens
Causal traces

What makes a good representation

Semantic vector composition
Disentanglement
Transfer learning

Tentative Class Schedule

(Note: we are sure to alter this schedule, likely to omit some topics or discuss others.)

Date	Note	Handout	Due	Topic	Activity	Reading
Tuesday, January 7, 2025		HW1, reading signups		History: neurons, perceptrons, universiality, backprop.	david	https://papers.baulab.info/
Thursday, January 9, 2025			Signups	Tensors, GPUs, Autograd, Optimizers, Modules, DataLoader	How-to-read-pytorch	Rumelhart 1986 (Backpropagation)
Tuesday, January 14, 2025				Classification and backpropagation	david	Bottou 1990 (Modules)
Thursday, January 16, 2025				Optimization and deep networks	presentation	Kingma 2015 (ADAM)
Tuesday, January 21, 2025		HW2	HW1	Initialization	presentation	Glorot 2010 (initialization)
Thursday, January 23, 2025	Last drop day			Equivariances and convolutions	david	Lecun 1989 (LeNet)
Tuesday, January 28, 2025				Achieving Depth: residual nets, batchnorm	presentation	He 2016 (ResNet)
Thursday, January 30, 2025	David in DC			[special help day] (no lecture)
Tuesday, February 4, 2025		HW3, project signup	HW2	Neural Language modeling	david	Elman 1990 (RNNs)
Thursday, February 6, 2025				Recurrent neural networks and gating	presentation	Cho 2014 (GRUs)
Tuesday, February 11, 2025				Transformers	presentation	Vaswani 2017 (Transformers)
Thursday, February 13, 2025				Multimodal representation learning: CLIP	presentation	Radford 2021 (CLIP)
Tuesday, February 18, 2025		HW4	HW3	Midterm review	david
Thursday, February 20, 2025	David in DC		Midterm	Midterm exam	midterm
Tuesday, February 25, 2025				Image generation	david
Thursday, February 27, 2025			Project titles and teams due	Adversarial generation: GANs	presentation	Goodfellow 2014 (GAN)
Tuesday, March 4, 2025	Spring break
Thursday, March 6, 2025	Spring break
Tuesday, March 11, 2025				Variational Autoencoders: VAEs	presentation	Kingma 2013 (VAE)
Thursday, March 13, 2025			HW4	Normalizing Flows	presentation	Dinh 2017 (Real NVP)
Tuesday, March 18, 2025				Diffusion Models	presentation	Sohl-Dickstein 2015 (Diffusion)
Thursday, March 20, 2025				Large language models	presentation	Brown 2020 (GPT-3)
Tuesday, March 25, 2025				RLHF and DPO	presentation	Ouyang 2022 (RLHF)
Thursday, March 27, 2025				How to do research	david
Tuesday, April 1, 2025			Project abstracts due	Mechanistic interpretability	david	Meng 2022 (ROME)
Thursday, April 3, 2025				Mamba	guest	Sen Sharma 2024 (Mamba)
Tuesday, April 8, 2025			Project reviews due	In-context learning	guest	Todd 2024 (In-context Learning)
Thursday, April 10, 2025				Model editing	guest	Gandikota 2024 (Unlearning)
Tuesday, April 15, 2025			Poster day		final projects
Monday, April 21, 2025			final project report due (midnight)		final projects