Syllabus for CS 7150 Spring 2025

Class time and Location: Tuesdays 11:45 AM - 1:25 PM and Thursdays 2:50 PM - 4:30 PM, Dodge Hall 173

The class meetings are in-person only.

The class is very participatory, with groups of students presenting and discussing papers on most days. On the first day of class, please fill out https://bit.ly/cs7150-roles25 to sign up.

The class will be 100% in-person.

Sign up for piazza here: https://piazza.com/northeastern/spring2025/cs715041859202530/home

Professor: David Bau davidbau@northeastern.edu

Professor Office Hours: right before class 2pm Thursdays at the lecture hall.

TAs: Vatika Tewari tewari.v@northeastern.edu

TA help hours:

To be posted.

Summary

In this course we will learn the principles and practice of deep learning methods and research.

We will cover the capabilities of deep networks, the main methods for training them effectively, and the common architectures and techniques for using a deep network to process and produce images and natural language text.  In addition to getting experience using these methods, we will discuss seminal research papers, overview some of the main questions and debates that have emerged in deep learning research, and explore some current research topics.  There will be a final project where you work with a partner to choose, investigate, and write a blog exploration of a deep learning research topic centered on a paper you choose.

Grading

We are planning 160 points total of course work.

Class participation: 40 points.  In-class paper presentations and discussion.

Semester Overview: Throughout the semester, 15 lectures will be dedicated to discussing each one of the 15 selected research papers by a panel of 12 students that will lead the discussion with follow up questions from the audience (or questioners). These papers & their respective dates are pre-scheduled & can be seen in timeline schedule of the course structure.

Points Breakdown & To Dos (40 Points):

  1. Nightly Questions: One night before each paper discussion, read the assigned paper & submit a question (and further elaboration as to why do you think that question is important or reasonable justification for it) for Prof. Bau or the review panel on canvas. Each question submitted grants you 1 point, amounting to a total of 12 Points.
  2. In-Class Interactions: Prof. Bau may randomly call on students during any lecture, be it a paper presentation or a standard class. Students earn points from being called on and from other active participation; class participation is 7 points.
  3. Paper Presentation & Review: Thrice during the semester, you'll be assigned to one of the roles from the list below, to review and present an aspect of a paper that will let you read a paper from different perspectives. This collaborative effort, done with several other students, requires the creation of a slide-deck (more on that below). Successfully leading these discussions can earn you up to 7 points per presentation, amounting to a total of 21 Points.

Roles (several students for each paper):

  1. Diagrammers (1 students): Create a slide visually explaining the method of the paper.
  2. Reviewers (up to 2 students): Address NeurIPS review questions. Summaries to be included in the last slide and detailed reviews uploaded to a specific Google Drive folder.
  3. Archaeologist (1 student): Offer historical context relevant to the paper.
  4. Private Investigator (up to 1 student): Delve into the backgrounds and motivations of the paper's authors.
  5. Academic Researcher (up to 1 student): Suggest a potential future academic project based on the paper's findings or method.

Format & Preparation:

- Duration: Each paper discussion will last 45 minutes

- Roles & Sign-up: Students are required to fill the role-playing sign-up form on Canvas before January 9. Here, you can indicate your role preference for each paper. While we aim to honor your preferences, please understand that adjustments might be necessary.

- Slide Deck & Materials Submission: All participants of  the panel, should prepare and submit the required materials, one night before the scheduled discussion. This includes:

  - Google Slides presentation only for those in designated roles (Go to the shared drive and read through instructions in 'INSTRUCTIONS: For reference purposes' File, it is detailed for every role, what particularly you have to do, where exactly and what files you have to upload)

  - A question about the paper for the broader class (Nightly Questions through canvas)

  - Acknowledgment of the role you're playing for that paper (If you have a role for a particular paper, within the same nightly question assignment, you can select role from the options acknowledging which role will you play for that paper)

 Class Procedure (during presentation):

  1. Slide Presentation (10-15 mins): Professor Bau will guide the class through the Google Slides, with each student briefly explaining (in under a minute) their slide and findings.
  2. Open Discussion: TAs will select students at random to pose their previously submitted questions to the panel

Homework: 40 points.  Programming and calculation exercises.

These will be jupyter notebooks to be worked through by students individually, due to be submitted online every two weeks at 11AM before Monday class.  Late homework submissions will be accepted but will lose points per day late, no points after a week.  10 points per homework.

Midterm: 40 points.  A written exam about foundational methods.  Closed-book.

Final Project: 40 points.  A blog/webpage report and presentation about a research paper that you choose. Done by groups of 2 students (3 with permission).  Similar to the paper-reading roleplaying exercise, except that you choose the papers, you play all the roles, and instead of putting together slides, you will assemble an illustrated blog with all your analysis.  Optionally, your project may include a programming demonstration or a research extension of the concepts in the paper that you choose.

 

Late day policy

Late homework submission will be accepted.  Every student has a bank of 3 free late days that can be applied without asking permission (e.g., 2 days on one homework, 1 day on another homework).  Beyond the banked 3 late days, automatically 20% of the points will be penalized for each day late.

 

Collaboration Policy

Collaboration is allowed, but you should think about the problems yourself before discussing them with others.  When you do seek out help, we strongly advise you to find a fellow classmate to talk with and work together rather than copying an answer. You will all learn much more by thinking collaboratively and explaining ideas to one another.  When you collaborate, you must acknowledge your collaborators by listing them explicitly.

 

Academic integrity policy

Read through university's academic integrity policy and plagiarism

 

Topic Map for the Course


List of important concepts to know after this course

*Starred items may be on the midterm exam.

 

The historical and intellectual evolution of deep network methods:

  • From Ramon y Cajal to McCulloch-Pitts Neurons to Cybenko's Universal Approximation*
  • Comparing Rosenblatt’s Perceptron Learning to Rumelhart-Hinton’s Backpropagation*
  • Lettvin’s Grandmother Neuron idea vs Hinton’s Parallel Distributed Processing Model*
  • Hinton-Salakhutdinov’s Layerwise Pretraining vs Glorot-Bengio’s End-to-End Training*
  • From Hubel and Wiesel to LeCun’s LeNet to Krizhevsky’s AlexNet*
  • From Ellman’s Structure in Time to GPT-3 Few-Shot Learning

How to use deep network tools (pytorch)

  • GPU tensors*
  • Autograd / backpropagation*
  • SGD and Adam and other optimizers*
  • Network modules*
  • Fast data loading*
  • Using pretrained models*

How to train a model deeper than a few layers

  • The backpropagation algorithm*
  • Vanishing and exploding gradient*
  • Xavier or Kaiming initialization*
  • ReLU vs Sigmoid vs Threshold nonlinearities*
  • Residual connections*
  • Batch normalization*

How to improve generalization

  • Holdout tests*
  • L2 Weight decay*, L1 regularization*
  • Dropout*
  • Early stopping*
  • Data augmentation*
  • Denoising*
  • Weight sharing, invariants and equivariances*
  • Double-descent

Learning to Learn

  • Metalearning
  • Prompting

Doing research

  • Reading papers
  • Writing papers

* Starred concepts may be on the midterm.

 

How to design effective objectives

  • Softmax and classification*
  • Cross-Entropy loss, KL divergence, and distribution-matching*
  • Autoencoders* and Denoising Autoencoders*
  • L2 reconstruction loss and L1 median-finding loss*
  • Contrastive learning*
  • Variational Autoencoders
  • Diffusion Models
  • Normalizing Flows

How to input images

  • Convolutional Layers*
  • Pooling*
  • AlexNet, VGG, and ResNet*

How to output images

  • U-Nets*
  • Perceptual losses
  • Generative Adversarial Networks

How to understand and visualize a convolutional network

  • Adversarial examples
  • Gradient visualizations
  • Saliency maps
  • Feature visualizations and dissection

How to input and output text

  • Word embedding*
  • Recurrent networks*
  • Language models*
  • LSTM and GRU*
  • Attention and Transformers*
  • BERT, GPT, and T5

How to understand and visualize a language model

  • Probability visualization
  • Attention visualization
  • Probing classifiers
  • Logit lens
  • Causal traces

What makes a good representation

  • Semantic vector composition
  • Disentanglement
  • Transfer learning



Tentative Class Schedule

(Note: we are sure to alter this schedule, likely to omit some topics or discuss others.)

Date Note Handout Due Topic Activity Reading
Tuesday, January 7, 2025
HW1, reading signups
History: neurons, perceptrons, universiality, backprop. david https://papers.baulab.info/
Thursday, January 9, 2025 Signups Tensors, GPUs, Autograd, Optimizers, Modules, DataLoader How-to-read-pytorch Rumelhart 1986 (Backpropagation)
Tuesday, January 14, 2025 Classification and backpropagation david Bottou 1990 (Modules)
Thursday, January 16, 2025 Optimization and deep networks presentation Kingma 2015 (ADAM)
Tuesday, January 21, 2025 HW2 HW1 Initialization presentation Glorot 2010 (initialization)
Thursday, January 23, 2025 Last drop day Equivariances and convolutions david Lecun 1989 (LeNet)
Tuesday, January 28, 2025 Achieving Depth: residual nets, batchnorm presentation He 2016 (ResNet)
Thursday, January 30, 2025 David in DC [special help day] (no lecture)
Tuesday, February 4, 2025 HW3, project signup HW2 Neural Language modeling david Elman 1990 (RNNs)
Thursday, February 6, 2025 Recurrent neural networks and gating presentation Cho 2014 (GRUs)
Tuesday, February 11, 2025 Transformers presentation Vaswani 2017 (Transformers)
Thursday, February 13, 2025 Multimodal representation learning: CLIP presentation Radford 2021 (CLIP)
Tuesday, February 18, 2025 HW4 HW3 Midterm review david
Thursday, February 20, 2025 David in DC Midterm Midterm exam midterm
Tuesday, February 25, 2025 Image generation david
Thursday, February 27, 2025 Project titles and teams due Adversarial generation: GANs presentation Goodfellow 2014 (GAN)
Tuesday, March 4, 2025 Spring break
Thursday, March 6, 2025 Spring break
Tuesday, March 11, 2025 Variational Autoencoders: VAEs presentation Kingma 2013 (VAE)
Thursday, March 13, 2025 HW4 Normalizing Flows presentation Dinh 2017 (Real NVP)
Tuesday, March 18, 2025 Diffusion Models presentation Sohl-Dickstein 2015 (Diffusion)
Thursday, March 20, 2025 Large language models presentation Brown 2020 (GPT-3)
Tuesday, March 25, 2025 RLHF and DPO presentation Ouyang 2022 (RLHF)
Thursday, March 27, 2025 How to do research david
Tuesday, April 1, 2025 Project abstracts due Mechanistic interpretability david Meng 2022 (ROME)
Thursday, April 3, 2025 Mamba guest Sen Sharma 2024 (Mamba)
Tuesday, April 8, 2025 Project reviews due In-context learning guest Todd 2024 (In-context Learning)
Thursday, April 10, 2025 Model editing guest Gandikota 2024 (Unlearning)
Tuesday, April 15, 2025 Poster day final projects
Monday, April 21, 2025 final project report due (midnight) final projects