Seminar course: Bridging Language in Machines and Language in the Brain

Current natural language processing (NLP) models (e.g. ChatGPT, GPT-4, etc.) have impressive capabilities, but how closely do they actually align with the capabilities of the only system that truly understands complex language–the human brain? In this seminar, we will review work that studies the existing alignment between the representations of language constructed by NLP models and the representations of language in the human brain obtained from brain imaging devices, as humans and models process the same language input. We will discuss the reasons for existing alignment, and some of the established remaining gaps. We will additionally review works that aim to bring NLP models closer to the human brain. Lastly, students will have the opportunity to propose and complete related projects.

Instructor: Mariya Toneva
Teaching Assistant: Gabriele Merlin

Course structure

The course consists of five main components that contribute to the final grade as follows:

  1. Attendance and participation during class (20%)
  2. Presentations and reports of research papers: one presentation (10%) and two reports (10%)
  3. Project proposal (20%)
  4. Project report and code (20%)
  5. Final presentation (20%)

Introductory lectures

We will begin the course with two lectures. These introductory sessions aim to familiarize students with fundamental concepts in machine learning and neuroscience that are needed for engaging with the research at the intersection between the two fields. The dates and locations of the lectures are specified in the timeline below.

Research paper presentations and reports

In the course of the semester, there will be two days dedicated to discussing relevant research. During each of these days, we will discuss 4 research papers from the list below. On one of these days, a student will be expected to present a 20-minute presentation on one of these papers, and on the other the student will be expected to submit two 2-page reports on 2 of the papers (1 report per paper; reports should utilize the NeurIPS template). The research papers will be randomly assigned. Reports and presentation slides are due by the time of class. The dates and locations of the research paper discussions are specified in the timeline below.

Please structure the presentation/report as follows:

  • a short summary of the paper
  • a discussion on how the paper extends state of the art
  • the main strengths of the paper
  • the main weaknesses of the paper
  • discuss how this paper could be improved

If you wish, you could also use these ideas to pursue as part of your project.

Project

Students will have the opportunity to pursue a research project of their choice, or to select a topic from a pre-defined set. As part of the project, students will be expected to:

  • submit a project proposal outlining the research question, proposed methodology, dataset, and closest related work
  • incorporate feedback from the instructors about the proposed project
  • complete the project and submit a report detailing the project and findings, as well as any related code
  • present the project (20 minute presentation)

We ask that the project proposal is 2 pages and the project report is 8 pages, with the opportunity to submit additional content as a supplementary file. All reports should utilize the NeurIPS template.

Timeline

The course will meet in person. See below for the dates and locations of each class. Note that class attendance and participation will both contribute to the final grade. We will block about 8 hours of time for the final project presentations. The exact dates will be finalized in discussion with enrolled students. Attendance at the final presentations will be mandatory.

Date   Time & Location
Oct 31 Lecture 1: LLMs and estimating alignment [slides] 1-2:30pm, Room 005, E1 5
Nov 7 Lecture 2: Neuroscience [slides] 1-2:30pm, Room 005, E1 5
Nov 21 Paper presentations & reports 1-4pm, Room 105, E1 5
Dec 5 Paper presentations & reports 1-4pm, Room 105, E1 5
Dec 12 Office hours for projects 3-5pm, Online
Dec 19 [Optional] First draft of project proposals due  
Jan 9 Final project proposals due  
Jan 16 Instructors send feedback on project proposals via email  
Jan 23 [Optional] Office hours for projects 1-3pm, Room 438, E1 5
Mar 6 Project report & code due  
Mar 13 Final project presentations 9am-4pm, Room 029, E1 5

List of research papers

Measuring the alignment between brains & LMs:

Understanding the reasons behind the existing alignment:

Improving the alignment further: