reviewing MIT’s Intro To Deep Learning by Aiden H. '28
the highs and lows of the beloved IAP class
This past week was the first week of IAP.01 Independent Activities Period, aka the month of January where students can take a long winter break, take unique, accelerated classes, just do research, or participate in a MISTI program, among other things. Along with a UROP,02 Undergraduate Research Opportunities Program, aka undergrad research. I took Intro to Deep Learning (or 6.S191, which no one calls it), an annual class taught by two former MIT students turned power couple, Alexander and Ava Amini.
Since the class started, it has kind of blown up due to all of the lectures and labs being open-sourced on the website and Youtube, with this Obama deepfake specifically making the rounds in 2020.
Given the opportunity to take such a famed class, I decided to give my unabashed03 have I ever been abashed though thoughts.
Lectures
The class ran from Monday to Friday with a scheduled 1:00-4:00 P.M. lecture time. Each day was scheduled for “two” lectures, one given by Alexander and one given by Ava. Typically they lasted only a little over an hour each, meaning I ended up leaving around 3:15 each day while the remaining lecture time was used as office hours with the TA’s.
Having no prior knowledge of deep learning except that one 3Blue1Brown video, I found the lectures relatively straightforward in terms of how they explained the foundational architecture around different deep learning networks. Since there was extra time everyday, the lecturers were also very open to answering lots of questions,04 side note: since the class was open to the public, it was so funny being in a lecture hall with adults who probably haven't been in a lecture for years. like they really just blurt out questions no filter no hand raise no shame. even in a lecture hall of like 250. Still, the length of pure lecture on completely new, high-level material felt pretty draining. Because Alexander and Ava are high-up industry professionals who probably can’t take more than a week off of work, I understand the pressed schedule. From a purely logistical standpoint, I think two weeks with more time to digest and review material would have solidified my understanding more. For example, they went over CNNs, VAEs, and GANs in one day, and of course by the next morning I couldn’t articulate to you the differences with each because we went straight into types of reinforcement learning.
Additionally, I understand why with such a short timeframe and an attempt to be open-sourced to as many people as possible that the lectures were purely theoretical, but I kind of needed and wanted the mathematical basis behind a lot of these structures. To say they skimmed over the most important math would be false–the first lectures were leaned heavily on matrix operations, propagation with non-linearities, the loss function and gradient descent, etc.–but once applied to different structures, I felt loss on exactly what was happening between layers or even across structures. How did each differ mathematically? Is there a way to view vector and matrix values at each point in a deep neural network? Again, as I don’t know much about deep learning, I don’t know if this a niche thing to call out or an important topic skimmed over, but for an MIT crowd, I want to see the numbers move! Maybe they didn’t want to assume that much linear algebra knowledge though.
After three days of intro lectures, Thursday and Friday has class sponsors come give guest lectures on a variety of topics to give more specifics on industry-specific applications. I found these to be very hit or miss with how informative they actually were. Ava gave the last guest lecture on deep learning for biology (which duh was my favorite), but because I personally am only interested in that topic compared to something like LLMs, I got pretty bored by other lectures no matter how engaging and passionate a presenter they are. For students who genuinely come into the class without any knowledge of deep learning (which is becoming increasingly less likely), the variety of these guest lectures could help inform how they want to enter the field. For students or industry professionals who have already chosen a sector, these were less helpful, since tactical knowledge > inspiration.
Nonetheless, there is something really important to say about both Alexander and Ava’s ability to teach these topics to such a broad audience (in person and online) and also their desire to apply them to the modernizing deep learning landscape. Although I don’t know how different this year was compared to previous ones, they made sure to focus on how everything can be applied to the rise of now-widespread tech like LLMs/chatbots or self-driving cars that exploded in recent years.
Software Labs
In tandem with the first three lectures, there were three online coding “labs” that walk through some of the basics of deep learning implementation. The labs themselves were pretty straightforward, not requiring students to write the code for an entire model but instead only fill in some lines of code here and there, but they had pretty cool implementations (a song generator, a face debiasing system, and a chatbot that speaks like Yoda). The labs were not required to pass the class, but completing them and uploading them along with a written explanation of what you did allowed you to enter into a contest for a prize (headphones, a desktop monitor, and a Kindle, respectively).
Starting with the pros, the labs each had very interesting applications that definitely helped distract from the fairly monotonous complexity that learning abstract computer science in a lecture style has. Also not having them all be required took some weight off of the beginning of the week, allowing me to choose which labs I was more interested in completing, considering each one took at least an hour to complete and hours more to hone for competition’s sake.
Now for the rant. I would say that the access to labs of this quality is definitely something to be grateful for, also considering they are open-sourced for millions. But the quality of the work derived from the labs was not worth the amount of time spent on them. Although unrealistic for beginner level engineers to create their own models, a majority of the labs were spent in Google Colab05 The bane of my existence just importing things I’ve never heard of and running patches of code that were vaguely explained but that I couldn’t possibly begin to create on my own. Also when trying to better my code, I quickly ran into the free GPU limit on Colab so I ended up having to redo it all on separate Google accounts (which on Colab means rerunning every cell, some of which took like 30 minutes because, yk, training takes a while). A similar issue occurred when using the external sites for APIs and exports they had recommended because they were created by the course sponsors/guest lecturers (all of which is nice, but highkey weren’t that user friendly or reliable softwares). Finally, one of the cells of code that was prewritten straight up didn’t work and I06 ChatGPT tried to fix it but because I didn’t have access to the dictionary they were using (because it was imported), so I didn’t even know where to start.
On lab 3 (training an LLM to speak like Yoda), I tried for like 6+ hours over multiple days to try and create the best model in order to win the contest, but I kept giving up because factors beyond my control kept limiting what I could do and made the process very laborious (see: I ran out of GPU so I switched accounts which meant running the hour of code again only to notice my free API key had expired so I had to do it again the next morning but Colab sucks and actually didn’t save right etc. etc. etc.). In the end I didn’t even have a final product.
Nonetheless, all of my bitching and moaning can’t be taken as too serious of a negative, because as stated before, none of this was technically required, it just did give me a bit of hassle.
Final Presentation
For MIT students taking the class for credit, there were two options for a final project to be submitted by Friday’s lecture: a one-pager reviewing a deep learning paper (boring!) or a 5 minute group presentation on a deep learning proposal to be given at the end of Friday’s lecture (definitely less boring!). Doing the proposal also put MIT-based teams up for the “grand prizes” of Apple Watches, desktop monitors, and a NVIDIA 4070 GPU.07 aka something I could've sold for like at least $500
I worked with some other course 6-708 computer science (course 6) and bio (course 7) interdisciplinary major freshman friends on an application that we all had general UROP experience with the night before it was due, and imo we cranked out a pretty good project.09 actually full credit to Grace C. '28 for staying up after we all went to sleep to literally download a bunch of AI software to run our THEORETICAL lab until 6 A.M. just because she wanted data to flex during our presentation. a truly dedicated 6-7 Tragically the night before presentations at midnight, the course staff announced that because so many people opted to present over the paper, presentations had to be cut from 5 minutes to 3 minutes. **cue major panic**
After the final two guest lectures on Friday, we got to sit and listen to everyone’s final project proposals, which was pretty interesting considering people taking the class ranged from MIT freshman to students at other universities who travelled for the class to literal business professionals with like master’s degrees who took off work to take this for fun. Ngl though, having to listen to two hours of presentations after two hours of lecture was verrrry draining and I was verrrry over it in the end. You could also definitely tell that most presentations were cut short or glossed over because of the time though, but also I understand why we couldn’t all stay there until 8:00 P.M. listening to projects. They even brought out a mini gong to ring (hit? strike? play?) when your team’s time was up (even though a lot of people kept talking anyway!!!!!!!), which was hilarious.
In the end my team didn’t place top 3, but that’s okay10 jk this is me cosplaying as not a sore loser. it is okay but duh I think we should've won lol because there were a ton of cool and unique projects. You can see all final presentation slides here (…I was in group 6)!
Final Scores11 Important: I rank things on what is a perfect bell curve in my mind (5 is average), NOT a school grading system (7 is passing, 5 is failing)
How much I learned: 8/10
Difficulty/homework: 6/10
Pacing: 4/10
How much free stuff I got: 6/10 (I got one shirt, but that’s definitely above average for a class)12 unless you count like, credit that counts towards my degree and my future or whatever
Overall: 6/10 – a slightly above average class
tl;dr fine class lol
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑
- back to text ↑