I’m Not Normal About Friends (Part 1) by Jenny B. '25
i lost six nights of sleep to perform amateur data analysis on a sitcom
I started watching Friends a few weeks ago. I thought it was going to be an innocuous little break from the stress-inducing shows that I usually choose to watch like The Bear or Black Mirror.
Then out of the blue, 10+ weeks of non-stop assignments gave way to a week of utter tranquility. No assignments due. No exams. I suddenly had a lot of time on my hands.
Like generations of other clueless young adults before me, I was sucked into the Friends zone. I love the show.
Don’t know what Friends is about? Here’s a quick rundown.
Am I obsessed? Sure. Use whatever word you want. In a bout of curiosity-fueled insanity that is not uncommon in the mind of a sleep-deprived college student,01 I do NOT encourage sleep deprivation, by the way. I almost got a migraine on the third night. There were a lot of risks in making this post. I sacrificed six nights’ worth of sleep—an offering of my life source for each Friend—to generate data from the Friends script and make sense of it. I wanted numbers. I wanted hidden knowledge.
What’s my purpose for all of this? To find out who’s the best Friends character? No need to. Each character has their own collections of traits that make all of them equally endearing, and it would It’s Chandler be unfair to judge them through such a rigid standard.
My purpose is this: What do certain pieces of data reveal about the characters and their relationships, if it even does?
I split my results into two parts. This post is Part I, which focuses on the dialogue in the show. Part II will focus on the on-screen interactions between the Friends, and I’ll release that sometime in the future.
Table of contents
> 2 – Who says the most words?
> 3 – Who says the most words on average?
0 – Overview of Part I
The dialogue in Friends is one of the most important elements of the show. Sure, it’s a sitcom, and it’s in the nature of a sitcom to rely heavily on dialogue. But that shouldn’t undermine how well the writers did with giving each character a distinctive voice, as well as balancing the dialogue so every Friend consistently plays a major role.
I wanted to answer three questions.
- How many times does each character get to speak? Who participates in conversations the most?
- How many words did each character say? Who had the most to say overall?
- How many words did each character say on average? How much does a character tend to say? What does this say about each character’s speech patterns?
To illustrate the difference between these three, imagine that the entire show was just these lines shared between Chandler and Joey.
- I equated (the amount of times that a character speaks) = (number of character cues that they have in the script). In other words, the number of lines that a character has. For example, “Chandler:” appears once in this over-abridged version of Friends, and “Joey:” appears twice.
- Word count is self-explanatory.
- I calculated the number of words that a character spoke on average by dividing their word count by the amount of times they speak.
To apply these questions to the entire show, I used this dataset that documents every quote that each character says in order of appearance in the script, as well as what episode/season the quote occurs in. I used Python libraries NumPy and pandas for my calculations, and matplotlib and seaborn to make pretty pictures out of the data I calculated.
Disclaimer: I’m not a data scientist, or whatever Chandler’s job is supposed to be.
1 – Who speaks the most?
It makes sense why Rachel and Ross take the lead here. The Rachel-Ross plotline is one of the longest and most prominent dynamics in the show. Ross pines for Rachel starting from Season 1, Rachel pines for Ross, and then they get together, and I guess other stuff happens too. I wouldn’t know, I’ve only watched up to Season 3 so far. But I know that their relationship keeps changing over time.
That raises another question. Do the characters’ stats change with each season as the dynamics between them evolve throughout the series?
This line graph plots every character’s percentage of lines within a season, for every season.
Features that I immediately noticed in this multicolored landscape:
Rachel & Ross – Although it’s obvious where Rachel and Ross aggregated most of their stats, I wanted to know why Ross was so prominent in the earlier parts of the show and Rachel was more prominent in the later parts.
Chandler – He hit not only the lowest percentage in Season 8, but the lowest percentage in the entire graph. I even graphed the data with raw totals for each character instead of converting them into averages, and the drop is still there. Because he’s the most goated character in the entire show in a relationship with Monica starting from S4, I got worried if something happened to him and I inadvertently spoiled something dreadful for myself.
I looked up recaps of the seasons to figure out what happened during these highs and lows. I used a reputable academic source to make a rough timeline for both Rachel/Ross and Monica/Chandler.
The whole Rachel/Ross thing is a lot messier than I thought it was. I’d have to look closer into the script to prove whether there’s any clear causation02 Correlation doesn't necessarily imply causation! between relationship status and number of lines, but you can kind of make out a rough correlation.
You can also kind of see a correlation with Monica/Chandler. The good news is that the writers didn’t nerf Chandler in S8—MonChan were just settling into their marriage. You can see in the picture where I accidentally skipped over it because the description of their relationship in S8 was so short.
Monica – The most interesting part to me was that Monica had the highest % of lines in S9.03 Assuming that my calculations are close to the actual number, which I don't know for sure. There’s something to say about how Monica, who had thoughts about motherhood since S1, has the highest percentage in the season where her and Chandler start to build a family for themselves.
Monica: What if my own baby hates me? Huh? What am I gonna do then?
Chandler: Monica, will you stop? This is nuts. Do you know how long it’s gonna be before you actually have to deal with this problem?– “The One With the Baby on the Bus”, Season 2 Ep 6
2 – Who says the most words?
I personally don’t think this data is as interesting as the other two, so I’ll quickly go over what I did think was interesting.
Similar to above, the overall stats reflect the whole Rachel-Ross plot line. Phoebe takes last place again, although this time she’s right behind Monica.
The Rachel-Ross Valley show up here like they did in the previous line plot, as does the Chandler Rift.
Otherwise, this line plot is definitely different! Phoebe consistently had the lowest or second lowest % in the previous section, but she even takes #2 in S2 and S3 here. Joey’s stats peak though to take #1 for S6.
2.5 – What about Pheebs?
I need to admit something. This post was originally just going to answer the first two questions: who speaks the most, and who says the most words. It turns out that I’m not the first person to figure this out.
When I was through coding up everything and began writing this post, I came across a post by Yashu Seth that did the same thing, six years ago. The way we gathered our data and wrote up our code is different, but we reached very similar-looking plots. Whelp. At least I can cross-check my results with someone, right?
Both Seth and I got the result that Phoebe takes last place in Total # of Lines/# of Times Spoken and Total # of Words. On one hand, that makes sense. She’s the most independent one out of the Friends.
On the other hand, if I was someone who’s never seen Friends and saw both of those stats, I might assume that Phoebe doesn’t have a lot to say or to contribute to the plot. At least, that logic lines up when I apply this same question to the The Big Bang Theory (dataset used). Sheldon and Leonard take up way more of the show than Howard and Raj does, and you can even go as far to say that they’re more important than Howard or Raj.
As a huge Phoebe fan, I know that definitely isn’t true for Pheebs. She definitely isn’t the quietest of the main characters, and she certainly isn’t the Howard/Raj of Friends. There’s a reason why I yell “SHE’S JUST LIKE ME” whenever Phoebe says anything, and a reason why I mentally disowned Howard from the list of fictional MIT alumni.
There’s a whole collection of her iconic moments that I honestly felt like were being undermined in the data that I had so far. There has to be some metric that better represents her character.
And that leads into my final question.
3 – Who says the most words on average?
’tis a resounding discovery for Phoebe and Joey enthusiasts alike. Whenever they speak, Phoebe and Joey both say the most words on average.
I really want to explore why Phoebe’s and Joey’s have noticeable spikes in Seasons 2 and 6 respectively, but since I promised that I would post this today, I’m going to keep myself from adding any more to this behemoth of a post.
I also don’t know how to end this post, so um. Here’s a recap.
- Ross and Rachel both said the most lines and the most words overall. This is probably because their relationship takes up a good chunk of the series’ plot lines.
- Phoebe and Joey say the most words on average. This is even more interesting considering that Phoebe had the lowest % of lines and words overall.
But is there a better way to visualize the dynamics between the characters? I’ll leave that for Part II.
References
- Friends dialogue data: https://www.kaggle.com/datasets/ryanstonebraker/friends-transcript
- The Big Bang Theory dialogue data: https://www.kaggle.com/datasets/mitramir5/the-big-bang-theory-series-transcript
- I do NOT encourage sleep deprivation, by the way. I almost got a migraine on the third night. There were a lot of risks in making this post. back to text ↑
- Correlation doesn't necessarily imply causation! back to text ↑
- Assuming that my calculations are close to the actual number, which I don't know for sure. back to text ↑