clock menu more-arrow no yes

Filed under:

Do long movie titles make for better movies?

(Or, I use years of statistics knowledge to try and prove a point)

birds of prey horizontal uk poster Image: Warner Bros. Pictures

With 56 letters, 19 syllables, and 11 words, Birds of Prey (and the Fantabulous Emancipation of One Harley Quinn) joins the ranks of Birdman or (The Unexpected Virtue of Ignorance) and Borat: Cultural Learnings of America for Make Benefit Glorious Nation of Kazakhstan in the annals of lengthy film titles. The journey to reclaim Harley’s character clearly demanded a title that was as different from the 12-letter, four-syllable, two-word Suicide Squad as could be.

Besides being mouthfuls, Birdman or (The Unexpected Virtue of Ignorance) and Borat: Cultural Learnings of America for Make Benefit Glorious Nation of Kazakhstan also happen to have high critical ratings, both sitting at 91% “fresh” on review aggregator Rotten Tomatoes. Compared to recent flubs like Cats and Doolittle, which boast dinky one-word titles and horrible Rotten Tomatoes scores, these overly loquacious names seem indicative of a higher cinematic pedigree.

Is it possible that, compared to their less verbose counterparts, movies with long titles are just … better? Does Birds of Prey (and the Fantabulous Emancipation of One Harley Quinn) have a critical leg up on the superhero movie competition, just by stuffing syllables onto its poster?

Thanks to the magic of statistics, there’s a way to test the hypothesis. We’re going to prove — with math! — whether or not movies with longer titles are better.

Movie taste is subjective, but we’ll use Rotten Tomatoes’ critics scores as a barometer for putting some quantitative markers on the idea of a “good movie.” Furthermore, let us clarify the golden rule of statistics: Correlation does not imply causation. Suppose our thesis is correct, and we find a trend between critical scores and movies with long titles. That doesn’t necessarily mean one causes the other, but that they share a strong relationship that could be influenced by a wide variety of factors (movies with long, pretentious titles may be liked by long, pretentious people, for instance). Still, if there is indeed a relationship between title length and critical score, we could have concrete proof of why Birdman or (The Unexpected Virtue of Ignorance) was so well-received.

One might assume from scanning the list of Rotten Tomatoes’ top 100 movies of all time that shorter titles play better for critics, with roughly 50% of the titles settling for one or two words. But keep in mind that 52% of the worst movies on Rotten Tomatoes are also one- to two-worders. Plus, the best and worst movies lists don’t tell us anything about the films that are bad-but-not-too-bad or good-but-not-too-good — we need to get a full sample of movies from across the gamut of quality in order to see if there is indeed a trend.

Victoria (Francesca Hayward), a white cat, wearing a string of pearls in Cats (2019)
Cats (2019): one word in the title, 20% “fresh” on Rotten Tomatoes.
Image: Universal Pictures

The importance of randomness

Strong statistical studies incorporate randomness in order to prevent unconscious bias. Since my thesis is all about longer movies being better, I can’t just go ahead and pick bad short-title movies to fill out my roster. That would be cheating. And even if I think I’m not doing it because I want my results to go one way, I may do it without thinking. That’s where randomness comes in.

To achieve randomness, I used a “Random Movie Roulette” Letterboxd list of 7,596 movies curated by user Tobias Andersen. Using a random number generator, I navigated to the designated movie on the list, added it to a spreadsheet, then repeated the process until I reached 100 titles. The “Random Movie Roulette” list is extensive and random (especially to me, as a person who did not make it), but I had to pass on a few titles that did not have Rotten Tomatoes scores. Notable entries that did not make the cut include: The Gnome-Mobile, Hot Splash, Bloodfight, and The Dungeonmaster. Movies that did: Master and Commander: The Far Side of the World, Mad Max Beyond Thunderdome, The Private Lives of Pippa Lee, Chef, Titus, Dredd, and Daddy Longlegs.

[Disclaimer: Because I didn’t know the titles the random generator would spit out, I did want to ensure that there were at least two very long titles and two very short titles, each corresponding to different movie quality. I manually input Cats (bad) and Birdman or (The Unexpected Virtue of Ignorance) (good), as well as Amadeus (good) and A Kid in King Arthur’s Court (bad).]

Breaking things down

Titles alone are hard to quantify, so I broke them down further by the number of letters, words, and syllables — since there’s a clear difference between Cats and Amadeus — although I didn’t count parentheses, punctuation, and spaces as characters. I also counted numbers as one word. In the end, I would use each of these parameters (words, syllables, and letters) in separate graphs, but with the same analysis applied to each.

Making sense of the data

After gathering the data, I created simple scatter plots in Google Sheets. [Author’s note: To aspiring statisticians out there, Microsoft Excel is objectively better, but Google Sheets is free, baby.] Scatter plots are the bread and butter of statistical analysis. These dots make sense of all the data. Basically, scatter plots take lists of numbers and put them in visual form, allowing us to clearly see if there is any possible relationship between variables.

Along the x-axis, we have the predictive element — in this case the length of the movie, be it conveyed through words, syllables, or letters. On the y-axis, we chart the element being predicted, in this case the Rotten Tomatoes score. Each dot, in this case, represents a movie. For instance, if we just plotted Cats and Birdman or (The Unexpected Virtue of Ignorance) based on the number of words in their title, we’d get this:

a graph of Cats and Birdman, plotting the number of words in the title versus their Rotten Tomatoes score Chart: Petrana Radulovic/Polygon

Ideally, in order to support our hypothesis, we’d want to have an upward slope, representing a positive correlation, as seen above. Once again, each dot represents a movie; in this case, Cats is on the bottom left and Birdman or (The Unexpected Virtue of Ignorance) is on the top right. They’re plotted using the number of words in their titles as the x-coordinate and the Rotten Tomatoes score as the y-coordinate.

The trendline generated is a line of best fit, meaning it fits itself as best as it can within all the data. In this overly simplified example, our data is just two points, so they both fall neatly on it. Other cases with a more chaotic data spread will see dots both above and below the line. In our case, the line is a tool to help visualize trends in the data, if any.

Since our hypothesis is that longer titles have better scores, we’re looking for an upward slope. A downward slope, or negative correlation, would occur if longer titles implied bad movies, such as if I had plotted Amadeus and A Kid in King Arthur’s Court.

Let’s see some graphs

With 100 movie titles, each broken down by letters, syllables, and words, we now have our data. The titles included 1956’s Around the World in 80 Days (22 letters, eight syllables, six words), Last Train from Gun Hill (20 letters, five syllables, five words); Yours, Mine, and Ours (15 letters, four syllables, four words); Action Jackson (13 letters, four syllables, two words); White Girl (nine letters, two syllables, two words); and 2017’s The Mummy (eight letters, three syllables, two words).

As in our very simple example, on the graphs below, each dot represents a movie, with each x-value representing its length (determined by letters, syllables, or words, depending on the graph) and each y-value representing the movie’s corresponding RT critics score. Unlike in the very simple example, things get a little more wild:

Chart: Petrana Radulovic/Polygon
Chart: Petrana Radulovic/Polygon
Chart: Petrana Radulovic/Polygon

Remember the upward-trending line in the Cats and Birdman or (The Unexpected Virtue of Ignorance) example? In each of our graphs — letters, syllables, and words — the trendline points upward in a similar fashion, indicating the pattern we hoped for: Longer titles imply better scores. It’s not as dramatic as the rigged example, but it is there.

Have I done it? Have I broken down Rotten Tomatoes-aggregated critics to their bare essentials?

Well, not quite

Even though I’m conducting my research in the proper way, and I could absolutely take the charts out of context and present my findings, I’m an ethical person. I can’t just skew statistics in order to prove my point. C’mon.

In statistics, there is a variable known as the correlation coefficient, R, that signifies the strength between two sets of variables, such as the number of letters in movie titles and their respective Rotten Tomatoes scores. The actual equation is cumbersome (see below), but thankfully, Google Sheets has a built-in command ([clears throat] Google Sheets, activate CORREL) that generates an R-value when you input two rows of data. An R of 1 (or -1 in the other direction) means the relationship is very strong, whereas an R of 0 means the relationship is nonexistent.

statistics equation to calculate R, the Pearson correlation coefficient Image: Data Science Central

Another variable commonly used in statistics is R2 — the R-value, but squared. Statisticians use R2 more often than R to quantify the relationship between two variables, since it eliminates the negative aspect and therefore is less confusing. It should be noted, though, that there are certainly cases where there’s a not-great R2 (like 0.2) that covers up an otherwise acceptable R (0.44).

After plugging the data through the Google Sheets commands that generate R and R2, a fact comes to light: Both the R-value and R2-value for each of these graphs kind of suck.

“Number of Words vs. Rotten Tomatoes Score” has it the worst, with a measly R2 of 0.006 and an R of .07, which basically implies no correlation. “Number of Letters vs. Rotten Tomatoes Score” wobbles in with an R2 of 0.024 and an R of 0.15. There is a small, tiny spark of hope — while the R2 for syllables is a pretty pathetic 0.04, the R-value is 0.2. That’s very, very weak, but because the guidelines for an “acceptable” R-value can differ depending on the specific textbook or parameters that are are used (and they can definitely be manipulated), an R-value of 0.2 could still point to some sort of relationship.

Y’know, if we were desperate and wanted to fudge our data. Which we’re not. But I’m just saying — we could.

Alas, by my own ethical standards, I cannot support my hypothesis that movies with longer titles are regarded as better by critics. I could maybe argue that wordier, multisyllabic titles have a weak tendency to be critical darlings ... but the glaring chart in my college engineering statistics textbook of acceptable R-values would call me a liar and haunt my dreams.

At the very least, we’ve learned something here today, and can trudge on knowing that if someone at a fancy party tries to point out that movie critics love long titles, we have an instant and supported rebuttal.