Alphabetism in Baseball

You may already be aware of this, dear reader, but alphabet discrimination exists.  People with surnames near the beginning of the alphabet own a slight but noticeable advantage over their late-alphabet colleagues.  They appear earlier in directories, leading to more phone calls.  They receive more applause at awards ceremonies and graduations, because people tend to get tired of clapping by the time the T’s roll around.  They even are more likely to receive tenure and Nobel Prizes, according to a study by Liran Einav and Leeat Yariv, because authors of collaborated work in certain fields tend to be recognized in alphabetical order.

The alphabet is important in baseball, too.  David Aardsma, despite the success he’s found in an eight-year career, is still best known for supplanting Hank Aaron as the first player listed in the alphabetical list of players.  This fact is the second sentence in his Wikipedia article. People are still upset by this.

But is there alphabet discrimination in baseball?  I collected the performances of every hitter in baseball history (this is an activity which sounds far more impressive than it actually is), organized them by surname, and averaged them by their hitting ability, as represented by FanGraph’s own fRC+.  The stunning and aesthetically pleasing result:

(Note: Each player’s career wRC+ is counted once, no matter how many seasons they played.  Since a superior player is more likely to last multiple seasons than an inferior player, the graph doesn’t average out at 100 even if the average player does.)

From this beautiful and concise graph we can draw several conclusions:

  • The next player whose last name starts with X will be the greatest player whose last name starts with X… of all time.
  • Having a last name beginning with a Q is the kiss of death.  In fact, the letter Q owes its recent success to the performance of Carlos Quentin; without him, the average wRC+ would be 76.
  • Other than that, not much.

But why stop there?  Why not examine hitting ability based on something even more arbitrary, such as the length of a player’s last name?

Bringing up the rear there is America’s favorite Saltalamacchia, proud owner of a career .699 OPS.  But what’s surprising is the statistical significance of the data.  For you kids at home with the graphing calculators, the data sports a r-squared of .69, and it jumps to .78 if we boot out a certain busted catching prospect.

The causes of this, if any, lie in obscurity.  Perhaps players lose confidence when the PA announcer botches their name at home games; perhaps scouts are more likely to remember short names when scanning for talent.  Who can say?  The world is full of biases, swirling and eddying around us all.





Patrick Dubuque is a wastrel and a general layabout. Many of the sites he has written for are now dead. Follow him on Twitter @euqubud.

15 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Theo
13 years ago

As you can see from the first graph, children, the average player is, statistically, below average.

Friedman
13 years ago
Reply to  Theo

I definitely thought the same thing.

The only explanation that I can think of is that he didn’t weight wRC+ by PA. Players with high wRC+ will stay around much longer and for every Albert Pujols, there will be multiple replacement level players with subpar wRC+s. If they’re given equal weight, this might result in the below-average error.