Batted Balls to the Future
September 25, 2012
Over four years ago, we started using our play-by-play data to derive college batters’ batted ball distributions in a number of different ways. Here's what we wrote in a Hardball Times article back in 2008:
Taking all batted balls into account, [Vanderbilt OF Dominic] de la Osa, a right-handed batter, is spraying the ball all over the place, going to the opposite field nearly as often as he pulls it. However, it would appear that his home run power is almost exclusively to pull. Here's a look at his batted-ball distribution for last season and this season, compared to his two-year total for extra-base hits:
LF CF RF 2007 45 31 36 2008 20 11 20 XBH 39 10 7
His circuit clouts present an even more dramatic split: Of the 25 in the last two years for which we have directional data, 21 (including all five this year) went to left. There's no doubting that de la Osa has "plenty of power," but it would appear that it's highly concentrated to pull.
|Kris Bryant (RHB, San Diego) 2012 vs RHP|
...and where is he most likely to hit the ball when facing a left handed pitcher?
|Kris Bryant (RHB, San Diego) 2012 vs LHP|
(For extra insight, mouse over the percentages to compare each number to the Division I average for right- handed batters.)
Why is this important? Well, for opposing college coaches, we suspect it could be a big help when positioning their defenses. For MLB scouts, it can assist in better defining a prospect’s tendency to pull the ball, or his ability to hit with power to the opposite field.
And for the rest of us simple fans? Hopefully it can help point out where to sit in the bleachers to catch
that home run ball.
MLB Draft Team-by-Team Breakdown
June 08, 2011
1,530 draft picks are in the books. Here's a look at how each team went about their business, separating hitters and pitchers, and also breaking down by collegiates, juco players, and high schoolers.
TEAM BAT PITCH 4YR JC HS ARI 24 28 34 3 15 ATL 26 24 31 8 11 BAL 22 28 20 12 18 BOS 25 28 21 3 29 CHC 24 26 20 9 21 CIN 23 27 25 6 19 CLE 22 28 23 10 17 COL 27 24 29 2 20 CWS 22 28 29 8 13 DET 29 20 29 5 15 TEAM BAT PITCH 4YR JC HS FLA 21 29 26 13 11 HOU 26 24 30 7 13 KC 25 25 24 6 20 LAA 28 21 35 6 8 LAD 27 23 28 7 15 MIL 27 24 27 3 21 MIN 17 35 29 5 18 NYM 23 28 26 3 22 NYY 16 34 16 6 28 OAK 28 21 30 3 16 TEAM BAT PITCH 4YR JC HS PHI 27 24 23 4 24 PIT 29 21 21 5 24 SD 29 24 31 5 17 SEA 23 28 31 7 13 SF 23 28 35 6 10 STL 25 25 30 8 12 TB 32 28 35 5 20 TEX 19 32 22 5 24 TOR 24 31 22 3 30 WSH 25 26 36 6 9 TEAM BAT PITCH 4YR JC HS Total 738 792 818 179 533 Average 24.6 26.4 27.3 6.0 17.8
Here's another way of looking at things. This has been hailed as a banner year for college pitching, and in fact, teams did take almost one more college pitcher per organization. Some teams went nuts--the Giants, for example, took 22 college arms.
TEAM 4YR-BAT 4YR-PIT JC-BAT JC-PIT HS-BAT HS-PIT ARI 16 18 1 2 7 8 ATL 14 17 4 4 8 3 BAL 10 10 3 9 9 9 BOS 9 12 1 2 15 14 CHC 11 9 2 7 11 10 CIN 8 17 5 1 10 9 CLE 8 15 4 6 10 7 COL 12 17 1 1 14 6 CWS 15 14 1 7 6 7 DET 14 15 2 3 13 2 TEAM 4YR-BAT 4YR-PIT JC-BAT JC-PIT HS-BAT HS-PIT FLA 9 17 5 8 7 4 HOU 18 12 1 6 7 6 KC 10 14 5 1 10 10 LAA 19 16 2 4 7 1 LAD 15 13 5 2 7 8 MIL 10 17 0 3 17 4 MIN 10 19 1 4 6 12 NYM 13 13 0 3 10 12 NYY 5 11 2 4 9 19 OAK 14 16 2 1 12 4 TEAM 4YR-BAT 4YR-PIT JC-BAT JC-PIT HS-BAT HS-PIT PHI 10 13 2 2 15 9 PIT 15 6 3 2 11 13 SD 16 15 3 2 10 7 SEA 12 19 4 3 7 6 SF 13 22 3 3 7 3 STL 12 18 4 4 9 3 TB 16 19 2 3 14 6 TEX 9 13 0 5 10 14 TOR 11 11 2 1 11 19 WSH 16 20 3 3 6 3 TEAM 4YR-BAT 4YR-PIT JC-BAT JC-PIT HS-BAT HS-PIT Total 370 448 73 106 295 238 Average 12.3 14.9 2.4 3.5 9.8 7.9
For comparison, I published similar tables after last year's draft.
A Much Closer Look at Tournament Brackets
May 25, 2011
One of my favorite aspects of the college baseball postseason is the multitude of brackets. In other sports, at other levels, the postseason is consistent and straightforward--you win a best-of-5 series to advance to the League Championship Series, or you win a single game to get from the Sweet 16 to the Elite 8.
In college baseball, just about every step of the way, things are more complicated.
The theme running through most steps of the college baseball postseason is "double-elimination." Not every conference tournament works that way, but most do. The Field of 64 leans heavily on double-elimination as well, with four-team double- elimination brackets in the regionals and two more four-teamers determining the national finalists.
What I want to know is this: How effective are these double-elimination brackets at unearthing the best team?
The perfect bracket
It's probably impossible to come up with a single perfect approach to a week-long postseason bracket. Still, it's worth thinking about what the ideal approach consists of.
Most importantly, a good bracket should make it likely that the best team wins. It should do this without relying on proper seeding, in large part because accurate seeding is so hard (or, at least, rare) in college baseball. Seedings are often based on short, unbalanced in-conference schedules. A team shouldn't have their hopes dashed because--like Vanderbilt in the SEC tournament this year--they are seeded 4th instead of 1st, where they belong.
It's crucial that the bracket be fair to all comers particularly at the national level, because in the World Series of D- I, D-II, and D-III, the bracket is set before exact teams are known. In those cases, no one is even trying to seed the teams that end up in the final eight.
Ultimately, I think we can use two criteria to determine a "good" bracket:
- The best team(s) have a good chance of winning.
- Each team's chances of winning are not heavily dependent on seeding.
If we agreed on the best team in a given tournament, it would be easy to fulfill the first requirement--just give them a bye to the final round, and let the other seven teams slug it out for the other spot in the finals. But that approach violates the second requirement.
On the other hand, we could choose the winner at random--that would fulfill the second requirement. But it wouldn't favor the best team at all, making it a horrible way for a tournament to determine the best team.
Jeff's Magical Field of Eight
To test out the various bracket formats, I created an eight-team field with a wide range of ability. It consists of: Oregon State (.700), Stanford (.650), Samford (.617), Rider (.590), Belmont (.561), USC (.530), Nicholls State (.500), and Pepperdine (.450). The numbers in parentheses are my strength ratings, and they are meant to express how often each team would win against an average opponent. So OSU would win 70% of games against such a team, while Nicholls State would break even.
Given this field, there's a fairly wide gap between OSU and their competition. For each bracket, we should expect to see Oregon State with the best chance of winning, regardless of whether they are seeded #1. To restate my criteria for a "good" bracket format: The ideal tourney structure is likely to spit out OSU as the winner, and seeding the teams correctly or incorrectly doesn't affect the results much.
Let's take a quick overview of the results. I've had to nickname the different bracket formats; we'll get into the details of each shortly. The percentages reflect the odds, in each bracket format, that Oregon State would beat out the rest of the field.
Format Max Games Seeded Random 2x(4-team bracket) 15 31.4% 29.6% 2x(4-team pools) 13 31.3% 23.8% 8-team bracket 15 32.7% 31.1% 6-team + play-in 13 35.9% 30.5% Omaha bracket 17 34.1% 32.3% 3-game playoffs 21 36.9% 29.8% 5-game playoffs 35 43.2% 34.4% 7-game playoffs 49 48.2% 41.7% 154-game season 616 82.4% 82.4%
In general, the more games, the more likely the best team wins. No surprise there. What is striking is the difference between the 'seeded' and 'random' results in some of the formats. Some tournament structures give a huge edge to the top seed--presumably a relatively easy draw through to the final rounds. Others provide a slight edge, but one that doesn't last as long.
Obviously, the last few formats listed are not feasible. I showed the 154-game season (each team plays every other team 22 times, just like pre-expansion MLB) to indicate the quality difference implied by the .700 level, .650 level, etc--over the course of a season, there would rarely be much doubt as to who the winner is. I included the 5- and 7-game playoffs to suggest how the MLB playoff structure compares.
The 2x4 is the most popular eight-team format in college baseball. The field is split up into two four-team brackets; one consists of the 1, 4, 5, and 8 seeds, and the other is made up of the 2, 3, 6, and 7 seeds. Each bracket plays its own double-elimination tournament. Here's an example bracket.
The only good thing this structure has going for it is that it doesn't favor the top-seeded team too much. But in terms of spitting out the "right" winner, it's about as bad as it gets. One problem is the initial splitting of the field--if the teams aren't seeded correctly, you can end up with the four best teams duking it out for one spot in the final. That's essentially what happened in last year's Division II World Series, where according to my numbers, the three best teams all ended up in the top half. At least one team ranked 4th or lower was guaranteed to end up in the final.
The other problem is more obvious: It all boils down to a single game. It's a cliche that in baseball, anything can happen in one game. That's why, in almost every other format, either the winner has to work harder (e.g. win a three-game series, as in Omaha) or the loser has to play worse (e.g. lose twice in a double-elimination bracket).
This less-popular format is used for the ACC and Conference USA tournaments. Instead of two four-team double-elimination brackets, the first few days are given to two four-team round robins. After the field is split in half (again, 1/4/5/8 and 2/3/6/7), each team plays one game against every other team in their half. Here's a sample, though it isn't very enlightening.
With a single-game final round, this bracket has the same problem as the traditional 2x4. Making matters worse is the frequency of ties. With each team playing three games, it's highly likely that a four-team pool will end up with two, or even three, teams tied with a 2-1 record. No matter what the tie-breaking procedure, it's not going to be very reliable. I'm not confident that every conference uses the same tiebreaking rules, but in at least one case, the tiebreaker for three 2-1's simply hands the victory to the highest-seeded team! That's all well and good when the best team is seeded #1, but we should recognize that that's not reliable.
Because of the single-game final, we're unlikely to get the "right" winner out of this bracket. And because of the possibility of a tiebreaker, this format hands a big bonus to the teams lucky enough to be seeded #1 and #2. The pool-play bracket is probably the worst option.
I was surprised to discover that only one conference (the Great West) uses a "pure" eight-team double-elimination bracket. It's also in use in the Division III World Series. There are a few variations on the theme, but they don't affect the numbers much. Here's an example of the exact format I used.
There's not a huge statistical difference, but this is better than the previous two brackets at handing the championship to the best team, and it is the best of the 15-game brackets when the field is seeded randomly. The key lies is it's choice not to split the field in half. As we've seen, allowing only one finalist from each half doesn't have the desired result when all the best teams are shoved in one half. The 8-team bracket still runs that risk slightly--after all, somebody has to play the top teams in the first round--but by dumping everyone in the same loser's bracket, it's more likely that the most deserving teams make it to the final round.
Finally, the double-elimination format is extended to the last game, so it's less likely that the champion is decided in nine innings. If the team from the loser's bracket wins Game 14, it has to win again. If the team from the winner's bracket wins Game 14, it is the lone undefeated team. All things considered, this is probably the best single-week format in current use.
6-team + play-in
In the interest of completeness, here's another approach. Many conferences hold six-team tournaments, and sure enough, there are a variety of six-team double-elimination brackets, as well. The Big South Conference compromised between six and eight. They have an eight-team tournament, but with what is essentially a six-team bracket.
The Big South tourney opens with two play-in games, in which the bottom four seeds fight for a spot in the six-team bracket. Thus, it is (at least at the start) single-elimination for some teams and double-elimination for others. Here's the very complicated Big South bracket.
There's probably no good way to do a six-team bracket. The Big South option may be the best, as it doesn't give byes to the top seeds. (The Mountain West Conference, for example, does give byes.) However, any six-team format is going to be stacked for the top seed. In the Big South, the top seed gets to face the lowest-seeded team that came out of the play-in games. Then, if they win, they don't play another winner--they play the winner of the first loser's bracket game. In other words, they may face the worst remaining team in each of their first two games.
That's better than the six-teamer with byes. With byes, the top seeds don't even have play the first round, and then (at least in some scenarios) the #1 seed is guaranteed to face the lowest-seeded surviving team.
As the table shows, this format is better than the alternatives at giving the championship to the best team--if the best team is the #1 seed. However, there's a big gap between the seeded and random approaches. If the best team is seeded anything but #1, it has an uphill battle, especially if it is marooned in a play-in game.
The Omaha format (2x4 plus a three-game playoff) isn't bad, but it's probably not feasible for conference tournaments because it requires at least one extra day. It also has one of the same problems as the 2x4 format, where the best teams could be stuck slugging it out for only one place in the final round.
An interesting alternative would be to play a series of three-game playoffs. The #1 and #8 seeds would fight for a spot in the semis, where the winner plays the winner of #4 vs. #5, and so on. Essentially, it would be like the baseball postseason. If games 1 and 2 were always played as a double-header, the tourney could be crammed into six days; what makes it probably impossible is that a team could end up playing nine games in those six days, an absolute nightmare for coaching staffs.
I embarked on this study thinking that there would be a bigger difference between tournament formats. If seeding is done reasonably well, there isn't much to separate the brackets. If seeding is done poorly, the pool-play approach isn't very good, but even then, all of the other options do a reasonably good job.
Now that we all know more than we ever wanted to know about the implications of various eight-team brackets, we can rest
easy in the knowledge that, in this year's conference tournaments, no one is getting screwed. At least not too badly.
Except for Clemson.
Updated Power Rankings
April 04, 2011
We've set things up so that our Power Rankings update daily. The list includes all 300 Division One teams, so whether you're interested in the top 25 or the top 250, you can find it here.
There's a link on the left-hand sidebar under "Draft Toolbox," and you should probably bookmark the power rankings page, too.
The exact details of our algorithm are geeky, boring, and subject to change. For all that, you might be interested in the general idea.
- We start with each team's pythagorean winning percentage--that is, how many runs they score and allow. This is a better tool than won-loss record, because it gives teams more credit for blowout wins. After all, who is the better team, the one that wins 11-1 or 11-7?
- Next, we measure the quality of each teams's schedule. This is absolutely crucial, especially early in the season, when teams are playing out-of-conference opponents.
- We also adjust for how many games each team has played at home. Home field advantage is stronger in college than in the pros, and northern teams don't get nearly as many home games as their southern brethren. We account for that.
- Finally, we measure the skill level of each team's returning players. Using Wins Above Replacement (WAR) from 2010, we estimate the quality of each team before the season starts. Then, as the season progresses, we increase the weight given to in-season results and decrease the weight given to 2010 WAR.
Division One Power Rankings: March 28
March 28, 2011
The Cavaliers are still on top in this week's College Splits Power Rankings. The Gators and Aggies have pushed themselves closer, each gaining two spots over last week's rankings. South Carolina also gained; they are back in the top 10 after appearing at #13 last week.
The CS rankings are based not only on 2011 performance, but also on the previous performance of returning players.
- 1. Virginia (24-2)
- 2. Vanderbilt (22-3)
- 3. Texas A&M (19-5)
- 4. Florida (21-4)
- 5. Clemson (13-9)
- 6. Florida State (18-6)
- 7. North Carolina (23-3)
- 8. UCLA (11-8)
- 9. South Carolina (18-5)
- 10. Oklahoma (19-6)
- 11. Arizona (17-7)
- 12. California (16-5)
- 13. Georgia Tech (21-4)
- 14. Arizona State (18-6)
- 15. Texas Christian (15-8)
- 16. Texas (17-7)
- 17. Louisiana State (17-7)
- 18. Miami (FL) (14-11)
- 19. Mississippi State (18-6)
- 20. Alabama (18-7)
- 21. Auburn (14-10)
- 22. Mississippi (18-7)
- 23. Oregon State (18-6)
- 24. Arkansas (18-6)
- 25. Louisville (15-8)
Next five: Nebraska, Central Florida, Southern Mississippi, Oklahoma State, Troy