Damn you many eyes. I'm going to have to fix that to just the jpg. It just won't embed right.
I did not make this headline up
It's home-opener week, which seems like a great time to start looking at cumulative stats. This will be one of a few of these I do over the rest of the season.
Baseball has a long lived on the forefront of statistics in sports. From the heavy emphasis on batting over .300 or advanced sabermetrics, baseball's history is forever linked with teaching kids that not all math is useless. In honor of that longstanding tradition, today we look at some stats from our baseball team and then wonder what the hell they might mean. College baseball stats are not just loosely kept, but they fluctuate wildly over periods of time.
First, because of the nature of college baseball's shortened season (compared to major leagues), pitching statistics don't really offer enough data until very late in the season, if at all. There's just not enough to say about 17 innings of work for a starter or 8 innings, if that, for a reliever. So we're going to focus on just batting statistics in this and most future posts of this type.
Second, college baseball stats are very basic. There is no way to track pitches accurately without either a dedicated sports information director or someone at games. It's painstakingly, eye-gouging-ly monotonous to calculate batting averages with runners in scoring position. You have to hope your team has play-by-play on the bottom of their box score, and then you have to read through each at-bat, and all surrounding at-bats in order to come up with the raw data. Just to come up with the data that I have, I had to go through each box score and type in each statistic to have a game log for each player.
This is just the way things are.
The first thing I always like to post is a track on how our team batting average, on base percentage, and slugging percentage have progressed over the season.
This year I've tried to add a fourth line to represent the quality of competition Michigan has faced. The purple line represents RPI, with a team registering a 1.000 as the #1 team in the nation, a team at .500 being the 151th team in the nation, and a team scoring zero as the 302nd team in the nation, with the RPI coming from Boyd's World (in this case, the data was taken on Sunday 3/21). I felt this would help identify certain peaks and valleys as a reference.
Other than the realization that we've played a tough schedule this year, what jumps out to me is the lower slugging percentage. Last year, Michigan regularly slugged around .475. The last graphic I made last year was this one, 37 games through the season:
We're slugging just over .440 this season, where last season was spent hovering around .475. Sure, the competition has gotten a bit tougher, but something else seems spotty here. We'll look at the slugging percentage and other non-Excel visualizations after the jump…
So let's take a look at those slugging percentages (if this embed will work properly):
The original interactive embed didn't, so we're going with just the image--ed.
The larger squares represent the batters with the most at-bats. The darker squares represent the batters with the highest slugging percentage. I'm a big proponent of slugging as the most important of the three major derived stats (batting average and on base percentage are the other two). As you can see, Crank, Biondi, and Berset are really carrying the load with LaMarre out. Dennis is also performing solidly.
What catches my eye tough, is Mike Dufek. Dufek is slugging a measly .397 this year. This is coming from a guy who slugged .627 last season. That's a significant drop off. The only thing I can imagine is that Mike is working on cutting down the strikeouts. He's gone from striking out one in every 3.8 at bats last year to just one in every 6.6 this season. Dufek hit 17 homers and 19 doubles last season. At this point in the season, he's on pace for 15 doubles and ZERO homers. I'm not sure if that's all that bad, Dufek is also on pace to knock in 61 RBI this season, 17 more than last season. Win some, lose some. With LaMarre due back soon, this drop in slugging percentage will probably reverse course and hopefully raise to a level above last years by the time we get too far into the conference season.
If you interact with the same visualization above and switch over to the on-base-percentage instead of slugging percentage, you can also notice that those three slugging leaders (Crank, Biondi, and Berset) are also the on base leaders as well. Biondi's .494 on base percentage is phenomenal for a true freshman playing the caliber of baseball Michigan has faced this season. Looking at Berset's 1.058 OPS (slugging plus on base) makes me wonder just how different last year could have been with our current captain. That broken finger he suffered sliding into second was devastating, even more so than LaMarre has been this year.
And while I can't quite give you batting average with runners in scoring position, this same visualization does offer a comparison of runners left on base compared to RBI.
Clicking the image should take you to a clearer version
What this diagram shows is player tiles sized by their number of RBI and the color shaded by the number of base runners they've stranded to end an inning. Lorenz has had it a bit rough this season, hitting in only 5 RBI while leaving 24 on base. This has really been amplified the last two weekends as he's stranded 19 over the last 7 games.
Anthony Toth also stands out as under-performing in several of these visualizations. In the two hole in the lineup, he should be hitting for a solid combination of average and power. He's definitely improved over the last year, but he's still not a best case second place hitter, especially with the .365 slugging percentage.
Toth's chart is somewhat strange again this year. Last year, he started off the first 8 games of the season with an identical batting average and slugging percentage. He was single or nothing. He's done a bit better this year with the walks though, and that's encouraging, as he's at least getting on behind Biondi and setting up the middle of the order to score runs. Toth's strikeout rate is still pretty high as it was last year. He's currently striking out once in every 4.625 at bats, even more frequent than his one in 4.95 last season. This may be a concern moving forward, but I don't see anyone worthy of taking his spot from him at the moment. Maybe if Dennis can solidify himself a bit more as the season goes along, we may see a change.
Speaking of the freshman, a look at his chart:
Dennis struggled a bit with the top clubs we've faced this season, but he's been one of the hottest hitters over the last two weeks. The best news is his climbing slugging percentage and massive drop in strikeouts. During the week against FGCU and UNC, Dennis struck out 9 times. The last two weeks, he's struck out just 3 times, and he's also belted 3 homers. His slugging percentage has risen .222 points in that same span. That's definitely encouraging.
I think the offense has rebounded well since the week following LaMarre's exit. Berset and Crank are way out-performing my expectations, and Biondi has been a monster in the lead off spot. I'm not so concerned about Dufek as he hasn't been relied on as much as last year, and his cut down in strikeouts hasn't affected his RBI production at all. He's performed much better behind two, and soon to be three, other big hitters in the lineup, and he's getting his job done.
I am keeping my eye on Toth vs Dennis in the lineup. I don't think it's time for a change yet, but if Dennis keeps producing, we could see Toth move to the bottom of the lineup and act as a lead off man for Biondi to hit around/in. Just something to watch come next stat watch, which will probably come after conference play begins next week.
Damn you many eyes. I'm going to have to fix that to just the jpg. It just won't embed right.
Do you think the team would be about right SLG-wise if Lamarre were in the lineup and producing normally?
Okay, a couple numbers to reference this as unbiased as I can:
There's also the caveat that Berset was missing from half of the games in the 2009 season through 18 games, so that slugging percentage may also be a bit low.
Okay, now for the biased part: I think we'd be just about the same as last year had LaMarre stayed in the lineup, probably a bit better given we wouldn't have fallen to pieces that week or two after he left the team. That's probably as to be expected as this year's team was expected to be a much better gap power team. I think there's still a little room for improvement (even after the LaMarre numbers are taken into account). I think that's going to come from Dufek as he continues getting hot.
boom! exactly what i was looking for'd
Great post. I really liked the statistics graphs.
Great work FA, although I might add that there are many SABR sites on these interwebs that have shown Team OBP correlates better to team runs than does Team SLG.
I would agree with them depending on the team strategy,
which is why I make sure that particular graphic makes it on here, too EDIT: which is why I wished that the interactive graphic would have made it through. It had the on-base comparison. It was pretty close to the same look as the slugging, with a bit more in Toth's favor.[/edit]
If there was a way for me to quantify how Michigan produces runs, I think you'd see that Michigan has been pretty weak at manufacturing runs the last two years. The team has relied much more on the big hit. While you still need runners on for that, I get the feeling that the on-base has been a lesser emphasis than power to the gaps. I prefer the balance and was one to rail on our lack of on-base last year, especially before the Fellows/Toth switch in the lineup.
I personally prefer OPS as the combination, but I haven't found a college OPS line to compare to the MLB average. In other words, I'm not sure that a 1.000 OPS is that good in college or not. I would assume so, but I don't have the complete numbers.
Those sites are working with major league data and apply to the major leagues. What you want to know is where the greatest separations in talent per league reside. For instance, it's not true that major league baseball pitchers have no ability to control the ability of a batted ball to go for a hit. It's true that they all have approximately the same talent. the difference on a per pitch basis is much smaller than the difference in the quality of defense received, for one. so when you run a regression with limited seasons you get nearly no correlation.