MLB ProjectionsThis post attempts to erase any doubt about the effectiveness of Daisuke Matsuzaka in the Major Leagues. It is intended particulary for those skeptics that have never seen the ace pitch a single game, but are willing to write him off for the big money he will cost, and level of competition he has faced in his career. There is a method to my scouting and analysis here at Matsuzaka Watch. It is born in part from Sabermetrics, as applied by Jim Albright of Baseball Guru.
Albright is a very bright guy. He's dedicated himself to applying SABR analysis to Japanese baseball, with a very limited statistical set and a lot of creativity. His work is imperfect, but highly effective. There is clearly some margin of error, but any differences in projections is purely due to the relatively rough data set and inevitable outside factors that affect cross-cultural adjustments and lifestyle changes. His work on predicting Japan to MLB production has been very interesting and for the most part has been refined to a very respectable level. You can read it for yourself.
Here, I will attempt to demonstrate the projection for Matsuzaka in a similar way. There are a couple of important differences. I will use the formulas that Albright has developed, but instead of a generic prediction I will use the exact data set to project Matsuzaka's record on the Yankees in 2005 and 2006. This blog is intended to inform all fans about the Japanese ace, but it is first a spin-off of my Yankee blog Canyon of Heroes, and as such will continue to be devoted to bringing Matsuzaka to New York. Apologies to the good fans of other clubs. This information will be valuable to all of you too. Follow me....
Albright has used a large cross section of data for players crossing the ocean to play in both NPB and MLB and has found some important ratios to demonstrate the differences in performance in the categories of hits, home runs, walks, and strikeouts. This data is useful in projecting Component ERA, which can be in turn extrapolated to Pythagorean Expectations on wins and losses. You can read the info in Albright's article, and the wikipedia listings for the other metrics, if you don't understand them but want to learn. Otherwise, just try to catch the general sense of this.
Matsuzaka's 2005 statistics for Seibu can be translated to MLB equivalents using the aformentioned ratios, and then an ERAC is created. Using the Yankees 2005 Pythagorean data, and projecting Matsuzaka's participation in their ballgames over 215 innings, I came up with these results (1 decision for each 9 innings pitched):
That's Cy Young material. In the American League in 2005, those stats would rank Matsuzaka thus:
I projected the current 2006 season for Matsuzaka out to the expected 25 games pitched, and adjusted the current statistical pace he's on to final numbers. I subsequently converted those numbers to MLB equivalents, and put Daisuke in the context of the 2006 Yankees (1 decision for each 9 innings pitched).
Again, Cy Young candidate. The 2006 season is still in progress so this is a rough comparison and analysis, but it's not a stretch to put Daisuke among the league leaders in virtually every major category again. Remember, this data has been calculated according to the necessary "dumbing down" factors that accurately predicted the MLB projections for most of the players who have made the leap across the Pacific, give or take a few hundredths of a point here or there, and a few tummy aches from the oilier Western food and travel routine.
If you want to "dumb down" the stats even further because you're a pessimist, there's still room to make Matsuzaka a top frontline pitcher on any club in the Majors. Give him a 3.50 ERA and you've totally blown metrics and established data out of the water in favor of doubt, and you still have one of the best pitchers in the AL. There's no way around it. He's going to be a monster, and should command top dollar at 26 years old.