FH302: Introduction to Fantasy Hockey Statistics

05/20/2007 10:53 PM - 

FS 302 – Introduction to Fantasy Hockey Statistics
Contributed By: Richard Lomax

This is the first in a series of articles on fantasy hockey statistics.  The idea is to teach you a little about (a) statistics, (b) fantasy hockey, and (c) using statistics to improve your team.  The plan is to start at the introductory level and work up to more complex types of statistics.  In this article we start off with the most basic statistics.  But rest assured, you will be challenged to drop the mitts at some point.  Visors are optional in this curriculum, so keep the sticks down.

First a little about my qualifications: (1) statistics professor for 25 years; (2) played high school, college, and senior hockey; (3) 15 years of coaching experience; (4) 14 years of fantasy hockey in several leagues (obviously including some statistical analysis); (5) member of the American Statistical Association and the Society for International Hockey Research; and (6) publications that include five statistics textbooks and a book chapter on fantasy sports.  That’s enough about me; now strap on the blades and get out on the ice.

Standing around at center (centre) ice, the coach says, “So why are you trying out for the Stat Geeks hockey team?  Couldn’t hang with the big boys?”
One player says, “My fantasy team is in the basement and I need major help.  I don’t think very highly of statistics, but I am willing to trying anything to improve my squad and make the playoffs.  Can you help us coach?” After many laps around the rink, multiple directives of “again”, and lots of sweat, the boys have cleaned up and are sitting around the chalkboard in the locker room…

Averages are really mean.
Say you want to know the average salary of your fantasy team.  It turns out that there are three types of averages in statistics, so the term average is deceptive.  The first type of average is the mean.  If you take the total or sum of all of the salaries on your team and divide by the number of players on your team, this is known as the mean salary.

Mean salary = (sum of salaries) / (# of players)

You’ve been computing the mean all of your life, but you might not have known the proper term.

However, the mean is not always appropriate.  Say that most of your players’ salaries are between a half million and two million dollars.  But you also have one stud making ten million (I made the same mistake as the Boston Bruins in signing Marty Lapointe to an outrageous contract; wasn’t the smartest move, eh?).  This particular player’s salary is known as an outlier because it is very different from the rest of the players’ salaries.  If you compute the mean salary in this particular situation, it would probably be something like four million.  In other words, the mean salary is not near any of the player salaries on your team and thus does not represent your team salary very well.

Are there any alternatives?   The second type of average is the median.  The median is the middle value, so that half of the salaries are above the median and half of the salaries are below the median.  For example, if you have 21 players and you list their salaries from highest to lowest, the middle salary will be the 11th one, as there will be 10 higher player salaries and 10 lower player salaries.  An advantage is that outliers have much less influence on the median than on the mean.  Therefore, the bottom line is this: you want an average that does a nice job of representing all of the values.  If you have a few outliers, values that are very different from the rest, then you should use the median.  Otherwise the mean is preferred.

There is one final type of average, called the mode.  This is the value that occurs most often, in other words, the most frequently occurring value.  For example, if the mode salary for your team is 1 million, then more players on your team make 1 million than any other salary.  The mode is a quick and dirty measure of average, like if you have to know something immediately, but otherwise it is not very useful.  Since basic statistics are widely available in computer software and on calculators, and laptops are on your bench already, there is no reason to use something quick and dirty anymore (other than that scrappy energy player you just brought up from the minors).  Go for accuracy!

Assignment #1
You thought you could get away without doing any work in this course?  Guess again chin strap!  Here is today’s question.  How do you compute GAA (goals against average) and why does it have to be so darn complicated?  At the end of today’s lesson we will discuss this assignment, so don’t scroll down now unless you want to spend two minutes in the sin bin with Tie Domi (and as your coach, I will find out).  HINT: By its name we know that GAA is some sort of average.  So take a couple of minutes to think about this one and jot down your answers…

My scores are all over the place coach.
Let’s look at one more important type of statistic today before we head out for a cold one.  The GMs in your league have reported their team salary means, but being the inquisitive type you want to know more.  For example, two teams in your fantasy league have a mean salary of 3 million, the Stanley Stinkpots and the Atlanta Aces.  That is, both teams have the same mean salary.  What does this tell us about individual player salaries?  Are they about the same for the two teams?  Or do salaries tend to be close to the mean for the Stinkpots and spread out quite a bit for the Aces?   No type of average can answer any of those questions.

You need a different kind of statistic to determine the variability or spread of the salaries.  Two measures of variability are most commonly used, the range and the variance (and its wingman, the standard deviation).  The range is the difference between the largest value and the smallest value.  So if the largest salary is 3 million and the smallest salary is a half million, then the range is 2.5 million.  The larger the range, the more the scores are spread out.  However, this is also a quick and dirty index because it only involves two players, the largest and smallest salaried players.

If you want something that takes all of the players into account, not just the two most extreme, then the variance is the measure you want to use.  All statistical software, even non-statistical programs like Excel, computes basic statistics.  But if you insist on having an equation, the variance is computed by taking the difference between each player’s salary and the mean (i.e., each player has a deviation or difference from the mean value), squaring that deviation, summing the squared deviations across all of the players, and then dividing by the number of players.  The equation for the variance is the following.

Variance = sum (salary – mean)2 / (number of players)

The larger the variance, the more the scores are spread out from the mean.  The smaller the variance, the more the scores tend to pile up right around the mean.  So if the Aces have a larger variance for salary than the Stinkpots, then the Aces salaries tend to be further from the mean of 3 million (some superstars and some low salaried grinders), while the Stinkpots salaries tend to be rather close to the mean of 3 million (no real superstars, but no cheap players either).

One characteristic of the variance is that it is measured in squared values.  So for salary, the variance is measured in dollars squared.  Some folks don’t like this, so they take the square root of the variance to get a measure in dollars.  This is known as the standard deviation.  Whether you use the variance or standard deviation to assess the spread of the scores is really personal preference, as one is the square root of the other.  The larger the number, the more spread, the smaller the number, the less spread.  Thus comparing the salary variances for two teams will tell you the same as comparing the salary standard deviations for the same two teams.

Just one more thing, again!
Let us finish up the course with a couple of other quick tidbits.  These are intended to be review items from elementary school, so don’t get insulted Junior.
First we have totals or sums.  For instance, total points (PT) is the sum of goals (G) and assists (A), or PT = G + A.  Hey, you said that you didn’t know anything about statistics!  Obviously we can compute all sorts of totals such as GP (total games played), PIMs (total penalty minutes), W (total wins), SO (total shut outs), MP (total minutes played), EN (total empty net goals), and SA (total saves).

Second we have percentages.  Take, for example, save percentage (SV%).  This is the number of saves (SV) divided by the number of shots on goal (SA or shots against) times 100% (this puts things on a 0% to 100% scale).  So essentially it is the number of shots that the goalie saved divided by the total number of shots on goal (where SA is actually the sum of saves plus goals allowed) times 100%.  In general, a percentage is the number of times a particular event (e.g., saves) occurs divided by the total number events (e.g., shots = saves + goals) times 100%.  Thus if a goaltender faces 50 shots, saves 40 by allowing 10 goals, then the percentage is 40/50 x 100%  = 80%.  Power play (PP) and penalty killing (PK) statistics can be reported as percentages (e.g., 90% PK means that 90% of the penalties are successfully killed off by the defending team).  Note, however, that all of these statistics are often reported without multiplying by 100% (e.g., PK = .90 means the same as 90%).

Answer to Assignment #1
Yes, unfortunately GAA has to be a bit complicated and here is the reason.  Say that a goalie only plays 30 minutes in one game and is replaced for some reason (e.g., injury, leaking like a sieve, etc.).  For example, during his 30 minutes of play he lets in 5 goals.  We then need to pro-rate this over an entire game.  That is, if he were to play an entire game, we would expect him to give up 10 goals (and then go see the GM for a bus ticket to the minors).

More specifically, this is how GAA is calculated.  First we compute what we might call Real GP (real games played), where Real GP = MP / 60, with MP being total minutes played.  This is the pro-rated part.  If Felix the Cat played 540 minutes across a total of 10 games, then Real GP = 540 / 60 = 9.  In other words, he played the equivalent of 9 full games over that 10 game span, because he missed some minutes along the way (i.e., 60 total minutes were missed for various reasons).  Now we can compute GAA from Real GP and Goals Against (GA), where GAA = GA / Real GP.  So if Felix gave up 36 goals over those 10 games, then GAA = 36 / 9 = 4.00.  So his GAA = 4, when pro-rated across complete 60 minute games.  So how did you do on this assignment?  Are you going to need to skate a few more laps or are you ready for a frosty beverage?

Leave a comment

You must be logged in to post a comment.