Monday, June 17, 2013
Question 10: How do we best predict the outcome of games or series?
We have relished the question and hopefully you have been following our website to see us actively publish our daily predictions in the NBA (winner and spread). Many people ask how we do our predictions and maintain such high accuracies. A few years ago I published an article that explains in high level detail how I do predictions in the NBA. Since then, we have maintained a similar approach to predicting the winner of every game.
First, we try to keep things simple. It really comes down to how teams matchup -- through their basic stats. The better shooting team will typically win games. Combine that aspect with strong defense and you are almost a sure thing. Typically, there are no surprises in the NBA -- standings are usually the same year after year (unless trades or injuries occur). Was anyone surprised that Miami and OKC finished at the top? Are we shocked to see the Spurs in the finals (after Westbrook went down)? Although the Bobcats tried to mislead us at 7-5, did they not finish where they were supposed to?
Keeping to the idea of being simple, a lot can be explained on why a team wins through their box scores. It takes a little craftiness to determine which box scores to use (home/away, last few games, vs conference, vs division, etc.). Another aspect is to determine which statistics are important. This can be accomplished through feature reduction by techniques such as principal component analysis or factor analysis. Finally, a strong prediction algorithm (such as artificial neural networks) can take a seemingly impossible problem and make is simple.
So, in short, I don't think this is a problem to solve. We are doing it every day and have maintained an 81% accuracy in predicting the winner of every NBA game this season. Most of our incorrect decisions came at the beginning (not enough data) of the season and at the end (injuries, rest) of the season. Are there better stats out there to use -- maybe -- but why make a problem more difficult than necessary?
This concludes our examination into the 10 questions to solve. We will start posting some work we are currently doing and information related to other sports. Will the Spurs win in 6 -- our model says no...
Question 9: What is the market value for player performance?
Obviously this question is asked a lot especially when people wonder why athletes are paid millions of dollars to play a sport. What warrants a high salary? Is Lebron James really worth $50 million like he thinks (or $40 million according to Pelton)? Are a lot of players overpaid?
I like Pelton's use of WARP (wins above replacement metric), but I don't think this necessarily encompasses all what a player is worth. In addition to the number of wins a player can provide a team, there are the other aspects of revenue management that need to be included. Such things are the ability to generate ticket sales, sponsorship opportunities (as we will start to see on backboards next year) and other revenue-generating circumstances.
Are players paid what they are worth -- I think it really comes to a comparison to other players. It's not the dollar amount that is of interest (although a GM or owner might say otherwise), but rather it is how much a player makes compared to others. Lebron should have the highest salary in basketball, but chose not to in order to help form the Big 3. Therefore, does it really matter if players are paid what they are worth (Chris Paul and Dwight Howard in 2013)? It only really matters that players aren't overpaid (Kobe anyone?) and organizations aren't "losing" money on their supposed stars.
If it came down to player performance through various statistics (personal and +/-) and WARP, sign me up! Otherwise (and unfortunately) it seems to come down to perception, mania (Lin and Terry), agent skills and greed that drive salaries up and down -- and pure stats lose the battle here...
Thursday, June 13, 2013
Question 8: How do statistics translate from other leagues to the NBA?
Luck would have it we are looking at this exact problem. We have been asked in the past if one could determine which players in the European leagues would do well in the NBA. However, we are currently examining which players at the collegiate level will do well in the NBA. I cannot yet get into the specifics, but we are able to determine with high accuracy which players fit into specified categories (level of NBA talent) based solely on box score data in college.
Pelton discusses the use of translations to determine how college players will do in the NBA. He notes that while this is successfully used in baseball, the ability to translate statistics is much more difficult. We are taking a somewhat different look at the problem.
Every NBA player has been categorized based on their talent/success in the NBA. This could be by statistics, influence or any number of subjective measures. We then break these groups up by positions so that we do not measure centers the same as point guards (Pelton did the same). Each position possesses their own unique set of statistics that explain each player (common techniques such as principal component analysis or factor analysis can be applied here).
After capturing the right statistics for each group, we can then apply various prediction algorithms to determine which players belong in which groups. The beauty behind our algorithms is that they "see" things we could easily have missed!
I look forward to sharing our results. With this information, we take the guessing out of the draft and provide insight into the true floppers of the NBA.
A fantastic question by Kevin Pelton -- which should be number 1 on the list. If you claim to love Moneyball, this question should just scream potential to you!
Question 7: What role does coaching play in the success of teams and players?
The entire NBA nation might say a tremendous role since the Spurs' Popovich has shut down Lebron James with his "I dare you to shoot jumpers" defense. So far, the King has been dethroned.
Very little analytical work has been conducted on coaches, as pointed out by Pelton. Coach tenures are very short in the NBA with the exception of several notables. Next to Popovich (1996) and Doc Rivers (2004), the longest tenured coaches in the NBA are Spoelstra, Carlisle and Brooks since 2008. Each of these coaches have made the NBA Finals.
This fact alone shows some merit that the success of teams may rely on coaching. However, are the players growing? Are they getting better under direction? Or do these teams just have good players?
Analytically, I do not see the value in pursuing this question. Every coach approaches the game differently and every player accepts coaching advice in a different manner. The key is finding the right coach that fits the system in place (Spoelstra is a decent candidate) or building a system around the right coach (Doc and Popovich).
Friday, May 31, 2013
In the spirit of the playoffs, we are going to finish up with Kevin Pelton's 10 questions to solve.
Question 6: Do per-minute (or per-play) stats translate across changes in playing time?
In my opinion, this is one of the most exciting statistical questions in this list. Each year we see a new 6th man of the year -- it seems that with more playing time they seem to do better and better (with the exception of JR Smith in the playoffs). Lance Stephenson has taken the bull by the horns with his increased minutes this year.
Recently we did a study on a particular Houston Rockets player to measure his per minute stats throughout a game. Ideally, this player was hoping for more playing time based on particular statistics. An important aspect was rebounds per minute. The chart below shows that Houston had more rebounds per minute when this player was off (red line) the court versus when he was on (blue line) the court.
However, if we put our tunnel vision on, we may fall in to the trap of saying that this player is detrimental to the team in terms of rebounding the ball. When examining other metrics, we find that Houston has far fewer rebound opportunities when the player is on the court versus off -- thus leading to fewer rebounds per minute. In addition, field goal attempts were much smaller was well. Two words - Good Defense.
Could this type of stat analysis lead to determining more playing for players? I think so -- it steps outside the bounds of how we normally look at per minute stats. It would be great to see more research done in this area -- especially its effects come playoff time!
Thursday, April 18, 2013
To recap, for the entire season we were able to predict 81.04% of the winners for every NBA game this season. Since our website debut in early March, we maintained a respectable 77.03% accuracy. In terms of covering the spread, our season average held an astonishing 72.71% accuracy rate. Since early March, we saw a lower accuracy of 61.54%. We only saw three days since March 4 with a record < .500.
Although our accuracies were lower at the end of the season, this is to be expected - especially in the last two weeks. For most of March, our spread accuracy hovered around the 66% mark. As key players started to sit and injuries started to pile up, our model had a difficult time adjusting. For instance, the Spurs were crushing teams throughout the season -- our model thought they should maintain that consistency until the final day. Most people would not pick every game -- we chose not to discriminate! Whether it was a confidence of 1% or 99%, we made our pick.
We hope you enjoyed following our NBA picks. Remember you can check out all our previous picks and results on the NBA Game Predictions page of our website. We will soon be moving onto MLB -- expect us to roll out our model in late May (our model needs more pitching data).
Tuesday, April 2, 2013
Taking a look at their bracket, I was excited but not surprised. What excited me -- their model chose Florida to win the National Championship -- same as Perduco Sports. Of course, we now know that is impossible since they were shredded by the Wolverines. However, our use of neural networks and their use of regression/markov models seems to show that statistically, Florida should have been a serious contender for the title game!
However, what didn't surprise me was their lack of upset picks throughout the tournament. Their only true upset picks came at the 7-10 and 6-11 matchups. Not one 5-12 matchup was selected, which is surprising since nearly 40% of the 12 seeds have won first round in the last 10 years. This lack of upset picks indicates they follow the RPI system very closely -- although they claim to be better than this system.
Unfortunately, 2013 appears to be a bust for Georgia Tech's model. Although "madness" is very difficult to predict, I was pleased to see another advanced analytical model obtaining similar results to our approach.
Perduco Sports had 5 of the Elite 8 teams (just missing Kansas), with all of our Final Four teams still fighting. There were 16 different possible ways this last Sat/Sun could have gone down. Luck was not on our side as the one possible combination occurred where we didn't have one team in the Final Four. There's always next year!
Monday, March 25, 2013
The amazing thing is that La Salle, a play-in game participant, is still in the tournament with a good chance to advance to the Sweet 16. Now another case can be made -- is the First Four enough? It would appear not. Can we open the tournament up to more teams? Should we?
Typically, the At-Large bids for the tournament do not go higher than the 12th seed. Those teams grabbing the final spots are known as the bubble teams. This year, Wildcat nation got the first of two major upsets when they were denied the chance to defend their national title by having their bubble burst. The most popular "upset" to pick the first round for many - the 5v12 matchup. From 2002-2012, the #5 seed has held the advantage with a record of 26-18. This year, the #5 seed went 1-3 to close the gap at 27-21. This means that almost half the time the bubble team defeats a top 20 caliber team.
It is very difficult for a #12 seed to reach the Sweet Sixteen (as it has happened only 19 times in tournament history). It is even more difficult to win a Sweet Sixteen matchup with a record of 1-18. However, isn't the first two rounds what makes March Madness exciting? Don't we love it when Duke and Missouri fall in the first round in the same year? I think nearly every basketball fan was either streaming the FGCU game last night or at least clicking refresh every two seconds to see the result.
So let the debate begin. Should we add more play-in games? Should the play-in games consist of better matchups? Does anyone even watch the NIT unless your hometown team is in the tournament? What's wrong with a little more basketball...
Friday, March 22, 2013
A Curry-less Davidson proved to be a strong matchup against Marquette. They essentially led the entire game with a couple lead changes mid way (as seen in the Statsheet.com chart below).
Wednesday, March 20, 2013
Although we haven't spent as much time as we would like in setting up this year's tournament picks, we still had to post our bracket! The entire bracket was created using statistical analysis techniques such as neural networks, factor analysis and a few others -- no subjective input.
To create the bracket, we utilized the last 10 years worth of tournament results. We developed models for every possible seed pairing and used season statistics to determine who would be the victor each matchup. These models are similar to the ones we use for predicting NBA games and spreads.
We anticipate a strong showing this year -- but next year we should be in full force with optimal NCAA algorithms to pick the perfect bracket!
Enjoy our picks!
** Click on image to zoom in **
Monday, March 18, 2013
Question 5: How much effect do players have on their teammates’ statistics?
I am not sold on the idea that this is really a question that needs to be answered. I think the more important question is how players affect the team’s statistics – not individual teammates. Yes, it is true that high scorers, elite rebounders and elite passers will take away shots, rebounds and assists from other teammates, but this isn’t necessarily what is of interest (unless you are an agent trying to get your player more stats). The true question is what effect the player has on his team while he is on or off the court.
The most common ways to look at this problem is the +/- or different per minute statistics. From this, we can see what overall effect the player had on the court. Basketballvalue.com does a fantastic job at looking at the efficiency of players in different lineups. They aggregate the entire season and see how each player did with every possible lineup throughout the season.
Currently, Perduco Sports is examining how a player contributes in different areas (shooting, rebounding, turnovers, etc.) throughout each game. We want to determine at what point during a game a particular player is more efficient and has a positive effect on their team. The graphs below show the difference in Houston stats (turnovers per minute and personal fouls per minute) when Patrick Patterson was on the court (blue) versus when he was off the court (red). We analyzed each minute of the game as whole minutes. Next, we looked at the opponents stats (points per minute and rebounds per minute) when Patterson was on the court versus off the court.
Looking at analysis like this allows analysts, coaches, and teams to see the true impact of a player during the game. This is just the beginning phase in determining how a player contributes to their team’s success. After all, basketball is a team sport!
Friday, March 15, 2013
Now we start getting to the hefty statistical questions presented by ESPN Insider Kevin Pelton.
Question 4: How do players’ roles on offense affect their efficiency?
Since joining the sports domain and talking to analysts across all of the major sports, this question seems to hit the mark on one of the most important questions to answer. Each player on the court plays a different role – whether they are volume scorers, rebounders or key defenders. The important thing to keep in mind is that not all roles can be compared the same. Therefore, one efficiency metric is not sufficient in this type of problem.
Eli Witus’ article is well written and discusses usage versus efficiency between highly used players and low usage players. Eli uses individual offensive rating and individual possessions as metrics. Again, it all comes down to what metric is used to measure efficiency. It is important to remember to classify players differently based on their roles.
At Perduco, we are examining this problem even closer. To truly measure efficiency versus usage, it is important to take into play fatigue and where in the game fatigue occurs. It’s possible that a starter is efficient for three quarters and sees diminishing returns when reaching the 40 minute mark. However, that same player could excel in 4th quarters. This leads to the assumption that there is fatigue or poor efficiency somewhere else in the game.
The graph below shows some very preliminary work being done to show per minute efficiencies of players. We are examining how a player does at each minute across an entire season (i.e. is the player more effective at different points in the game?). The graph shows the pts/min for Patrick Patterson when he played for the Houston Rockets in the shortened 2011-2012 season. We can see a dip in performance at the beginning of second halves throughout the season. Patterson saw lower usage at these game times and a lower efficiency (we are only showing pts/min) as well. We will continue to post our results that will look at how players perform throughout the game rather than just games in general.
Friday, March 8, 2013
Question 3: What is the best way to develop young players?
No matter how statisticians approach this problem, this could be the most difficult topic to convince coaches/staff that statistics can answer the question on how to develop NBA players. Every player is different and a combination of talent and coachability comes into play. Some players thrive with a mentor present while others need to find their own path to greatness. In the NFL, would Aaron Rodgers be the quarterback he is today if he started playing immediately in his career?
Too much variation exists among how players are developed – even from the young toddler stage. Trying to use statistics to answer this question is almost impossible since there is so much missing data and backstory that all play a role in player development (in terms of how Pelton’s describes the problem). It’s hard enough to convince coaches on how to execute other aspects of the game based on statistics – trying to influence how to develop/coach a player based on “unproven” statistics could be a bit far-fetched.
Pelton mentions the role of minutes young players play. I believe this could be a good research problem in terms of how players do over their career based on the minutes they played early on. However, this does not answer any part of the question of how NBA players develop. Nothing is taken into account about the type of coach, mentors, teammates, previous coaching, etc.
Although an interesting and important issue, this author believes this problem is more of a philosophy question rather than one to be answered by statistical analysis.
Wednesday, March 6, 2013
Question 2: How do basketball players age?
Pelton references a WSJ article that states that 25 is the age at which basketball players peak in performance. Pelton performed his own analysis and found a peaking age of 27.1, which is in line with MLB players. Both of these analyses focus on wins per season or improvements with respects to wins.
An interesting problem indeed – yet, approached in a limited fashion. First, as the authors have admitted, not all players age the same. But to choose essentially one metric to classify how a player ages is too simplistic. Why not look at how players at different positions age differently? Is it appropriate to put Shaq and Stockton in the same group to measure age – I think not.
Pelton discusses how rebounding age peaks earlier, hence why his age is 27 and the WSJ found an age of 25. Not every basketball player is the same; therefore, let’s not classify them all the same. In addition, every study is always looking at Lebron, Kobe, Duncan, etc. If we are asking how elite players age, that is a different question than how a basketball player ages. Elite players are on their own level. Let’s reach back to all the other players in the league that seem to be around for a long time, yet may not have peaked such as Reddick, Delfino, and Novak.
I’m not sold on the idea of wins per season as the metric of a player peaking in age. It should be based on their position and expected contribution over time (given numerous stats and the position’s expectations). Every player contributes differently and simply because a player does not produce wins per season doesn’t necessarily mean a player has peaked as an individual basketball player.
Tuesday, March 5, 2013
ESPN Insider Kevin Pelton, inspired by the MIT Sloan Sports Analytics Conference, wrote an excerpt about 10 questions to be answered in the NBA, which are similar to David Hilbert’s 23 problems in mathematics and Keith Woolner’s 23 problems in Baseball. Over the next 10 days, we will examine each question that Pelton poses and give our insight into the approach of the problem.
Monday, March 4, 2013
This data is truly the future of all sports analytics. Kopp discussed every possible way that coaches and owners could use this data to improve their team. There are numerous pros and cons when dealing with this data and we are going to indulge in a couple.
Thursday, February 28, 2013
We, as analysts, need to avoid the trap of displaying simple box score data in cool 6-D plots that no one else has and call it new insight. Although I am all for visual display of data – Brian Burke does an excellent job displaying unique analysis – we need to keep in mind that true analysis comes from really searching the data for new insight. Apply data mining, use machine learning, execute simulations – use whatever mathematical technique it takes to figure out statistically why the Ravens won the Super Bowl.
Check back often for updates to our projects, blog and game predictions or follow us on Twitter (@PerducoSports). We welcome any and all questions. Enjoy our site!