window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-41145778-1'); window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'UA-41145778-1');

Just in time for the 2017 NCAA Men’s Basketball Tournament, I have put together a new statistical model to assess team strength and make predictions about the NCAA Tournament. Two years ago I used a well known model to make predictions about the NCAA Tournament, with limited success. The upcoming tournament is the first time new model has been used so this year is essentially a live test case. Following the tournament I will be providing more details of the model and the results in a paper I am preparing to submit for peer review. For now, though, I will use this page to share the findings the tournament progresses.

How the Model Works

The model predicts the scores of each game using the strength of the teams playing and the location of the game. The model also provides a prediction of how consistent the scores of each game will be based on the consistency of the scores of each team and the consistency of scores for games played at the same location. Finally, the model provides an relationship between the scores of each game based on the teams playing. A Bayesian algorithm was built that used data from all games played to provide estimates of all model parameters.

Tournament Prediction Estimates

For predictions, the algorithm draws scores of all tournament games randomly creating 10,000 simulated tournament brackets. The resulting tournament bracket results are then tabulated to provide the proportion of times a given team wins the tournament or advances to a given round.

Click here to see the model’s predictions for the NCAA tournament.

Team Rating Estimates

The model can also be used to create estimates of team strength, which are loosely called ratings. Ratings are produced by taking the estimated team parameters (across the 10,00o iterations). The rating value represents the number of standard deviations a team will score above or below the score of an average team. As team strength and team consistency both factor into the model, the rating provides a blend of the two. For instance, two teams with the same strength but differing consistencies will have different ratings with the more consistent team having the higher rating.

Click here to see the current estimated team ratings.