A tool that scrapes the backend NHL API to get play-by-play NHL game data, which is then parsed to determine the season-long and average player statistics. An efficiency value is then produced for each player, which helps show which players are more productive when on the ice.
Python & R
This project was completed as a part of an independent study at Allegheny College during the Spring 2021 semester, but work on it will continue in my free time over the summer. Expect updates to this project/page in the future, it is a passion project of mine that I didn't have a ton of time to work on during the school year. The first step of the project was to complete a stepwise regression process on NHL statistics using R. Stepwise regression allows developers to create more advanced models as this process determines which independent attributes which are most impactful in predicting the dependent variable, giving these independent variables a weight. Via stepwise regression, I was able to determine that things such as shots, time on ice, and more were most impactful in predicting a player's points. Once I had an idea of which attributes were most relevant I could begin work on implementing my efficiency value calculator.
I implemented most of my project in Python. The tool can scrape data from the NHL API, parse this scraped data on a game-by-game basis, and determine a players cumulative stats for a given season. Users can choose their season, portion of this season (preseason, postseason, regular season), and how many games worth of data to scrape. Once data is scaped and the players' cumulative stats are calculated, their average statistics are calculated, allowing for the calculation of an efficiency value ranking. Things such as points, shots, hits, and more were taken into the efficiency value ranking, which was calculated by looking at these statistics in correlation with player playing time. Overall, the efficiency metric that was created was quite accurate but more work could be done in this area to improve it.
This tool allows users to scrape large amounts of NHL game data without needing to have an existing knowledge of the NHL backend API, view cumulative & average player stats, and get a feel for player efficiency on the ice. This tool paints a good picture in to which players make the most of their time on ice, showing which players are the most impactful on the game. Such information could be useful for general managers looking to make personnel decisions in relation to salary.