Predictive Models

From problem statement to solution approach, leading to result and conclusion.

Published in

Analytics Vidhya

4 min readJan 30, 2020

The above image of a basketball player would have made you to think what a basketball player had to do with predictive models in ML. But this is what we will be focussing on in this story, where I will be talking about the improvement in basketball players, and using ML for making predictive model. Although this is not one of the projects I had done, but this was something so great that I just can’t stop sharing with you all. So let’s get started.

As stated above, this story not only discuss the problem statement, but also discuss about the solution approach, which will ultimately lead to some result and conclusion. The problem statement is as below:

Based on data like the player’s age, his position, his performance in last season, how much a player will improve the next season.

As in my previous story, I had already discussed about getting a dataset from kaggle, and how we then convert it into dataframe, extract required information(data cleaning) and stuffs, this is what we are not going to talk about today. We will directly jump to the step after data cleaning i.e Exploratory Data Analysis. Many a times, we don’t have target variable in the dataset, instead we deduce it from the given parameters or through their combinations. This is what we do in here. We focus on players improvement, hence we calculate the difference of “win shares” between two consecutive years as the target variable.

Relationship between improvement and age : Younger players are expected to improve much faster and effectively as compared to older players. This argument is supported by the following figure showing improvement of players of different ages using a box plot.

**Improvement of players with respect to age**

2. Relationship between improvement and playing time/duration : It is mostly observed in strenuous sports like hockey, football, basketball that the time duration of play also affects how much a player improves in his/her skills. Usually players with short playing time comes up with significant improvement and so improvement is directly proportional to short and effective participation.

3. Relationship between improvement and number of games played : If a player had played less number of games, then it may be because he was in a break, or may be injured and recovering. While those who had gone exhaustive involvement may face fatigue or may have several injuries, which will adversely affect their performances in the coming season. Below scatter plot will show the same as stated.

4. Relationship between improvement and last year’s improvement : Sometimes a player improves his game with time, although it’s not the case always. Sometimes, a player revolves around a fixed boundaries, and this is what is shown below in the plot.

Predictive Modelling

There are 2 types of predictive models, i.e regression and classification. Regression models are used to obtain continuous values, which may be rather more predictive in this case, as it helps us to estimate the extend of improvement, while the classification model provides information whether improvement is possible or not (categorical classification). So we can use linear models (linear regression, Ridge regression, and Lasso regression), support vector machines (SVM), random forest, and gradient boost models to the dataset, while root mean squared error (RMSE) as the tuning and evaluation metric. The predicted values had much narrow range than the actual values as shown below.

As a result, the prediction errors were larger as the actual values deviated further from zero as shown.

For the solution we treat players with large improvement/decline with higher weights in model training and evaluation because they were more rare. Using this approach, we found that predicted target values have similar range and distribution as the actual target values, which are shown as below.

Performance of different models

The summary of performance of various models are shown as below.

Conclusion and Result

To come up with the conclusion and solution, we had used a number of factors which make it rather complex, but this is what real time models be like. Making predictive models is not just the only work, more important is the evaluation of performance of different models and their efficiency check, which helps us to choose the most efficient model for a particular case.

Really excited for my 2nd story on Medium.

Feel free to comment below.

Analytics Vidhya

Predictive Models

From problem statement to solution approach, leading to result and conclusion.

Based on data like the player’s age, his position, his performance in last season, how much a player will improve the next season.

Predictive Modelling

Conclusion and Result

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Analytics Vidhya

Written by Mayank Bhandari

No responses yet

More from Mayank Bhandari and Analytics Vidhya

Hackerrank: SQL Project Planning

A (MySQL) solution.

How To Update Your Status During Standup Like a Senior Engineer

A status update is where you can showcase how well you manage ambiguity and is an important way to build trust with your team

Why I Prefer Regular Merge Commits Over Squash Commits

I used to think squash commits were so cool, and then I had to use them all day, every day. Here’s why you should avoid squash

Quick Guide to Labelling Data for Common Seaborn Plots

A quick guide to data labels for seaborn plots

Recommended from Medium

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

10 MindBlowing Free APIs to Supercharge Your Next Project

Make your projects 10x better!

How I Learned to Love `init.py`: A Simple Guide😊

💡 Heads Up! Click here to unlock this article for free if you’re not a Medium member!

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Stop making your python projects like it was 15 years ago…

I have a few things I’ve seen across companies and projects that I’ve seen working with Python that are annoying, hard to maintain, and are…

This Is How Tesla Will Die

The vultures are circling the tech giant.

Analytics Vidhya

Predictive Models

From problem statement to solution approach, leading to result and conclusion.

Based on data like the player’s age, his position, his performance in last season, how much a player will improve the next season.

Predictive Modelling

Conclusion and Result

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Analytics Vidhya

Written by Mayank Bhandari

No responses yet

More from Mayank Bhandari and Analytics Vidhya

Hackerrank: SQL Project Planning

A (MySQL) solution.

How To Update Your Status During Standup Like a Senior Engineer

A status update is where you can showcase how well you manage ambiguity and is an important way to build trust with your team

Why I Prefer Regular Merge Commits Over Squash Commits

I used to think squash commits were so cool, and then I had to use them all day, every day. Here’s why you should avoid squash

Quick Guide to Labelling Data for Common Seaborn Plots

A quick guide to data labels for seaborn plots

Recommended from Medium

How I Am Using a Lifetime 100% Free Server

Get a server with 24 GB RAM + 4 CPU + 200 GB Storage + Always Free

10 MindBlowing Free APIs to Supercharge Your Next Project

Make your projects 10x better!

How I Learned to Love `__init__.py`: A Simple Guide😊

💡 Heads Up! Click here to unlock this article for free if you’re not a Medium member!

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Stop making your python projects like it was 15 years ago…

I have a few things I’ve seen across companies and projects that I’ve seen working with Python that are annoying, hard to maintain, and are…

This Is How Tesla Will Die

The vultures are circling the tech giant.

How I Learned to Love `init.py`: A Simple Guide😊