Unbalanced dataset is very common. For example, credit card transaction (majority of them are authentic), malware detection ( majority are benign), internet traffic( majority are friendly), CT-scan ( majority without tumor), etc.
Why we need to deal with it and how to deal with it. Here we are going to use Jupyter notebook to illustrate this problem.
I am writing this post little by little, so it may takes a few days to finish.
https://github.com/chaowu2009/ML_Projects/blob/master/ML_unbalanced_data.ipynb
No comments:
Post a Comment