Tuesday, 22 October 2019

🐼 ELI5 Library : A powerful library for Feature Engineering 🐼

ELI5 Library:

ELI5 is a Python package which helps us to debug machine learning classifiers & explain their predictions.
It provides support for the following machine learning frameworks & packages:

scikit-learn, XGBoost, LightGBM, CatBoost, lightning, sklearn-crfsuitekeras

Installation:









Implementation:


  • Building a sample model to find features which are more important:





  • Calculating weights & observing some important features using eli5 library:




  • Visualizing the important features:





  • Importance of features is decreasing as we move down the top of column.
  • Features showing in green indicates they are having positive impact on our prediction.
  • Features showing in white indicates they are having no impact on our prediction.
  • Most imp feature is var_81 here.

Wednesday, 7 August 2019

🐼 Powerful Info Function in Pandas 🐼

          Info Function:


  • Before you start on any of the Pandas tools, you will want to examine the dataframe to get an overview of your dataset. As we can see in the output, the summary includes list of all columns with their data types and the number of non-null values in each column. We also have the value of range index provided for the index axis.

  • In order to print the short summary, we can use the verbose parameter and set it to False. As, we can see in the output, the summary is very crisp and short. It is helpful when we have 1000's of attributes in dataframe.








Wednesday, 31 July 2019

🐼 New DataSet Exploration in less time 🐼

       Just Use pandas_profiling  :  A Fantastic Package

        Steps:

  • Install pandas_profiling library via pip or conda. 
            pip install pandas-profiling
            conda install -c conda-forge pandas-profiling
  • import pandas_profiling
  • It's a time saving trick in few lines of code when exploring the  data.


Wednesday, 24 July 2019

🐼 Take a look at the fillna() function 🐼


        
      fillna() function
  • It helps to fill out the missing data.
  • If you think that some entries don’t need to be filled out, you can just remove them using the dropna() function.