Lazypredict: Run All Sklearn Algorithms With a Line Of Code

How to (and why you shouldn’t) use it

Lazypredict: Run All Sklearn Algorithms With a Line Of Code

An output of lazypredict.

Here are two pain points of data scientists:

Pain Point 1: Limited time in the data science lifecycle

Data scientists have to prioritize. This may mean spending more time on understanding the business problem and identifying the most appropriate approach rather than focusing solely on developing machine learning algorithms.

Pain point 2: Machine learning modeling can be time-consuming

Fine-tuning a machine learning algorithm involves finding the optimal values for these hyperparameters, which can be a trial-and-error process. This takes a long time.

Automated machine learning can help data scientists tremendously. Image by stable diffusion.

AutoML saves the day

AutoML can address these. One nascent library is lazypredict. In this post, I run through the following:

  • What is lazypredict
  • Installing lazypredict
  • How to use it for automatically fit scikit-learn regression algorithms
  • How to use it for automatically fit classification algorithms
  • Why you shouldn’t use it (and what else you can use)

Note: I’m not affiliated with lazypredict.

What is Lazypredict

Lazypredict is a Python package that aims to automate the machine learning modeling process. It works on both regression and classification tasks.

Its key feature is its ability to automate the training and evaluation of machine learning models. It provides a simple interface for defining a range of hyperparameters and then trains and evaluates a model using a variety of different combinations of these hyperparameters.

Installing Lazypredict

On your terminal, run the following

pip install lazypredict

However, you might need to manually install some dependencies of lazypredict. If you run into issues that say that you need to install scikit-learn, xgboost, or lightgbm, you can run pip install to install the necessary libraries.

Personally, I got it to work on python 3.9.13by having the following requirements.txt

pandas==1.4.4
numpy==1.21.5
scikit-learn==1.0.2
lazypredict==0.2.12

I installed the following libraries by running this command on the terminal: pip install -r requirements.txt .

It’s even better to use a virtual environment in this case.

Using Lazypredict for Regression

Let’s walk through the code. (If you just want the complete code, search “Full code” in this article.)

We’ll first import the necessary libraries.

from lazypredict.Supervised import LazyRegressor
from sklearn import datasets
from sklearn.utils import shuffle
import numpy as np

First, we’ll import the Diabetes dataset.

Ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements were obtained for each of n = 442 diabetes patients, as well as the response of interest, a quantitative measure of disease progression one year after baseline.
# Import the Diabetes Dataset
diabetes = datasets.load_diabetes()

Next, we shuffle the dataset so that we can split them into train-test sets.

# Shuffle the dataset 
X, y = shuffle(diabetes.data, diabetes.target, random_state=13)

# Cast the numerical values into a numpy float.
X = X.astype(np.float32)

# Split the dataset into 90% and 10%.
offset = int(X.shape[0] * 0.9)

# Split into train and test
X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]


Next, we initialize the LazyRegressor object.

# Running the Lazypredict library and fit multiple regression libraries
# for the same dataset
reg = LazyRegressor(verbose=0, 
                    ignore_warnings=False, 
                    custom_metric=None,
                    predictions=False,
                    random_state = 13)

# Parameters
# ----------
# verbose : int, optional (default=0)
#       For the liblinear and lbfgs solvers set verbose to any positive
#       number for verbosity.
# ignore_warnings : bool, optional (default=True)
#       When set to True, the warning related to algorigms that are not able
#       to run are ignored.
# custom_metric : function, optional (default=None)
#       When function is provided, models are evaluated based on the custom 
#       evaluation metric provided.
# prediction : bool, optional (default=False)
#       When set to True, the predictions of all the models models are 
#       returned as dataframe.
# regressors : list, optional (default="all")
#       When function is provided, trains the chosen regressor(s).

Now, we will fitmultiple regression algorithms with the lazypredict library. This step took 3 seconds in total.

Under the hood, the fit method does the following:

  1. Split all features into three categories: numerical (features which are numbers) or categorical (features which are text)
  2. Further split categorical features into two: ‘High’ categorical features (which have more unique values than the total number of features) and ‘low’ categorical features (which have less unique values than the total number of features)
  3. Each feature is then preprocessed in this manner:
  • Numerical features: Impute missing values with mean, then standardize the feature (removing the mean and dividing by the variance)
  • ‘High’ categorical features: Impute missing values with the value ‘missing’, then perform one-hot encoding.
  • ‘Low’ categorical features: Impute missing values with the value ‘missing’, then perform ordinal encoding (convert each unique string value into an integer. In the example of a Gender column— ‘Male’ is encoded as 0 and ‘Female’ 1.)
  • Fit the training dataset on each algorithm.
  • Test each algorithm on the testing set. By default, the metrics are adjusted R-squared, R-squared, root-mean-squared error, and the time taken.
models, predictions = reg.fit(X_train, X_test, y_train, y_test)
model_dictionary = reg.provide_models(X_train, X_test, y_train, y_test)
models

Here is the result.

| Model                         |   Adjusted R-Squared |   R-Squared |   RMSE |   Time Taken |
|:------------------------------|---------------------:|------------:|-------:|-------------:|
| ExtraTreesRegressor           |                 0.38 |        0.52 |  54.22 |         0.17 |
| OrthogonalMatchingPursuitCV   |                 0.37 |        0.52 |  54.39 |         0.01 |
| Lasso                         |                 0.37 |        0.52 |  54.46 |         0.01 |
| LassoLars                     |                 0.37 |        0.52 |  54.46 |         0.01 |
| LarsCV                        |                 0.37 |        0.51 |  54.54 |         0.02 |
| LassoCV                       |                 0.37 |        0.51 |  54.59 |         0.07 |
| PassiveAggressiveRegressor    |                 0.37 |        0.51 |  54.74 |         0.01 |
| LassoLarsIC                   |                 0.36 |        0.51 |  54.83 |         0.01 |
| SGDRegressor                  |                 0.36 |        0.51 |  54.85 |         0.01 |
| RidgeCV                       |                 0.36 |        0.51 |  54.91 |         0.01 |
| Ridge                         |                 0.36 |        0.51 |  54.91 |         0.01 |
| BayesianRidge                 |                 0.36 |        0.51 |  54.94 |         0.01 |
| LassoLarsCV                   |                 0.36 |        0.51 |  54.96 |         0.02 |
| LinearRegression              |                 0.36 |        0.51 |  54.96 |         0.01 |
| TransformedTargetRegressor    |                 0.36 |        0.51 |  54.96 |         0.01 |
| Lars                          |                 0.36 |        0.50 |  55.09 |         0.01 |
| ElasticNetCV                  |                 0.36 |        0.50 |  55.20 |         0.06 |
| HuberRegressor                |                 0.36 |        0.50 |  55.24 |         0.02 |
| RandomForestRegressor         |                 0.35 |        0.50 |  55.48 |         0.25 |
| AdaBoostRegressor             |                 0.34 |        0.49 |  55.88 |         0.08 |
| LGBMRegressor                 |                 0.34 |        0.49 |  55.93 |         0.05 |
| HistGradientBoostingRegressor |                 0.34 |        0.49 |  56.08 |         0.20 |
| PoissonRegressor              |                 0.32 |        0.48 |  56.61 |         0.01 |
| ElasticNet                    |                 0.30 |        0.46 |  57.49 |         0.01 |
| KNeighborsRegressor           |                 0.30 |        0.46 |  57.57 |         0.01 |
| OrthogonalMatchingPursuit     |                 0.29 |        0.45 |  57.87 |         0.01 |
| BaggingRegressor              |                 0.29 |        0.45 |  57.92 |         0.04 |
| XGBRegressor                  |                 0.28 |        0.45 |  58.18 |         0.11 |
| GradientBoostingRegressor     |                 0.25 |        0.42 |  59.70 |         0.12 |
| TweedieRegressor              |                 0.24 |        0.42 |  59.81 |         0.01 |
| GammaRegressor                |                 0.22 |        0.40 |  60.61 |         0.01 |
| RANSACRegressor               |                 0.20 |        0.38 |  61.40 |         0.12 |
| LinearSVR                     |                 0.12 |        0.32 |  64.66 |         0.01 |
| ExtraTreeRegressor            |                 0.00 |        0.23 |  68.73 |         0.01 |
| NuSVR                         |                -0.07 |        0.18 |  71.06 |         0.01 |
| SVR                           |                -0.10 |        0.15 |  72.04 |         0.02 |
| DummyRegressor                |                -0.30 |       -0.00 |  78.37 |         0.01 |
| QuantileRegressor             |                -0.35 |       -0.04 |  79.84 |         1.42 |
| DecisionTreeRegressor         |                -0.47 |       -0.14 |  83.42 |         0.01 |
| GaussianProcessRegressor      |                -0.77 |       -0.37 |  91.51 |         0.02 |
| MLPRegressor                  |                -1.87 |       -1.22 | 116.51 |         0.21 |
| KernelRidge                   |                -5.04 |       -3.67 | 169.06 |         0.01 |

Here’s the full code for regression on a Diabetes dataset.

from lazypredict.Supervised import LazyRegressor
from sklearn import datasets
from sklearn.utils import shuffle
import numpy as np

# Import the Diabetes Dataset
diabetes = datasets.load_diabetes()

# Shuffle the dataset 
X, y = shuffle(diabetes.data, diabetes.target, random_state=13)

# Cast the numerical values
X = X.astype(np.float32)
offset = int(X.shape[0] * 0.9)

# Split into train and test
X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

# Running the Lazypredict library and fit multiple regression libraries
# for the same dataset
reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)
model_dictionary = reg.provide_models(X_train, X_test, y_train, y_test)
models

Using Lazypredict for Classification

Let’s use Lazypredict for classification. (If you just want the full code, search “full code” in this article.

First, import the necessary libraries.

from lazypredict.Supervised import LazyClassifier
 from sklearn.datasets import load_iris
 from sklearn.model_selection import train_test_split

Next, we load the data, the Iris dataset, and split it into train and test sets. Here’s what it contains.

The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.
data = load_iris()
 X = data.data
 y= data.target
 X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.5,random_state =123)

Next, we initialize the LazyClassifier object.

# Running the Lazypredict library and fit multiple regression libraries
# for the same dataset
clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)

"""
    Parameters
    ----------
    verbose : int, optional (default=0)
        For the liblinear and lbfgs solvers set verbose to any positive
        number for verbosity.
    ignore_warnings : bool, optional (default=True)
        When set to True, the warning related to algorigms that are not able to run are ignored.
    custom_metric : function, optional (default=None)
        When function is provided, models are evaluated based on the custom evaluation metric provided.
    prediction : bool, optional (default=False)
        When set to True, the predictions of all the models models are returned as dataframe.
    classifiers : list, optional (default="all")
        When function is provided, trains the chosen classifier(s).
"""



Then, we call the lazy regressor'sfitmethod, which fits ltiple classification algorithms with the lazypredict library. This step took 1 second in total for this small dataset.

(Search the keyword “under the hood, the fit method” to jump to the section where I explain what fit does.)

models,predictions = clf.fit(X_train, X_test, y_train, y_test)



Lastly, we can see how each model performs using provide_models. This reports the accuracy, balanced accuracy, ROC AUC, and F1 score on the test set.

# Calculate performance of all models on test dataset
 model_dictionary = clf.provide_models(X_train,X_test,y_train,y_test)
 models

Here is the full result.

| Model                         |   Accuracy |   Balanced Accuracy | ROC AUC   |   F1 Score |   Time Taken |
|:------------------------------|-----------:|--------------------:|:----------|-----------:|-------------:|
| LinearDiscriminantAnalysis    |       0.99 |                0.99 |           |       0.99 |         0.01 |
| AdaBoostClassifier            |       0.97 |                0.98 |           |       0.97 |         0.13 |
| PassiveAggressiveClassifier   |       0.97 |                0.98 |           |       0.97 |         0.01 |
| LogisticRegression            |       0.97 |                0.98 |           |       0.97 |         0.01 |
| GaussianNB                    |       0.97 |                0.98 |           |       0.97 |         0.01 |
| SGDClassifier                 |       0.96 |                0.96 |           |       0.96 |         0.01 |
| RandomForestClassifier        |       0.96 |                0.96 |           |       0.96 |         0.19 |
| QuadraticDiscriminantAnalysis |       0.96 |                0.96 |           |       0.96 |         0.01 |
| Perceptron                    |       0.96 |                0.96 |           |       0.96 |         0.01 |
| LGBMClassifier                |       0.96 |                0.96 |           |       0.96 |         0.30 |
| ExtraTreeClassifier           |       0.96 |                0.96 |           |       0.96 |         0.01 |
| BaggingClassifier             |       0.95 |                0.95 |           |       0.95 |         0.03 |
| ExtraTreesClassifier          |       0.95 |                0.95 |           |       0.95 |         0.13 |
| XGBClassifier                 |       0.95 |                0.95 |           |       0.95 |         0.19 |
| DecisionTreeClassifier        |       0.95 |                0.95 |           |       0.95 |         0.01 |
| LinearSVC                     |       0.95 |                0.95 |           |       0.95 |         0.01 |
| CalibratedClassifierCV        |       0.95 |                0.95 |           |       0.95 |         0.04 |
| KNeighborsClassifier          |       0.93 |                0.94 |           |       0.93 |         0.01 |
| NuSVC                         |       0.93 |                0.94 |           |       0.93 |         0.01 |
| SVC                           |       0.93 |                0.94 |           |       0.93 |         0.01 |
| RidgeClassifierCV             |       0.91 |                0.91 |           |       0.91 |         0.01 |
| NearestCentroid               |       0.89 |                0.90 |           |       0.89 |         0.01 |
| LabelPropagation              |       0.89 |                0.90 |           |       0.90 |         0.01 |
| LabelSpreading                |       0.89 |                0.90 |           |       0.90 |         0.01 |
| RidgeClassifier               |       0.88 |                0.89 |           |       0.88 |         0.01 |
| BernoulliNB                   |       0.79 |                0.75 |           |       0.77 |         0.01 |
| DummyClassifier               |       0.27 |                0.33 |           |       0.11 |         0.01 |

Here is the full code for classification.

from lazypredict.Supervised import LazyClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Load dataset
data = load_breast_cancer()
X = data.data
y= data.target

# Split data into train and test with a 90:10 ratio
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.1,random_state =123)

# Initialize the Lazypredict library
clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)

# Fit all classification algorithms on training dataset
models,predictions = clf.fit(X_train, X_test, y_train, y_test)

# Calculate performance of all models on test dataset
model_dictionary = clf.provide_models(X_train,X_test,y_train,y_test)
models

Do I recommend lazypredict?

If you get to install it, lazypredict is very simple to use. Its syntax is very close to scikit-learn, making the learning curve very gentle.

But it has some critical weaknesses.

  1. Difficult installation: Many reported difficulties in installing the libraries because the developers did not add the requirements.txt that document their required dependencies.
  2. Limited documentation: I had to comb through the source code to find out how the preprocessing runs. This is not ideal. I also do not know the hyperparameters used to perform each of the classification and regression tasks.
  3. Limited customizability: I still have yet to find ways to customize the preprocessing steps.
  4. Unclear how to use the model after lazypredict: Once you’re done with the lazypredict library, you’d ideally want to select the best algorithm. Lazypredict does not make this easy since you do not have an easy way of exporting the best algorithm.

Main takeaway

Lazypredict’s critical weaknesses limit its utility. It is nice, but it’s still underdeveloped.

I’d strongly recommend you check out other AutoML libraries that are superior in terms of documentation and customizability.

Here are some alternatives.

  1. TPOT (Check out how to use TPOT here)
  2. Auto-Sklearn
  3. Auto-ViML
  4. H2O AutoML
  5. Auto-Keras
  6. MLBox
  7. Hyperopt Sklearn
  8. AutoGluon
Data scientist + Robots = Magic. Photo by Andy Kelly on Unsplash

I’m Travis Tang, a data scientist in Tech. I share how you can use open-sourced libraries on Medium. I also share data analytics and science tips on LinkedIn daily. Follow me if you like this content.