lazypredict package
Subpackages
Submodules
lazypredict.Supervised module
Supervised Models — LazyClassifier and LazyRegressor for rapid model benchmarking.
Provides LazyClassifier and LazyRegressor classes that train multiple scikit-learn models with minimal code to quickly identify which algorithms perform best on a given dataset.
- lazypredict.Supervised.Classification
alias of
LazyClassifier
- class lazypredict.Supervised.LazyClassifier(verbose: int = 0, ignore_warnings: bool = True, custom_metric: Callable | None = None, predictions: bool = False, random_state: int = 42, classifiers: str | List = 'all', cv: int | None = None, timeout: int | float | None = None, categorical_encoder: str = 'onehot', n_jobs: int = -1, max_models: int | None = None, progress_callback: Callable | None = None, use_gpu: bool = False)[source]
Bases:
LazyEstimatorFit all classification algorithms available in scikit-learn and benchmark them.
- Parameters:
verbose (int, optional (default=0)) – Set to a positive number to enable progress bars and per-model metric output.
ignore_warnings (bool, optional (default=True)) – When True, warnings and errors from individual models are suppressed.
custom_metric (callable or None, optional (default=None)) – A function
f(y_true, y_pred)used for additional evaluation.predictions (bool, optional (default=False)) – When True,
fit()returns a tuple of (scores, predictions_dataframe).random_state (int, optional (default=42)) – Random seed passed to models that accept it.
classifiers (list or
"all", optional (default=”all”)) – Specific classifier classes to train, or"all"for every available one.cv (int or None, optional (default=None)) – Number of folds for cross-validation. If None, uses train/test split only.
timeout (int or float or None, optional (default=None)) – Maximum seconds for each model. Models exceeding this are skipped.
categorical_encoder (str, optional (default='onehot')) – Encoder for categorical features:
'onehot','ordinal','target', or'binary'.n_jobs (int, optional (default=-1)) – Number of parallel jobs for cross-validation. -1 uses all processors.
max_models (int or None, optional (default=None)) – Maximum number of models to train. None means train all.
progress_callback (callable or None, optional (default=None)) – Callback
f(model_name, current, total, metrics)called after each model.use_gpu (bool, optional (default=False)) – When True, enables GPU acceleration for models that support it (e.g., XGBoost, LightGBM, CatBoost). When cuML (RAPIDS) is installed, GPU-accelerated scikit-learn equivalents are also added automatically. Falls back to CPU if CUDA is unavailable.
Examples
>>> from lazypredict.Supervised import LazyClassifier >>> from sklearn.datasets import load_breast_cancer >>> from sklearn.model_selection import train_test_split >>> data = load_breast_cancer() >>> X = data.data >>> y = data.target >>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=123) >>> clf = LazyClassifier(verbose=0, ignore_warnings=True, custom_metric=None) >>> models, predictions = clf.fit(X_train, X_test, y_train, y_test)
- class lazypredict.Supervised.LazyRegressor(verbose: int = 0, ignore_warnings: bool = True, custom_metric: Callable | None = None, predictions: bool = False, random_state: int = 42, regressors: str | List = 'all', cv: int | None = None, timeout: int | float | None = None, categorical_encoder: str = 'onehot', n_jobs: int = -1, max_models: int | None = None, progress_callback: Callable | None = None, use_gpu: bool = False)[source]
Bases:
LazyEstimatorFit all regression algorithms available in scikit-learn and benchmark them.
- Parameters:
verbose (int, optional (default=0)) – Set to a positive number to enable progress bars and per-model metric output.
ignore_warnings (bool, optional (default=True)) – When True, warnings and errors from individual models are suppressed.
custom_metric (callable or None, optional (default=None)) – A function
f(y_true, y_pred)used for additional evaluation.predictions (bool, optional (default=False)) – When True,
fit()returns a tuple of (scores, predictions_dataframe).random_state (int, optional (default=42)) – Random seed passed to models that accept it.
regressors (list or
"all", optional (default=”all”)) – Specific regressor classes to train, or"all"for every available one.cv (int or None, optional (default=None)) – Number of folds for cross-validation. If None, uses train/test split only.
timeout (int or float or None, optional (default=None)) – Maximum seconds for each model. Models exceeding this are skipped.
categorical_encoder (str, optional (default='onehot')) – Encoder for categorical features:
'onehot','ordinal','target', or'binary'.n_jobs (int, optional (default=-1)) – Number of parallel jobs for cross-validation. -1 uses all processors.
max_models (int or None, optional (default=None)) – Maximum number of models to train. None means train all.
progress_callback (callable or None, optional (default=None)) – Callback
f(model_name, current, total, metrics)called after each model.use_gpu (bool, optional (default=False)) – When True, enables GPU acceleration for models that support it (e.g., XGBoost, LightGBM). Falls back to CPU if CUDA is unavailable.
Examples
>>> from lazypredict.Supervised import LazyRegressor >>> from sklearn import datasets >>> from sklearn.utils import shuffle >>> import numpy as np >>> diabetes = datasets.load_diabetes() >>> X, y = shuffle(diabetes.data, diabetes.target, random_state=13) >>> X = X.astype(np.float32) >>> offset = int(X.shape[0] * 0.9) >>> X_train, y_train = X[:offset], y[:offset] >>> X_test, y_test = X[offset:], y[offset:] >>> reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None) >>> models, predictions = reg.fit(X_train, X_test, y_train, y_test)
- lazypredict.Supervised.Regression
alias of
LazyRegressor
- lazypredict.Supervised.time_limit(seconds: int)[source]
Context manager to limit execution time of a code block.
- Parameters:
seconds (int) – Maximum time in seconds for the code block to execute.
- Raises:
TimeoutException – If the code block exceeds the time limit.
lazypredict.TimeSeriesForecasting module
Time Series Forecasting — LazyForecaster for rapid model benchmarking.
Provides a LazyForecaster class that trains multiple statistical, ML, and deep-learning forecasting models to quickly identify which algorithms perform best on a given time series.
- class lazypredict.TimeSeriesForecasting.AutoARIMAForecaster[source]
Bases:
ForecasterWrapperAuto-ARIMA via
pmdarimawith automatic (p, d, q) parameter tuning.Performs a stepwise search over ARIMA orders to minimise information criteria. Supports exogenous features.
Requires
pmdarima.- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.ForecasterWrapper[source]
Bases:
ABCAbstract base class providing a uniform interface for all forecasting models.
Every forecaster wrapper must implement
fit(),predict(), and thenameproperty so thatLazyForecastercan train and evaluate them interchangeably.- abstractmethod fit(y_train: ndarray, X_train: ndarray | None = None) None[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- abstract property name: str
Human-readable model name used as index in result tables.
- abstractmethod predict(horizon: int, X_test: ndarray | None = None) ndarray[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.GRUForecaster(n_lags: int = 10, n_rolling: Tuple[int, ...] = (3, 7), hidden_size: int = 64, n_epochs: int = 50, batch_size: int = 32, learning_rate: float = 0.001, random_state: int = 42, use_gpu: bool = False)[source]
Bases:
_TorchRNNForecasterSingle-layer GRU forecaster.
See
_TorchRNNForecasterfor parameter details. Requirestorch.
- class lazypredict.TimeSeriesForecasting.HoltForecaster[source]
Bases:
ForecasterWrapperHolt’s linear trend method (double exponential smoothing).
Captures level and trend components but ignores seasonality. Requires
statsmodels.- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.HoltWintersForecaster(seasonal: str = 'add', seasonal_periods: int | None = None, label_suffix: str = '')[source]
Bases:
ForecasterWrapperHolt-Winters triple exponential smoothing with additive or multiplicative seasonality.
- Parameters:
seasonal (str, optional (default="add")) – Type of seasonal component:
"add"or"mul".seasonal_periods (int or None, optional (default=None)) – Number of observations per seasonal cycle (e.g. 12 for monthly data with yearly seasonality). Must be >= 2.
label_suffix (str, optional (default="")) – Suffix appended to the model name for display.
statsmodels. (Requires)
- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.LSTMForecaster(n_lags: int = 10, n_rolling: Tuple[int, ...] = (3, 7), hidden_size: int = 64, n_epochs: int = 50, batch_size: int = 32, learning_rate: float = 0.001, random_state: int = 42, use_gpu: bool = False)[source]
Bases:
_TorchRNNForecasterSingle-layer LSTM forecaster.
See
_TorchRNNForecasterfor parameter details. Requirestorch.
- class lazypredict.TimeSeriesForecasting.LazyForecaster(verbose: int = 0, ignore_warnings: bool = True, custom_metric: Callable | None = None, predictions: bool = False, random_state: int = 42, forecasters: str | List[str] = 'all', cv: int | None = None, timeout: int | float | None = None, n_lags: int = 10, n_rolling: Tuple[int, ...] = (3, 7), seasonal_period: int | None = None, sort_by: str = 'RMSE', n_jobs: int = -1, max_models: int | None = None, progress_callback: Callable | None = None, use_gpu: bool = False, foundation_model_path: str | None = None)[source]
Bases:
objectFit multiple time series forecasting models and benchmark them.
Runs statistical, machine-learning, deep-learning, and pretrained foundation models on a time series and returns a ranked DataFrame of metrics so you can quickly see which approach works best.
- Parameters:
verbose (int, optional (default=0)) – Controls progress-bar visibility and per-model metric logging.
ignore_warnings (bool, optional (default=True)) – When True, model-level exceptions are silently stored in
self.errorsand the loop continues.custom_metric (callable or None, optional (default=None)) – Additional metric function
f(y_true, y_pred) -> float.predictions (bool, optional (default=False)) – When True,
fit()returns a second DataFrame of predictions.random_state (int, optional (default=42)) – Seed for ML and deep-learning models.
forecasters (str or list, optional (default="all")) –
"all"to run every available model, or a list of model names (strings) to select a subset.cv (int or None, optional (default=None)) – Number of
TimeSeriesSplitfolds for cross-validation.timeout (int, float, or None, optional (default=None)) – Maximum training time in seconds per model.
n_lags (int, optional (default=10)) – Number of lag features for ML/DL models.
n_rolling (tuple of int, optional (default=(3, 7))) – Rolling-window sizes for feature engineering.
seasonal_period (int or None, optional (default=None)) – Seasonal period.
Nonetriggers auto-detection via ACF.sort_by (str, optional (default="RMSE")) – Metric column to sort results by.
n_jobs (int, optional (default=-1)) – Parallel jobs for cross-validation.
max_models (int or None, optional (default=None)) – Limit the number of models to train.
progress_callback (callable or None, optional (default=None)) – Called after each model as
f(name, current, total, metrics).use_gpu (bool, optional (default=False)) – When True, enables GPU acceleration for models that support it (e.g., XGBoost, LightGBM, LSTM, GRU). Falls back to CPU if CUDA is unavailable.
foundation_model_path (str or None, optional (default=None)) – Local filesystem path to pre-downloaded foundation model weights (e.g. TimesFM). Use this when you are offline, behind a firewall, or in an air-gapped environment. When
None(default), the model is downloaded from Hugging Face automatically.
- models
Fitted
ForecasterWrapperobjects keyed by model name.- Type:
dict
- errors
Exceptions from models that failed, keyed by model name.
- Type:
dict
- fit(y_train: Series | ndarray, y_test: Series | ndarray, X_train: DataFrame | ndarray | None = None, X_test: DataFrame | ndarray | None = None) Tuple[DataFrame, DataFrame][source]
Fit forecasting models and evaluate on test data.
- Parameters:
y_train (array-like) – Training time series (chronological order).
y_test (array-like) – Held-out future values to forecast against.
X_train (array-like or None) – Exogenous features for the training period.
X_test (array-like or None) – Exogenous features for the forecast period.
- Returns:
scores (pd.DataFrame) – Metric table for every model, sorted by
sort_by.predictions (pd.DataFrame) – Per-model predictions (empty if
self.predictionsis False).
- load_models(path: str) Dict[str, ForecasterWrapper][source]
Load previously saved models from disk.
- Parameters:
path (str) – Directory containing
.joblibfiles written bysave_models().- Returns:
Mapping of model name to
ForecasterWrapper.- Return type:
dict
- Raises:
FileNotFoundError – If
pathdoes not exist.
- predict(y_history: Series | ndarray, horizon: int, model_name: str | None = None, X_test: DataFrame | ndarray | None = None) Dict[str, ndarray] | ndarray[source]
Produce forecasts from previously fitted models.
Each model is re-fit on
y_historybefore predicting so that the most recent observations are used.- Parameters:
y_history (array-like) – Historical time series to condition the forecast on.
horizon (int) – Number of future time steps to forecast.
model_name (str or None, optional) – If given, only this model is used and a single
np.ndarrayis returned. Otherwise all fitted models are used and adictmapping model names to arrays is returned.X_test (array-like or None, optional) – Exogenous features for the forecast period.
- Returns:
A single forecast array when
model_nameis specified, or a{name: np.ndarray}dict for all models.- Return type:
np.ndarray or dict
- Raises:
ValueError – If no models have been fitted or
model_nameis not found.
- provide_models(y_train: Series | ndarray, y_test: Series | ndarray, X_train: DataFrame | ndarray | None = None, X_test: DataFrame | ndarray | None = None) Dict[str, ForecasterWrapper][source]
Return all fitted forecaster wrappers.
Calls
fit()automatically if no models have been fitted yet.- Parameters:
y_train (array-like) – Training time series.
y_test (array-like) – Test time series (needed only if
fithas not been called).X_train (array-like or None, optional) – Exogenous features for the training period.
X_test (array-like or None, optional) – Exogenous features for the forecast period.
- Returns:
Mapping of model name to fitted
ForecasterWrapper.- Return type:
dict
- class lazypredict.TimeSeriesForecasting.MLForecaster(estimator_class, model_name: str, n_lags: int = 10, n_rolling: Tuple[int, ...] = (3, 7), random_state: int = 42, use_gpu: bool = False)[source]
Bases:
ForecasterWrapperWraps any scikit-learn regressor for time series by engineering lag and rolling features.
The training series is transformed into a tabular supervised-learning problem using
create_lag_features(). Multi-step forecasts are produced via recursive (autoregressive) prediction withrecursive_forecast().- Parameters:
estimator_class (type) – An sklearn-compatible regressor class (not an instance).
model_name (str) – Display name used in result tables.
n_lags (int, optional (default=10)) – Number of lag features to create.
n_rolling (tuple of int, optional (default=(3, 7))) – Window sizes for rolling mean/std features.
random_state (int, optional (default=42)) – Seed passed to the estimator if it accepts
random_state.
- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.NaiveForecaster[source]
Bases:
ForecasterWrapperNaive baseline that predicts the last observed value for all future steps.
This is the simplest possible forecaster and serves as a lower-bound benchmark. Exogenous features are ignored.
- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.SARIMAXForecaster(seasonal_period: int | None = None)[source]
Bases:
ForecasterWrapperSeasonal ARIMA with eXogenous regressors (SARIMAX).
Uses a default order of
(1,1,1)with seasonal order(1,1,1,sp)when a seasonal period is detected. Supports exogenous features viaX_train/X_test.- Parameters:
seasonal_period (int or None, optional (default=None)) – Seasonal period. When <= 1 the seasonal component is disabled.
statsmodels. (Requires)
- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.SeasonalNaiveForecaster(seasonal_period: int = 1)[source]
Bases:
ForecasterWrapperSeasonal naive baseline that repeats the last complete seasonal cycle.
- Parameters:
seasonal_period (int, optional (default=1)) – Length of one seasonal cycle. When set to 1 this behaves identically to
NaiveForecaster.
- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.SimpleExpSmoothingForecaster[source]
Bases:
ForecasterWrapperSimple Exponential Smoothing (no trend, no seasonality).
Uses
statsmodels.tsa.holtwinters.SimpleExpSmoothingwith optimised smoothing parameters. Suitable for series without trend or seasonality. Requiresstatsmodels.- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.ThetaForecaster(seasonal_period: int | None = None)[source]
Bases:
ForecasterWrapperTheta method from
statsmodels.The Theta method decomposes the series using a modified theta-line and produces forecasts by extrapolating with simple exponential smoothing.
- Parameters:
seasonal_period (int or None, optional (default=None)) – Seasonal period passed to
ThetaModel.Nonedefaults to 1 (no seasonality).statsmodels. (Requires)
- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
- class lazypredict.TimeSeriesForecasting.TimesFMForecaster(use_gpu: bool = False, model_path: str | None = None)[source]
Bases:
ForecasterWrapperGoogle TimesFM 2.5 zero-shot pretrained foundation model for forecasting.
TimesFM is a 200M-parameter transformer pre-trained on a large corpus of real and synthetic time series. It performs zero-shot inference—no task- specific training is needed. Exogenous features are not supported and will be silently ignored.
When
use_gpu=Trueand CUDA is available, the model is placed on GPU for faster inference.- Parameters:
use_gpu (bool, optional (default=False)) – Place the model on a CUDA device when available.
model_path (str or None, optional (default=None)) – Path to a local directory containing the pre-downloaded TimesFM model weights. When
None(default), the model is downloaded from Hugging Face (google/timesfm-2.5-200m-pytorch). Use this when you are offline or behind a firewall.only). (Requires timesfm and torch (Python 3.10-3.11)
- fit(y_train, X_train=None)[source]
Fit the forecaster on training data.
- Parameters:
y_train (np.ndarray) – 1-D array of training observations in chronological order.
X_train (np.ndarray or None, optional) – Exogenous feature matrix of shape
(len(y_train), n_features).
- property name
Human-readable model name used as index in result tables.
- predict(horizon, X_test=None)[source]
Forecast
horizonsteps into the future.- Parameters:
horizon (int) – Number of future time steps to forecast.
X_test (np.ndarray or None, optional) – Exogenous features for the forecast period, shape
(horizon, n_features).
- Returns:
1-D array of length
horizonwith point forecasts.- Return type:
np.ndarray
lazypredict.cli module
Console script for lazypredict — quick model benchmarking from the command line.
lazypredict.config module
Configuration constants and defaults for LazyPredict.
- lazypredict.config.get_cuml_models() dict[source]
Return a mapping of sklearn model names to cuML GPU equivalents.
cuML (RAPIDS) provides GPU-accelerated drop-in replacements for many scikit-learn estimators. This function returns the available ones.
- Returns:
{sklearn_name: cuml_class}for available cuML models. Empty dict if cuML is not installed.- Return type:
dict
- lazypredict.config.get_gpu_model_params(model_class, use_gpu: bool) dict[source]
Return GPU-related keyword arguments for a model class.
Inspects the model class module to determine if it supports GPU acceleration and returns the appropriate kwargs.
Supported GPU backends:
XGBoost:
device="cuda"LightGBM:
device="gpu"CatBoost:
task_type="GPU"cuML (RAPIDS): No extra params needed (GPU-native).
- Parameters:
model_class (type) – The model class to inspect.
use_gpu (bool) – Whether GPU usage has been requested by the user.
- Returns:
Keyword arguments to pass to the model constructor for GPU support. Empty dict if the model does not support GPU or
use_gpuis False.- Return type:
dict
lazypredict.exceptions module
Custom exception types for LazyPredict.
- exception lazypredict.exceptions.DataValidationError[source]
Bases:
LazyPredictError,ValueErrorRaised when input data fails validation checks.
- exception lazypredict.exceptions.InsufficientDataError[source]
Bases:
LazyPredictError,ValueErrorRaised when the time series is too short for the requested configuration.
- exception lazypredict.exceptions.InvalidParameterError[source]
Bases:
LazyPredictError,ValueErrorRaised when an invalid parameter is passed to a constructor or method.
- exception lazypredict.exceptions.LazyPredictError[source]
Bases:
ExceptionBase exception for all LazyPredict errors.
- exception lazypredict.exceptions.ModelFitError(model_name: str, original_error: Exception)[source]
Bases:
LazyPredictErrorRaised when a model fails during fitting.
- exception lazypredict.exceptions.TimeoutException[source]
Bases:
LazyPredictErrorRaised when a model exceeds the allotted time limit.
lazypredict.metrics module
Metric helper functions for LazyPredict.
- lazypredict.metrics.adjusted_rsquared(r2: float, n: int, p: int) float[source]
Calculate adjusted R-squared.
- Parameters:
r2 (float) – R-squared value.
n (int) – Number of observations.
p (int) – Number of predictors.
- Returns:
Adjusted R-squared value.
- Return type:
float
- lazypredict.metrics.compute_forecast_metrics(y_true: ndarray, y_pred: ndarray, y_train: ndarray, seasonal_period: int = 1) dict[source]
Compute all standard forecasting metrics at once.
- Parameters:
y_true (array-like) – Actual test values.
y_pred (array-like) – Predicted values.
y_train (array-like) – Training series (needed for MASE).
seasonal_period (int) – Seasonal period for MASE baseline.
- Returns:
Keys: mae, rmse, r_squared, mape, smape, mase.
- Return type:
dict
- lazypredict.metrics.mean_absolute_percentage_error(y_true: ndarray, y_pred: ndarray) float[source]
Mean Absolute Percentage Error (MAPE).
Undefined when y_true contains zeros; those entries are excluded.
- Parameters:
y_true (array-like) – Actual values.
y_pred (array-like) – Predicted values.
- Returns:
MAPE as a percentage (0–100+).
- Return type:
float
- lazypredict.metrics.mean_absolute_scaled_error(y_true: ndarray, y_pred: ndarray, y_train: ndarray, seasonal_period: int = 1) float[source]
Mean Absolute Scaled Error (MASE).
Scale-free metric relative to the in-sample naive forecast error.
seasonal_period=1compares against a random-walk naive forecast; values > 1 use seasonal naive.- Parameters:
y_true (array-like) – Actual test values.
y_pred (array-like) – Predicted values.
y_train (array-like) – Training series used to compute the naive scaling factor.
seasonal_period (int) – Seasonal period for the naive baseline (1 = non-seasonal).
- Returns:
MASE value. Values < 1 beat the naive baseline.
- Return type:
float
- lazypredict.metrics.symmetric_mean_absolute_percentage_error(y_true: ndarray, y_pred: ndarray) float[source]
Symmetric Mean Absolute Percentage Error (SMAPE).
More balanced than MAPE; handles zeros better.
- Parameters:
y_true (array-like) – Actual values.
y_pred (array-like) – Predicted values.
- Returns:
SMAPE as a percentage (0–200).
- Return type:
float
lazypredict.preprocessing module
Preprocessing utilities — encoders, transformers, and cardinality splitting.
- lazypredict.preprocessing.build_preprocessor(X_train: DataFrame, categorical_encoder: str) ColumnTransformer[source]
Build a ColumnTransformer for the given data.
- lazypredict.preprocessing.get_card_split(df: DataFrame, cols: Index | list, n: int = 11) Tuple[Index, Index][source]
Split categorical columns into two lists based on cardinality.
- Parameters:
df (pandas.DataFrame) – DataFrame from which the cardinality of the columns is calculated.
cols (array-like) – Categorical columns to evaluate.
n (int, optional (default=11)) – Columns with more than n unique values are considered high cardinality.
- Returns:
card_low (pandas.Index) – Columns with cardinality <= n.
card_high (pandas.Index) – Columns with cardinality > n.
- lazypredict.preprocessing.get_categorical_encoder(encoder_type: str = 'onehot', cardinality: str = 'low') Pipeline[source]
Get categorical encoder pipeline based on encoder type and cardinality.
- Parameters:
encoder_type (str, optional (default='onehot')) – Type of encoder: ‘onehot’, ‘ordinal’, ‘target’, or ‘binary’.
cardinality (str, optional (default='low')) – Cardinality level: ‘low’ or ‘high’.
- Returns:
Sklearn pipeline with imputer and encoder.
- Return type:
Pipeline
- Raises:
ValueError – If encoder_type is not one of the recognised values.
lazypredict.ts_preprocessing module
Time series feature engineering utilities for LazyForecaster.
- lazypredict.ts_preprocessing.create_lag_features(y: ndarray, n_lags: int = 10, n_rolling: Tuple[int, ...] | None = None, X_exog: ndarray | None = None) Tuple[ndarray, ndarray][source]
Create lag features, rolling statistics, and a diff feature.
- Parameters:
y (np.ndarray) – Time series values.
n_lags (int) – Number of lag features (y_{t-1} … y_{t-n_lags}).
n_rolling (tuple of int or None) – Window sizes for rolling mean and rolling std.
X_exog (np.ndarray or None) – Exogenous features to append (shape
(len(y), k)).
- Returns:
X_features (np.ndarray) – Feature matrix of shape
(n_valid, n_features).y_target (np.ndarray) – Target values aligned with the feature rows.
- lazypredict.ts_preprocessing.detect_seasonal_period(y: ndarray, max_period: int = 365) int | None[source]
Auto-detect the dominant seasonal period using autocorrelation.
- Parameters:
y (np.ndarray) – Time series values.
max_period (int) – Maximum candidate period to consider.
- Returns:
Detected seasonal period, or
Noneif no significant seasonality is found.- Return type:
int or None
- lazypredict.ts_preprocessing.recursive_forecast(estimator, scaler: StandardScaler, y_history: ndarray, horizon: int, n_lags: int, n_rolling: Tuple[int, ...] | None = None, X_exog: ndarray | None = None) ndarray[source]
Multi-step recursive (autoregressive) forecast.
At each step the model predicts the next value, which is then appended to the history before computing features for the following step.
- Parameters:
estimator – Fitted sklearn-compatible regressor.
scaler (StandardScaler) – Fitted scaler used during training.
y_history (np.ndarray) – Historical series used for context.
horizon (int) – Number of future steps to predict.
n_lags (int) – Number of lag features.
n_rolling (tuple of int or None) – Rolling window sizes.
X_exog (np.ndarray or None) – Exogenous features for the forecast period (shape
(horizon, k)).
- Returns:
Array of length
horizonwith predicted values.- Return type:
np.ndarray
Module contents
Top-level package for Lazy Predict.
- class lazypredict.LazyClassifier(verbose: int = 0, ignore_warnings: bool = True, custom_metric: Callable | None = None, predictions: bool = False, random_state: int = 42, classifiers: str | List = 'all', cv: int | None = None, timeout: int | float | None = None, categorical_encoder: str = 'onehot', n_jobs: int = -1, max_models: int | None = None, progress_callback: Callable | None = None, use_gpu: bool = False)[source]
Bases:
LazyEstimatorFit all classification algorithms available in scikit-learn and benchmark them.
- Parameters:
verbose (int, optional (default=0)) – Set to a positive number to enable progress bars and per-model metric output.
ignore_warnings (bool, optional (default=True)) – When True, warnings and errors from individual models are suppressed.
custom_metric (callable or None, optional (default=None)) – A function
f(y_true, y_pred)used for additional evaluation.predictions (bool, optional (default=False)) – When True,
fit()returns a tuple of (scores, predictions_dataframe).random_state (int, optional (default=42)) – Random seed passed to models that accept it.
classifiers (list or
"all", optional (default=”all”)) – Specific classifier classes to train, or"all"for every available one.cv (int or None, optional (default=None)) – Number of folds for cross-validation. If None, uses train/test split only.
timeout (int or float or None, optional (default=None)) – Maximum seconds for each model. Models exceeding this are skipped.
categorical_encoder (str, optional (default='onehot')) – Encoder for categorical features:
'onehot','ordinal','target', or'binary'.n_jobs (int, optional (default=-1)) – Number of parallel jobs for cross-validation. -1 uses all processors.
max_models (int or None, optional (default=None)) – Maximum number of models to train. None means train all.
progress_callback (callable or None, optional (default=None)) – Callback
f(model_name, current, total, metrics)called after each model.use_gpu (bool, optional (default=False)) – When True, enables GPU acceleration for models that support it (e.g., XGBoost, LightGBM, CatBoost). When cuML (RAPIDS) is installed, GPU-accelerated scikit-learn equivalents are also added automatically. Falls back to CPU if CUDA is unavailable.
Examples
>>> from lazypredict.Supervised import LazyClassifier >>> from sklearn.datasets import load_breast_cancer >>> from sklearn.model_selection import train_test_split >>> data = load_breast_cancer() >>> X = data.data >>> y = data.target >>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=123) >>> clf = LazyClassifier(verbose=0, ignore_warnings=True, custom_metric=None) >>> models, predictions = clf.fit(X_train, X_test, y_train, y_test)
- class lazypredict.LazyEstimator(verbose: int = 0, ignore_warnings: bool = True, custom_metric: Callable | None = None, predictions: bool = False, random_state: int = 42, cv: int | None = None, timeout: int | float | None = None, categorical_encoder: str = 'onehot', n_jobs: int = -1, max_models: int | None = None, progress_callback: Callable | None = None, use_gpu: bool = False)[source]
Bases:
objectAbstract base class with shared logic for LazyClassifier and LazyRegressor.
Subclasses must implement
_get_estimator_list,_compute_metrics,_build_scores_dataframe, and_estimator_step_name.- fit(X_train: DataFrame | ndarray, X_test: DataFrame | ndarray, y_train: Series | ndarray, y_test: Series | ndarray) DataFrame | Tuple[DataFrame, DataFrame][source]
Fit estimators and score on test data.
- Parameters:
X_train (array-like) – Training feature matrix.
X_test (array-like) – Testing feature matrix.
y_train (array-like) – Training target vector.
y_test (array-like) – Testing target vector.
- Returns:
scores (pandas.DataFrame) – Metrics for every model.
predictions (pandas.DataFrame) – Only returned when
self.predictionsis True.
- load_models(path: str) Dict[str, Pipeline][source]
Load models from disk.
- Parameters:
path (str) – Directory path containing saved models.
- Returns:
Mapping of model name to loaded Pipeline.
- Return type:
dict
- predict(X_test: DataFrame | ndarray, model_name: str | None = None) Dict[str, ndarray] | ndarray[source]
Make predictions using fitted models.
- Parameters:
X_test (array-like) – Test feature matrix.
model_name (str or None, optional (default=None)) – Specific model to use. If None, returns predictions from all models.
- Returns:
Dictionary of predictions keyed by model name, or a single array.
- Return type:
dict or numpy.ndarray
- class lazypredict.LazyForecaster(verbose: int = 0, ignore_warnings: bool = True, custom_metric: Callable | None = None, predictions: bool = False, random_state: int = 42, forecasters: str | List[str] = 'all', cv: int | None = None, timeout: int | float | None = None, n_lags: int = 10, n_rolling: Tuple[int, ...] = (3, 7), seasonal_period: int | None = None, sort_by: str = 'RMSE', n_jobs: int = -1, max_models: int | None = None, progress_callback: Callable | None = None, use_gpu: bool = False, foundation_model_path: str | None = None)[source]
Bases:
objectFit multiple time series forecasting models and benchmark them.
Runs statistical, machine-learning, deep-learning, and pretrained foundation models on a time series and returns a ranked DataFrame of metrics so you can quickly see which approach works best.
- Parameters:
verbose (int, optional (default=0)) – Controls progress-bar visibility and per-model metric logging.
ignore_warnings (bool, optional (default=True)) – When True, model-level exceptions are silently stored in
self.errorsand the loop continues.custom_metric (callable or None, optional (default=None)) – Additional metric function
f(y_true, y_pred) -> float.predictions (bool, optional (default=False)) – When True,
fit()returns a second DataFrame of predictions.random_state (int, optional (default=42)) – Seed for ML and deep-learning models.
forecasters (str or list, optional (default="all")) –
"all"to run every available model, or a list of model names (strings) to select a subset.cv (int or None, optional (default=None)) – Number of
TimeSeriesSplitfolds for cross-validation.timeout (int, float, or None, optional (default=None)) – Maximum training time in seconds per model.
n_lags (int, optional (default=10)) – Number of lag features for ML/DL models.
n_rolling (tuple of int, optional (default=(3, 7))) – Rolling-window sizes for feature engineering.
seasonal_period (int or None, optional (default=None)) – Seasonal period.
Nonetriggers auto-detection via ACF.sort_by (str, optional (default="RMSE")) – Metric column to sort results by.
n_jobs (int, optional (default=-1)) – Parallel jobs for cross-validation.
max_models (int or None, optional (default=None)) – Limit the number of models to train.
progress_callback (callable or None, optional (default=None)) – Called after each model as
f(name, current, total, metrics).use_gpu (bool, optional (default=False)) – When True, enables GPU acceleration for models that support it (e.g., XGBoost, LightGBM, LSTM, GRU). Falls back to CPU if CUDA is unavailable.
foundation_model_path (str or None, optional (default=None)) – Local filesystem path to pre-downloaded foundation model weights (e.g. TimesFM). Use this when you are offline, behind a firewall, or in an air-gapped environment. When
None(default), the model is downloaded from Hugging Face automatically.
- models
Fitted
ForecasterWrapperobjects keyed by model name.- Type:
dict
- errors
Exceptions from models that failed, keyed by model name.
- Type:
dict
- fit(y_train: Series | ndarray, y_test: Series | ndarray, X_train: DataFrame | ndarray | None = None, X_test: DataFrame | ndarray | None = None) Tuple[DataFrame, DataFrame][source]
Fit forecasting models and evaluate on test data.
- Parameters:
y_train (array-like) – Training time series (chronological order).
y_test (array-like) – Held-out future values to forecast against.
X_train (array-like or None) – Exogenous features for the training period.
X_test (array-like or None) – Exogenous features for the forecast period.
- Returns:
scores (pd.DataFrame) – Metric table for every model, sorted by
sort_by.predictions (pd.DataFrame) – Per-model predictions (empty if
self.predictionsis False).
- load_models(path: str) Dict[str, ForecasterWrapper][source]
Load previously saved models from disk.
- Parameters:
path (str) – Directory containing
.joblibfiles written bysave_models().- Returns:
Mapping of model name to
ForecasterWrapper.- Return type:
dict
- Raises:
FileNotFoundError – If
pathdoes not exist.
- predict(y_history: Series | ndarray, horizon: int, model_name: str | None = None, X_test: DataFrame | ndarray | None = None) Dict[str, ndarray] | ndarray[source]
Produce forecasts from previously fitted models.
Each model is re-fit on
y_historybefore predicting so that the most recent observations are used.- Parameters:
y_history (array-like) – Historical time series to condition the forecast on.
horizon (int) – Number of future time steps to forecast.
model_name (str or None, optional) – If given, only this model is used and a single
np.ndarrayis returned. Otherwise all fitted models are used and adictmapping model names to arrays is returned.X_test (array-like or None, optional) – Exogenous features for the forecast period.
- Returns:
A single forecast array when
model_nameis specified, or a{name: np.ndarray}dict for all models.- Return type:
np.ndarray or dict
- Raises:
ValueError – If no models have been fitted or
model_nameis not found.
- provide_models(y_train: Series | ndarray, y_test: Series | ndarray, X_train: DataFrame | ndarray | None = None, X_test: DataFrame | ndarray | None = None) Dict[str, ForecasterWrapper][source]
Return all fitted forecaster wrappers.
Calls
fit()automatically if no models have been fitted yet.- Parameters:
y_train (array-like) – Training time series.
y_test (array-like) – Test time series (needed only if
fithas not been called).X_train (array-like or None, optional) – Exogenous features for the training period.
X_test (array-like or None, optional) – Exogenous features for the forecast period.
- Returns:
Mapping of model name to fitted
ForecasterWrapper.- Return type:
dict
- class lazypredict.LazyRegressor(verbose: int = 0, ignore_warnings: bool = True, custom_metric: Callable | None = None, predictions: bool = False, random_state: int = 42, regressors: str | List = 'all', cv: int | None = None, timeout: int | float | None = None, categorical_encoder: str = 'onehot', n_jobs: int = -1, max_models: int | None = None, progress_callback: Callable | None = None, use_gpu: bool = False)[source]
Bases:
LazyEstimatorFit all regression algorithms available in scikit-learn and benchmark them.
- Parameters:
verbose (int, optional (default=0)) – Set to a positive number to enable progress bars and per-model metric output.
ignore_warnings (bool, optional (default=True)) – When True, warnings and errors from individual models are suppressed.
custom_metric (callable or None, optional (default=None)) – A function
f(y_true, y_pred)used for additional evaluation.predictions (bool, optional (default=False)) – When True,
fit()returns a tuple of (scores, predictions_dataframe).random_state (int, optional (default=42)) – Random seed passed to models that accept it.
regressors (list or
"all", optional (default=”all”)) – Specific regressor classes to train, or"all"for every available one.cv (int or None, optional (default=None)) – Number of folds for cross-validation. If None, uses train/test split only.
timeout (int or float or None, optional (default=None)) – Maximum seconds for each model. Models exceeding this are skipped.
categorical_encoder (str, optional (default='onehot')) – Encoder for categorical features:
'onehot','ordinal','target', or'binary'.n_jobs (int, optional (default=-1)) – Number of parallel jobs for cross-validation. -1 uses all processors.
max_models (int or None, optional (default=None)) – Maximum number of models to train. None means train all.
progress_callback (callable or None, optional (default=None)) – Callback
f(model_name, current, total, metrics)called after each model.use_gpu (bool, optional (default=False)) – When True, enables GPU acceleration for models that support it (e.g., XGBoost, LightGBM). Falls back to CPU if CUDA is unavailable.
Examples
>>> from lazypredict.Supervised import LazyRegressor >>> from sklearn import datasets >>> from sklearn.utils import shuffle >>> import numpy as np >>> diabetes = datasets.load_diabetes() >>> X, y = shuffle(diabetes.data, diabetes.target, random_state=13) >>> X = X.astype(np.float32) >>> offset = int(X.shape[0] * 0.9) >>> X_train, y_train = X[:offset], y[:offset] >>> X_test, y_test = X[offset:], y[offset:] >>> reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None) >>> models, predictions = reg.fit(X_train, X_test, y_train, y_test)