AutoTS

chronos.autots.forecast

AutoTSTrainer trains a time series pipeline (including data processing, feature engineering, and model) with AutoML.

class zoo.chronos.autots.forecast.AutoTSTrainer(horizon=1, dt_col='datetime', target_col='value', logs_dir='~/zoo_automl_logs', extra_features_col=None, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None, name='automl')[source]

Bases: object

The Automated Time Series Forecast Trainer

Initialize the AutoTS Trainer.

Parameters
  • horizon – steps to look forward

  • dt_col – the datetime column

  • target_col – the target column to forecast

  • extra_features_col – extra feature columns

fit(train_df, validation_df=None, metric='mse', recipe: zoo.automl.recipe.base.Recipe = <zoo.chronos.config.recipe.SmokeRecipe object>, uncertainty: bool = False, upload_dir=None)[source]

Fit a time series forecasting pipeline w/ automl

Parameters
  • train_df – the input dataframe (as pandas.dataframe)

  • validation_df – the validation dataframe (as pandas.dataframe)

  • recipe – the configuration of searching

  • metric – the evaluation metric to optimize

  • uncertainty – whether to enable uncertainty calculation (will output an uncertainty sigma)

  • upload_dir – Optional URI to sync training results and checkpoints. We only support hdfs URI for now.

:return a TSPipeline

class zoo.chronos.autots.forecast.TSPipeline[source]

Bases: object

A pipeline for time series forecasting.

Initialize an emtpy TSPipeline. Usually it is not called by user directly. A TSPipeline is either obtained from AutoTrainer.fit or TSPipeline.load

save(pipeline_file)[source]

Save the pipeline to a file

Parameters

pipeline_file – the file path

Returns

static load(pipeline_file)[source]

Load pipeline from a file

Parameters

pipeline_file – the pipeline file

Returns

a TSPipeline object

fit(input_df, validation_df=None, uncertainty: bool = False, epochs=1, **user_config)[source]

Incremental Fitting

Parameters
  • input_df – the input dataframe

  • validation_df – the validation dataframe

  • uncertainty – whether to calculate uncertainty

  • epochs – number of epochs to train

  • user_config – user configurations

Returns

predict(input_df)[source]

Prediction.

Parameters

input_df – the input dataframe

Returns

the forecast results

evaluate(input_df, metrics=['mse'], multioutput='raw_values')[source]

Evaluation

Parameters
  • input_df – the input dataframe

  • metrics – the evaluation metrics

  • multioutput – output mode of multiple output, whether to aggregate

Returns

the evaluation results

chronos.config.recipe

Recipe is used for search configuration for AutoTSTrainer.

class zoo.chronos.config.recipe.SmokeRecipe[source]

Bases: zoo.automl.recipe.base.Recipe

A very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.

class zoo.chronos.config.recipe.MTNetSmokeRecipe[source]

Bases: zoo.automl.recipe.base.Recipe

A very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.

class zoo.chronos.config.recipe.TCNSmokeRecipe[source]

Bases: zoo.automl.recipe.base.Recipe

A very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.

class zoo.chronos.config.recipe.PastSeqParamHandler[source]

Bases: object

Utility to handle PastSeq Param

static get_past_seq_config(look_back)[source]

Generate pass sequence config based on look_back.

Parameters

look_back – look_back configuration

Returns

search configuration for past sequence

class zoo.chronos.config.recipe.GridRandomRecipe(num_rand_samples=1, look_back=2, epochs=5, training_iteration=10)[source]

Bases: zoo.automl.recipe.base.Recipe

A recipe involves both grid search and random search.

Constructor. :param num_rand_samples: number of hyper-param configurations sampled randomly :param look_back: the length to look back, either a tuple with 2 int values,

which is in format is (min len, max len), or a single int, which is a fixed length to look back.

Parameters
  • training_iteration – no. of iterations for training (n epochs) in trials

  • epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.LSTMSeq2SeqRandomRecipe(input_feature_num, output_feature_num, future_seq_len, num_rand_samples=1, epochs=1, training_iteration=20, batch_size=[128, 256, 512], lr=(0.001, 0.01), lstm_hidden_dim=[64, 128], lstm_layer_num=[1, 2, 3, 4], dropout=(0, 0.25), teacher_forcing=[True, False])[source]

Bases: zoo.automl.recipe.base.Recipe

A recipe involves both grid search and random search, only for Seq2SeqPytorch. Note: This recipe is specifically designed for third-party model searching, rather than TimeSequencePredictor.

Constructor. set the param to a list for grid search. set the param to a tuple with length = 2 for random search.

Parameters
  • input_feature_num – (int) no. of input feature

  • output_feature_num – (int) no. of ouput feature

  • future_seq_len – (int) no. of steps to be predicted (i.e. horizon)

  • num_rand_samples – (int) number of hyper-param configurations sampled randomly

  • epochs – (int) no. of epochs to train in each iteration

  • training_iteration – (int) no. of iterations for training (n epochs) in trials

  • batch_size – (tuple|list) grid search candidates for batch size

  • lr – (tuple|list) learning rate

  • lstm_hidden_dim – (tuple|list) lstm hidden dim for both encoder and decoder

  • lstm_layer_num – (tuple|list) no. of lstm layer for both encoder and decoder

  • dropout – (tuple|list) dropout for lstm layer

  • teacher_forcing – (list) if to use teacher forcing machanism during training

class zoo.chronos.config.recipe.LSTMGridRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, look_back=2, lstm_1_units=[16, 32, 64, 128], lstm_2_units=[16, 32, 64], batch_size=[32, 64])[source]

Bases: zoo.automl.recipe.base.Recipe

A recipe involves both grid search and random search, only for LSTM.

Constructor.

Parameters
  • lstm_1_units – random search candidates for num of lstm_1_units

  • lstm_2_units – grid search candidates for num of lstm_1_units

  • batch_size – grid search candidates for batch size

  • num_rand_samples – number of hyper-param configurations sampled randomly

  • look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.

  • training_iteration – no. of iterations for training (n epochs) in trials

  • epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.Seq2SeqRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, look_back=2, latent_dim=[32, 64, 128, 256], batch_size=[32, 64])[source]

Bases: zoo.automl.recipe.base.Recipe

A recipe involves both grid search and random search, only for LSTM.

Constructor.

Parameters
  • lstm_1_units – random search candidates for num of lstm_1_units

  • lstm_2_units – grid search candidates for num of lstm_1_units

  • batch_size – grid search candidates for batch size

  • num_rand_samples – number of hyper-param configurations sampled randomly

  • look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.

  • training_iteration – no. of iterations for training (n epochs) in trials

  • epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.MTNetGridRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, time_step=[3, 4], long_num=[3, 4], cnn_height=[2, 3], cnn_hid_size=[32, 50, 100], ar_size=[2, 3], batch_size=[32, 64])[source]

Bases: zoo.automl.recipe.base.Recipe

Grid+Random Recipe for MTNet

Constructor.

Parameters
  • num_rand_samples – number of hyper-param configurations sampled randomly

  • training_iteration – no. of iterations for training (n epochs) in trials

  • epochs – no. of epochs to train in each iteration

  • time_step – random search candidates for model param “time_step”

  • long_num – random search candidates for model param “long_num”

  • ar_size – random search candidates for model param “ar_size”

  • batch_size – grid search candidates for batch size

  • cnn_height – random search candidates for model param “cnn_height”

  • cnn_hid_size – random search candidates for model param “cnn_hid_size”

class zoo.chronos.config.recipe.TCNGridRandomRecipe(num_rand_samples=1, training_iteration=40, batch_size=[256, 512], hidden_size=[32, 48], levels=[6, 8], kernel_size=[3, 5], dropout=[0, 0.1], lr=[0.001, 0.003])[source]

Bases: zoo.automl.recipe.base.Recipe

Grid+Random Recipe for TCN

Constructor.

Parameters
  • num_rand_samples – number of hyper-param configurations sampled randomly

  • training_iteration – no. of iterations for training (n epochs) in trials

  • batch_size – grid search candidates for batch size

  • hidden_size – grid search candidates for hidden size of each layer

  • levels – the number of layers

  • kernel_size – the kernel size of each layer

  • dropout – dropout rate (1 - keep probability)

  • lr – learning rate

class zoo.chronos.config.recipe.RandomRecipe(num_rand_samples=1, look_back=2, epochs=5, reward_metric=- 0.05, training_iteration=10)[source]

Bases: zoo.automl.recipe.base.Recipe

Pure random sample Recipe. Often used as baseline.

Constructor.

Parameters

num_rand_samples – number of hyper-param configurations sampled randomly

:param look_back:the length to look back, either a tuple with 2 int values,

which is in format is (min len, max len), or a single int, which is a fixed length to look back.

Parameters
  • reward_metric – the rewarding metric value, when reached, stop trial

  • training_iteration – no. of iterations for training (n epochs) in trials

  • epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.BayesRecipe(num_samples=1, look_back=2, epochs=5, reward_metric=- 0.05, training_iteration=5)[source]

Bases: zoo.automl.recipe.base.Recipe

A Bayes search Recipe. (Experimental)

Constructor.

Parameters
  • num_samples – number of hyper-param configurations sampled

  • look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.

  • reward_metric – the rewarding metric value, when reached, stop trial

  • training_iteration – no. of iterations for training (n epochs) in trials

  • epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.XgbRegressorGridRandomRecipe(num_rand_samples=1, n_estimators=[8, 15], max_depth=[10, 15], n_jobs=- 1, tree_method='hist', random_state=2, seed=0, lr=(0.0001, 0.1), subsample=0.8, colsample_bytree=0.8, min_child_weight=[1, 2, 3], gamma=0, reg_alpha=0, reg_lambda=1)[source]

Bases: zoo.automl.recipe.base.Recipe

Grid + Random Recipe for XGBoost Regressor.

Constructor. For XGBoost hyper parameters, refer to https://xgboost.readthedocs.io/en/latest/python/python_api.html for details.

Parameters
  • num_rand_samples – number of hyper-param configurations sampled randomly

  • n_estimators – number of gradient boosted trees.

  • max_depth – max tree depth

  • n_jobs – number of parallel threads used to run xgboost.

  • tree_method – specify which tree method to use.

  • random_state – random number seed.

  • seed – seed used to generate the folds

  • lr – learning rate

  • subsample – subsample ratio of the training instance

  • colsample_bytree – subsample ratio of columns when constructing each tree.

  • min_child_weight – minimum sum of instance weight(hessian) needed in a child.

  • gamma – minimum loss reduction required to make a further partition on a leaf node of the tree.

  • reg_alpha – L1 regularization term on weights (xgb’s alpha).

  • reg_lambda – L2 regularization term on weights (xgb’s lambda).

class zoo.chronos.config.recipe.XgbRegressorSkOptRecipe(num_rand_samples=10, n_estimators_range=(50, 1000), max_depth_range=(2, 15), lr=(0.0001, 0.1), min_child_weight=[1, 2, 3])[source]

Bases: zoo.automl.recipe.base.Recipe

A recipe using SkOpt search algorithm for XGBoost Regressor.

Constructor.

Parameters
  • num_rand_samples – number of hyper-param configurations sampled randomly

  • n_estimators_range – range of number of gradient boosted trees.

  • max_depth_range – range of max tree depth

  • lr – learning rate

  • min_child_weight – minimum sum of instance weight(hessian) needed in a child.

chronos.autots.model.auto_tcn

AutoTCN is a TCN forecasting model with Auto tuning.

class zoo.chronos.autots.model.auto_tcn.AutoTCN(input_feature_num, output_target_num, past_seq_len, future_seq_len, optimizer, loss, metric, hidden_units=None, levels=None, num_channels=None, kernel_size=7, lr=0.001, dropout=0.2, backend='torch', logs_dir='/tmp/auto_tcn', cpus_per_trial=1, name='auto_tcn')[source]

Bases: object

Create an AutoTCN.

Parameters
  • input_feature_num – Int. The number of features in the input

  • output_target_num – Int. The number of targets in the output

  • past_seq_len – Int. The number of historical steps used for forecasting.

  • future_seq_len – Int. The number of future steps to forecast.

  • optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.

  • loss – String or pytorch/tf.keras loss instance or pytorch loss creator function.

  • metric – String. The evaluation metric name to optimize. e.g. “mse”

  • hidden_units – Int or hp sampling function from an integer space. The number of hidden units or filters for each convolutional layer. It is similar to units for LSTM. It defaults to 30. We will omit the hidden_units value if num_channels is specified. For hp sampling, see zoo.orca.automl.hp for more details. e.g. hp.grid_search([32, 64]).

  • levels – Int or hp sampling function from an integer space. The number of levels of TemporalBlocks to use. It defaults to 8. We will omit the levels value if num_channels is specified.

  • num_channels – List of integers. A list of hidden_units for each level. You could specify num_channels if you want different hidden_units for different levels. By default, num_channels equals to [hidden_units] * (levels - 1) + [output_target_num].

  • kernel_size – Int or hp sampling function from an integer space. The size of the kernel to use in each convolutional layer.

  • lr – float or hp sampling function from a float space. Learning rate. e.g. hp.choice([0.001, 0.003, 0.01])

  • dropout – float or hp sampling function from a float space. Learning rate. Dropout rate. e.g. hp.uniform(0.1, 0.3)

  • backend – The backend of the TCN model. We only support backend as “torch” for now.

  • logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_tcn”

  • cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.

  • name – name of the AutoTCN. It defaults to “auto_tcn”

fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]

Automatically fit the model and search for the best hyper parameters.

Parameters
  • data

    train data. For backend of “torch”, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y),

    where x is training input data and y is training target data.

  • epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.

  • batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.

  • validation_data – Validation data. Validation data type should be the same as data.

  • metric_threshold – a trial will be terminated when metric threshold is met

  • n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.

  • search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)

  • search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode

  • scheduler – str, all supported scheduler provided by ray tune

  • scheduler_params – parameters for scheduler

Returns

get_best_model()[source]

Get the best tcn model.

chronos.autots.model.auto_lstm

AutoLSTM is an LSTM forecasting model with Auto tuning.

class zoo.chronos.autots.model.auto_lstm.AutoLSTM(input_feature_num, output_target_num, past_seq_len, optimizer, loss, metric, hidden_dim=32, layer_num=1, lr=0.001, dropout=0.2, backend='torch', logs_dir='/tmp/auto_lstm', cpus_per_trial=1, name='auto_lstm')[source]

Bases: object

Create an AutoLSTM.

Parameters
  • input_feature_num – Int. The number of features in the input

  • output_target_num – Int. The number of targets in the output

  • past_seq_len – Int or hp sampling function The number of historical steps used for forecasting.

  • optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.

  • loss – String or pytorch/tf.keras loss instance or pytorch loss creator function.

  • metric – String. The evaluation metric name to optimize. e.g. “mse”

  • hidden_dim – Int or hp sampling function from an integer space. The number of features in the hidden state h. For hp sampling, see zoo.chronos.orca.automl.hp for more details. e.g. hp.grid_search([32, 64]).

  • layer_num – Int or hp sampling function from an integer space. Number of recurrent layers. e.g. hp.randint(1, 3)

  • lr – float or hp sampling function from a float space. Learning rate. e.g. hp.choice([0.001, 0.003, 0.01])

  • dropout – float or hp sampling function from a float space. Learning rate. Dropout rate. e.g. hp.uniform(0.1, 0.3)

  • backend – The backend of the lstm model. We only support backend as “torch” for now.

  • logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_lstm”

  • cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.

  • name – name of the AutoLSTM. It defaults to “auto_lstm”

fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]

Automatically fit the model and search for the best hyper parameters.

Parameters
  • data

    train data. For backend of “torch”, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y),

    where x is training input data and y is training target data.

  • epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.

  • batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.

  • validation_data – Validation data. Validation data type should be the same as data.

  • metric_threshold – a trial will be terminated when metric threshold is met

  • n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.

  • search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)

  • search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode

  • scheduler – str, all supported scheduler provided by ray tune

  • scheduler_params – parameters for scheduler

Returns

get_best_model()[source]

Get the best lstm model.