AutoTS¶

chronos.autots.forecast¶

AutoTSTrainer trains a time series pipeline (including data processing, feature engineering, and model) with AutoML.

class zoo.chronos.autots.forecast.AutoTSTrainer(horizon=1, dt_col='datetime', target_col='value', logs_dir='~/zoo_automl_logs', extra_features_col=None, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None, name='automl')[source]¶

Bases: object

The Automated Time Series Forecast Trainer

Initialize the AutoTS Trainer.

Parameters

horizon – steps to look forward
dt_col – the datetime column
target_col – the target column to forecast
extra_features_col – extra feature columns

fit(train_df, validation_df=None, metric='mse', recipe: zoo.automl.recipe.base.Recipe = <zoo.chronos.config.recipe.SmokeRecipe object>, uncertainty: bool = False, upload_dir=None)[source]¶

Fit a time series forecasting pipeline w/ automl

Parameters

train_df – the input dataframe (as pandas.dataframe)
validation_df – the validation dataframe (as pandas.dataframe)
recipe – the configuration of searching
metric – the evaluation metric to optimize
uncertainty – whether to enable uncertainty calculation (will output an uncertainty sigma)
upload_dir – Optional URI to sync training results and checkpoints. We only support hdfs URI for now.

:return a TSPipeline

class zoo.chronos.autots.forecast.TSPipeline[source]¶

Bases: object

A pipeline for time series forecasting.

Initialize an emtpy TSPipeline. Usually it is not called by user directly. A TSPipeline is either obtained from AutoTrainer.fit or TSPipeline.load

save(pipeline_file)[source]¶

Save the pipeline to a file

Parameters: pipeline_file – the file path
Returns

static load(pipeline_file)[source]¶

Load pipeline from a file

Parameters: pipeline_file – the pipeline file
Returns: a TSPipeline object

fit(input_df, validation_df=None, uncertainty: bool = False, epochs=1, **user_config)[source]¶

Incremental Fitting

Parameters

input_df – the input dataframe
validation_df – the validation dataframe
uncertainty – whether to calculate uncertainty
epochs – number of epochs to train
user_config – user configurations

Returns

predict(input_df)[source]¶

Prediction.

Parameters: input_df – the input dataframe
Returns: the forecast results

evaluate(input_df, metrics=['mse'], multioutput='raw_values')[source]¶

Evaluation

Parameters

input_df – the input dataframe
metrics – the evaluation metrics
multioutput – output mode of multiple output, whether to aggregate

Returns

the evaluation results

chronos.config.recipe¶

Recipe is used for search configuration for AutoTSTrainer.

class zoo.chronos.config.recipe.SmokeRecipe[source]¶

Bases: zoo.automl.recipe.base.Recipe

A very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.

class zoo.chronos.config.recipe.MTNetSmokeRecipe[source]¶

Bases: zoo.automl.recipe.base.Recipe

A very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.

class zoo.chronos.config.recipe.TCNSmokeRecipe[source]¶

Bases: zoo.automl.recipe.base.Recipe

A very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.

class zoo.chronos.config.recipe.PastSeqParamHandler[source]¶

Bases: object

Utility to handle PastSeq Param

static get_past_seq_config(look_back)[source]¶

Generate pass sequence config based on look_back.

Parameters: look_back – look_back configuration
Returns: search configuration for past sequence

class zoo.chronos.config.recipe.GridRandomRecipe(num_rand_samples=1, look_back=2, epochs=5, training_iteration=10)[source]¶

Bases: zoo.automl.recipe.base.Recipe

A recipe involves both grid search and random search.

Constructor. :param num_rand_samples: number of hyper-param configurations sampled randomly :param look_back: the length to look back, either a tuple with 2 int values,

which is in format is (min len, max len), or a single int, which is a fixed length to look back.

Parameters

training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.LSTMSeq2SeqRandomRecipe(input_feature_num, output_feature_num, future_seq_len, num_rand_samples=1, epochs=1, training_iteration=20, batch_size=[128, 256, 512], lr=(0.001, 0.01), lstm_hidden_dim=[64, 128], lstm_layer_num=[1, 2, 3, 4], dropout=(0, 0.25), teacher_forcing=[True, False])[source]¶

Bases: zoo.automl.recipe.base.Recipe

A recipe involves both grid search and random search, only for Seq2SeqPytorch. Note: This recipe is specifically designed for third-party model searching, rather than TimeSequencePredictor.

Constructor. set the param to a list for grid search. set the param to a tuple with length = 2 for random search.

Parameters

input_feature_num – (int) no. of input feature
output_feature_num – (int) no. of ouput feature
future_seq_len – (int) no. of steps to be predicted (i.e. horizon)
num_rand_samples – (int) number of hyper-param configurations sampled randomly
epochs – (int) no. of epochs to train in each iteration
training_iteration – (int) no. of iterations for training (n epochs) in trials
batch_size – (tuple|list) grid search candidates for batch size
lr – (tuple|list) learning rate
lstm_hidden_dim – (tuple|list) lstm hidden dim for both encoder and decoder
lstm_layer_num – (tuple|list) no. of lstm layer for both encoder and decoder
dropout – (tuple|list) dropout for lstm layer
teacher_forcing – (list) if to use teacher forcing machanism during training

class zoo.chronos.config.recipe.LSTMGridRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, look_back=2, lstm_1_units=[16, 32, 64, 128], lstm_2_units=[16, 32, 64], batch_size=[32, 64])[source]¶

Bases: zoo.automl.recipe.base.Recipe

A recipe involves both grid search and random search, only for LSTM.

Constructor.

Parameters

lstm_1_units – random search candidates for num of lstm_1_units
lstm_2_units – grid search candidates for num of lstm_1_units
batch_size – grid search candidates for batch size
num_rand_samples – number of hyper-param configurations sampled randomly
look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.Seq2SeqRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, look_back=2, latent_dim=[32, 64, 128, 256], batch_size=[32, 64])[source]¶

Bases: zoo.automl.recipe.base.Recipe

A recipe involves both grid search and random search, only for LSTM.

Constructor.

Parameters

lstm_1_units – random search candidates for num of lstm_1_units
lstm_2_units – grid search candidates for num of lstm_1_units
batch_size – grid search candidates for batch size
num_rand_samples – number of hyper-param configurations sampled randomly
look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.MTNetGridRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, time_step=[3, 4], long_num=[3, 4], cnn_height=[2, 3], cnn_hid_size=[32, 50, 100], ar_size=[2, 3], batch_size=[32, 64])[source]¶

Bases: zoo.automl.recipe.base.Recipe

Grid+Random Recipe for MTNet

Constructor.

Parameters

num_rand_samples – number of hyper-param configurations sampled randomly
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration
time_step – random search candidates for model param “time_step”
long_num – random search candidates for model param “long_num”
ar_size – random search candidates for model param “ar_size”
batch_size – grid search candidates for batch size
cnn_height – random search candidates for model param “cnn_height”
cnn_hid_size – random search candidates for model param “cnn_hid_size”

class zoo.chronos.config.recipe.TCNGridRandomRecipe(num_rand_samples=1, training_iteration=40, batch_size=[256, 512], hidden_size=[32, 48], levels=[6, 8], kernel_size=[3, 5], dropout=[0, 0.1], lr=[0.001, 0.003])[source]¶

Bases: zoo.automl.recipe.base.Recipe

Grid+Random Recipe for TCN

Constructor.

Parameters

num_rand_samples – number of hyper-param configurations sampled randomly
training_iteration – no. of iterations for training (n epochs) in trials
batch_size – grid search candidates for batch size
hidden_size – grid search candidates for hidden size of each layer
levels – the number of layers
kernel_size – the kernel size of each layer
dropout – dropout rate (1 - keep probability)
lr – learning rate

class zoo.chronos.config.recipe.RandomRecipe(num_rand_samples=1, look_back=2, epochs=5, reward_metric=- 0.05, training_iteration=10)[source]¶

Bases: zoo.automl.recipe.base.Recipe

Pure random sample Recipe. Often used as baseline.

Constructor.

Parameters: num_rand_samples – number of hyper-param configurations sampled randomly

:param look_back:the length to look back, either a tuple with 2 int values,: which is in format is (min len, max len), or a single int, which is a fixed length to look back.

Parameters

reward_metric – the rewarding metric value, when reached, stop trial
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.BayesRecipe(num_samples=1, look_back=2, epochs=5, reward_metric=- 0.05, training_iteration=5)[source]¶

Bases: zoo.automl.recipe.base.Recipe

A Bayes search Recipe. (Experimental)

Constructor.

Parameters

num_samples – number of hyper-param configurations sampled
look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.
reward_metric – the rewarding metric value, when reached, stop trial
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration

class zoo.chronos.config.recipe.XgbRegressorGridRandomRecipe(num_rand_samples=1, n_estimators=[8, 15], max_depth=[10, 15], n_jobs=- 1, tree_method='hist', random_state=2, seed=0, lr=(0.0001, 0.1), subsample=0.8, colsample_bytree=0.8, min_child_weight=[1, 2, 3], gamma=0, reg_alpha=0, reg_lambda=1)[source]¶

Bases: zoo.automl.recipe.base.Recipe

Grid + Random Recipe for XGBoost Regressor.

Constructor. For XGBoost hyper parameters, refer to https://xgboost.readthedocs.io/en/latest/python/python_api.html for details.

Parameters

num_rand_samples – number of hyper-param configurations sampled randomly
n_estimators – number of gradient boosted trees.
max_depth – max tree depth
n_jobs – number of parallel threads used to run xgboost.
tree_method – specify which tree method to use.
random_state – random number seed.
seed – seed used to generate the folds
lr – learning rate
subsample – subsample ratio of the training instance
colsample_bytree – subsample ratio of columns when constructing each tree.
min_child_weight – minimum sum of instance weight(hessian) needed in a child.
gamma – minimum loss reduction required to make a further partition on a leaf node of the tree.
reg_alpha – L1 regularization term on weights (xgb’s alpha).
reg_lambda – L2 regularization term on weights (xgb’s lambda).

class zoo.chronos.config.recipe.XgbRegressorSkOptRecipe(num_rand_samples=10, n_estimators_range=(50, 1000), max_depth_range=(2, 15), lr=(0.0001, 0.1), min_child_weight=[1, 2, 3])[source]¶

Bases: zoo.automl.recipe.base.Recipe

A recipe using SkOpt search algorithm for XGBoost Regressor.

Constructor.

Parameters

num_rand_samples – number of hyper-param configurations sampled randomly
n_estimators_range – range of number of gradient boosted trees.
max_depth_range – range of max tree depth
lr – learning rate
min_child_weight – minimum sum of instance weight(hessian) needed in a child.

chronos.autots.model.auto_tcn¶

AutoTCN is a TCN forecasting model with Auto tuning.

class zoo.chronos.autots.model.auto_tcn.AutoTCN(input_feature_num, output_target_num, past_seq_len, future_seq_len, optimizer, loss, metric, hidden_units=None, levels=None, num_channels=None, kernel_size=7, lr=0.001, dropout=0.2, backend='torch', logs_dir='/tmp/auto_tcn', cpus_per_trial=1, name='auto_tcn')[source]¶

Bases: object

Create an AutoTCN.

Parameters

input_feature_num – Int. The number of features in the input
output_target_num – Int. The number of targets in the output
past_seq_len – Int. The number of historical steps used for forecasting.
future_seq_len – Int. The number of future steps to forecast.
optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.
loss – String or pytorch/tf.keras loss instance or pytorch loss creator function.
metric – String. The evaluation metric name to optimize. e.g. “mse”
hidden_units – Int or hp sampling function from an integer space. The number of hidden units or filters for each convolutional layer. It is similar to units for LSTM. It defaults to 30. We will omit the hidden_units value if num_channels is specified. For hp sampling, see zoo.orca.automl.hp for more details. e.g. hp.grid_search([32, 64]).
levels – Int or hp sampling function from an integer space. The number of levels of TemporalBlocks to use. It defaults to 8. We will omit the levels value if num_channels is specified.
num_channels – List of integers. A list of hidden_units for each level. You could specify num_channels if you want different hidden_units for different levels. By default, num_channels equals to [hidden_units] * (levels - 1) + [output_target_num].
kernel_size – Int or hp sampling function from an integer space. The size of the kernel to use in each convolutional layer.
lr – float or hp sampling function from a float space. Learning rate. e.g. hp.choice([0.001, 0.003, 0.01])
dropout – float or hp sampling function from a float space. Learning rate. Dropout rate. e.g. hp.uniform(0.1, 0.3)
backend – The backend of the TCN model. We only support backend as “torch” for now.
logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_tcn”
cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.
name – name of the AutoTCN. It defaults to “auto_tcn”

fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]¶

Automatically fit the model and search for the best hyper parameters.

Parameters

data –
train data. For backend of “torch”, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y),

where x is training input data and y is training target data.
epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.
batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.
validation_data – Validation data. Validation data type should be the same as data.
metric_threshold – a trial will be terminated when metric threshold is met
n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.
search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)
search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode
scheduler – str, all supported scheduler provided by ray tune
scheduler_params – parameters for scheduler

Returns

get_best_model()[source]¶: Get the best tcn model.

chronos.autots.model.auto_lstm¶

AutoLSTM is an LSTM forecasting model with Auto tuning.

class zoo.chronos.autots.model.auto_lstm.AutoLSTM(input_feature_num, output_target_num, past_seq_len, optimizer, loss, metric, hidden_dim=32, layer_num=1, lr=0.001, dropout=0.2, backend='torch', logs_dir='/tmp/auto_lstm', cpus_per_trial=1, name='auto_lstm')[source]¶

Bases: object

Create an AutoLSTM.

Parameters

input_feature_num – Int. The number of features in the input
output_target_num – Int. The number of targets in the output
past_seq_len – Int or hp sampling function The number of historical steps used for forecasting.
optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.
loss – String or pytorch/tf.keras loss instance or pytorch loss creator function.
metric – String. The evaluation metric name to optimize. e.g. “mse”
hidden_dim – Int or hp sampling function from an integer space. The number of features in the hidden state h. For hp sampling, see zoo.chronos.orca.automl.hp for more details. e.g. hp.grid_search([32, 64]).
layer_num – Int or hp sampling function from an integer space. Number of recurrent layers. e.g. hp.randint(1, 3)
lr – float or hp sampling function from a float space. Learning rate. e.g. hp.choice([0.001, 0.003, 0.01])
dropout – float or hp sampling function from a float space. Learning rate. Dropout rate. e.g. hp.uniform(0.1, 0.3)
backend – The backend of the lstm model. We only support backend as “torch” for now.
logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_lstm”
cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.
name – name of the AutoLSTM. It defaults to “auto_lstm”

fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]¶

Automatically fit the model and search for the best hyper parameters.

Parameters

data –
train data. For backend of “torch”, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y),

where x is training input data and y is training target data.
epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.
batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.
validation_data – Validation data. Validation data type should be the same as data.
metric_threshold – a trial will be terminated when metric threshold is met
n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.
search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)
search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode
scheduler – str, all supported scheduler provided by ray tune
scheduler_params – parameters for scheduler

Returns

get_best_model()[source]¶: Get the best lstm model.