AutoTS¶
chronos.autots.forecast¶
AutoTSTrainer trains a time series pipeline (including data processing, feature engineering, and model) with AutoML.
- class zoo.chronos.autots.forecast.AutoTSTrainer(horizon=1, dt_col='datetime', target_col='value', logs_dir='~/zoo_automl_logs', extra_features_col=None, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None, name='automl')[source]¶
Bases:
objectThe Automated Time Series Forecast Trainer
Initialize the AutoTS Trainer.
- Parameters
horizon – steps to look forward
dt_col – the datetime column
target_col – the target column to forecast
extra_features_col – extra feature columns
- fit(train_df, validation_df=None, metric='mse', recipe: zoo.automl.recipe.base.Recipe = <zoo.chronos.config.recipe.SmokeRecipe object>, uncertainty: bool = False, upload_dir=None)[source]¶
Fit a time series forecasting pipeline w/ automl
- Parameters
train_df – the input dataframe (as pandas.dataframe)
validation_df – the validation dataframe (as pandas.dataframe)
recipe – the configuration of searching
metric – the evaluation metric to optimize
uncertainty – whether to enable uncertainty calculation (will output an uncertainty sigma)
upload_dir – Optional URI to sync training results and checkpoints. We only support hdfs URI for now.
:return a TSPipeline
- class zoo.chronos.autots.forecast.TSPipeline[source]¶
Bases:
objectA pipeline for time series forecasting.
Initialize an emtpy TSPipeline. Usually it is not called by user directly. A TSPipeline is either obtained from AutoTrainer.fit or TSPipeline.load
- save(pipeline_file)[source]¶
Save the pipeline to a file
- Parameters
pipeline_file – the file path
- Returns
- static load(pipeline_file)[source]¶
Load pipeline from a file
- Parameters
pipeline_file – the pipeline file
- Returns
a TSPipeline object
- fit(input_df, validation_df=None, uncertainty: bool = False, epochs=1, **user_config)[source]¶
Incremental Fitting
- Parameters
input_df – the input dataframe
validation_df – the validation dataframe
uncertainty – whether to calculate uncertainty
epochs – number of epochs to train
user_config – user configurations
- Returns
chronos.config.recipe¶
Recipe is used for search configuration for AutoTSTrainer.
- class zoo.chronos.config.recipe.SmokeRecipe[source]¶
Bases:
zoo.automl.recipe.base.RecipeA very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.
- class zoo.chronos.config.recipe.MTNetSmokeRecipe[source]¶
Bases:
zoo.automl.recipe.base.RecipeA very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.
- class zoo.chronos.config.recipe.TCNSmokeRecipe[source]¶
Bases:
zoo.automl.recipe.base.RecipeA very simple Recipe for smoke test that runs one epoch and one iteration with only 1 random sample.
- class zoo.chronos.config.recipe.PastSeqParamHandler[source]¶
Bases:
objectUtility to handle PastSeq Param
- class zoo.chronos.config.recipe.GridRandomRecipe(num_rand_samples=1, look_back=2, epochs=5, training_iteration=10)[source]¶
Bases:
zoo.automl.recipe.base.RecipeA recipe involves both grid search and random search.
Constructor. :param num_rand_samples: number of hyper-param configurations sampled randomly :param look_back: the length to look back, either a tuple with 2 int values,
which is in format is (min len, max len), or a single int, which is a fixed length to look back.
- Parameters
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration
- class zoo.chronos.config.recipe.LSTMSeq2SeqRandomRecipe(input_feature_num, output_feature_num, future_seq_len, num_rand_samples=1, epochs=1, training_iteration=20, batch_size=[128, 256, 512], lr=(0.001, 0.01), lstm_hidden_dim=[64, 128], lstm_layer_num=[1, 2, 3, 4], dropout=(0, 0.25), teacher_forcing=[True, False])[source]¶
Bases:
zoo.automl.recipe.base.RecipeA recipe involves both grid search and random search, only for Seq2SeqPytorch. Note: This recipe is specifically designed for third-party model searching, rather than TimeSequencePredictor.
Constructor. set the param to a list for grid search. set the param to a tuple with length = 2 for random search.
- Parameters
input_feature_num – (int) no. of input feature
output_feature_num – (int) no. of ouput feature
future_seq_len – (int) no. of steps to be predicted (i.e. horizon)
num_rand_samples – (int) number of hyper-param configurations sampled randomly
epochs – (int) no. of epochs to train in each iteration
training_iteration – (int) no. of iterations for training (n epochs) in trials
batch_size – (tuple|list) grid search candidates for batch size
lr – (tuple|list) learning rate
lstm_hidden_dim – (tuple|list) lstm hidden dim for both encoder and decoder
lstm_layer_num – (tuple|list) no. of lstm layer for both encoder and decoder
dropout – (tuple|list) dropout for lstm layer
teacher_forcing – (list) if to use teacher forcing machanism during training
- class zoo.chronos.config.recipe.LSTMGridRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, look_back=2, lstm_1_units=[16, 32, 64, 128], lstm_2_units=[16, 32, 64], batch_size=[32, 64])[source]¶
Bases:
zoo.automl.recipe.base.RecipeA recipe involves both grid search and random search, only for LSTM.
Constructor.
- Parameters
lstm_1_units – random search candidates for num of lstm_1_units
lstm_2_units – grid search candidates for num of lstm_1_units
batch_size – grid search candidates for batch size
num_rand_samples – number of hyper-param configurations sampled randomly
look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration
- class zoo.chronos.config.recipe.Seq2SeqRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, look_back=2, latent_dim=[32, 64, 128, 256], batch_size=[32, 64])[source]¶
Bases:
zoo.automl.recipe.base.RecipeA recipe involves both grid search and random search, only for LSTM.
Constructor.
- Parameters
lstm_1_units – random search candidates for num of lstm_1_units
lstm_2_units – grid search candidates for num of lstm_1_units
batch_size – grid search candidates for batch size
num_rand_samples – number of hyper-param configurations sampled randomly
look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration
- class zoo.chronos.config.recipe.MTNetGridRandomRecipe(num_rand_samples=1, epochs=5, training_iteration=10, time_step=[3, 4], long_num=[3, 4], cnn_height=[2, 3], cnn_hid_size=[32, 50, 100], ar_size=[2, 3], batch_size=[32, 64])[source]¶
Bases:
zoo.automl.recipe.base.RecipeGrid+Random Recipe for MTNet
Constructor.
- Parameters
num_rand_samples – number of hyper-param configurations sampled randomly
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration
time_step – random search candidates for model param “time_step”
long_num – random search candidates for model param “long_num”
ar_size – random search candidates for model param “ar_size”
batch_size – grid search candidates for batch size
cnn_height – random search candidates for model param “cnn_height”
cnn_hid_size – random search candidates for model param “cnn_hid_size”
- class zoo.chronos.config.recipe.TCNGridRandomRecipe(num_rand_samples=1, training_iteration=40, batch_size=[256, 512], hidden_size=[32, 48], levels=[6, 8], kernel_size=[3, 5], dropout=[0, 0.1], lr=[0.001, 0.003])[source]¶
Bases:
zoo.automl.recipe.base.RecipeGrid+Random Recipe for TCN
Constructor.
- Parameters
num_rand_samples – number of hyper-param configurations sampled randomly
training_iteration – no. of iterations for training (n epochs) in trials
batch_size – grid search candidates for batch size
hidden_size – grid search candidates for hidden size of each layer
levels – the number of layers
kernel_size – the kernel size of each layer
dropout – dropout rate (1 - keep probability)
lr – learning rate
- class zoo.chronos.config.recipe.RandomRecipe(num_rand_samples=1, look_back=2, epochs=5, reward_metric=- 0.05, training_iteration=10)[source]¶
Bases:
zoo.automl.recipe.base.RecipePure random sample Recipe. Often used as baseline.
Constructor.
- Parameters
num_rand_samples – number of hyper-param configurations sampled randomly
- :param look_back:the length to look back, either a tuple with 2 int values,
which is in format is (min len, max len), or a single int, which is a fixed length to look back.
- Parameters
reward_metric – the rewarding metric value, when reached, stop trial
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration
- class zoo.chronos.config.recipe.BayesRecipe(num_samples=1, look_back=2, epochs=5, reward_metric=- 0.05, training_iteration=5)[source]¶
Bases:
zoo.automl.recipe.base.RecipeA Bayes search Recipe. (Experimental)
Constructor.
- Parameters
num_samples – number of hyper-param configurations sampled
look_back – the length to look back, either a tuple with 2 int values, which is in format is (min len, max len), or a single int, which is a fixed length to look back.
reward_metric – the rewarding metric value, when reached, stop trial
training_iteration – no. of iterations for training (n epochs) in trials
epochs – no. of epochs to train in each iteration
- class zoo.chronos.config.recipe.XgbRegressorGridRandomRecipe(num_rand_samples=1, n_estimators=[8, 15], max_depth=[10, 15], n_jobs=- 1, tree_method='hist', random_state=2, seed=0, lr=(0.0001, 0.1), subsample=0.8, colsample_bytree=0.8, min_child_weight=[1, 2, 3], gamma=0, reg_alpha=0, reg_lambda=1)[source]¶
Bases:
zoo.automl.recipe.base.RecipeGrid + Random Recipe for XGBoost Regressor.
Constructor. For XGBoost hyper parameters, refer to https://xgboost.readthedocs.io/en/latest/python/python_api.html for details.
- Parameters
num_rand_samples – number of hyper-param configurations sampled randomly
n_estimators – number of gradient boosted trees.
max_depth – max tree depth
n_jobs – number of parallel threads used to run xgboost.
tree_method – specify which tree method to use.
random_state – random number seed.
seed – seed used to generate the folds
lr – learning rate
subsample – subsample ratio of the training instance
colsample_bytree – subsample ratio of columns when constructing each tree.
min_child_weight – minimum sum of instance weight(hessian) needed in a child.
gamma – minimum loss reduction required to make a further partition on a leaf node of the tree.
reg_alpha – L1 regularization term on weights (xgb’s alpha).
reg_lambda – L2 regularization term on weights (xgb’s lambda).
- class zoo.chronos.config.recipe.XgbRegressorSkOptRecipe(num_rand_samples=10, n_estimators_range=(50, 1000), max_depth_range=(2, 15), lr=(0.0001, 0.1), min_child_weight=[1, 2, 3])[source]¶
Bases:
zoo.automl.recipe.base.RecipeA recipe using SkOpt search algorithm for XGBoost Regressor.
Constructor.
- Parameters
num_rand_samples – number of hyper-param configurations sampled randomly
n_estimators_range – range of number of gradient boosted trees.
max_depth_range – range of max tree depth
lr – learning rate
min_child_weight – minimum sum of instance weight(hessian) needed in a child.
chronos.autots.model.auto_tcn¶
AutoTCN is a TCN forecasting model with Auto tuning.
- class zoo.chronos.autots.model.auto_tcn.AutoTCN(input_feature_num, output_target_num, past_seq_len, future_seq_len, optimizer, loss, metric, hidden_units=None, levels=None, num_channels=None, kernel_size=7, lr=0.001, dropout=0.2, backend='torch', logs_dir='/tmp/auto_tcn', cpus_per_trial=1, name='auto_tcn')[source]¶
Bases:
objectCreate an AutoTCN.
- Parameters
input_feature_num – Int. The number of features in the input
output_target_num – Int. The number of targets in the output
past_seq_len – Int. The number of historical steps used for forecasting.
future_seq_len – Int. The number of future steps to forecast.
optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.
loss – String or pytorch/tf.keras loss instance or pytorch loss creator function.
metric – String. The evaluation metric name to optimize. e.g. “mse”
hidden_units – Int or hp sampling function from an integer space. The number of hidden units or filters for each convolutional layer. It is similar to units for LSTM. It defaults to 30. We will omit the hidden_units value if num_channels is specified. For hp sampling, see zoo.orca.automl.hp for more details. e.g. hp.grid_search([32, 64]).
levels – Int or hp sampling function from an integer space. The number of levels of TemporalBlocks to use. It defaults to 8. We will omit the levels value if num_channels is specified.
num_channels – List of integers. A list of hidden_units for each level. You could specify num_channels if you want different hidden_units for different levels. By default, num_channels equals to [hidden_units] * (levels - 1) + [output_target_num].
kernel_size – Int or hp sampling function from an integer space. The size of the kernel to use in each convolutional layer.
lr – float or hp sampling function from a float space. Learning rate. e.g. hp.choice([0.001, 0.003, 0.01])
dropout – float or hp sampling function from a float space. Learning rate. Dropout rate. e.g. hp.uniform(0.1, 0.3)
backend – The backend of the TCN model. We only support backend as “torch” for now.
logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_tcn”
cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.
name – name of the AutoTCN. It defaults to “auto_tcn”
- fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]¶
Automatically fit the model and search for the best hyper parameters.
- Parameters
data –
train data. For backend of “torch”, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y),
where x is training input data and y is training target data.
epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.
batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.
validation_data – Validation data. Validation data type should be the same as data.
metric_threshold – a trial will be terminated when metric threshold is met
n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.
search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)
search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode
scheduler – str, all supported scheduler provided by ray tune
scheduler_params – parameters for scheduler
- Returns
chronos.autots.model.auto_lstm¶
AutoLSTM is an LSTM forecasting model with Auto tuning.
- class zoo.chronos.autots.model.auto_lstm.AutoLSTM(input_feature_num, output_target_num, past_seq_len, optimizer, loss, metric, hidden_dim=32, layer_num=1, lr=0.001, dropout=0.2, backend='torch', logs_dir='/tmp/auto_lstm', cpus_per_trial=1, name='auto_lstm')[source]¶
Bases:
objectCreate an AutoLSTM.
- Parameters
input_feature_num – Int. The number of features in the input
output_target_num – Int. The number of targets in the output
past_seq_len – Int or hp sampling function The number of historical steps used for forecasting.
optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.
loss – String or pytorch/tf.keras loss instance or pytorch loss creator function.
metric – String. The evaluation metric name to optimize. e.g. “mse”
hidden_dim – Int or hp sampling function from an integer space. The number of features in the hidden state h. For hp sampling, see zoo.chronos.orca.automl.hp for more details. e.g. hp.grid_search([32, 64]).
layer_num – Int or hp sampling function from an integer space. Number of recurrent layers. e.g. hp.randint(1, 3)
lr – float or hp sampling function from a float space. Learning rate. e.g. hp.choice([0.001, 0.003, 0.01])
dropout – float or hp sampling function from a float space. Learning rate. Dropout rate. e.g. hp.uniform(0.1, 0.3)
backend – The backend of the lstm model. We only support backend as “torch” for now.
logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_lstm”
cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.
name – name of the AutoLSTM. It defaults to “auto_lstm”
- fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]¶
Automatically fit the model and search for the best hyper parameters.
- Parameters
data –
train data. For backend of “torch”, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y),
where x is training input data and y is training target data.
epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.
batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.
validation_data – Validation data. Validation data type should be the same as data.
metric_threshold – a trial will be terminated when metric threshold is met
n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.
search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)
search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode
scheduler – str, all supported scheduler provided by ray tune
scheduler_params – parameters for scheduler
- Returns