Chronos User Guide

1. Overview

Chronos is an application framework for building large-scale time series analysis applications.

You can use Chronos to do:

2. Install

Install analytics-zoo with target [automl] to install the additional dependencies for Chronos.

conda create -n my_env python=3.7
conda activate my_env
pip install --pre --upgrade analytics-zoo[automl]

3 Initialization

Chronos uses Orca to enable distributed training and AutoML capabilities. Init orca as below. View Orca Context for more details. Note that argument init_ray_on_spark must be True for Chronos.

if args.cluster_mode == "local":
    init_orca_context(cluster_mode="local", cores=4, init_ray_on_spark=True) # run in local mode
elif args.cluster_mode == "k8s":
    init_orca_context(cluster_mode="k8s", num_nodes=2, cores=2, init_ray_on_spark=True) # run on K8s cluster
elif args.cluster_mode == "yarn":
    init_orca_context(cluster_mode="yarn-client", num_nodes=2, cores=2, init_ray_on_spark=True) # run on Hadoop YARN cluster

View Quick Start for a more detailed example.


4 Forecasting

Time Series Forecasting uses the history to predict the future. There’re two ways to do forecasting:

  • Use AutoTS pipeline

  • Use Standalone Forecaster pipeline

4.1 Use AutoTS Pipeline (with AutoML)

You can use the AutoTS package to to build a time series forecasting pipeline with AutoML.

The general workflow has two steps:

View AutoTS notebook example for more details.

4.1.1 Prepare input data

You should prepare the training dataset and the optional validation dataset. Both training and validation data need to be provided as Pandas Dataframe. The dataframe should have at least two columns:

  • The datetime column, which should have Pandas datetime format (you can use pandas.to_datetime to convert a string into a datetime format)

  • The target column, which contains the data points at the associated timestamps; these data points will be used to predict future data points.

You may have other input columns for each row as extra feature; so the final input data could look something like below.

datetime    target  extra_feature_1  extra_feature_2
2019-06-06  1.2     1                2
2019-06-07  2.30    2                1

4.1.2 Create AutoTSTrainer

You can create an AutoTSTrainer as follows (dt_col is the datetime, target_col is the target column, and extra_features_col is the extra features):

from zoo.chronos.autots.forecast import AutoTSTrainer

trainer = AutoTSTrainer(dt_col="datetime", target_col="target", horizon=1, extra_features_col=["extra_feature_1","extra_feature_2"])

View AutoTSTrainer API Doc for more details.

4.1.3 Train AutoTS pipeline

You can then train on the input data using AutoTSTrainer.fit with AutoML as follows:

ts_pipeline = trainer.fit(train_df, validation_df, recipe=SmokeRecipe())

recipe configures the search space for auto tuning. View Recipe API docs for available recipes. After training, it will return a TSPipeline, which includes not only the model, but also the data preprocessing/post processing steps.

Appropriate hyperparameters are automatically selected for the models and data processing steps in the pipeline during the fit process, and you may use built-in visualization tool to inspect the training results after training stopped.

4.1.4 Use TSPipeline

Use TSPipeline.predict|evaluate|fit for prediction, evaluation or (incremental) fitting. Note: incremental fitting on TSPipeline just update the model weights the standard way, which does not involve AutoML.

ts_pipeline.predict(test_df)
ts_pipeline.evalute(val_df)
ts_pipeline.fit(new_train_df, new_val_df, epochs=10)

Use TSPipeline.save|load to load or save.

from zoo.chronos.autots.forecast import TSPipeline
loaded_ppl = TSPipeline.load(file)
loaded_ppl.save(another_file)

View TSPipeline API Doc for more details.

Note: init_orca_context is not needed if you just use the trained TSPipeline for inference, evaluation or incremental fitting.


4.2 Use Standalone Forecaster Pipeline

Chronos provides a set of standalone time series forecasters without AutoML support, including deep learning models as well as traditional statistical models.

View some examples notebooks for Network Traffic Prediction

The common process of using a Forecaster looks like below.

f = Forecaster()
f.fit(...)
f.predict(...)

Refer to API docs of each Forecaster for detailed usage instructions and examples.

4.2.1 LSTMForecaster

LSTMForecaster wraps a vanilla LSTM model, and is suitable for univariate time series forecasting.

View Network Traffic Prediction notebook and LSTMForecaster API Doc for more details.

4.2.2 Seq2SeqForecaster

Seq2SeqForecaster wraps a sequence to sequence model based on LSTM, and is suitable for multivariant & multistep time series forecasting.

View Seq2SeqForecaster API Doc for more details.

4.2.3 TCNForecaster

Temporal Convolutional Networks (TCN) is a neural network that use convolutional architecture rather than recurrent networks. It supports multi-step and multi-variant cases. Causal Convolutions enables large scale parallel computing which makes TCN has less inference time than RNN based model such as LSTM.

View Network Traffic multivariate multistep Prediction notebook and TCNForecaster API Doc for more details.

4.2.4 MTNetForecaster

MTNetForecaster wraps a MTNet model. The model architecture mostly follows the MTNet paper with slight modifications, and is suitable for multivariate time series forecasting.

View Network Traffic Prediction notebook and MTNetForecaster API Doc for more details.

4.2.5 TCMFForecaster

TCMFForecaster wraps a model architecture that follows implementation of the paper DeepGLO paper with slight modifications. It is especially suitable for extremely high dimensional (up-to millions) multivariate time series forecasting.

View High-dimensional Electricity Data Forecasting example and TCMFForecaster API Doc for more details.

5 Anomaly Detection

Anomaly Detection detects abnormal samples in a given time series. Chronos provides a set of unsupervised anomaly detectors.

View some examples notebooks for Datacenter AIOps.

5.1 ThresholdDetector

ThresholdDetector detects anomaly based on threshold. It can be used to detect anomaly on a given time series (notebook), or used together with Forecasters (#forecasting) to detect anomaly on new coming samples (notebook).

View ThresholdDetector API Doc for more details.

5.2 AEDetector

AEDetector detects anomaly based on the reconstruction error of an autoencoder network.

View anomaly detection notebook and AEDetector API Doc for more details.

5.3 DBScanDetector

DBScanDetector uses DBSCAN clustering algortihm for anomaly detection.

View anomaly detection notebook and DBScanDetector API Doc for more details.

6 Data Processing and Features

Chronos provides TSDataset for time series data processing and feature engineering.

View TSDataset API Doc for more details.