The Long Short-term Cognitive Network (LSTCN) model [1] is an efficient recurrent neural network for time series forecasting. It supports both one-step-ahead and multiple-step-ahead forecasting of univariate and multivariate time series. The LSTCN model is competitive compared to state-of-the-art recurrent neural networks such as LSTM and GRU in terms of forecasting error while being much faster.
LSTCN can be installed from PyPI
pip install lstcn
An LSTCN model [1] can be defined as a recurrent neural network composed of sequentially-ordered Short-term Cognitive Network (STCN) blocks [2]. Each STCN block is a two-layer neural network that uses shallow learning to fit the model to a specific time patch, which is then transferred to the following block. Time patches are temporal pieces of data resulting from partitioning the time series.
Let us assume that
The input gate operates with the prior knowledge matrix
where
where
The syntax for the usage of LSTCN is compatible with scikit-learn library.
First create an LSTCN object specifying the number of features and the number of steps to predict ahead:
model = LSTCN(n_features, n_steps)
Optionally, you can also specify the number of STCN blocks in the network (n_blocks
), the activation function (function
), the regression solver (solver
) and the regularization penalization parameter (alpha
). For more details, check the documentation of the LSTCN class.
For training a LSTCN model simply call the fit method:
model.fit(X_train,Y_train)
Use walk forward cross-validation and grid search (or any other suitable validation strategy from scikit-learn) for selecting the best-performing model:
tscv = TimeSeriesSplit(n_splits=5)
scorer = make_scorer(model.score, greater_is_better=False)
param_search = {
'alpha': [1.0E-3, 1.0E-2, 1.0E-1],
'n_blocks': range(2, 6)
}
gsearch = GridSearchCV(estimator=model, cv=tscv, param_grid=param_search, refit=True,
n_jobs=-1, error_score='raise', scoring=scorer)
gsearch.fit(X_train, Y_train)
best_model = gsearch.best_estimator_
For predicting new data use the method predict:
Y_pred = best_model.predict(X_test)
The figure below shows the predictions on the test set for the target series oil temperature
of the ETTh1
dataset [5] containing 17420 records of electricity transformer temperatures in China:
In this example, we used 80% of the dataset for training and validation, and 20% for testing. The mean absolute error for the training set is 0.0355, while the test error is 0.0192. More importantly, the model's hyperparameter tuning (exploring 15 models) runs in 3.8599 seconds!
If you use the LSTCN model in your research please cite the following papers:
-
Nápoles, G., Grau, I., Jastrzębska, A., & Salgueiro, Y. (2022). Long short-term cognitive networks. Neural Computing and Applications, 1-13. paper bibtex
-
Nápoles, G., Vanhoenshoven, F., & Vanhoof, K. (2019). Short-term cognitive networks, flexible reasoning and nonsynaptic learning. Neural Networks, 115, 72-81. paper bibtex
Some application papers with nice examples and further explanations:
-
Morales-Hernández, A., Nápoles, G., Jastrzebska, A., Salgueiro, Y., & Vanhoof, K. (2022). Online learning of windmill time series using Long Short-term Cognitive Networks. Expert Systems with Applications, 117721. paper bibtex
-
Grau, I., de Hoop, M., Glaser, A., Nápoles, G., & Dijkman, R. (2022). Semiconductor Demand Forecasting using Long Short-term Cognitive Networks. In Proceedings of the 34th Benelux Conference on Artificial Intelligence and 31st Belgian-Dutch Conference on Machine Learning, BNAIC/BeNeLearn 2022. paper bibtex
This following paper introduces the dataset used in the example: