Interpretable Time Series Autoregression

Quantifying periodicity and seasonality of time series with sparse autoregression. The optimization on sparse autoregression is used to identify dominant and positive auto-correlations of time series (e.g., human mobility and climate variables).

(Updated on August 9, 2025)


In this post, we intend to explain the essential ideas of our research work:

Content:

In Part I of this series, we introduce the essential idea of time series autoregression in statistics.


I. Univariate Autoregression

Time series autoregression is a statistical model used to analyze and forecast time series data. The class of autoregression models is widely used in the fields of economics, finance, weather forecasting, and signal processing. Exploring auto-correlations from univariate autoregression is meaningful for understanding time series.

I-A. Definition of Autoregression

The essential idea of time series autoregression is that a given data point of a time series is linearly dependent on the previous data points. Mathematically, the th-order univariate autoregression of time series can be written as follows,

for all . The integer is the order. Here, is the value of the time series at time . The vector represents the autoregressive coefficients. The random error is assumed to be normally distributed, following a mean of zero and a constant variance.

There is a closed-form solution to the coefficient vector from the optimization problem such that

which is equivalent to

where denotes the -norm. The symbol is the the Moore–Penrose inverse of a matrix. While using -norm, the vector consists of the last entries in the time series vector , i.e.,

The matrix is also comprised of the entries in the time series vector , which is given by

In essence, given the data pair constructed by the time series , the univariate autoregression can be easily converted into a linear regression formula. Thus, the closed-form solution is least squares.

Considering one quick example:


I-B. Motivation of Sparse Autoregression

However, the challenges arise if there is a sparsity constraint in the form of -norm, for instance,

where the upper bound the constraint is an integer , which is supposed to be no greater than the order . In the constraint, counts the number of nonzero entries in the vector , and is the sparsity level.

II. Sparse Autoregression

II-A. Mixed-Integer Programming

II-B. Semidefinite Programming

III. Time-Varying Sparse Autoregression

The optimization problem is formulated as follows,



Example 1. For any vectors , verify that .

According to the definition of inner product, we have . In contrast, the outer product between and is given by

Recall that the trace of a square matrix is the sum of diagonal entries, we therefore have

as claimed.



III-A. Ridesharing Data

III-B. Formulating Time-Varying Systems

III-C. Solving the Optimization Problem

IV. Periodicity of Hangzhou Metro Passenger Flow

IV-A. Data Description

IV-B. Periodicity Analysis

IV-C. Spatially-Varying Systems


(Posted by Xinyu Chen on February 15, 2025)