Problem Set (Without Solutions) - Undergraduate Level
Use the formulas and code below as a reference when computing statistics by hand or checking your work. For a sample time series x₁, x₂, …, xₙ of length n:
For lag k, use the demeaned series (subtract the mean first). With n observations, there are n−k pairs (xᵢ − x̄, xᵢ₊ₖ − x̄).
Some texts use 1/n instead of 1/(n−k); be consistent. For this problem set, using 1/(n−k) or 1/n is acceptable as long as you use the same convention for γ(0) and γ(k).
So ρ(0) = 1 always. Autocorrelation is dimensionless and lies in [−1, 1].
Example with a small array (replace with your data). For ACF, mean removal is applied.
import numpy as np
def sample_mean(x):
return np.mean(x)
def sample_var(x):
return np.var(x, ddof=1) # ddof=1 gives 1/(n-1)
def autocovariance(x, k):
"""Sample autocovariance at lag k, with mean removal."""
x = np.asarray(x)
n = len(x)
x_centered = x - np.mean(x)
if k >= n:
return 0.0
return np.sum(x_centered[:-k] * x_centered[k:]) / (n - k)
def autocorrelation(x, k):
"""Sample autocorrelation at lag k."""
return autocovariance(x, k) / autocovariance(x, 0)
# Example: compute for a short series
x = np.array([2, 4, 6, 8, 10])
print("Mean:", sample_mean(x))
print("Variance:", sample_var(x))
for lag in [0, 1, 2]:
print(f"ACF({lag}):", autocorrelation(x, lag))
Without mean removal (for comparison in Exercise Type 2): use xᵢ and xᵢ₊ₖ directly in the product sum instead of (xᵢ − x̄)(xᵢ₊ₖ − x̄). In code, replace x_centered with x in the autocovariance sum.
Given the time series: x = [2, 4, 6, 8, 10]
Compute:
Given the time series: x = [5, 5, 5, 5, 5]
Compute the autocovariance and autocorrelation. What happens when all values are identical?
Given the time series: x = [10, 8, 6, 4, 2]
Compute the autocovariance γ(k) and autocorrelation ρ(k) for lags k = 0, 1, 2.
Given the time series: x = [3, 7, 3, 7, 3]
Compute the autocovariance and autocorrelation for lags k = 0, 1, 2, 3.
Given the time series: x = [12, 15, 18, 12, 15, 18, 12]
Compute the autocovariance and autocorrelation for lags k = 0, 1, 2, 3.
Given the time series: x = [5, 7, 9, 11, 13]
Given the time series: x = [100, 102, 104, 106, 108]
Given the time series: x = [20, 20, 20, 25, 25, 25]
Compare the autocorrelation at lag 1 computed with and without mean removal. What does this tell you about the series?
Given the time series: x = [1, 3, 5, 1, 3, 5]
Compute autocorrelation at lag 2 with and without mean removal. Explain why the results differ.
Given the time series: x = [10, 12, 14, 10, 12, 14, 10]
A time series has an ACF plot where ρ(0) = 1 and ρ(k) ≈ 0 for all k > 0, with values randomly scattered around zero within the confidence bands.
A time series has an ACF plot showing ρ(k) that decays very slowly, remaining positive and significant even at large lags (e.g., ρ(20) > 0.5).
A time series has an ACF plot showing oscillatory (sinusoidal) behavior, with autocorrelations alternating between positive and negative values in a periodic pattern.
A time series has an ACF plot where ρ(k) shows a sharp cutoff after lag q = 2, with ρ(1) and ρ(2) being significant, but ρ(k) ≈ 0 for all k > 2.
A time series has an ACF plot showing exponential decay: ρ(k) starts high and decays gradually, remaining positive but decreasing, with no sharp cutoff.
Consider a heart rate time series where the ACF shows:
Consider a blood pressure time series where the ACF shows:
Consider an EEG-like signal where the ACF shows:
Consider a respiratory rate time series where the ACF shows:
Consider a body temperature time series where the ACF shows:
This problem set covers:
This problem set provides exercises for practice. Solutions are available separately.