EEG & EMG Practice: Stationarity, Unit Root, Differencing, AR/MA/ARMA

Practice Exercises (Kaggle Notebook) — No Solutions

Kaggle Notebook Environment

Complete these exercises in a Kaggle notebook. Use the EEG/EMG dataset and path conventions below.

Dataset

Kaggle: EEG & EMG TS Practice
File: eeg_emg_ts.csv
Description: Two physiological time series over the same time index (1-minute epochs, 800 epochs).
- eeg_alpha_power: Simulated EEG alpha-band power (8–12 Hz band, microvolts² scale). Includes trend, weak periodic modulation (alpha rhythm), and AR-like dynamics.
- emg_rms: Simulated EMG RMS amplitude (arbitrary units). Includes trend and MA-like dynamics (burst-like behavior).
Length: 800 rows (~13.3 hours of 1-min epochs).
Columns: date, eeg_alpha_power, emg_rms.

Adding the dataset on Kaggle

Open the dataset link above (or go to your notebook and click Add Data).
Search for eeg-emg-ts-practice by trongnghia7171, or use the link: https://www.kaggle.com/datasets/trongnghia7171/eeg-emg-ts-practice.
Attach the dataset to your notebook. The file path will be /kaggle/input/eeg-emg-ts-practice/eeg_emg_ts.csv.

Loading the data in the notebook

import pandas as pd
path = "/kaggle/input/eeg-emg-ts-practice/eeg_emg_ts.csv"
df = pd.read_csv(path)
df["date"] = pd.to_datetime(df["date"])
df = df.set_index("date").sort_index()
eeg = df["eeg_alpha_power"]
emg = df["emg_rms"]

Required packages

Kaggle notebooks include pandas, numpy, matplotlib, and statsmodels. If needed, install in a cell:

# !pip install statsmodels

Objective: Use EEG and EMG time series to practice the full pipeline (unit root tests, differencing, ACF/PACF model selection, AR/MA/ARMA fit, evaluation) with EEG- and EMG-specific interpretations (alpha rhythm, burst dynamics, physiological non-stationarity). Solutions are available separately.

Exercise 1: Load and explore

Task 1.1

Load the CSV from the Kaggle input path, parse date, and set it as the datetime index. Extract the EEG series (eeg_alpha_power) and the EMG series (emg_rms).

Task 1.2

Plot both series (EEG and EMG) and compute basic statistics (mean, std, min, max) for each.

Task 1.3

Comment on trend, variability, and physiological interpretation: e.g. alpha power changes over a recording session (drowsiness, arousal); EMG baseline drift and burst-like structure.

Exercise 2: Unit root tests

Task 2.1

Run the Augmented Dickey–Fuller (ADF) test on the raw EEG series and on the raw EMG series. Report test statistic, p-value, and critical values for each.

Task 2.2

Conclude for each series whether it is stationary or has a unit root. State the null and alternative hypotheses and your decision.

Task 2.3 (optional)

Run the KPSS test for each series and compare conclusions (trend stationary vs difference stationary).

Exercise 3: Differencing

Task 3.1

Compute the first difference of the EEG series and of the EMG series. Plot both differenced series.

Task 3.2

Run the ADF test (and optionally KPSS) on each differenced series. Conclude whether the differenced series are stationary and hence whether d = 1 is appropriate for both.

Exercise 4: ACF and PACF for model selection

Task 4.1 — EEG

Plot the ACF and PACF of the differenced EEG series (use a reasonable number of lags, e.g. 40–50). Using the usual guidelines (PACF cutoff for AR order, ACF cutoff for MA order), suggest candidate orders (p, q) for the differenced series, hence (p, d, q) with d = 1 for the raw EEG (e.g. ARIMA(1,1,0), ARIMA(2,1,0)). Note any periodic peaks (e.g. alpha-rhythm modulation at a given lag).

Task 4.2 — EMG

Plot the ACF and PACF of the differenced EMG series. Suggest ARIMA(p, 1, q) for the raw EMG. Comment on EMG-specific behavior (e.g. short memory, MA-like cutoff typical of burst dynamics).

Exercise 5: AR/MA/ARMA fit

Task 5.1 — EEG

Fit 2–3 candidate ARIMA(p, 1, q) models for the EEG series (e.g. (1,1,0), (2,1,0), (1,1,1)). Report estimated coefficients, AIC, and BIC. Choose a preferred model (e.g. by BIC).

Task 5.2 — EMG

Fit 2–3 candidate ARIMA(p, 1, q) models for the EMG series (e.g. (0,1,1), (1,1,1), (0,1,2)). Report coefficients, AIC, BIC. Choose a preferred model.

Task 5.3

Compare the EEG and EMG preferred orders (e.g. AR vs MA dominance) and interpret in terms of physiological dynamics.

Exercise 6: Evaluation

Task 6.1 — Residual diagnostics

For both preferred models (EEG and EMG), plot the ACF of the residuals and run a Ljung–Box test (e.g. on the first 20 lags). Comment on whether the residuals behave like white noise for each.

Task 6.2 — Forecast evaluation

For both series, use a temporal train/test split (e.g. last 10–15%). Generate one-step or short-horizon forecasts for the test period. Compute RMSE and MAE for EEG and for EMG. Plot forecasts vs actual values for each. Briefly compare EEG vs EMG forecastability.

Summary

This practice set covers:

Load and explore: EEG alpha power and EMG RMS in Kaggle; physiological interpretation.
Unit root tests: ADF (and optionally KPSS) on raw EEG and EMG; conclude non-stationarity.
Differencing: First difference for both; confirm stationarity with ADF.
ACF/PACF: Model selection for EEG (AR-dominated, optional periodicity) and EMG (MA-dominated, burst-like).
AR/MA/ARMA fit: Fit ARIMA for EEG and EMG; compare orders and interpret.
Evaluation: Residual diagnostics and forecast evaluation (RMSE, MAE) for both; compare forecastability.

Complete the exercises in a Kaggle notebook. Solutions are available separately.