Time series and classifier ensembles

This assignment is divided into two large blocks. The first one studies how to decompose a time series and build forecasts from its components. The second one compares different ways of combining classifiers in supervised learning.

Quick guide

The common idea behind both blocks is model combination. In time series, the final forecast is built from transformations, trend, seasonality and a seasonal model. In classification, the final decision is built from several learners that reduce variance, bias or both depending on the ensemble strategy.

1. Time series with AirPassengers

The notebook begins with the AirPassengers dataset, a classic monthly series of airline passengers. The first plots already show the three ingredients that drive the rest of the exercise: heteroscedasticity, trend and seasonality.

The analysis is developed in stages. First, the variance is stabilized to remove heteroscedasticity. Then the long-term trend is estimated and separated, and finally the seasonal behavior is studied in order to identify the period and fit a seasonal forecasting model.

1.1 Components of the series

  • Heteroscedasticity: transform the series so that seasonal amplitude remains more stable over time.
  • Trend: isolate the long-term linear growth component.
  • Seasonality: detect the periodic behavior and model it explicitly.
  • SARIMA: use a seasonal ARIMA model once the period has been identified.

The notebook therefore treats forecasting as a structured decomposition problem rather than a single black-box prediction step.

1.2 Forecasting the next two years

After fitting the model, the assignment predicts the next two years and compares the forecast with the real continuation of the series. The exercise then rebuilds the final prediction step by step: first the SARIMA output, then the trend component, and finally the inverse transformation needed to restore the original heteroscedastic behavior.

This makes the forecasting pipeline transparent: each transformation is justified and later reversed so that the prediction can be interpreted in the original scale of the problem.

2. Combination of classifiers

The second half of the assignment changes domain and works with a supervised dataset about whether a car is likely to skid in a curve. From there, the notebook compares several ensemble strategies instead of relying on a single model.

2.1 Parallel combinations

The notebook starts from a decision tree baseline and then studies parallel ensemble methods such as bagging and boosting. The point is to measure how performance changes when multiple weak or moderate learners are aggregated.

This section is useful as a controlled comparison: the same classification problem is kept fixed while the combination strategy becomes more sophisticated.

2.2 Sequential combinations

The final block studies sequential combinations of different base classifiers. The notebook introduces KNN and SVM as component models and then builds higher-level strategies such as stacking and cascading.

In other words, the assignment does not limit itself to ensemble voting. It also explores workflows where the output of one classifier becomes part of the input or decision process of another.

Overall objective

Together, both parts of the notebook show two complementary ideas: forecasting improves when the temporal structure of the data is modeled explicitly, and classification often improves when several learners are combined with the right architecture. The full notebook with code, plots and outputs remains below in Spanish.