Time Series Fundamentals: Basic Terms with Simple Explanation
Time Series Fundamentals:
Introduction
Imagine you're looking at how your favorite coffee shop's sales change over time. This is a time series! Let's break down the key concepts in simple terms.
Core Components: The Building Blocks
1. Trend 📈
The general direction your data moves over time.
Like:
- A child's height growing over years (upward trend)
- Ice cream sales over a decade (might show an upward trend)
- Number of landline phones over recent years (downward trend)
Real-world example: Monthly sales at a growing business
Jan: $1000 Feb: $1100 Mar: $1250 Apr: $1400
2. Seasonality 🌞❄️
Regular patterns that repeat over fixed periods.
Like:
- Ice cream sales peak in summer, drop in winter
- Coffee shop sales higher on weekdays, lower on weekends
- Retail sales spike during holidays
Real-world example: Monthly ice cream sales
Summer: 1000 units Fall: 500 units Winter: 200 units Spring: 600 units (Pattern repeats next year)
3. Cyclical Patterns 🔄
Up and down movements that aren't fixed to a specific time period.
Like:
- Housing market cycles (boom and bust)
- Economic cycles (growth and recession)
- Fashion trends (styles coming and going)
Real-world example: Business cycle
Growth phase: 2-3 years Peak: 6 months Decline phase: 1-2 years Bottom: 1 year
4. Random Variations ⚡
Unexpected changes that don't follow any pattern.
Like:
- A surprise rainy day affecting restaurant sales
- A celebrity tweet boosting product sales
- Power outage affecting store sales
Key Analysis Techniques: Understanding Your Data
1. Decomposition Methods 🧩
Breaking down your time series into its basic parts (like separating ingredients in a recipe).[trends, seasonal pattern, random change or error or residue]
Example:
Total Sales = Trend + Seasonal Pattern + Random Changes 1000 = 800 (trend) + 150 (seasonal) + 50 (random)
2. Stationarity Testing 📊
Checking if your data's basic properties stay stable over time.
Like:
- collage seats every year (stationary)
- Sales of chocolate (non-stationary)
3. Autocorrelation Analysis 🔍
How today's values relate to past values.
Example:
- If it rains today, it might likely rain tomorrow
- If sales are high today, they might be high tomorrow
4. Cross-correlation Techniques 🤝
How two different time series relate to each other.
Example:
- How ice cream sales relate to temperature
- How coffee sales relate to rainfall
Preprocessing Strategies: Cleaning Your Data
1. Handling Missing Values 🕳️
Methods:
- Use the last known value
- Take an average of nearby values
- Make an educated guess based on patterns
2. Smoothing Techniques 🔄
Making your data less jumpy by removing small fluctuations.
Like:
- Taking average of nearby values
- Giving more importance to recent values
Before smoothing:
100, 120, 90, 110, 95
After smoothing:
100, 105, 105, 100, 100
3. Normalization 📏
Putting all your data on the same scale. Transforms data to a common scale (usually 0-1). It makes different measurements comparable
Common Normalization Techniques
4.Scaling 📊
Scaling is like adjusting the zoom level on a map
a. Standard Scaling (Standardization) 📊
Scaling is like adjusting the measurement units of your data to make it more manageable and comparable, Transforms data to have:
- Mean = 0
- Standard Deviation = 1
- When your data should follow a normal distribution
- For algorithms sensitive to magnitude (like Neural Networks)
- When outliers aren't extreme
b. Min-Max Scaling 🎯
Transforms data to a fixed range [0,1]
- When you need values between 0 and 1
- For image processing
- When minimum and maximum values are meaningful
c. Robust Scaling 💪
Uses statistics that are robust to outliers:
- Median (instead of mean)
- Interquartile Range (instead of standard deviation)
d. Max Absolute Scaling 📈
Scales data by dividing by maximum absolute value
e. Decimal Scaling 🔢
Moves decimal point based on maximum absolute value
f. Log Scaling 📉
Takes logarithm of the values
Quick Decision Guide 🤔
Choose Standard Scaling When:
- Using linear models
- Data is roughly normal
- Mean/variance important
Choose Min-Max Scaling When:
- Need bounded values
- Data is not normal
- Working with neural networks
Choose Robust Scaling When:
- Have outliers
- Data is skewed
- Working with financial data
Choose Max Absolute Scaling When:
- Have sparse data
- Need to preserve zero entries
- Working with text data
Choose Decimal Scaling When:
- Need simple transformation
- Want readable numbers
- Doing basic reporting
Choose Log Scaling When:
- Have exponential growth
- Data spans many magnitudes
- Working with prices/populations
Common Pitfalls ⚠️
- Data Leakage
- Always fit scaler on training data only
- Apply same scaling to test data
- Zero Variance
- Check for constant features
- Handle separately if needed
- Negative Values
- Some scalers can't handle negative values
- Check data range before choosing
Best Practices 👍
- Always scale after splitting data
- Save scaler parameters for later use
- Scale target variable separately
- Document scaling method used
- Check scaled data distribution
5. Differencing ➖
Like measuring the steps between values instead of the values themselves.