Literally Everything You Can Do With Time Series!
Since my first week on this platform, I have been fascinated by the topic of time series analysis.
This article is prepared to be a container of many broad topics in the field of time series analysis. My motive is to make this the ultimate reference to time series analysis for beginners and experienced people alike.
# Importing libraries
import os
import warnings
warnings.filterwarnings('ignore')
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
# Above is a special style template for matplotlib, highly useful for visualizing time series data
%matplotlib inline
from pylab import rcParams
from plotly import tools
import plotly.plotly as py
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.figure_factory as ff
import statsmodels.api as sm
from numpy.random import normal, seed
from scipy.stats import norm
from statsmodels.tsa.arima_model import ARMA
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.arima_process import ArmaProcess
from statsmodels.tsa.arima_model import ARIMA
import math
from sklearn.metrics import mean_squared_error
from datetime import datetime, timedelta
rcParams['figure.figsize'] = 11, 9
print(os.listdir("../input"))
This code imports various libraries that will be used in a data analysis project. Imported libraries include:
OS: allows access to operating system-dependent functionality, such as reading and writing files
Code execution may result in warnings, which can be ignored
Numpy is used for numerical computations and array manipulation
Analyzing and manipulating data with Pandas
Data visualization using Matplotlib
Pylab provides an interface to matplotlib and numpy in a single namespace, mainly used to plot data
Using plotly, you can create interactive and dynamic visualizations
Time series data are modeled and analyzed with statsmodels
Numpy does not include mathematical functions
Also, the code configures the plot size and style parameters. The last step is to print the list of files in the directory “../input”.
%%bash
# https://pypi.org/project/yahoofinancials/
pip install yahoofinancials
The code is written in Jupyter Notebook or Google Colaboratory cell using the “bash” kernel. The Python package “yahoofinancials” is installed from the Python Package Index (PyPI) using the pip command.
yahoofinancials is a Python wrapper around the Yahoo Finance API that provides a simple interface for retrieving stock prices and other financial data.
It installs the “yahoofinancials” package in the current Python environment, making it available for use in subsequent code cells.
from yahoofinancials import YahooFinancials
from joblib import Memory
TMPDIR = '/tmp'
memory = Memory(TMPDIR, verbose=0)
The code imports the YahooFinancials class from the yahoofinancials package and the Memory class from the joblib package.
YahooFinancials provides a simple interface for fetching stock prices and other financial data from Yahoo Finance using a Python wrapper around the Yahoo Finance API. A YahooFinancials instance is created, which can then be used to fetch financial data for a particular stock or set of stocks.
A function call’s results are cached using the Memory class from the joblib package. It is used to cache the results of YahooFinancials API requests, which can be time-consuming and resource-intensive. Memory is initialized with a temporary directory path (TMPDIR) where the cached results will be stored.
With caching, subsequent calls to the same YahooFinancials API request can be fetched from the cache instead of making a new request.
@memory.cache
def get_ticker_data(ticker: str, param_start_date, param_end_date) -> dict:
raw_data = YahooFinancials(ticker)
return raw_data.get_historical_price_data(param_start_date, param_end_date, "daily").copy()
def fetch_ticker_data(ticker: str, start_date, end_date) -> pd.DataFrame:
date_range = pd.bdate_range(start=start_date, end=end_date)
values = pd.DataFrame({'Date': date_range})
values['Date'] = pd.to_datetime(values['Date'])
raw_data = get_ticker_data(ticker, start_date, end_date)
return pd.DataFrame(raw_data[ticker]["prices"])[['date', 'open', 'high', 'low', 'adjclose', 'volume']]
Using Yahoo Finance API, this code retrieves historical stock ticker data from a stock ticker symbol and converts it into a Pandas DataFrame using two functions, get_ticker_data() and fetch_ticker_data().
There are three arguments to the get_ticker_data() function:
A string that represents the ticker symbol of the stock to be fetched
Parameter_start_date: a string representing the data’s start date in “YYYY-MM-DD” format
The parameter_end_date represents the end date in “YYYY-MM-DD” format for the data to be retrieved
First, the function creates an instance of the YahooFinancials class with the given ticker symbol. To get historical price data for the given ticker symbol, it calls the get_historical_price_data() method on this object. As a Python dictionary, the function returns the price data.
Three arguments are passed to fetch_ticker_data():
Stock ticker symbol: a string representing the stock ticker symbol
Start_date: a string representing the start date in “YYYY-MM-DD” format for the data to be fetched
The end date for the data to be retrieved must be represented in the format “YYYY-MM-DD”
Using the pd.bdate_range() method, the function generates a list of business dates between the start_date and end_date arguments. A Pandas DataFrame is then created with the Date column containing the date range. In order to retrieve historical price data for a ticker symbol and date range, the get_ticker_data() function is called. From the fetched data, the function creates a new DataFrame with the relevant columns.
@memory.cache caches the results of the get_ticker_data() function, reducing API calls and improving performance. Instead of making a new API request if the function is called again with the same arguments, the decorator returns the cached result.
1 Introduction to date and time
1.1 Importing time series data
1.2 Cleaning and preparing time series data
1.3 Visualizing the datasets
1.4 Timestamps and Periods
1.5 Using date_range
1.6 Using to_datetime
1.7 Shifting and lags
1.8 Resampling
2. Finance and Statistics
2.1 Percent change
2.2 Stock returns
2.3 Absolute change in successive rows
2.4 Comparing two or more time series
2.5 Window functions
2.6 OHLC charts
2.7 Candlestick charts
2.8 Autocorrelation and Partial Autocorrelation
3. Time series decomposition and Random Walks
3.1 Trends, Seasonality and Noise
3.2 White Noise
3.3 Random Walk
3.4 Stationarity
4. Modelling using statsmodels
4.1 AR models
4.2 MA models
4.3 ARMA models
4.4 ARIMA models
4.5 VAR models
4.6 State space methods
4.6.1 SARIMA models
4.6.2 Unobserved components
4.6.3 Dynamic Factor models