Stock prediction using 3 methods (LSTM+ARIMA+MCMC)
During the course of my bachelor’s thesis, I tested various stock prediction methods and made mistakes along the way.
Most of you are probably familiar with LSTM, ARIMA (and its variations) and MCMC (Markov Chain Monte Carlo). Stock prices are usually a bit harder to forecast due to market volatility and social influences on the trend. So, my research focused on combining such methods in order to make my predictions flexible, depending on the historical data of each stock. It is my hope that you will find this notebook useful.
Installing Packages
For data scraping I use yfinance package, it is easy to install and use.
Yfinance is an open source library developed by Ran Aroussi for accessing Yahoo Finance’s financial data. Yahoo Finance provides excellent market data on stocks, bonds, currencies, and cryptocurrencies. Also, it provides market news, reports, and analysis, as well as options and fundamentals data- setting it apart from some of its competitors.
!pip install yfinance --quiet
!pip install pmdarima --quiet
Python package installer “pip” is used to install two libraries: “yfinance” and “pmdarima”.
Installing the “yfinance” library is achieved with the first line “!pip install yfinance — quiet”. Using this library, you can retrieve financial data from Yahoo! Finance. In order to suppress the output messages generated during installation, use the “ — quiet” flag.
With the second line “!pip install pmdarima — quiet”, the “pmdarima” library is installed. The AutoRegressive Integrated Moving Average (ARIMA) model is used in this library for time series analysis and forecasting. “ — quiet” suppresses any output messages generated during installation.
This code installs two Python libraries that may be useful for financial analysis and time series forecasting.
!pip install statsmodels==0.11.0rc1 --quiet
!pip install -Iv pulp==1.6.8 --quiet
Two specific versions of libraries are installed using the Python package installer “pip”: “statsmodels” and “pulp”.
“!pip install statsmodels==0.11.0rc1 — quiet” installs the “statsmodels” library version 0.11.0rc1. Statistical modeling and econometrics can be performed using this library in Python. As indicated by the “==0.11.0rc1” notation, the library version 0.11.0rc1 is a specific version of the library that may be required by some code or projects. During installation, the “ — quiet” flag is used to suppress output messages.
It installs version 1.6.8 of the “pulp” library with the second line “!pip install -Iv pulp==1.6.8 — quiet”. In Python, this library is used for linear programming optimization. If a newer version of the library is already installed, the “-Iv” option forces the installation of the specified version. Installing this specific version is ensured by the “==1.6.8” notation. During the installation process, “ — quiet” suppresses any output messages generated.
By ensuring that these specific versions of libraries are installed, even if newer versions are available, this code installs two specific versions of libraries that may be required by certain code or projects.
Collecting Train Data
import yfinance as yf
# getting data from Yahoo Finance
stock_name = 'AMD' # here you can change the name of stock ticker, for example we will take AMD ticker
data = yf.download(stock_name, start="2020-03-26", end="2021-03-29")
This code imports the “yfinance” library with the alias “yf” and uses it to download historical stock price data from Yahoo Finance for a given stock ticker.
“import yfinance as yf” imports the “yfinance” library and creates an alias “yf”. Financial data is retrieved from Yahoo Finance using this library.
The second line creates a variable “stock_name” with the value ‘AMD’, which represents the stock ticker of the company whose stock price data should be downloaded. A different stock can be downloaded by changing this value.
For the third line, we use the “download” function from the “yfinance” library to retrieve historical stock prices between the specified dates for the given stock ticker. A start date of March 26th, 2020 and an end date of March 29th, 2021 are used in this example. “data” stores the retrieved data.
Through the “yfinance” library in Python, this code retrieves historical stock prices from Yahoo Finance for a given ticker.
# import plotly package for graphs
import plotly
import plotly.graph_objs as go
import plotly.express as px
from plotly.subplots import make_subplots
This code imports several modules and functions from the plotly package, which is used to create interactive graphs and visualizations.
‘# import plotly package for graphs’ is a comment that describes what the following code is for.
Plotly, a popular Python library that creates interactive visualizations, is imported by the second line “import plotly”.
“import plotly.graph_objs as go” imports the “graph_objs” module from the “plotly” package, and creates an alias “go”. There are several classes in this module that define different types of graphs and visualizations, including scatter plots, bar charts, and heatmaps.
By importing “express” from the “plotly” package, and creating an alias “px” for it, the fourth line imports “express” from the “plotly” package. There are many types of visualizations that can be created with minimal coding using this module.
This fifth line imports the “make_subplots” function from the “plotly” package’s “subplots” module. Within a single figure, this function creates a grid of subplots.
Overall, this code imports several key components of the “plotly” package that are commonly used in Python to create interactive graphs and visualizations.