This is how you do Asset Allocation With Machine Learning
There is no source code to download. Every code block is inside the article itself, it downloads the data from yahoo finance.
Let’s start coding:
# python 3.7
# For yahoo finance
import io
import re
import requests
# The usual suspects
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Fancy graphics
plt.style.use('seaborn')
# Getting Yahoo finance data
def getdata(tickers,start,end,frequency):
OHLC = {}
cookie = ''
crumb = ''
res = requests.get('https://finance.yahoo.com/quote/SPY/history')
cookie = res.cookies['B']
pattern = re.compile('.*"CrumbStore":\{"crumb":"(?P<crumb>[^"]+)"\}')
for line in res.text.splitlines():
m = pattern.match(line)
if m is not None:
crumb = m.groupdict()['crumb']
for ticker in tickers:
url_str = "https://query1.finance.yahoo.com/v7/finance/download/%s"
url_str += "?period1=%s&period2=%s&interval=%s&events=history&crumb=%s"
url = url_str % (ticker, start, end, frequency, crumb)
res = requests.get(url, cookies={'B': cookie}).text
OHLC[ticker] = pd.read_csv(io.StringIO(res), index_col=0,
error_bad_lines=False).replace('null', np.nan).dropna()
OHLC[ticker].index = pd.to_datetime(OHLC[ticker].index)
OHLC[ticker] = OHLC[ticker].apply(pd.to_numeric)
return OHLC
# Assets under consideration
tickers = ['%5EGSPTSE','%5EGSPC','%5ESTOXX','000001.SS']
# If yahoo data retrieval fails, try until it returns something
data = None
while data is None:
try:
data = getdata(tickers,'946685000','1685008000','1d')
except:
pass
ICP = pd.DataFrame({'SP500': data['%5EGSPC']['Adj Close'],
'TSX': data['%5EGSPTSE']['Adj Close'],
'STOXX600': data['%5ESTOXX']['Adj Close'],
'SSE': data['000001.SS']['Adj Close']}).fillna(method='ffill')
# since last commit, yahoo finance decided to mess up (more) some of the tickers data, so now we have to drop rows...
ICP = ICP.dropna()
Firstly, we use this code to retrieve information and financial data from Yahoo finance. We also import all the important libraries such as io, re and requests. After that we define a function getdata that is going to take in arguments such as ticker for stocks, start dates as well as end dates, and also frequency.
The function also uses the requests library to send HTTp requests to Yahoo Finance and retrieve data based on the argumnets specified by us. We also use regular expressions to extract specific data from the requests response. After that a list of stock tickers is created by us and stored in ticker variable.
There is also a list contains the stock symbols for different assets like S&P500, TSX, STOXX600, and SSE. After that, there is a while loop that attempts to retrieve data from the library Yahoo Finance using the previously defined getdata function. If there is an error, our loop is going to try again until it successfully retrieves data.
The retrieved data is then stored in our Pandas dataframe named ICP with the columns S&P500, TSX, STOXX600, and SSE representing the different assets. The dataframe is filled with data using the fillna method and any empty values are replaced with the previous values in our column.
At the end the dataframe is returend as the output of the getdata function.