Onepagecode

Onepagecode

Algorithmic Trading: Building a Three-Pillar Quantitative Strategy

From Equal Weight to Robust Value and High-Quality Momentum: A Practical Python Guide to Portfolio Construction

Onepagecode's avatar
Onepagecode
Nov 20, 2025
∙ Paid

Download link at the end of article for source code!

The realm of stock market investing is increasingly driven by quantitative strategies — systematic, rules-based approaches that leverage computational power and data analysis over human intuition. While passive strategies like tracking the S&P 500 remain popular, sophisticated investors often combine fundamental principles like value and dynamic principles like momentum to enhance returns and manage risk. This comprehensive guide details the process of building a robust algorithmic trading toolkit in Python. We will develop and implement three distinct strategies: a foundational Equal-Weight S&P 500 Index Fund for balanced exposure, a sophisticated High-Quality Momentum (HQM) screener focused on consistently strong price trends, and a Robust Value (RV) strategy that identifies deeply undervalued stocks using a combination of five key financial metrics. Through this project, you’ll learn to handle real-world financial data, manage APIs, execute complex calculations, and translate investment hypotheses into actionable, automated trading recommendations.


The S&P 500 is the most popular stock market index in the world. The biggest fund tied to this index is the SPDR® S&P 500® ETF Trust, which manages over US$250 billion in assets.

This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.

In this part of the course, we will create a Python script that takes the value of your portfolio and tells you how many shares of each S&P 500 company you should buy to create an equal-weight version of the index fund.

Library Imports

First, let’s import the open-source software libraries that we will be using in this tutorial.

import numpy as np #The Numpy numerical computing library
import pandas as pd #The Pandas data science library
import requests #The requests library for HTTP requests in Python
import xlsxwriter #The XlsxWriter libarary for 
import math #The Python math module

Imagine you’re setting up for a fun cooking project where we want to analyze trends in the kitchen. Our first step is to gather our ingredients, and in the world of programming, we often do this by importing libraries. Each library we import acts like a special tool or recipe that helps us accomplish specific tasks.

So, we begin by bringing in numpy, the numerical computing library. Think of NumPy as a set of very sharp knives that help us chop and manipulate large amounts of numerical data quickly and efficiently. Then we grab pandas, which is like a well-organized pantry that allows us to store and manage data in a way that’s easy to work with. It’s perfect for handling tables, just like having all your ingredients neatly lined up before cooking.

Next comes requests, which is like sending a messenger to fetch supplies from the market. This library allows us to make HTTP requests, meaning it can help us grab data from the internet, like prices or trends, which is crucial for our algorithmic trading project. Moving along, we have xlsxwriter. You can think of it as a fancy wrapping paper; it helps us neatly package our data into Excel files, which makes it easy to share and present our findings.

Lastly, we import the math module, our trusty companion for performing various mathematical operations. It’s like having a calculator that allows us to measure and calculate accurately as we put our trades together. Each of these imports lays a foundation for a powerful program that can analyze market patterns, ultimately helping us make smarter trading decisions. As we continue this project, each ingredient will come into play to craft our trading strategy seamlessly.

Importing Our List of Stocks

Next, let’s import the S&P 500 constituents.

Since these constituents can change, the best approach would be to connect directly to the index provider, Standard & Poor’s, and regularly retrieve their latest list.

Download dataset and the code using the link below:

However, accessing the index provider’s API isn’t part of our course.

Now, let’s import these stocks into our Jupyter Notebook file.

stocks = pd.read_csv(’sp_500_stocks.csv’)

Imagine stepping into a bustling stock market filled with countless opportunities, and you want to capitalize on potential trends using machine learning. To begin our journey, we need to gather information, and that’s where our first line comes in.

Here, we see the command that pulls in a treasure trove of data about various stocks, specifically those in the S&P 500. The pd.read_csv function acts like a friendly librarian, retrieving the specific book — in this case, a file called ‘sp_500_stocks.csv’ — from our digital library. This file is a spreadsheet, containing crucial information about each stock, such as their prices, market capitalization, and industry sector.

By using pd, we are actually tapping into the powerful Pandas library, which is fantastic for managing and analyzing data, much like a chef uses various utensils to prepare a meal efficiently. Reading in this data prepares our ingredients, allowing us to cook up insightful analyses and predictions later in the project.

Once we have everything neatly organized, we can start working with the stocks, analyzing how they perform, and creating models that can predict their future behavior. Think of this step as laying the groundwork for a recipe — without the right ingredients, we can’t bake a delicious cake. As we continue with our work, each additional line will build on this foundation, guiding us toward smarter trading decisions using the intricate tools of machine learning and deep learning. Together, we’ll uncover patterns and insights that can transform our approach to trading.

Importing Your API Token

It’s time to add your IEX Cloud API token! This is the data provider we’ll use throughout the course.

Please remember to store your API token (and other sensitive information) in a secrets.py file. Ensure that this file doesn’t get uploaded to your local Git repository. In this course, we will use a sandbox API token, which means the data will be randomly generated and, most importantly, free of charge.

from secrets import IEX_CLOUD_API_TOKEN

Imagine embarking on a journey through the world of algorithmic trading, where machines make smart decisions about buying and selling stocks. Our adventure begins by tapping into a treasure chest of financial data provided by IEX Cloud, an essential resource for analyzing stock market trends. To access this wealth of information safely, we first summon a special key — an API token — hidden away in our secrets for security. It’s like having a special pass that grants us entry to exclusive club insights.

Now, picture this line as us turning to a locked door (the financial data) and using our secret key to open it. The “from secrets import IEX_CLOUD_API_TOKEN” line brings that token into our program, ensuring that when we inquire about stock prices or market changes, we do so quietly and securely, without exposing our key to the world. This creates a protective layer around our data access, much like how a superhero keeps their true identity under wraps.

By incorporating this token at the very start of our coding story, we set the stage for all that follows. This key is the foundation upon which we will build our machine learning models, which will learn from the data we gather and help predict future market movements. Think of it as establishing the groundwork for our complex recipe, where every ingredient and instruction — each line and function — plays a pivotal role in creating a tasty dish of smart trading algorithms. With this key in hand, we are ready to dive deeper into the realms of machine learning and deep learning, exploring how computers can become savvy traders in the bustling stock market.

Making Our First API Call

Let’s get ready to make our API calls to IEX Cloud.

Here’s what we need to gather from the API:

* The market capitalization for each stock

* The price of each stock

symbol=’AAPL’
api_url = f’https://sandbox.iexapis.com/stable/stock/{symbol}/quote?token={IEX_CLOUD_API_TOKEN}’
data = requests.get(api_url).json()
data

Imagine we’re trying to gather the latest information about Apple Inc. for a trading strategy we’re developing using machine learning. To begin, we define our target company with the symbol ‘AAPL’, which serves like a name tag that helps us identify it in the bustling stock market. Next, we create an URL that acts like a delivery address for requesting data from a specialized service called an API, which stands for Application Programming Interface. Picture this as asking a librarian to fetch a specific book; we need to tell it exactly what we’re looking for.

The URL we construct uses Python’s f-string, allowing us to seamlessly insert our symbol right into the address, making it dynamic. It’s like writing a letter where the name of the person changes based on who you are addressing. After we have our complete URL, we send a request to this address, asking for the stock quote information using the requests.get() method, which is like dialing a phone number to get a voice on the other end. When we receive the response, it comes back as a JSON object, which is a structured form of data that we can easily work with, similar to receiving a neatly organized report.

Finally, by using .json(), we convert the response into a Python-friendly format, allowing us to interact with the data more easily. It’s as if we are unpackaging a box full of information so we can see what it contains. The goal here is to prepare ourselves with the necessary stock information, which we will later leverage in our trading algorithms, sharpening our decision-making as we automate the trading process. This foundational step is crucial as we move forward into more sophisticated analyses and predictions.

Parsing Our API Call

The API call we made in the previous code block includes all the details we need to create our equal-weight S&P 500 strategy.

However, the data still needs to be organized properly. Let’s parse it first.

data[’latestPrice’]
data[’marketCap’]

Imagine you’re reading a treasure map that leads you to the hidden gems of the stock market. In our adventurous journey, we’re collecting key treasures from a data treasure chest that contains important insights about various stocks.

The first treasure we’re reaching for is labeled ‘latestPrice.’ Here, we’re extracting the most recent price of a particular stock, allowing us to see how much it is currently worth. Think of it as checking the price tag while shopping for your favorite gadget; by knowing this, you can make informed decisions about buying or selling.

Next, we turn to our second treasure labeled ‘marketCap.’ This reflects the total value of a company in the stock market, similar to calculating how much all the toys in a toy store might be worth — all the items combined give you a picture of the store’s worth. By gathering this information, we can assess the company’s size and its potential for growth or decline, much like determining whether a small toy store can compete with a large department store.

Each time we sift through this treasure chest, we’re not just looking for pretty numbers; we’re gathering critical information that will feed into our machine learning models. These models will analyze this data to uncover patterns and trends that can help us make smarter trading decisions, turning our adventure into potential profit. So, as we continue to explore and gather our treasures, we build a stronger foundation for our larger quest in algorithmic trading. And with each line we write, we’re one step closer to unveiling the secrets of the market!

Incorporating Our Stocks Data into a Pandas DataFrame

Next, we will add our stock prices and market capitalization to a pandas DataFrame. You can think of a DataFrame as the Python equivalent of a spreadsheet, where it holds tabular data.

my_columns = [’Ticker’, ‘Price’,’Market Capitalization’, ‘Number Of Shares to Buy’]
final_dataframe = pd.DataFrame(columns = my_columns)
final_dataframe

Imagine we’re chefs in the exciting kitchen of algorithmic trading, where we need a special recipe to manage our ingredients — like a list of stocks and their details. We start by creating a list called my_columns, which acts like a grocery list featuring key ingredients: ‘Ticker’, ‘Price’, ‘Market Capitalization’, and ‘Number Of Shares to Buy’. Each of these items will help us bake a successful trading strategy, just as good ingredients are essential for a delicious dish.

Next, we use a tool from the pandas library, which is like our trusty kitchen equipment, to whip up something new. We call this creation final_dataframe. In our context, it’s like a well-organized kitchen counter, where we’ve set aside spaces for each ingredient listed in our my_columns. When we use pd.DataFrame(columns=my_columns), we’re effectively preparing a clean slate — imagine setting up empty containers labeled accordingly, ready to be filled with delicious stock data.

With this structure in place, final_dataframe has been initialized but remains empty for now. It’s like having our containers ready to catch the fresh produce and spices we’re going to get later. This empty dataframe will eventually hold all the relevant information we gather, whether it’s prices or how many shares we want to buy, which is crucial for our trading algorithm.

As we continue our journey through the world of algorithmic trading using machine learning and deep learning, this dataframe will become a vibrant hub where data comes alive, helping us make informed decisions about our trades. Together, we’ll fill it up and transform it into a powerful resource in our trading toolkit.

final_dataframe = final_dataframe.append(
                                        pd.Series([’AAPL’, 
                                                   data[’latestPrice’], 
                                                   data[’marketCap’], 
                                                   ‘N/A’], 
                                                  index = my_columns), 
                                        ignore_index = True)
final_dataframe

Imagine we’re working on a project that aims to make intelligent decisions in financial trading, much like a skilled chef who knows exactly how to combine ingredients for the perfect dish. In this part of our program, we’re gathering important information about a stock, specifically Apple, and organizing it neatly in a table, which we call a DataFrame.

Here, we start with our final_dataframe, a collection of data that we’ve built up over time, like a recipe book filled with our favorite meals. To add new information about Apple, we reach into our cooking toolbox and pull out the trusty pd.Series, which acts like a recipe card. Each item in this card represents a key detail: ‘AAPL’ is the name of the stock, data[‘latestPrice’] gives us its current value, data[‘marketCap’] provides its overall worth in the market, and ‘N/A’ is a placeholder for information we don’t have yet.

By using the append method, we’re carefully adding this recipe card to our larger recipe book, ensuring that we keep everything neatly organized. The ignore_index=True part ensures that we’re not fussing over the previous page numbers but instead just adding it to the end of our book, making it easy to read.

Finally, when we refer back to final_dataframe, we see all our collected ingredients together at a glance, ready for our machine learning model to analyze and help us make savvy trading decisions. Ultimately, this is part of the larger endeavor to use sophisticated algorithms to outsmart the market, much like a chef who evolves their recipes based on taste tests and feedback. Each detail we collect now becomes part of that recipe for success.

Looping Through Our List of Stock Tickers

By applying the same idea we discussed earlier, we can gather data for all S&P 500 stocks and save this information in the DataFrame using a for loop.

final_dataframe = pd.DataFrame(columns = my_columns)
for symbol in stocks[’Ticker’]:
    api_url = f’https://sandbox.iexapis.com/stable/stock/{symbol}/quote?token={IEX_CLOUD_API_TOKEN}’
    data = requests.get(api_url).json()
    final_dataframe = final_dataframe.append(
                                        pd.Series([symbol, 
                                                   data[’latestPrice’], 
                                                   data[’marketCap’], 
                                                   ‘N/A’], 
                                                  index = my_columns), 
                                        ignore_index = True)

Imagine we’re embarking on a journey to gather key financial data about various stocks. We begin by creating an empty treasure chest, which in programming we call a DataFrame — think of it as a well-organized table where our findings will be stored. This treasure chest is initialized with specific columns that we’ve decided to track, represented by my_columns, which might include details like stock ticker symbols, their latest prices, and market capitalizations.

Next, we open a book of stocks, where each entry is a stock ticker, akin to the titles of stories we want to explore. As we read through each ticker symbol one by one, we want to collect the latest information about that stock. To do this, we craft a special message, much like sending a letter that requests information from a distant library. This letter is the api_url, which is carefully constructed to include the stock symbol and our secret access key to the IEX Cloud, ensuring we are authorized to receive the data.

Once we send off our request using requests.get(api_url), we receive a treasure trove of information in the form of a response. This response is converted from its original form to something we can understand — just as we might transform a foreign language into our native tongue; here, it becomes a JSON object.

With the latest price and market capitalization in hand, we write a new entry into our DataFrame. Think of this as placing a new gem into our treasure chest. We embrace the pd.Series function, which helps us organize our findings according to the same columns we defined earlier. Finally, we gently close the chest with ignore_index=True, ensuring that our entries keep their unique identities as we continue our quest.

In the grand scheme of our project, gathering this data is a foundational step, akin to securing reliable ingredients before we begin cooking up our machine learning algorithms to trade intelligently in the stock market. Each line serves its purpose in building a robust toolkit for that exciting journey ahead.

final_dataframe

Imagine we are crafting the foundation of a powerful tool that could help automate trading decisions in the financial markets. Our journey begins with a collection of data, which we call final_dataframe. Think of it as a meticulously organized filing cabinet filled with crucial information — each drawer labeled with different attributes related to stocks, trends, and other vital indicators.

When we refer to final_dataframe, we are actually calling upon the entirety of this organized collection. It’s where all the magic begins! Imagine flipping through a well-organized binder where each page represents specific historical data points, like prices, volumes, and patterns. By referencing final_dataframe, we’re telling our program, “Let’s dive into this treasure trove of data to analyze and make informed decisions.”

Thanks for reading! This post is public so feel free to share it.

Share

The beauty of this approach lies in how we can improve and refine our analysis. Just like a chef might tweak a recipe after tasting the dish, we can modify our data collection and analysis methods based on what the markets tell us. This flexibility allows us to make adjustments to our algorithms and predictions, honing our trading strategies over time.

As we continue to build on this foundational piece, we will feed final_dataframe into various machine learning models, helping our program learn from the data and adapt to new trends and fluctuations. Each part of our project will layer on top of this collection, creating a dynamic system that can hopefully predict market movements and execute trades with wisdom and speed. So, let’s keep our eyes on how this essential data will empower our algorithmic trading engines moving forward!

Enhancing Performance with Batch API Calls

Using batch API calls is a simple way to boost your code’s performance.

HTTP requests are usually among the slowest parts of a script.

Additionally, many API providers offer discounted rates for batch API calls, as they are more efficient for them to handle.

IEX Cloud allows batch API calls for up to 100 tickers at a time. This change cuts down our total API calls in this section from 500 to just 5 — a remarkable improvement! Here, we’ll break our stock list into groups of 100 and make one batch API call for each group.

# Function sourced from 
# https://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks
def chunks(lst, n):
    “”“Yield successive n-sized chunks from lst.”“”
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

Imagine we’re on a journey to analyze financial data for algo trading, where we need to effectively manage and manipulate our information. In our adventure, we encounter a helper function named chunks, which acts like our trusty guide, breaking down larger tasks into manageable pieces.

At the beginning, we define this function. It’s like opening a toolbox and selecting our most essential tool for the job. This particular tool takes two ingredients: a list, which we can picture as a long string of pearls, each pearl representing an item of data, and a number n, which tells our tool how many pearls we’d like in each group.

Inside the function, the magic happens in a friendly loop that gently walks through our list. Just as you might take a step down a path, stopping to admire a cluster of flowers before moving on, the loop iterates from the start of the list to its end, stepping through in jumps of size n. Each time it takes a leap, it creates a new chunk, like gathering a handful of flowers to form a lovely bouquet.

This chunk is created using the yield keyword, which is like a magical basket that lets us pick and share our bouquet one at a time, rather than all at once. This is efficient and allows us to keep our hands free for the next step.

As we piece together these chunks from our data, we prepare to process them for our algo trading task, ensuring that we handle our information systematically. Each step and each chunk brings us closer to creating sophisticated models that can make predictions in the fast-paced world of finance.

symbol_groups = list(chunks(stocks[’Ticker’], 100))
symbol_strings = []
for i in range(0, len(symbol_groups)):
    symbol_strings.append(’,’.join(symbol_groups[i]))
#     print(symbol_strings[i])

final_dataframe = pd.DataFrame(columns = my_columns)

for symbol_string in symbol_strings:
#     print(symbol_strings)
    batch_api_call_url = f’https://sandbox.iexapis.com/stable/stock/market/batch/?types=quote&symbols={symbol_string}&token={IEX_CLOUD_API_TOKEN}’
    data = requests.get(batch_api_call_url).json()
    for symbol in symbol_string.split(’,’):
        final_dataframe = final_dataframe.append(
                                        pd.Series([symbol, 
                                                   data[symbol][’quote’][’latestPrice’], 
                                                   data[symbol][’quote’][’marketCap’], 
                                                   ‘N/A’], 
                                                  index = my_columns), 
                                        ignore_index = True)
        
    
final_dataframe

Imagine we’re embarking on a journey to gather financial insight on a set of stocks, akin to collecting ingredients for a grand feast. Our adventure begins with a collection of tickers representing various stocks, and we want to chunk them into manageable groups of 100 to avoid overwhelming ourselves, much like organizing our ingredients by type.

As we create these smaller groups, each chunk becomes a basket of tickers we can handle better. We then take each basket and transform it into a string of tickers separated by commas, making it easier to communicate with the API, which is like calling a supplier to request information about our ingredients. This is what our loop does — it gathers each group and prepares it for our upcoming requests.

Now, with our baskets ready, we create a blank table, or a DataFrame, where we will store the essential details — like the price and market capitalization — of each stock, setting the stage for our feast. With each string of tickers, we make an API call to a financial service, asking for the latest data. Think of this as sending a grocery list to our supplier who will deliver the freshest stock information.

As we receive the data back, we meticulously pour each piece of information into our DataFrame, a structured way to keep everything organized. We loop through each ticker in the string, pulling its price and market cap from the data we received and adding these to our table along with some placeholders.

Finally, our collection of data sits neatly in the DataFrame, ready to be used for our next steps in analyzing the stocks through machine learning and deep learning, ultimately guiding our trading decisions.

Figuring Out How Many Shares to Purchase

Looking at the DataFrame above, we still need to determine how many shares of each stock to buy.

Let’s take care of that next.

portfolio_size = input(”Enter the value of your portfolio:”)

try:
    val = float(portfolio_size)
except ValueError:
    print(”That’s not a number! \n Try again:”)
    portfolio_size = input(”Enter the value of your portfolio:”)

Imagine we’re starting a journey in the world of algorithmic trading, where each piece of code helps us make smart decisions with our investments. We begin by asking for the value of your portfolio, kind of like asking someone how much money they have to spend at a market. This is where the program engages with the user, presenting a prompt: “Enter the value of your portfolio.”

Now, the interesting part happens next. The program takes the input from the user and tries to turn that input into a number by using the float function, which is like converting a stack of coins into a clear amount of cash. However, sometimes, people might not enter a tidy number — perhaps they mistakenly type in letters instead of digits. This is where we introduce a protective shield: a try-except block. Think of it as a safety net that lets us catch any mistakes while we’re soaring high in our coding adventure.

If the user enters something that can’t be converted to a number, like “apple” instead of “1000”, it triggers a ValueError, which pops us down to the except block, where we gently remind the user of their mistake with a friendly message. Then, we try again, prompting them to enter the portfolio value once more. This repetitive feedback loop is crucial for ensuring we get the right input before we move forward. It’s all about making sure our calculations are based on accurate data.

By handling inputs carefully, we prepare for the next exciting steps in our trading algorithms, ensuring we’re equipped with the right numbers before diving into the complexities of machine learning and deep learning. With this foundation in place, we’re one step closer to building an intelligent trading system.

position_size = float(portfolio_size) / len(final_dataframe.index)
for i in range(0, len(final_dataframe[’Ticker’])-1):
    final_dataframe.loc[i, ‘Number Of Shares to Buy’] = math.floor(position_size / final_dataframe[’Price’][i])
final_dataframe

Imagine we’re working together to build a smart trading algorithm that decides how many shares of each stock we can buy with the money we have. First, we need to figure out how much of our total portfolio we should allocate for each stock. The line that begins with “position_size” elegantly does just that: it takes our total amount of money, called portfolio_size, and divides it by the number of different stocks we’re interested in, which sits in final_dataframe. This division tells us how much we can spend on each stock.

Next, we take our journey through each stock in our list using a loop. Think of this loop as going down a checklist, one stock at a time. We create a range that covers every index of our stock tickers. The loop starts at zero and goes up to one less than the total number of stocks, because we want to avoid going out of bounds.

Now, for each stock, we need to determine how many shares we can buy without exceeding our allocated budget. The line inside our loop does exactly that! It looks up the price of the current stock and calculates how many shares can fit into our budgeted amount by dividing position_size by the stock’s price. The math.floor function comes into play here; it ensures that we only count whole shares since we can’t buy a fraction of a share. Finally, we store this number back into our final_dataframe, which keeps track of our decisions.

When we complete this loop, we have a clear picture of how many shares to buy for each stock in our final plan. All these little steps lead us closer to creating a powerful algorithm for trading that leverages data and smart calculations!

Formatting Our Excel Output

We will use the XlsxWriter library in Python to create beautifully formatted Excel files.

XlsxWriter is a fantastic package that provides a lot of customization options. However, it can feel a bit overwhelming for new users. So, this section will be a bit detailed as I want to ensure we cover how XlsxWriter operates thoroughly.

Initializing our XlsxWriter Object

writer = pd.ExcelWriter(’recommended_trades.xlsx’, engine=’xlsxwriter’)
final_dataframe.to_excel(writer, sheet_name=’Recommended Trades’, index = False)

Imagine you’ve been on a journey through the exciting world of algorithmic trading, where you’ve developed a sophisticated strategy to decide when to buy and sell stocks based on data intelligence. Now, after all that hard work with machine learning and deep learning, it’s time to share your recommendations in an easy-to-read format.

The first line introduces a character in our story: the writer. Think of it as a skilled secretary who is prepared to neatly organize and document your findings. By using pd.ExcelWriter, you’re asking this secretary to create an empty Excel file named ‘recommended_trades.xlsx’. The engine=’xlsxwriter’ part is like picking the best tool for the job, ensuring that your secretary has the right skills to craft the file just the way you need it.

Now that our secretary is ready, we move on to the next important task: filling in the file with content. The second line takes the final_dataframe, which is like a well-organized notebook containing all the results of your calculations — your recommended trades. By calling to_excel, you’re instructing our secretary to transfer this data into your Excel file under a specific sheet named ‘Recommended Trades’. By using index=False, you’re politely requesting that the notes (or row indices) are left out, providing a cleaner look that focuses purely on your recommendations.

As we wrap up this small yet crucial chapter, remember that every part of your code works together like chapters in a book, supporting the bigger narrative of developing a robust trading strategy. With your recommendations neatly documented, you are one step closer to sharing your insights and making informed trading decisions.

Preparing the Formats for Our .xlsx File

Formats can include colors, fonts, and symbols such as % and $. For our Excel document, we will use four main formats:

* String format for ticker symbols

* \\$XX.XX format for stock prices

* \\$XX,XXX format for market capitalization

* Integer format for the number of shares to buy

background_color = ‘#0a0a23’
font_color = ‘#ffffff’

string_format = writer.book.add_format(
        {
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

dollar_format = writer.book.add_format(
        {
            ‘num_format’:’$0.00’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

integer_format = writer.book.add_format(
        {
            ‘num_format’:’0’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

Imagine we’re setting the stage for a grand performance, where our focus is on displaying information beautifully and clearly. As we embark on this journey, our first task is to decide the atmosphere — we create a background color, a deep, rich shade of blue known as #0a0a23, which will be the canvas where our data dances. Next, we choose a bold font color, pure white (#ffffff), which will ensure that our text shines brightly against that canvas.

Now, just like a painter uses different brushes to create specific textures, we create various formats for the data we want to display, starting with string_format. Think of this as a carefully crafted recipe card where we specify that our text will be white on that blue background with a border around it to keep things tidy.

But what if we want to represent money? We can’t just throw numbers around; they need to sparkle just like jewels. So, we have dollar_format, which formats numbers as currency — ducking the decimal point and making sure they look fancy, ready to impress. Again, we retain our color choices for consistency, and we wrap those figures in a neat package with a border.

Finally, we have integer_format, which is our way of showcasing whole numbers. Much like presenting fruits in a bowl, we want them neat and simple, clear of any frills.

Together, these formats let us approach our data with both elegance and clarity, perfectly aligning with the larger ambition of our project to create an effective algorithmic trading tool that not only performs well but also communicates its findings beautifully.

Formatting Columns in Our .xlsx File

We can easily format specific columns in our spreadsheet using the set_column method on the writer.sheets[‘Recommended Trades’] object.

Here’s how it works:


writer.sheets[’Recommended Trades’].set_column(’B:B’, # This specifies that we want to format column B
18, # This sets the width of the column to 18 pixels
string_format # This applies the ‘string_format’ to the column
)
# writer.sheets[’Recommended Trades’].write(’A1’, ‘Ticker’, string_format)
# writer.sheets[’Recommended Trades’].write(’B1’, ‘Price’, string_format)
# writer.sheets[’Recommended Trades’].write(’C1’, ‘Market Capitalization’, string_format)
# writer.sheets[’Recommended Trades’].write(’D1’, ‘Number Of Shares to Buy’, string_format)
# writer.sheets[’Recommended Trades’].set_column(’A:A’, 20, string_format)
# writer.sheets[’Recommended Trades’].set_column(’B:B’, 20, dollar_format)
# writer.sheets[’Recommended Trades’].set_column(’C:C’, 20, dollar_format)
# writer.sheets[’Recommended Trades’].set_column(’D:D’, 20, integer_format)

Imagine we want to create a neat and organized spreadsheet to present our recommended trades in the stock market, as if we’re sorting through a treasure map. We begin with lines that set up the headers for our treasure map, which detail the most crucial information. The first four lines act like our title signs, labeling each column clearly. We write “Ticker” in cell A1, “Price” in B1, “Market Capitalization” in C1, and “Number Of Shares to Buy” in D1. Each of these labels helps us understand what each part of our treasure map represents.

Next, we focus on the columns themselves. We want them to be easy to read, so we set their sizes just right. Imagine resizing the columns like adjusting the width of a bookshelf to fit all our books comfortably. The set_column function comes in here, where we specify that column A should have a width of 20 to ensure our ticker symbols have enough space, just like giving plenty of room for lengthy book titles. We do the same for columns B and C, where we also apply a currency format, because prices and market capitalizations are much more readable when presented as dollar amounts. Here, the “dollar_format” acts like a shiny sticker on our prices, clearly indicating their value.

Finally, column D holds the number of shares to buy, formatted as an integer, because this is a straightforward count, much like counting the number of gemstones in our treasure chest. By visually organizing our data, we’re making our analysis easier, paving the way for the machine learning and deep learning processes to follow as we decide on trading strategies. In the grand scheme of our project, this is about building a foundation that effectively communicates insights, enhancing our algorithmic trading decisions.

This code functions well, but it goes against the software principle of “Don’t Repeat Yourself.”

We can make it better by using two loops instead.

This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.

column_formats = { 
                    ‘A’: [’Ticker’, string_format],
                    ‘B’: [’Price’, dollar_format],
                    ‘C’: [’Market Capitalization’, dollar_format],
                    ‘D’: [’Number of Shares to Buy’, integer_format]
                    }

for column in column_formats.keys():
    writer.sheets[’Recommended Trades’].set_column(f’{column}:{column}’, 20, column_formats[column][1])
    writer.sheets[’Recommended Trades’].write(f’{column}1’, column_formats[column][0], string_format)

Imagine we’re crafting a beautifully organized spreadsheet for trading recommendations, much like laying out a well-arranged table for a feast. Here, we start by defining our ingredients, which in programming we call variables. The collection of column formats serves as our recipe card. Each entry in this collection maps a column letter, like ‘A’ or ‘B’, to two essential details: the title of the column, such as ‘Ticker’ or ‘Price’, and the format it should have, like how we’d choose a fancy plate for serving a dish.

Next, we embark on a journey, represented by a loop, where we will visit each column listed in our recipe card. Think of this as going down the line at a buffet, where we fill each station just right. For every column in our collections like ‘A’ through ‘D’, we tell our program to adjust the width of that column to 20 units — like making sure there’s enough space for a heaping serving!

As we adjust each column’s width, we also drop in the header in the first row, labeling it with the appropriate title from our recipe card. Each header is formatted correctly, ensuring that ‘Ticker’ is deliciously presented, ‘Price’ shines, and so forth.

By the time we’ve looped through all the columns, we’ve effectively organized our trading recommendations, making them easy to read and attractive — perfect for anyone who will glance at them. This sets a solid foundation for managing our algorithmic trading insights using machine learning and deep learning techniques, as clean data presentation is key to making informed decisions!

Saving Our Excel Output

It’s simple to save our Excel file:

writer.save()

Imagine we have been hard at work on a thrilling project where our goal is to develop an algorithm that can trade stocks intelligently, much like how a savvy trader might make decisions based on patterns and trends. As we dive into the final steps of our handle, we come across a line of code that’s like sealing the deal after a rigorous negotiation: writer.save().

Picture a writer meticulously documenting every detail of our trading strategy and the data that supports it. Each time we run our complex model, new insights and data points are generated, similar to taking extensive notes during a brainstorming session. But to ensure these important insights don’t just float away, we need to save them, preserving our work — this is precisely what the writer.save() command does.

By invoking this command, we are instructing our program to take all the data, like strategies and results from our algorithm, and store it permanently in a way that we can use later. Imagine it as putting completed pages into a file folder, ensuring we have a concoction of knowledge ready for when we need it most.

So, after orchestrating all the thinking, training, and testing, writer.save() culminates this process by ensuring everything we’ve built is securely kept, ready to influence or improve our trading decisions in the future. This simple yet powerful line is a reminder of the importance of preserving our work in our journey of creating intelligent trading algorithms, allowing us to reflect and iterate upon our trading strategies, much like a chef refining a delicious recipe over time.

Quantitative Momentum Strategy

“Momentum investing” refers to putting your money into stocks that have experienced the biggest price increases.

In this project, we’ll create an investment strategy that picks the 50 stocks with the strongest price momentum. After that, we will determine the recommended trades for an equal-weight portfolio consisting of these 50 stocks.

To start, we need to import the open-source software libraries that we will be using throughout this tutorial.

import numpy as np #The Numpy numerical computing library
import pandas as pd #The Pandas data science library
import requests #The requests library for HTTP requests in Python
import xlsxwriter #The XlsxWriter libarary for 
import math #The Python math module
from scipy import stats #The SciPy stats module

Let’s imagine we’re embarking on an exciting journey into the world of algorithmic trading using machine learning and deep learning. To set the stage, we first need to gather the right tools for our adventure. We begin by inviting some special friends to help us along the way.

First, we bring in NumPy, represented by import numpy as np. Think of NumPy as our powerful calculator, enabling us to perform complex mathematical operations quickly and efficiently, just like how a sophisticated gadget can simplify a challenging task. Alongside it, we welcome Pandas with import pandas as pd. Imagine Pandas as a tidy librarian, keeping our data organized in neat tables and making it easy to analyze — helpful when we dive into the vast sea of stock data.

Next, we bring in the requests library with import requests. This is like a messenger that goes out to collect vital information from the internet, such as current stock prices or market news, ensuring we’re always up to date. Along with it, we have import xlsxwriter, our assistant that helps us document our findings and results in well-organized spreadsheet files. You can think of it as making a scrapbook of our trading journey, capturing all the important moments.

To perform calculations, we also call upon the math module with import math. Picture it as our trusty mathematical toolbox, providing essential functions like finding square roots or computing logarithms. Finally, we invite from scipy import stats, a statistician friend who will guide us through the intricacies of data analysis — helping us understand trends and patterns in market behavior.

With this ensemble of libraries, we are now equipped to navigate the exciting world of algorithmic trading. As we progress, we’ll use these tools to build models that empower us to make informed decisions in the stock market. So let’s roll up our sleeves and dive into the next steps of our trading adventure!

Bringing in Our Stock List

Just like before, we need to bring in our stock list and our API token before moving forward. Please ensure that the .csv file is still in your working directory and import it using the command below:

stocks = pd.read_csv(’sp_500_stocks.csv’)
from secrets import IEX_CLOUD_API_TOKEN

Imagine we’re embarking on an exciting treasure hunt in the world of finance, where our goal is to gather valuable insights from stock market data. To kick off our adventure, we first need to gather our treasure map, which in this case is a file containing information about various stocks. The first line of our program plays this crucial part by using a library called pandas, often abbreviated as ‘pd’. When we read in the contents of ‘sp_500_stocks.csv’, we are essentially opening a book filled with vital stock information — the S&P 500 stocks — which lets us know what treasures we might find.

Now, as we dive deeper into our quest, we come across something intriguing in the next line: we call upon a magical key, known as the IEX_CLOUD_API_TOKEN. Just like a key unlocks a hidden chest, this token lets us access real-time market data from the IEX Cloud, an external service that feeds us up-to-date stock prices and other valuable information. It’s important to keep this token secure, which is why it’s stored in a separate secret location, ensuring only we can access it for our mission.

Everything we’ve set up so far is preparing us to delve into the realm of algorithmic trading using machine learning and deep learning. With our stock data in hand and our key ready, we can now start crafting intelligent strategies that will guide our trading decisions. This wonderful knowledge we’re building will enable us to not just read the market, but to predict and react to it effectively, turning us into savvy traders on this journey!

Making Our First API Call

It’s time to create the first version of our momentum screener!

We need to obtain one-year price returns for every stock in our selection. Here’s how we can do that.

symbol = ‘AAPL’
api_url = f’https://sandbox.iexapis.com/stable/stock/{symbol}/stats?token={IEX_CLOUD_API_TOKEN}’
data = requests.get(api_url).json()
data

Imagine we’re embarking on a journey to unlock insights into the world of stocks, beginning with Apple, which we refer to by its symbol ‘AAPL’. This symbol is like a secret code that helps us identify the company that interests us. Here, our goal is to fetch information about Apple’s stock performance from a trusted source online.

We’re now crafting a specific address, or URL, that leads us to this treasure trove of data. By combining a base URL with our chosen symbol and a secure token, we create a unique path to Apple’s statistics. Think of this as writing down the address of a library where all the books about Apple are kept, but only we have the special key (the API token) that unlocks those doors.

Next, we make a request to this URL, akin to sending a messenger to our library to retrieve those precious books. In the world of programming, the ‘requests’ library acts like our reliable messenger, diligently going to the address we crafted and bringing back the invaluable information we seek. This information comes back as a jumble of data that looks like a slice of a larger puzzle, and by using the .json() method, we tidy it up into a format that we can easily understand and work with — imagine cleaning up the papers we receive so we can read and analyze them.

Finally, displaying the data is as if we’re laying out all the insights from those books right in front of us, ready for examination. With this foundational step, we’ve taken one crucial stride in our larger quest to harness the power of machine learning and deep learning in algo trading. There’s much more ahead, and each piece of code will continue to build toward our ultimate objective of making smart trading decisions.

Understanding Our API Call

This API call contains all the information we require. We can access it using the familiar square-bracket notation from the first project of this course. Here’s an example.

data[’year1ChangePercent’]

Imagine we are trying to understand how a stock’s price has changed over time. In our journey through stock trading, we’re going to analyze data that tells us about price movements. The line before us, data[‘year1ChangePercent’], acts like a window through which we can glimpse the percentage change in stock prices over a year.

Picture a large spreadsheet filled with rows and columns, where each row represents a different stock and each column holds various pieces of information about these stocks. The data we’re referencing is like that entire spreadsheet, a collection of many stocks with their respective details laid out.

Now, by using the square brackets with the string ‘year1ChangePercent’, we are essentially saying, “Show me the information from the column that tells me how much these stocks have changed in price over the past year.” It’s akin to picking up a specific recipe card from a recipe box; we want to focus on the change over one year, which gives us insight into how well a stock has performed.

By extracting the year one change percentage, we prepare ourselves to make informed decisions about our trading strategies. In algorithmic trading, understanding these changes deeply informs our machine learning models. These models learn patterns from historical data, which helps us predict future movements and make savvy trading choices. So, as we explore this line, we take a crucial step toward harnessing the power of data in the fascinating world of trading, using machine learning and deep learning to guide us.

Making Batch API Calls and Creating Our DataFrame

Just like in our first project, it’s time to make several batch API calls and gather the information we need for our DataFrame.

Let’s begin by running the following code cell. It includes some code we previously created that we can use again for this project. Specifically, it features a function called chunks that will help us split our list of securities into groups of 100.

# Function sourced from 
# https://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks
def chunks(lst, n):
    “”“Yield successive n-sized chunks from lst.”“”
    for i in range(0, len(lst), n):
        yield lst[i:i + n]   
        
symbol_groups = list(chunks(stocks[’Ticker’], 100))
symbol_strings = []
for i in range(0, len(symbol_groups)):
    symbol_strings.append(’,’.join(symbol_groups[i]))
#     print(symbol_strings[i])

my_columns = [’Ticker’, ‘Price’, ‘One-Year Price Return’, ‘Number of Shares to Buy’]

In our exploration of algorithmic trading using machine learning, we have an exciting task ahead: organizing stock ticker symbols into manageable groups for analysis. Imagine you have a long list of grocery items, but instead of tackling them all at once, you decide to make it easier by grouping them into smaller batches. That’s exactly what our program is doing.

We start by defining a function called “chunks.” Think of it as a recipe that specifies how to slice a big cake into smaller pieces. The function takes a list, which in our context is a collection of stock ticker symbols, and a number, n, which tells it how large each piece should be. The function then carefully retrieves these pieces one by one using a simple loop, allowing us to create manageable chunks of our original list.

Next, we use this function to break our stock tickers into groups of 100. In our story, these groups are like boxes of groceries, making it easier for us to carry them. We store these grouped tickers in symbol_groups. But we need to transform this into a format that’s easier to handle, so we create another list called symbol_strings. Here, we iterate over our groups and, like a meticulous chef combining ingredients, we join the tickers in each group into a single string, separated by commas.

Finally, we define a list called my_columns, which outlines the important pieces of information we want to track, including the ticker symbol and the stock price. This list will serve as the framework for our deeper analysis, guiding us as we continue to dive into our algorithmic trading project and harness the power of data to make informed trading decisions. By breaking our work into these digestible chunks, we maintain clarity and efficiency in our trading strategy development.

Next, let’s create a blank DataFrame and then add our data to it one piece at a time.

final_dataframe = pd.DataFrame(columns = my_columns)

for symbol_string in symbol_strings:
#     print(symbol_strings)
    batch_api_call_url = f’https://sandbox.iexapis.com/stable/stock/market/batch/?types=stats,quote&symbols={symbol_string}&token={IEX_CLOUD_API_TOKEN}’
    data = requests.get(batch_api_call_url).json()
    for symbol in symbol_string.split(’,’):
        final_dataframe = final_dataframe.append(
                                        pd.Series([symbol, 
                                                   data[symbol][’quote’][’latestPrice’],
                                                   data[symbol][’stats’][’year1ChangePercent’],
                                                   ‘N/A’
                                                   ], 
                                                  index = my_columns), 
                                        ignore_index = True)
        
    
final_dataframe

Imagine we want to gather some financial data from the stock market so we can later analyze it for trading decisions. To kick off our program, we start by creating an empty table or, as we call it in programming, a DataFrame. Think of this DataFrame as a blank canvas where we will paint the information we collect. We label the columns with specific names held in my_columns, which define what kind of data we will be filling in, like stock prices and performance metrics.

Next, we embark on a journey, iterating through a collection of symbols — these refer to different stocks, much like exploring various ingredients for a recipe. For each symbol, we construct a recipe for requesting data from an API, which is like a restaurant’s kitchen we’re ordering from. This URL includes all the necessary ingredients — our symbols and an access token, which acts like a VIP pass to get the ingredients we need.

Then, we take a special step: we call the API to retrieve data. Think of requests.get as sending our order into the kitchen. Once our culinary staff (the API) has prepared the data, it returns a response in a format we can easily work with, called JSON — a structured yet flexible format like a neatly organized pantry.

Now, we dive deeper into the response, extracting relevant information for each symbol. Here, we are crafting specific rows for our DataFrame, where each row contains the stock symbol, its latest price, and its annual change percentage, alongside a placeholder marked as ‘N/A’. It’s as if we’re measuring out the ingredients for each dish before serving it up.

Finally, we have our complete DataFrame filled with financial ingredients, ripe and ready for analysis. This organized data lays the foundation for our algorithms to learn from in our larger goal of developing smart trading strategies!

Removing Low-Momentum Stocks

Our investment strategy aims to find the 50 stocks in the S&P 500 with the highest momentum.

To achieve this, we will now remove any stocks from our DataFrame that don’t meet this momentum standard. We’ll sort the DataFrame based on the one-year price return of the stocks and eliminate those that are not in the top 50.

final_dataframe.sort_values(’One-Year Price Return’, ascending = False, inplace = True)
final_dataframe = final_dataframe[:51]
final_dataframe.reset_index(drop = True, inplace = True)
final_dataframe

Imagine you’re sorting through a treasure chest filled with various coins, and you only want the shiny ones that have gained the most value over the past year. Our journey begins with a collection of data we call final_dataframe, which holds information about different assets — think of them as the various coins you’ve gathered.

The first line of our program carefully sorts this treasure chest based on a specific criterion, the ‘One-Year Price Return.’ By arranging these assets in descending order, we make sure that the ones with the highest returns are at the top, just like prioritizing the shiniest coins first. The ascending=False tells the program to arrange them from the highest to the lowest, while inplace=True ensures we’re making these changes directly in our treasure chest without needing to create a new one.

Next, we take a closer look at just the top 51 assets that have shone the brightest. This is where our second line comes into play, slicing final_dataframe to keep only these top contenders, similar to selecting the finest coins you want to keep in your collection.

Lastly, we tidy things up by resetting the index of our data. Think of this as re-labeling your coins in order to reflect that we now have only the best ones in our collection. By using drop=True, we seamlessly discard the old labels to avoid any confusion, letting our new sorted dataframe shine.

By conducting these steps, we’re honing in on the most promising assets for our algorithmic trading project. Understanding which ones have performed best will help us make informed decisions as we venture deeper into machine learning and deep learning. Each of these operations is essential in curating the finest selection of data, which is the foundation of our trading strategy.

Calculating the Number of Shares to Buy

Just like in the last project, we need to determine how many shares to buy. This time, we’ll put this calculation inside a function, as we’ll use it again later in this Jupyter Notebook.

You already have most of the information you need, so go ahead and complete the next two code cells on your own without waiting for me!

def portfolio_input():
    global portfolio_size
    portfolio_size = input(”Enter the value of your portfolio:”)

    try:
        val = float(portfolio_size)
    except ValueError:
        print(”That’s not a number! \n Try again:”)
        portfolio_size = input(”Enter the value of your portfolio:”)

portfolio_input()
print(portfolio_size)

Imagine you’re embarking on an exciting journey into the world of algorithmic trading with machine learning. To get started, we need to gather some important information that will shape our trading decisions. The first step we take is defining a special recipe, or function, called portfolio_input(). Think of this function as a cooking guide where every step helps us prepare our main dish: a successful trading strategy.

Inside our function, we declare a global ingredient called portfolio_size. By using the word ‘global,’ we’re saying this ingredient will be accessible throughout the entire recipe, allowing us to use it later in our project. Next, we prompt our users to enter the value of their portfolio — imagine this as asking a friend how much they’re willing to invest.

Now comes the tricky part. We want to ensure that the value entered is indeed a number, so we use a try-except block for error handling. Think of it as a safety net: if the user accidentally types in something that isn’t a number — like a word — the program catches that mistake gracefully. We throw a message back to the user informing them that they need to enter a valid number, almost like a helpful chef encouraging a cook to double-check their measurements.

Once a proper value is confirmed and our recipe is complete, we call our portfolio_input() function, like turning on the oven to start cooking. Finally, we print our portfolio_size, allowing us to see the value that will help us navigate our trading landscape. This setup is just the beginning, laying down a foundation for our trading algorithms to flourish. As we progress, these values will feed into our machine learning models, driving our decisions in the financial markets.

position_size = float(portfolio_size) / len(final_dataframe.index)
for i in range(0, len(final_dataframe[’Ticker’])):
    final_dataframe.loc[i, ‘Number of Shares to Buy’] = math.floor(position_size / final_dataframe[’Price’][i])
final_dataframe

Imagine we’re managing a garden of investments, carefully choosing how much of each plant to nurture based on the resources we have. First, we determine how much money we can allocate to each individual investment, represented by the variable called position_size. Here, we take our total portfolio_size — think of it as a set amount of seeds — and divide it by the total number of different stocks or “plants” we want to invest in, which we get from final_dataframe.index. This division helps us ensure that we don’t overwater any single plant.

Next, we step into a loop, a bit like walking through each row of our garden, examining every stock we have. For each i, which represents each stock, we calculate how many shares we can afford to buy. The math.floor function — imagine it as a gardener deciding to round down how many seeds can realistically be planted — you can’t plant half a seed — takes our calculated position_size, divides it by the price of the specific stock, and gives us a whole number of shares.

As we repeat this step for all stocks, we fill up our data frame, which keeps track of these calculations in a column labeled ‘Number of Shares to Buy’. By the end of this process, we’ve efficiently planned how to allocate our investment across various stocks, ensuring that we’re ready to plant our seeds wisely in the market. Ultimately, this careful preparation will play a crucial role in our larger project of using machine learning and deep learning to make informed trading decisions.

Creating a Better (and More Realistic) Momentum Strategy

In the world of quantitative investing, firms often make a clear distinction between “high quality” and “low quality” momentum stocks:

* High-quality momentum stocks demonstrate consistent and gradual outperformance over long periods.

* Low-quality momentum stocks may experience long stretches without any momentum, only to suddenly rise.

High-quality momentum stocks are favored because low-quality momentum is often driven by temporary news that is unlikely to be repeated, such as an FDA approval for a biotech company.

To identify high-quality momentum, we’ll develop a strategy that picks stocks from the top percentiles of:

* 1-month price returns

* 3-month price returns

* 6-month price returns

* 1-year price returns

Let’s begin by building our DataFrame. You’ll see that I frequently use the abbreviation hqm, which stands for high-quality momentum.

hqm_columns = [
                ‘Ticker’, 
                ‘Price’, 
                ‘Number of Shares to Buy’, 
                ‘One-Year Price Return’, 
                ‘One-Year Return Percentile’,
                ‘Six-Month Price Return’,
                ‘Six-Month Return Percentile’,
                ‘Three-Month Price Return’,
                ‘Three-Month Return Percentile’,
                ‘One-Month Price Return’,
                ‘One-Month Return Percentile’,
                ‘HQM Score’
                ]

hqm_dataframe = pd.DataFrame(columns = hqm_columns)

for symbol_string in symbol_strings:
#     print(symbol_strings)
    batch_api_call_url = f’https://sandbox.iexapis.com/stable/stock/market/batch/?types=stats,quote&symbols={symbol_string}&token={IEX_CLOUD_API_TOKEN}’
    data = requests.get(batch_api_call_url).json()
    for symbol in symbol_string.split(’,’):
        hqm_dataframe = hqm_dataframe.append(
                                        pd.Series([symbol, 
                                                   data[symbol][’quote’][’latestPrice’],
                                                   ‘N/A’,
                                                   data[symbol][’stats’][’year1ChangePercent’],
                                                   ‘N/A’,
                                                   data[symbol][’stats’][’month6ChangePercent’],
                                                   ‘N/A’,
                                                   data[symbol][’stats’][’month3ChangePercent’],
                                                   ‘N/A’,
                                                   data[symbol][’stats’][’month1ChangePercent’],
                                                   ‘N/A’,
                                                   ‘N/A’
                                                   ], 
                                                  index = hqm_columns), 
                                        ignore_index = True)
        
hqm_dataframe.columns

Imagine stepping into a world where we want to make smart investments, much like a chef crafting a delicious meal by choosing the right ingredients and layering flavors. In this program, our goal is to gather valuable stock data, preparing it neatly to help us choose the best opportunities in the market.

We begin by creating a list of important information we need about each stock, akin to writing down the ingredients and their quantities for our recipe. This list is stored in hqm_columns, with elements like ‘Ticker’ for the stock’s symbol, and various price return metrics that tell us how well the stock has performed over time.

Next, we create an empty frame, like a blank canvas to fill in, where all our data will seamlessly fit. This frame is designed to hold our stock metrics, initialized by calling pd.DataFrame with our column names.

Now, we move into the heart of our program where the magic happens. We loop through an array of stock symbols, each representing a potential ingredient in our financial feast. Inside this loop, we construct a URL to gather data from an API — think of this as reaching out to a supplier to get the freshest produce.

Once we have the data, we break it down for each stock symbol, pulling out specific pieces, just like chopping vegetables into precise sizes for cooking. We append each stock’s details into our data frame, ensuring that every key piece of information finds its proper place — while temporarily using ‘N/A’ for values yet to be calculated.

Eventually, the last line reminds us to check our data frame’s columns, ensuring everything is organized and labeled correctly. This process lays a meaningful foundation for our larger project in algorithmic trading using machine learning, where this carefully curated data will help us make informed investment decisions.

Calculating Momentum Percentiles

Next, we will calculate momentum percentile scores for each stock in our universe. Specifically, we’ll focus on determining percentile scores for the following metrics for every stock:

* One-Year Price Return

* Six-Month Price Return

* Three-Month Price Return

* One-Month Price Return

Here’s our approach:

time_periods = [
                ‘One-Year’,
                ‘Six-Month’,
                ‘Three-Month’,
                ‘One-Month’
                ]

for row in hqm_dataframe.index:
    for time_period in time_periods:
        hqm_dataframe.loc[row, f’{time_period} Return Percentile’] = stats.percentileofscore(hqm_dataframe[f’{time_period} Price Return’], hqm_dataframe.loc[row, f’{time_period} Price Return’])/100

# Print each percentile score to make sure it was calculated properly
for time_period in time_periods:
    print(hqm_dataframe[f’{time_period} Return Percentile’])

#Print the entire DataFrame    
hqm_dataframe

Imagine we are working together to analyze historical stock market data, aiming to understand how individual stocks performed over different time periods. Our journey begins by defining a list that specifies these time frames, ranging from a single year down to just one month. Think of this list as our guidebook, showing us the paths we’ll take in our exploration.

Next, we enter a loop that traverses each row of our data, akin to inspecting each stock in our collection. For every stock, we then enter another loop that goes through each time period in our guidebook. Here, we’re calculating the relative performance of each stock, determining how well it performed compared to others during the specified time frames.

Using a statistical function, we find the percentile rank of each stock’s return for each time period. This is like figuring out where a contestant places in a race compared to everyone else; we divide the score by 100 to convert it into a percentage, ensuring we can easily interpret what this means. As we fill in our data frame with these calculated values, we’re essentially building a clearer picture of how each stock stacks up against its peers.

Once we finish calculating all the percentiles, we want to double-check our work. So, we print out each percentile score for validation — like verifying that our math on a test is correct. Finally, when we print the entire DataFrame, we get a comprehensive view of all our hard work, showing us the results of our analysis in one elegant snapshot.

By the end of this journey, we’ve constructed a powerful tool that helps us assess stock performance, setting the stage for our deeper exploration into trading strategies using machine learning and deep learning techniques.

Calculating the HQM Score

Now, let’s calculate our HQM Score, which represents the high-quality momentum score we’ll use to select stocks for this investing strategy.

The HQM Score will be the average of the four momentum percentile scores we determined in the previous section.

To find the average, we will utilize the mean function from Python’s built-in statistics module.

from statistics import mean

for row in hqm_dataframe.index:
    momentum_percentiles = []
    for time_period in time_periods:
        momentum_percentiles.append(hqm_dataframe.loc[row, f’{time_period} Return Percentile’])
    hqm_dataframe.loc[row, ‘HQM Score’] = mean(momentum_percentiles)

Imagine we have a special recipe book filled with various ingredients — our data — focused on high-quality momentum trading. In this section of the program, we are going to calculate something we call the “HQM Score,” which helps us determine the strength of a stock’s momentum over different time periods.

First, we gather our ingredients. We start by pulling in the average function from a library called statistics, which will help us average out numbers later. With that ready, we set out on our journey with a loop that will traverse each row in our dataframe. Think of this as flipping through each page of our recipe book, where each page contains information about a different stock.

For every stock, we introduce an empty list called momentum_percentiles to gather the information we need. Now, just like measuring ingredients for a recipe, we have another loop that goes through a predefined set of time periods — like 1 month, 3 months, or 6 months. As we measure, we fetch and append each stock’s momentum data from the dataframe, specifically the return percentiles for those time frames. This is akin to collecting various spices that will bring depth to our final dish.

Once we have gathered all these values, we whisk them together using the mean function, which calculates the average of our collected percentiles. This average becomes our HQM Score — a representation of the overall momentum strength for that particular stock. We then place this score back into the dataframe, ensuring that each stock has its unique position — just like labeling jars in our kitchen.

In summary, this part of our project helps us prepare an essential ingredient for our trading strategy, basing decisions on calculated momentum, and setting the stage for our machine learning journey ahead.

Choosing the Top 50 Momentum Stocks

Once again, we can find the 50 top momentum stocks in our dataset by organizing the DataFrame based on the HQM Score column and keeping just the leading 50 entries.

hqm_dataframe.sort_values(by = ‘HQM Score’, ascending = False)
hqm_dataframe = hqm_dataframe[:51]

Imagine we’re trying to find the best candidates for a big investment project based on their performance ratings. The program we’re looking at does just that, helping us sift through a sea of data to find the top performers using something called the HQM Score, which you can think of as a well-crafted grade that tells us how promising each candidate is.

In the first line, we dive into our data collection, called hqm_dataframe, which holds all the candidates and their scores. By using the sort_values function, we’re effectively organizing this list as one might tidy up files in a drawer by importance — placing the highest scores right at the top for easy access. The ascending=False part is like saying, “I want my best scores to be front and center,” as it ensures that the candidates with the highest HQM Scores float to the top of our list.

Now that we have our candidates neatly sorted, we want to focus on the top tier — the elite 51, if you will. So, in the second line, we trim our original dataset to just those top performers by slicing it with hqm_dataframe[:51]. This is similar to pulling out the top 51 dessert recipes from a thick cookbook when you’re hosting a fancy dinner; only the best make it to the table.

By the end of this little coding journey, we have not just organized and refined our potential investment choices but have also set a strong foundation for further analysis or decision-making. As we continue to explore more advanced techniques in algorithmic trading through machine learning, these skills of sorting and filtering will be crucial as we aim to make informed and strategic choices in our trading endeavors.

Determining How Many Shares to Purchase

We’ll use the portfolio_input function we developed earlier to input the size of our portfolio. After that, we’ll apply a similar approach in a for loop to figure out how many shares to buy for each stock in our investment collection.

portfolio_input()

Imagine we are embarking on an exciting journey into the world of algorithmic trading, where we empower machines to make informed decisions about buying and selling stocks. Our adventure begins with a simple yet powerful function call: portfolio_input(). Think of this as opening a door to a room filled with investment possibilities.

Now, this function serves a vital purpose — it invites users to input their investment preferences and portfolio details. Just like a master chef gathers ingredients before cooking, we need specific information to craft our trading strategy. This could range from the types of stocks a user is interested in to how much money they wish to allocate to each one. By collecting this data, we prepare ourselves to build a trading model tailored to individual needs.

Thanks for reading! This post is public so feel free to share it.

Share

When we call portfolio_input(), we essentially summon a pre-defined recipe card from our programming cookbook. This function, behind the scenes, might have its own instructions for how to gather and validate information. Imagine it gently prompting the user, like a friendly guide at a market, asking questions and ensuring that everything is in order before moving forward. Each response shapes the foundation of our trading strategy, ensuring that what we build is relevant and effective.

As we step away from this function call, we see that we’ve laid a crucial building block for our trading journey. By defining what our users want and need, we are equipping ourselves with the right tools to predict market movements intelligently. This act of gathering inputs is not just a formality; it is the cornerstone of our larger project, helping us to create a machine learning model that can analyze data and make wise trading decisions in the fast-paced stock market.

position_size = float(portfolio_size) / len(hqm_dataframe.index)
for i in range(0, len(hqm_dataframe[’Ticker’])-1):
    hqm_dataframe.loc[i, ‘Number of Shares to Buy’] = math.floor(position_size / hqm_dataframe[’Price’][i])
hqm_dataframe

Imagine we have a pot of money, which we call our portfolio, and we want to invest it wisely using the data we’ve gathered in a table called hqm_dataframe. First, we determine how much of our portfolio we can allocate to each stock by dividing our total portfolio size by the number of different stocks in our table. This division gives us the position size, a key concept that tells us how much to invest in each individual stock.

Next, we embark on a journey through our data, looking at each stock one by one. We use a loop, which is like repeating a step in a recipe for every stock we have. The range tells us to go from the first stock all the way to the second-to-last stock because we want to ensure we don’t go out of bounds. Inside this loop, we calculate how many shares we can buy for each stock based on its price. We take our position size and divide it by the price of the stock. The magic here is in using the math.floor() function, which ensures we only count whole shares — just like you wouldn’t want to buy half a loaf of bread.

As we go through the loop, we fill in a new column in our dataframe, noting down the number of shares we can buy for each stock. Finally, after the loop has done its work, we have a complete picture of our investment strategy directly in this table.

In the grand scheme of our algorithmic trading project, this step is vital because it lays the foundation for our next actions — executing trades based on informed decisions derived from our machine learning models.

Formatting Our Excel Output

We will use the XlsxWriter library in Python to create beautifully formatted Excel files.

XlsxWriter is a fantastic tool that provides plenty of customization options. However, it can feel a bit overwhelming for beginners. Because of this, this section will be a bit lengthy, as I want to make sure I thoroughly explain how XlsxWriter works.

writer = pd.ExcelWriter(’momentum_strategy.xlsx’, engine=’xlsxwriter’)
hqm_dataframe.to_excel(writer, sheet_name=’Momentum Strategy’, index = False)

Imagine you’re preparing a special recipe that combines all the best ingredients to create a delicious dish. In the world of algorithmic trading, we live in a similar culinary adventure, where we strive to combine data and insights to serve up a successful strategy. Here, we have a little piece of our journey where we want to save our carefully crafted strategy, just like neatly organizing our delicious recipes in a cookbook.

First, we see the line that starts with “writer.” Think of this as opening a new, pristine recipe book — specifically an Excel workbook named ‘momentum_strategy.xlsx.’ The ‘ExcelWriter’ is our tool that helps us write down all the important elements of our strategy, using a special technique called the ‘xlsxwriter’ engine that ensures our pages are beautifully formatted for easy reading later.

Next, we employ our “hqm_dataframe,” which is like our freshly assembled ingredients list for the momentum strategy. This dataframe holds all the data we’ve collected and analyzed, much like how a chef prepares their ingredients before cooking. The command “to_excel(writer, sheet_name=’Momentum Strategy’, index=False)” acts as our cooking instruction; it tells our program to take the contents of this dataframe and elegantly arrange them on a specific page labeled ‘Momentum Strategy.’ The ‘index=False’ part means we’re making the presentation cleaner by not including a messy extra list of numbers next to our ingredients.

In this small but significant step, we capture the essence of our strategy and organize it beautifully, making it easier to review and refine as we continue to enhance our algorithmic trading skills. As we move forward in our project, think of how this neatly coded recipe will serve us well as we experiment with our trading journey!

Setting Up the Formats for Our .xlsx File

In our first project, we learned that formats consist of colors, fonts, and symbols like % and $. For our Excel document, we will need four main formats:

* String format for tickers

* \$XX.XX format for stock prices

* \$XX,XXX format for market capitalization

* Integer format for the number of shares to purchase

Since we created these formats in the previous section, I’ve listed them below for your convenience. Please run this code cell before moving forward.

background_color = ‘#0a0a23’
font_color = ‘#ffffff’

string_template = writer.book.add_format(
        {
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

dollar_template = writer.book.add_format(
        {
            ‘num_format’:’$0.00’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

integer_template = writer.book.add_format(
        {
            ‘num_format’:’0’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

percent_template = writer.book.add_format(
        {
            ‘num_format’:’0.0%’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

Imagine we’re preparing a beautifully formatted report that will help us visualize our algorithmic trading results. To do this, we start by defining the colors and formats for our presentation, making it visually appealing and easy to understand.

First, we choose a background color, a deep navy blue, represented by the hex code #0a0a23, and a bright font color in white, #ffffff. These colors are like the canvas and paint of our artistry; they set the mood for our work.

Now, we create a series of templates, which are like reusable recipe cards that tell us how to display different types of data consistently. The first template, string_template, is used for simple text. It specifies our chosen colors and adds a border around any cell that uses this format, just like framing a picture to draw attention to it.

Next, we have dollar_template, which is specially designed for displaying money values. It adds a dollar sign and formats the number to always have two decimal places. This is particularly important in the finance world, where accuracy matters, much like measuring ingredients precisely in a recipe.

Then, we have the integer_template for whole numbers, formatted without decimals, and the percent_template, which formats numbers as percentages with one decimal point. Think of these as different ways to present our data that cater to the specific stories we want to tell — be it showing profits, counts, or rates of return.

Together, these templates will ensure that our algorithmic trading results are not just functional but also beautiful and easy to digest, helping us communicate our findings effectively to others. Each template enriches our narrative, transforming raw data into a compelling story.

column_formats = { 
                    ‘A’: [’Ticker’, string_template],
                    ‘B’: [’Price’, dollar_template],
                    ‘C’: [’Number of Shares to Buy’, integer_template],
                    ‘D’: [’One-Year Price Return’, percent_template],
                    ‘E’: [’One-Year Return Percentile’, percent_template],
                    ‘F’: [’Six-Month Price Return’, percent_template],
                    ‘G’: [’Six-Month Return Percentile’, percent_template],
                    ‘H’: [’Three-Month Price Return’, percent_template],
                    ‘I’: [’Three-Month Return Percentile’, percent_template],
                    ‘J’: [’One-Month Price Return’, percent_template],
                    ‘K’: [’One-Month Return Percentile’, percent_template],
                    ‘L’: [’HQM Score’, integer_template]
                    }

for column in column_formats.keys():
    writer.sheets[’Momentum Strategy’].set_column(f’{column}:{column}’, 20, column_formats[column][1])
    writer.sheets[’Momentum Strategy’].write(f’{column}1’, column_formats[column][0], string_template)

Imagine we’re setting up a powerful tool to help us with our algorithmic trading, where we need to organize and visualize important financial data clearly. Our first step is to create a structured layout using a set of defined formats. We start with a dictionary named column_formats, which you can think of as a labeled box of templates, each representing a specific type of data we will track. For instance, the ‘A’ key is like a label for our ‘Ticker’, which refers to stock symbols, while ‘B’ refers to ‘Price’, formatted to display dollar values.

Now, we have a series of identifiers, or keys, each linked to information about various financial metrics, including returns over different periods. These keys will direct how we present our data when we put it into a spreadsheet. Each line in this dictionary highlights the importance of clarity in our project and helps maintain consistency across our columns.

Next, we move into a loop, which is like following a recipe repeatedly for each ingredient we have — we’re going to format each column as defined in our box earlier. The for loop iterates through each column key, allowing us to apply specific formatting for the column width and data type. The set_column function adjusts the width of each column to make sure our data fits comfortably without overwhelming the viewer, while the write function fills in the header of each column with appropriate titles, all made easy to read.

So, with each iteration, we systematically shape our spreadsheet to become a clear and useful interface for tracking and analyzing trading performance. This meticulous organization sets the stage for diving into deeper analytical processes down the road, enhancing our trading strategies with machine learning and deep learning insights.

Saving Our Excel Output

Just like before, saving our Excel output is a breeze!

writer.save()

Here, we see a simple yet significant command: writer.save(). Think of this as the moment when an author saves their manuscript after a long night of writing, ensuring that all their hard work isn’t lost. This line of code tells the program to take all the beautiful patterns and insights we’ve uncovered, packaging them up neatly so they can be revisited later. The writer acts like a diligent librarian, ensuring that every detail is preserved in a way that we can easily access it down the line.

Without this step, all our efforts to train our model using machine learning and deep learning could vanish into thin air, much like forgetting to save a document could mean starting all over again. This becomes especially critical in algo trading, where decisions are made at lightning speed and depend heavily on the data we have at our disposal.

So, as we prepare to conclude this chapter in our project, remember that saving our work isn’t just about securing data; it’s about creating a foundation for intelligent trading decisions that we can refine and build upon in the future. With each save, we move closer to a system that has the potential to transform how we engage with the markets.


Quantitative Value Strategy

“Value investing” involves choosing stocks that are the most affordable based on typical measures of business value, such as earnings or assets.

In this project, we will develop an investment strategy that identifies the 50 stocks with the strongest value metrics. Then, we will suggest trades for a portfolio that equally weights these 50 stocks.

Library Imports

To get started, we’ll first import the open-source software libraries that we’ll use throughout this tutorial.

import numpy as np #The Numpy numerical computing library
import pandas as pd #The Pandas data science library
import requests #The requests library for HTTP requests in Python
import xlsxwriter #The XlsxWriter libarary for 
import math #The Python math module
from scipy import stats #The SciPy stats module

Imagine you’re about to embark on a journey into the fascinating world of algorithmic trading, where data guides decisions in the fast-paced stock market. To equip ourselves for this adventure, we start by gathering some essential tools, just like gathering supplies for a camping trip.

First, we invite NumPy into our program, the numerical computing library that’s like our trusty calculator, turning complex calculations into simple operations. Next, we bring in Pandas, the go-to data science library, which helps us organize and manipulate our data effortlessly, much like arranging ingredients before cooking a meal.

But what’s a journey without a map? Enter the Requests library, allowing us to fetch data from the web as if we’re sending out scouts to gather intel. We often want to save this data, and that’s where XlsxWriter comes into play. It’s like a digital notebook where we can neatly jot down our findings, keeping everything organized for later review.

Alongside these tools, we have the math module, which helps us perform basic mathematical operations — think of it as the fundamental skills that help us navigate through complex problems. Lastly, the SciPy stats module is our statistical toolkit, helping us understand the chances of different outcomes, very much like calculating odds in a game of chance.

This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.

As we piece together these libraries, we form a powerful foundation to build our algorithmic trading system. Each part is vital, like ingredients in a recipe designed to prepare us for building predictive models and making informed trading decisions, setting us on a path to become savvy market participants.

Bringing in Our Stock List and API Token

Just like before, we need to bring in our stock list and API token before moving forward. Please ensure that the .csv file is still in your working directory and use the command below to import it:

stocks = pd.read_csv(’sp_500_stocks.csv’)
from secrets import IEX_CLOUD_API_TOKEN

Imagine we’re embarking on a journey into the world of algorithmic trading using machine learning and deep learning. Our adventure begins by gathering information — the raw materials we need to make informed decisions. The first line we encounter is like opening a treasure chest; it invites us to load data about the S&P 500 stocks from a CSV file. Think of the CSV file as a neatly organized notebook that holds the key details about various companies we’re interested in, such as their prices and market caps.

Next, we see another line where we’re importing a card from a hidden realm — it’s like pulling out a special key that allows us to access a secure vault of financial data. Here, the card we import is called the API token from IEX Cloud, which provides us with real-time stock data. This token is crucial because, in the world of trading, timely information can make all the difference, just like knowing the weather before planning a picnic.

As we piece this together, we realize we’re setting the foundation for a more complex project where we will analyze stock movements and make predictions about future trends. The data we’ve invited into our program will serve as a formidable ally in our quest to uncover patterns and potentially discover profitable trading strategies.

So, as we continue, let’s stay curious about how we’ll use these tools and insights to navigate the exciting and dynamic landscape of financial markets!

Creating Our First API Call

Now, let’s dive into making our first version of the value screener!

We’ll begin by developing a straightforward value screener that ranks securities using just one metric: the price-to-earnings ratio.

symbol = ‘AAPL’
api_url = f’https://sandbox.iexapis.com/stable/stock/{symbol}/quote?token={IEX_CLOUD_API_TOKEN}’
data = requests.get(api_url).json()
data

Imagine we’re trying to gather some key information about a stock, specifically Apple, represented by the symbol ‘AAPL’. The first line of our journey sets up this symbol, like choosing a character in a video game — this is the stock we will be focusing on.

Next, we construct a URL, which serves as a treasure map guiding us to our desired data. We use an f-string, a handy tool that allows us to insert our chosen symbol into a larger string. This URL points us to the IEX Cloud API, a service that connects us to financial data — like a portal to a world of market insights. The placeholder token in this URL, IEX_CLOUD_API_TOKEN, acts as our key, granting us permission to access the treasure trove of stock information.

Now, as we traverse to the URL, we use requests.get(api_url) to send a request to the IEX API, almost like knocking on a door and asking for the information we seek. When the server responds, we eagerly use .json() to convert the data we receive into a format we can work with — think of it as translating a foreign language into our own.

Finally, we simply place data by itself, which in this context allows us to peek at the treasures we’ve just retrieved. We opened the door, got the information, and now it’s ready for us to analyze and use in our trading strategies. This step is crucial as we build upon it in our quest to harness machine learning algorithms that can help us make predictive trading decisions, paving our way deeper into the world of algorithmic trading.

Understanding Our API Call

This API call provides the metric we’re looking for — the price-to-earnings ratio.

Let’s look at an example of how to extract this metric from our API call:

pe_ratio = data[’peRatio’]
pe_ratio

Imagine we’re embarking on a journey to better understand stock markets through the lens of algorithms and machine learning. Our adventure begins with a specific financial metric known as the price-to-earnings ratio, or pe ratio for short. Think of the pe ratio as a storyteller that reveals whether a company’s stock is priced fairly compared to its earnings.

In our first line, we see pe_ratio = data[‘peRatio’]. Here, we’re retrieving the pe ratio from a collection of data, which is like opening a treasure chest filled with valuable information about various companies. As we assign this treasure to the variable pe_ratio, we create a labeled box where we can easily find and refer to this specific piece of information later on, much like putting a name on a jar of cookies.

When we look at pe_ratio on the next line, it’s as if we’re taking a peek inside that jar. At this moment, we’re simply displaying the contents of pe_ratio, allowing us to see the actual values held within. This step might seem straightforward, but it’s crucial — it confirms that we’ve successfully retrieved our treasure, giving us a foundation to build upon.

By focusing on the pe ratio, we are setting the stage for deeper insights and analyses that we can perform later. As we continue through our project, we’ll be using this information as a key ingredient in our algorithmic trading recipe, mixed with machine learning and deep learning techniques to navigate the stock market’s complexities. Together, we’ll uncover patterns and make predictions that can guide trading decisions, all the while standing on the solid groundwork laid by understanding fundamental metrics like the pe ratio.

Making a Batch API Call & Creating Our DataFrame

Just like in our first project, we’re ready to make several batch API calls and gather the information we need for our DataFrame.

Let’s begin by running the following code cell. It includes some code we created last time that we can use again in this project. Specifically, it features a function called chunks, which helps us split our list of securities into groups of 100.

# Function sourced from 
# https://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks
def chunks(lst, n):
    “”“Yield successive n-sized chunks from lst.”“”
    for i in range(0, len(lst), n):
        yield lst[i:i + n]   
        
symbol_groups = list(chunks(stocks[’Ticker’], 100))
symbol_strings = []
for i in range(0, len(symbol_groups)):
    symbol_strings.append(’,’.join(symbol_groups[i]))
#     print(symbol_strings[i])

my_columns = [’Ticker’, ‘Price’, ‘Price-to-Earnings Ratio’, ‘Number of Shares to Buy’]

Imagine we’re embarking on a journey to analyze several stocks, specifically looking at groups of tickers that represent each stock. Our task here is to effectively manage these tickers, breaking them down into smaller, more manageable pieces. This makes it easier to work with them, especially when we’re dealing with large data sets, like in algorithmic trading.

First, we define a function called *chunks*. Think of this function as a helper who divides a big pizza into smaller slices so everyone can enjoy a piece. The function takes a list, lst, which contains our stock tickers, and a number, n, which tells the function how many tickers should be in each group. Within the function, we use a simple loop to step through the list in increments of n, yielding sections of n tickers until we’ve covered the entire list. This way, we can handle our data in a way that feels less overwhelming.

Next, we take the entire list of stock tickers from a dataset named stocks, and use our chunks function to split it into groups of 100. We store those groups in symbol_groups, creating a structured way to work with our tickers. However, we need to harmonize these groups into a more readable format, so we initialize an empty list called symbol_strings.

We then cycle through each group, appending a formatted string of tickers, separated by commas, to symbol_strings. This step is essential, as it creates a clean, digestible list of tickers for later use, enabling us to feed them into our trading algorithms without hassle.

Finally, we define my_columns, which acts like a table of contents for our future analysis. It outlines the kinds of data we want to track for each stock, setting us up perfectly for the deeper insights we want to gain from our algorithmic trading project.

Next, let’s create a new, empty DataFrame and add our data to it one piece at a time.

final_dataframe = pd.DataFrame(columns = my_columns)

for symbol_string in symbol_strings:
#     print(symbol_strings)
    batch_api_call_url = f’https://sandbox.iexapis.com/stable/stock/market/batch/?types=quote&symbols={symbol_string}&token={IEX_CLOUD_API_TOKEN}’
    data = requests.get(batch_api_call_url).json()
    for symbol in symbol_string.split(’,’):
        final_dataframe = final_dataframe.append(
                                        pd.Series([symbol, 
                                                   data[symbol][’quote’][’latestPrice’],
                                                   data[symbol][’quote’][’peRatio’],
                                                   ‘N/A’
                                                   ], 
                                                  index = my_columns), 
                                        ignore_index = True)
        
    
final_dataframe

Imagine we’re collecting vital information about stocks from a financial market, like gathering ingredients for a delicious recipe. To start, we create a blank shopping list, our final_dataframe, with specific categories marked by my_columns. Each category represents a type of information we want to gather about our stocks, akin to needing both apples and oranges on our list.

Next, we take a stroll through a collection of symbols, or stock tickers, represented by symbol_strings. Each symbol is like a different recipe we’ll explore. For each of these symbols, we construct a unique URL to fetch data from an online API, much like looking up a recipe in a cookbook — the URL contains specific information allowing us to request just what we need. This is where we utilize f-strings: they help us neatly insert the stock symbol into our URL, ensuring we get the correct information for each stock.

Once we receive the information — our data, like fresh ingredients arriving at our kitchen door — we dive into each individual symbol. We split the string of symbols into separate pieces so we can handle them one at a time. With each symbol and its corresponding data, we extract the latest price and the price-to-earnings ratio. Then, we create a series, or a small structured list, filled with these essential pieces of information, labeling them according to our earlier categories.

Finally, we append this information to our shopping list, or final_dataframe, ensuring that our list grows with each symbol we process. After iterating through all the symbols, our list is complete, brimming with crucial data we can later analyze to make informed trading decisions. This is a crucial step in our larger project, where we will use this data to fuel our machine learning models, guiding us towards smarter trading strategies.

Removing Glamour Stocks

A “glamour stock” is essentially the opposite of a “value stock.”

To achieve our goal of finding the 50 best value stocks, the next step is to eliminate glamour stocks from our DataFrame.

We’ll organize the DataFrame based on the stocks’ price-to-earnings ratio, and we’ll remove all stocks that fall outside the top 50.

final_dataframe.sort_values(’Price-to-Earnings Ratio’, inplace = True)
final_dataframe = final_dataframe[final_dataframe[’Price-to-Earnings Ratio’] > 0]
final_dataframe = final_dataframe[:50]
final_dataframe.reset_index(inplace = True)
final_dataframe.drop(’index’, axis=1, inplace = True)

Imagine we’re working on a project that helps us make smarter decisions in stock trading through machine learning. The goal here is to prepare our data, particularly how we evaluate companies by their Price-to-Earnings Ratio, often abbreviated as P/E Ratio. Think of this ratio as a quick way to gauge a company’s profitability — lower ratios might indicate better value, like finding a hidden gem!

In the first step, we gently guide our dataset to sort it by the P/E Ratio, with the lowest values coming first. This is like organizing our recipe ingredients from lightest to heaviest, making it easier to spot which companies we might want to analyze more closely. The term inplace=True means we’re making the changes right in our existing dataframe, so we don’t create a separate copy — just a tidy kitchen!

Next, we want to eliminate any companies that have a P/E Ratio of zero or less. Companies with zero or negative ratios might not be ideal for investment, just like we wouldn’t include spoiled ingredients in our dish. We only keep those companies with a positive P/E Ratio, which allows our selection to be filled with promising opportunities.

With the next line, we’re slicing our dataframe to retain only the top 50 companies, akin to picking the top 50 recipes from our collection based on their ratings. This helps us focus our analysis on the best candidates for investment.

To keep things organized, we reset the index, ensuring our new list is numbered neatly from zero to forty-nine. Finally, we get rid of any leftover index labels that are no longer needed, just like cleaning up excess notes after getting the recipe right.

All these steps help us refine our data into actionable insights, setting the stage for our machine-learning algorithms to predict which stocks might soar next.

Determining How Many Shares to Purchase

Next, let’s figure out how many shares we should buy.

We’ll use the portfolio_input function that we developed in our momentum project for this calculation.

I’ve included this function below.

def portfolio_input():
    global portfolio_size
    portfolio_size = input(”Enter the value of your portfolio:”)

    try:
        val = float(portfolio_size)
    except ValueError:
        print(”That’s not a number! \n Try again:”)
        portfolio_size = input(”Enter the value of your portfolio:”)

Imagine we’re embarking on a journey where we need to set the foundation for managing our investment portfolio in algorithmic trading. Our adventure begins with a function called portfolio_input, which acts like a special recipe card that helps us gather crucial information about how much money we’re willing to invest.

As we step into this function, we reach for a global ingredient: portfolio_size, which means that this value can be accessed throughout our entire program, like a secret stash we can revisit whenever we need. The first action we take is to prompt the user: “Enter the value of your portfolio.” Here, we invite someone to share their investment amount, which will ultimately influence our trading decisions.

Once we have this input, the next step is crucial — just like a chef needs to ensure their measuring cup is accurate. We try to convert the input into a floating number with float(portfolio_size). This is where our recipe can get tricky, as not all inputs will be valid. If someone mistakenly types in text instead of a number, we might run into a ‘ValueError’, like realizing that we’ve added salt instead of sugar to our dish.

To handle this potential mishap, we wrap our conversion attempt in a protective shield, known as a try-except block. If the input isn’t a number, we catch the error and kindly inform our user that their response wasn’t quite right. We invite them to try again, ensuring they understand what’s needed without discouragement.

Through this little function, we’re not just gathering numbers; we’re laying the groundwork for analyzing investments intelligently with our future machine learning tools. As we proceed with our project, creating a robust portfolio management system will help us make well-informed trading decisions, enhancing our algorithmic trading strategy.

Use the portfolio_input function to take in a portfolio_size value from the user of this script.

portfolio_input()

Imagine we’re crafting a smart trading bot that helps us navigate the bustling world of finance. In our journey, we encounter a little function called portfolio_input(). Think of this function as a friendly guide, standing at the entrance of our investment adventure, ready to gather crucial information to shape our trading strategy.

When we call portfolio_input(), it’s like ringing a doorbell. This action prompts our program to spring into motion, inviting us to share details about our existing portfolio — essentially our collection of investments. Now, much like a recipe that requires various ingredients, this function might be designed to gather data such as the assets we own, how much we’ve invested, and perhaps even our risk appetite. Although we can’t see the ingredients just yet, we know our function is prepared to process and transform this information into something meaningful.

As we imagine this happening, it’s essential to remember that functions are reusable recipe cards in our programming kitchen, helping us bake the same delightful dish — our trading strategy — over and over again without starting from scratch. Once we have our inputs snugly tucked away, our smart trading bot can use this information to make informed decisions, optimizing our investments.

By calling portfolio_input(), we’re taking the first brave step toward building a robust trading strategy that employs machine learning. We put ourselves in a position to analyze our financial landscape, ensuring we are guided by data rather than just intuition. With every detail collected through this little function, we move closer to crafting a powerful tool that could set us apart in the fast-paced world of algorithmic trading.

You can now use the global portfolio_size variable to figure out how many shares our strategy should buy.

position_size = float(portfolio_size) / len(final_dataframe.index)
for i in range(0, len(final_dataframe[’Ticker’])):
    final_dataframe.loc[i, ‘Number of Shares to Buy’] = math.floor(position_size / final_dataframe[’Price’][i])
final_dataframe

Imagine we have a portfolio, a kind of treasure chest filled with various investments, and we want to decide how many shares of each stock we should buy. First, we need to figure out our position size, which can be thought of as the portion of our treasure chest we want to allocate to each stock. By dividing our total portfolio size by the number of different stocks we’ve collected, we determine how much money we’ll use for each stock investment.

This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.

Now, we embark on a loop, kind of like going down a list at the grocery store, where each item we check corresponds to a different stock in our dataframe. This loop runs once for every stock we have. For each stock, represented by the index i, we calculate how many shares we can purchase. We take our calculated position size — remember, that’s our allocated money for this stock — and we divide it by the price of that stock at index i.

Because we don’t want to buy partial shares (imagine trying to buy half a piece of fruit!), we use math.floor to round down to the nearest whole number. This tells us the maximum number of shares we can afford to buy for that particular stock. We store this result back into our dataframe, making it easy to see all our potential purchases lined up.

At the end, we have a complete overview in our dataframe of how many shares we can buy for each stock based on our strategic allocation. This step is crucial as we prepare to make informed trading decisions using sophisticated algorithms and machine learning techniques later in our project. Together, we are shaping our approach to algorithmic trading!

Creating a Stronger and More Realistic Value Strategy

Every valuation metric has its own shortcomings.

For instance, the price-to-earnings ratio doesn’t work well with stocks that have negative earnings.

Likewise, it can be challenging to value stocks that buy back their own shares using the price-to-book ratio.

To build effective quantitative value strategies, investors usually rely on a combination of valuation metrics. In this section, we will focus on stocks that fall into the lowest percentiles for the following metrics:

* Price-to-earnings ratio

* Price-to-book ratio

* Price-to-sales ratio

* Enterprise Value divided by Earnings Before Interest, Taxes, Depreciation, and Amortization (EV/EBITDA)

* Enterprise Value divided by Gross Profit (EV/GP)

Some of these metrics aren’t available directly from the IEX Cloud API and need to be calculated from raw data. We will begin by determining each data point from the ground up.

symbol = ‘AAPL’
batch_api_call_url = f’https://sandbox.iexapis.com/stable/stock/market/batch/?types=advanced-stats,quote&symbols={symbol}&token={IEX_CLOUD_API_TOKEN}’
data = requests.get(batch_api_call_url).json()

# P/E Ratio
pe_ratio = data[symbol][’quote’][’peRatio’]

# P/B Ratio
pb_ratio = data[symbol][’advanced-stats’][’priceToBook’]

#P/S Ratio
ps_ratio = data[symbol][’advanced-stats’][’priceToSales’]

# EV/EBITDA
enterprise_value = data[symbol][’advanced-stats’][’enterpriseValue’]
ebitda = data[symbol][’advanced-stats’][’EBITDA’]
ev_to_ebitda = enterprise_value/ebitda

# EV/GP
gross_profit = data[symbol][’advanced-stats’][’grossProfit’]
ev_to_gross_profit = enterprise_value/gross_profit

Imagine we’re about to dive into the fascinating world of stock analysis using Python. Our objective here is to fetch key financial metrics for a company — in this case, Apple, with the ticker symbol ‘AAPL’ — to help us understand its financial health better.

First, we create a URL that will serve as our bridge to the data we seek, where we specify the type of information we want: advanced stats and quotes. Think of this as writing an address on an envelope that will guide our request to the right destination. We use Python’s f-string to neatly embed the symbol and our API token into this URL, which allows us to connect to the dataset we need.

Next, we send a request to this URL, much like sending our envelope off in the mail, and then we receive a response that we can transform into a usable format with .json(), like opening a letter to read its contents.

With the financial data now accessible, we extract several important ratios. First up is the P/E ratio, or price-to-earnings ratio, which tells us how much investors are willing to pay per dollar of earnings. We access this by navigating through the structure of our received data, much like following the path in a maze.

Then, we examine the P/B ratio, or price-to-book ratio, which compares a company’s market value to its book value, helping investors determine if the stock is undervalued or overvalued.

We also calculate other ratios, like the P/S ratio, which assesses the company’s price compared to its sales, and the EV/EBITDA ratio, where we first find the enterprise value and EBITDA, serving as a measure of a company’s overall valuation against its earnings. Lastly, we look at the EV/GP ratio, which takes the enterprise value and divides it by gross profit to provide additional insights.

By the end of this journey through the code, we’ve gathered valuable metrics that inform our trading decisions, all part of our broader project exploring algorithmic trading powered by machine learning and deep learning. Understanding these financial fundamentals is key to guiding our investment strategies effectively.

Now, let’s get started on creating our DataFrame. You’ll see that I frequently use the abbreviation rv. This stands for robust value, and it’s the term we’ll use for this advanced strategy from now on.

rv_columns = [
    ‘Ticker’,
    ‘Price’,
    ‘Number of Shares to Buy’, 
    ‘Price-to-Earnings Ratio’,
    ‘PE Percentile’,
    ‘Price-to-Book Ratio’,
    ‘PB Percentile’,
    ‘Price-to-Sales Ratio’,
    ‘PS Percentile’,
    ‘EV/EBITDA’,
    ‘EV/EBITDA Percentile’,
    ‘EV/GP’,
    ‘EV/GP Percentile’,
    ‘RV Score’
]

rv_dataframe = pd.DataFrame(columns = rv_columns)

for symbol_string in symbol_strings:
    batch_api_call_url = f’https://sandbox.iexapis.com/stable/stock/market/batch?symbols={symbol_string}&types=quote,advanced-stats&token={IEX_CLOUD_API_TOKEN}’
    data = requests.get(batch_api_call_url).json()
    for symbol in symbol_string.split(’,’):
        enterprise_value = data[symbol][’advanced-stats’][’enterpriseValue’]
        ebitda = data[symbol][’advanced-stats’][’EBITDA’]
        gross_profit = data[symbol][’advanced-stats’][’grossProfit’]
        
        try:
            ev_to_ebitda = enterprise_value/ebitda
        except TypeError:
            ev_to_ebitda = np.NaN
        
        try:
            ev_to_gross_profit = enterprise_value/gross_profit
        except TypeError:
            ev_to_gross_profit = np.NaN
            
        rv_dataframe = rv_dataframe.append(
            pd.Series([
                symbol,
                data[symbol][’quote’][’latestPrice’],
                ‘N/A’,
                data[symbol][’quote’][’peRatio’],
                ‘N/A’,
                data[symbol][’advanced-stats’][’priceToBook’],
                ‘N/A’,
                data[symbol][’advanced-stats’][’priceToSales’],
                ‘N/A’,
                ev_to_ebitda,
                ‘N/A’,
                ev_to_gross_profit,
                ‘N/A’,
                ‘N/A’
        ],
        index = rv_columns),
            ignore_index = True
        )

Imagine we’re cooking up a recipe for a robust financial analysis tool aimed at algorithmic trading using machine learning. The first thing we notice is that we’re setting up a table — it’s like laying out our ingredients before we start cooking. Here, we define the *rv_columns*, which represent the various financial metrics we want to track for different stocks, like their prices, earnings ratios, and something called the RV Score.

Next, we create an empty DataFrame, *rv_dataframe*, inspired by a blank canvas, awaiting our delicious data to fill it. We’ll be adding information about various stocks to this empty table.

Now we dive into a loop, symbolizing the repetitive steps in our recipe. For every *symbol_string*, we’ll fetch stock data from an online API, akin to gathering ingredients from a pantry. The API call retrieves detailed information about stocks, and converting that to JSON format is like prepping our ingredients for cooking.

As we split *symbol_string* into individual stock symbols, we are systematically addressing each ingredient one by one. For each symbol, we grab three essential pieces of information — enterprise value, EBITDA, and gross profit — using our already prepared ingredients.

Now comes a crucial step, where we try to calculate the *EV to EBITDA ratio*. If our ingredients aren’t quite right — say, if we try to divide by zero — we handle that gracefully, avoiding any kitchen disasters by assigning a special value called NaN.

Finally, we take all this carefully prepared information and add it to our DataFrame. It’s like plating a dish: we arrange our values beautifully, ensuring each column has the right ingredient, from ticker symbols to important ratios.

As we collect more batches of data, our table fills up, empowering our algorithmic trading strategies with real-time insights. This is just a foundational step toward building a powerful machine learning model that can help us navigate the stock market effectively.

Handling Missing Data in Our DataFrame

Our DataFrame has a few gaps because some of the metrics we need aren’t available from the API we’re using.

To find the missing data, you can use pandas’ isnull method:

rv_dataframe[rv_dataframe.isnull().any(axis=1)]

Let’s imagine we’re on a treasure hunt, and our goal is to discover hidden gems of data within a treasure chest called a DataFrame. Now, sometimes when we collect data, we might come across mysterious empty spaces — think of them as little question marks — indicating we don’t have complete information. Our job today is to identify these missing pieces.

In our quest, we start by peering into our treasure chest, which we refer to as rv_dataframe. This DataFrame is filled with rows and columns of intriguing data points — sort of like a spreadsheet full of important numbers and descriptions about our trading strategies. Now, we use a special tool called isnull(). Picture it as a magnifying glass that helps us spot any of those pesky question marks hiding in our data. It scans through each cell to see if any spot is empty.

However, finding empty spaces is just the beginning. We want to know which rows contain these gaps wherever they might be nestled. So, we combine isnull() with any(axis=1), which is like saying, “If any of my data jewels in this row are missing, let me know.” The axis parameter here is our guide that tells it to look across the row — rather like checking every item in a single line of treasures.

What we get back is a view of only those rows that are incomplete — like a filtered treasure map leading us to the spots that need attention. By identifying these missing values, we can then decide how to fill in the gaps or remove them, ensuring our future analysis or model training is as robust as possible. This process is essential in our larger project of using machine learning for algorithmic trading, as it helps ensure that our models have high-quality, accurate data to learn from and make informed trading decisions.

Handling missing data is a key topic in data science.

There are two primary methods to consider:

* Remove the missing data from the dataset (you can use pandas’ dropna method for this)

* Substitute missing data with a new value (pandas’ fillna method can help)

In this tutorial, we will focus on replacing missing data with the average of the existing non-NaN values in that column.

Here’s the code to accomplish this:

for column in [’Price-to-Earnings Ratio’, ‘Price-to-Book Ratio’,’Price-to-Sales Ratio’,  ‘EV/EBITDA’,’EV/GP’]:
    rv_dataframe[column].fillna(rv_dataframe[column].mean(), inplace = True)

Imagine we have a treasure map filled with data about different stocks, and our goal is to find the most promising investments. However, just like any map, there are some unclear spots — these are represented by missing values in our data. Now, to make our map useful, we need to fill in those gaps intelligently.

Here, we start by looking at a list of important indicators — think of them as specific landmarks on our map. Each one tells us something crucial about a stock’s health, like its Price-to-Earnings Ratio or Price-to-Book Ratio. These landmarks are laid out in a collection called rv_dataframe, where all our data is stored orderly.

Next, we take our treasure hunt step by step by using a loop, which is akin to repeating the process for each landmark. For every indicator in our list, we delve into the data and search for any missing treasure — what we call NaN, or “Not a Number,” in programming. When we find these holes in our data, we decide to fill them using the average value of the field. Think of the average as a sensible guess based on all the other clues around it.

To execute this, the method fillna is employed, which replaces the missing values with that average. The inplace=True ensures that we’re not just making a copy of the data but actually modifying the original map right where we need it.

By the end of this process, our map of stock indicators is clearer and more accurate, enabling us to make better investment decisions as we explore the exciting world of algorithmic trading using machine learning. Each step brings us closer to successfully navigating the market!

If we execute the earlier statement to show rows with missing data, we should not see any results returned.

rv_dataframe[rv_dataframe.isnull().any(axis=1)]

Imagine you’re sifting through a large pile of papers, looking for any that have missing information. In our little coding adventure, we’re doing something similar, but instead of papers, we’re working with something called a dataframe, which is like a neatly organized table of information. This dataframe, named rv_dataframe, contains various data points that are crucial for our algorithmic trading project.

Now, the goal is to identify any rows in this table that have missing values — think of it as hunting for blank spots on your project pages. To accomplish this, we start by checking each row of our dataframe to see if there are any gaps. The method isnull() acts like a keen-eyed assistant that scans each cell, pointing out where the missing values are present. When we call any(axis=1), it’s as though we are saying, “Hey, show me any row where at least one cell is empty.”

Finally, we wrap this search in brackets that simply filter our table, allowing us to collect all those rows with missing data. What we end up with is a new table that displays only the entries needing attention — like a checklist of things to fix.

This piece of code is essential for ensuring our future decisions are made on complete and accurate data, paving the way for our machine learning algorithms to operate effectively. As we delve deeper into our project, each piece of code will help refine our approach to creating smarter trading strategies.

Calculating Value Percentiles

Next, we’ll calculate value score percentiles for each stock in our universe. Specifically, we want to find the percentile scores for these metrics for every stock:

* Price-to-earnings ratio

* Price-to-book ratio

* Price-to-sales ratio

* EV/EBITDA

* EV/GP

Here’s our approach:

metrics = {
            ‘Price-to-Earnings Ratio’: ‘PE Percentile’,
            ‘Price-to-Book Ratio’:’PB Percentile’,
            ‘Price-to-Sales Ratio’: ‘PS Percentile’,
            ‘EV/EBITDA’:’EV/EBITDA Percentile’,
            ‘EV/GP’:’EV/GP Percentile’
}

for row in rv_dataframe.index:
    for metric in metrics.keys():
        rv_dataframe.loc[row, metrics[metric]] = stats.percentileofscore(rv_dataframe[metric], rv_dataframe.loc[row, metric])/100

# Print each percentile score to make sure it was calculated properly
for metric in metrics.values():
    print(rv_dataframe[metric])

#Print the entire DataFrame    
rv_dataframe

Imagine we’re on a journey to explore how we can analyze financial data, particularly for algo trading, where decision-making relies heavily on numerical insights. Here, our adventure begins with a map, which is represented by a collection of metrics — a dictionary. Each metric is like a specific measurement that helps us understand a company’s financial health, such as the Price-to-Earnings Ratio or the EV/EBITDA.

Now, picture this dictionary as a recipe book where each metric is paired with its percentile counterpart, guiding us on how to quantify our financial metrics against the broader market. As we dive deeper, we need to compute how these financial metrics rank, so we embark on a double-loop journey. The first loop is like going through every row of our data — a table of companies, if you will — while the inner loop is our way of comparing each metric in our dictionary.

With every company in our index, we calculate the percentile score for each metric. It’s like determining how well a student performed compared to their peers; we take the specific metric for that company and analyze it. Using the stats.percentileofscore function, we find out where our company stands in relation to others, converting that comparison into a score out of 100. Each calculated percentile score gets stored conveniently in our DataFrame under the corresponding key, making our information neatly organized.

Finally, we want to ensure everything looks accurate, so we print out the computed percentile scores, shedding light on where each company stands financially. When we print the entire DataFrame, it’s like revealing the big picture: a comprehensive view of our analysis that can guide our trading decisions in the complex world of finance. This meticulous process sets the stage for informed decision-making as we harness machine learning and deep learning in our algo trading project.

Determining the RV Score

Next, let’s find our RV Score, which stands for Robust Value. This score will help us filter stocks as part of our investing strategy.

The RV Score is simply the average of the four percentile scores we calculated in the previous section.

To calculate the average, we will use the mean function from Python’s built-in statistics module.

from statistics import mean

for row in rv_dataframe.index:
    value_percentiles = []
    for metric in metrics.keys():
        value_percentiles.append(rv_dataframe.loc[row, metrics[metric]])
    rv_dataframe.loc[row, ‘RV Score’] = mean(value_percentiles)
    
rv_dataframe

Imagine we’re diving into a fascinating world of algorithmic trading, where technology and finance meet. Our goal here is to calculate something called the “RV Score,” a crucial metric that will help us assess different trading strategies using data we have in a table, or as we call it, a DataFrame.

We start by bringing in a handy tool from a library designed for statistical functions. Here, we import the function that computes the average, which is like finding the middle point in our collection of numbers — it helps us summarize information effectively.

Now, we set off on a journey through each row of our DataFrame, much like flipping through pages in a book, where each row holds vital information about a trading day. For every row we visit, we create a blank list called value_percentiles, ready to collect our values like a container waiting to be filled with ingredients for a recipe.

Next, we peek at a collection of metrics, which are like ingredients that quantify various aspects of our trading strategy. For each metric, we extract its corresponding value from our DataFrame, adding it into our value_percentiles list. This is akin to measuring out each ingredient — making sure we have everything we need to achieve that perfect score.

Once we’ve gathered all the values for a given day, we calculate their average using the mean function we imported earlier. This average is the RV Score, and we carefully store it back into our DataFrame, ensuring our calculated wisdom is recorded for future analysis.

In the end, as we look at our updated DataFrame, we see that we’ve enriched our data landscape, paving the way for smarter trading decisions in our algorithmic trading venture.

Choosing the 50 Best Value Stocks

Once again, we can find the 50 best value stocks by organizing the DataFrame based on the RV Score column and keeping only the top 50 entries.

rv_dataframe.sort_values(by = ‘RV Score’, inplace = True)
rv_dataframe = rv_dataframe[:50]
rv_dataframe.reset_index(drop = True, inplace = True)

Imagine we’re curating a collection of the best investment opportunities based on their “RV Score,” which tells us how appealing each option is. Our goal is to identify the top performers in a sea of data, much like sorting through a box of chocolates to find the most delectable ones.

We begin by taking our data, represented here as a dataframe, and we want to sort it. The first line of our journey is where we arrange our dataframe in ascending order based on the RV Score, similar to how we might line up our favorite books from least to most interesting. The magic happens when we use sort_values, which meticulously organizes the rows based on the values in the ‘RV Score’ column, ensuring the best candidates float to the top. By setting inplace = True, we’re saying we want this sorting to happen directly within our existing collection, rather than creating a new one.

Next, we trim down the list to just the top 50 candidates. This is akin to picking the top 50 chocolates that caught our eye. By slicing the dataframe to rv_dataframe[:50], we’re focusing our attention solely on those sweet selections, making it easier for us to analyze and act on them.

Finally, we call for a little housekeeping to reset the index of our refined list with the reset_index method. Think of it like renumbering our favorite chocolates so that they have a fresh order, ignoring any previous numbering that might still linger. By using drop=True, we’re ensuring we don’t keep any old index numbers, which makes everything clean and tidy.

Putting these pieces together, we streamline our analysis of the top investment opportunities, setting the stage for deeper exploration into machine learning and deep learning strategies that can help us make informed trading decisions.

Determining How Many Shares to Purchase

We’ll utilize the portfolio_input function we developed earlier to gather our portfolio size. After that, we will apply a similar approach in a for loop to figure out how many shares to buy for each stock in our investment options.

portfolio_input()

Imagine we’re embarking on an exciting adventure into the world of algorithmic trading through machine learning! Right at the heart of our journey, we have a little function called portfolio_input(). Think of this function as an invitation for us to gather important information about our investment choices — like collecting ingredients for a delicious meal.

When we call portfolio_input(), it’s like ringing a doorbell to let our program know it’s time to step into the kitchen and start our preparation. This function likely prompts us for details, such as the stocks or assets we want to include in our portfolio. We’ll fill it with data that reflects our preferences, similar to how a chef decides what flavors to mix together.

Now, every time we run this function, we might be trying out different combinations of ingredients depending on our market expectations, which keeps our approach fresh and adaptable. It’s this flexibility that allows us to craft a portfolio that aligns with our trading strategy. Without such input, our algorithm wouldn’t know how to proceed; it’s like trying to bake a cake without knowing what flavor the cake should be!

Share Onepagecode

So, as we venture forth from portfolio_input(), we set the foundations for not just executing trades but also leveraging machine learning to optimize our decisions based on past data and predictions. This first step is crucial as it lays the groundwork for creating an intelligent trading system that learns and evolves — much like a cook perfecting a recipe over time! And as we continue to build on this, we’ll see how each part comes together in our larger quest for smart, effective trading.

position_size = float(portfolio_size) / len(rv_dataframe.index)
for i in range(0, len(rv_dataframe[’Ticker’])-1):
    rv_dataframe.loc[i, ‘Number of Shares to Buy’] = math.floor(position_size / rv_dataframe[’Price’][i])
rv_dataframe

Imagine we’re on a quest to optimize our trading strategy in the world of algorithmic trading. Our first goal is to determine how many shares of a stock we should buy based on our available portfolio size. So, we start by calculating the position size, which is like budgeting your dollars for each investment. We take the total portfolio size and divide it by the number of stocks we’re interested in, represented by the length of our stock data, or rv_dataframe.index. This division tells us how much money we can allocate to each stock.

Next, we embark on a journey through each stock listed in our rv_dataframe, which is like flipping through a recipe book for each stock’s entry. We use a loop that starts at zero and goes all the way to the second-to-last stock — that’s because we want to process each stock’s unique data. Inside this loop, we make our calculations specific. For each stock, we determine how many shares we can buy. We take our budget for that stock, referred to as position_size, and divide it by the current price of that stock, rv_dataframe[‘Price’][i].

However, since we can’t buy half a share, we use a function called math.floor, which rounds down to the nearest whole number — it’s like making sure you only buy complete cookies from the bakery, not fractions of them! This calculated number of shares is then stored back into our data frame, specifically in the column labeled ‘Number of Shares to Buy’.

Finally, at the end, we look back at our data frame, which now has a clear picture of how many shares we can purchase for each stock. This powerful step moves us closer to executing our trading strategy, laying the foundation for more complex decisions and helping us navigate the exciting world of machine learning and deep learning in trading.

Formatting Our Excel Output

In this section, we’ll use the XlsxWriter library for Python to create beautifully formatted Excel files.

XlsxWriter is a fantastic package that provides a lot of customization options. However, it can feel a bit overwhelming for beginners. Because of this, this section will be quite detailed as I want to ensure you understand how XlsxWriter works.

writer = pd.ExcelWriter(’value_strategy.xlsx’, engine=’xlsxwriter’)
rv_dataframe.to_excel(writer, sheet_name=’Value Strategy’, index = False)

Imagine we’re preparing a delightful feast for our friends who are hungry for insights about a sophisticated trading strategy. Our feast begins by creating a stylish serving platter, the Excel file, where we’ll elegantly present our data. The first line of code sets the stage by inviting pandas, a handy library for manipulating data, to create an Excel writer object. We name our platter ‘value_strategy.xlsx’ and choose the ‘xlsxwriter’ engine, which is like selecting a fine set of utensils to ensure everything is served beautifully.

Now, with our platter ready, we want to fill it with a rich dish made of our findings — the rv_dataframe. Think of this dataframe as a recipe filled with key ingredients that describe our trading strategy. The second line gently scoops up the data from rv_dataframe and presents it on our ‘Value Strategy’ sheet within the Excel file. By specifying index=False, we’re making sure that we don’t clutter our presentation with any unnecessary ingredient labels; we just want the pure essence of our strategy to shine through.

Once our data is painstakingly poured into the platter, it’s all set for our friends to savor and explore. This process — writing data to an Excel file — is crucial in our larger project of algorithmic trading using machine learning and deep learning. We want our insights to be accessible and digestible, so our trading strategies can be tested and refined effectively, ensuring our feast of knowledge feeds the hungry minds behind the trading decisions.

Creating the Formats We Need for Our .xlsx File

As we discussed in our first project, formats can include colors, fonts, and symbols like % and $. For our Excel document, we will require four main formats:

* String format for tickers

* \$XX.XX format for stock prices

* \$XX,XXX format for market capitalization

* Integer format for the number of shares to purchase

* Float formats with 1 decimal for each valuation metric

Since we’ve already created some formats in earlier sections of this course, you can find them listed below. Please run this code cell before moving on.

background_color = ‘#0a0a23’
font_color = ‘#ffffff’

string_template = writer.book.add_format(
        {
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

dollar_template = writer.book.add_format(
        {
            ‘num_format’:’$0.00’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

integer_template = writer.book.add_format(
        {
            ‘num_format’:’0’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

float_template = writer.book.add_format(
        {
            ‘num_format’:’0’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

percent_template = writer.book.add_format(
        {
            ‘num_format’:’0.0%’,
            ‘font_color’: font_color,
            ‘bg_color’: background_color,
            ‘border’: 1
        }
    )

Imagine we’re creating a beautifully designed report for our algorithmic trading project, and we want to present our data in a clear and appealing way. The first two lines set the stage for our visual theme, where we choose a deep blue background color and crisp white font color, representing a sleek and professional look. It’s like picking the perfect wallpaper and font for a room we’re going to decorate.

Next, we create our first recipe card, lovingly named string_template. This is where we define a format for text entries in our report. It includes our chosen colors and also a border, giving our text a nice, framed appearance. It’s like crafting a custom frame for a painting; it enhances the overall picture.

We then define several more recipe cards, each tailored to specific data types we expect to present. The dollar_template is designed for monetary values, ensuring that anything shown in dollars will be formatted with a dollar sign and two decimal places, just like we’d want to see in a financial report. Similarly, we craft the integer_template and float_template, which are essential for displaying whole numbers and floating-point numbers, but we notice both are currently set to the same format — perhaps a simple oversight.

Finally, the percent_template is like dressing up percentages for a celebration, showing one decimal place in a way that’s easy to read. Each template directly contributes to how our data will be displayed, making our report visually coherent and professional.

As we compile this report, we are preparing to visualize our trading results, making complex data easy to digest, and presenting it in a way that will impress stakeholders and aid in our decision-making. This attention to detail in presentation is crucial in the world of algorithmic trading, as it builds trust and clarity in our strategy’s outcomes.

column_formats = {
                    ‘A’: [’Ticker’, string_template],
                    ‘B’: [’Price’, dollar_template],
                    ‘C’: [’Number of Shares to Buy’, integer_template],
                    ‘D’: [’Price-to-Earnings Ratio’, float_template],
                    ‘E’: [’PE Percentile’, percent_template],
                    ‘F’: [’Price-to-Book Ratio’, float_template],
                    ‘G’: [’PB Percentile’,percent_template],
                    ‘H’: [’Price-to-Sales Ratio’, float_template],
                    ‘I’: [’PS Percentile’, percent_template],
                    ‘J’: [’EV/EBITDA’, float_template],
                    ‘K’: [’EV/EBITDA Percentile’, percent_template],
                    ‘L’: [’EV/GP’, float_template],
                    ‘M’: [’EV/GP Percentile’, percent_template],
                    ‘N’: [’RV Score’, percent_template]
                 }

for column in column_formats.keys():
    writer.sheets[’Value Strategy’].set_column(f’{column}:{column}’, 25, column_formats[column][1])
    writer.sheets[’Value Strategy’].write(f’{column}1’, column_formats[column][0], column_formats[column][1])

Imagine we’re building a neat and organized spreadsheet to analyze stock data for our algorithmic trading project. First, we create a dictionary called column_formats, which is like a blueprint for each column we will include. Each entry maps a column letter, like ‘A’ or ‘B’, to a pair: the column’s name and a certain format that tells us how to style the data. Think of it like designing a recipe card where each ingredient has a specific format to follow — some might be strings, others numbers or percentages.

Share Onepagecode

Now, we need to make sure our spreadsheet reflects this design. We embark on a journey through each column defined in our blueprint using a loop. This is similar to repeating a step in a recipe for each dish we’re preparing, ensuring we don’t miss any important details. As we loop through the keys of our column_formats, we access both the name of each column and its format.

In the first part of the loop, we adjust the width of each column in our spreadsheet to 25 units, so that no data gets squished — this is like ensuring there’s enough room on our table for all the ingredients. The set_column function does this, applying the specific formatting provided in our blueprint.

Next, we use the write function to actually place the name of the column in the first row of the spreadsheet, still honoring that format we’ve specified. This action puts the header in place, much like putting the title on a recipe — it’s what tells us about the contents below.

As we continue to build this structured spreadsheet, we’re paving the way for our algorithmic trading to effectively analyze stock data, guiding our machine learning and deep learning efforts in making smart investment decisions.

Saving Our Excel Output

Just like before, saving our Excel output is simple and straightforward:

writer.save()

And done, by this final part we save all of our progress and results to the excel file, and we are done with the project:

If you want to download the source code and the entire dataset please use the url below:

Use button below to download the source code:

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Onepagecode · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture