This week, I started re-writing my day trading bot from scratch. The original bot was created in 2019 and has not aged well, due to the many additions I've made in a hurry. On the other hand, it's been a steady money maker, even through rough conditions, thanks to a conservative strategy. When TD Ameritrade's API stopped working (bought by Schwab), I decided to use this opportunity to give it some more attention while I switch to the TradeStation API.
Repo: alhankeser/borsa
Many of the transformations I need to make work quite well as SQL queries and it's very possible that this would go faster in SQL than in pandas, mainly for backtesting. As for day trading, I don't need blazing speed since I'm only making a trading decision once a minute.
Also, I feel like it: I want to better understand DuckDB's use cases and its drawbacks.
In my first week of rewriting the bot, I've realized that I can do nearly everything needed in dbt, with the help of some Python (or language of choice) as a controller. I almost thought I could get away with no loops of any kind, but found myself needing a recursive macro that generated CTEs. The reason for this was the need to be able to backtest buy decisions where the sell event depended on conditions related to the buy price (think trailing stops). It's easy to do when there's only one trade in a day, but there could be multiple trades and figuring out when the next buy event takes place and re-doing the same analysis made my brain melt. So recurse, I did.
While it wasn't as elegant of a solution I was hoping for, it is lightyears faster than the many loops I was doing in my original pandas approach. Here is the macro in question at this moment in time. It takes a list of potential buys (as suggested by trading strategies), buys on the first one and holds until the first sell event. It calls the same macro again to create a subsequent cte that is filtered by the timestamps that come after the most recent sell date. We do that a maximum of 3 times as that is my current limit on the number of trades per day (arbitrary at the moment for testing purposes).
My strategies are quite simple: they are a set of conditions under which I buy. My sell conditions are all based around open profit and using some dynamic trailing stops (not yet implemented in latest version). At first, I thought I was going to build these through Python like I did the first time around, but it turns out that passing variables to dbt via dbt_project
vars
works as expected for dictionary-like objects. The below works, which means I could call a dbt run from anywhere and override the variable:
%% dbt_project.yml %%
vars:
strategies: [
{
id: "test1",
buy: "(sma0 > sma5)",
},
{
id: "test2",
buy: "(sma0 > sma5 and sma5 > sma10)",
}
]
Then in a macro that unions all strategies, I can do this:
{% macro union_strategies(strategies=var("strategies"), model=ref("int__indicators")) %}
(
{% for strategy in strategies %}
select
symbol,
ts,
'{{ strategy['id'] }}' as strategy_id,
{{ strategy['buy'] }} as buy,
from {{ model }}
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
) unioned_strategies
{% endmacro %}
The same logic to do all of this in an easy-to-test way through code would be much more involved. As I progressed through the creation of the various pieces of this project in SQL, I realized that it's actually one of the better options for such a project and I don't feel like I'm shoe-horning the trading bot into using dbt and DuckDB.
Most of my time has gone towards finding a quick and easy way to visualize the backtest results. It's challenging because I need to be able to see not only the overall results of each strategy, but also individual days to tune the rules. Of course, using an ML model in comparison will be a future step as opposed to all of this manual work. However, to get started, I'd like to identify a baseline strategy that works decently well.
I played around with both Plotly Dash and Vega. While I quickly got both working to do simple things, I was not able to get to a more advanced stage in a reasonable amount of time, so I cut my losses and went back to what I can dependably create complex visuals with: good ol' matplotlib. It's slow to generate plots, but fast for me to develop on and what's holding me back right now is not the performance of my app, but instead the time I have to dedicate to it.
I added an "optimal" strategy by default to compare backtest results against this. That way, I can see how much each strategy is leaving on the table. To create this optimal strategy, I'm using the SQL version of finding a max subarray difference where the first value needs to be the lower value:
optimal_strategy as (
select
i1.symbol,
i1.ts_day,
i1.price as best_buy_price,
i1.ts as best_buy_ts,
i1.minutes_since_open as best_buy_minutes_since_open,
i2.price as best_sell_price,
i2.ts as best_sell_ts,
i2.minutes_since_open as best_sell_minutes_since_open,
i2.price - i1.price as best_price_diff,
from indicators as i1
left join
indicators as i2
on i2.ts > i1.ts
and i2.ts_day = i1.ts_day
and i2.symbol = i1.symbol
where i2.price - i1.price > 0
and i1.minutes_since_open >= {{ var("buy_after_minutes") }}
qualify
row_number() over (
partition by i1.symbol, i1.ts_day
order by i2.price - i1.price desc
)
= 1
)
At some point last year, TD Ameritrade accounts got migrated to Charles Schwab. Despite promises to the contrary, one of the victims of this acquisition was the TD Ameritrade API, which was one of the better "free" options available. Last I checked, Schwab has not re-enabled the API for the common people. (Update: I got an email late March saying that it was again available for non-business users.)
That's where TradeStation comes in. I've been eyeing the platform for a while as it seems simpler and they have a documented API, which is a nice change from TD Ameritrade.