Backtest

Overall Statistics
Total Orders 159 Average Win 1.56% Average Loss -1.19% Compounding Annual Return 97.424% Drawdown 23.200% Expectancy 0.139 Start Equity 100000 End Equity 125458.52 Net Profit 25.459% Sharpe Ratio 1.55 Sortino Ratio 2.627 Probabilistic Sharpe Ratio 59.880% Loss Rate 51% Win Rate 49% Profit-Loss Ratio 1.31 Alpha 0.537 Beta -0.169 Annual Standard Deviation 0.349 Annual Variance 0.122 Information Ratio 1.03 Tracking Error 0.544 Treynor Ratio -3.202 Total Fees $792.20 Estimated Strategy Capacity $15000000.00 Lowest Capacity Asset TSLA UNU3P8Y3WFAD Portfolio Turnover 258.68%

# region imports
from AlgorithmImports import *
# endregion

class SmoothBlueBull(QCAlgorithm):

    def Initialize(self):
        self.SetStartDate(2023, 11, 1)
        self.SetEndDate(2024, 3, 1)
        self.SetCash(100000)
        self.tsla = self.AddEquity("TSLA", Resolution.Minute)

    def OnData(self, data: Slice):
        if not self.Portfolio.Invested:
            self.SetHoldings(self.tsla.Symbol, 1)

#region imports
from AlgorithmImports import *
#endregion

# - Starting in the Research Environment (research.ipynb), we gather the data we need.
# - We subscribe to all of the assets that we want to analyze. In this case, we use TSLA.
# - We then subscribe to the TiingoNews data of TSLA, which provides news articles that feature TSLA.
# - We gather the TiingoNews articles from 2023/11/01 - 2024/03/01.
# - We choose these months because they are the most recent articles and we limit it to 5 months to stay within our OpenAI quota.
# - In the notebook, we next group the articles by date and by hour.
# - We exclude duplicated articles (TiingoNews articles that have the same title as one of the past 100 article titles).
# - The following image shows the number of articles per day and the cumulative number of articles during the time period.
# - The following image shows the number of articles per hour and the cumulative number of articles during the time period.
# - Next, we iterate through each hour of articles and ask GPT4 through the OpenAI API to summarize all the articles that were released during the hour with a single sentiment score between -10 and +10
# - Here is the prompt we use:
# - "Article <i> title: <title>
# - Article <i> description: <Description>
# - . . .
# - Review the news titles and descriptions above and then create an aggregated sentiment score which represents the emotional positivity towards TSLA after seeing all of the news articles. -10 represents extreme negative sentiment, +10 represents extreme positive sentiment, and 0 represents neutral sentiment. Reply ONLY with the numerical value in JSON format. For example, `{ "sentiment-score": 0 }`"
# - We gather all of the sentiment scores into a DataFrame and then save the results as a CSV into the ObjectStore.
# -
# - Now, we transition to the backtesting environment to build an algorithm that uses the data we just created.
# - In the main.py file, we define a custom data class that reads the CSV data from the Object Store and injects it into the algorithm.
# - We apply a RateOfChange indicator to the data set so that we can observe the direction of sentiment changes.
# - The trading logic is:
# - If sentiment is flat/increasing and not already long, long.
# - If sentiment is negative and and sentiment is decreasing and not already short, short.
# - The algorithm achieves a 1.695 Sharpe ratio.
# - In contrast, the benchmark (buy and hold TSLA) achieves a -0.06 Sharpe ratio.
# - Therefore, the strategy outperforms the benchmark in terms of risk-adjusted returns.

# region imports
from AlgorithmImports import *
# endregion


class LLMSummarizationAlgorithm(QCAlgorithm):
    """
    This algorithm demonstrates how to load in the sentiment scores from
    the Object Store and use them to inform trading decisions. Before 
    you can run this algorithm, run the cells in the `research.ipynb` 
    file.
    """

    def initialize(self):
        self.set_start_date(2023, 11, 1)
        self.set_end_date(2024, 3, 1)
        self.set_cash(100_000)

        self._tsla = self.add_equity("TSLA")
        self._dataset_symbol = self.add_data(
            TiingoNewsSentiment, "TiingoNewsSentiment", Resolution.HOUR
        ).symbol
        self._roc = self.roc(self._dataset_symbol, 2)

        self.set_benchmark(self._tsla.symbol)

    def on_data(self, data):
        # Get the current sentiment.
        if self._dataset_symbol not in data:
            return
        sentiment = data[self._dataset_symbol].value
        
        # If the market isn't open right now, do nothing.
        if not self.is_market_open(self._tsla.symbol):
            return

        self.plot(
            "Sentiment", "Change in OpenAI Sentiment", self._roc.current.value
        )

        # If sentiment is flat/increasing and not already long, long.
        # The condition to buy here doesn't include `sentiment >= 0`
        # because excluding it allows the algorithm to buy when 
        # sentiment is down but relatively OK/good (since sentiment
        # is flat/increasing). If you wait for sentiment to be 
        # positive, it'll to late to benefit from the reversion
        # in price upwards following a crash from negative news. 
        if self._roc.current.value >= 0 and not self._tsla.holdings.is_long:
            self.set_holdings(self._tsla.symbol, 1)
        # If sentiment is negative and sentiment is decreasing and not 
        # already short, short.
        elif (sentiment < 0 and 
            self._roc.current.value < 0 and 
            not self._tsla.holdings.is_short):
            self.set_holdings(self._tsla.symbol, -1)


class TiingoNewsSentiment(PythonData):

    def get_source(self, config, date, is_live):
        return SubscriptionDataSource(
            f"tiingo-{date.strftime('%Y-%m-%d')}.csv", 
            SubscriptionTransportMedium.OBJECT_STORE, 
            FileFormat.CSV
        )

    def reader(self, config, line, date, is_live):
        # Skip the header line.
        if line[0] == ",": 
            return None
        
        # Parse the CSV line into a list.
        data = line.split(',')

        # Construct the new sentiment datapoint.
        t = TiingoNewsSentiment()
        t.symbol = config.symbol
        t.time = date.replace(hour=int(data[0]), minute=0, second=0)
        t.end_time = t.time + timedelta(hours=1)
        t.value = float(data[1])
        t["sentiment"] = t.value
        t["volume"] = float(data[2])

        return t