book
Checkout our new book! Hands on AI Trading with Python, QuantConnect, and AWS Learn More arrow

Popular Models

DistilBERT

Introduction

This page explains how to use DistilBERT in LEAN trading algorithms. The model repository provides the following description:

The DistilBERT model was proposed in the blog post Smaller, faster, cheaper, lighter: Introducing DistilBERT, adistilled version of BERT, and the paper DistilBERT, adistilled version of BERT: smaller, faster, cheaper and lighter. DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than bert-base-uncased, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark. This model is a fine-tune checkpoint of DistilBERT-base-cased, fine-tuned using (a second step of) knowledge distillation on SQuAD v1.1.

Use Cases

The DistilBERT model is a question answering model. The following use cases explain how you might utilize it in trading algorithms:

  • Summarize the upcoming changes reported through regulatory articles in preparation of sentiment analysis.
  • Extract relevant information from news feeds like Tiingo and Benzinga while reducing noise.
  • Parse through US SEC Filings to find critical information about company performance, legal issues, or changes in management.

Load Pre-Trained Model

Follow these steps to load the pre-trained DistilBERT model:

  1. Import the pipeline.
  2. from transformers import pipeline
  3. Create a pipeline object with the DistilBERT model.
  4. In QuantConnect Cloud, the path is distilbert / distilbert-base-cased-distilled-squad.

    question_answerer = pipeline(
        "question-answering", 
        model='distilbert/distilbert-base-cased-distilled-squad', 
        local_files_only=True
    )

Answer Questions

Follow these steps to answer questions with DistilBERT:

  1. Load the model.
  2. Define the context.
  3. context = """
    Introduced on 2024-11-01. 
    The legislation introduces a flat corporate income tax rate, reducing the current graduated system 
    to 5.5% for taxable years starting from January 1, 2025, and further lowering it to 3.5% for years 
    beginning January 1, 2026. It establishes a bonus depreciation deduction for qualified property and 
    research expenditures, allowing immediate cost recovery. The bill also terminates various corporate 
    tax credits, exemptions, and deductions, including those related to motion picture production, 
    research and development, and historic structure rehabilitation, with most credits expiring after 
    June 30, 2025. Additionally, it repeals provisions for certain tax programs and credits, such as the
    corporate tax apportionment program and various incentives for businesses. The changes aim to simplify 
    corporate taxation and reduce the number of available tax credits, with the new provisions effective 
    for income tax periods starting January 1, 2025, and franchise tax periods beginning January 1, 2026. 
    The legislation will take effect upon the governor's signature or after the designated period for 
    gubernatorial action.
    """
  4. Create the question.
  5. question = "What do these changes accomplish?"
  6. Pass the question and context to the pipeline.
  7. result = question_answerer(question=question, context=context)

    The following dictionary is an example result:

    {
        'score': 0.19066356122493744,
        'start': 820, 
        'end': 895, 
        'answer': 'simplify corporate taxation and reduce the number of available tax credits'
    }

Examples

You can also see our Videos. You can also get in touch with us via Discord.

Did you find this page helpful?

Contribute to the documentation: