Popular Models
DistilBERT
Introduction
This page explains how to use DistilBERT in LEAN trading algorithms. The model repository provides the following description:
The DistilBERT model was proposed in the blog post Smaller, faster, cheaper, lighter: Introducing DistilBERT, adistilled version of BERT, and the paper DistilBERT, adistilled version of BERT: smaller, faster, cheaper and lighter. DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than bert-base-uncased, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark. This model is a fine-tune checkpoint of DistilBERT-base-cased, fine-tuned using (a second step of) knowledge distillation on SQuAD v1.1.
Use Cases
The DistilBERT model is a question answering model. The following use cases explain how you might utilize it in trading algorithms:
- Summarize the upcoming changes reported through regulatory articles in preparation of sentiment analysis.
- Extract relevant information from news feeds like Tiingo and Benzinga while reducing noise.
- Parse through US SEC Filings to find critical information about company performance, legal issues, or changes in management.
Load Pre-Trained Model
Follow these steps to load the pre-trained DistilBERT model:
- Import the pipeline.
- Create a pipeline object with the DistilBERT model.
from transformers import pipeline
In QuantConnect Cloud, the path is distilbert / distilbert-base-cased-distilled-squad.
question_answerer = pipeline( "question-answering", model='distilbert/distilbert-base-cased-distilled-squad', local_files_only=True )
Answer Questions
Follow these steps to answer questions with DistilBERT:
- Load the model.
- Define the context.
- Create the question.
- Pass the question and context to the pipeline.
context = """ Introduced on 2024-11-01. The legislation introduces a flat corporate income tax rate, reducing the current graduated system to 5.5% for taxable years starting from January 1, 2025, and further lowering it to 3.5% for years beginning January 1, 2026. It establishes a bonus depreciation deduction for qualified property and research expenditures, allowing immediate cost recovery. The bill also terminates various corporate tax credits, exemptions, and deductions, including those related to motion picture production, research and development, and historic structure rehabilitation, with most credits expiring after June 30, 2025. Additionally, it repeals provisions for certain tax programs and credits, such as the corporate tax apportionment program and various incentives for businesses. The changes aim to simplify corporate taxation and reduce the number of available tax credits, with the new provisions effective for income tax periods starting January 1, 2025, and franchise tax periods beginning January 1, 2026. The legislation will take effect upon the governor's signature or after the designated period for gubernatorial action. """
question = "What do these changes accomplish?"
result = question_answerer(question=question, context=context)
The following dictionary is an example result:
{ 'score': 0.19066356122493744, 'start': 820, 'end': 895, 'answer': 'simplify corporate taxation and reduce the number of available tax credits' }