Since 2018, I have been learning Machine Learning and its application in algorithmic trading. It is not an easy path, but I have found some books that have been very useful to me. One of them is Marco López de Prado's book "Advances in Financial Machine Learning".
Each chapter is an opportunity to question the basis of many analyzes. For example, at the very begining, in chapter 2 is reference made to the possibility of using bars that are not typical but that may be useful. Especially, when a normal distribution of the data is assumed. Some of the types of bars addressed in the chapter are: tick bars, volume bars, dollar bars, and information-driven bars.
These types of bars are used even in technical trading, although they are not as well known. In book we find a rigorous presentation of these bars from the mathematical point of view. In other chapters we find even code examples that help us implement ideas.
Although going to each chapter and programming from scratch each idea is a good approach to achieve a better understanding of the concepts. The truth is that there are already many GitHub repositories that have tried to program these concepts (the book has become very popular). For me, the best implementation of the concepts of the book has been achieved by the "Hudson and Thames" team. These guys are doing an excellent job. And not only limited to the ideas of López de Prado. And thanks to the programming environment we enjoy in QuantConnect, it is possible to load the library created by the "Hudson and Thames" team, called mlfinlab.
I would like to make a small note here regarding the QuantConnect environment. It's not easy for someone who is starting, like me, to get all the necessary pieces to enter the world of algorithmic trading. First, it is a great problem to obtain financial data that is useful for predictive analysis. For a private individual, the costs of obtaining the data are prohibitive. Then, we do not find the fact that developing a backtest platform is not easy, much less developing a live trading platform. And finally, there is the point of being able to load libraries that allow us to make personalized analyzes. Other platforms do not allow loading TensorFlow, for example, or other famous DeepLearning libraries, which limits research in the field of machine learning. The only site I have found where all these problems are solved is in QuantConnect.
Returning to the topic of this post, in the attached backtest you will find an example of an algorithm that shows how to use mlfinlab to create volume bars. You will also find a Jupyter notebook with base tests to compare the normality of various types of bars: time bars, tick bars, volume bars and dollar bars. Based on the work of Jacques Joubert of the "Hudson and Thames" team
Greetings.
John Radosta
Interesting library. I noticed the functions output a csv though, which for event-driven trading (t=0) wouldn't really work. Do you this library be used as a consolidator? Are they functions that just spit out a dataframe instead of a CSV?
Brian Christopher
John Radosta QC is updating a lot of their backend packages including MlFinLab which is on version 0.9 or so now. It doesn't require input or output of a csv and can take pandas df as input and output a new df.
Derek Melchin
Hi John,
We currently support version 0.4.1 of mlfinlab. Unfortunately, the source code for this version shows that get_volume_bars only accepts a CSV pointer, not a DataFrame. We are going to upgrade to 0.9.3, which does support the ability to pass in a DataFrame. For future reference, all our machine learning libraries are listed with their respective version number in the Machine Learning section of our documentation.
Best,
Derek
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Dirk bothof
when is can the upgrade of mlfinlab bee expected, I have been reading their docs this morning and I have to say I'm excited!
Dirk bothof
so guys I checked out the documentation and it is not consistent
https://www.quantconnect.com/docs/algorithm-reference/machine-learning
shows that quantconnect is running 0.4.3 and
https://www.quantconnect.com/docs/key-concepts/supported-libraries
says 0.9.3 I'll investigate a little this weekend but a .__version__ does not work on mlfinlab
Jared Broad
Thanks Dirk. I've updated the machine learning section.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Kevin Baker
I got an error when trying to run the example above.
Being a beginner, I just tried changes until it appeared to work, for example, changing the AddEquity to Hour resolution.
I pass the DataFrame into MLFinLab instead of the CSV. I used a hack to add some plotting: made OnData continue to plot after the Tick History is captured and fed to MLFinLab. Ignore the dates when looking at the Volume Levels chart.
Question: Why does it only seem to work when AddEquity is set to Hour resolution?
Observation: There are some huge numbers in the Volume Bar data returned by MLFinLab. Perhaps they are start of day/end of day numbers that should be weeded out and treated differently.
Gahl Goziker
Hi Kevin,
There is a known issue where history requests for tick-resolution data in Python raise a runtime error.
We are actively working to resolve this issue, and you can track progress here.
Best regards,
Gahl Goziker
.ekz.
Has anyone found any viable workarounds for tick-data?
Derek Melchin
Hi .ekz.,
To get tick data, request a list of Tick objects instead of a DataFrame.
For more information, see History Requests.
Best,
Derek Melchin
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
.ekz.
Brilliant. Thanks Derek Melchin
Lars Klawitter
hi,
I've recently subscribed to the MLFinLab library with Hudson & Thames and have come across this post from 2 years ago.
I've had a go at plugging tick bars into MLFinLab to get dollar bars, using Enzo's code as a starting point - many thanks for sharing the example!
I tried to aim at 50 dollar bars per trading day as suggested by de Prado's in "Advances in Financial Machine Learning".
To that end I'm calculating the last 5 day's average daily dollar volume and use a running total in OnData up until the point when 1/50 of the average daily dollar value is reached. and then create the bar.
Whilst the logic works as intended, it seems awfully clunky and I'm sure there is a more elegant way of achieving this.
As far as I understand the MLFinLab integration via auth token is a fairly recent thing, so I'd be interested whether other QC community members are also experimenting with the current version of MLFinLab and whether there are any examples/best practises anybody would want to share in regards to the integration into QC.
Enzo Garofalo
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!