Hi, i’m preparing data (in Pandas) for a machine learning Forex strategy. Data comes from FRED, FXCM, Alpha vantage ecc. How could different features be aggregated in a pandas dataframe? For example fundamental data with price time series (GBPUSD + technical indicators + GDP + interest rates ecc). There is a problem with date adaptation, a feature is daily while others generally monthly. I know scaling and features selection/reduction with PCA but i’m interested in preprocessing and joining of features with different scale/values/timeframe. Please tell me a detailed process in pandas or Scikit Learn to obtain fundamental and price features perfectly merged and ready for a machine learning training/test. From cleaning to scaling. Then many ML models like Random trees or Svm will be compared choosing the best performer. Thank you very much.
Derek Melchin
Hi Federico,
To merge together DataFrames consisting of samples at different frequencies, we can aggregate/disaggregate the data to whatever frequency we'd like. When disaggregating, we can simply fill-forward the missing data. On the other hand, the process of data aggregation can be subjective. With OHLC features, the process is trivial. Without them, some aggregation options are to use the first, last, or most common data point over the aggregated period. I recommend reviewing pandas' resample and concat methods. This SO post provides a good example of using them.
In regards to scaling techniques, consider researching normalization and standardization. There are many great online resources that document the forumlas used in these processes.
Best,
Derek Melchin
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
Federico Juvara
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!