Hi
I have tried to work out a way to loop over a list of fundamental ratios, and get it nice and smooth into a pandas DataFrame, so I can implement a multivariable linear regression model on the data. My lack of data manipulation skills in pandas unfortunately stops me from making it work. I have worked out a way to do it without a loop, but i takes up a lot of space, and it is not flexible at all (see code).
qb = QuantBook()
from sklearn.decomposition import FactorAnalysis
tickers = ("AAPL", "MSFT")
symbols = [qb.AddEquity(ticker, Resolution.Daily).Symbol for ticker in tickers]
start_time = datetime(2017, 1, 2)
end_time = datetime(2020, 1, 2)
funda_start = datetime(2020, 1, 1)
funda_end = datetime(2020, 1, 2)
fundamentals = ["ValuationRatios.SustainableGrowthRate", "ValuationRatios.PayoutRatio"]
pe_ratios = qb.GetFundamental(symbols, "ValuationRatios.SustainableGrowthRate", funda_start, funda_end)
pay_ratio = qb.GetFundamental(symbols, "ValuationRatios.PayoutRatio", funda_start, funda_end)
dataframe = pd.concat([pe_ratios, pay_ratio], axis=0)
dataframe = dataframe.transpose()
dataframe.columns = fundamentals
returns = qb.History(symbols, start_time, end_time, Resolution.Daily)['close']
returns = returns.to_frame()
returns.reset_index(["time"], inplace=True)
final_returns = returns.groupby(level=0)
final_returns = final_returns.agg(lambda x: x.iloc[-1]).close
final_returns = final_returns.to_frame()
final_data = pd.concat([dataframe, final_returns], axis=1)
y = final_data.close
X = dataframe
from sklearn import linear_model
regr = linear_model.LinearRegression()
regr.fit(X, y)
print(regr.coef_)
The problem I am encountering is that a loop over the list, and appending to a empty DataFrame does not work (see code)
qb = QuantBook()
tickers = ("AAPL", "MSFT")
symbols = [qb.AddEquity(ticker, Resolution.Daily).Symbol for ticker in tickers]
start_time = datetime(2017, 1, 2)
end_time = datetime(2020, 1, 2)
funda_start = datetime(2020, 1, 1)
funda_end = datetime(2020, 1, 2)
fundamentals = ["ValuationRatios.SustainableGrowthRate", "ValuationRatios.PayoutRatio"]
ratios = pd.DataFrame()
for fundamental in fundamentals:
ratio = qb.GetFundamental(symbols, fundamental, funda_start, funda_end)
ratio1 = ratio.transpose()
ratio1.set_axis([fundamental], axis=1)
ratios.append(ratio1)
ratios
I was wondering if any knew how to solve this problem, as it becomes more irritating, the more variables is being used in the program
Have a nice day
Lucas
Varad Kabade
Hi Lucas
We recommend using pandas.concat to combine the different fundamentals dataframe. While it is not necessary for linear regression, it is also a good practice to rename your columns for easy management. The below snippet change would do the trick:
Best,
Varad Kabade
Lucas
Hej Varad
As always, thank you for the help. When running the code, is produces a dataframe with 552 columns and 1 row. This is fine, and can be fixed with the following code, to make it nice and easy to read, for anyone with the same issue:
Have a good day
Lucas
Lucas
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by QuantConnect. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. QuantConnect makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. All investments involve risk, including loss of principal. You should consult with an investment professional before making any investment decisions.
To unlock posting to the community forums please complete at least 30% of Boot Camp.
You can continue your Boot Camp training progress from the terminal. We hope to see you in the community soon!