A few simple rows - Restated

A lot of developments have been made since I first started to open the black box and shared my investment strategy with you in as in A few simple rows. Also, as promised, I will today show you the details of my most recent strategy. Namely, the Dolvol + Mom (Dynamic) strategy from Let’s turn the GAS on

A lot of the things that I show here today are already included in old posts. And for any frequent readers, that may seem a bit repeating. However, since I want everything that I do here to be as open and accessible as possible, I have decided to compile all my most up-to-date work here in one place. 


Starting with the method, my investment strategy uses a combination of the parametric portfolio policy and a generalized autoregressive score (GAS) model to decide which stocks within the OMX Stockholm 30 Index to invest in each month. 

In essence, I model the asset weights of each stock each month as

where 1/N stands for equally weighted benchmark weights and the product of is the factor that decides how each asset weight should deviate from the benchmark. For this purpose, I let x represent a matrix containing the cross-sectionally standardized asset characteristics (in this case past trading volume and momentum) and 𝜃 is a vector of dynamically modelled theta coefficients.

The relationship between the asset characteristics and the thetas can be compared to a traditional portfolio sorting strategy. Like, for instance, when portfolios are sorted according to their size or book value. The main purpose of the thetas in my investment strategy is to determine how much and in which direction each characteristic should be sorted.


The theta of each characteristic each month is modelled using GAS


where omega (ω) stands for the long run or unconditional mean. Hence, this is the value that theta is supposed to converge towards in the long run. Beta (β) is the persistency parameter, which basically tells the model how much it should continue to believe in its previous theta. The last parameter, alpha (α), represents the learning rate that plays a key role in updating the dynamic values of theta given the driving variable of the scaled score function (s).

And what on earth is “the driving variable of the scaled score function”?! Well, at the end of the day I want to find the “optimal” asset weights of each stock to invest in. And in doing so, my investment strategy takes use of an optimizing objective, looking like 

in which I optimize the investor utility, that is arising from the portfolio return, given a fixed risk aversion parameter (γ). 

With this optimizing perspective, I define the scaled score function as the partial derivative of the one-month investor utility with respect to theta, assuming “unit” scaling, i.e., St = 1. 

These joint equations constitute the building blocks of my current investment strategy.  And with that said, I think it’s time to show you hands-on how to implement this code-wise. 


Each month, I use yfinance to download historical stock prices and trading volumes needed for my analysis. For the love of being open, this is what it looks like when retrieving my data for November 2022. 

It is of course possible to download data for a longer period than just one month, (my total sample starts in December 1999), but the dataframe that you get out is kind of messy and I don’t have any good solution for that except for manual csv-cleansing. That is why I each update my already clean historical data with the latest values. 


Moving on, the first steps in my investment strategy recipe involve (i) reading in monthly price and volume data, (ii) creating all variables and characteristics needed, i.e., returns, past trading volumes / dolvol, and momentum, and (iii) general housekeeping like making sure that all variables are structured in nice and uniform arrays. 

I calculate dolvol as the logarithm of each stock price times its trading volume meanwhile momentum is measured as the compounded stock return over one year (from t-13 to t-1). You might notice that inin and adj seem to contain identical data at this point, but I will later use inin to filter out certain stocks with missing data during the oldest part of the sample period. Additionally, I use k1 = 16 to shorten my sample period to a bit later than December 1999, considering the relatively small number of stocks with available data during those older days.


The next step is to construct the matrix of cross-sectionally standardized asset characteristics (x) so that they have zero mean and unit standard deviation. At the same time, I also gather N (the number of active stocks each month) and r (the return of each active stock each month). The last three rows are used to divide my sample so far into two parts. The first part includes all months except the last, November 2022. November 2022 is included in the second part of my sample and will be used for illustrative purposes later. 

Now to some more tricky parts. I define the function for the one-month investor utility and its corresponding partial derivative with respect to the theta of each month. As explained previously, this derivative, dj, represents the scaled score function within GAS.  

rho = 5 (please don’t ask me why I have chosen to call this rho) stands for the risk aversion parameter which is set equal to five. Additionally, t in jac(t, i) stands for the theta of each month, meanwhile, i stands for the specific month within the sample period and dj = grad(jac, 0) ensures that I take the derivative with respect to theta rather than with respect to time.  


At last, we now have all that we need and it is finally time to start the optimization. 

As explained in Let’s turn the GAS on, I run this optimization twice. First using x0 = [0.0, 0.98, 10, 0.0, 0.98, 10] and second using x0 = [-0.002, 0.98, 10, 0.006, 0.98, 10], where -0.002 and 0.006 are the estimated values of omega from the first stage. I do this to help the model begin with more “reasonable” starting values for its long-run means.

In the pr = list( row, you can also see me playing around a bit with np.exp and np.log. As explained in A few simple rows, I do this to ensure positive asset weight only. Although, to be honest, at this stage I’m not sure whether this is actually working or not, and I might need to come back to this later.  

Nevertheless, following this optimization, I get the following estimated values of omega, beta, and alpha.

The first trio represent the omega, beta, and alpha associated with the dolvol characteristic, meanwhile, the latter three are tied to momentum. Together, these estimated parameters will be used for modelling the dynamic values of theta using the second part of my sample period of November 2022. 


For this purpose, I start by defining omega, beta and alpha given the previously estimated parameters.

Additionally, I define a new scaled score function using my second sample. 


Continuing, I use the following rows to obtain the dynamic values of theta for the period

where tet = gas(res.x, True)[-1] states that I want to use the last value of thetas obtained during the optimization as a starting point.


Finally, this gives me the following pair of thetas for dolvol and momentum respectively.  

 

Basically, it tells me to put less weight on stocks with a high past trading volume and to put more weight on momentum stocks. 


Lastly, I use the estimated values of theta to find an optimal final asset allocation as of November 2022 and I use a monthly budget of SEK 5000kr to transform each asset weight to units, i.e., the number of shares to buy for each stock. 

 

As a result, this is what I should have bought this November if using my proposed Dolvol + Mom (Dynamic) strategy as presented in detail above. 

Feel free to compare this to my actual November buy from This is what I buy in November.

 

Comments