Full Transcript: David Wright on AI In Investing
Using Machine Learning in an Investing Process
Justin: David, thank you very much for joining us on Excess Returns.
David: Well, thank you very much for having me.
Justin: As head of quantitative investing at Asset Management, you play a very important role at the firm in terms of leading and shaping how the firm approaches building systematic strategies. And that includes strategies that use artificial intelligence and machine learning in terms of stock selection.
And so I think today what we wanted to do is, is sort of sit down with you and talk to you about how you are leveraging AI and machine learning in your investment process, what it can and can’t solve for. And how you are thinking about this intersection of human judgment along with machine driven.
Investment process and how you think things may evolve in the future. So that’s gonna be sort of the broader framework that, we’ll get, kind of get in with today with you and then sort of talk specifically about, how these things are being used, in terms of building, the portfolios that you guys are running.
I should note that, pic Tech recently launched a series of US-based ETFs. You can sort of Google that, but actually the, the URL is etf.am dot pic tech.com for the US sort of strategies. And I think a few of those strategies actually are using or have an AI orientation or focus. So a lot of what we’re gonna talk about today sort of is powering, how you guys are building those portfolios.
So again, thank you very much for joining us. I think our audience is gonna find this conversation. very, very interesting. where I wanted to start with you, I guess is a little bit more broadly here, just to set the stage. And, when, when we think about ai, it’s. It is a very broad term. So when we, when you think about it from an investing context, can you just kind of explain what you mean, what you think about when you hear the words like artificial intelligence used, when, powering an investment strategy or process?
David: Yeah, I mean, we, we get asked this, this sort of like this high level starting point quite a bit. So I mean, ai, artificial intelligence, I simply think it means, it, it means a machine, doing some kind of function or role that a human would generally do. So it’s whether it’s, whether it’s generating something, whether it’s sensing something, it’s doing something in a human-like way.
in practice though, generally what we’re talking about when most people are talking about AI and finance, they mean machine learning. Not all, not all AI is machine learning, but machine learning is a very large subset of it. And what we generally mean there is, rather than a human programming, an algorithm to do a task, an algorithm is trained on data to learn to do that task.
Justin: Can you kind of bring that through, like how does that work with like the training aspect of it? ‘cause that’s a very important part of machine learning. So sort of talk through how the system actually does that.
David: Okay. So. Again, I, I think we’re using the right kind of terminology here. I sometimes think the, the learning aspect confuses people, because it, it kind of gives the impression that most machine learning is very free and unsupervised and there isn’t a lot of structure to it.
But I like, once we start using the word train, because I think it, it, it kind of highlights the, the structure around a, a lot of what’s going on here. So using the example of what we do in, in our world. so essentially a big part of the training is defining what you want to train the algorithm to do, and for us powering that, ETF that you mentioned.
P one is, the, is the one that, that is particularly run by my team of the three that Pictay have launched. the key here is that we want to be able to forecast. The relative attractiveness or the relative return for a broad universe of stocks over the next month. Now, to do that, we define a large amount of input information.
we, we, jargon wise, we would call these features, that tell us something about the stock or the company. We then define an output that we want to train the model to be able to forecast going forward. And that output, again, is a one month specific return on every company in the universe. And then rather than a portfolio manager assigning weight.
To those features or characteristics of the company and testing how a model would look with that waiting scheme, we actually take decades of information, the point in time features and the rolling, one month forward returns. And an, an algorithm is trained on the data to understand how you most effectively move from that input to that output.
Justin: I think we’ll try to peel back the onion on, on a lot of, on a lot of that, that what you just said. But before we get to that, I just want to ask you, because I think a lot of people that are listening to this probably have used, some type of large language model, whether that be chat, GPT, Gemini, or Claude.
So just contrast and, and I do think some investors both. Individual investors and professional investors are actually also using those large language models to select stock. So how would you, I guess, compare or contrast the machine learning method versus the, the chat GPT large language model method?
Like, how would you describe that?
David: I mean, so the first thing in, in very shorthand is that you’ve gotta have the right technique for the specific task at hand. Now, a large language model as again, many of us are becoming much more aware of, is a, is a form of generative ai. it’s a form of generative AI that is trained on text, and then it generates text as its output.
It’s not making a prediction, it’s not classifying anything in a particular way. it’s, it’s making a, a generative. output for what we are specifically wanting to do. And again, I defined it quite tightly in how we are using, the approach that we do. But the approach that we are using that is based around regression trees or decision trees.
And again, we can go into a lot more detail about what that exactly means. They are very well suited to making broad forecasts, particularly, particularly numerical, forecasts. They can be trained with differing types of data, but they offer benefits in the financial field well above. LLMs that, again, they are very effective for making predictions.
They’re very stable approaches. So, LLMs and the types of machine learning that you use to train LLMs and not particularly stable. they’re not always gonna work in the same way, and they’re very hard to interpret. it’s the, the, the chat GPTs of this world, I think are getting better if you sort of like challenge them on why they’ve come up with their answer, but they’re, that type of approach is really not embedded into LLMs the types of approach that we are using.
Being able to interpret and understand why it comes up with its answers is a sort of core benefit of that approach.
Justin: Yeah. It’s almost like LLMs will give you almost what you’re like, if you say like, gimme a portfolio that is made up of stock. it, it almost, sometimes it feeds you almost what you, the, the prompt that you’re giving it.
Yeah. So it’s not giving you like a true objective thing because it’s kind of tilted towards the bias that you, have prompted it with. But just back to the, what you said something earlier. Walk us through like a simple, this decision tree, this process that is kind of embedded in the machine learning.
Like, give us an example of that with. Something, like a, a financial data input that some, something that would go into that decision tree, I guess. Okay. Process. So the,
David: the, we can talk more about, the, the sort of the, the decision tree process that, that we are using and, and, and, and specifics to it, as we go through this.
But in, in simple terms, it’s, it’s almost exactly as it sounds. So, I, I think most of us would get the idea, a decision tree. you have, you have splitting points, kind of like different branches within the tree. each of those splitting points is set by one of these features. And, and again, we can come into more detail on this, but for those of us that are in the quant world, that is kind of like machine learning jargon.
In practice, what we generally mean is like a signal, again, something a characteristic about the company or the stock. So you can kind of think of a simple decision tree as a feature or a signal being at every point, splitting branch in the tree, and then the, the how you split, whether it is positive or negative for the forecast.
It is gonna be set at a score, of that feature. it could be done something like, negative is, is, is, is a one third point within the distribution. And, and positive is, is, is a, is the other two thirds. It could be an actual, defined, piece in the distribution of the score. So again, every feature is, is assigned.
There is a splitting point on this, and you can start to use and, and understand the effectiveness of this, individual decision tree and in the approach that we are using. Then use that to start refining other decision trees that you can combine together in a combination.
Jack: Going back to the beginning, I wanna talk about the data you referred to, because I think that’s probably an interesting choice you have to make, which is how much data am I putting in here?
I mean, you could probably do something very. narrow and put just fundamental data or something like that. And you could probably get into things like the cars and number of cars in the Walmart parking lot or some of this more advanced data. Like how do you think about what to put in there in the first place?
David: Yeah, I mean, I, I think this is gonna be a good way to kind of. Benchmark what we do a little bit here. but it is simplifying it. So I, I, I kind of think a decision that we made, and, and maybe maybe you have to take a little bit of a step back here. I mean, so much of what we’re gonna talk about today is sort of very technology focused, very data heavy, but a lot of the decisions made by the people that develop this and use it have a big impact on the way that the algorithm learns and then is structured.
And probably one of the biggest macro decisions on what type of machine learning you’re using is the data. And again, I think we can simplify that decision that we made down to, do we want to go the more traditional data route that quants have been using for, 30 or 40 years, your prices, your fundamental information.
Your sell side forecasts, your positioning type data, or do you want to go the more alternative data route as you sort of alluded to the, the, the, the satellite imagery, the smartphone locations, the social media posting, those, those types of things that, that have garnered a lot of excitement in the quant space Now, we, we through a lot of testing and, and, and refining our approach, we like to train our models over, around a 15 year look back now a 15 year period of data.
That means we have multiple different economic and market cycles within it. Our models, can learn from a lot of historical information. They can learn about the persistence of the relationships that we’re gonna end up trading off. Now a lot of the more alternative data does not have that kind of history.
it might have three years, it might have five year history. It also is unlikely to have full breadth. So using more traditional data, you’re generally gonna have a feature, score for every company in your universe, and you’ll probably have that every day, every trading day, right back for that 15 years.
If we start incorporating more alternative data, we don’t have the history, we don’t have the breadth. It creates a lot of challenges. So we have focused much more on that traditional data side. Again, ‘cause of the history, ‘cause of the breadth, and because a lot of those signals, they have a strong rationale on why they should work and a history of developing alpha in traditional approaches as well.
Jack: That rationale is what I wanted to ask you about next. ‘cause that seems to be a little bit of a debate Within the quant communities. We’ve always been told we should use factors that are intuitive, so we should use things where we can understand why this should influence stock prices. But I think a lot of people in the new age of, of machine learning are going a different direction, which is saying if it’s in the data.
That it’s true. I, I don’t care if I understand why it influences stock prices. So how, how do you think about that balance in terms of putting data in there that should have something to do with stock prices?
David: Yeah, I mean, why, why I, I’ve got a sort of r smile here. I mean, one of the reasons that, that I, I find this really so interesting is you talked about like the quant industry having this debate.
I would say it’s even going on in our own, in, in my own team. So if I think about the backgrounds in, in my team who, who works on these types of strategies, who have developed these strategies, we, we’ve got more traditional quants who have, have built factor models historically, and a very, very, very, very wedded to the idea of that, rationality, that explainability of, of the features that we would use.
the other extreme, we’ve got some of the kind of computer scientists, data scientists who are very much about the machine learning approach. Let’s just give it lots and lots of data, and, and sort of see what it comes up with. And then, and then we’ve also got experts. and, and I would sort of characterize this with a bit more of a physics background somewhere in the middle who have got a lot of experience of building large models for other types of tasks.
so we’ve probably ended up coming somewhere down in the middle. a lot of the features that we use are, are essentially signals. They are things that have been researched by us or academia and have a strong rationale on why they should work. There is a fundamental underpinning on why they work.
There is a behavioral underpinning on why they should work. But we also have a great belief that we want to use as many of them as possible, and that we are willing to put in additional features that would have less of a rationale or less of a history of working as signals, but they could potentially have a strong conditioning benefit of how they would work together with other more traditional signals.
So I would say we’ve come out somewhere in the middle, but the majority of the, the features that we do use do have some kind of rationale behind them.
Jack: Do you think as an industry, we’ll go more and more towards this not needing a rationale over time? Like, I remember when we talked to Cliff Asness, he was very resistant to doing any of this in AQR process, but he’s slowly been being more and more comfortable with a little bit of this, I don’t understand why this works, but it works type of thing.
I mean, do you think as people get more comfortable with these techniques, we’re gonna get away from this rationale over
David: time. Yeah, I mean, I, I think, I think your horizon of investment is a little bit of a part of this. So I talked about that. Our forecasting model is, with, with this strategy is around one month.
We do some, in some of our AI strategies, combining some models that we train with a multi-month horizon as well. But let’s just view us as like a one month horizon type, type strategy for, for, for, for this answer. I think if you are, if as a, as, as quite a lot of the teams who are focusing on machine learning are, they’re more like the high frequency end of the market intraday.
I think at that point, your. I think the rationale becomes less important. I do think you can just feed the machines a lot of data, and you have to have that kind of acceptance around it. I think as you go further out on the horizon, I think the, the more that rationale will remain important to you. so I do think there are gonna still remain different schools within this.
From our side, the way that I would characterize it is that a lot of the signals, the features that we train the model with have the rationale, but then the relationships that the machine learning are understanding between those features, the interactions, the non-linear elements to them, where we believe we’re generating a lot of our alpha from I, I think we have to accept that a lot of those relationships, you and I could maybe.
Look at the thousands of relationships that machine learning identified, we could probably put a solid story on 10 or 20% on why some of those work. And then a lot we don’t, we don’t understand why they work. So we’ve had to get to comfortable with that kind of breakdown. We get our, we, we, we like rationale in our inputs, but increasingly what the machine finds in the relationships between those different pieces, we do just have to accept that they’re a little bit less, obvious Jack: how, you mentioned you guys use sort of a one month timeframe here.
How do you think about timeframe with this? Because traditional factors, value, momentum would be over longer timeframes. I mean, do you think this type of technology is more suited for shorter timeframes, or is that just the way you choose, chose to implement it?
David: Yeah, I mean a little bit of both.
is, is is a kind of short answer. So, certainly in the refining of this approach, we, we tried lots of different things. So let’s use horizon. We trained five day models, we trained 10 day models, we trained 20 day models. we, as I mentioned in, in some of our strategies, we do combine in some six month train models.
What we are increasingly finding though, is the horizon gets longer and longer. The real benefit of using the machine learning over and above just building like a traditional factor model and how much of the return you can forecast with that, it gets lower. So it does seem that the, the horizon is, is a big part of that.
The, the shorter the horizon becomes, the more beneficial the machine learning element. becomes within it. So, so I, I I, for us that’s been, we wanted to, we, we had a history as a, as a team running factor strategies with, with actually emphasizing more the quality low vol factor, which, are really quite at the longer end of the, of the quant spectrum.
We wanted to have something that was complimentary alongside that, that we could build different strategies. So once we are into that, that shorter end of the market, it, it did, we did find that in all of our analysis, the machine learning just became a much more helpful over that horizon.
Justin: Do you have any theory as to why?
David: Yes, but, but some of it’s kind of conjecture, I think even amongst us a little bit at this point. So, and, and I, I, I think people could justifiably argue about this. I, I think when you go out longer, horizon. And, and, and maybe this would sound counterintuitive, there’s maybe a little less, a less number of things that are driving the, the relative return.
You can pin it down to kind of like economic cycle elements, very specific, company fundamental elements with these things that are more aligned to like style elements or sort of common elements. Once you get down to the shorter horizon, again, sort of counterintuitively, lots of different things are driving returns.
It gets quite noisy, at that point, and that noise is just very hard to kind of understand. It’s clearly really hard to understand are humans, but even with more traditional quant approaches, it’s very hard to to to, to disentangle a lot of that noise that’s going on. And, and machine learning does give you that, that ability and that capability.
Jack: Do you think machine learning will make traditional factor strategies better? I mean, in the US here, there’s tons of value ETFs and momentum ETFs and multifactor ETFs. Yeah. I, I mean, do you think for that type of thing, this will make those things better? Or, or do you think it doesn’t add value for those types of strategies?David: I, I think it has a lot of potential. but it comes back to one of the earlier questions. It’s, so let’s maybe use our example. my team’s example here is the easiest way to illustrate this. So we run, I mentioned we run factor strategies. They, they emphasize the, the, the, particularly the quality, low vo element, little bit of value.
And then we run these machine learned that are forced to be fact neutral as much as possible, both in the way that we train the model and we construct portfolios. At the moment, while we are very collaborative as a team, those strategies have been kept very separate. So one uses AI machine learning, one doesn’t.
and again, it’s because we’ve trained and spent so much time training our, our machine learning, engine for one very specific task, however. we are increasingly as a team investigating the idea, can the factors that we are using in our more factor driven strategy, can they be improved with other types of machine learning?
So I, I put a, a a lot of hope as, as I know, a lot of quants do in, can you use LLMs to, to make better analyst forecasts? can you, can you, can you analyze, the sell side reports that are written? Can you analyze the, the earnings called transcripts using lms? Can you analyze news?
So I think there is real potential on the factor side to make improvements to some of the momentum models or the value models or the quality models using those types of things. But again, that is gonna be a very different machine learning technique or tool than the one that we’ve built to, to make this prediction, model.
Jack: One of the misconceptions I think that’s out there about machine learning is this idea that it’s just data mining, ? And as quants, we’re always taught, they, they have this whole saying, you torture the data data until it confesses. And we’re taught, just don’t keep testing things over and over and over again to get the result you want.
And some people, I think, have a misconception that that’s what machine learning does. So can you explain why that’s not the case?
David: Yeah, I mean, I can, but there, there is also a little bit that it kind of is data mining. I mean, that’s, that’s the kind of funny thing that you almost have to admit to this. So clearly we want to, and, and the power of the model that we’ve trained is that it’s learning from data.
So you are in, you, you’re inherently building that element within it. But clearly you have to put a huge amount of guardrails in place to, to, to overcome and, and check for some of the challenges that this potentially creates. And, and clearly the one that you’re alluding here is sort of, is over fitting.
It’s sort of, it’s over forcing a model to look too much like the past, and then it’s not necessarily gonna work, in, in, in, once you’re using it live. So a few things that, that, how we look to overcome this. So firstly, I, I think an initial point that the features that we are using, that a lot of them have this rationale to them, I think is a good starting point.
The fact that when we train the models over over 15 years, that the types of technique that we are using, so we, within that 15 year period, we, we train for 12 and then we validate on a, on three years of, of the training set that, that we pull out. And we randomly, we, we train some models on this 12 years and, and validate on three, we will train another part of it on, a different 12 and validate on three.
And we do multiple different combinations of that to, to, again, trying to make it as robust as possible. We also, while we are basing the training on the last 15 years, we do exactly the same type of training outside that 15 years as well to see are the relationships that the model can identify, and then generate alpha from, are they pretty stable, like 20, 30 years ago?
So there’s lots of different techniques, statistical techniques, techniques, the sort of often pioneered outside finance that you can use to make this training as robust as as possible. But clearly at its starting point, you are learning relationships from data.
Jack: One of the things I’ve been thinking about a lot is what do humans do better and what do these models do better?
What does AI do better? And that’s probably gonna change over time, but I’m thinking like where you are right now, like what is, what does your team do better than these AI systems can do? And what do the AI systems do better than your team can do?
David: Yeah, I mean, so, and, and again, I ask myself a lot this as well, thinking about how, how, how I run the team and, and how that might need to evolve over time.
I mean, so firstly, if we think about the process that we are using here, the feature engineering, so creating the signals, the features that is still predominantly a, a human, done approach within our team. So again, I talked about some of these ideas that, that we’re using come from academia. Some of them we research ourselves, some of them.
like ideas can come, maybe from an interaction that someone is having with someone else in the firm, Picate does a lot of different things. so, but generally that type of thing is coming from us. So we are the, the, the, the, the broad group of features that we choose to train the model with.
that, that broad group is defined by us. And I still think at this point defining that universe of features the human is better at now with our investment horizon. What the machine learning the AI is way better at is understanding which of those features should be most emphasized, which of those features work most effectively together.
and, and, and really essentially structuring the model itself. So with the horizon that we’re talking about here, the machine is better at structuring the model for by learning from that historical data. We also, in the way that we then take that and construct a portfolio, clearly that is a very automated process.
So actually taking this information, interacting it with, risk views, cost views, constraints, and then building a portfolio that optimization is clearly, is, is much more effective, in a, in a machine driven way, even if it’s not necessarily an AI way. Then the final piece for us checking a final portfolio before we go and trade it.
While we have many dashboards that give us lots of useful information, I still like that a portfolio management team does that and, and, and has that final check. maybe that there is some news, some piece of information that the model is unaware of. So the start and the end for us is a bit more human.
But the piece in the middle, that’s really we, where we see the, the, the, the real benefit of the, of the machine.
Jack: So do you not tell the models anything in terms of what data is more important when you feed it in? Like for instance, you’re feeding in fundamentals and you think those should have a lot to do with stock prices and you’re feeding some peripheral data set.
Do you say anything about this is more important than this? Or do you let it figure that out?
David: I mean, so broadly we are letting it figure it out. Now, there is an element of course, that we are slightly forcing this by just the, the, the number of, of, of, of, of kind of features that we’re using here. So if we think of the breakdown around 400 features, like a hundred, 125 are price based.
Similar number we build from sell side forecasts. we probably build, we build a, a another hundred from, from more. accounting based type information, and the remainder is from kind of investor positioning, calendar information, from more qualitative type information about companies. So clearly some of those final pieces that I’ve mentioned there, they’re smaller, so you have forced a little bit the, the kind of the, the bigger groupings of features potentially have a higher likelihood of, of importance.
But again, what is incredibly powerful about these decision tree based models. Is what it is. Learning is not just the importance of these features, which means that they would be further up in the decision making process so they have a larger impact on the path that a stock would follow to make the forecast.
But what are the features cluster in with it? So what we tend to see is that there are certain features in the feature set that do not have a very high number of them, but they can work very effectively with the more common features and have a lot of, conditioning elements to them, a lot of relationships that are built with their involvement.
So again, we really find learning that from the data is much more effective than us trying to force that too hard.
Jack: And is the data you’re using an ongoing process? So for instance, do you have a live model sitting here and then behind the scenes you have, more of a test model where you’re continually testing d different data sets to see if it should be in the live model?
David: So we do, we will add data sets and features over time. So, while, peak one, pict a’s, active ETF for the international market has launched quite recently, we’ve run this strategy even either long, short, or long, only in other vehicles live now for, for, for over two years, over two and a half years.
In that time. The feature set we’re using has grown from around 250 to 400. Predominantly that is engineering features from existing data sets, but we are also testing other data sets and building other features from them as well. So, yes, in the background we are constantly refining and developing the model that we use.
How that then comes in live. whether new features, are incorporated or not every three months, we do a full retraining of the model. So the model has a, is trained on 15 years of data and then every three months we roll it forward by three months. So we drop the oldest, we add on three new months, and we do the training completely fresh.
So it has not, while we set the parameters the same, so the sort of structure around the training, the same as, as we, as we were comfortable with previously, it’s done without any understanding the previous model existed. It’s completely retrained and if we want to incorporate new features we can at that point.
But building it again from the ground up and making sure that the model looks very similar is a nice safety check that you’ve not done kind of anything stupid in that training. I’m just
Jack: curious, as you do these retraining runs, are you seeing the technology leap a lot over time? Like in the US we’re seeing with these LMS.
we’re seeing mass, everybody jumping on top of each other and these massive changes, these massive innovations. Like when, when you, as you keep looking at this is what you’re able to do or how fast you’re able to do it, is that changing rapidly over time? David: I mean, certainly, in the time that we’ve been doing this, sort of like the last five, six years, which would be an evolution from, the first kind of academic papers the team wrote on this through to a more applied approach to training our own models, to, to then running live capital on it.
The time it takes us to train this, has definitely fallen. So we are finding efficiencies, the, the type, we, we, we, we have also our hardware has improved over time. So we have more, computer processing chips, more graphic processing chips to speed up these things. So the approach is becoming more efficient, so we can do these things quicker and more efficiently.
But I think what is really interesting for us is that. is how stable a lot of the relationships that we actually find. So yes, the machine learning world is improving. The efficiency in the way that we train models is improving, but something that we consider so dynamic, the financial markets, the equity markets, many of the relationships that we find through the, through the, the data analysis, the relationships between these different features, they’ve been static for a long while.
Versions of this model that we could train with nineties data have a lot of the same relationships between the features in it to now, and these, these, these relationships are still very, very effective. And I would say that has been the biggest surprise to us through this approach is how stable a lot of these relationships we discover are.
Jack: And I would think that’s important in terms of thinking about things that will work over the long term. Like if every run you, you did, you got completely different results, you’d probably have less confidence than you do now because you’re seeing similar things
David: each time. Exactly, and I, and I think it’s, and, and, and if someone asked me to sort of like decompose the, the benefit over and above a traditional factor model.
Over. And the, the fact that we’re talking about, they’re slightly different things, particularly with the aim of the different horizons. But as, as, as, as people who work in quant, as and, and people who have looked at this like, like yourselves a lot, we know that many traditional factors have a decay element to them over time.
And that decay can be quite significant in some factors, and it can be slower in others. And it’s not necessarily a 45 degree angle. It can be a little bit, it can be a little bit more up and down than that, but generally there is a decay element to it. What’s really exciting for us is that the, the kind of 40, 50% of the return, of, of our model that comes from the, the, the capturing, the understanding of these non-linear relationships, these interactions between the features, the decay in them, is just minimal in comparison to sort of a, a linear exposure to a traditional factor.
So that is really exciting for us in, in the potential in this.
Jack: Going back to the human versus AI thing, and I know none of us know the answer to this, but I’m thinking about like, as, as we go well into the future, how is that gonna change? Like, is AI gonna be able to take over all the things we do? Well, like you talked about, you guys are better at figuring out like what to feed into the model.
Like, are we all gonna be sitting on a beach at some point and and these models are just gonna be running and competing with each other, or, I’m just trying to think like, if you think forward, how, how do you think about how this might play out?
David: I mean, if I look at, I, I’ve, I’ve worked in, in, in the quant industry for 25 years, from what I’ve seen over that time and, and, and, and my colleagues and, and other people in the industry will have seen something very similar over that time is that almost every part of a quant process has got more automated over time.
I mean, like, it’s almost every aspect of it. There have been ways to find and, and improve and, and, and get more technology in there. So. So I sort of use some descriptions of things where it’s more human led for us, or there’s stopping points in the process. Another stopping point in the process for us is that our machine learn model, our train learn model, produces our relative return forecast.
that relative return forecast is then used within a re, relatively traditional, mean variance optimization to build the portfolio. Again, adding in risk views, cost views to it. we have research underway and certainly, people using machine learning who are much higher frequency than us would have to work in this way given the time constraint where we actually train a model that is risk and cost aware, so it outputs holdings for us rather than, rather than just a, a forecast that we interact in the portfolio construction.
So this is just an example that I, I do think the steps are gonna probably get more and more automated, within it, but at the same time. a lot of, a lot of my time is spent with our client base and while people really appreciate the returns that we’re able to deliver with this type of strategy, they do want to know that there are human elements to it.
They do want to know that there is very strong human oversight, that there is guardrails and structure and supervision around how these models are trained. So I think you’re always gonna have a little bit of that side pushing on, on, on, on the, the, that they wouldn’t want it to be front to back the machine.
Jack: Yeah. It’s interesting ‘cause when I, when I talk to clients, one of the things that they often say is, I use a quant strategy ‘cause I wanna take emotion outta the investment process, human emotion. And that’s very true. You do take a lot of it out. But what I always say to them is, you don’t completely take it out right now because you want someone sitting behind that strategy, who knows what they’re doing, who decides what goes in there.
you want to have someone, I could panic when my strategy’s not working and I can start pulling stuff out of it. So right now the human sitting on top is a very important. I think of these strategies and, and I wonder is I think over time, like how that will change. And I, I don’t think any of us necessarily know the answer.
David: Yeah. I mean, I, I, I, again, I, I, I think the, the, the human emotion one is, is, is very interesting and bias and, and clearly again, quant was structured to overcome that. But we have to be fair to our, to ourselves and, and the way that we work. If, if a human is, is building the model and allocating weight to groups of groups of signals or certain factors, a bias coming back comes back in that way as well.
So it’s like, I think one of the reasons I do really like what we’re doing with the machine learning side of things is taking, again, taking that piece away where the human can kind of get back into it and. Involve and, and sort of involve themselves and their views on things. Again, we’re taking that aspect away from them by learning from the data, understanding from the data, the feature importance, the way to structure these different things.
So I think this is a step forward in further removing that a, a source of bias that maybe we’re we, we overlook a little that exists in traditional quant approaches.
Jack: I wanna come back to chat. GBT, Justin mentioned it earlier, but a lot of people are thinking about the idea now of using this to build and test strategy.
So, something like you are Warren Buffett, everything Warren Buffett has ever known. select stocks like Warren Buffett and create a strategy that’ll work going forward. And I’m wondering just as someone inside of this, how you might think of the ability of those models to do that and whether they’re intended to do that type of thing successfully with strategies that might work going forward.
David: Yeah, I mean, I, I really, I, I really remain very skeptical that that is what they’re gonna be able to do effectively. A again, because I mean, I see it from my own usage, chatt, BT has got way better there, there’s clearly more reasoning within it than, than just sort of full generation now. But it hasn’t been trained to do that very specific task.
It, it hasn’t been trained to build a portfolio. It hasn’t been trained to do, if we use the buffet example that in-depth analysis on, on, on individual companies, but whether, whether it kind of has that or not, the challenge that I still have mostly with, with, with the LLMs and the, and the generative side of things being effective in finance, is just the lack of interpretability of it.
So really, I mean, you could question it why it’s coming to that decision, and it’ll give you an answer. But I think really being able to interpret how it came to that, I don’t see that in the LLMs now. The approach that we’re using, again, we’ve. We kept a, a reasonably high level in, in the way that we talk, that we do things, but we’ve talked about, the decision tree based approach.
We end up having thousands and thousands of decision trees as, as our end model. they’re, they’re trained, with gradient boosting is the technique where they’re in, iteratively trained like one after the other, learning from the mistakes of the previous tree. But it’s still, the end model ends up being thousands and thousands of decision trees.
Those trees are interpretable. So every position that comes out in our model on a given day on a given stock, we can understand which features in which combination are driving the view of that model. So it’s interpretable. If we, as we did when we were starting to investigate this space and look at different ways of doing this, if we trained the model using a neural network.
Deep learning, which is the way LLMs are trained. You don’t have that interpretability of it. So actually we found that we were able with neural networks to produce forecasts that were of a similar kind of accuracy, as the boosted tree models that we use, but then we wouldn’t have any way of interpreting it.
So trying to find that combination of accuracy, stability, and interpretability. I think that is gonna keep a lot of people away from using the, the kind of LLM approach
Justin: just one point of clarification. So when you say accuracy, you mean like you buy of 10 positions, you might get six outta 10, correct.
Is that what you mean? Accuracy? Exactly.
David: But for us it’s a relative forecast. It’s, it’s kind of like the way that we would judge that, I mean this is simplifying a little bit, is kind of like the, the, the correlation between the distribution of the forecast and then the, to the correlation to the, the actual final return over the horizon, the distribution within that group as well.
That for us is how we’re assessing accuracy.
Jack: And I think one of the challenges is when we build models, we only want to basically have the model know what it would’ve known at the time. And that can be tough with these LMS because they know everything as of now. So you can’t go back and say, you forget everything you’ve learned since 1990, and build me a model in 1990.
And so, because obviously if I know everything I know now I could build an incredibly successful model, but it’s probably not gonna work going forward. So I think that’s sort of a challenge of these types of things.
David: And, and, and, and again, I think that’s why I just don’t see a lot of quants, certainly on the quant side, given the history that we have of understanding about, peak ahead bias, restating data, all these types of things and, and, and really have a, a solid understanding of the way to correctly do back tests.
I, I think it just, it, it creates a barrier that I don’t see really being overcome to actually use it as a, a, again, to, to build portfolios or, or, or make, or make. Forecasts. So again, if we, if we contrast to what we are able to do, when we run simulations for this strategy, the model that we try, that we use on a given day, if we, even if we’re for, if we are, if we’re simulating right back to the 1990s, it has a, it has its 15 year look back at that point, it’s not, doesn’t know anything about the data that we’ve had since.
It doesn’t know about any relationships that have been developed since. So you really get a robustness to that, simulation in a way that you, as you described, that you wouldn’t with like an LLM type model.
Justin: So let’s just talk about how this all comes into an actual portfolio. So first off, are you buying a broad swath of, stocks or is this more used as a more focused, concentrated type of, portfolio that you’re constructing off of this system?
David: No, it’s very much about building a diversified portfolio. So this forecasting model, the, so we train it over 15 years, as I’ve described it a couple of times. We then use that to make our daily one month forward forecast, for the next three months. And then we retrain on 15 years. So again, we, we, we’ve sort of defined the, the structure of it there that model’s power.
again, as, as, as we’ve also got a little bit into is its cross-sectional forecasting power. So it’s individual forecasting accuracy on an individual name of that name’s gonna go up 5% or down 5% over the next month is not fantastic, but its ability to say we’re in that distribution. Th it’s much better at that.
So what does that mean in the way that we went then want to build a portfolio? We want to take that cross-sectional power and, and best reflect that in a portfolio. So we do that in two ways. In long only. We do that in enhance index, so lower tracking error strategies that hold a lot of the index names and just use this forecasting model to tilt a proportion of the stocks above the benchmark weight and tilt a proportion of the stocks below benchmark weight or, or don’t hold them.
And then if we wanna run a higher octane type strategy, we would do that with leverage in long short. So we remove the benchmark, we still maintain our diversification, and we use leverage to lever up the portfolio so we can combine higher risk and diversification. So the power here is very much in a, in a, in a diversified forecast and then a diversified portfolio.
Justin: Yeah, and I think in that enhanced index, it’s, correct me if I’m wrong here, it’s like factor neutral, sector neutral, and geographic neutral. So you really are isolating, like over time as a track record gets established, the. Alpha, which hopefully it’s, there will be a function completely of the stock selection process.
E exactly.
David: Mm-hmm. So we want to, and, and that comes out in, in two dimensions. So when we train the model, I, I’ve talked about our features as inputs, the hundreds of features, inputs, and our output is then that 20 day or 20 trading day forecast. What we train the model is. On the specific return. So we clean those historical returns statistically of the market beater, of the sector beater or industry country style beaters as well.
So it’s learning to just forecast that piece of the stock, the idiosyncratic part of the return, the unique part of the return to the company. And then when we build the portfolio, we have a lot of guardrails and all those dimensions as well. So exactly as you’ve described in, in enhanced index, we want don’t want to deviate much from the benchmark on any of those common dimensions.
And then what our clients will see when they, if they have the ability to analyze their, their performance as well, that we deliver to them, they will find that the vast majority of this, well above 95% will come from kind of stock specific alpha and not any of those common elements.
Justin: So what are the, are you guys running this on the s and p 500 or what are, what Index currently are you running it?
David: So the, so the, the evolution to this point, the first strategy that we did, with a, a vehicle, domiciled here in Europe, we did against an MSCI world Benchmark. then the, the peak one, that we’ve launched, as an active ETF in the US is an EFI benchmark. So, okay. World XUS and Canada, we are all, we, we will likely, bring, other strategies, with the, the same approach to the US market and our research has shown, and we’re already running money for individual clients, with us, with em.
So, we, we intend to do this in Europe. So what we’ve seen is this is a very transferable approach as long as you have a kind of minimum universe size. So, As much as I would love it. There’s never gonna be an, an AI driven probably UK strategy. It’s not, it’s, it’s not broad enough. We we’re not gonna run a Swiss version of this strategy, but as long as we’ve probably got four, 500 names, that then it looks to be pretty effective.
Justin: And you find the persistence, and I’m sorry if you just answered, but you find the persistence across different geographic areas and markets to be, I guess the model would sort of vet that out if there was any difference. I’m thinking like international versus us, or Yeah, emerging versus developed and how that kind of plays out.
I guess the model would kind of figure that out.
David: I mean, so yes, but also we wanted to make sure that was the case in the way that we trained it and tested it as well. So I, and I think we as a team, were maybe expecting some of, again, the, the power of this as, as a hope we’re getting across is finding the relationships between the different features.
The way that they work together most effectively to forecast a return. Now, we had some prior thought that a lot of these relationships that we found, maybe they would work in us, maybe they would work in Europe, but maybe they would be less effective in emerging markets. For example. Maybe just the dynamics of those markets.
I don’t know, the dominance of, different investor types or just because the, the maturity of the market, whatever it might be. But what we found is, these relationships were very, very transferable. And the way that we checked that out is what we did is we ran simulations where we just took the model that was trained on our global, a global model that we’d built and we were live with.
And we saw how that structure of model would work in something like EM or just unique in us. Then we trained versions of the model where the training could only see. Just the subset of the universe that we would then run it on. We then would train variations of the model where we would slightly change the feature set with our view of the dynamics of the market.
And what we found is the most effective thing to use in these different regions was the globally trained model with the global feature set. So again, it said these relationships were very stable and you actually can learn things in one market. And even though it wasn’t necessarily trained on that, on that subset of the market, it works very effectively there.
Jack: How do you think about rebalancing something like this? I mean, I would assume it’d be something that would be happen slow over time, and it would be a function of the change in the projected return from the model with also balanced against the trading costs. But how do you do that and how much of that is done by a person and how much of that is completely inside of the model?
David: Okay. Your first part, you, you, you answered almost perfectly. so, so maybe I’ll just, I’ll just repeat kind of what you said. So the model does update daily, given the amount of money that we’re running in the strategy at the moment and, and the fact that the correlation between the model forecast day-to-day is quite similar, we do at the moment rebalance weekly.
So that is regular enough for us where we’re capturing some evolving views in the model. And we are not gonna spend too much in transaction costs to do that. And we obviously need to, that’s a key consideration as it is quite a fast strategy. As the strategy grows over time, we’ve been analyzing, to understand will we need to rebalance more frequently?
And I think that probably is the case. So over time, as the strategy grows, we will need to rebalance more frequently. Clearly the model is not changing more often, but it’s, it’s, we kind of need to chop our trade up. A little bit more frequently to it, but there is certainly an optimal level of rebalancing that you need to do.It is a human that has made that final decision, but we analyze and test a huge amount of, of data to, to get to that decision.
Justin: Is there, I don’t know if this is the right terminology, is there one conditioning effect or signal If those two things are the same, I’m not sure, but is there something that really, I’m just trying to get at, like something that really surprised, surprised you when you looked at it and you said, wow, that is just something I’ve never thought would have.
Obviously the data you’re putting in, you’re putting it in there because like you said, it’s, it’s, it’s, it’s data that should be important to the future performance of stock prices. But I’m just wondering if there’s something that came out of the model that you, you and your team were just like, I never would’ve.
Thought that that would have that much impact on, returns.
David: Yeah, I mean, okay, so I need to, I kind of need to answer this in two ways. So the first one is, given how we, I described this earlier at the start, quite a lot will surprise us a, a vast number of these conditioning elements. So, and, and again, let, let me sort of just describe it given how, how you sort of hit on it at the start there.
So we have, we have these, we have these 400 features or signals and, and what we can kind of think of as the conditioning elements are, which of those work together almost like if functions. So you might have a traditional signal, like the ratio of sell side analyst upgrades to downgrades over the last month.
And you in a traditional model, can just trade on that as a individual isolated signal. But can it, can the model learn from data? What other features will improve the performance of that signal? If this is happening and that is happening and that is happening. So, so to go back to what I said at the start, we can probably put a story on maybe 10 or 20 of these little groups of features that work together very effectively, and then a large number of them do not have an obvious reason why they work.
So the short easy out for me is to say lots of it surprises us. Mm-hmm. And actually it’s easier to give people an idea of how this works a little bit with something that is less surprising. Like calendar information works very well with analyst type information. So. So the, the further away you are from a company reporting its, its official numbers, you are better off following the sell side forecast.
And then the closer you get to the official numbers, the less relevant that it is. Now, in my mind, that has a rationality to it, the, the market is probably gonna respond to live numbers. So I, I’ll, I’ll, I’ll, I’ll care less about the forecast to it. So, but then there are thousands, tens of thousands of the relationships on how these different pieces together that, that, that just don’t have that, that, that obvious piece to them.And, and it, and it’s that where we’re getting a lot of the alpha from. so probably the thing that’s closest to answering both those pieces is the calendar element does work with a lot of different aspects, probably more than we would expect.
Justin: Yeah. And that, and that, that calendar is, is like you said, like the longer.
You are out, like you might, analysts recommendations and buy sells might be more important. And as you kind of hone in, this is just the one example, I’m just repeating it back, like as you hone in, it gets less important, like that sort of idea. Right,
David: exactly. But, but again, generally what, what ends up being used here is not, it’s not like, it’s not like pairs of features.
It’s like six features together. Right? Right. So it will kind of be like, if the ratio of up down is above this and you are within 10 days of the reporting period, and the short interest is below this level, and the CEO has been in the seat for at least six months, it’s, a lot of those combinations of things working together that is the power of the model.
Justin: Yeah. Interesting. I, it’d be interesting to like look at that in detail and kind of, yeah. That’s, that’s cool. I’m just wondering, is there anything that
Jack: surprised you in terms of not being used? Like for instance, if we think about traditional factors, like are valuation factors not used that often in here?
Or is there anything like that that you, maybe traditional factors are not used as much as you thought they might be? I mean,
David: so we, in that feature set, I talked around like there are a hundred features that are more fundamental in nature, but we tend to force them to not be that kind of traditional in the way that we construct them.
So it might be, we are not necessarily using the dividend yield of a company, but we’re using how the dividend changes over a certain horizon when not necessarily looking at the return on assets, but we are looking at how that, there’s been volatility in the return of assets of a particular company.
So we do force the construction a little bit, so it moves away from that traditional factor bias. To it, but, I, I think maybe, maybe, maybe I’ll turn it around. What is a little surprising for us is, is actually the other way around. Even when we train the model with just the target, the specific return.
If, if we don’t, if we don’t, then in the portfolio construction put some guardrails on those style factors. Some of it still sneaks back in. So like we have found, and, and I don’t think this is a bias because our factor models tend to be a bit more quality in nature, but we found if we don’t force that style neutrality at the portfolio construction level, even when we trained on the specific return, we get a bit of quality.
So what does that back that out? What does that say? It says that certain quality elements do have a forecasting power, at least a conditioning element to them on forecasting that specific return. So that has probably been a bit of a surprise for us.
Justin: We know it’s, getting late in the day. Geneva time for you David.
So, this has been a great conversation. Thank you so much. We like to ask all of our guests sort of two standard closing questions. and the first is. What is the one thing you believe about investing that you think most of your peers would disagree with you with?
David: I’m gonna answer as a little bit more like, how would I, how would I offer something different to, to, to our competitors?
And, and a big one for me is that I’m, I’m not a big fan of talking about what your edge is and, and because I think it can make you quite arrogant in thinking that you have a moat on a specific area. I think the power of what we’ve built is that we’ve done very in depth work at every stage of what we do.
The feature generation, the training, the portfolio construction, and I think we’ve got very, very good at that. But, so I would differ with some competitors who want to tell you they’ve got a specific edge in one area. I think you need to just refine and be as effective as you can and in, in all aspects of what you’re doing.
Justin: And the last question is, based on your experience in the markets, what’s the one lesson you would teach your average investor?
David: So most of my friends don’t work in finance. So the one that when asked things like this is, is don’t overtrade. So like, bring this back to, to my world. And, and, and Jack kind of asked this quite a lot, trading costs and understanding like the speed of and horizon of your forecast and that element to it is so incredibly important and so easy to just give up your alpha.
So I, I think that’s what I always is tell my friends is, is don’t, don’t trade too much.
Justin: Great discussion. David, thank you very much for joining us. We appreciate it.
David: Thank you very much for having me.

