The SO1 Supermarket Gym – What’s next

Applying the SO1 Supermarket Gym in software integration projects and upcoming research projects

This is the last part of our interview series with Sebastian, our Chief AI Officer, on the Supermarket Gym. So far, we have covered:

Part 1: Introduction to the Supermarket Gym – what it is and why we developed it
Part 2: How we generate data for the Supermarket Gym to accurately simulate shopping behavior – and what type of academic research is considered
Part 3: Applying the Supermarket Gym in fundamental research: concrete examples of how SO1 uses the tool in its AI development

So let’s now look at additional applications and upcoming research projects at SO1(Segment of One).

Sebastian, are there any other ways a retailer can profit from the Supermarket Gym?

We have another use case at SO1 which, funnily enough, has nothing to do with algorithms, and that is actually integration testing.

For this we set up a simulated retailer, in which we have basically a control unit, creating fake customer traffic – by that I mean that this control unit decides whether customers go shopping in a particular week or not. Once we have this customer traffic generated, we can then use the Supermarket Gym to actually simulate purchases for these customers, and now we can do that not only for typical week-by-week shopping at the retailer without any coupons, but we can actually use coupons as well. So imagine a system where you have this control unit, creating fake customer traffic, you have an API from which you can request coupons for these customers, you have an API to which you can feed these coupons and then receive back your basket data (that is exactly the simulation), and then you can store all of that information in real time in a database. What you now have is a completely virtual environment, an integration test so to speak, where you test a coupon engine, your databases, and all of your APIs. So that’s another application we use it for.

And it doesn’t matter how simplified the simulation is. There is just the continuous data flow that feeds data into the API, collects data, retrieves data from these APIs, stores everything in databases, and runs continuously.

It’s a great way for us to test various components of our system, such as automatic model training, which we use to continuously retrain our models to improve them, or to adjust them to certain trends and markets. We can also just test our APIs, and modifications to our APIs. And before we deploy experiments, we can test the experiments in this virtual environment as well, to make sure that our experiments are actually set up properly.

So all of that is more of a “software development” angle on the Supermarket Gym. It’s not an “algorithm development” angle.

And retailers can really benefit from both, right?

Definitely. We can develop new algorithms for retailers in the Supermarket Gym, and a retailer can also continuously test their software environment and their APIs. It’s incredibly efficient and satisfying, compared to the time and effort that a retailer would otherwise spend on creating tests for their APIs.

Just by creating a “simulated retailer” everything comes for free: you can scale it up or down by the number of consumers, scale traffic any way you want, run load tests, and it’s all fairly close to reality, which is a major benefit for an integration test.

Absolutely! I think there’s one last question that might be interesting. What are the next steps in the AI development of SO1 or in the Supermarket Gym?

Right now we’re investing heavily to improve our deep learning models. And there are initial prototypes available that we’re already using in our model stack.

By model “stack” I mean that we don’t just use one machine learning model, because one single model is simply not powerful enough to cope with the complexity of the decision problem that we need to deal with here. We build a model stack. At the top of this model stack is what we often call our expert model. As probably most machine learning companies out there do, we use boosted tree models here.

So we have this boosted tree model into which we feed a whole bunch of second stage models, which can be very, very simple. They’re just features. But then, using our data, our expert knowledge, we transform them into very powerful features. There are also more complicated features that are based on two second stage models. For example, we have models trained on auxiliary problems, not necessarily our main coupon optimization problem but, for example, problems like price sensitivity, purchase incidence, etc. We use alternative models for that and feed the predictions, the output of those models, into our boosted tree model.

The third component – probably the most complex and advanced – uses numerous deep learning models. It could be representation learning, autoencoders, non-linear matrix factorization, things like that, and we again output (or we predict the output) in these models and feed those outputs into our Boosted Tree model.

And that’s what we do with our cross-category deep neural network as well. We apply it to real data, but instead of building an end-to-end deep learning model, we feed the output of that model or network into our boosted tree model to improve the latter’s performance. This is kind of the first stage to leveraging deep learning at SO1, because you realize that, given the power of our current boosted tree stack, the end-to-end neural network is not powerful enough to replace the full boosted tree stack. So we feed the deep neural network model output into the boosted tree stack, to improve it, and we see that it works. So some of the knowledge contained in that deep neural network is orthogonal to, i.e. statistically independent from, the knowledge that is already leveraged in our boosted tree model.

Now, where will things go? More in our fundamental research development and not in production yet, is deep reinforcement learning. Here we train deep reinforcement learning agents, which are based on deep neural networks, that first of all learn consumer purchases. But then, of course, we also learn the value function for our agents, defined in three different ways: in terms of redemption rates, revenue or profit. So we train another deep neural network to predict the value conditioned on a coupon treatment and how that coupon treatment effects the value function.

Once we have that, we can try to come up with a coupon policy that generates actions conditioned on a given state – the “state” in our context would be the data that we collected in the past or on each consumer, the loyalty card data that we have– and then transform that into the next coupon action. So this is something that we’re currently testing in the simulation. We’re seeing that this is actually quite powerful and promises very good improvements in all three metrics: redemption rates, revenue and profit.

So we hope that this is something we can evaluate in an A/B test in 2019 and then roll out to our clients very soon. But, as I said, right now it’s still more in the fundamental research stage and not yet in production.



Thank you so much, Sebastian, that was really insightful!

The Supermarket Gym – an interview series (edited for clarity and length) with Sebastian Gabel, Chief AI Officer at SO1:

Part 1: Introduction to the Supermarket Gym – what it is and why we developed it
Part 2: How we generate data for the Supermarket Gym to accurately simulate shopping behavior – and what type of academic research is considered
Part 3: Applying the Supermarket Gym in fundamental research: concrete examples of how SO1 uses the tool in its AI development
Part 4: Applying the Supermarket Gym in software integration projects (plus what’s next on SO1’s fundamental research agenda)

Retail technology news