Let the data speak: Revealing market structures in FMCG retail to drive demand

Online and offline retailers in the FMCG market must regularly make decisions about assortment planning, product placement, category management, advertising, shelf design, and product categorization. These decisions directly influence the ability of retailers to acquire new customers, develop better relationships with them, and make important cost-related assortment decisions.

Yet most assortments are still categorized with products, rather than the customer, in mind. Essentially, they are driven by logistics and category management, as opposed to the consumers’ view of market structures.

To help retailers better understand these structures, we at SO1 worked together with Humboldt University in Berlin to develop a new, scalable, and fully-automated machine learning approach called P2V-MAP, based on our proprietary AI technology. The system can derive market structures from the data from billions of shopping baskets and large product assortments without any human input.

Note: This article is adapted from a scientific publication in the Journal of Marketing Research by our Chief AI Officer Dr. Sebastian Gabel. The research has also won this year’s EHI Retail Science Award 2020.

The product map

To demonstrate this new approach, we analyzed basket data collected in a large German city over 12 months from 147 stores of the city’s largest grocery retailer. The data consist of 3,386,500,350 training samples derived from 73,048,605 shopping baskets, and 30,763 products from 133 product categories.

P2V-MAP maps products according to their co-occurrence patterns in customers’ shopping baskets across time. It does not require any product master data, nor human labeling, nor third-party or personal customer data, only the GTINs within shopping baskets freely available to all retailers through their checkout systems and an anonymous identifier to match customers’ historical baskets (i.e. a hashed loyalty card ID).

The output is a detailed product map that clusters all products within the retailer’s portfolio. No previous assumptions are made about these product clusters; rather, the AI forms them by grouping products that are similar in the eyes of the customer.

For better human interpretation, we labeled some of these clusters by having a quick look at the products or subcategories there (click to englare):

Some clusters are similar to retailers’ own product categorization, such as juices, soft drinks, bottled water, chocolate bars, wine or bottled water. We can also see that non-food categories such as pet food, home care or body care are quite far from the others, reflecting the fact that customers use and shop for them in a very different way than they do in the case of groceries.

But the majority of these product clusters do not reflect categories as defined by retailers. Instead, P2V-MAP clusters products from across several different categories, such as Italian, Barbecue, Baking, Vegetarian, Lactose-free, Organic and even Children’s Products (e.g., small juice boxes, snacks for school, bologna in the shape of teddy bears, or Disney yogurt). Products from these clusters are mostly located on different shelves throughout the store, but customers tend to buy them together.

Use cases

This deep understanding of market structure can be applied in many areas of retail. Let’s take a look at four use cases that show how grocery and drugstore retailers can apply this technology. First, a quick overview:

  1. Defining thematic categories for in-store placement and advertising — One of the main benefits of this approach is the ability to discover meaningful product clusters grouped in a way customers shop for them. The groupings that are revealed are not based on brands’ attempted positioning or the retailer’s own category definitions. Instead, they reveal how customers themselves view the products. This understanding provides an excellent foundation for designing specific in-store thematic shelves (such as BBQ, Organic or Baby products) and promoting them via mass advertising.
  2. Managing categories and brands — The insight and knowledge that retailers have at their fingertips with this technology means likely product substitutes can be easily identified. The same data that enables them to optimize category depths and evaluate the fit of new products gives them added leverage when negotiating conditions with brands.
  3. Evaluating the fit of new products and brands — A unique feature of this approach is the ability to map products that are not even in a retailer’s portfolio for which there is no data. This way, category managers can evaluate the contribution to the overall category performance of both—existing and new products, before even putting them on their shelves.
  4. 1:1 personalization of recommendation and price — The ultimate value of this technology lies in its ability to understand customers’ individual product preferences and the price they are willing to pay for specific products. These insights can be automatically transformed to real-time decisions that personalize both product recommendation and pricing. The only thing that the AI needs is basket data and some kind of loyalty program to reach customers.

More of the use cases for this technology are described in the white paper 7 Business cases of using AI for retail promotion targeting.

1. Defining thematic categories for in-store placement and advertising

Quite often, retailers promote thematic product groups (such as Italian food, Mexican food or BBQ) to inspire their customers. But how to know which themes are trending right now and which exact products to pick for them to maximize their effect?

As an example, let’s zoom in to the P2V-MAP barbecue cluster and look at it at the product level:

The top part of the barbecue cluster contains ingredients for “American” barbecue dishes such as hamburgers and hot dogs. Hot dog sausages and hot dog buns are located on the left, burger buns and beef patties on the right. In between, we find condiments that can be used with both hot dogs and hamburgers (e.g., sliced pickles, fried onions, and sauces).

The close proximity on the product map suggests that there is some kind of relationship between these products, but we don’t know yet whether these relationships are complementary or substitutional. This is important because for good product recommendation we need to know which products work well together and which compete with each other.

For example, burger buns and burger patties are likely complements, meaning customers tend to buy them together. Discounting just one of the products will likely increase demand for both. But burger patties and hot dog sausages might have subsitutory relationship, meaning discounting one of them will make the customer prefer it over the other. So if we discount burger patties, we might see a parallel increase in demand for burger buns, but a decrease in demand for multiple hot dog ingredients.

To reveal these specific relationships, P2V-MAP provides a score for each pair of products that measures how likely these products are to co-occur in one basket. In other words, the score measures how often consumers buy the two products together in one shopping trip. Two products are substitutes if this score is low and complements if this score is high.


As expected, co-occurrence are high between hot dog buns and sausages, and burger buns and beef patties, but not across these two groups. Both, however, have a high correlation with additional ingredients such as fried onions, sliced pickles, and different sauces. The system understands by itself that when promoting the barbecue cluster we always need to include either hot dogs or burgers, but in both cases also additional ingredients such as pickles, sauces or fried onions.



Just to remind ourselves: the AI uses GTINs (i.e. pure numbers) to understand how these products work together. The co-occurrence score reveals categories and even typical recipes in a fully automated manner.

These relationships allow retailers to create engaging themed product placements inside stores. These themes can be also advertised in mass channels such as leaflets, social media, out of home, TV or print. But the full power of such AI-based insights can be best leveraged within digital channels: the AI is also able to understand relationships for an individual customer and can consider that, e.g., customer A might have a gluten intolerance, while customer B might not like onions at all, and customer C likes the meat-free product alternative.

2. Managing categories and brands

By understanding product relationships, retailers can optimize the category depths and gain more bargaining power when dealing with brands. To better illustrate this use case, let’s zoom into one of the P2V-MAP chocolate clusters:

The first thing to note is that this specific excerpt of the chocolate cluster does not include any low-cocoa products such as milk chocolate, white chocolate or chocolate bars. This suggests that customers don’t see such products as substitutes for dark chocolate tablets.

We can also see that Lindt products (considered a premium chocolate brand in Germany) with similar percentages of cocoa, both ‘mild’ and ‘regular’ varieties, are very close together. The brands Zetti and Sarotti (more regional, old-fashioned brands, and considered to be low to medium-priced) are located away from Lindt, but very close together, indicating that customers likely wouldn’t mind switching between them. On the bottom-right, we can see a subcategory of chocolate tablets with additional ingredients (Orange, Chili, Sea Salt…).

Among the dark chocolates, the percentage of cocoa content increases from left to right. The price decreases from the bottom up, with Lindt for 1.99 Euro and Zetti / Sarotti for 1.29. This is a good indication that there are two important attributes by which customers distinguish dark chocolates—price and percentage of cocoa. Of course, the brand can be a strong differentiator as well, but not in the case of Sarotti and Zetti.

Looking at the map, we can quickly recognize that products with the same percentage of cocoa and the same price are likely strong substitutes. These findings enable the retailer to either reduce the amount of substituting brands or negotiate better conditions with them.

3. Evaluating the fit of new products and brands

Most techniques for visualizing market structure depend on past data. However, retailers might also want to understand the potential impact of new products on their existing market structures. In the case of the dark chocolate example, we might want to know how a new brand would fit with the existing mix—will it create additional demand or just be a substitute for other products from within the category? These insights might help to evaluate the contribution of new products to overall category performance before they are even put into stores.

Using this analysis, category managers can better understand and evaluate the benefit of adding new brands and products to their portfolio before they sign contracts and secure logistics. This will also give them more bargaining power over brands that do not represent a significant contribution to category performance.

P2V-MAP makes it possible to map products for which no shopping data exists. To understand how we have to step back and look at how exactly the product map is created.

The first step towards creating the product map is the algorithms analyzing customer shopping data from tens of millions of baskets. Based on the co-occurrence of these products in these baskets, the AI derives more than 800 different product attributes by which every individual product is defined.

These attributes represent how customers see the products and they can be anything from “sweet”, “fresh”, “healthy” through more abstract definitions such as “premium looking” to completely unknown attributes that we, as humans, do not even have names for. Remember, these are all latent attributes defined by unsupervised machine learning algorithms, so we have no idea what they really are. All we know is that these attributes are a mathematical representation of how the product is perceived by customers.

Each product is represented by certain vectors on these 800 latent attributes. To visualize this, we would have to create an incredibly complex 800-dimensional space and place every single product in it. But unlike computers, humans can’t think in more than three dimensions, and on a computer screen, even that is too many. To make it easier to interpret the results, we apply a dimensionality reduction and that is how the two-dimensional P2V-MAP is plotted.

Now the question is, how can we define the attributes for new products that we have no data for? Well, since every product is defined as a set of vectors, we can use vector arithmetic to derive the data for new products simply by combining the vectors for existing ones.

Let’s imagine we are considering adding a new product to our portfolio—a Mars ice cream bar. To place it in the product map, we can use the data from three other products we already have data for: Mars chocolate bars, Snickers chocolate bars, and Snickers ice cream bars. The equation would then look like this:

The difference between the product vectors for Snickers ice cream and Snickers chocolate captures the “frozen” set of attributes. Adding this characteristic to the vector for Mars chocolate results in a “frozen” version of the chocolate bar, that is, Mars ice cream bars.

This works surprisingly well among all kinds of product categories, as seen in the examples below:

To verify these findings, we removed all shopping baskets containing Mars ice cream bars, recalculated these product attributes using vector arithmetic and calculated a new product map. Then we compared the two maps (real vs. calculated data) and evaluated two metrics: the 19 nearest product neighbors and their distance from the observed product. While of course it was not a 100% match, the results are convincing—17 out of 19 neighbors are the same and their distances are very similar.

4. Promotion personalization on 1:1 level

Once the algorithms process the robust shopping data to derive more than 800 latent attributes for each product in a retailer’s portfolio, we are just one step from something that is essentially more powerful than all the use cases above—truly individual 1:1 product recommendation and price optimization.

The same machine learning algorithms from SO1 that derived the product attributes can also map customer preferences towards them. If we have some kind of loyalty program in place, the algorithms can analyze the shopping history of each individual customer in this program and capture his or her preferences on all 800 product attributes. This allows the algorithms to predict which products customers like simply by calculating the average distance between customer preferences and product vectors. The products with a closer average distance to customer preferences will likely be preferred.

This works for each individual customer without the need to segment them. Some eat healthy in general, but frozen pizza is their guilty pleasure. Some might be vegetarians, but from time to time they might like to have that delicious smoked salmon. The algorithms can predict all these nuances even if they don’t have individual historical data supporting them (e.g. the customers haven’t bought pizza or smoked salmon at this retailer yet). This is possible because the vector arithmetic can be also applied to customers.

This allows the AI to work with high precision even with a small amount of customer data, i.e.. just five shopping trips per customer. All of this is calculated on demand, in seconds, and individually for each customer. The product recommendations can be hence delivered to any channel—email, an app, check-in coupon kiosks or printed POS receipts.

What’s more, the AI will also know how price-sensitive these customers will be towards each product offered, meaning it can calculate the discount needed to convince them to buy. This discount, too, is calculated by observing the distance between customer preferences and products. The bigger the gap, the bigger the discount will need to be to convince the customer to buy. This prevents over and under discounting products in promotions—a well known retail phenomenon.

The technology delivers a significant performance uplift on most KPIs used in retail CRM teams. Our real environment tests with some of the major German and US grocers have shown that our tested AI, the SO1 Engine, is able to increase redemption rates more than five times while growing revenues from loyalty club members by 15.8% and profits by 36.4%. More results are presented here.

SO1 Revenue uplift

Revenue uplift (increase in targeted basket size) vs. percentage of average discount offered. This impact of individual promotions has been proven by rigorous A/B testing at all participating retailers.

The bottom line

The application of the P2V-MAP approach to real data from a leading German grocery retailer demonstrated the effectiveness of our proprietary AI technology. P2V-MAP also proved to be superior to the alternative, well-established reference models in an extensive simulation study described in the full peer-reviewed article.

To learn more about the technology behind — The SO1 Engine — visit our website and sign up for our newsletter below to receive more articles on retail innovation and AI.

If you’re interested in a more detailed and scientifically accurate discussion of this topic, have a look at the full article in the Journal of Marketing Research.

Retail technology news