Automate Brand Tracking in FMCG with Machine Learning

FMCG retail is characterized by fast dynamics and aggressive competition. Retailers and CPG brands striving to keep a unique market position must constantly keep an eye on their brand performance. But just among yogurts, there are around 1500 different products on some markets; it’s next to impossible to keep track of all of them only using panel research or manual data analysis. Our study across 147 grocery stores in Germany proves that machine learning can help.

If retailers find a way to harness the great abundance of transactional data they have on customers, they can offer brands a comprehensive understanding of their position on the market. What’s more, retailers can also use this data to better optimize their categories, store assortments, and promotional offers.

You might have already heard of the product-to-vec (P2V) market structures analysis. This machine learning approach gives retailers and brands a whole new way to work with data. The research work behind this approach by Dr. Sebastian Gabel, powered by our SO1 Engine, has won this year’s EHI Retail Science Award 2020 and was published in the Journal of Marketing Research.

The approach — based on deep neural networks — uses retailer’s shopping data and looks at the co-occurrence of thousands of products in millions of shopping baskets. This allows it to map all products on 800 different vectors and place them into a visual map where every product in retailer’s portfolio is depicted as a single dot:

The first thing to notice is that products are grouped in a way customers shop for them, not retailers place them on shelves. Examples of such clusters are Italian food, Barbecue, Baking, Breakfast, Vegetarian, and even Children’s Products (e.g., small juice boxes, snacks for school, lunch meats in the shape of teddy bears, or Disney yogurt). We can also see which products in these groups are complements (e.g. pasta + tomato sauce) and which substitutes (two different types of pasta).

The map above comes from a large German grocery chain and is derived from 73 million shopping baskets. The analysis works completely autonomously, only using raw shopping data from POS systems and loyalty programs. Let’s look at a couple of examples how brands can benefit from this unbiased view on market structures.

Discovering substitution

Retailers want to discover substitution to get rid of redundant brands or negotiate better conditions with them. Brands need insights on their closest competitors and how well they differentiate from them to prevent substitution in the first place.

To see how this analysis can help, let’s zoom into the dark chocolate sub-cluster:

We see that Lindt products (considered a premium chocolate brand in Germany) with similar percentages of cocoa, both ‘mild’ and ‘regular’ varieties, are very close together. The brands Zetti and Sarotti (more regional and low-priced brands) are located away from Lindt, but very close together, indicating strong subsitutory relationship. On the bottom-right, we see a subcategory of dark chocolate bars with additional ingredients (Orange, Chili, Sea Salt…), indicating they hold their own unique position on the market.

This gives the brands a few important insights. First, Lindt is strongly differentiating from other brands, but the Mild version is not clearly recognized by customers. Lindt might want to better communicate the difference between the two on the package or in advertisement.

Brands Zetti and Sarotti seem to be strong substitutes for each other. Customers likely don’t differentiate between them very well and wouldn’t mind switching if one of the brands was on promotion, out of stock or completely missing. A change in marketing strategy might be needed to better differentiate the two.

What drives differentiation

So we already know which products are substitutes and which are uniquely positioned, but what exactly drives this differentiation? Looking into the dark chocolate subcluster, we can see two trends – price is decreasing from the bottom up (with Lindt for 2€ and Zetti/Sarotti for 1.29€), and the percentage of cocoa from left to right. These two attributes, therefore, look to be the main differentiators, unless the chocolates are flavored.

For Zetti and Sarotti, this might mean an opportunity to introduce a flavored version of their chocolate. With lower prices and as a regional brand, this new product might be a competition for the Lindt flavored chocolates.

Let’s zoom into another cluster, this time wine. We see that the first and foremost differentiator is whether it is a sparkling (“Sekt”) or standard wine. Small 0.25l wine and sparkling bottles hold their own position quite far from others.

On the right, we zoom into the 0.7 l wine cluster. Price seems to be the strongest differentiator here, much more than the actual type of wine (e.g. Chatou, Sauvignon, Riesling…). When products are separated this clearly by price, we are talking about a highly commoditized market that wine looks to be. This suggests that this category is mainly driven by price and promotions, but there is still a place for differentiation, for instance by stronger branding or regional origins.

Product positioning

The approach can be also used to determine the positioning of certain products. For example, a CPG biscuit brand approached us to find out whether consumers perceive their products more as a snack or as a breakfast product. Looking into the map, we quickly recognized that biscuit products from this brand were located among breakfast products, quite far from other biscuits and snacks.

This insight was used to strengthen this “breakfast” positioning of the brand even more via advertising and packaging, start partnerships with brands producing complementary breakfast categories such as cereals or jams, and even expand into some of these categories as well.

The bottom line

The key benefits of this approach are simplicity, objectivity, and automation. Because the approach only uses raw shopping data and reflects true consumer behavior, there are no biases that often occur in other approaches. And because the whole analysis is automated, it can be repeated as often as needed with new or filtered data to assess the effectiveness of marketing campaigns, introduction of new products or seasonal and regional influences. To have a better understanding of the P2V approach, have a look at the full article by Dr. Sebastian Gabel.

The technology used in this research – the SO1 Engine – is also used in our advanced personalization engine. Based on the insights gained in this process, it allows retailers to autonomously recommend the right products at the right price and time for each individual customer — all with retailers’ and brands’ business goals in mind.

Retail technology news