Meeting one of our partners yesterday, we reached again the stage of discussing the importance of building customer-centric demand-driven machine learning solutions.

In this era of Data Science, Data Mining, Machine learning and AI hype only the added value is what makes a difference… well… and of course the ability to present the complex algorithms into comprehensible information that the business can actually relate to.

This line of thought brings us to our next article… So… let’s start over in the right way…

Hello everybody,

We do hope this article finds you well.

In today’s part of our Data Science Series we will be talking about promotional effectiveness – how changes in the price of our product affect our sales volumes. Why do we want know that? Well, to be as efficient as possible choosing one campaign over another in the quest of our final goal (e.g. maximum profitability) and doing it in a realized manner with certain expectations.

This time we have set ourselves 3 main goals with a final step of putting it all together and took a constructive, both statistically and business-wise, approach into addressing them. Please see the picture below after which we will elaborate a bit on each topic:

Let’s dig into each of the Advanced Analytics Approach sections and see how well we do meeting our goals:

1)     Start with the data

Never underestimate the process of getting to know your data – analyze past trends and customer behavior, set expectations for the further steps of the modelling exercise, etc… In the chart on the left we are sharing just one slice of a great visual experience. There we see how sales volume (the red line) is highly correlated with whether we are in a promotional week or not. Furthermore, we can set our expectations in terms of values – roughly what is the average mean amount/ are our sales increasing in the past year/ do we observe any obvious seasonality/ etc.

2)     Estimate the baseline volume

This answers our question: „What our sales volume would have been had we not applied any price promotion?” Put it simple – without specific effort how well will we do in terms of sales?

This is not a straightforward exercise since nobody can repeat history and compare parallel realities. Thankfully, here statistics comes to help. Our approach to this:

  • Start with periods without promotion;
  • Use Kalman filter and final series smoothing to fill-in the information for the promotional periods.

Note: We have tested more than one approach for estimating the baseline and in this case they give comparable results

3)     Difference between baseline volume and actual sales


Now that we have the baseline volume we compare two things:

  • Are actual sales significantly different than baseline volume meaning the effect of promotions deserves the effort;
  • Initial idea of the relative importance among the different promotional activities (measured in terms of actual vs. baseline volume ratio) – we see that the most effective promotion type is A (see the related graph).

4)     Identify the volume uplift drivers

Well, isn’t it always better to know the reason behind an event (causality rather than just result observation). Following this belief we investigate the patterns and relationship between the volume movements that we have and the rest of the data (the profile of our business and environment):

  • Here we use regression analysis to pin-point significant relationships between potential volume drivers (features) and the target – sales volume;
  • Feature engineering is important – creating a better description of the environment can increase model accuracy but requires some work. Starting from the price a few examples of additional feature that were derived are: promo week or not flag, first week of the promotion or not flag, price lag, etc.

This already gives us inference knowledge – which of these drivers are more significant than others and what is their relation (positive/ negative) to the sales volume.

5)     Measure competitor’s impact

So far so good. But – yes, we can do better. In continuation of our effort to create a better (more holistic) representation of the environment in order to improve our prediction we integrate what directly affects us – our competitors.

In the same manner as feature engineering we test in our model feature like: relative difference between our actual price and the average market price; a variable indicating the number of competitors in the market, etc. The result is – analyzing competitors and using information about their presence on the market and prices pays-off!

Note: Even though it is outside of the scope of this article we want to, again, stress how crucial is to investigate and validate the results of your model regardless of its type, and levels of interpretablity and complexity. Examples are: validation on well-selected previously unseen data and residuals diagnostics (Auto-correlation function, partial auto-correlation function, QQ-plot, etc.).

6)     Inference and strategy-ready ML solution

Along the way – because the data science solution is more of a knowledge gathering path than solely a complex modelling algorithms – we have already learned a lot. At this stage we summarize this knowledge and the place of this knowledge into our business strategy and automation.

Examples of what we have learned:

  • Which promotional activities have the highest impact on sales volume;
  • Is there a significant difference between baselines volume (no promotion) and the one we get if a campaign is applied; how does that differ among different promotional types;
  • First week of the campaign is the most effective – going forward the impact depreciates significantly;
  • Not all promotional activities can be directly related to the sales volumes dynamics with a certainty;
  • Competitors’ presence, their prices and their relation to our prices matters. Furthermore, we are able to identify statistically that some competitors play much more significant role to our results than others. Knowing this can guide us on both our strategy and on our research.
  • And others…

Now we have the knowledge path summarized – not only as general knowledge but quantified as well – and can say our solution is strategy-ready. What does that mean? It means that the solution itself is not the end goal but rather the leverage that we use to automate our processes and make our efforts more efficient. Some real-case examples:

  • Choose a specific promotion over another based on market research and competitors behavior;
  • If we know our customers – distribute promotions customer profile based;
  • And many more!

We are far from thinking that this exhausts this topic and therefore  we will continue to  dig deeper and peel other layers of know-how in the quest of bringing value and data-driven strategies to our clients.

Truthfully yours,

SeedSet Ltd.

____________ ______________ ______________ ______________

With this series our aim is to increase data science coverage and to make data-driven decision integral part of more and more companies around the Globe.