It’s often easier to keep existing customers than it is to win over new ones. Syskoplan Reply has designed a probability model for Churn Management, which can be used to forecast individual customer churn probability, and implemented it in a SAP HANA database using R.
The objective of Churn Management is to identify customers that are highly likely to leave a company. These can then be addressed in a targeted manner, in order to try and keep them as customers. In the given case, Churn Management was implemented for a non-contractual relationship between a customer and a trading company. The challenge here related to the fact that without a contractual commitment, it is not possible to directly observe the loss of the customer. A customer can be a loyal customer, even if they have not made a purchase for a long time.
Another challenge due to data and consumer protection, the individual transaction histories of customers may only be analysed up to a specific point in the past. Usually this is the data of the past two years. The diagram highlights the effects of this limitation regarding the used data. All observations outside of the time period of two years, i.e. those in the coloured area, are not used. As a result of this, the data quality, especially in relation to the example customers 1 and 2, is influenced significantly, which in turn has an influence on the ability to forecast the individual customer behaviour.
The sales receipts of all customers were located in a SAP HANA database. The data for forecasting customer behaviour was prepared directly in SAP HANA. For the subsequent testing of the probability model, R was used with the help of the R integration offered by SAP. The R code was written within the SQL code in the HANA application. The HANA Calculation Engine started the processing independently within an R session on an external server. The results were then stored in a table in the SAP HANA database.
In order to resolve the above-mentioned challenges, it was particularly important to choose the right model for forecasting the customer behaviour. This is why R was used as a statistics environment. This open source software offers numerous statistical methods, which are always very much up-to-date (new statistical methods are now usually provided in R first of all in an individual extension package). There are no restrictions for the software that is used with regard to methods and models. The selected, centred beta geometric/NBD model takes two phases of customer behaviour into consideration:
Using a simulation study, we have shown that the selected model is superior compared to other models, which also map two phases of customer behaviour, when it comes to forecasting churn, above all due to the 2-year limitation regarding the data to be examined.
With the implementation of Churn Management, the analytical capabilities of the SAP HANA database can be fully exploited together with the powerful statistics environment R. It is possible to identify active and inactive customers. The customers can be contacted in a targeted manner in order to try and retain them.