False positives: A harsh truth
Alongside a string of compliance developments in recent years, companies still have the same old issues to deal with. False positives are one such serial offender. To avoid missing any critical hits, compliance systems tend to strike early, triggering false alarms. Weeding out these false-positive results can take up a substantial chunk of a compliance employee’s time. A monotonous and laborious task – not to mention a weighty responsibility. And despite all of this, critical hits frequently slip through the cracks. Reducing the incidence of false-positive results and improving hit rates would mark a considerable step forward. This is one of the key challenges being tackled by the latest systems and technologies.
Conventional classification of customer segments: The downside
Rule-based approaches to segmentation group customers according to specific criteria, or statistical attributes, such as corporate client, or international customer. Rules are defined for these groups and threshold values determined, taking into account relative risk assessments.
The drawback of this strategy lies in the fact that criteria for group allocation have to be defined “from outside” – and hence remain static. There is no scope for inherent structures. And only a limited number of criteria, or dimensions, can be taken into account at one time. This makes the threshold values both inaccurate and inflexible.
This is where cutting-edge data mining techniques can be put to work – to drive up the accuracy of results. The evaluation of existing data allows criteria to be defined flexibly and gives rise to more precise threshold values.
What does compliance have to do with archaeology?
Cluster analysis (k-means) is used in archaeology to determine the age of historical finds. This is based on the fact that clay fragments from the same period are more similar in design than those from different eras. Cluster analysis can therefore be used to group vast numbers of ceramic fragments and clearly demonstrate which pieces originate from the same era. In essence, this is the overall aim of cluster analysis – to collate a large quantity of objects according to their degree of similarity and reveal an intrinsic existing structure in each group.
Cluster analysis in compliance
The same approach can be used to group customers according to inherent similarities. An internally homogeneous customer segment is made up of customers with similar characteristics and preferences. Customers who differ greatly from one another are grouped in externally heterogeneous customer segments. Using this type of customer segmentation as a basis, the definition of threshold values is significantly more precise.
Advantages of using cluster analysis in customer segmentation:
- Existing data is analyzed and evaluated for inherent similarities.
- Commonalities are not based on static attributes, but rather on customer behavior.
- Empirical classification is therefore prioritized.
- Differences in customer behavior within segments are minimized.
- Differences between individual customer segments are maximized.
- Theoretically, any number of dimensions can be included in the classification e.g. payment amount, number of transactions, general transaction behavior, sender, beneficiary etc.
Even with a single dimension and a relatively small database, customer clustering yields good results. Of course, the larger the database, the better the results. The more data there is available, the more information can be gleaned from it – and the more accurate the classification.
The bottom line? Using cluster analysis in customer segmentation can help make your compliance solution more precise, lighten the load on your workforce – and boost confidence in your solution.
- Hartung, Joachim und Bärbel Elpelt: Multivariate Statistik: Lehr- und Handbuch der angewandten Statistik; Page. 443 ff. https://books.google.de/books?isbn=3486710796 Accessed 17 July 2018
- https://www.statistik.tu-dortmund.de/~dvogel/Multivariate/Skript/08-Clustersanalyse.pdf, Accessed 17 July 2018
- Mehr zum Thema Clustering (k-means) mit Anwendungsbeispiel: https://www.micromata.de/blog/k-means-clustering-big-data/ Accessed 17 July 2018