How Fraud detection in the “Wild West” era was superior
- Back to overview
- Wiebe Fokma
- Utrecht
I started detecting online banking fraud back in 2011. It was at the time that the Netherlands suffered under a huge peak in online banking fraud, especially malware that injected large numbers of fraudulent transactions. As the newly installed detection system came without any detection rules, we started with a blank canvas. Looking back, it was a wonderful time of exploration, much like the Wild West. We learned by doing and improved rapidly over time. A great advantage was that the RiskShield system supported whatever we could think of. The West was wide open and for us to conquer.
Nationality? Yuck!
We had lots of datapoints available, and as we were using them for fraud detection, we were allowed to use them freely. Because GDPR was something for the future, we could also use those that nowadays are considered “unethical”, such as a customer’s nationality. But were we unethical, and did we discriminate? And is everything much better nowadays with all the internal and external rules and regulations? I seriously doubt it.
Nowadays, the typical bank’s ethical committee does not allow for the use of customer nationality at all. At first glance, this seems logical, as discrimination is lurking around the corner. Yet nationality can very well be used to reduce false positives, particularly for customers with a foreign one. It can even do more.
Customer nationality can be used to prevent (indirect) discrimination
The problem being that people with a foreign nationality do behave differently. Not fraudulently, but just differently. This is based on their upbringing, culture, etc. This means that if you create rules/models without considering nationality your model might flag deviant behaviour and generate an alert. This is called indirect discrimination. And yes, that is exactly what you want to avoid. Hypothetical? Not at all. Last year, it was revealed that the model used to detect fraud with Dutch student allowances had done exactly this: behaviour that was typically fraudulent for indigenous Dutch students, was normal and genuine for others with a non-Western migration background. Around 10,000 students were the victims of this indirect discrimination and wrongly branded as fraudsters. They are, of course, being compensated.
Had the model taken nationality / non-Western background into account, it could have alerted only those fraudulent Dutch natives. This shows that the nationality can be used to prevent discrimination.
By definition, any datapoint that can lead to discrimination can also be used to prevent it.
Where did it go wrong?
So, something was lost when fraud detection progressed into the civilised era. Something has changed fundamentally, and apparently not only for the better. There are two clear causes: banks have become very cautious, and they rely heavily on machine learning.
Reason one: Banks have become risk-averse
Most importantly, banks have become extremely risk-averse. This is due to pressure from laws and regulations, consumer organisations and bad publicity. Social media platforms play a pivotal role in major modern-day scams but are able to wash their hands of responsibility, leaving banks to bear the entire burden. Banks want to demonstrate that they are going above and beyond, so any data point that could potentially lead to discrimination is typically excluded from use. Better safe than sorry.
Reason two: Detection relies on machine learning
At the same time the banking industry is pushing for model creation with minimal human intervention. This to avoid bias, and fewer experts mean lower costs. A machine learning model is an indivisible whole. Therefore, if it doesn’t function for a small part, the whole model must be retrained. It’s like trying to create a picture using AI. Somehow, the more detailed your description, the more unwanted deviations you get. If you give an identical description twice, you get two completely different pictures. This is the current state of AI and machine learning: you can achieve great things, but the much needed control is lacking.
The two possible solutions are a thorough analysis or testing of the model. Analysing a model takes a lot of time and expertise, especially for models using hundreds of parameters, so this is not a realistic solution. Testing is therefore the way to go. However, typical machine learning testing has a significant drawback: it requires some 20-30% of the very scarce fraud data, which makes the models much worse than they could be. Either way, those models are effectively black-box.
Used solution: better safe than sorry – just do not use any risky data points
Back to the nationality of the customer. In any model that is not fully under control, blackbox or effectively blackbox, you do not want the nationality to be used. Way too high a risk of discrimination. Then better not use it, and if testing shows any possible sign of indirect discrimination just retrain the model. Machine learning currently is a trick we neither really understand nor control, just like the sorcerer’s apprentice. Fully understandable and justified, this led to regulations like the AI Act requiring human supervision. In any case, these models are essentially black boxes.
Anomaly detection
Back in the Wild West era, we found that anomaly detection was superior. Profile normal customer behaviour and any fraud stands out, including fraud you could not think of when building the model. It is great to see that leading banks that are in full control, such as HSBC, use anomaly detection combined with machine learning — the best of both worlds. And indeed: HSBC states that if the FATF comes up with new money laundering typologies, they do not need to backtest to check if it might have happened at their institution because, if it had, it would have been alerted.
It is good to see that the best of the Wild West era is making a comeback. When combined with machine learning, it is a powerful combination indeed. The rest of the industry is still struggling, but it looks like the last hurdle before the era of AI for fraud detection really ushers in.
Want to know more?
Best in class Verification Of Payee solution
With our European Verification Of Payee solution, the combination of IBAN & Name will be checked in EU countries, the UK and the world.
Schedule a meeting today
We are here to help answer any questions you may have about Verification Of Payee and the instant payments regulation.