Fraud Detection, Machine Learning

Huge Payoff Is Possible By Identifying Fraud Schemes

Darshana Daga

The volume and dollar value of fraudulent transactions has mushroomed in recent years due to the phenomenal growth of electronic transactions that span every industry segment, coupled with the relative anonymity that online commerce offers to crooks of all shades. In the era of extreme automation, high transaction volumes, and a highly connected world, fraudulent transactions can hit virtually any business.

THUS, there is a huge potential pay-off if one can apprehend the crooks themselves, rather than try to detect their individual transactions in real-time… and that is the motive driving efforts to identify fraud schemes perpetrated by the crooks in order to catch them, rather than to try and detect individual transactions as they occur.

So, what exactly is a Fraud Scheme?

  • Casinos and Hotels monitoring cash movements for anti-money laundering compliance
  • Restaurants and Retail Chains looking to spot cashier fraud
  • Hospitals trying to track down supply chain fraud
  • Insurance Companies tracking down billions of dollars in fraudulent claims
  • Financial Institutions monitoring activities for potential account compromises
  • E-commerce Operations transacting with customers located anywhere in the globe
  • Government Agencies managing tax refunds, unemployment claims, and disability claims.

These are just a few examples of situations where a business is dealing with a large volume of transactions and is exposed to what we refer to as bad actors who are responsible for the potentially large volume of fraudulent transactions. Each transaction might be small in itself and be designed to be undetectable by (traditional) transaction-based fraud screening systems. Sometimes such fraud detection systems can identify a portion of the fraudulent scheme, but most often they fail to “connect the dots” and alert the business about the overarching pattern of fraud.

In each of these scenarios, typical fraud detection systems in use, which screen every transaction in real-time as it is processed, would fail to detect what may appear later as a pretty evident pattern. For example, in these scenarios, each transaction would typically appear to be legitimate because the credit card information would match, the customer data would not match with existing blacklists and the orders might mostly include a single high-value product. Also, the IP address geo-location would match the delivery area, etc.

What the system could not detect was the fact that many of the transactions were related in some way. They might have been placed within the same timeframe, or they might have all been shipped to a few specific delivery addresses; or that the credit cards used, although different, might have all been pre-paid credit card accounts from a known issuer of reloadable cards; or it might have been that the same phone number was associated with these new accounts, even though the customer information was different; or the weblogs might have revealed that the same ‘cookie’ was associated with most of the fraudulent sessions, implying that the same computer was likely used to place the string of fraudulent orders.

Detecting Patterns of Fraud

To protect your business from fraud schemes and bad actors before they can cause significant damage, you need two fundamental capabilities. First, the ability to relate transactions to each other, whether they are claims, tax returns, online purchases, money transfers, etc. Second, the ability to evaluate which groups of related transactions are “anomalies” that are suspicious and should trigger further investigation.

The bad actors are unlikely to be detectable by explicit clues, like a customer account, or a payment method, because fraudsters are very savvy in avoiding such obvious clues. They can easily create multiple accounts and use hundreds of diverse financial accounts (both for paying and receiving money). For example, fraudulent tax refund claims are typically routed to dozens of pre-paid credit card accounts, which are quite anonymous, easily available, and practical. On the other hand, a healthcare provider may have a genuine reason for submitting a pattern of high-value claims because they happen to specialize in serving a narrow group of patients or diagnoses… or they may not… Unfortunately, it is difficult to determine which piece of data will be useful to detect the next scheme or to determine which clues the fraudsters miss — which might be helpful in linking the next string of fraudulent transactions.

So, what is the solution?

Fortunately, data science allows us to create advanced anomaly detection software “engines” that utilize machine learning (whose detailed discussion is beyond the scope of this article!) to rapidly process humongous volumes of data and analyze dozens of data elements to create groups of transactions that can lead to identifying the underlying “bad actors.” This is done by analyzing attributes of groups of transactions, by considering whether there are unusual similarities, or dissimilarities, across these “clusters” of transactions. All this is done without making specific assumptions about what is to be considered “normal” or “abnormal,” by leveraging academically sound Data Science principles.

In other words, we can use Data Science to uncover broader schemes that are not easily detectable by brute force manual approaches, and so raise the level of visibility, awareness, and preparedness to face the most sophisticated and challenging fraud schemes as we can “see the forest for the trees!”

Please contact one of our reps at for additional information.

Darshana Daga