Monday, January 11, 2016

Benford's Law and fraud detection

I could have majored in mathematics, but it's probably just as well that I didn't. I've forgotten most of the calculus and matrix stuff that I once barely knew. But I still have a healthy respect for mathematicians, and it turns out that the science - while abstract - has true practical benefits.

Steven J. Miller of Williams College recently wrote an article about Benford's law. The law, which did not originate with a person named Benford but with a person named Newcomb, states (in simple terms) that "often the first digits of numbers in a data set are not distributed equally." Miller provides this example:

One particularly nice illustration is the example of a geometric process, say a stock that increases 4% a year. If we start with US$1, then after one year we have $1.04. After two years, we have $1.0816, and so on, finally reaching $2 after about 17.673 years. It would take approximately 58.708 years to reach $10. If we increase by a constant multiple each time, it’ll take more time to go from 1 to 2 than from 9 to 10 because the magnitude of the increase is larger at 9 than at 1 and the distance to cover is the same.

So if you visualize the first digits of all of these numbers from each year, there are a lot of 1's (17 of them), fewer 2's, fewer 3's, and so forth.

But Mark Nigrini of West Virginia University has a more practical example: detecting fraud in financial transactions. If you have a monthly credit card statement, Nigrini expected that the charges would exhibit Benford's Law: most of the numbers would start with a 1, some would start with a 2, etc.

Fraudsters don't follow Benford's Law when they enter fraudulent transactions. Miller describes how one such fraudster was discovered:

An investigation at one bank turned up many more stolen card totals starting with a 4 than Benford’s law would predict. Eventually they found that a large number were around $4,800 or $4,900, and attributable to one agent who was having friends run up debts just below the threshold before reporting the card stolen! Fraudsters discovered, thanks again to Benford’s law.

I can personally attest to this. One of my credit cards was compromised in the past, and the credit card company asked me to confirm whether certain purchases were truly made by me. Ignoring the fact that all of the purchases were made at one store chain in one geographic area, and all were made on the very same day, the pattern of the numbers probably gave the credit card company a clue. Here are some of the dollar values for the transactions that the credit card company questioned:

$1.52
$43.22
$49.20
$49.49
$49.99
$49.99
$49.99
$49.99
$49.99


This is not the complete number of transactions that were questioned, but I think you get the idea.

So if you were to graph these, you would see that one transaction begins with the number 1, none begin with 2, none begin with 3, a whole bunch begin with 4, and none begin with the digits 5 through 9.

Now that's an unusual pattern.
blog comments powered by Disqus