Friday, January 20, 2012

Question authority - looking at piracy statistics

All of us rely on some commonly-accepted wisdom and take it at face value, but often we don't probe the "wisdom" to see if it is true.

For example, a January 17 item from Jennifer Collins, discussing the (then) forthcoming Wikipedia blackout, included this statement.

The Obama administration has opposed portions of the bill and the House has also pulled back. But with piracy costing up to $775 billion a year, virtually everyone agrees the bills in some form will survive.

Steven Hodson saw this statement, and (in a post on Forty Two Times) offered the following suggestion:

It would be extremely beneficial to her audience if Ms. Collins were to provide additional substantiation for the claim in her recent post.

Remember what I said about taking things at face value? Hodson did NOT write the words I quoted above. Here's what he ACTUALLY said (and you can check me on it):

Okay Jennifer Collins consider this your official call out – where is the proof - publicly available proof that has been verified by more than just the entertainment industry that piracy costs $775 billion a year.

Prove it, please.

Show us the facts and figures, as well as where you got the information to base such a claim on.

Prove it because I am calling bullshit.

I don't know if Ms. Collins saw Mr. Hodson's request, but I did some searching on my own and found the source of Jennifer Collins' $775 billion piracy cost claim. It turns out that she got this from a February 2011 report from the International Chamber of Commerce. Here's part of the press release that the Chamber released at the time:

A new report released today by the International Chamber of Commerce (ICC) indicates that the global economic and social impacts of counterfeiting and piracy will reach US$1.7 trillion by 2015 and put 2.5 million legitimate jobs at risk each year....

The report reveals that based on 2008 data, the total global economic and social impacts of counterfeit and pirated products are as much as US$775 billion every year.

OK, so there's the figure that Collins cited. But a single figure alone does not necessarily allow one to develop a full-fledged anti-piracy policy. First off, what is included in this figure? The first hint is in the press release itself:

This includes impacts of lost tax revenue and higher government spending on law enforcement and health care.

Right here you learn a little about the methodology used. The total costs used in the $775 billion figure not only include the direct estimates of losses due to piracy, but also include outside costs, such as law enforcment costs. This suggests and obvious solution - if you want to reduce the costs of piracy, reduce the amount of money you're spending on law enforcement.

To really analyze this $775 billion figure, you need to take a detailed look at the study itself. If you go to this page, you can find links to PDF versions of the Executive Summary and the Full Report. I confess that I haven't read the full report, but I have spent some time reading the Executive Summary (PDF here). Ignoring future considerations, what makes up the $775 billion figure? Page 2 of the Executive Summary lists four categories that are analyzed:

Category 1: Counterfeit and pirated goods moving through
international trade. We update the OECD’s estimate of the value of
counterfeit and pirated goods moving through international trade, drawing
on new customs seizure data indicating that the incidence of counterfeiting
and piracy has increased relative to the 2005-based customs data used in the
OECD’s 2008 study.

• Category 2: Value of domestically produced and consumed counterfeit
and pirated products. We develop a methodology, derived from the
OECD’s modeling work, to generate an estimate of the value of domestic
manufacture and consumption of counterfeit and pirate products – thereby
capturing an estimated value of fake products that do not cross borders.

• Category 3: Volume of pirated digital products being distributed via
the Internet. We describe, evaluate and contextualize industry reports and
academic studies on the value of digital piracy of recorded music, movies
and software. We then use these studies to produce an estimate of the total
value of digital piracy that has been calculated using consistent assumptions
and methodology across these industries.

• Category 4: Broader economy-wide effects. We provide a summary of
previous analysis aimed at identifying the broader economy-wide effects of
counterfeiting and piracy.

A later statement further identifies category 4 as "Effects on government tax revenues, welfare spending, costs of crime health services, FDI flows." This is the law enforcement/health care category that I discussed earlier.

For 2008, the dollar values for these four categories are (page 3):

Category 1: Between $285 billion and $360 billion

Category 2: Between $140 billion and $215 billion

Category 3: Between $30 billion and $75 billion

Category 4: $125 billion

Which brings us to a grand total of...well, it depends. Depending upon how you add the numbers above, you either get a figure of $580 billion or $775 billion.

Guess which figure the Chamber highlighted in its report.

In addition, it's important to discern when someone is making an apples and oranges comparison. When one is talking about rogue websites, it's improper to say that rogue websites cost $775 billion. Rogue websites would only fit into category 3 of the analysis above, which means that the rogue website cost would be one-tenth of the $775 billion figure - $75 billion, or perhaps as little as $30 billion.

And there's a world of difference between $30 billion and $775 billion.

As I said above, I have not looked into the detailed report to see exactly how the Chamber came up with these figures. If you would like to do so, go to this page and click on the "Full Report" icon (PDF).

P.S. If you believe that it's flawed to misrepresent the figures in a study, what about misrepresenting whether a cited study even exists? Kevin Fogarty wrote about this aspect of the problem. An excerpt:

According to a Government Accountability Office (GAO) report published in 2010, all the estimates on which pro-SOPA forces based their numbers – which became the justification for SOPA and PIPA – are from studies no one can find.

Though both the FBI and Customs and Border Protection are cited as primary sources of the studies – because, presumably, they'd paid to have them done and then published the results – it turned out that neither agency had ever done such a study and had no idea where the numbers came from.

In each case the first use of the figure was in a speech and press release put out by the agency – the FBI's in 2002 – with a vague reference to the origin of the figure.

When even the source every citation references as the originator of a piece of data has no idea where it came from and denies ever funding studies that might have come up with those estimates, the odds that the data are anything but a chimera begin to grow.

The most likely origination for both numbers, according to the GAO report, was that someone from the FBI or CBP used the figures in a speech, quoting or misquoting numbers without an accurate citation that would make fact-checking possible.

It just goes to show that press releases are kind of like surveys - without knowing the underlying facts behind the press release, any claim in the press release can be considered questionable.

