With this post we continue our reading of Cathy O’Neil’s Weapons of Math Destruction. (If you’d like to catch up with the reading schedule, click here. All posts for this reading, including the schedule one, are grouped here.)
Here I’ll summarize this week’s chapters, then offer some discussion questions.
But first, some book club business.
Last week’s reading elicited some fine comments. Elsewhere Mike Richichi continued his bookblogging with reflections on last week’s reading. On Twitter Mark Corbett Wilson offered this bracing thought:
Elsewhere on the web, this Motherboard article explores how bias works in one of Google’s APIs.
Chapter 4, “Propaganda Machine: Online Advertising”
While the chapter title invokes the general topic of online advertising, this section really focuses on one instance: its use by for-profit colleges and universities. This continues last week’s theme of data in higher education. As Tressie Cottom also argues, O’Neil finds for-profits aggressively targeting poor people. They promote “ads that pinpoint people in great need and sell them false or overpriced promises. The find inequality and feast on it.
” (70) And they work with Cottom’s “gospel of education”, that widespread belief in the fruitful connection between education and career progress (81).
In terms of online advertising in general, O’Neil points out that Facebook and Google ads are interesting WMDs, because they do have some unusual features, one being learning from campaigns. They energetically learn from new data, and now “sift through data on their own… [w]ith machine learning.” (75) . They also act at very large scale, which further strengthens their ability to learn.
The end of chapter 4 touches on one other industry using predatory advertising: payday loan outfits.
Chapter 5, “Civilian Casualties: Justice in the Age of Big Data”
Here the book shifts ground from education to criminal justice, identifying a new group of bad algorithms. We lead off with Predpol, software designed to help police determine the most crimogenic areas of a city. This causes problems when applied to nuisance crimes through a broken windows policing approach, as that disproportionately targets poor neighborhoods. A feedback loop then results when analysis gathers more data from those regions, revealing more crime. “The result is that we criminalize poverty, believing all the whole that our tools are not only scientific but fair.” (91) (O’Neil offers a fine, acidic thought experiment whereby police give rich white collar areas the same treatment, 89-90).
O’Neil derives several general principles about WMDs from the policing experience, the first being that big data and data analytics involves “a choice between fairness and efficiency” (94). Efficiency is easier for software to handle. A second concept is that WMDs tend to be applied unequally. As with the previous chapter, many enterprises aim their data instruments at the poor. One more point flows from this: that we tend to gather data unevenly, ruling out certain categories because they are uncomfortable. There is, in short, always a question of which data is obtained, and which is excluded.
The chapter concludes on a recommendation for community policing, for “attempting to build relationships in the neighborhood” rather than subjecting an area to data surveillance. (103)
- Are there advertising campaigns using big data that avoid these problems?
- If non-profit higher education competition heats up, will those campuses turn to these sorts of big data campaigns?
- Mark Wilson asks a powerful question. If all data massively collected in America is biased, how should we proceed?
Next up: for November 6: chapter 6, “Ineligible to Serve: Getting a Job” and chapter 7, “Sweating Bullets: On the Job”.
Again, I go back to my discussion of narrative. Most of the people who are running these algorithms have a poor sense of storytelling (or they are evil manipulators of the data). I usually go with ignorance over conspiracy. There are simply too many players to explain away these kinds of things as some vast conspiracy (and I don’t think O’Neill does this). We live (and have always lived) in a country (world) with too small a subset of critical thinkers. You don’t have to be an advanced mathematician to understand how this system perverts itself. People will do what is easy. Critical analysis is hard. We also do a poor job of teaching it in school.
What we are seeing, and this is true of all forms of technology, is that technology is acting as an accellerant on stupidity. Was drunk wagoning a problem? Maybe. But few innocents died from it (horse-sense, you know). Adding a car the mix, however, adding to the killing capability of stupidity created conditions for mass carnage.
Information tech does no less. Was this data out there before computers? Yes. But it was too hard to get at to do anything meaningful with it. Now, we are seeing a surfacing of data to justify stereotyped, sometimes racist, and confidence-man-like instinct and allowing it be scaled to those who don’t have innate skills in this area. If you were a bad wagon-master, the horses would help you, to a degree. We don’t have horses anymore.
Improving the critical thinking skills of all involved would go a long way to solving all of these problems. We should not get too distracted by the tech. It’s just a means to an end. As a society, we need to start worrying about the Ends more. I think O’Neill makes us start thinking in those terms.
I’m very glad of your focus on storytelling, Tom. Will add a question about that in the next post.
Good point about the historicity of data-gathering. Now it’s scaled.
I would also recommend Wolfgang Greller on learning analytics and data ~ including cautions. (on Twitter as @WolfgangGreller)
Pingback: Book Club–“Weapons of Math Destruction”, part 3 – Mike Richichi Dot Net
Pingback: Reading Weapons of Math Destruction, part 4 | Bryan Alexander
Advertising is a particularly sticky problem, precisely because the status quo is relatively weak, so any improvement tends to justify the WMD. If a trailer for a movie I like follows me around YouTube, well, that’s kind of fun. It’s only when I click on an odd slogan and get followed around Facebook for a month by ads for a product I don’t want that it’s a problem. Similarly, I like the algorithms which consume the data from my frequent buyer card at the supermarket, giving me discounts for products I actually use. I don’t like thinking about the fact that the discount is essentially paid for with a tax on newcomers to the market (or that I’m the one paying the tax at stores where I don’t have a frequent buyer card).
Regarding the justice system, I found myself thinking about the way that policing WMDs feed into sentencing and parole WMDs. There’s some good coverage of sentencing “guidelines” over at Popehat (www.popehat.com). The point you’ll see over there most often is that news stories saying “could serve up to 50 years” should almost always be mistrusted, since the guidelines rarely produce the maximum. I also seem to remember the argument that the sentencing algorithm significantly swings power toward the prosecutor and away from the judge, since the particular way that charges are framed and pursued can have an effect on the algorithmic result which judges are often unwilling to overrule (and which juries don’t understand… see again the “black box” problem).
Excellent points, from the frequent shopper tax to the “up to” weasel phrase to the black box.
Are you seeing big data used for penal matters in Ohio?
Excellent question, and not one for which I have a good answer. (Might be an interesting book club project – identify 3-5 big data projects/potential WMDs acting on your life or in your community.) I’d tend to assume that the big cities are using big data approaches, and that rural communities like mine aren’t.
There’s a current controversy about the fact that prison inmates are counted as residents of the prison for census purposes. This increases the population of the rural districts where the prisons are, which means it increases the political power of those districts’ free voters. (And decreases the population and political power of their home communities.) Looking at it as an unintended consequence, this shows the risk of even a pretty simplistic algorithm. But I don’t think it rises to O’Neil’s definition of WMD because it’s transparent and at least theoretically subject to feedback.