The rise of big data policing rests in part on the belief that data-based decisions can be more objective, fair, and accurate than traditional policing.Data is data and thus, the thinking goes, not subject to the same subjective errors as human decision making. But in truth, algorithms encode both error and bias. As David Vladeck, the former director of the Bureau of Consumer Protection at the Federal Trade Commission (who was, thus, in charge of much of the law surrounding big data consumer protection), once warned, "Algorithms may also be imperfect decisional tools. Algorithms themselves are designed by humans, leaving open the possibility that unrecognized human bias may taint the process. And algorithms are no better than the data they process, and we know that much of that data may be unreliable, outdated, or reflect bias."Algorithmic technologies that aid law enforcement in targeting crime must compete with a host of very human questions. What data goes into the computer model? After all, the inputs determine the outputs. How much data must go into the model? The choice of sample size can alter the outcome. How do you account for cultural differences? Sometimes algorithms try to smooth out the anomalies in the data—anomalies that can correspond with minority populations. How do you address the complexity in the data or the "noise" that results from imperfect results?Sometimes, the machines get it wrong because of racial or gender bias built into the model. For policing, this is a serious concern. [...]As Frank Pasquale has written in his acclaimed book The Black Box Society, "Algorithms are not immune from the fundamental problem of discrimination, in which negative and baseless assumptions congeal into prejudice. . . . And they must often use data laced with all-too-human prejudice."Inputs go in and generalizations come out, so that if historical crime data shows that robberies happen at banks more often than at nursery schools, the algorithm will correlate banks with robberies, without any need to understand that banks hold lots of cash and nursery schools do not. "Why" does not matter to the math. The correlation is the key. Of course, algorithms can replicate past biases, so that if an algorithm is built around biased data, analysts will get a biased result. For example, if police primarily arrest people of color from minority neighborhoods for marijuana, even though people of all races and all neighborhoods use marijuana at equal rates, the algorithm will correlate race with marijuana use.The algorithm will also correlate marijuana with certain locations. A policing strategy based on such an algorithm will correlate race and drugs, even though the correlation does not accurately reflect the actual underlying criminal activity across society. And even if race were completely stripped out of the model, the correlation with communities of color might still remain because of the location. A proxy for racial bias can be baked into the system, even without any formal focus on race as a variable. [...]As mathematician Jeremy Kun has written, "It’s true that an algorithm itself is quantitative—it boils down to a sequence of arithmetic steps for solving a problem. The danger is that these algorithms, which are trained on data produced by people, may reflect the biases in that data, perpetuating structural racism and negative biases about minority groups."Big data policing involves a similar danger of perpetrating structural racism and negative biases about minority groups. "How" we target impacts "whom" we target, and underlying existing racial biases means that data-driven policing may well reflect those biases.
>minority communities tend to be less affluent, not as well educated, and does not provide as much of a chance of social upward mobility >minority communities as a result have a higher crime rate >crime data being taken in acknowledges those areas are committing an disproportionate amount of crime in comparison to other areas...>so this means the technology is biased for noticing trends that occur among different racial communities Really?
1. Statistics are easily manipulated and interpreted in different ways. They're a guideline but insufficient to dictate policy on their own and prone to misuse. The real problems arise when big data is used not just to provide information but to make decisions affecting individual people.
Imagine you're a man looking to become a teacher. The school district employs an algorithm to assess job applications and potential candidates. This system takes into account dozens of characteristics and data points to evaluate your profile. One of the things it learns from the data it's trained with is that men make up the vast majority of sex offenders and are responsible for almost all cases of teachers sexually or physically abusing students. As a result, it ties men to these crimes and incorporates this into its decisionmaking. Every man, by default, gets a point deduction because they're a higher risk profile and will systematically be hired less. This goes for a dozen different things. Say you're applying for a college. Its algorithm determines that people from your state / area / region / background tend to drop out more often than the average. Since every student is an investment, colleges want successful ones. As such, your name is by default put at the bottom of the list despite no person at the school having met you or being able to assess you on your merits alone. The same thing applies here. All of this, as you say, is based on accurate, real and reliable facts that notice trends in our society, yet I think you're going to have to agree that it's far from fair.
2. These algorithms exacerbate existing problems and biases by creating a feedback loop. Say the system identifies an area or a specific group that has a lot of issues with crime. As a result, the police focuses its attention there and deploys more cops with a specific mandate. You will see that even more crime is now recorded in this area, merely due to the fact that there's simply more cops that are actively looking for it. There's not any more crime that there was before, but it's just noticed more often. You then put this new data in the system and voila - feedback loop. "The computer was correct, we listened to it and caught more criminals who are black". This can lead to adverse effects and the over / under policing of certain areas. The attributes you've found serve as a proxy for race and rather than fairly policing anything, you're now effectively policing people based on the color of their skin.
Take the policies like stop and frisk or random traffic stops. There's been a lot of research into this finding substantial racial bias in how they were executed. If you now use that date to train a computer to determine who should be "randomly" stopped, you'll find that it also focuses more on blacks. Aside from the problem of how this affects innocent individuals (see 1.), simply by focusing more on blacks, you'll now find more criminals among them. That's basic logic. Feed this back into the system and you'll end up with a situation where whites are given a pass or stopped less and less based on the assumption that they're less likely to be criminals, but this assumption is already based on previous data (analysis) and can therefore exacerbate the issues and bias. This can lead to the underlynig problem being ignored and existing problems being continued rather than fixed.
3. You now also institutionalize the problem. It's easy to check a person and have them justify certain actions in order to determine if they're prejudiced or wrong, but it's a lot harder with a very intelligent computer. People take what technology says for granted and trust that it's neutral, fair and accurate, while it very often isn't. The more we move towards machine learning, the more we run the risk of incorporating these issues that are potentially extremely difficult to detect. A famous example is that of image recognition software distinguishing between different animals. An AI was trained to do this and it became extremely good at it with very little effort. So good that the people who created it became skeptical. Want to take a guess what they found when they really put it to the test? I'll comment later.
Sorry for the late reply by the way, I just got home recently, started writing, and got logged off by accident because I forgot to click the keep me logged on button.
IRegardless, even with the systematic bias towards male teachers, I can say from my own personal anecdotes that metric has not as great of an effect as you think.
I'll probably finish the rest tomorrow in a shorter version.
Poor people are more involved in crime > more police focus on poor areas, less on richer areas > more poor people arrested and convicted > even more evidence that the poor have criminal propensity > even more focus on the poor (reinforcing the “prison pipeline” and putting more of them in prison, which we know teaches criminal habits and make them less likely to be employed afterwards so they remain poor) and even more sentences and harsh judgments > system grows increasingly more “biased” against poor people because of the feedback loop (it’s being proven “right” because more poor people are going to jail because of it) > the current problems of inequality persist, the underlying problem goes unaddressed, minority communities are further ostracized, the rich/privileged are given more “passes” while the poor/disadvantaged are given less leeway and more punishments > social mobility is stifled and the divide between the rich and poor grows because institutionalized computer systems serve as an added obstacle…And all of this happens on the basis of cold, hard and factual data paired with a very smart computer. This is just one of the dozens of possible scenarios, but I hope that this clarifies what I meant. Data is not necessarily wrong or inherently bad, even when it’s “biased”. The point is that technology can pick up on these inequalities / problems / different treatments and actually reinforce them further because it considers them the norm. The risk is that algorithms learn from data, create generalized (and potentially “prejudiced” or “biased) profiles, and then apply them to individuals (“your father was abusive which means that you’re more likely to be abusive too so you’ll be denied a job since you’re considered a potential abuser regardless of the person you are”) which suffer as a consequence but have almost no way to fight back because their disparate treatment is (often wrongfully) legitimized as “well the computer says so and it’s an intelligent piece of technology so it’s neutral and always objective”.
I understand what you are saying about the feedback loops. However, to address that concern, this information should not be treated as the same data set, rather a subset of that data set. Like I discussed from my previous post, when an area is provided with an more intensive treatment for the purpose of remedying the difference between that area and the norm, the data should be used for analysis of how the area is improving over time. Think about it like a science experiment, the area is receiving a new variable into its equation (the increased police presence). Treating the area like the other areas is what supports that feedback loop.
My guess is that the computer created generalizations that simplified the process of identifying animals in the quickest way possible. If I am correct, such a thing reminds me of the discussion of how AIs talking to one another "created" a new language by simplifying the syntax of English.
Despite this though, I doubt that anytime in future we will solely rely on AIs.
Fuck the quadpost / monologue, but I'm going to take some of the stuff I wrote here and reuse it for an article I'm working on. Thanks for the help Zen.
Can I get a tldr?
Quote from: Dietrich Six on January 14, 2018, 11:43:36 PMCan I get a tldr?Advanced analytical systems and AI are increasingly being used to support policy being drafted and decisions being made about people, including in the area of law enforcement and criminal justice. While very beneficial in several ways, these new technologies also come with risks. The design of the system itself can be flawed, but equally realistic is that "bias" from big data will find its way into the AI that is supposed to learn from it. AIs are created to detect patterns and apply them back into practice. This not only risks decisions about an individual person being made largely on the basis of a profile based on how people like him are expected to act, but also perpetuating current inequalities and problems. If prejudice or other societal factors leads to cops disproportionately targeting black people in "random" vehicle stops and patdowns, an AI learning from police records and arrest data can easily pick up on the relation between race and police encounters. From this data, it can draw the conclusion that black people are more likely to be criminals than whites, and that blacks should therefore be considered as more likely suspects for unsolved crimes or be subject to even more scrutiny. When presented with two identical people of the exact same background and profile (with the only difference being that one is white and the other is black), the police AI will then pick out the black guy as the likely offender because that's what it learned from (potentially biased and flawed) arrest data in the past. This is a major issue as it can decrease social mobility, exacerbate inequality and result in blatantly unfair treatment of people. It's made worse because it's done by a super intelligent computer that people are unlikely to doubt (as they believe it's hard maths and completely objective) and that's very difficult to hold accountable or assess on errors and bias (due to how complex, inaccessible and secretive these systems are).
Why are we listening to computers?
Quote from: Dietrich Six on January 15, 2018, 06:18:21 PMWhy are we listening to computers?We already are. Every time you get into a car you trust computers to tell you how fast you're going and whether it's safe to cross the street when the light's green. Every time you sign into your PC or console and log in to a secure service, you trust that the computer isn't sending your payments to a scammer and your personal information to a hacker. It's just becoming more pervasive.There's a lot of reasons why this is taking off the way it is. Big data analytics and predictive computing can be used very effectively for a lot of good things. It can detect and predict the spread of infectious diseases before any human could. It can pick up on possible terrorist attacks before they happen. It can pick up on patterns investigators might miss to solve cases and fight crime. It can help businesses and government allocate their resources more effectively and free up precious time and commodities to spend elsewhere. It can automate tasks and improve the economy. It can assist researchers everywhere to map and address the consequences of global warming, polution and international conflict. It can help delivery companies route their trucks better, medical businesses cure diseases faster and cities cut down on littering and traffic accidents more efficiently. It creates fun, new technologies like automatic drones, self-driving cars and image recognition that lets computers identify what is in a picture and improve search engines. There's untold reasons why computers can help us make decisions. Problem is that as with many new things, it's not all safe.
And this can and does go pretty far. You're applying for jobs or colleges. Your profile is checked and scored based on how well you would do. Aside from your own qualifications (degree, experience, traits), you're also scored against a general profile made of you based on similarities and the information they have on you. Your name sounds foreign or Spanish? Shame, but -10 points on language skills because statistically those people are less fluent in English than "Richard Smith" is. You're from area X? Ouch, well that place has some of the highest substance abuse rates in the country so you'll get a -10 on reliability because, statistically speaking, you're more likely to be a drunk or drug addict. You went to this high school? Oof, people from that school tend to have lower graduation rates than the national average so that's a -10 on adequacy. Your parents don't have college degrees? Sucks to be you but it's a fact that children of college educated parents are more likely to score well on university level tests, so -10 on efficiency. That's -40 points on your application based entirely on hard, solid and valid statistics or facts. Perfectly reasonable, no? Only, it aren't facts about you. It's facts about people like you taken on average. And of course, this will hold true for many like you. They won't do as well, they will fail more and they might in the end drop out. But for many, this doesn't ring true. They aren't drunks, they are motivated, they would get good grades and they do speak English well. But in the end, they don't even get the chance to try because the system rejects them. This likely condemns them to worse jobs, a lower education and ultimately an almost guaranteed lower social status, all while people from a "good" area with rich, white parents get more opportunities so that the inequality and the social divide grows while social mobility drops.Obviously, this is an exaggeration. It doesn't happen now, but it very well could in the not so distant future. As machine learning and AI become more commonplace and powerful, and the amounts of different data they are fed with continues to grow, it becomes increasingly difficult to ascertain exactly what goes on in their "brain". And as these systems are almost always proprietary and owned by companies, there's almost no real way to look into them and find out how they work - especially not if you're just an ordinary person.
I brought this up to further illustrate my example from earlier. There are some serious disparate and negative effects that can come from, as you put it, "noticing trends" and applying them for decision-making without proper safeguards, oversight and mitigation techniques in place, even when they are based on solid, valid and statistically sound facts. And these things can and really do happen, partially because of how difficult it is to assess these systems and pick out flaws. Remember when Google's image recognition software identified black people as gorillas? After several years, we've finally found out what their "solution" is two days ago. Instead of fixing the acutal algorithm, which is a difficult thing to do even for a company like Google, they just removed gorillas from their labelling software altogether and made it into what's effectively an "advanced image recognition tool - for everything other than gorillas" package.
Regarding the first section, I feel pretty 50/50 about the implications of here. While we can both agree that this isn't fair, the position of a college doing this is understandable. When it comes to compiling data, outliers shouldn't be taken as the norm of a distribution. If a college found two people of equal qualifications, but one came from a background that had a family of drug abusers, I wouldn't chastise the college for choosing the safer of the two bets. However, like you said, this does create the problem of making social mobility easier for people. Generally, this is why safeguards such as affirmative action has been so commonplace.
With the latter quote, I do feel that much of what we are finding is how much in its infancy AI software is in respect to its potential in the future, a lot of what we will find right now is the hiccups in the system of refining them. Especially right now with the fact that AIs can only work in a series of yes and no answers.
All of these things use data that has been collected from humans and therefore imperfect. The real problem is that machines can't feel like humans can and will likely go to extremes that humans recognize as unsafe or irresponsible.Artificial intelligence will likely be the downfall of mankind and I for one do not welcome our circuited overlords.Will we never learn flee?
It seems there is no back up plan if the EU turns into a fourth Reich. In the US people have guns if the government starts oppressing people.
Quote from: Genghis Khan on January 16, 2018, 03:05:57 PMIt seems there is no back up plan if the EU turns into a fourth Reich. In the US people have guns if the government starts oppressing people.But how can the EU turn tyrannical when Muslim immigrants are going to tear down the government and turned the entire country into a barren wasteland controlled by Sharia law in the first place?
Quote from: Flee on January 16, 2018, 03:19:19 PMQuote from: Genghis Khan on January 16, 2018, 03:05:57 PMIt seems there is no back up plan if the EU turns into a fourth Reich. In the US people have guns if the government starts oppressing people.But how can the EU turn tyrannical when Muslim immigrants are going to tear down the government and turned the entire country into a barren wasteland controlled by Sharia law in the first place?
Quote from: Dietrich Six on January 15, 2018, 06:56:22 PMAll of these things use data that has been collected from humans and therefore imperfect. The real problem is that machines can't feel like humans can and will likely go to extremes that humans recognize as unsafe or irresponsible.Artificial intelligence will likely be the downfall of mankind and I for one do not welcome our circuited overlords.Will we never learn flee?I agree with the first part but not so much the second. I think AI can be a huge force for good. We just need to be very careful and mindful from this point out. Advanced analytics need to be accountable, transparent and auditable. They need to be able to justify why they arrived at certain outcomes and how they analyzed data. Safeguards, alert mechanisms, supervised and fair learning need to be standard and mandated by law. Independent and technically capable oversight bodies need to have access and sufficient power to scrutinize commercial and governmental dealings. The EU is taking steps towards this with its Resolutions on Big Data and Robotics as well as its new General Data Protection Regulation, but this also needs to catch on in the US (as it has in NYC where the first transparent algorithms bill was recently adopted). We can't shut down these technologies and it's probably not in our best interests to do so either. We shouldn't be overly paranoid and shun them because of unlikely doomsday scenarios, but we should also show some serious restraint and take the proper steps to think this through and mitigate or avoid potentially negative consequences.
Quote from: Zen on January 15, 2018, 09:41:01 PMWith the latter quote, I do feel that much of what we are finding is how much in its infancy AI software is in respect to its potential in the future, a lot of what we will find right now is the hiccups in the system of refining them. Especially right now with the fact that AIs can only work in a series of yes and no answers.This just adds to my point though. AI is still in its infancy (even though it can definitely do more than just provide yes or no answers), which is exactly why these are important things to consider and regulate now rather than letting it grow up without these issues being addressed.