Crowdworkers form their own digital networks


Michael Pooler

They are the invisible labourers whose toil in the digital economy powers many rising technology companies.

Crowdworkers pick up the slack where artificial intelligence meets its limits. They do small online data tasks, on an outsourced basis and usually from home, that involve basic computer skills, from labelling images and transcription to identifying porno­graphy, which machines and algorithms alone cannot manage. The work is often repetitive and simple but requires human judgment and insight.

“For a lot of entrepreneurs working on lean start-up budgets, it’s not viable to take somebody on to do this work,” says Lilly Irani, a computer scientist at the University of California, San Diego, who sees the phenomenon as part of the broader socio-economic reconfiguration ushered in by on-demand services such as Uber, Lyft and TaskRabbit.

“[Crowdwork is] making large segments of labour infinitely flexible through computers and apps for a certain class of people – innovators and entrepreneurs,” says Ms Irani.

There are no exact figures but active crowdworkers are believed to number hundreds of thousands globally, with most concentrated in the US and India.

Champions of this emerging sector say the work is flexible and provides a path out of poverty for people in developing nations, as well as a financial lifeline in countries with weak social safety nets. Its critics say that the often minimal pay – sometimes as low as 50 cents an hour – and absence of employment rights, such as guaranteed work, sick pay and holidays, sits uneasily with the world-changing aspirations of the tech entrepreneurs and engineers who farm out the mouse-clicking and keyboard-tapping assignments.

“It’s not like a nine to five [job], where the same work is available to you,” says 27-year-old Ozlem Demirci, who lives in the US. “I call $20 [a] day a fantastic day. Some days I earn just a couple of dollars.”

This emerging but disparate virtual workforce is increasingly trying to tip the scales back in its favour.

Amazon’s Mechanical Turk, dubbed MTurk, is the best known and biggest crowdwork marketplace with a pool of 500,000 workers worldwide, according to the online retailer. Launched in 2005, it takes its name from an 18th century autonomous chess-playing machine that toured Europe before it was exposed as having a living chess master operating it from inside.

While “requesters”, who post tasks with set prices, are free to withhold payment without explanation if dissatisfied, a “Turker” who receives several bad ratings on the quality of their work is barred from viewing swaths of tasks. They have no grievance mechanism: Amazon, which receives a commission on each paid “human intelligence task”, or HIT, does not intervene in disputes. Studies put the median hourly MTurk wage at between $1.38 and $5. “How much a worker makes really depends on what tasks [they choose], how good their work is and if they are a casual or a full-time worker,” says Amazon.

Although they are geographically dispersed, crowdworkers are establishing digital versions of mutual aid and workplace solidarity. There are internet forums where Turkers, as they call themselves, share tips, experiences of particular requesters, boast about their task tallies and air anxieties about work drying up or paying too little.

TurkOpticon, a web browser plug-in developed by Ms Irani and a colleague with input from Turkers, enables workers to rate requesters on communication, generosity, fairness and promptness of payment. Some other websites provide coding to help users to operate the Amazon system more efficiently.

“Only when you harness all the tools available can you make a living wage, or [even] a good living,” says Kristy Milland, a thirtysomething mother and student in Ontario, Canada. Ms Milland says she made “double the poverty line” by Turking full-time, which was enough to support her family and pay medical bills for two years after her husband lost his job. Even so, obtaining work involves unpaid admin: “We have to look for HITs, then check out the requester on TurkOpticon, track who we have worked for and set up alerts in case they post more good work.”

However, crowdsourcing companies’ ability to get round statutory minimum wages is being challenged in the US. A California court has been asked to approve a financial settlement between San Francisco-based CrowdFlower and two of its former workers, who claimed breach of federal minimum wage laws. Ellen Doyle, a lawyer for one of the claimants, says that although the case will not set a precedent, such actions could in her view eventually lead to crowdworkers being reclassified as employees instead of their present status as independent contractors. “It isn’t a big leap to say even if you work online from your house or the coffee shop you are still performing work which is essentially highly controlled by the company.” CrowdFlower did not respond to requests for comment.

There are signs that potential litigation is prompting some companies to re­think their strategy on digital outsourcing. Some crowdwork companies are increasingly targeting video gamers or younger people with in-game rewards or Facebook credits instead of cash.

Other businesses are looking overseas. CloudFactory, which recently announced a $3m fundraising, engages workers in Kenya and Nepal. The company aims to improve its service by treating its 3,000-odd contractors better than its competitors.

Instead of an open marketplace, it uses handpicked trained and supervised workers who spend minimal time looking for tasks, says chief executive Mark Sears. “[Our workers] are usually earning $1-$3 an hour, which ironically is more than what many people are getting paid in the US [for similar work]”.

Not all crowdworkers are disempowered. One, who prefers not to be named, says he earns up to $200 on a good day and adds that some people do not pay tax on their earnings – a requirement in most jurisdictions.

While the Turker forums have pressed for and won improvements such as convincing requesters to communicate better or redesign aptitude tests and work-distribution algorithms, there is scepticism that more collective forms of self-organising can prevail. “A union would never work on any crowdsource platform,” writes one forum user. “If the workers staged a strike, thousands of non-union workers would quickly fill the void.”

Further reading: Payment by the hour, not the task

As digital crowdwork faces a backlash over low pay and absence of protection for workers, some entrepreneurs sense an opportunity in adopting socially conscious models.

“When people are badly paid and it’s relatively transactional, they show up, do the work and disappear. There’s no incentive to do a good job,” says Anand Kulkarni, chief executive of MobileWorks, whose LeadGenius platform launched in 2010 and has “several hundred” full-time workers in 50 countries. MobileWorks recently raised $6m on top of $2.2m in early-stage funding.

The company targets disadvantaged and marginalised groups, from military veterans to refugees. Unlike most other crowdwork companies, it pays hourly. The idea is to root out poor-quality work by removing the incentive to complete assignments hastily.

A US-based crowdworker for MobileWorks can expect up to 40 hours work a week, says Mr Kulkarni. “Pay is almost always above the minimum wage in the countries we are working in,” he adds. The lower-skilled tasks are assigned to developing countries.

However, its workers are freelancers, not employees, and therefore enjoy no holiday or sick pay, or security of work.


Uber, Airbnb Under Attack As Old and New Economies Clash


Nonprofit 2.0

Nonprofit 2.0 is more than just a conference on the next generation web. It’s a next generation conference in format. Ever attend a conference, for a keynote, and find the rest of the content to be wanting? NonProfit 2.0 delivers the best of both worlds, offering great keynote sessions led by the most innovative nonprofit campaigners, thought leaders, and strategists in the space but in an unconference way with no PowerPoint, 15 minute leads, and open questions and dialogue for fantastic conversations. Then from midmorning forward, NonProfit 2.0 shifts into a full-on Unconference with DC’s brightest minds strategizing for social good. Our past keynotes have included Jean Case of the Case Foundation, Allison Fine and Beth Kanter co-authors of the Networked Nonprofit, Stacey Monk, Co-Founder of Epic Change, Robert Wolfe, Co-Founder of Crowdrise, and Paull Young of charity: water.
More info


The Internet of Soda: Why Coca-Cola Has Stockpiled 16 Million Network IDs


Ever used one of those Burger King soda fountains that lets you create your own drink? It’s actually the brainchild of the Coca-Cola company, and it’s connected to the internet.
They’re known as Freestyle machines. You’ll find more than 2,000 of them inside the world’s Burger Kings, and many other restaurants and movie theaters throughout the United States and the United Kingdom have them as well. They’re all connected to the internet, so that Coca-Cola can track what drinks people are making and how often.
That’s one reason Coca-Cola now owns 16 million unique network identifiers usually reserved for Wi-Fi cards and other networking equipment. This week, many people were surprised to learn that a soda company controls that many network IDs, with the geeks at Slashdot, a popular online hangout for techies, launching an epic discussion about what the company would do with all those addresses. But we already know a bit about Coke’s plans, and as we approach “The Internet of Things,” the fact of the matter is that this sort of network ID grab is par for the course.
Coca-Cola’s network ID stash isn’t that unusual, according to John Matherly, the man behind Shodan, a search engine for The Internet of Things — the ever-growing array of devices that tap into our global network. First off, Coca-Cola has owned these identifiers since at least 2010. And while 16 million may sound like a lot, it’s actually the smallest number of these unique identifiers you can reserve at one time.
Coca-Colla did not respond to a request for an interview about its Internet of Thing ambitions. But in all likelihood, these addresses are already being used in the company’s Freestyle machines as well as the internet-connected vending machines it’s testing in Texas. That said, the decision to secure a block of addresses en masse may hint at even larger online ambitions for the company.
All network cards have these unique identifiers, referred to as a “media access control address,” or MAC address. These are separate from IP addresses assigned by internet service providers. A device’s MAC address will be the same on any network, regardless of the IP address. If you use your laptop on your wireless network at home and then use it at on a wireless network at a coffee shop, your MAC address will remain the same but your IP address will change. In that sense, a MAC address is more like a virtual serial number than a network address.
An organization called the Institute of Electrical and Electronics Engineers, Incorporated, or IEEE, manages MAC address registration. These identifiers are usually reserved by companies that sell networking cards and equipment. Because of how identifiers are generated, companies that want to reserve a block of addresses must do so in sets of 16 million, as Matherly explains.
That’s not a lot for a company like Cisco, which sells millions of network devices each year. But for companies like Coca-Cola, it may be more than they need. That said, the mere decision grab them by the block — as opposed to securing them one-by-one — shows Coke is serious about this internet thing. “What’s more interesting to me is that they decided to get their own MAC address range,” Matherly says. “This could’ve been a prudent move by an engineer to guarantee their devices a unique range of identifiers in the future, or it’s part of a larger strategy to make their infrastructure — including vending machines — more connected.”
In addition to the Freestyle machines, Coca-Cola has been testing 200 web-connected vending machines in the Austin, Texas area and plans to roll out tens of thousands more, according toBloomberg Businessweek.
The killer feature of a web-connected vending machine is the ability to process credit cards and mobile payments. But the Freestyle machines and the smart vending machines also open a new frontier in collecting real-time data about customer behavior. Coca-Cola could greatly simplify logistics, asBusinessweek points out, by making it possible to know when to restock a machine and how much inventory each one needs, without having to send a truck to each one first.
That only scratches the surface, Matherly says. Because the machines use digital displays, Coca-Cola could do the same type of A/B testing that marketers and political campaigners do on websites and advertisements. For example, the company could determine which arrangements of drinks lead to the most sales, and it could customize the displays on each individual machine depending on what works best for its customers.
It’s unlikely that Coca-Cola will burn through all 16 million MAC addresses with vending machines and soda fountains alone. There are only about 6 to 7 million vending machines total in the entire United States, according to a spokesperson at USA Technologies, a company that sells payment processing systems for vending machines and other retailers. There’s probably roughly twice that number in the entire world.
That means the company’s could take its ambitions even further.

How Your Data Are Being Deeply Mined


Alice E. Marwick


The recent revelations regarding the NSA’s collection of the personal information and the digital activities of millions of people across the world have attracted immense attention and public concern. But there are equally troubling and equally opaque systems run by advertising, marketing, and data-mining firms that are far less known. Using techniques ranging from supermarket loyalty cards to targeted advertising on Facebook, private companies systematically collect very personal information, from who you are, to what you do, to what you buy. Data about your online and offline behavior are combined, analyzed, and sold to marketers, corporations, governments, and even criminals. The scope of this collection, aggregation, and brokering of information is similar to, if not larger than, that of the NSA, yet it is almost entirely unregulated and many of the activities of data-mining and digital marketing firms are not publicly known at all.

Here I will discuss two things: the involuntary, or passive, collecting of data by private corporations; and the voluntary, or active, collection and aggregation of their own personal data by individuals. While I think it is the former that we should be more concerned with, the latter poses the question of whether it is possible for us to take full advantage of social media without playing into larger corporate interests.

Database Marketing
The industry of collecting, aggregating, and brokering personal data is known as “database marketing.” The second-largest company in this field, Acxiom, has 23,000 computer servers that process more than 50 trillion data transactions per year, according to The New York Times.1 It claims to have records on hundreds of millions of Americans, including 1.1 billion browser cookies (small pieces of data sent from a website, used to track the user’s activity), 200 million mobile profiles, and an average of 1,500 pieces of data perconsumer. These data include information gleaned from publicly available records like home valuation and vehicle ownership, information about online behavior tracked through cookies, browser advertising, and the like, data from customer surveys, and “offline” buying behavior. The CEO, Scott Howe, says, “Our digital reach will soon approach nearly every Internet user in the US.”2

Visiting virtually any website places a digital cookie, or small text file, on your computer. “First-party” cookies are placed by the site itself, such as Gmail saving your password so that you don’t have to log in every time you visit the site. “Third-party cookies” persist across sites, tracking what sites you visit, in what order. For those who have logged in, Google Chrome and Firefox sync this browsing history across devices, combining what you do on your iPad with your iPhone with your laptop. This is used to deliver advertising.

For example, a few nights ago I was browsing LLBean.com for winter boots on my iPhone. A few days later, LLBean.com ads showed up on a news blog I was reading on my iPad. This “behavioral targeting” is falling out of fashion in favor of “predictive targeting,” which uses sophisticated data-mining techniques to predict for L.L.Bean whether or not I am likely to purchase something upon seeing an LLBean.com ad.

Acxiom provides “premium proprietary behavioral insights” that “number in the thousands and cover consumer interests ranging from brand and channel affinities to product usage and purchase timing.” In other words, Acxiom creates profiles, or digital dossiers, about millions of people, based on the 1,500 points of data about them it claims to have. These data might include your education level; how many children you have; the type of car you drive; your stock portfolio; your recent purchases; and your race, age, and education level. These data are combined across sources—for instance, magazine subscriber lists and public records of home ownership—to determine whether you fit into a number of predefined categories such as “McMansions and Minivans” or “adult with wealthy parent.”3 Acxiom is then able to sell these consumer profiles to its customers, who include twelve of the top fifteen credit card issuers, seven of the top ten retail banks, eight of the top ten telecom/media companies, and nine of the top ten property and casualty insurers.

Acxiom may be one of the largest data brokers, but it represents a dramatic shift in the way that personal information is handled online. The movement toward “Big Data,” which uses computational techniques to find social insights in very large groupings of data, is rapidly transforming industries from health care to electoral politics. Big Data has many well-known social uses, for example by the police and by managers aiming to increase productivity. But it also poses new challenges to privacy on an unprecedented level and scale. Big Data is made up of “little data,” and these little data may be deeply personal.
Alone, the fact that you purchased a bottle of cocoa butter lotion from Target is unremarkable. Target, on the other hand, assigns each customer a single Guest ID number, linked to their credit card number, e-mail address, or name. Every purchase and interaction you have with Target is then linked to your Guest ID, including the cocoa butter.

Now, Target has spent a great deal of time figuring out how to market to people about to have a baby. While most people remain fairly constant in their shopping habits—buying toilet paper here, socks there—the birth of a child is a life change that brings immense upheaval. Since birth records are public, new parents are bombarded with marketing and advertising offers. So Target’s goal was to identify parents before the baby was born. The chief statistician for Target, Andrew Pole, said, “We knew that if we could identify [new parents] in their second trimester, there’s a good chance we could capture them for years.”4 Pole had been mining immense amounts of data about the shopping habits of pregnant women and new parents. He found that women purchased certain things during their pregnancy, such as cocoa butter, calcium tablets, and large purses that could double as diaper bags.

Target then began sending targeted mail to women during their pregnancy. This backfired. Women found it creepy—how did Target know they were pregnant? In one famous case, the father of a teenage girl called Target to complain that it was encouraging teen pregnancy by mailing her coupons for car seats and diapers. A week later, he called back and apologized; she hadn’t told her father yet that she was pregnant.5

So the Target managers changed their tactics. They mixed in coupons for wine and lawnmowers with those for pacifiers and Baby Wipes. Pregnant women could use the coupons without realizing that Target knew they were pregnant. As Pole told The New York Times Magazine, “Even if you’re following the law, you can do things where people get queasy.”

These same techniques were used to great effect by the Obama campaign before the 2012 election. Famously, the campaign recruited some of the most brilliant young experts in analytics and behavioral science, and put them in a room called “the cave” for sixteen hours a day.6 The chief data scientist for the campaign was an analyst who had formerly mined Big Data to improve supermarket promotions. This “dream team” was able to deliver microtargeted demographics to Obama—they could predict exactly how much money they would get back from each fund-raising e-mail. When the team discovered that East Coast women between thirty and forty were not donating as much as might be expected, they offered a chance to have dinner with Sarah Jessica Parker as an incentive.7 Every evening, the campaign ran 66,000 simulations to model the state of the election. The Obama analysts were not only using cutting-edge database marketing techniques, they were developing techniques that were far beyond the state of the art.

The Obama campaign’s tactics illuminate something that is often missed in our discussions of data-mining and marketing—the fact that governments and politicians are major clients of marketing agencies and data brokers. For instance, the campaign bought data on the television-watching habits of Ohioans from a company called FourthWallMedia. Each household was assigned a number, but the names of those in the household were not revealed. The Obama campaign, however, was able to combine lists of voters with lists of cable subscribers, which it could then coordinate with the supposedly anonymous ID numbers used to track the usage patterns of television set-top boxes.8 It could then target campaign ads to the exact times that certain voters were watching television. As a result, the campaign bought airtime during unconventional programming, like Sons of Anarchy, The Walking Dead, and Don’t Trust the B— in Apt. 23, rather than during local news programming as conventional wisdom would have advised.

The “cave dwellers” were even able to match voter lists with Facebook information, using “Facebook Connect,” Facebook’s sign-on technology, which is used for many sign-ups and commenting systems online. Knowing that some of these users were Obama supporters, the campaign could figure out how to get them to persuade their perhaps less motivated friends to vote. Observing lists of Facebook friends and comparing them with tagged photos, the campaign matched these “friends” with lists of persuadable voters and then mobilized Obama supporters to convince their “real-life” friends to vote.

Social Media
In view of these sophisticated data- mining and analyzing techniques, is there any way we can use social media—or the Internet itself —without adding to our profiles collected by companies like Acxiom, Experian, or Epsilon?

Social media allow us to collect and track data about ourselves. For instance, I have been using a website called Last.fm since 2005 to track every piece of digital music I have listened to when using iTunes or Spotify. As a result, I have a fascinating picture of how my musical tastes have changed over time, and Last.fm is able to recommend obscure bands to me based on this extensive listening history.

Using social media allows us to connect with friends; to learn more about ourselves; even to improve our lives. The Quantified Self movement, which builds on techniques used by women for decades, such as counting calories, promotes the use of personal data for self-knowledge. Measuring your sleep cycles over time, for instance, can help you learn to avoid caffeine after 4:00 pm, or realize that, if you want to fall asleep, you can’t use the Internet for an hour before bedtime.

But these data are immensely beneficial to data brokers. Imagine how a health insurer might react to viewing your caloric intake on MyFitnessPal, the number of steps you walk per day tracked by Fitbit, how often you check in to your local gym using Foursquare, and what you eat based on the pictures of your meals that you post on Instagram. Each piece of information, by itself, may be inconsequential, but the aggregation of this information creates a larger picture. Data trackers can centrally access such information and add it to their databases. Two large consequences of this collection of data deserve more attention.

The first is data discrimination. Once customers are sliced and diced into segmented demographic categories, they can be sorted. An Acxiom presentation to the Consumer Marketing Organization in 2013 placed customers into “customer value segments” and noted that while the top 30 percent of customers add 500 percent of value, the bottom 20 percent actually cost 400 percent of value. In other words, it behooves companies to shower their top customers with attention, while ignoring the bottom 20 percent, who may spend “too much” time on customer service calls, and may cost companies in returns or coupons, or otherwise cost more than they provide.

These “low-value targets” are known in industry parlance as “waste.” Joseph Turow, a University of Pennsylvania professor in communications who studies niche marketing, asks what happens to those people who fall into the categories of “waste,” entirely without their knowledge or any notification. Do they suffer price discrimination? Poor service? Do they miss out on the offers given to others? Such discrimination is still more insidious because it is entirely invisible.

Second, we may be more concerned with government surveillance than with marketers or data brokers collecting personal information, but this ignores the fact that the government regularly purchases data from these companies. ChoicePoint, now owned by Elsevier, was an enormous data aggregator that combined personal data extracted from public and private databases, including Social Security numbers, credit reports, and criminal records. It maintained 17 billion records on businesses and individuals, which it sold to approximately 100,000 clients, including thirty-five government agencies and seven thousand federal, state, and local law enforcement agencies.9

For instance, the State Department purchased records on millions of Latin American citizens, which were then checked against immigration databases. Choicepoint was also investigated for selling 145,000 personal records to an identity theft ring. More recently, Experian, one of the three major credit bureaus, mistakenly sold personal records to a Vietnamese hacker. Scammers refer to these records, which include Social Security numbers and mothers’ maiden names, as “fullz,” because they contain enough personal information for crooked operators to apply for credit cards or take out loans.

A few years ago, I toured the experimental lab of a large advertising agency. They showed me the cutting edge of consumer- monitoring technologies. Someday, not too far in the future, if you’re at Duane Reade, aimlessly staring at a giant shelf of shampoo trying to figure out which to buy, the shelf will track your eye movements and which bottles you pick up and examine in more detail. Using this data, Duane Reade can algorithmically generate a coupon for a particular brand of shampoo, which you can then print from the shelf. I watched an experimental application that tracks the movements of individuals through a mall, based on the unique identifiers, or MAC addresses, of their cell phones, kept in purses or pockets but available to wireless tracking devices. Again, in all of these cases, the individuals are unaware that they are being tracked. A description of such procedures may be hidden at the end of a byzantine privacy policy people may not have noticed when they bought their devices, or written on a notice next to a CCTV camera. Though they may not be technically illegal, they seem ethically dubious.

While the easy answer to these problems is to opt out of loyalty cards, Internet use, or social media, this is hardly realistic. In fact, it is practically impossible to live life, online or offline, without being tracked—unless one takes extreme measures of avoidance. Cities track car movements; radio-frequency identification (RFID) tags are attached to clothing and dry cleaning; CCTV cameras are in most stores.10 The technology is developing far more rapidly than our consumer protection laws, which in many cases are out of date and difficult to apply to our networked world.

The Federal Trade Commission and the Senate Commerce Committee are currently investigating data brokers and calling for more transparency in the collection and dissemination of personal information. Those of us concerned with privacy must continue to demand that checks and balances be applied to these private corporations. People should be encouraged to investigate the various opt- out tools, ad-blockers, and plug-ins that are available for most platforms. While closer scrutiny of the NSA is necessary and needed, we must apply equal pressure to private corporations to ensure that seemingly harmless targeted mail campaigns and advertisements do not give way to insidious and dangerous violations of personal privacy.

1 See Natasha Singer, “Acxiom, the Quiet Giant of Consumer Database Marketing,” The New York Times, June 16, 2012.
2 See Judith Aquino, “ Acxiom Prepares New ‘Audience Operating System’ Amid Wobbly Earnings,” AdExchanger.com, August 1, 2013. ↩ 
3 See Natasha Singer, “A Data Broker Offers a Peek Behind the Curtain,” The New York Times, August 31, 2013. ↩ 
4 See Charles Duhigg, “How Companies Learn Your Secrets,” The New York Times Magazine, February 16, 2012. ↩ 
5 See Kashmir Hill, “ How Target Figured Out a Teen Girl Was Pregnant Before Her Father Did,” Forbes, February 16, 2012. ↩ 
6 See Jim Rutenberg, “ Data You Can Believe In: The Obama Campaign’s Digital Masterminds Cash In,” The New York Times Magazine, June 20, 2013. ↩ 
7 See Michael Scherer, “ Inside the Secret World of the Data Crunchers Who Helped Obama Win,” Time, November 7, 2012. ↩ 
8 See Lois Beckett, “ Everything We Know (So Far) About Obama’s Big Data Tactics,” ProPublica, November 29, 2012.
9 See “ ChoicePoint,” at the Electronic Privacy Information Center. ↩ 
10 See Sarah Kessler, “ Think You Can Live Offline Without Being Tracked? Here’s What It Takes,” Fast Company, October 15, 2013.