The Internet of Soda: Why Coca-Cola Has Stockpiled 16 Million Network IDs


Ever used one of those Burger King soda fountains that lets you create your own drink? It’s actually the brainchild of the Coca-Cola company, and it’s connected to the internet.
They’re known as Freestyle machines. You’ll find more than 2,000 of them inside the world’s Burger Kings, and many other restaurants and movie theaters throughout the United States and the United Kingdom have them as well. They’re all connected to the internet, so that Coca-Cola can track what drinks people are making and how often.
That’s one reason Coca-Cola now owns 16 million unique network identifiers usually reserved for Wi-Fi cards and other networking equipment. This week, many people were surprised to learn that a soda company controls that many network IDs, with the geeks at Slashdot, a popular online hangout for techies, launching an epic discussion about what the company would do with all those addresses. But we already know a bit about Coke’s plans, and as we approach “The Internet of Things,” the fact of the matter is that this sort of network ID grab is par for the course.
Coca-Cola’s network ID stash isn’t that unusual, according to John Matherly, the man behind Shodan, a search engine for The Internet of Things — the ever-growing array of devices that tap into our global network. First off, Coca-Cola has owned these identifiers since at least 2010. And while 16 million may sound like a lot, it’s actually the smallest number of these unique identifiers you can reserve at one time.
Coca-Colla did not respond to a request for an interview about its Internet of Thing ambitions. But in all likelihood, these addresses are already being used in the company’s Freestyle machines as well as the internet-connected vending machines it’s testing in Texas. That said, the decision to secure a block of addresses en masse may hint at even larger online ambitions for the company.
All network cards have these unique identifiers, referred to as a “media access control address,” or MAC address. These are separate from IP addresses assigned by internet service providers. A device’s MAC address will be the same on any network, regardless of the IP address. If you use your laptop on your wireless network at home and then use it at on a wireless network at a coffee shop, your MAC address will remain the same but your IP address will change. In that sense, a MAC address is more like a virtual serial number than a network address.
An organization called the Institute of Electrical and Electronics Engineers, Incorporated, or IEEE, manages MAC address registration. These identifiers are usually reserved by companies that sell networking cards and equipment. Because of how identifiers are generated, companies that want to reserve a block of addresses must do so in sets of 16 million, as Matherly explains.
That’s not a lot for a company like Cisco, which sells millions of network devices each year. But for companies like Coca-Cola, it may be more than they need. That said, the mere decision grab them by the block — as opposed to securing them one-by-one — shows Coke is serious about this internet thing. “What’s more interesting to me is that they decided to get their own MAC address range,” Matherly says. “This could’ve been a prudent move by an engineer to guarantee their devices a unique range of identifiers in the future, or it’s part of a larger strategy to make their infrastructure — including vending machines — more connected.”
In addition to the Freestyle machines, Coca-Cola has been testing 200 web-connected vending machines in the Austin, Texas area and plans to roll out tens of thousands more, according toBloomberg Businessweek.
The killer feature of a web-connected vending machine is the ability to process credit cards and mobile payments. But the Freestyle machines and the smart vending machines also open a new frontier in collecting real-time data about customer behavior. Coca-Cola could greatly simplify logistics, asBusinessweek points out, by making it possible to know when to restock a machine and how much inventory each one needs, without having to send a truck to each one first.
That only scratches the surface, Matherly says. Because the machines use digital displays, Coca-Cola could do the same type of A/B testing that marketers and political campaigners do on websites and advertisements. For example, the company could determine which arrangements of drinks lead to the most sales, and it could customize the displays on each individual machine depending on what works best for its customers.
It’s unlikely that Coca-Cola will burn through all 16 million MAC addresses with vending machines and soda fountains alone. There are only about 6 to 7 million vending machines total in the entire United States, according to a spokesperson at USA Technologies, a company that sells payment processing systems for vending machines and other retailers. There’s probably roughly twice that number in the entire world.
That means the company’s could take its ambitions even further.

How Your Data Are Being Deeply Mined


Alice E. Marwick


The recent revelations regarding the NSA’s collection of the personal information and the digital activities of millions of people across the world have attracted immense attention and public concern. But there are equally troubling and equally opaque systems run by advertising, marketing, and data-mining firms that are far less known. Using techniques ranging from supermarket loyalty cards to targeted advertising on Facebook, private companies systematically collect very personal information, from who you are, to what you do, to what you buy. Data about your online and offline behavior are combined, analyzed, and sold to marketers, corporations, governments, and even criminals. The scope of this collection, aggregation, and brokering of information is similar to, if not larger than, that of the NSA, yet it is almost entirely unregulated and many of the activities of data-mining and digital marketing firms are not publicly known at all.

Here I will discuss two things: the involuntary, or passive, collecting of data by private corporations; and the voluntary, or active, collection and aggregation of their own personal data by individuals. While I think it is the former that we should be more concerned with, the latter poses the question of whether it is possible for us to take full advantage of social media without playing into larger corporate interests.

Database Marketing
The industry of collecting, aggregating, and brokering personal data is known as “database marketing.” The second-largest company in this field, Acxiom, has 23,000 computer servers that process more than 50 trillion data transactions per year, according to The New York Times.1 It claims to have records on hundreds of millions of Americans, including 1.1 billion browser cookies (small pieces of data sent from a website, used to track the user’s activity), 200 million mobile profiles, and an average of 1,500 pieces of data perconsumer. These data include information gleaned from publicly available records like home valuation and vehicle ownership, information about online behavior tracked through cookies, browser advertising, and the like, data from customer surveys, and “offline” buying behavior. The CEO, Scott Howe, says, “Our digital reach will soon approach nearly every Internet user in the US.”2

Visiting virtually any website places a digital cookie, or small text file, on your computer. “First-party” cookies are placed by the site itself, such as Gmail saving your password so that you don’t have to log in every time you visit the site. “Third-party cookies” persist across sites, tracking what sites you visit, in what order. For those who have logged in, Google Chrome and Firefox sync this browsing history across devices, combining what you do on your iPad with your iPhone with your laptop. This is used to deliver advertising.

For example, a few nights ago I was browsing LLBean.com for winter boots on my iPhone. A few days later, LLBean.com ads showed up on a news blog I was reading on my iPad. This “behavioral targeting” is falling out of fashion in favor of “predictive targeting,” which uses sophisticated data-mining techniques to predict for L.L.Bean whether or not I am likely to purchase something upon seeing an LLBean.com ad.

Acxiom provides “premium proprietary behavioral insights” that “number in the thousands and cover consumer interests ranging from brand and channel affinities to product usage and purchase timing.” In other words, Acxiom creates profiles, or digital dossiers, about millions of people, based on the 1,500 points of data about them it claims to have. These data might include your education level; how many children you have; the type of car you drive; your stock portfolio; your recent purchases; and your race, age, and education level. These data are combined across sources—for instance, magazine subscriber lists and public records of home ownership—to determine whether you fit into a number of predefined categories such as “McMansions and Minivans” or “adult with wealthy parent.”3 Acxiom is then able to sell these consumer profiles to its customers, who include twelve of the top fifteen credit card issuers, seven of the top ten retail banks, eight of the top ten telecom/media companies, and nine of the top ten property and casualty insurers.

Acxiom may be one of the largest data brokers, but it represents a dramatic shift in the way that personal information is handled online. The movement toward “Big Data,” which uses computational techniques to find social insights in very large groupings of data, is rapidly transforming industries from health care to electoral politics. Big Data has many well-known social uses, for example by the police and by managers aiming to increase productivity. But it also poses new challenges to privacy on an unprecedented level and scale. Big Data is made up of “little data,” and these little data may be deeply personal.
Alone, the fact that you purchased a bottle of cocoa butter lotion from Target is unremarkable. Target, on the other hand, assigns each customer a single Guest ID number, linked to their credit card number, e-mail address, or name. Every purchase and interaction you have with Target is then linked to your Guest ID, including the cocoa butter.

Now, Target has spent a great deal of time figuring out how to market to people about to have a baby. While most people remain fairly constant in their shopping habits—buying toilet paper here, socks there—the birth of a child is a life change that brings immense upheaval. Since birth records are public, new parents are bombarded with marketing and advertising offers. So Target’s goal was to identify parents before the baby was born. The chief statistician for Target, Andrew Pole, said, “We knew that if we could identify [new parents] in their second trimester, there’s a good chance we could capture them for years.”4 Pole had been mining immense amounts of data about the shopping habits of pregnant women and new parents. He found that women purchased certain things during their pregnancy, such as cocoa butter, calcium tablets, and large purses that could double as diaper bags.

Target then began sending targeted mail to women during their pregnancy. This backfired. Women found it creepy—how did Target know they were pregnant? In one famous case, the father of a teenage girl called Target to complain that it was encouraging teen pregnancy by mailing her coupons for car seats and diapers. A week later, he called back and apologized; she hadn’t told her father yet that she was pregnant.5

So the Target managers changed their tactics. They mixed in coupons for wine and lawnmowers with those for pacifiers and Baby Wipes. Pregnant women could use the coupons without realizing that Target knew they were pregnant. As Pole told The New York Times Magazine, “Even if you’re following the law, you can do things where people get queasy.”

These same techniques were used to great effect by the Obama campaign before the 2012 election. Famously, the campaign recruited some of the most brilliant young experts in analytics and behavioral science, and put them in a room called “the cave” for sixteen hours a day.6 The chief data scientist for the campaign was an analyst who had formerly mined Big Data to improve supermarket promotions. This “dream team” was able to deliver microtargeted demographics to Obama—they could predict exactly how much money they would get back from each fund-raising e-mail. When the team discovered that East Coast women between thirty and forty were not donating as much as might be expected, they offered a chance to have dinner with Sarah Jessica Parker as an incentive.7 Every evening, the campaign ran 66,000 simulations to model the state of the election. The Obama analysts were not only using cutting-edge database marketing techniques, they were developing techniques that were far beyond the state of the art.

The Obama campaign’s tactics illuminate something that is often missed in our discussions of data-mining and marketing—the fact that governments and politicians are major clients of marketing agencies and data brokers. For instance, the campaign bought data on the television-watching habits of Ohioans from a company called FourthWallMedia. Each household was assigned a number, but the names of those in the household were not revealed. The Obama campaign, however, was able to combine lists of voters with lists of cable subscribers, which it could then coordinate with the supposedly anonymous ID numbers used to track the usage patterns of television set-top boxes.8 It could then target campaign ads to the exact times that certain voters were watching television. As a result, the campaign bought airtime during unconventional programming, like Sons of Anarchy, The Walking Dead, and Don’t Trust the B— in Apt. 23, rather than during local news programming as conventional wisdom would have advised.

The “cave dwellers” were even able to match voter lists with Facebook information, using “Facebook Connect,” Facebook’s sign-on technology, which is used for many sign-ups and commenting systems online. Knowing that some of these users were Obama supporters, the campaign could figure out how to get them to persuade their perhaps less motivated friends to vote. Observing lists of Facebook friends and comparing them with tagged photos, the campaign matched these “friends” with lists of persuadable voters and then mobilized Obama supporters to convince their “real-life” friends to vote.

Social Media
In view of these sophisticated data- mining and analyzing techniques, is there any way we can use social media—or the Internet itself —without adding to our profiles collected by companies like Acxiom, Experian, or Epsilon?

Social media allow us to collect and track data about ourselves. For instance, I have been using a website called Last.fm since 2005 to track every piece of digital music I have listened to when using iTunes or Spotify. As a result, I have a fascinating picture of how my musical tastes have changed over time, and Last.fm is able to recommend obscure bands to me based on this extensive listening history.

Using social media allows us to connect with friends; to learn more about ourselves; even to improve our lives. The Quantified Self movement, which builds on techniques used by women for decades, such as counting calories, promotes the use of personal data for self-knowledge. Measuring your sleep cycles over time, for instance, can help you learn to avoid caffeine after 4:00 pm, or realize that, if you want to fall asleep, you can’t use the Internet for an hour before bedtime.

But these data are immensely beneficial to data brokers. Imagine how a health insurer might react to viewing your caloric intake on MyFitnessPal, the number of steps you walk per day tracked by Fitbit, how often you check in to your local gym using Foursquare, and what you eat based on the pictures of your meals that you post on Instagram. Each piece of information, by itself, may be inconsequential, but the aggregation of this information creates a larger picture. Data trackers can centrally access such information and add it to their databases. Two large consequences of this collection of data deserve more attention.

The first is data discrimination. Once customers are sliced and diced into segmented demographic categories, they can be sorted. An Acxiom presentation to the Consumer Marketing Organization in 2013 placed customers into “customer value segments” and noted that while the top 30 percent of customers add 500 percent of value, the bottom 20 percent actually cost 400 percent of value. In other words, it behooves companies to shower their top customers with attention, while ignoring the bottom 20 percent, who may spend “too much” time on customer service calls, and may cost companies in returns or coupons, or otherwise cost more than they provide.

These “low-value targets” are known in industry parlance as “waste.” Joseph Turow, a University of Pennsylvania professor in communications who studies niche marketing, asks what happens to those people who fall into the categories of “waste,” entirely without their knowledge or any notification. Do they suffer price discrimination? Poor service? Do they miss out on the offers given to others? Such discrimination is still more insidious because it is entirely invisible.

Second, we may be more concerned with government surveillance than with marketers or data brokers collecting personal information, but this ignores the fact that the government regularly purchases data from these companies. ChoicePoint, now owned by Elsevier, was an enormous data aggregator that combined personal data extracted from public and private databases, including Social Security numbers, credit reports, and criminal records. It maintained 17 billion records on businesses and individuals, which it sold to approximately 100,000 clients, including thirty-five government agencies and seven thousand federal, state, and local law enforcement agencies.9

For instance, the State Department purchased records on millions of Latin American citizens, which were then checked against immigration databases. Choicepoint was also investigated for selling 145,000 personal records to an identity theft ring. More recently, Experian, one of the three major credit bureaus, mistakenly sold personal records to a Vietnamese hacker. Scammers refer to these records, which include Social Security numbers and mothers’ maiden names, as “fullz,” because they contain enough personal information for crooked operators to apply for credit cards or take out loans.

A few years ago, I toured the experimental lab of a large advertising agency. They showed me the cutting edge of consumer- monitoring technologies. Someday, not too far in the future, if you’re at Duane Reade, aimlessly staring at a giant shelf of shampoo trying to figure out which to buy, the shelf will track your eye movements and which bottles you pick up and examine in more detail. Using this data, Duane Reade can algorithmically generate a coupon for a particular brand of shampoo, which you can then print from the shelf. I watched an experimental application that tracks the movements of individuals through a mall, based on the unique identifiers, or MAC addresses, of their cell phones, kept in purses or pockets but available to wireless tracking devices. Again, in all of these cases, the individuals are unaware that they are being tracked. A description of such procedures may be hidden at the end of a byzantine privacy policy people may not have noticed when they bought their devices, or written on a notice next to a CCTV camera. Though they may not be technically illegal, they seem ethically dubious.

While the easy answer to these problems is to opt out of loyalty cards, Internet use, or social media, this is hardly realistic. In fact, it is practically impossible to live life, online or offline, without being tracked—unless one takes extreme measures of avoidance. Cities track car movements; radio-frequency identification (RFID) tags are attached to clothing and dry cleaning; CCTV cameras are in most stores.10 The technology is developing far more rapidly than our consumer protection laws, which in many cases are out of date and difficult to apply to our networked world.

The Federal Trade Commission and the Senate Commerce Committee are currently investigating data brokers and calling for more transparency in the collection and dissemination of personal information. Those of us concerned with privacy must continue to demand that checks and balances be applied to these private corporations. People should be encouraged to investigate the various opt- out tools, ad-blockers, and plug-ins that are available for most platforms. While closer scrutiny of the NSA is necessary and needed, we must apply equal pressure to private corporations to ensure that seemingly harmless targeted mail campaigns and advertisements do not give way to insidious and dangerous violations of personal privacy.

1 See Natasha Singer, “Acxiom, the Quiet Giant of Consumer Database Marketing,” The New York Times, June 16, 2012.
2 See Judith Aquino, “ Acxiom Prepares New ‘Audience Operating System’ Amid Wobbly Earnings,” AdExchanger.com, August 1, 2013. ↩ 
3 See Natasha Singer, “A Data Broker Offers a Peek Behind the Curtain,” The New York Times, August 31, 2013. ↩ 
4 See Charles Duhigg, “How Companies Learn Your Secrets,” The New York Times Magazine, February 16, 2012. ↩ 
5 See Kashmir Hill, “ How Target Figured Out a Teen Girl Was Pregnant Before Her Father Did,” Forbes, February 16, 2012. ↩ 
6 See Jim Rutenberg, “ Data You Can Believe In: The Obama Campaign’s Digital Masterminds Cash In,” The New York Times Magazine, June 20, 2013. ↩ 
7 See Michael Scherer, “ Inside the Secret World of the Data Crunchers Who Helped Obama Win,” Time, November 7, 2012. ↩ 
8 See Lois Beckett, “ Everything We Know (So Far) About Obama’s Big Data Tactics,” ProPublica, November 29, 2012.
9 See “ ChoicePoint,” at the Electronic Privacy Information Center. ↩ 
10 See Sarah Kessler, “ Think You Can Live Offline Without Being Tracked? Here’s What It Takes,” Fast Company, October 15, 2013.

Listen to Pandora, and It Listens Back

Pandora, the Internet radio service, is plying a new tune.
After years of customizing playlists to individual listeners by analyzing components of the songs they like, then playing them tracks with similar traits, the company has started data-mining users’ musical tastes for clues about the kinds of ads most likely to engage them.
“It’s becoming quite apparent to us that the world of playing the perfect music to people and the world of playing perfect advertising to them are strikingly similar,” says Eric Bieschke, Pandora’s chief scientist.
Consider someone who’s in an adventurous musical mood on a weekend afternoon, he says. One hypothesis is that this listener may be more likely to click on an ad for, say, adventure travel in Costa Rica than a person in an office on a Monday morning listening to familiar tunes. And that person at the office, Mr. Bieschke says, may be more inclined to respond to a more conservative travel ad for a restaurant-and-museum tour of Paris. Pandora is now testing hypotheses like these by, among other methods, measuring the frequency of ad clicks. “There are a lot of interesting things we can do on the music side that bridge the way to advertising,” says Mr. Bieschke, who led the development of Pandora’s music recommendation engine.
A few services, like Pandora, Amazon and Netflix, were early in developing algorithms to recommend products based on an individual customer’s preferences or those of people with similar profiles. Now, some companies are trying to differentiate themselves by using their proprietary data sets to make deeper inferences about individuals and try to influence their behavior.
This online ad customization technique is known as behavioral targeting, but Pandora adds a music layer. Pandora has collected song preference and other details about more than 200 million registered users, and those people have expressed their song likes and dislikes by pressing the site’s thumbs-up and thumbs-down buttons more than 35 billion times. Because Pandora needs to understand the type of device a listener is using in order to deliver songs in a playable format, its system also knows whether people are tuning in from their cars, from iPhones or Android phones or from desktops.
So it seems only logical for the company to start seeking correlations between users’ listening habits and the kinds of ads they might be most receptive to.
“The advantage of using our own in-house data is that we have it down to the individual level, to the specific person who is using Pandora,” Mr. Bieschke says. “We take all of these signals and look at correlations that lead us to come up with magical insights about somebody.”
People’s music, movie or book choices may reveal much more than commercial likes and dislikes. Certain product or cultural preferences can give glimpses into consumers’ political beliefs, religious faith, sexual orientation or other intimate issues. That means many organizations now are not merely collecting details about where we go and what we buy, but are also making inferences about who we are.
“I would guess, looking at music choices, you could probably predict with high accuracy a person’s worldview,” says Vitaly Shmatikov, an associate professor of computer science at the University of Texas at Austin, where he studies computer security and privacy. “You might be able to predict people’s stance on issues like gun control or the environment because there are bands and music tracks that do express strong positions.”
Pandora, for one, has a political ad-targeting system that has been used in presidential and congressional campaigns, and even a few for governor. It can deconstruct users’ song preferences to predict their political party of choice. (The company does not analyze listeners’ attitudes to individual political issues like abortion or fracking.)
During the next federal election cycle, for instance, Pandora users tuning into country music acts, stand-up comedians or Christian bands might hear or see ads for Republican candidates for Congress. Others listening to hip-hop tunes, or to classical acts like the Berlin Philharmonic, might hear ads for Democrats.
Because Pandora users provide their ZIP codes when they register, Mr. Bieschke says, “we can play ads only for the specific districts political campaigns want to target,” and “we can use their music to predict users’ political affiliations.” But he cautioned that the predictions about users’ political parties are machine-generated forecasts for groups of listeners with certain similar characteristics and may not be correct for any particular listener.
Shazam, the song recognition app with 80 million unique monthly users, also plays ads based on users’ preferred music genres. “Hypothetically, a Ford F-150 pickup truck might over-index to country music listeners,” says Kevin McGurn, Shazam’s chief revenue officer. For those who prefer U2 and Coldplay, a demographic that skews to middle-age people with relatively high incomes, he says, the app might play ads for luxury cars like Jaguars.
In its privacy policy, Pandora describes the types of information it collects about users and the purposes — music personalization and ad customization — for which the information may be employed. Although users may elect to pay $36 annually to opt out of receiving ads, advertising on the free service accounts for the bulk of Pandora’s revenue. Out of $427.1 million in revenue in the 2013 fiscal year, advertising generated $375.2 million.
Pandora’s inferences about individuals become more discerning as time goes on. How we think about the ethics and accuracy of algorithms is another matter.
“I’m optimistic that the benefits to society will outweigh the risks,” Professor Shmatikov says. “But our attitudes will have to evolve to understand that now everybody knows more about who we are.”