Tagged: data Toggle Comment Threads | Keyboard Shortcuts

  • feedwordpress 17:36:20 on 2018/11/05 Permalink
    Tags: , data, , , ,   

    Lazy Ad Buying Is Killing The Open Web. 

    But…I just *bought* a robe. I don’t want another one.

    If you’re read my rants for long enough, you know I’m fond of programmatic advertising. I’ve called it the most important artifact in human history, replacing  the Macintosh as the most significant tool ever created.

    So yes, I think programmatic advertising is a big deal. As I wrote in the aforementioned post:

    “I believe the very same technologies we’ve built to serve real time, data-driven advertising will soon be re-purposed across nearly every segment of our society. Programmatic adtech is the heir to the database of intentions – it’s that database turned real time and distributed far outside of search. And that’s a very, very big deal. (I just wish I had a cooler name for it than “adtech.”)” 

    But lately, I’m starting to wonder if perhaps adtech is failing, not for any technical reason, but because the people leveraging are complicit in what might best be called a massive failure of imagination.

    I’m about to go on a rant here, so please forgive me in advance.

    But honestly, who else out there is sick of being followed by ads so stupid a fourth grader could do a better job of targeting them?

    Case in point is the ad above. I took this screen shot from my phone this past weekend while I was reading a New York Times article. The image – of a robe Amazon wanted me to buy – was instantly annoying, because I had in fact purchased a robe on Amazon several days before. Why on earth was Amazon retargeting me for a product I just bought?!

    But wait, it gets worse! As I perused the next Times article, this ad shows up:

    That would have made sense *after I bought a robe, but…” I bought slippers two weeks ago. So WTF?

    You might think this ad makes more sense. If the dude buys a robe, makes sense to try to sell him a new pair of slippers, no? Well, sure, but only if that same dude didn’t buy a new pair of slippers two weeks ago. Which, in fact, I did just do.

    So, yeah, this ad sucks as well. Not only is it not useful or relevant, it’s downright annoying. The vast machinery of adtech has correctly identified me as a robe-and-slippers-buying customer. But it’s failed to realize *I’ve already bought the damn things.*

    Is it possible that adtech is this stupid? This poorly instrumented? I mean, are programmatic buyers simply tagging visitors who land on ecommerce pages (male robe intender?) without caring about whether those visitors actually bought anything?

    Are the human beings responsible for setting the dials of programmatic just this lazy?

    Yes.

    I’ve been a critical observer of adtech over the past ten or so years, and one consistent takeaway is this: If there’s a way for a buyer to cut corners, declare an easy win, and keep doing things they way the’ve always been done, well, they most certainly will.

    But why does it have to be this way? Digging into the examples above yields an extremely frustrating set of facts. Consider the data the adtech infrastructure either got *right* about me as a customer, or could have gotten right:

    • I am a frequent ecommerce customer, usually buying on Amazon
    • I recently purchased both a robe and some slippers
    • I am reading on the New York Times site as a logged on (IE data rich) customer of the Times‘ offerings

    These are just the obvious data points. My mobile ID and cookies, all of which are available to programmatic buyers, certainly indicate a high household income, a propensity to click on certain kinds of ads, a rich web browsing history reflecting a thickly veined lodestar of interest data, among countless other possible inputs.

    Imagine if a programmatic campaign actually paid attention to all this rich data? Start with the fact I just purchased a robe and slippers. What are products related to those two that Amazon might show me? Well, according to its own “people who bought this item also bought” algorithms, folks who bought men’s robes also bought robes for the women in their life. Now there’s a cool recommendation! I might have clicked on an ad that showed a cool robe for my wife. But no, I’m shown an ad for a product I already have.

    Why?

    I’ve got a few calls in to verify my hunch, but I suspect the ugly truth is pure laziness on the part of the folks responsible for buying ads. Consider: The average cost for a thousand views (CPM) of a targeted programmatic advertisement hovers between ten cents (yes, ten pennies) to $2.  With costs that low, the advertising community can afford to waste ad inventory.

    Let’s apply that reality to our robe example. Let’s say the robe costs $60, and yields a $20 profit for our e-commerce advertiser, not including marketing costs. That means that same advertiser is can spend upwards of $19.99 per unit on advertising (more, if a robe purchaser turns out to be a “big basket” e-commerce spender).  So what does our advertiser do? Well, they set a retargeting campaign aimed anyone who ever visited our erstwhile robe’s page.  With CPMs averaging around a buck, that robe’s going to follow nearly 20,000 folks around the internet, hoping that just one  of them converts.

    Put another way, programmatic advertising is a pure numbers game, and as long as the numbers show one penny of profit, no one is motivated to make the system any better. I’ve encountered many similar examples of ad buyers ignoring high-quality data signals, preferring instead to “waste reach” because, well, it’s just easier to set up campaigns on one or two factors. Inventory is cheap. Why not?

    This is problematic. What’s the point of having all that rich (and hard won) targeting data if buyers won’t use it, and consumers don’t benefit from it? An ecosystem that fails to encourage innovation will stagnate and lose share to walled gardens like Facebook, Google, and others. If the ads suck on the open web (and they do), then consumers will either install ad blockers (and they are), or abandon the open web altogether (and they are).

    We can do so much better. Shouldn’t we try?

     

     

     
  • feedwordpress 20:07:01 on 2018/10/31 Permalink
    Tags: , data, , , , food, , , , , , small business   

    After the Token Act: A New Data Economy Driven By Small Business Entrepreneurship 

    Gramercy Tavern in New York City

    If Walmart can leverage data tokens to lure Amazon’s best customers away, what else is possible in a world of enabled by my fictional Token Act?

    Well, Walmart vs. Amazon is all about big business – a platform giant (Amazon) disrupting an OldBigCo (Walmart and its kin). Over the past two decades, Amazon bumped Walmart out of the race to a trillion-dollar market cap, and the OldCo from Bentonville had to reset and play the role of the upstart. The Token Act levels the playing field, forcing both to win where it really matters: In service to the customer.

    But while BigCos are sexy and well known, it’s the small and medium-sized business ecosystem that determines whether or not we have an economy of mass flourishing.  So let’s explore the Token Act from the point of view of a small business startup, in this case, a new neighborhood restaurant. I briefly touched upon this idea in my set up post, Don’t Break Up The Tech Oligarchs. Force Them To Share Instead.  (If you haven’t already, you might want to read that post before this one, as I lay out the framework in which this scenario would play out.) What I envision below assumes the Token Act has passed, and we’re at least a year or two into its adoption by most major data players. Here we go…

    ***

    Fresh off her $2,700 win from Walmart, Michelle decides she’s ready to lean into a lifelong dream: Starting a restaurant in her newly adopted neighborhood of Chelsea in New York City. Since moving to the area from California, she’s noticed two puzzling trends: First, a dearth of interesting mid- to high-end dinner spots walking distance from her new place, and second, what appears to be higher-than-average vacancy rates for the retail storefronts in the same general area. It appears to be a buyer’s market for retail restaurant space in Chelsea. So why aren’t new places launching? She read the Times’ piece on vacancies a few years ago (before the Token Act passed) and was left just as puzzled as before – seems like there’s no rhyme or reason to the market.

    Michelle wants to start a high end American gastro pub – the kind of place she loved back when she lived in Northern California (she’s fond of Danny Meyers’ Gramercy Tavern, pictured above, but it’s a bit too far away from her new place). She has a strong hunch that such a place would be a hit in her new neighborhood, but she’s not sure her new neighbors will agree.

    Now starting a restaurant requires a certain breed of insanity – they say the best way to make a small fortune in the business is to start with a large one. The truth is, launching restaurants has historically been a crap shoot – you might find the best talent, the best designer, and the best location – but if for some reason you don’t bring the je ne sai quois, the place will fail within months, leaving you and your partners millions of dollar poorer.

    It’s that  je ne sai quois that Michelle is determined to reveal.  The tools she will leverage? The newly liberated resources of data tokens.

    Before we continue, allow me to draw your attention back to the rise of search, indeed, the very era which begat Searchblog in the early 2000s. Google Adwords launched in 2000, and within a few years, the media world had been turned upside down by what I termed The Database of Intentions.  As if by magic, people everywhere could suddenly ask new kinds of questions, finding themselves both surprised and delighted by the answers they received.

    Gates-Line compliant ecosystem quickly developed on top of this new platform, driven by an emerging industry of search engine marketing and optimization. SEO/SEM sprung into existence to help small and medium sized businesses take advantage of the Google platform – by 2006 the industry stood at nearly $10 billion in spend, growing more than 60 percent year on year. Adwords grew from zero to millions of advertisers by connecting to a long tail of small businesses that took advantage of an entirely new class of revealed information: The intents, desires, and needs of tens of millions of consumers, who relentlessly poured their queries into Google’s placid and unblinking search box.

    Were you a limo service in the Bronx looking for new customers? It paid huge dividends to purchase Adwords like “car service bronx” and “best limo manhattan.” Were you a dry cleaner in West LA hoping to expand? Best be first in line when customers typed in “best cleaners Beverly Hills.” Selling heavy machinery to construction services in the midwest? If you don’t own keywords like “caterpillar dealer des moines” you’d lose, and quick, to whoever did optimize to phrases like that.

    My point is simply this: Adwords was a freaking revolution, but it ain’t nothing compared to what will happen if we unleash data tokens on the world.

    ***

    Ok, back to Michelle and her new restaurant. Of course Michelle will leverage Adwords, and Facebook, and any other advertising service to help her new business grow. But none of those services can help her figure out her je ne sai quois – for that, she needs something entirely novel. She needs a new question machine. And the ecosystem that develops around data tokens will offer it.

    Thanks to her Walmart experience, Michelle has become aware of the power of personal data. She’s also read up on the Token Act, the new law requiring all data players at scale to allow individuals to create machine-readable data tokens that can be exchanged for value as directed by the consumer. After doing a bit of research, she stumbles across a startup called OfferExchange, which manages “Token Offers” on behalf of anyone who might want to query TokenLand. OfferExchange is a spinout from ProtocolLabs, a pioneer in secure blockchain software platforms like Filecoin. It’s still early in TokenLand, so an at-scale Google of the space hasn’t emerged. OfferExchange works more like a bespoke yet platform-based research outfit – the firm has a sophisticated website and impressive client list. It uses Facebook, Twitter, LiveRamp, and Instagram to identify potential token-creating consumers, then solicits those individuals with offers of cash or other value in exchange for said tokens.

    Michelle does a Crunchbase search for OfferExchange and sees it’s backed by Union Square Ventures and Benchmark, which gives her some comfort – those firms don’t fund fly-by-night hucksters. And OfferExchange site is impressive – in less than five minutes, it guides her through the construction of an elegant query. Here’s how the process works:

    First, the site asks Michelle what her goal is. “Starting a restaurant in New York City,” she responds. The site reconstructs around her answer, showing suggested data repositories she might mine. “Restaurants, New York City,” reads the top layer of a directory-like page. Underneath are several categories, each populated with familiar company names:

    • Restaurant Reservation and Review Services
      • OpenTable Google Resy Yelp Eat24 Facebook (more)
    • Food Delivery Services
      • GrubHub Uber Eats PostMates InstaCart (more)
    • Transportation Services
      • Uber Lyft Juno Via (more)
    • Real Estate Services (Commercial)
      •  LoopNet DocuSign CompStak (more)
    • Location Services 
      • Foursquare Uber Lyft Google NinthDecimal (more)
    • Financial Services
      • American Express Visa Mastercard Apple Pay Diners Club (more)

    And so on – if she wished, Michelle could dig into dozens of categories related to her initial “restaurant New York City” search.

    Michelle’s imagination sparks – the kinds of queries she could ask of these services is mind blowing. She could  limit her query to people who live within walking distance of her neighborhood, asking her *actual neighbors* for tokens that tell her what restaurants they eat at, when they eat there, the size of their checks, related reviews, abandoned reservations, the works. She might discover that folks like Indian takeout on Mondays, that they rarely spend more than $100 on a meal on Tuesdays, but that they splurge on the weekends. She could discover the percentage of diners in Chelsea who travel more than two miles by car service to eat out at a place similar to the one she has in mind, and what the size of the check might be when they do. She can also check historical average rents for restaurants in her zip code, over time, which will certainly help with negotiating her lease. The possibilities are endless.

    Put another way, with OfferExchange’s services, Michelle can litigate the merde out of her je ne sai quois.

    *** 

    This post is getting long, so I’ll stop here and pull back for a spot of Thinking Out Loud. I could continue the story, imagining the process of the token offer Michelle would put out through OfferExchange’s platform, but suffice to say, she’d be willing to pay upwards of $5-20 per potential customer for their data. The marketing benefit alone – alerting potential customers in the neighborhood that she’s exploring a new restaurant in the area – is worth tens of thousands already. And of course, OfferExchange can connect anyone who offers their tokens to Michelle’s new project a discount on their first meal at the restaurant, should it actually launch. Cool!

    But let’s stop there and consider what happens when local entrepreneurs have access to the information currently silo’d across thousands of walled garden services like Uber, LoopNet, Resy, and of course Facebook and Google. While better data won’t insure that Michelle’s restaurant will succeed, it certainly increases the odds that it won’t fail. And it will give both Michelle and her investors – local banks, savvy friends and family members – much more conviction that her new enterprise is viable. Take this local restaurant example and apply it to all manner of small business – dry cleaners, hardware stores, bike shops – and this newly liberated class of information enables an explosion of efficiency, investment, and, well, flourishing in what has become, over the past four decades, a stagnant SMB environment.

    Is this Money Ball for SMB? Perhaps. And yes, I can imagine any number of downsides to this new data economy. But I also believe the benefits would far outweigh the downsides. Under the Token Act as I envision it, co-creators of the data – the services like Uber, OpenTable, or Facebook – have the right to charge a vig for the data being monetized. Sure, it’d be possible for an entrepreneur to steal customers via tokens, but I’m going to guess the economic value of allowing your customers to discover new use cases for their data will dwarf the downside of possibly losing those customers to a new competitor. Plus, this new competitive force will drive everyone to play at a higher level, focusing not on moats built on data silos, but instead on what really matters: A highly satisfied customer. That’s certainly Michelle’s goal, and the goal of every successful local business. Why shouldn’t it also be the goal of the data giants?

     
  • feedwordpress 02:36:10 on 2018/10/22 Permalink
    Tags: , data, data portability, , , , ,   

    Instead of Breaking Up The Tech Oligarchs, Let’s Try This One Simple Hack 

    (image)

    Social conversations about difficult and complex topics have arcs – they tend to start scattered, with many threads and potential paths, then resolve over time toward consensus. This consensus differs based on groups within society – Fox News aficionados will cluster one way, NPR devotees another. Regardless of the group, such consensus then becomes presumption – and once a group of people presume, they fail to explore potentially difficult or presumably impossible alternative solutions.

    This is often a good thing – an efficient way to get to an answer. But it can also mean we fail to imagine a better solution, because our own biases are obstructing a more elegant path forward.

    This is my sense of the current conversation around the impact of what Professor Scott Galloway has named “The Four” – the largest and most powerful American companies in technology (they are Apple, Amazon, Google, and Facebook, for those just returning from a ten-year nap).  Over the past year or so, the conversation around technology has become one of “something must be done.” Tech was too powerful, it consumed too much of our data and too much of our economic growth. Europe passed GDPR, Congress held ineffectual hearings, Facebook kept screwing up, Google failed to show up…it was all of a piece.

    The conversation evolved into a debate about various remedies, and recently, it’s resolved into a pretty consistent consensus, at least amongst a certain class of tech observers: These companies need to be broken up. Antitrust, many now claim, is the best remedy for the market dominance these companies have amassed.

    It’s a seductive response, with seductive historical precedent. In the 1970s and 80s, antitrust broke up AT&T, ultimately paving the way for the Internet to flourish. In the 90s, antitrust provided the framework for the government’s case against Microsoft, opening the door for new companies like Google and Facebook to dominate the next version of the Internet. Why wouldn’t antitrust regulation usher in #Internet3? Imagine a world where YouTube, Instagram, and Amazon Web Services are all separate companies. Would not that world be better?

    Perhaps. I’m not well read enough in antitrust law to argue one way or the other, but I know that antitrust turns on the idea of consumer harm (usually measured in terms of price), and there’s a strong argument to be made that a free service like Google or Facebook can’t possibly cause consumer harm. Then again, there are many who argue that data is in fact currency, and The Four have essentially monopolized a class of that currency.

    But even as I stare at the antitrust remedy, another solution keeps poking at me, one that on its face seems quite elegant and rather unexplored.

    The idea is simply this: Require all companies who’ve reached a certain scale to build machine-readable data portability into their platforms. The right to data portability is explicit in the EU’s newly enacted GDPR framework, but so far the impact has been slight: There’s enough wiggle room in the verbiage to hamper technical implementation and scope. Plus, let’s be honest: Europe has never really been a hotbed of open innovation in the first place.

    But what if we had a similar statute here? And I don’t mean all of GDPR – that’s certainly a non starter. But that one rule, that one requirement: That every data service at scale had to stand up an API that allowed consumers to access their co-created data, download a copy of it (which I am calling a token), and make that copy available to any service they deemed worthy?

    Imagine what might come of that in the United States?

    I’m not a policy expert, and the devil’s always in the details. So let me be clear in what I mean when I say “machine-readable data portability”: The right to take, via an API, what is essentially a “token” containing all (or a portion of) the data you’ve co created in one service, and offer it, with various protections, permission, and revocability, to another service. In my Senate testimony, I gave the example of a token that has all your Amazon purchases, which you then give to Walmart so it can do a historical price comparison and tell you how much money you would save if you shopped at its online service. Walmart would have a powerful incentive to get consumers to create and share that token – the most difficult problem in nearly all of business is getting a customer to switch to a similar service. That would be quite a valuable token, I’d wager*.

    Should be simple to do, no? I mean, don’t we at least co-own the information about what we bought at Amazon?

    Well, no. Not really. Between confusing terms of service, hard to find dashboards, and confounding data reporting standards, The Four can both claim we “own our own data” while at the same time ensuring there’ll never be a true market for the information they have about us.

    So yes, my idea is easily dismissed. The initial response I’ve had to it is always some variation of: “There’s no way The Four would let this happen.” That’s exactly the kind of biases I refer to above – we assume that The Four control the dialog, that they either will thwart this idea through intensive lobbying, clever terms of service, and soft power, or that the idea is practically impossible because of technical or market limitations. To that I ask….Why?

    Why is it impossible for me to tokenize all of my Lyft ride data, and give for free it to an academic project that is mapping the impact of ride sharing on congestion in major cities? Why is it impossible for a small business owner to create an RFP for all OpenTable, Resy, and other dining data, so she can determine the best kind of restaurant to open in her neighborhood? I’m pretty certain she’d pay a few bucks a head for that kind of data – so why can’t I sell that information to her (with a vig back to OpenTable and Resy) if the value exchange is there to be monetized? Why can’t I tokenize and sell my Twitter interactions to a brand (or more likely, an agency or research company) interested in understanding the mind of a father who lives in Manhattan? Why can’t I tokenize and trade my Spotify history for better recommendations on live shows to see, or movies to watch, or books to read? Or, simply give it to a free service that’s sprung up to give me suggestions about new music to check out?

    Why can’t an ecosystem of agents, startups, and data brokers emerge, a new industry of information processing not seen since the rise of search optimization in the early aughts, leveraging and arbitraging consumer information to create entirely new kinds of businesses driven by insights currently buried in today’s data monopolies?

    Such a world would be fascinating, exciting, sometimes sketchy, and a hell of a lot of fun. It’d be driven by the individual choices of millions of consumers – choosing which agents to trust, which tokens to create, which trades felt fair. There’s be fails, there’d be fraud, there’d be bad actors. But over time, the good would win over the bad, because the decision making is distributed across the entire population of Internet users. In short, we’d push the decision making to the node – to us. Sure, we’d do stupid things. And sure, the hucksters and the hustlers would make short term killings. But I’ll take an open system like this over a closed one any day of the week, especially if the open system is governed by an architecture empowering the individual to make their own decisions.

    It’s be a lot like the Internet was once imagined to be.

    I’ve been noodling on such an ecosystem, and I’m convinced it could dwarf our current Internet in terms of overall value created (and credit where credit is due, The Four have created a lot of value). It’d run laps around The Four when it comes to innovation – tens of thousands of new companies would form, all of them feeding off the newly liberated oxygen of high quality, structured, machine readable data. Trusted independent platforms for value exchange would arise. Independent third party agents would munge tokens from competing services, verifying claims and earning the trust of consumers (will Walmart really save you a thousand bucks a year?! We can prove it, or not!). Huge platforms would develop for the processing, securitization, permissioning, and validation of our data. Man, it’d feel like…well, like the recumbent, boring old Internet was finally exciting again.

    There’s no technical reason why this world doesn’t exist. The progenitors of the Web have already imagined it, heck, Tim Berners Lee recently announced he’s working pretty much full time on creating a system devoted to the foundational elements needed for it to blossom.

    But until we as a society write machine-readable data portability into law, such efforts will be relegated to interesting side shows. And more likely than not, we’ll spend the next few years arguing about breaking up The Four, and let’s be honest, that’s an argument The Four want us to have, because they’re going to win it (more money, better lawyers, etc. etc.). Instead, we should  just require them – and all other data services of scale – to free the data they’ve so far managed to imprison. One simple new law could change all of that. Shouldn’t we consider it?

    *In another post, I’ll explore this example in detail. It’s really, really fascinating. 

     
  • feedwordpress 17:53:30 on 2018/07/22 Permalink
    Tags: , , data, , , , , ,   

    The Tragedy of the Data Commons 

    Before, and after?

    A theme of my writing over the past ten or so years has been the role of data in society. I tend to frame that role anthropologically: How have we adapted to this new element in our society? What tools and social structures have we created in response to its emergence as a currency in our world? How have power structures shifted as a result?

    Increasingly, I’ve been worrying a hypothesis: Like a city built over generations without central planning or consideration for much more than fundamental capitalistic values, we’ve architected an ecosystem around data that is not only dysfunctional, it’s possibly antithetical to the core values of democratic society. Houston, it seems, we really do have a problem.

    I know, it’s been a while since I’ve written here, and most of my recent stuff has focused on Facebook. I’ve been on the road the entire summer, and preparing to move from the Bay area to NYC ( that’s another post). But before you roll your eyes in anticipation of yet another Facebook rant, no, this post is not about Facebook, despite that company’s continued inability to govern itself.

    No, this post is about the business of health insurance.

    Last week ProPublica published a story titled Health Insurers Are Vacuuming Up Details About You — And It Could Raise Your Rates.  It’s the second in an ongoing series the investigative unit is doing on the role of data in healthcare. I’ve been watching this story develop for years, and ProPublica’s piece does a nice job of framing the issue. It envisions  “a future in which everything you do — the things you buy, the food you eat, the time you spend watching TV — may help determine how much you pay for health insurance.”  Unsurprisingly, the health industry has  developed an insatiable appetite for personal data about the individuals it covers. Over the past decade or so, all of our quotidian activities (and far more) have been turned into data, and that data can and is being sold to the insurance industry:

    “The companies are tracking your race, education level, TV habits, marital status, net worth. They’re collecting what you post on social media, whether you’re behind on your bills, what you order online. Then they feed this information into complicated computer algorithms that spit out predictions about how much your health care could cost them.”

    HIPPA, the regulatory framework governing health information in the United States, only covers and protects medical data – not search histories, streaming usage, or grocery loyalty data. But if you think your search, video, and food choices aren’t related to health, well, let’s just say your insurance company begs to differ.

    Lest we dive into a rabbit hole about the corrosive combination of healthcare profit margins with personal data (ProPublica’s story does a fine job of that anyway), I want to pull back and think about what’s really going on here.

    The Tragedy of the Commons

    One of the most fundamental tensions in an open society is the potential misuse of resources held “in common” – resources to which all individuals have access. Garrett Hardin’s 1968 essay on the subject, “The Tragedy of the Commons,” explores this tension, concluding that the problem of human overpopulation has no technical solution. (A technical solution is one that does not require a shift in human values or morality (IE, a political solution), but rather can be fixed by application of science and/or engineering.) Hardin’s essay has become one of the most cited works in social science – the tragedy of the commons is a facile concept that applies to countless problems across society.

    In the essay, Hardin employs a simple example of a common grazing pasture, open to all who own livestock. The pasture, of course, can only support a finite number of cattle. But as Hardin argues, cattle owners are financially motivated to graze as many cattle as they possibly can, driving the number of grass munchers beyond the land’s capacity, ultimately destroying the commons. “Freedom in a commons brings ruin to all,” he concludes, delivering an intellectual middle finger to Smith’s “invisible hand” in the process.

    So what does this have to do with healthcare, data, and the insurance industry? Well, consider how the insurance industry prices its policies. Insurance has always been a data-driven business – it’s driven by actuarial risk assessment, a statistical method that predicts the probability of a certain event happening. Creating and refining these risk assessments lies at the heart of the insurance industry, and until recently, the amount of data informing actuarial models has been staggeringly slight. Age, location, and tobacco use are pretty much how policies are priced under Obamacare, for example. Given this paucity, one might argue that it’s utterly a *good* thing that the insurance industry is beefing up its databases. Right?

    Perhaps not. When a population is aggregated on high-level data points like age and location, we’re essentially being judged on a simple shared commons – all 18 year olds who live in Los Angeles are being treated essentially the same, regardless if one person has a lurking gene for cancer and another will live without health complications for decades. In essence, we’re sharing the load of public health in common – evening out the societal costs in the process.

    But once the system can discriminate on a multitude of data points, the commons collapses,  devolving into a system rewarding whoever has the most profitable profile. That 18-year old with flawless genes, the right zip code, an enviable inheritance, and all the right social media habits will pay next to nothing for health insurance. But the 18 year old with a mutated BRCA1 gene, a poor zip code, and a proclivity to sit around eating Pringles while playing Fortnite? That teenager is not going to be able to afford health insurance.

    Put another way, adding personalized data to the insurance commons destroys the fabric of that commons. Healthcare has been resistant to this force until recently, but we’re already seeing the same forces at work in other aspects of our previously shared public goods.

    A public good, to review, is defined as “a commodity or service that is provided without profit to all members of a society, either by the government or a private individual or organization.” A good example is public transportation. The rise of data-driven services like Uber and Lyft have been a boon for anyone who can afford these services, but the unforeseen externalities are disastrous for the public good. Ridership, and therefore revenue, falls for public transportation systems, which fall into a spiral of neglect and decay. Our public streets become clogged with circling rideshare drivers, roadway maintenance costs skyrocket, and – perhaps most perniciously – we become a society of individuals who forget how to interact with each other in public spaces like buses, subways, and trolley cars.

    Once you start to think about public goods in this way, you start to see the data-driven erosion of the public good everywhere. Our public square, where we debate political and social issues, has become 2.2 billion data-driven Truman Shows, to paraphrase social media critic Roger McNamee. Retail outlets, where we once interacted with our fellow citizens, are now inhabited by armies of Taskrabbits and Instacarters. Public education is hollowed out by data-driven personalized learning startups like Alt School, Khan Academy, or, let’s face it, YouTube how to videos.

    We’re facing a crisis of the commons – of the public spaces we once held as fundamental to the functioning of our democratic society. And we have data-driven capitalism to blame for it.

    Now, before you conclude that Battelle has become a neo-luddite, know that I remain a massive fan of data-driven business. However, if we fail to re-architect the core framework of how data flows through society – if we continue to favor the rights of corporations to determine how value flows to individuals absent the balancing weight of the public commons – we’re heading down a path of social ruin. ProPublica’s warning on health insurance is proof that the problem is not limited to Facebook alone. It is a problem across our entire society. It’s time we woke up to it.

    So what do we do about it? That’ll be the focus of a lot of my writing going forward.  As Hardin writes presciently in his original article, “It is when the hidden decisions are made explicit that the arguments begin. The problem for the years ahead is to work out an acceptable theory of weighting.” In the case of data-driven decisioning, we can no longer outsource that work to private corporations with lofty sounding mission statements, whether they be in healthcare, insurance, social media, ride sharing, or e-commerce.

     

     

     
  • feedwordpress 23:29:34 on 2018/06/19 Permalink
    Tags: , , data, , , , , , ,   

    My Senate Testimony 

    (image) Today I had a chance to testify to the US Senate on the subject of Facebook, Cambridge Analytica, and data privacy. It was an honor, and a bit scary, but overall an experience I’ll never forget. Below is the written testimony I delivered to the Commerce committee on Sunday, released on its site today. If you’d like to watch, head right here, I think it’ll be up soon.  Forgive the way the links work, I had to consider that this would be printed and bound in the Congressional Record. I might post a shorter version that I read in as my verbal remarks next…we’ll see.


     

    Honorable Committee Members –

     

    My name is John Battelle, for more than thirty years, I’ve made my career reporting, writing, and starting companies at the intersection of technology, society, and business. I appreciate the opportunity to submit this written and verbal testimony to your committee.

    Over the years I’ve written extensively about the business models, strategies, and societal impact of technology companies, with a particular emphasis on the role of data, and the role of large, well-known firms. In the 1980s and 90s I focused on Apple and Microsoft, among others. In the late 90s I focused on the nascent Internet industry, the early 2000s brought my attention to Google, Amazon, and later, Twitter and Facebook. My writings tend to be observational, predictive, analytical, and opinionated.

    Concurrently I’ve been an entrepreneur, founding or co-founding and leading half a dozen companies in the media and technology industries. All of these companies, which span magazines, digital publishing tools, events, and advertising technology platforms, have been active participants in what is broadly understood to be the “technology industry” in the United States and, on several occasions, abroad as well. Over the years these companies have employed thousands of staff members, including hundreds of journalists, and helped to support tens of thousands of independent creators across the Internet. I also serve on the boards of several companies, all of which are deeply involved in the technology and data industries.

    In the past few years my work has focused on the role of the corporation in society, with a particular emphasis on the role technology plays in transforming that role. Given this focus, a natural subject of my work has been on companies that are the most visible exemplars of technology’s impact on business and society. Of these, Facebook has been perhaps my most frequent subject in the past year or two.

    Given the focus of this hearing, the remainder of my written testimony will focus on a number of observations related generally to Facebook, and specifically to the impact of the Cambridge Analytica story. For purposes of brevity, I will summarize many of my points here, and provide links to longer form writings that can be found on the open Internet.

    Facebook broke through the traditional Valley startup company noise in the mid 2000s, a typical founder-driven success story backed by all the right venture capital, replete with a narrative of early intrigue between partners, an ambitious mission (“to make the world more open and connected”), a sky-high private valuation, and any number of controversial decisions around its relationship to its initial customers, the users of its service (later in its life, Facebook’s core customers bifurcated to include advertisers). I was initially skeptical about the service, but when Sheryl Sandberg, a respected Google executive, moved to Facebook to run its advertising business, I became certain it would grow to be one of the most important companies in technology. I was convinced Facebook would challenge Google for supremacy in the hyper-growth world of personalized advertising. In those early days, I often made the point that while Google’s early corporate culture sprang from the open, interconnected world wide web, Facebook was built on the precept of an insular walled garden, where a user’s experience was entirely controlled by the Facebook service itself. This approach to creating a digital service not only threatened the core business model of Google (which was based on indexing and creating value from open web pages), it also raised a significant question of what kind of public commons we wanted to inhabit as we migrated our attention and our social relationships to the web. (Examples: https://battellemedia.com/archives/2012/02/its-not-whether-googles-threatened-its-asking-ourselves-what-commons-do-we-wish-for ; https://battellemedia.com/archives/2012/03/why-hath-google-forsaken-us-a-meditation)

    In the past five or so years, of course, Facebook has come to dominate what is colloquially known as the public square – the metaphorical space where our society comes together to communicate with itself, to debate matters of public interest, and to privately and publicly converse on any number of topics. Since the dawn of the American republic, independent publishers (often referred to as the Fourth Estate – from pamphleteers to journalists to bloggers) have always been important actors in the center of this space. As a publisher myself, I became increasingly concerned that Facebook’s appropriation of public discourse would imperil the viability of independent publishers. This of course has come to pass.

    As is well understood by members of this committee, Facebook employed two crucial strategies to grow its service in its early days. The first was what is universally known as the News Feed, which mixed personal news from “friends” with public stories from independent publishers. The second strategy was the Facebook “Platform,” which encouraged developers to create useful (and sometimes not so useful) products and services inside Facebook’s walled garden service. During the rise of both News Feed and Platform, I repeatedly warned independent publishers to avoid committing themselves and their future viability to either News Feed or the Platform, as Facebook would likely change its policies in the future, leaving publishers without recourse. (Examples: https://battellemedia.com/archives/2012/01/put-your-taproot-into-the-independent-web ; https://battellemedia.com/archives/2012/11/facebook-is-now-making-its-own-weather ; https://shift.newco.co/we-can-fix-this-f-cking-mess-bf6595ac6ccd ; https://shift.newco.co/ads-blocking-and-tackling-18129db3c352)

    Of course, the potent mix of News Feed and a subset of independent publishers combined to deliver us the Cambridge Analytica scandal, and we are still grappling with the implications of this incident on our democracy. But it is important to remember that while the Cambridge Analytica breach seems unusual, it is in fact not – it represents business as usual for Facebook. Facebook’s business model is driven by its role as a data broker. Early in its history, Facebook realized it could grow faster if it allowed third parties, often referred to as developers, to access its burgeoning trove of user data, then manipulate that data to create services on Facebook’s platform that increased a Facebook user’s engagement on the platform. Indeed, in his early years as CEO of Facebook, Mark Zuckerberg was enamored with the “platform business model,” and hoped to emulate such icons as Bill Gates (who built the Windows platform) or Steve Jobs (who later built the iOS/app store platform).

    However, Facebook’s core business model of advertising, driven as it is by the brokerage of its users’ personal information, stood in conflict with Zuckerberg’s stated goal of creating a world-beating platform. By their nature, platforms are places where third parties can create value. They do so by leveraging the structure, assets, and distribution inherent to the platform. In the case of Windows, for example, developers capitalized on Microsoft’s well-understood user interface, its core code base, and its massive adoption by hundreds of millions of computer users. Bill Gates famously defined a successful platform as one that creates more value for the ecosystem that gathers around it than for the platform itself. By this test – known as the Gates Line – Facebook’s early platform fell far short. Developers who leveraged access to Facebook’s core asset – its user data – failed to make enough advertising revenue to be viable, because Facebook (and its advertisers) would always preference Facebook’s own advertising inventory over that of its developer partners. In retrospect, it’s now commonly understood in the Valley that Facebook’s platform efforts were a failure in terms of creating a true ecosystem of value, but a success in terms of driving ever more engagement through Facebook’s service.

    For an advertising-based business model, engagement trumps all other possible metrics. As it grew into one of the most successful public companies in the history of business, Facebook nimbly identified the most engaging portions of its developer ecosystem, incorporated those ideas into its core services, and became a ruthlessly efficient acquirer and manipulator of its users’ engagement. It then processed that engagement into advertising opportunities, leveraging its extraordinary data assets in the process. Those advertising opportunities drew millions of advertisers large and small, and built the business whose impact we now struggle to understand.

    To truly understand the impact of Facebook on our culture, we must first understand the business model it employs. Interested observers of Facebook will draw ill-informed conclusions about the company absent a deep comprehension of its core driver – the business of personalized advertising. I have written extensively on this subject, but a core takeaway is this: The technology infrastructure that allows companies like Facebook to identify exactly the right message to put in front of exactly the right person at exactly the right time are, in all aspects of the word, marvelous. But the externalities of manufacturing attention and selling it to the highest bidder have not been fully examined by our society. (Examples: https://shift.newco.co/its-the-advertising-model-stupid-b843cd7edbe9 ; https://shift.newco.co/its-the-advertising-model-stupid-b843cd7edbe9 ; https://shift.newco.co/lost-context-how-did-we-end-up-here-fd680c0cb6da ; https://battellemedia.com/archives/2013/11/why-the-banner-ad-is-heroic-and-adtech-is-our-greatest-technology-artifact ; https://shift.newco.co/do-big-advertisers-even-matter-to-the-platforms-9c8ccfe6d3dc )

    The Cambridge Analytica scandal has finally focused our attention on these externalities, and we should use this opportunity to go beyond the specifics of that incident, and consider the broader implications. The “failure” of Facebook’s Platform initiative is not a failure of the concept of an open platform. It is instead a failure by an immature, blinkered company (Facebook) to properly govern its own platform, as well as a failure of our own regulatory oversight to govern the environment in which Facebook operates. Truly open platforms are regulated by the platform creator in a way that allows for explosive innovation (see the Gates Line) and shared value creation. (Examples: https://shift.newco.co/its-not-the-platforms-that-need-regulation-2f55177a2297 ; https://shift.newco.co/memo-to-techs-titans-please-remember-what-it-was-like-to-be-small-d6668a8fa630)

    The absolutely wrong conclusion to draw from the Cambridge Analytica scandal is that entities like Facebook must build ever-higher walls around their services and their data. In fact, the conclusion should be the opposite. A truly open society should allow individuals and properly governed third parties to share their data so as to create a society of what Nobel laureate Edmond Phelps calls “mass flourishing.” My own work now centers on how our society might shift what I call the “social architecture of data” from one where the control, processing and value exchange around data is managed entirely by massive, closed entities like Facebook, to one where individuals and their contracted agents manage that process themselves. (Examples: https://shift.newco.co/are-we-dumb-terminals-86f1e1315a63 ; https://shift.newco.co/facebook-tear-down-this-wall-400385b7475d ; https://shift.newco.co/how-facebook-google-amazon-and-their-peers-could-change-techs-awful-narrative-9a758516210a ; https://shift.newco.co/on-facebook-a156710f2679 ; https://battellemedia.com/archives/2014/03/branded-data-preferences )

    Another mistaken belief to emerge from the Cambridge Analytica scandal is that any company, no matter how powerful, well intentioned, or intelligent, can by itself “fix” the problems the scandal has revealed. Facebook has grown to a size, scope, and impact on our society that outstrips its ability to manage the externalities it has created. To presume otherwise is to succumb to arrogance, ignorance, or worse. The bald truth is this: Not even Mark Zuckerberg understands how Facebook works, nor does he comprehend its impact on our society. (Examples: https://shift.newco.co/we-allowed-this-to-happen-were-sorry-we-need-your-help-e26ed0bc87ac ; https://shift.newco.co/i-apologize-d5c831ce0690 ; https://shift.newco.co/facebooks-data-trove-may-well-determine-trump-s-fate-71047fd86921 ; https://shift.newco.co/its-time-to-ask-ourselves-how-tech-is-changing-our-kids-and-our-future-2ce1d0e59c3c )

    Another misconception: Facebook does not “sell” its data to any third parties. While Facebook may not sell copies of its data to these third parties, it certainly sells leases to that data, and this distinction bears significant scrutiny. The company may not wish to be understood as such, but it is most certainly the largest data broker in the history of the data industry.

    Lastly, the Cambridge Analytica scandal may seem to be entirely about a violation of privacy, but to truly understand its impact, we must consider the implications relating to future economic innovation. Facebook has used the scandal as an excuse to limit third party data sharing across and outside its platform. While this seems logical on first glance, it is in fact destructive to long term economic value creation.

    So what might be done about all of this? While I understand the lure of sweeping legislation that attempts to “cure” the ills of technological progress, such approaches often have their own unexpected consequences. For example, the EU’s adoption of GDPR, drafted to limit the power of companies like Facebook, may in fact only strengthen that company’s grip on its market, while severely limiting entrepreneurial innovation in the process (Example: https://shift.newco.co/how-gdpr-kills-the-innovation-economy-844570b70a7a )

    As policy makers and informed citizens, we should strive to create a flexible, secure, and innovation friendly approach to data governance that allows for maximum innovation while also insuring maximum control over the data by all effected parties, including individuals, and importantly, the beneficiaries of future innovation yet conceived and created. To play forward the current architecture of data in our society – where most of the valuable information is controlled by an increasingly small oligarchy of massive corporations – is to imagine a sterile landscape hostile to new ideas and mass flourishing.

    Instead, we must explore a world governed by an enlightened regulatory framework that encourages data sharing, high standards of governance, and maximum value creation, with the individual at the center of that value exchange. As I recently wrote: “Imagine … you can download your own Facebook or Amazon “token,” a magic data coin containing not only all the useful data and insights about you, but a control panel that allows you to set and revoke permissions around that data for any context. You might pass your Amazon token to Walmart, set its permissions to “view purchase history” and ask Walmart to determine how much money it might have saved you had you purchased those items on Walmart’s service instead of Amazon. You might pass your Facebook token to Google, set the permissions to compare your social graph with others across Google’s network, and then ask Google to show you search results based on your social relationships. You might pass your Google token to a startup that already has your genome and your health history, and ask it to munge the two in case your 20-year history of searching might infer some insights into your health outcomes. This might seem like a parlor game, but this is the kind of parlor game that could unleash an explosion of new use cases for data, new startups, new jobs, and new economic value.”

    It is our responsibility to examine our current body of legislation as it relates to how corporations such as Facebook impact the lives of consumers and the norms of our society overall. Much of the argument around this issue turns on the definition of “consumer harm” under current policy. Given that data is non-rivalrous and services such as Facebook are free of charge, it is often presumed there is no harm to consumers (or by extension, to society) in its use. This also applies to arguments about antitrust enforcement. I think our society will look back on this line of reasoning as deeply flawed once we evolve to an understanding of data as equal to – or possibly even more valuable than – monetary currency.

    Most observers of technology agree that data is a new class of currency in society, yet we continue to struggle to understand its impact, and how best to govern it. The manufacturing of data into currency is the main business of Facebook and countless other information age businesses. Currently the only participatory right in this value creation for a user of these services is to A/engage with the services offered and B/purchase the stock of the company offering the services. Neither of these options affords the user – or society – compensation commensurate with the value created for the firm. We can and must do better as a society, and we can and must expect more of our business leaders.

    (More: https://shift.newco.co/its-time-for-platforms-to-come-clean-on-political-advertising-69311f582955 ; https://shift.newco.co/come-on-what-did-you-think-they-do-with-your-data-396fd855e7e1 ; https://shift.newco.co/tech-is-public-enemy-1-so-now-what-dee0c0cc40fe ; https://shift.newco.co/why-is-amazons-go-not-bodega-2-0-6f148075afd5 ; https://shift.newco.co/predictions-2017-cfe0806bed84 ; https://shift.newco.co/the-automatic-weapons-of-social-media-3ccce92553ad )

    Respectfully submitted,

    John Battelle

    Ross, California

    June 17, 2018

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
esc
cancel