The Cambridge Analytica Scandal: Data Harvesting Confirmed at Scale
The data really was taken from 87 million people. What it could actually do with it is the harder question.

Contents
On 17 March 2018, a young man with dyed pink hair sat under studio lights and told The Observer and The New York Times that he had helped build a machine for harvesting the private lives of Americans. His name was Christopher Wylie, he was twenty-eight, and until 2014 he had been director of research at a British firm called Cambridge Analytica. What he described was not a rumour or a suspicion. It was a supply chain: an academic app, a loophole in Facebook’s developer platform, and roughly eighty-seven million personal profiles pulled out of the world’s largest social network and repurposed as raw material for a political consultancy owned by one of the biggest donors to the American right. The remarkable thing about the story that followed is not that people believed it. It is that so much of the frightening version turned out to be provable, and so much of the most frightening version turned out to be marketing.
What the documents actually show
Strip away the theatre and the confirmed core is bad enough to stand on its own. In 2013 a Cambridge University psychology researcher named Aleksandr Kogan built a personality-quiz app called thisisyourdigitallife. Around 270,000 people were paid small sums to take it, and in doing so granted the app permission to see their Facebook data. Under Facebook’s platform rules at the time, that permission also reached into their friends’ data. So from a quarter of a million consenting users, Kogan’s app scraped the profiles of tens of millions of people who had never heard of it. Facebook’s own later disclosure put the figure at up to eighty-seven million.
Kogan passed that dataset to Cambridge Analytica, in breach of Facebook’s platform policy, which forbade selling or transferring data collected this way. Cambridge Analytica was majority-owned by the American hedge-fund billionaire Robert Mercer and had Steve Bannon as a vice-president; it was a spin-off of an older British defence and elections contractor, SCL Group, whose past work included what it called “psychological operations” for military and political clients. The firm sold itself on “psychographics”: the promise that by scoring individuals on the five-factor personality model — openness, conscientiousness, extraversion, agreeableness, neuroticism, the so-called OCEAN traits — it could target political messaging with almost surgical precision.
That is the documented spine, and it was confirmed by a convergence of independent sources rather than a single anonymous tip. Wylie produced contracts and invoices. Channel 4 News ran undercover footage of chief executive Alexander Nix boasting to a fake Sri Lankan client about entrapment stings, Ukrainian sex workers, and untraceable operations. Britain’s Information Commissioner’s Office raided the firm’s London offices under warrant. In 2019 the US Federal Trade Commission fined Facebook five billion dollars over its privacy practices, the largest such penalty in the agency’s history to that point, and the ICO fined the company £500,000, the maximum available under the pre-GDPR law. Cambridge Analytica itself filed for insolvency in May 2018, weeks after the story broke.
For anyone who had spent years being told that worrying about social-media data was paranoia, the moment was vindicating. A thing many people had felt in their gut — that the free platform was extracting something valuable and doing something with it out of sight — had been dragged into the light with paperwork attached. This sits alongside the Snowden revelations as one of the episodes that moved mass data harvesting out of the folder marked “conspiracy theory” and into the folder marked “court record”.
The fork: harvesting is not the same as controlling
Here is where the story branches, and where the popular retelling started adding things the evidence never supported.
The confirmed claim is about acquisition. Data was taken improperly, at scale, without meaningful consent. That is real and it is settled. The unconfirmed claim — the one that powered a thousand headlines and a Netflix documentary — is about effect: that Cambridge Analytica used this data to psychologically manipulate voters through personalised advertising, and that this manipulation swung the 2016 US presidential election and the Brexit referendum. That claim is far shakier, and the people best placed to know have quietly said so.
Start with the psychographics themselves. The academic evidence that you can meaningfully change political behaviour by tailoring adverts to someone’s personality score is thin. The foundational research the firm leaned on came from the Cambridge Psychometrics Centre and the work of Michal Kosinski, showing that Facebook likes could predict personality traits reasonably well. Predicting a trait is a long way from flipping a vote. When journalists and academics later tried to find evidence that the OCEAN-targeted adverts had actually moved anyone, they mostly found sales patter. Even Wylie, the whistleblower, was careful in his own book to describe the firm as arguably ineffective at the specific thing it claimed to be a genius at — a company that may have been better at frightening its clients and terrifying its critics than at changing minds.
Then there is the Brexit strand. The Leave campaign’s association with Cambridge Analytica became folk knowledge almost overnight, yet the ICO’s own three-year investigation, concluded in October 2020, found no significant evidence that Cambridge Analytica had been involved in the EU referendum beyond some initial enquiries. The commissioner, Elizabeth Denham, wrote that the firm’s systems were “not as sophisticated” as the public debate assumed. The most consequential provable wrongdoing in the Brexit data story turned out to involve a different set of actors and a separate set of fines.
It is worth dwelling on why the efficacy question is so hard to settle, because the difficulty is itself instructive. Political campaigns are noisy environments in which thousands of factors move at once — the economy, the candidates, the news cycle, the weather on polling day. Isolating the effect of one firm’s personality-targeted adverts inside that storm is close to impossible even for researchers with full access to the data, which no independent party ever had. The firm exploited exactly this opacity. Because nobody could prove the adverts had not worked, the claim that they had worked could float free of evidence, sustained by the firm’s own confidence and its critics’ worst fears feeding each other. That is a recurring feature of the manipulation panic: the harder a claimed effect is to measure, the easier it becomes to believe in, because disproof is unavailable and the imagination fills the vacuum.
None of this exonerates the company. The unauthorised harvesting was real regardless of whether the targeting worked. But the fork matters, because two very different fears got welded together in the public mind. One fear is that companies take our data without consent. That fear was confirmed. The other fear is that they have therefore acquired a kind of remote control over our decisions. That fear was sold to us — by the very firm we were afraid of.
The journey: how a firm became a folk-devil
The salesmanship is the part that gets lost. Cambridge Analytica’s entire commercial pitch depended on clients believing it could do the almost-magical thing. Alexander Nix stood on conference stages and told rooms of marketers that his firm had a personality profile on every adult in America. That was the product. When the scandal broke, the firm’s own boasts became the prosecution’s evidence, and the public simply took the salesman at his word. The manipulation-machine narrative was, in a real sense, Cambridge Analytica’s marketing brochure read back to it as an indictment.
It travelled because it arrived at exactly the right moment. In early 2018 a great many people, particularly in Britain and the United States, were looking for an explanation for two election results they found bewildering. A shadowy data firm with a supervillain aesthetic — undercover stings, a smirking chief executive, a reclusive billionaire, Bannon lurking in the background — offered something enormously comforting to the losing side: the results had been engineered — the output of a machine that had reached into millions of heads and hacked them. That is a far more bearable story than the alternative, which is that half the country disagreed with you for reasons of its own, arrived at without any machine’s help.
The pattern is old. When a financier or a firm accumulates real, verifiable influence, the public imagination tends to inflate it into omnipotence, and the inflation often carries a nastier cargo than the facts. The same dynamic runs through the Soros conspiracy theories, where a genuine philanthropic and political operation gets ballooned into a puppet-master mythology. Cambridge Analytica became the tech-age version of that figure: a name you could attach to any electoral outcome you disliked, evidence of effect optional.
What it was really about
Underneath the whole affair sits a discomfort that predates Facebook by decades — the sense that we have surrendered something intimate to institutions we cannot see, in exchange for conveniences we cannot give up. The quiz app is the perfect emblem. Two hundred and seventy thousand people willingly told a stranger’s software about their personalities in exchange for a moment’s diversion, and in doing so they handed over the friends who trusted them. Nobody twisted an arm. The extraction ran on ordinary human curiosity and the invisible architecture of consent that platforms had built to be as frictionless, and as unread, as possible.
That is why the scandal landed so hard, and why the exaggerations stuck so easily. People were not really arguing about OCEAN scores or the efficacy of micro-targeting. They were reckoning with the realisation that a decade of casual clicking had built a detailed record of who they were, held by companies whose incentives did not obviously include their welfare. The mind-control story gave that formless dread a shape and a villain. The truer story is quieter and harder to hold: the machine that reads you does not need to control you to change the world, because knowing you well enough to sell to you is already an enormous and largely unaccountable power. That power did not fold along with Cambridge Analytica in 2018. It was never the property of one badly behaved firm; it is the business model, and it is still running.
The most useful thing to carry out of the affair is a habit of separating the two questions the scandal fused. Did they take the data? Yes, provably, at a scale that should still unsettle anyone. Did taking it give them the power to reach into eighty-seven million heads and rewrite the vote? That, the firm wanted us to believe most of all — and it is the one claim for which the paperwork never arrived. The gap between those two answers is where the real education lives, and it is a gap the platforms themselves have every reason to keep us from noticing.




