Cianan Brennan: How data becomes money - the price we pay for keeping web content free

The convenience companies such as Google have brought to modern living is undeniable — and the company says that it never technically sells our information to third parties. But are we sacrificing too much of our privacy just to make our online lives easier to manage? Cianan Brennan reports
Cianan Brennan: How data becomes money - the price we pay for keeping web content free

Picture: iStock

One of the endless criticisms in the endless debate on the rights and wrongs of present-day big tech and the quest for data accountability is the idea that users of such tech are effectively walking cash-generators for the corporate behemoths.

Like so much in this world, such a statement is not entirely wrong — but it is not entirely right either.

The likes of Google, Facebook, and Twitter subsist by selling advertising, it’s true. All three would argue that they do not sell data, however.

Nonetheless, marketing is crucial to their business model. The most accomplished, and most profitable, of the three is Google. Why? “Because they got there first,” says a source.

“They got in early, they have a better algorithm, and they show more relevant ads. First-mover advantage is hugely important.”

In the third quarter of 2018, 87% of Google’s total revenue came from advertising, equating to about €20.4bn.

Google CEO Sundar Pichai is on the record as saying that Google will never sell any personal information to third parties. “You get to decide how your information is used,” Mr Pichai wrote in The New York Times in 2019.

There is certainly merit to that statement — the convenience which Google provides to modern living, in terms of traffic-monitoring for example, is only possible via the sharing of location data. If no one shares their location, the service cannot work.

Meanwhile, personal data from Google Drive or Gmail is never used for marketing purposes, the company says.

Nevertheless, Google has shown itself to be wise to the fact that scepticism about big data is only growing with time — hence the commitment made in 2019 to auto-wipe data for new accounts, and to prompt long-term users to do the same.

Google CEO Sundar Pichai. Picture:Justin Sullivan/Getty Images
Google CEO Sundar Pichai. Picture:Justin Sullivan/Getty Images

The company still needs to make money, however. Pichai sums it up thus: “A small subset of data helps serve ads that are relevant and that provide the revenue that keeps Google products free and accessible.

“That revenue also sustains a broad community of content creators, which in turn helps keep content on the web free for everyone.” 

This all sounds very altruistic, and certainly the utility provided by Google cannot be denied. Regardless, it is one of the most profitable companies on the planet, so money is certainly being made somewhere.

The recently introduced California Consumer Privacy Act (CCPA), which came into effect at the beginning of 2020 in Google’s home state, has a number of wide-reaching effects. One of those is to define what a “sale” is in terms of data transfers. Under the act, any transaction which sees something of value change hands equates to a sale, and in all such cases, the individual whose data is being used has to be given the opportunity to opt out of the deal.

This is one of the key reasons why acknowledging the sale of data would be so problematic for the manner in which Google functions. The company does acknowledge that at some part of the process a sale takes place, it just denies that it is the one doing the selling.

The argument is not dissimilar to that made regarding the torrenting of media, with those hosting torrent files, or torrent sites, arguing that they are merely providing a service, not committing any overt act.

But how exactly does one’s online data become monetised?

In the case of Google, the all-conquering search giant, formed in 1998, itself operates its revenue-generating business model in two main ways.

Firstly it assembles all the data it has recorded from your various online interactions into a profile of interests, and potentially physical characteristics, and allows advertisers to target their messaging at that demographic.

Secondly, it shares its vast wealth of data with advertisers directly and asks them to bid for ad space, be it on mobile devices or desktop, via a constant stream of online auctions.

Such auctions are constantly happening in the background via an automated system known as "real-time bidding".

The system is a double-edged sword. In auctioning off advertising space on their apps and websites, publishers likewise share personal data via permissions — phone numbers, device IDs, browsing history — with Google and hundreds of other companies like it.

Stock image. Picture: Andrew Matthews/PA Wire
Stock image. Picture: Andrew Matthews/PA Wire

The process sees a person’s data go through a number of different layers in real time, from leaving a user’s device to ending up in the hands of an advertiser.

Supply-side platforms (SSPs) collect the data to sell, ad tech exchanges like Google organise the automated auctions for advertising space between themselves and the advertisers, and demand-side platforms (DSPs) do the bidding on behalf of the advertisers themselves.

The auctions may sound grandiose given the real-world variant of the term — in reality they last mere nanoseconds, with the highest bidder getting the space.

In this manner, ads based on your recent browsing activity or personal preferences magically end up targeting you via an app that may be completely unrelated to the activity on sale.

Where Google has the advantage in this system is that it controls vast tracts of the real-time bidding universe. This has resulted from prescient acquisitions as the company has developed, not dissimilar to Facebook’s strategic acquisitions of Instagram and Whatsapp, together with the ongoing success of its own signature advertising products.

For example, in 2009 it bought AdMob, the largest ad SSP for the then-burgeoning smartphone application advertising market. In 2020 AdMob under Google’s ownership sends more than 40bn mobile and text ads per month across mobile browsers and individual applications.

AdMob develops tools for web developers to build into their apps, known as software development kits (SDKs), which then connect the apps to the ad exchanges directly.

From there, your app shares an ad with you based off the volume of data that AdMob shares with Google and the exchange, and to end the process both Google and the app developer are paid for the service.

AdSense meanwhile is Google’s flagship monetisation product, counting some 11m websites as its customers and paying out $10bn to publishers each year. It works simply by a publisher making advertising slots available on its site which Google fills with the winning bids in those same real-time auctions.

Billions of these auctions happen every day with Google acting both as data facilitator and oftentimes the exchange also. In each case Google receives bid requests from across the web; it then shares data with the DSPs representing the advertisers, and receives remuneration once the auction is completed. The amount depends on the scale of the auction and the number of clicks the displayed ad produces. Cost per click (CPC) varies depending on the subject matter, with an average across the spectrum of between $1 and $2.

AdSense’s revenue split equates to 68% for the publisher and 32% for Google. This equates to millions of dollars per day. Make no mistake, Google is an enormous revenue generator, both for itself and for thousands of individual sites which likely could not exist without that income.

Meanwhile, at the user’s end, the individual is affected in that all the data points for an auction on their device are tied back to their specific data points — from phone identifiers, or cookies (the footprints of desktop browsing), to information from a person’s own Google account.

Then there are the means by which Google personalises your data, preferences, and online history. You can get a feel for this personally by checking out myaccount.google.com.

Should an advertiser wish to target a specific age group, with particular interests, Google’s algorithm moulds the copious amounts of data it processes into inferences as to demographics and interests, which are then made available to the demand side for targeted ads.

Taking that a step further, its Customer Match facility allows advertisers to upload data on and target individual users by name, email, and device ID, information which is shared with Google routinely for example via Google Play Services, the app installed on all Android devices charged with ensuring third-party applications run properly.

This particular form of data-sharing then sees ads served at the individual customer via their account across platforms.

Cookies, meanwhile, are of less import than Google’s own data collection and profiling, it seems.

“Cookies are for when you specifically want to target someone who has already performed a desired behaviour — say in purchasing a specific product,” says an ad-exchange source. "It’s called remarketing, and they’re nice to have because they’re proven, as opposed to speculative.

But that serves to seriously limit the number of people you can show an ad to.

So is there a problem for individual publishers, given they get a cut of all advertising that comes their way via Google?

This really depends on what the advertiser is looking for — although with the Covid-19 pandemic having decimated in-house advertising budgets of media companies for example, at present the point is slightly moot. Nevertheless, publishers can boast their own first-party data — that is, metrics exclusive to the site in question, like browsing history and viewer behaviours.

In happier times, an in-house marketing sales department is of more use to an advertiser looking for a bespoke campaign with a specific return.

“Google takes approximately 30% of all revenue earned from a website, as well as requiring you to use their publishers’ ad manager system, which can add up to substantial costs,” says one in-house source.

“If we sell directly, we can layer on our own first-party data regarding a user’s interests, but even then we’d have to use the Google management system which we pay a standing fee for,” they say.

“First-party data is gold dust and is essentially the best quality. Many agencies would love to get their hands on such data, but many sites seem reluctant to give it up, most likely as it would be like opening up Pandora’s box.

“Google are definitely a hindrance overall, but we can’t really operate without them as they are so established. Even other market leaders use the Google platform overall, so it’s hard to compete. They have publishers cornered, in essence.

File picture: PA
File picture: PA

“In short, many other companies have tried and failed to offer a direct system, but as Google is so prevalent in the industry, it’s not really a runner.

When speaking about the benefit of dealing directly with a data bank like Google, one industry source says: “It’s all about efficiency. 

“You want to be able to go online, have a campaign drawn up inside 30 minutes and let loose. Then you can come back the next day and adjust. That’s the beauty of online advertising, in that its measurable. Most traditional advertising, a logo on a shirt, say, can’t really be measured.

"And a lot of advertising has traditionally been about maintaining brand recognition. But online advertising evolves all the time, it’s a question of evolving with it to get the best return on investment.” 

So this is how the system works. What then are the potential issues, other than the irritation of low-grade, looped advertising content?

As with all data-heavy projects of all hues, the largest issues exist in terms of privacy breaches, and the ever-expanding potential for same. With the sheer amount of data flying through the ether as a matter of course, data grabs become an ever-more attractive prospect for those looking to exploit the system.

At present, the Irish Data Protection Commission has open inquiries under way into both the adtech industry and the real-time bidding advertising framework.

The aforementioned customer-matching technique in practice is one particular avenue with the potential to prove problematic from a privacy standpoint, with research suggesting that such specifically targeted systems — which are utilised by all the big tech companies — can lead to the reverse engineering of phone numbers and other personal identifiable information (PII).

However, it is also true, as Sundar Pichai suggests, that online marketing only works proportionate to the information which users are willing to voluntarily give up.

By routinely deleting browser histories and location data, a user will make it at least a little more difficult for companies to target them.

What Google is banking on, and with good reason, is that enough people are happy enough to give up their data for the returns they receive.

Unless that changes, the billions being made via automated online advertising will not be going anywhere.

Putting a market value on genetic data

Genuity’s endeavours with the Irish public’s genetic data raise questions as to why the Government has not become involved in orchestrating a publicly-funded genomics project which would cost a fraction of the price of the private one, and would see the collated data returned to the public domain, says Cianan Brennan.

It’s 2020, and we’re not yet flying around in hovercars. Yet the world is a fundamentally different place to what it was 20 years ago.

Back then, mobile phones were becoming increasingly common, but the smart device revolution was still several years away. Facebook wouldn’t appear before 2005. Google existed, but was many years removed from the all-conquering behemoth we know now.

Nowadays, in a world of smartphone ubiquity and endless online information, the marketing of data has become commonplace. While such information is valuable, the whole oblique nature of the process renders it, for good or ill, of little everyday interest to most.

The health and wellbeing sphere has been far from immune to the encroachment of big data.

Google routinely monitors the exercise its users take, where they cycle to, where they’ve walked.

Popular apps such as FitBit, acquired last year by Google, or Apple Health, collect data on individual users’ fitness and wellbeing statistics as a matter of course, though the volunteering of said information is voluntary.

Closer to home the recent launch of the Covid Tracker app was a poster-child for responsible data processing in the case of a civic-minded app, one which people who would ordinarily have little appetite for interacting with the data giants such as Google would be more inclined to download.

However, the app has not been without its problems, most of which have stemmed from the Google/Android side of things and have been fired by that company’s need to acquire data to feed its behavioural marketing models.

Asked at the launch about the appropriateness or otherwise of relying so heavily on private entities Apple and Google for an application aimed at curing a societal ill, Health Minister Stephen Donnelly compared the tech giants' status to that of a national utility.

“I think it is probably a facet of modern life. They are the technology platforms. We’re largely dependent on the ESB to keep the country running, but that’s ok. We are largely dependent on these private-sector companies to keep the internet running, to keep broadband running, to do an awful lot of things,” he said.

“It just is what it is.” 

However, there is other data relating to our collective health — physical information that is of huge value. And the sheer scope of what it means is something the Irish State, let alone its people, has yet to get a handle on.

We’re talking about genetic data, and the market value of the individual human genome — the DNA files which double as the building blocks for humankind.

The focus of this market is not necessarily research. The value lies in what may seem a far more banal avenue, mainly the predictability of susceptibility to disease. But if you think that means the potential for enormous profit does not exist, you are very much mistaken.

Last month, Blackstone Group, a giant American private-equity multinational, acquired about 75% of Ancestry.com in a deal worth the guts of €4bn.

Ancestry is an American genealogy company set up in 1996, the largest for-profit entity of its kind on the planet, with its European headquarters in Dublin.

Its fundamental premise is the sale to consumers of DNA kits with which they can map their genetic ancestry. This basically translates to an estimate of ethnicity, together with the inference of family relationships between its 18m users. In practice, the service is used by customers to trace unknown biological relations. Analysis of its kit samples is performed by the American corporation Quest Diagnostics, one of the largest clinical laboratory companies in the world. Irish people will be familiar with it as one of the laboratories, used to process Cervical Check samples, which became household names after a number of Irish women took lawsuits over incorrect smear-test results.

Blackstone’s recent track record has been one of investment in growth industries and companies likely to benefit from dramatic shifts in consumer behaviour. It is far from inconceivable that the move towards marketable genomes is a classic example of such a shift.

What may come as a surprise is that a valuation can indeed be placed on an individual’s genetic sequence.

“For $99 they’ll tell you what percent Viking you are, and how susceptible you are to cancer,” says Simon McGarr, a privacy solicitor.

“It’s amazing that they get you to pay them. You don’t get much of anything in return really. But they’ve got the genome, and the genome is valuable.” 

How valuable?

A recent presentation by pharmaceutical research company Open Orphan put the price per sample at anything between $450 and $3,000. A compromise figure of $1,500 is often accepted as the current going rate.

Apply that rate to a full population and you begin to see that genomics is big business, with large-scale pharma multinationals on the lookout for areas where they can access swathes of genetic data, ostensibly for pharmaceutical research, most especially in the field of rare disease drugs.

Individual citizens owning the rights to their own genetic data is far from usual, but that cannot alter the fact that the acquisition of such data is big business in 2020.

So, what if there were a determined grab for the genetic data of the Irish people?

That’s an easy question to answer because it has to all intents and purposes already happened.

In late June of this year a private entity known as Genomics Medicine Ireland (GMI), established in 2015, quietly rebranded its American, Icelandic, and Irish operations as Genuity Science, while acquiring a new chief executive.

GMI had made any number of headlines in recent years, with its goal of sequencing the genomes of 450,000 people — a tenth of the Irish population, and a large enough cohort to enable the effective profiling of the entire citizenry, given the interrelated ancestry of Ireland’s people.

The company has been partnering with individual hospitals to acquire large tracts of genetic data.

Earlier this year it found itself embroiled in a controversy of sorts after a joint venture between itself and Beaumont Hospital in Dublin, aimed at harvesting the genetic data of 9,000 brain-tumour patients of the hospital, was broadcast publicly via a series of newspaper advertorials. The catch is that participation in the project is on an opt-out basis, rather than opt-in, and many of the participants are already dead.

The study itself had initially been blocked by the State’s Health Research Consent Declaration Committee, established under the Data Protection Act 2018, which gave effect to GDPR in Irish law. That decision was overturned on appeal, with a caveat that the study must be publicly advertised.

Eventually the company and Beaumont bowed to public pressure and extended the deadline for opting out of the study by three months until the middle of this month, to mitigate the effects the Covid-19 pandemic would have had on the visibility of the public information announcements. This week the deadline was extended once more until the end of this year, following a flurry of criticism from the likes of Social Democrats co-leader Roisin Shortall.

Why the name change?

Despite its name, Genomics Medicine Ireland was in fact a subsidiary of an international genetics company, WuXi NextCode, which acquired it in 2018.

Officially, Genuity has said the rebrand was necessitated by it having to “overhaul its structure after China introduced new national security regulations which make it harder for foreign genetic research companies to share data”. The move saw Wuxi NextCode split from its Chinese operations as part of the corporate restructuring.

However, the company’s prior association with China, given the political climate in the Asian country, had led to criticism from some quarters, with Wuxi NextCode last year rebuffing accusations of ties to China, emanating from the American Senate, and reiterating its status as a multinational headquartered in the US.

“Where does GMI stand? They’ve kept stating that they’re not Chinese, so that would suggest it was damaging the study,” said one industry source, speaking on condition of anonymity.

This restructuring allows us to sharpen the short-term strategic focus of our business to better catalyse the biopharma industry’s ability to effectively and efficiently integrate genomic data and insights into their drug development endeavors,” said Rob Brainin, incoming CEO of the new entity. “Our long-term vision and commitment to improving the lives of patients by accelerating the pace of precision health remains the same.

Where Genuity stands out is that in it the Government has opted for the genomic profiling of the Irish nation via a private entity, one backed by roughly €70m in taxpayers' money via the Irish Strategic Investment Fund.

Meanwhile, a Wuxi NextCode executive went on LinkedIn in August 2019 to announce that GMI had “12,000 whole genome sequenced MS patients” and was “actively looking for a pharma partner to explore the underlying genetic and biological drivers”. The post was removed shortly thereafter. The executive no longer works for Wuxi.

This is not to say that Genuity’s data-acquisition practices have escaped regulatory scrutiny. In November 2019, months prior to the Beaumont Hospital public announcement, the Data Protection Commission informed privacy advocacy grouping, Digital Rights Ireland, that it had commenced a “widespread compliance and supervision” investigation regarding how the company processes the genetic data of Irish citizens. No update on the matter has yet been released, although it is understood the possibility of a full inquiry into the company’s data-acquisition practices has been mooted and may yet materialise.

The overriding question regarding Genuity’s endeavours with the Irish public’s genetic data is why the Government hasn’t become involved in orchestrating a publicly-funded genomics project — one which would cost a fraction of the price of the private project, and which would see the collated data returned to the public domain.

“What these private projects are doing is collating genome banks and then charging big pharmaceutical companies to access them,” says McGarr. “You could use that genetic information to profile who might be susceptible to the coronavirus, for example.” The then-GMI had made just under $13m in revenue through “making available certain genomic data to be used in extensive research” per its most recently-filed accounts as at the end of 2018.

“It comes down to this: What do people know and what do they expect? DNA is abstract; someone could consent to selling it, but would they want it sold on to the big pharma companies?” says McGarr.

What if it were to affect my life insurance, for example, when a profile becomes available showing a predisposition towards cancer and then my family can’t get cover? The consequences are so enormous it’s mind-boggling.

In theory, the sale of genetic data does not have to be a one-off either, but something that can be repeated, with the same windfall on each occasion. Taking the 450,000 cohort which Genuity has aimed for, at a price of $1,500 per sample, you’re left with a databank with a value of $675m with the possibility of resales.

“The fact that Ireland hasn’t considered the ethical problems involved in the collation of this data is an abomination,” says the industry source.

“For €10m we could have a national genome project which benefits those taking part. Private industry, its responsibility is to make profits via therapeutic products. You have to presume that such concerns make decisions which will be beneficial to them,” they say.

So why wouldn’t the Government get involved in a publicly-funded genome project, of which there are multiple examples already in existence across Europe? Why instead sign on with private industry?

“My feeling is that there has been a tendency to worship at the altar of foreign direct investment in Ireland at a cost of letting foreign companies dictate the terms of engagement,” says the industry source. 

“These companies are not questioned in terms of risk-benefit ratios. That is fine in terms of reputational risk, but your genome is another ball game. It’s like a naked picture of you. If you knew someone was taking one you wouldn’t let them. In Ireland we’ve chosen the most exploitative model possible.” 

“This private project should be countered by a public one. Look for diseases in the Irish population and benefit a huge amount of people for the least amount of investment possible.”

More in this section

Select your favourite newsletters and get the best of Irish Examiner delivered to your inbox

LOTTO RESULTS

Saturday, November 28, 2020

  • 5
  • 6
  • 16
  • 18
  • 45
  • 47
  • 41

Full Lotto draw results »