Google cancelled an agreement with the NIH that
would have involved publicly posting
100,000 X-ray images of 30,000 patients

Google almost exposed personal information for 30,000 patients: report

November 19, 2019
by John R. Fischer, Senior Reporter
A telephone call from the National Institutes of Health (NIH) saved Google from publicly exposing the personal health data of 30,000 patients.

The data was within more than 100,000 chest X-ray images that Google planned to post in 2017 as part of a project between it and the NIH, which had provided the scans, discovered the Washington Post this past week in records obtained under a Freedom of Information Act request.

The call, which came two days before the expected posting date, led Google to “abruptly” cancel the project with the NIH, according to a series of emails that detailed the incident, and were found within the records. The discovery of the event, which was never reported, comes amidst another revelation this month of a secret agreement between Google and healthcare provider Ascension to transfer 50 million American health records to Google cloud.

“We take great care to protect patient data and ensure that personal information remains private and secure,” said Google spokesman Michael Moeschler in regard to the NIH project, reported The Post. “Out of an abundance of caution, and in the interest of protecting personal privacy, we elected to not host the NIH data set. We deleted all images from our internal systems and did not pursue further work with NIH.”

Post reporters discovered the details of the project and its cancellation in a series of emails within the records. A source, who wished to remain anonymous for unspecified reasons, told the paper that Google researchers did not make any legal agreements around the privacy of patient information, and that the company rushed to announce its plans without properly verifying that privacy of the data was in order.

Justin Cohen, chief of the office of communications & media relations at the NIH Clinical Center, told The Post that Google was one of a number of cloud providers that the NIH considered for hosting the 112,000 X-ray scans, which were taken at the NIH Clinical Center, a government-funded research hospital in Bethesda, Maryland. The two organizations worked together to delete any trace of personal health information from the images. Google aimed to finish the job by July 21, 2017, the date on which it planned to announce the project and publicly release the data set at an AI conference in Honolulu.

Overseeing the work was then chief scientist of the tech giant’s cloud computing division Fei-Fei Li. She wanted to use the project to demonstrate how TensorFlow, a Google tool that teaches computers to identify different markings of different diseases in images, could be applied to solve some of the most complex problems in medicine, according to the unnamed source.

Upon finding personal information in dozens of images, the NIH contacted Google on July 19 to warn it. Information included dates of when X-rays were taken and distinctive jewelry patients wore during the exams, according to the emails. Upon consulting its attorneys and emailing the NIH to ask if the data was protected under HIPAA, Google chose not to move forward with the project and deleted all of the X-rays from its server. The 112,000 images were later published in September 2018 after the NIH had scrubbed all traces of personal data from them.

"The data was released to the scientific community to give researchers access and increase their ability to teach computers how to detect and diagnose disease. Ultimately, this artificial intelligence mechanism can lead to clinicians making better diagnostic decisions for patients," Cohen told HCB News, adding that a media advisory with details about the release of the dataset was issued in September 2017. "As noted in the media advisory, the hope is that machine learning tools will be able to process large amounts of scans, and confirm results radiologists have found as well as identify findings that may have been overlooked."

Google is currently facing a number of challenges following a revelation made this month by a whistleblower for the company to collect and store the personal health information — including full names, dates of birth and clinical histories — of millions of patients at Ascension in its cloud technology. Google maintains that both the NIH and Ascension projects are compliant with federal privacy laws, and that the objective with Ascension was to provide better recommendations to physicians, while demonstrating its cloud storage services.

Many, however, have raised concerns over information privacy, as it was not immediately clear if patients consented to have their files transferred from Ascension’s servers, or what Google’s intentions were. The Department of Health and Human Services announced its intention this week to evaluate if Google’s “mass collection of individuals’ health records,” violated HIPAA privacy regulations.

The company is also under scrutiny for a $2.1 billion deal to acquire Fitbit, due to concerns over antitrust and privacy violations. It is currently waiting for regulatory approval.

“Why should Google be permitted to acquire even more companies while they’re under DOJ antitrust investigation?” tweeted Sen. Josh Hawley (R-Mo.) after the deal was announced this month, with Rep. David N. Cicilline (D-R.I.), the chairman of a House committee on antitrust issues, requesting “an immediate and thorough investigation” of the acquisition, according to The Post.

The X-rays scanned by the NIH belong mostly to patients with lung disease, says the emails.

The NIH has broad authority to share medical data with outside “consultants” for research purposes and specifies this in waivers collected from patients. It is unknown if Google would be considered a consultant under the policy. Cohen maintains that the project was in line with The Privacy Act and that "no personally identifiable information (PII) has been released," due to staff "rigorously reviewing" data to ensure the removal of PII.

Google did not respond for comment.