Google has been forced to shut down a data analysis system it was using to develop a censored search engine for China after members of the company’s privacy team raised internal complaints that it had been kept secret from them, The Intercept has learned.
The internal rift over the system has had massive ramifications, effectively ending work on the censored search engine, known as Dragonfly, according to two sources familiar with the plans. The incident represents a major blow to top Google executives, including CEO Sundar Pichai, who have over the last two years made the China project one of their main priorities.
The dispute began in mid-August, when the The Intercept revealed that Google employees working on Dragonfly had been using a Beijing-based website to help develop blacklists for the censored search engine, which was designed to block out broad categories of information related to democracy, human rights, and peaceful protest, in accordance with strict rules on censorship in China that are enforced by the country’s authoritarian Communist Party government.
The Beijing-based website, 265.com, is a Chinese-language web directory service that claims to be “China’s most used homepage.” Google purchased the site in 2008 from Cai Wensheng, a billionaire Chinese entrepreneur. 265.com provides its Chinese visitors with news updates, information about financial markets, horoscopes, and advertisements for cheap flights and hotels. It also has a function that allows people to search for websites, images, and videos. However, search queries entered on 265.com are redirected to Baidu, the most popular search engine in China and Google’s main competitor in the country. As The Intercept reported in August, it appears that Google has used 265.com as a honeypot for market research, storing information about Chinese users’ searches before sending them along to Baidu.
According to two Google sources, engineers working on Dragonfly obtained large datasets showing queries that Chinese people were entering into the 265.com search engine. At least one of the engineers obtained a key needed to access an “application programming interface,” or API, associated with 265.com, and used it to harvest search data from the site. Members of Google’s privacy team, however, were kept in the dark about the use of 265.com — a serious breach of company protocol.
Several groups of engineers have now been moved off of Dragonfly completely and told to shift their attention away from China.
The engineers used the data they pulled from 265.com to learn about the kinds of things that people located in mainland China routinely search for in Mandarin. This helped them to build a prototype of Dragonfly. The engineers used the sample queries from 265.com, for instance, to review lists of websites Chinese people would see if they typed the same word or phrase into Google. They then used a tool they called “BeaconTower” to check whether any websites in the Google search results would be blocked by China’s internet censorship system, known as the Great Firewall. Through this process, the engineers compiled a list of thousands of banned websites, which they integrated into the Dragonfly search platform so that it would purge links to websites prohibited in China, such as those of the online encyclopedia Wikipedia and British news broadcaster BBC.
Under normal company procedure, analysis of people’s search queries is subject to tight constraints and should be reviewed by the company’s privacy staff, whose job is to safeguard user rights. But the privacy team only found out about the 265.com data access after The Intercept revealed it, and were “really pissed,” according to one Google source. Members of the privacy team confronted the executives responsible for managing Dragonfly. Following a series of discussions, two sources said, Google engineers were told that they were no longer permitted to continue using the 265.com data to help develop Dragonfly, which has since had severe consequences for the project.
“The 265 data was integral to Dragonfly,” said one source. “Access to the data has been suspended now, which has stopped progress.”
In recent weeks, teams working on Dragonfly have been told to use different datasets for their work. They are no longer gathering search queries from mainland China and are instead now studying “global Chinese” queries that are entered into Google from people living in countries such as the United States and Malaysia; those queries are qualitatively different from searches originating from within China itself, making it virtually impossible for the Dragonfly team to hone the accuracy of results. Significantly, several groups of engineers have now been moved off of Dragonfly completely, and told to shift their attention away from China to instead work on projects related to India, Indonesia, Russia, the Middle East and Brazil.
Records show that 265.com is still hosted on Google servers, but its physical address is listed under the name of the “Beijing Guxiang Information and Technology Co.,” which has an office space on the third floor of a tower building in northwest Beijing’s Haidian district. 265.com is operated as a Google subsidiary, but unlike most Google-owned websites — such as YouTube and Google.com — it is not blocked in China and can be freely accessed by people in the country using any standard internet browser.
The internal dispute at Google over the 265.com data access is not the first time important information related to Dragonfly has been withheld from the company’s privacy team. The Intercept reported in November that privacy and security employees working on the project had been shut out of key meetings and felt that senior executives had sidelined them. Yonatan Zunger, formerly a 14-year veteran of Google and one of the leading engineers at the company, worked on Dragonfly for several months last year and said the project was shrouded in extreme secrecy and handled in a “highly unusual” way from the outset. Scott Beaumont, Google’s leader in China and a key architect of the Dragonfly project, “did not feel that the security, privacy, and legal teams should be able to question his product decisions,” according to Zunger, “and maintained an openly adversarial relationship with them — quite outside the Google norm.”
Last week, Pichai, Google’s CEO, appeared before Congress, where he faced questions on Dragonfly. Pichai stated that “right now” there were no plans to launch the search engine, though refused to rule it out in the future. Google had originally aimed to launch Dragonfly between January and April 2019. Leaks about the plan and the extraordinary backlash that ensued both internally and externally appear to have forced company executives to shelve it at least in the short term, two sources familiar with the project said.
Google did not respond to requests for comment.