From the Data

Google Academics Inc. Methodology

Our database of Google-sponsored research began with anecdotal inquiries into conference proceedings of policy issues of interest to Google and curricula vitae of authors with known ties to the company. We supplemented this research with three structured searches of Google Scholar, described in detail below. While this methodology resulted in a significant number of policy papers that could be traced through their authors to Google funding, this dataset is by no means comprehensive. Please let us know if you notice any errors or omissions.
 
For the first structured search, we built a database of recipients of direct Google research support using the company’s own disclosure pages. We then searched for award recipients’ names on Google scholar in conjunction with keywords for relevant policy issues: antitrust, search neutrality, net neutrality, search bias, copyright infringement, patent lawsuit, intellectual property, data security, fair use', 'anticompetitive, public policy, regulation, regulatory, tax, taxation, taxes. We then restricted the results for each author to papers published after the year in which she received her first Google award. Finally, an analyst sorted through these papers manually to determine a) Correct author identification, b) Relevance to policy issues of interest to Google, and c) Whether the author acknowledged Google support in the paper.
 

The second structured search targeted papers that explicitly acknowledged Google support. We searched Google Scholar for the concurrence of the policy keywords identified above with acknowledgement language: “grant from Google,” “support from Google,” “funding from Google,” “fellowship from Google,” “Google grant,” “Google fellowship,” “Google funding,” “Google research grant,” “Google * fellowship,” “grateful to Google,” “thanks to Google,” “thank Google,"  "thanks Google.” An analyst sorted through these papers manually to verify a) Whether the responsive phrase was in fact an acknowledgement of Google support, and b) Relevance to policy issues of interest to Google. 

In the final structured search, we compiled a list of all of the authors of papers identified in the second structured search and in anecdotal research, and searched Google Scholar for their names alongside the policy keywords used in the previous two searches. We sought papers by authors who may have acknowledged Google support in one publication but not in other papers addressing the same topic. An analyst manually reviewed the results of this search to verify a) Whether the paper was published after the date of first known Google support, b) Relevance to policy issues of interest to Google, and c) Whether the author acknowledged Google support in the paper.

To construct the citation network, we iterated through our completed database of Google-funded papers on Google Scholar and clicked the “cited by…” link on each article. We gathered data on each article that cited a Google-funded paper. An analyst manually cleaned this data and omitted results in alphabets other than the Latin alphabet, because those citations could not be verified. The resulting network contained nearly 6,000 connections and is too big to display in an interactive visualization. The interactive network graph that accompanies this article shows the 329 Google-funded papers that we identified and all unfunded papers that cite more than one piece of Google-funded scholarship.