Stimulate AI innovation with a National Research Cloud
A recently published white paper provides a roadmap for creating a National Research Cloud (NRC) that would fuel artificial intelligence-based research.
In “Building a National AI Research Resource: A Blueprint for the National Research Cloud”, the Stanford Institute for Human-Centered Artificial Intelligence (HAI) and the Stanford Law School’s Policy Lab, which studied the feasibility of an NRC, make recommendations on how an NRC might play.
First, the document, released on October 6, recommends the use of a dual investment strategy that makes the most of public IT infrastructure and the services of commercial cloud providers. In the short term, “NRC’s computational model can be quickly launched by subsidizing and negotiating cloud computing for AI researchers with existing vendors, expanding existing initiatives like the [National Science Foundation’s CloudBank project,” which provides subsidized access to existing cloud resources.
NRC should also invest in a pilot test of public infrastructure to assess its ability to provide similar resources in the long run, similar to the way Energy Department national laboratories operate now. They own supercomputing facilities that researchers get approval to use.
Second, the paper recommends that eligibility to access and use NRC be limited — at least initially — to academic and nonprofit AI research, specifically those who are considered principal investigators (PIs) at U.S. colleges and universities and to what the paper terms “Affiliated Government Agencies.” Those are organizations that will contribute previously unreleased, high-value datasets to NRC in return for subsidized compute resources.
Third, the researchers recommend a default base-level access for computing that should cover most PIs’ needs and a custom grant process for access to additional compute beyond that base.
The whitepaper’s data access model is based on four recommendations. The first is that NRC focus on government data. Second, it recommends “a tiered access model: by default, researchers will gain access to government data that is already public; researchers can then apply through a streamlined process to gain access at higher security levels on a project-specific basis.”
One challenge to sharing data is the Privacy Act of 1974, which requires that there not be a central repository of government data, putting it somewhat at odds with an NRC. The researchers took that into consideration.
“There’s no question here about removing or eliminating the Privacy Act,” Jennifer King, privacy and data policy fellow at the Stanford HAI, said during a webinar announcing the blueprint. Instead, the researchers came up with ways the two could exist compatibly. For instance, the act allows for an exemption for statistical research, which makes up the bulk of AI study, and agencies are expected to protect data with privacy treatments beyond anonymization, including differential privacy, homomorphic encryption or synthetic datasets.
In terms of where to locate NRC, the whitepaper recommends that it be a Federally Funded Research and Development Center (FFRDC) to start, moving to a public/private partnership in the long run.
Overall, the paper identified three primary themes. They are complementarity between compute and data, rebalancing AI research toward long-term non-commercial research and coordinating short- and long-term approaches to creating the NRC.
The idea for an NRC came from Stanford HAI’s founders, who helped usher it into legislation with the National AI Research Resource Task Force Act, part of the National Defense Authorization Act. The act created a task force, set up in June, to study and plan for the implementation of a “National Artificial Intelligence Research Resource” (NAIRR), also known as NRC, the paper states.
The need for an NRC stems from an imbalance between commercial and noncommercial AI research that threatens to “undermine the historical innovation ecosystem where basic, fundamental and noncommercial research have laid the foundations for applications that may be decades away, not yet marketable or promote the public interest,” the blueprint states.
“We need to understand there are inherent differences between academic research and commercialized research,” Russell Wald, director of policy for the Stanford HAI, said during the webinar. “With longer time horizons and no profit constraints, basic scientific research has given way to breakthroughs such as GPS, the internet and CRISPR [DNA sequences]. Examples like this have led to the eventual commercialization of these discoveries and greater downstream benefits to society. “
For example, after Landsat imagery went from around $ 600 per file to free to the public in 2008, it generated productivity savings resulting in annual economic benefits of $ 3-4 billion, a said Wald.
He added that 82% of the algorithms in use today were from federally funded academic and nonprofit efforts, he said, but this is shrinking as the “innovation ecosystem” is threatened. by the high cost of computing power, limited access to raw data used for training. AI models and the ‘brain drain’ of AI researchers in universities.
“NRC will generate distinct positive externalities by integrating computation and data, the two bottlenecks for high-quality AI research,” according to the report. “Specifically, NRC will provide affordable access to high-end computing resources, large-scale government datasets in a secure cloud environment, and the expertise to take advantage of this resource through a close partnership between universities, government and industry. By expanding access to these essential resources in AI research, NRC will support basic scientific research in AI, the democratization of AI innovation, and the promotion of US leadership in AI.
Stephanie Kanowitz is a freelance writer based in Northern Virginia.