Atlantic CouncilMay 13, 202633 min read

The US AI health data collision: Charting the future of US cross-border data flow policy, health data, and health and biopharma AI policy

Issue Brief May 13, 2026 • 3:57 pm ET

Introduction

Healthcare artificial intelligence (AI) is a promising area of AI research and development. Meanwhile, this area raises critical questions about transparency, auditability, privacy, cybersecurity, inequity, national security, and more. Data from populations across the world, from the individual to population levels, are necessary for curing disease, innovating treatment, advancing research, and assessing and responding to public health threats, among many other functions. Collecting, analyzing, and sharing health data can and does have many tangible benefits for society. Simultaneously, collecting, analyzing, and sharing health data necessarily creates risks, such as privacy risks to individuals and populations and cybersecurity risks to organizations.

The US healthcare sector is also global, underpinned by a complex international supply chain of vendors and partners, and many healthcare research efforts, trials, and other efforts depend upon access to patient data from a wide range of populations and countries. Combined, these factors make cross-border data flow policies critical to the future of US health, biopharma, and life sciences AI development, alongside the related data policy considerations.

This issue brief proceeds in three parts, looking across the roots and drivers of US cross-border data flow policy, the state of health data in the US AI sector, and future policy directions. It follows the first issue brief in this series, focused on similar issues in the European Union,¹ and comes before the next in the series, focused on similar issues in China. Key points of this brief include:

The drivers of American cross-border data flow policy have historically been commercial interests, consumer protection concerns, and criticisms from the European Union regarding the perceived risks to EU citizens associated with personal data transfers to the United States. More recent trends include US national security concerns about foreign adversaries’ (especially China’s potential) access to US personal data and US debates over AI competitiveness. The U.S. Department of Justice’s (DOJ) Data Security Program and the Protecting Americans’ Data from Foreign Adversaries Act (PADFAA) are notable, recent data security measures, due to their governance of cross-border data flows with a national security lens.
The United States is home to many agencies, companies, and civil society organizations (including universities) providing a range of health datasets—sometimes directly to other entities, such as other companies or researchers; sometimes, to anyone on the internet—that entities can use to train and test AI models, from large language models (LLMs) to image recognition systems, although many cutting-edge health, genomic, and other datasets are proprietary datasets held by enterprises. While US federal law places some limits on how a few categories of entities (e.g., hospitals, health insurers) can use personal data for AI training and testing, many companies in the United States have far more freedom to use the same data for AI purposes. Future innovation areas for health data and AI technologies span disease discovery, image recognition, and protein folding, among others.
There are at least three likely, continuous drivers of US cross-border data flow policy as it pertains to health data and health-related AI models broadly: EU-related adequacy disruptions, US government national security concerns about data transfers and touchpoints (particularly to and with China), and US government and industry narratives about an AI arms race.

Roots and drivers of the United States’ cross-border data policy

Historically, the drivers of American cross-border data flow policy have been commercial interests, consumer protection concerns, and criticisms from the European Union, in particular, the perceived risks associated with EU citizen data transfers to the United States. In recent years, while the other trends persist, US national security concerns about foreign adversaries’ access to US personal data, as well as US debates over AI competitiveness, have become much more significant factors for US cross-border data flow policy. This section provides a high-level overview of the roots and drivers of US cross-border data policy, identifying major policies and themes. As discussed later, these shifts could have implications for all the data components in the AI supply chain: training data, testing data, models (themselves), model architectures, model weights, application programming interfaces (APIs), and software development kits (SDKs).²

A (very) brief history

For many years, the United States shied away from implementing substantial cross-border data flow restrictions. Many Americans and US businesses have considered this the status quo. Largely, this position has stemmed from US policy—and a related, somewhat idealistic view³—that the internet must remain “free” and “open,” with minimal state regulations imposed on traffic or the infrastructure.⁴ Cross-border data flow limits imposed from Washington would, in this view, undercut the so-called internet freedom agenda. Limits would also, in this view, impact the fast-growing US cloud computing sector—led by the so-called hyperscalers, or Amazon (AWS), Google (Google Cloud), and Microsoft (Azure)—and the many sectors, from finance to health to logistics, moving systems onto the cloud, enabling access to systems and data globally without local infrastructure.

As the European Union implemented measures over the last three-plus decades that could hinder US-EU cross-border data flows, driven at least initially by concerns about individual privacy rights, ⁵ the US government undertook decisive efforts to ensure those transfers could legally continue. For example, in 1995, the European Union implemented a directive on the processing of personal data and its free movement that, among others, prohibited transfers of EU personal data to countries without “adequate” levels of protection.⁶ Adequacy evaluations considered the nature of the data transferred, the purpose and duration of the proposed processing operations, the country of origin and the final destination, and the final destination’s rule of law and other security measures, among others. This led to a lengthy and complicated negotiation process, at the end of which the US Department of Commerce issued the Safe Harbor Privacy Principles in July 2000 and sent them to the European Commission to receive an adequacy determination.⁷ Then, the Commission ruled that data transfers to the United States underneath the Safe Harbor Principles were “adequate,” and thus permitted.⁸ In the decade or so afterwards, cross-border data flows operated under these principles—and the United States and the European Union were able to continue reaching agreement on other aspects of cross-border data flow policy, such as promoting to other countries the importance of the free flow of information across borders and the cross-border supply of information technology and communications services.⁹

In tandem, the United States has maintained its position of a relatively open internet without cross-border data flow restrictions, while criticizing countries that went in the other direction. The 2008 National Trade Estimate called attention to regulations of international data flows and restrictions on the use of non-US data as services barriers for American companies.¹⁰ In 2013, the National Trade Estimate specifically called out China’s restrictions on cross-border data flows and rules for data sovereignty as concerning risks for non-Chinese companies.¹¹ The list goes on.

The biggest disruption to US-EU cross-border data flows occurred in 2013. When Edward Snowden leaked information about classified US government intelligence activities,¹² the political and popular backlash from the European Union (among other places in the world) was immense.¹³ For example, one European Parliament-commissioned report stated that there was an “absence of any cognizable privacy rights for ‘non-US persons’ under FISA,” or the United States’ Foreign Intelligence Surveillance Act.¹⁴ These leaks prompted researcher Maximilian Schrems to challenge the validity of Facebook transferring EU citizen data to the United States under the Safe Harbor Principles, citing the leaked information as the basis to contest the past EU “adequacy” ruling.¹⁵ This prompted several iterations of European courts overturning US-EU cross-border data flow adequacy determinations and the United States and European Union renegotiating agreements, including the 2016 Privacy Shield Framework (after the Schrems I decision invalidated Safe Harbor) and the EU-US Data Privacy Framework, or “Privacy Shield 2.0” (after the Schrems II decision invalidated Privacy Shield).¹⁶ Regardless of one’s view of these court rulings, it is clear that they have caused significant uncertainty for EU and US data-transferring organizations—and some of this uncertainty still lingers, insofar as actual or future legal challenges to US-EU cross-border data flow adequacy develop. In September 2025, for example, the European General Court dismissed a challenge to the Framework brought by a member of the French Parliament.¹⁷ However, it is currently on appeal to the European Court of Justice, with a possible decision ready next year.

Other notable US policy developments include processes for law enforcement access to data stored abroad and multilateral engagements to create interoperable data transfer policies. In the former case, the United States passed the Clarifying Lawful Overseas Use of Data (CLOUD) Act in 2018, which has two core components: (1) the act allows the US government to establish executive agreements with other countries to essentially ensure, under certain rule of law criteria, that those countries’ law enforcement agencies could obtain expedited, direct access to US company-held data for investigations pursuant to lawful process, while bypassing the often onerous requests under a multilateral legal assistance treaty (MLAT); and (2) it clarifies that US law can require US companies to produce data they hold or control, whether or not the infrastructure used to store the data is physically within the United States.¹⁸ Requirements for a bilateral executive agreement included that a country’s legal system must have “robust substantive and procedural protections for privacy and civil liberties” vis-à-vis law enforcement data collection, which scholars have noted resemble requirements in many countries, such as in the EU bloc’s General Data Protection Regulation (GDPR) for data minimization, transparency, and accountability.¹⁹ In the latter case, the Commerce Department, along with Canada, Japan, South Korea, the Philippines, Singapore, and Chinese Taipei, established the Global Cross-Border Privacy Rules Forum in 2022 to promote values-aligned certifications for companies to carry out compliant cross-border data transfers.²⁰

Recent moves: National security and AI competition drivers

Since 2024, two major federal regulations have placed restrictions on cross-border data flows—including those within the health sector and on health data that could train and test AI systems—for national security purposes. The first is a Biden administration executive order (EO) whose implementing regulations remain operative. The second is a congressional law that expanded the Federal Trade Commission’s (FTC) authorities vis-à-vis data and national security.

Drawing on authorities under the International Emergency Economic Powers Act (IEEPA), President Joseph Biden signed EO 14117, titled “Preventing Access to Americans’ Bulk Sensitive Personal Data and United States Government-Related Data by Countries of Concern,” in March 2024.²¹ It said that US foreign adversaries (termed “countries of concern”) can access bulk US data and use advanced technologies, including AI systems, to analyze and manipulate that data to advance national security threats, such as espionage or cyber operations. EO 14117 directed the DOJ to establish regulations that specify limits on commercial transactions involving Americans’ bulk sensitive personal data to protect US national security interests. The result is the DOJ’s Data Security Program—what companies and others variably also call the “bulk data program” or the “data broker and national security program”—finalized in January 2025 and fully implemented in April 2025.²²

Part of the final rule focuses on data brokerage, or the sale, licensing of access to data, or similar commercial transactions where one entity transfers data to another that did not already have it. It restricts the brokerage of two categories of data from US companies to countries of concern: bulk sensitive personal data, which can relate to any US individual with restrictions based on data-type-specific thresholds, and the US government-related data explicitly tied to individuals, such as current or former military or intelligence personnel, which is restricted regardless of the amount of data involved (i.e., no threshold). It defines the countries of concern as China, Russia, Iran, North Korea, Cuba, and Venezuela.²³

For health-related data, the rule prohibits data brokerage if the data transferred by any entity over the preceding 12 months, in one transaction or multiple, exceeded data type-specific thresholds (pulling the below directly from the rule’s text):

Importantly, the rule also introduces requirements for vendor, employment, and investment agreements vis-à-vis genomic data and personal health data (among others). It prohibits transactions that provide a country of concern or covered person with access to bulk ‘omic data (i.e., genomic, epigenomic, proteomic, or transcriptomic data as defined above),²⁴ or to human biospecimens from which an entity could derive bulk human genomic data. Moreover, it prohibits transactions of bulk US personal health data that happen through vendor, employment, and investment agreements unless the US individual or entity carrying out the transaction complies with specified security requirements. The Cybersecurity and Infrastructure Security Agency (CISA) developed these requirements, and they span organization-, system-, and data-level protections (including data minimization and masking, encryption, or privacy-enhancing techniques). Their design is to ensure that the covered person and country of concern cannot access regulated data through this type of covered data transaction.

The DOJ’s Data Security Program took effect in April 2025, but the department decided to delay enforcement 90 days until July 8, 2025, and to delay certain affirmative due diligence obligations until October 6, 2025.²⁵ While the team responsible for enforcing the program has lost most of its staff, the statute of limitations for any civil or criminal violation is 10 years.

The second recent national security measure for cross-border data flows is the Protecting Americans’ Data from Foreign Adversaries Act (PADFAA),²⁶ which Congress passed in April 2024 along with the Protecting Americans from Foreign Adversary Controlled Applications Act (PAFACA),²⁷ or the TikTok divest-or-ban law. PADFAA made it unlawful for a “data broker” anywhere in the United States to sell, license, rent, trade, transfer, release, disclose, provide access to, or otherwise make available a US individual’s “personally identifiable,” “sensitive data” of a US individual to any “foreign adversary country” or any entity controlled by one.

The law’s definitions of key terms determine its scope. Unlike the DOJ program, PADFAA scoped “data broker” to mean any entity that, for valuable consideration, sells, licenses, rents, trades, transfers, releases, discloses, provides access to, or otherwise makes available US individuals’ data that the broker did not collect directly from those individuals. In other words, PADFAA defines a data broker as a third party. ²⁸ The statute’s definition of foreign adversaries comes from a list of countries in an existing (but separate) federal statute, meaning PADFAA’s restrictions only apply to covered transactions to North Korea, China, Russia, and Iran (a narrower list than the DOJ program’s six countries of concern).²⁹ Importantly, PADFAA also defined personally identifiable, sensitive data as any data that identifies or is linked or reasonably linkable, alone or in combination with other data, to an individual or a device that identifies or is linked or reasonably linkable to an individual, in multiple categories, such as government-issued identifiers, health conditions and treatments, device log-ins, sexual behavior, data on any individual under the age of 17, identifying online activity, and precise geolocation data. The law’s limitations on health data sales apply to “any information that describes or reveals the past, present, or future physical health, mental health, disability, diagnosis, or healthcare condition or treatment of an individual.”

Hence, in some ways, the DOJ program is broader, encompassing first-party data brokers (i.e., those that collect data directly from individuals), as well as a category of low-risk transfers, while covering two additional countries (Cuba and Venezuela). In other ways, however, PADFAA is broader, equally governing all data flows within its remit (without any data-type thresholds) and covering broader categories of personal data, such as an individual’s private communications (e.g., voicemails, emails, texts, direct messages). It is possible that PADFAA could affect a large corporation in the health, biopharma, life sciences, or related sectors, if that corporation were to engage in covered sales of health data it did not directly collect from consumers. Yet the DOJ Data Security Program affects these sectors much more, because it encompasses first-party collectors transferring data to the covered countries. For example, this means that many pharmaceutical companies that transfer above-threshold volumes of US citizen health data to organizations in China for health research face obligations under the regulations, subject to relevant exemptions, such as for “drug, biological product, and medical device authorizations” and “other clinical investigations and post-marketing surveillance data.”³⁰ This is an important point. Companies may or may not internally make distinctions, such as in the product development or scientific research lifecycle, between some of these ways of transacting in data (e.g., is it sent via an employer agreement or another kind of transfer?) and the related purpose (e.g., it is for other clinical investigations?), but the DOJ program certainly does.

Nonetheless, these programs do not apply to any one specific part of the AI supply chain per se. If a data component in the AI supply chain—be it training data, testing data, or something else³¹—fits under the definitions of the DOJ program or PADFAA, the restrictions will enter into effect. This means they impact large training or testing datasets for various AI models and the extent to which model developers or maintainers, for example, can or cannot transfer the AI models to countries and entities of concern (and if so, under what protections).

AI innovation has heavily driven the recent discourse concerning data flows in Washington, DC—in some cases, the conceptualization of AI research and development as an “arms race” with the Chinese government (or, sometimes, framed as a technological and economic race with China writ large).³² The debates have not yet resulted in significant legal or policy changes for cross-border data flows. For now, discourse about an “AI arms race” and innovation-regulation dynamics has mostly focused on executive branch actions targeted at the notion of preventing US states from charting their own paths in governing various AI technologies³³ (even though a wide range of states are pushing forward with AI regulations anyway).³⁴

However, it is inevitable that the AI innovation discourse will affect cross-border data flows in the future. In Chatham House Rule discussions in which the author has participated, for instance, some analysts or companies have suggested that enabling certain health innovations in AI (not LLMs but beyond them) will require maintaining some degree of cross-border flows of US health data. Conversely, other Chatham House Rule discussions in which the author has participated have underscored the national security interest in further restrictions on US health, genetic, and biometric data to protect what are genuine US security interests. The bipartisan, bicameral National Security Commission on Emerging Biotechnology, to give one example, recommended in its 2025 report that Congress should conduct oversight of existing policies and add new authorities as warranted, to ensure that China cannot obtain bulk and sensitive biological data from the United States.³⁵ Proposals to further restrict cross-border health, genetic, and other data flows not in spite of but due to AI competition with China are likely to be salient in the coming years.

State of health data in the US AI sector

The discussion around the role of health data and technology within the US AI landscape is at a critical inflection point. When the American Hospital Association surveyed thousands of hospitals around the country in 2023, 43.9 percent of hospitals in metro counties reported using some type of AI in their operations, such as for automating tasks, optimizing administrative and clinical work, and predicting patient demand.³⁶ In 2025, by one estimate, US investors put 46 percent of their healthcare sector investments into healthcare AI companies.³⁷ In 2026, 75 percent of US health systems queried for a survey reported using at least one AI application, up from 59 percent the year prior. ³⁸.Surveys can be variable, but it is clear that many US health organizations are increasingly leveraging AI models in their operations.

Data itself is another critical part of the picture. The United States has many organizations providing a range of health datasets that entities could use to help train and test AI models, from LLMs to image recognition systems. This covers two critical data components of the AI supply chain: training data and testing data.

Federal agencies publish datasets for COVID-19 tracking, de-identified patient data for cancer surveillance, medical imagery (e.g., CTs, or computed tomography scans, and MRIs, or magnetic resonance imaging) related to clinical subjects, environmental health data, multimodal genomic and electronic health records data, and much more.³⁹ MIT, Harvard Medical School, and Beth Israel Deaconess Medical Center researchers maintain a database of hundreds of thousands of emergency department and intensive care unit (ICU) patients’ deidentified records..⁴⁰ Stanford publishes datasets of radiology reports and chest X-rays, abdominal CT scans, whole brain MRI studies, and many other kinds of health-related image training datasets.⁴¹ Nonprofits and companies publish health data usable for AI training and testing, too: Google maintains an online “data commons” that includes health information from a wide range of sources;⁴² the Radiological Society of North America runs “AI challenges” that invite researchers to develop high-performing machine learning (ML) models for specific health tasks;⁴³ MITRE even developed an open-source, synthetic patient generator that models the medical history of synthetic, realistic patients to create data.⁴⁴

To whom they provide this data varies. Sometimes, these organizations provide such data directly to other entities, such as other companies or researchers. Sometimes, these organizations publicly post such data for anyone on the internet to download (with varying permissible uses). As with many datasets, an individual’s or organization’s access to funding (including if they need to purchase licenses for data), computing capabilities, data storage, subject matter expertise, and other informational and infrastructural resources will impact which groups can effectively leverage these datasets for health and other purposes. The widespread availability of free-to-use or low-cost AI models, including LLMs, arguably lowers many of these barriers to entry.

At the same time, many large corporations in the healthcare, biopharma, and life sciences industries have their own vast proprietary datasets, such as clinical trial data or internally developed survey data, that they can use to train AI systems that are not available to the public. One former chief data officer at three of the largest US healthcare companies recently commented, in this vein, that smaller language models trained on proprietary datasets held by enterprises will deliver much more value in the future.⁴⁵ This may be especially true in some highly sensitive data categories, such as genomic data. The bigger demands for privacy and cybersecurity protections (including strict access controls—in other words, limiting the pool of people who can access the data) mean companies, universities, and even government agencies looking to develop medical and health AI models on the data may be reliant on internal datasets more than anything they can procure from a public source.⁴⁶ Health data for AI purposes in the United States, therefore, goes far beyond the text that an LLM chatbot vendor may scrape from across the internet.

For American companies seeking to use health data to train AI systems, perhaps the most significant distinction lies in whether a company is subject to the Health Insurance Portability and Accountability Act (HIPAA). Passed in 1996, HIPAA’s regulations apply to four categories of entities: healthcare providers, health plans, healthcare clearinghouses, and their business associates, which include entities that provide financial, legal, accounting, or other services to a HIPAA-covered entity or that receive data from HIPAA-covered entities in conjunction with HIPAA-covered activities, such as data analysis firms or website design contractors.⁴⁷ All entities regulated by HIPAA must comply with HIPAA’s privacy and security rules when seeking to train an AI system on health data, such as by ensuring it has a specialized agreement (“Business Associate Agreement”) with an AI vendor before sharing any patient data.⁴⁸

However, HIPAA does not apply to many other entities, such as social media companies, smart device manufacturers, advertising technology companies, data brokers, and many other data-collecting, -generating, and -analyzing companies. This means that many companies do not face significant federal regulations in leveraging consumers’ health data for training AI applications. For example, a non-HIPAA-covered mental health app that wanted to use its own customers’ data that it collected to train a predictive health model would not face HIPAA restrictions in doing so.⁴⁹ Compounding this fact is just how much health data companies collect on individuals outside the bounds of a hospital or clinic, including online purchases, smart device biometrics, and data indicating physical activity and movement levels. Such legal and regulatory gaps create significant privacy and cybersecurity risks for individuals’ health data. They have also catalyzed states’ interests in regulating health data with their own laws, such as Washington’s My Health My Data Act,⁵⁰ to fill gaps left by decades-old federal sectoral laws. Nonetheless, many states do not have heavily health-data-focused privacy laws addressing the ways in which myriad types of data generated outside the scope of hospitals or clinics can be used for AI training purposes.

For example, OpenAI said in January 2026 that 40 million people globally use ChatGPT daily for “health information.”⁵¹ Per week, it says that about 230 million people ask the chatbot health and wellness questions.⁵² Such usage is likely to continue driving legislative and regulatory concerns about the uses of US health data by AI applications, outside the scope of HIPAA and other privacy and security requirements and best practices.

Beyond commercial chatbot usage for health questions, companies, universities, and government agencies are likely to deepen their work on more specialized AI health models (including those that are smaller or trained on proprietary data) in the coming years. Google’s DeepMind, through its Isomorphic Labs startup, has been expanding its AlphaFold protein folding model to predict the structure of proteins, DNA, and so forth.⁵³ Scientists at Lawrence Livermore National Laboratory, AMD, and Columbia University have developed a biological computing model, ElMerFold, run on the National Nuclear Security Administration-funded supercomputer El Capitan, to advance biosecurity efforts.⁵⁴ One company, Insitro, uses ML capabilities to analyze in vitro cellular data to identify therapeutic insights and interventions across diseases.⁵⁵ Another, Numerion Labs, uses an AI platform to analyze chemical structures to predict drug functions and other tasks.⁵⁶ These kinds of innovations are likely to become more important components of the health AI space in the coming years.

Future policy directions

There are at least three likely, continued drivers of US cross-border data flow policy as it pertains to health data and health-related AI models broadly: EU-related adequacy disruptions, US government national security concerns about data transfers and touchpoints (particularly to and with China), and US government and industry narratives about an AI arms race.

Currently, the EU-US Data Privacy Framework is still in place. However, other European courts could rule differently in future cases and hearings (i.e., those akin to the dismissed 2025 challenge). Many European policymakers’ reactions to the last year or so of technology developments, political and policy changes, and rule of law challenges in the United States—including the calls to reduce dependence on American technology from European politicians,⁵⁷ as well as military and security agencies ⁵⁸—have generated significant debate about the nature of US-EU technological ties, including data flows and touchpoints. Nonetheless, it is not clear whether these conversations will directly translate into challenges to the Data Privacy Framework per se. The fates of any potential challenges, would be, in a word, complicated. On the one hand, for challenges filed in the current environment, the odds of invalidation based entirely on the state of US law and politics are likely much greater than in 2024. This would have ramifications for a wide range of US sectors, including the cross-border flow of health data and for health and pharma organizations working on AI applications, such as those related to drug discovery, image recognition, or patient record analysis. On the other hand, it is highly probable that the current Trump administration might pursue other means of securing adequacy besides negotiation. Other compliance questions for businesses under current frameworks, meanwhile, persist.⁵⁹

Alongside European concerns about data transfers to the United States, here at home, the US government will likely stay focused over the next decade on the national security risks associated with the transfer of certain data to—or certain data touchpoints with—entities in China and other foreign adversary countries. Building on PADFAA and the DOJ’s Data Security Program, legislators on both sides of the aisle in Congress remain interested in additional measures to bolster and expand the programs to further govern how US data can flow to China.⁶⁰ Genetic, health, and biotech-related data are of particular, heavy concern for many policymakers, because of their identifiability (especially with genetic data) and the military and intelligence contexts in which an adversary could leverage them.⁶¹ While the current administration has articulated and pursued a clear preference for regulatory rollbacks across industries, there remains strong, underlying, bipartisan Congressional interest in national security regulations that affect health and biological data vis-à-vis China. There are also many consumer protection reasons for Congress to pass comprehensive privacy legislation that would encompass these data categories, distinct from national security per se. But the last few years have underscored that Congress is far more likely to pass piecemeal, data-focused national security laws than to make meaningful movement on a comprehensive, federal privacy framework. Members keep introducing bills, but the inching forward still runs into major roadblocks, such as debates over a private right of action.

Notably, US corporate and government discourse about the idea of an “AI race” with China will affect US health data, cross-border data flows, and data in the health AI context going forward. Many forces exist simultaneously. Some companies have a sincere belief in the need for US firms to move faster than their counterparts in China to maintain long-term American technological advantage and strategic competitiveness. Plenty of other companies, particularly in Big Tech, have also wielded these arguments instrumentally to pursue their respective ends (e.g., killing privacy regulations). ⁶² There are policymakers, similarly, who genuinely focus their time on the ways in which the Chinese government leverages AI for national security purposes ranging from surveillance,⁶³ to drone swarms,⁶⁴ and, evidently, to cyber operations.⁶⁵ There are also policymakers whose arguments about an AI arms race hinge more on the bottom lines of American firms writ large, compared to Chinese competitors, than a precisely articulated vision for what defines the supposed race.⁶⁶ All told—and without getting too much into the notion of an “AI race” itself—it is clear that future administrations will contend with these competing views of AI, competition, and China in ways that could significantly relax or tighten the US regulatory apparatus for health data, AI models such as image recognition or genomic data analysis, and cross-border data flows.

Looking forward, the challenge for policymakers, as is often the case in policymaking, lies in doubling down on areas where important interests align, while navigating balancing acts where they diverge. The challenge for health, biopharma, and related AI companies will include deepening their understanding of, appreciation for, and fluency in issues related to US national security, AI competition, and China. And the challenge across the board, including for non-US entities and those in civil society, will be the thoughtful critique of the overarching framings that guide US policy in this area to ensure calibrated, evidence-based measures based on clear, genuine intentions from the promoters of those rhetorical framings. This could include considering:

“AI” is an umbrella term that encompasses a wide range of technologies, far beyond the LLM chatbot makers that dominate the headlines, stock market reports, and policy conversations. Relatedly, there are many different use cases for AI models, including non-LLM AI models, which have quite varied impacts on society. In this case, LLMs built by commercial chatbot vendors and pitched as health tools are different than LLMs built by clinical researchers and trained, methodically and precisely, for specific tasks—or image recognition systems built by healthcare companies to screen for cancer. Unpacking these distinctions in regulatory discussions will be critical to differentiating between risks, opportunities, and types of societal and individual impact. This kind of differentiation may be easier with organizations developing focused AI models that use specific types of data for specific purposes, compared to, say, LLM vendors pitching their tools as general-purpose and thus (at least purportedly) cutting across many different use cases.
Healthcare is not the same as advertising. Policymakers may regulate or want to regulate cross-border data flows and “AI” with only a particular kind of AI technology or AI use case in mind, such the privacy risks of LLM vendors marketing chatbots as therapy solutions for teenagers (which the American Psychological Association warns against)⁶⁷ and turning those conversations into ad insights—or social media platforms using users’ health data to target them with AI-driven, problematic advertisements. Yet, the resulting regulations could also implicate use cases that are less about data ingestion or commercial advertising and more about, say, disease research or clinical trial development. It is not mutually exclusive that policymakers could curtail harms in one area while understanding the public health implications and other related consequences of broader regulations in other areas.
Many companies lack a deep understanding of the national security risks associated with business activities in or technological ties to China. This could mean the US government should provide more opportunities for companies to understand the risk space. Rather than declassifying significant volumes of intelligence or providing companies with hyper-specific lists of static risk criteria (e.g., for national security regulatory programs), this could look like the government issuing more plain-language justifications for data-related regulatory decisions.⁶⁸ It could look like federal regulators issuing more advisory opinions for the growing number of federal data-related regulatory programs. And it could look like federal agencies meeting more with companies outside of investigations.⁶⁹ At the same time, there are often fundamentally divergent perceptions of the risk environment (or the valuation placed on what the US government sees as national security risks) that it is unlikely this gap will close entirely—and some US national security-motivated data regulations will address issues that companies may to some extent be in tension with.
Beyond the moral and societal reasons to continually explore advancements in health technology, policymakers and other stakeholders should note that health data and health-related AI developments could also have beneficial applications for biosecurity as well, aligning with US national security interests across homeland security, national defense, and more. Framing commercial innovation, societal benefit, and national security interests as always inherently in tension is often an imprecise way of approaching problems, which can obscure complex solutions.

Justin Sherman is a nonresident senior fellow at the Atlantic Council’s Cyber Statecraft Initiative. He is also the founder and CEO of Global Cyber Strategies, a Washington, DC-based research and advisory firm, a contributing editor at Lawfare, and a columnist at Barron’s. He is the author of the book Navigating Technology and National Security.

The author would like to thank Ken Propp, Lee Licata, Jolynn Dellinger, Stacey Gray, and Trey Herr for their comments on earlier drafts of this report, various health and biopharma sector experts for background discussions, and Nitansha Bansal, Kenton Thibaut, and the rest of the project team for their support.

The Atlantic Council’s Cyber Statecraft Initiative, part of the Atlantic Council Technology Programs, works at the nexus of geopolitics and cybersecurity to craft strategies to help shape the conduct of statecraft and to better inform and secure users of technology.