June 23, 2023
Ian J. Stewart
Note: the following text was generated by Chat GPT based on Dr. Stewart’s slides and was subsequently lightly edited by a human. A video of the seminar is available from Dr. Stewart on request.
At a CNS seminar on June 16th, CNS Washington, D.C. executive director, Dr. Ian Stewart provided an exploratory talk on the role of Large Language Models (LLMs) in nonproliferation research. The presentation focused on how these AI-powered tools could aid research workflows and data analysis.
In introducing his talk Dr. Stewart started by noting that LLMs are relatively new technology in the field of machine learning whose ability to generate text, code, and structured data is of significant potential relevance to the nonproliferation domain. Dr. Stewart went on to say that while the use of LLMs can replace or enhance existing research tools, it’s crucial to recognize their limitations. Being prediction-based, they may sometimes yield incorrect results. Furthermore, LLMs can be computationally expensive, meaning that they are not always the optimum tool for applications, even those for which they otherwise seem well suited.
Moving to the main element of his presentation, Dr. Stewart highlighted the importance of defining structured workflow when utilizing LLMs. Structured workflows enable examination of the suitability of LLMs in various research scenarios. Nonproliferation is a data-intensive field requiring the integration of a variety of data sources for comprehensive insights and can greatly benefit from such workflow optimization.
Dr. Stewart noted that within CNS, nonproliferation is approached from various perspectives, including diplomatic, location-based, historical, and legal standpoints, as well as a data domain. He went on to say that much of his own work uses a data centric approach to nonproliferation that mirrors the Activity Based Intelligence (ABI) concept of understanding the ‘who,’ ‘what,’ ‘why,’ ‘where,’ ‘how,’ and ‘when’ of an activity. The challenge lies in linking disparate data sources to form an intelligible and cohesive picture of the nonproliferation landscape. This linkage / picture development process can involve constructing a knowledge graph or an ontology.
Emphasizing the value of ABI in nonproliferation, Dr. Stewart outlined how ABI uses every piece of available data to build a comprehensive profile of entities or areas of interest, whether that entity be an individual, a proliferation network, or a whole field of work to gain useful insight. From mapping every WMD (Weapons of Mass Destruction) facility to identifying every involved company and their activities, ABI helps gather crucial data. Key individuals and their actions are also identified and examined. The data for nonproliferation research comes from a variety of sources. These include general open-source information, trade data, contractual data, corporate data, patents, tenders, and scientific cooperation data from resources like Scopus. To facilitate efficient data exploration, a range of tools are used, including LLMs and other machine learning approaches.
Dr. Stewart went on to introduce one such tool: the Nonproliferation Archive that he is developing as a repository of 10,000 original nonproliferation documents. Dr Stewart detailed specific ways in which LLMs had been useful in developing this archive. For example, metadata for the archived documents usually doesn’t exist, so LLMs were employed to assist with creation of metadata. Additionally, LLMs were used to produce document summaries so that the documents could be posted with accompanying text. Finally, transcript tools such as Whisper were used to produce transcripts of videos that could then be summarized.
Dr. Stewart provided several other examples of how LLMs could be integrated into research workflows. One involved the extraction of data from 10,000 tenders to produce a single structured dataset, a task that other machine learning approaches cannot solve without human review. LLMs were also utilized in analyzing 2000 academic papers mentioning machine learning and weapons, helping to determine their relevance to different military systems. In exploring nonproliferation workflows in which LLMs can be used, Dr. Stewart also reiterated the potential use of LLMs in natural language processing, specifically to analyze archive documents for nuclear history. This would involve optical character recognition (OCR), running the text through the Chat GPT API to produce document summaries, and tagging the content for user review, rating, and commentary. The advantage of this is that LLMs seem capable of reading OCRed documents even when the quality of the OCR is relatively poor.
Several other use cases for LLMs, such as news summarization, classification, and tagging were also highlighted. Other potential workflows involve using LLMs as a tool to examine documents by asking them questions about the documents’ content, sources, related content, etc. This might include asking the LLM to explain a country’s position on an issue or asking the LLM to detail who introduced different considerations and in what order, for example. Another potential use could be to answer questions on technical subjects like export controls, although this might require retraining of the LLM, which was a topic Dr Stewart returned to later in the talk.
Dr Stewart then went on to discuss cases where LLMs may not be the good fit. For instance, the attempt to use an LLM as a master tool in a simulation-based exercise didn’t yield satisfactory results as the LLM merely regurgitated the scenario without providing helpful staged responses. This led into a discussion of the need to train LLMs for specific use cases. Training these LLMs would ideally involve generating a series of prompts and ideal responses to better suit the specific research needs of a project. Dr. Stewart said that in the future as LLMs develop, it should also be possible to feed in larger quantities of text to allow LLMs to answer more context-specific questions about the text.
To conclude, Dr. Stewart reiterated the significant potential of LLMs in aiding nonproliferation research. While they offer many immediately relevant workflows and can support these as extensions or replacements to existing tools, they are computationally more expensive compared to other machine learning approaches. Being prediction-based, they sometimes do make mistakes. However, with a structured workflow, it is possible to identify the scenarios where LLMs are the most suitable tools and where they may fall short. As the LLM technology continues to evolve, it is likely that their integration into nonproliferation research will only become more prevalent and beneficial. With OpenAI’s commitment to facilitate retraining, the future of LLMs in nonproliferation research seems promising indeed.