Generating Novel Molecules Using AI

By Charles Xie and Xiaotong Ding

Back to AIMS home page

In this article, we show how you can 1) create novel molecules using generative AI (GenAI) based on large language models (LLMs), 2) visualize and compare the output molecules with a molecular gallery, and 3) perform molecular dynamics simulations to explore them further. The GenAI model used to generate molecules in this article is based on OpenAI's o4-mini.

Streamlining molecular generation, visualization, and simulation in a single system

The following video shows how this worked in AIMS when we asked it to generate a hydrophobic molecule similar to benzene, which was then visualized and investigated using the existing tools in AIMS. The ability to conduct these follow-up tasks is critically important as it helps to answer the so-what question often asked by people who use GenAI to create novel structures but are rarely provided a way to test their validity. One may think that the validation can also be done using GenAI. But the current forms of most LLMs may not be good at complex analysis of many science and engineering problems that are numerical — as opposed to textual — in nature. A combination of GenAI and numerical analysis may be a good solution in those cases. The following section provides an example to illuminate this point.

Validating and correcting generated molecules

A weakness of GenAI is that it can sometimes output a chemically incorrect structure like the one shown in the following video. The erroneous structure can be automatically corrected when running a molecular dynamics simulation in AIMS, as shown below. This example illustrates how AIMS takes advantage of both AI and science to achieve better results.

Managing generated molecules in design

The generated molecules can be saved in an AIMS project, allowing users to easily document and share the results. The following embedded window shows four different molecules given by AI when we requested it to "generate a novel hydrophobic molecule similar to benzene" four times, with the "AI Memory" option disabled in Gallery Settings such that each generated molecule has nothing to do with the existing ones in the project. Returning a different result each time is a great feature as we do want to generate many alternatives to increase our scope of search in the solution space when it comes to design.

Live model above (view in full screen) — Chrome or Edge recommended

Back to AIMS home page