RAPID Reaction Software Ecosystem 

Lead Team Members: Dion Vlachos (PI), Raul Lobo, Marat Orazov, Sunita Chandrasekaran

Supporting Team Members: Sashank Kasiraju, Siddhant Lambor

This project focuses on building next-generation kinetics modeling software and data-harnessing tools to analyze, design, and optimize chemical kinetics and mechanisms, catalysts, chemical reactors, and modular manufacturing processes.

The Virtual Kinetics Lab provides tools to accelerate the multiscale modeling workflow by integrating first-principles calculations and data-driven methods to calculate thermodynamic properties of adsorbates on catalysts, reaction rate constants, reaction pathways and networks, kinetic models, and reaction model and kinetics visualization and analysis. We leverage advanced optimization and machine learning paradigms for kinetic parameter estimation, similarity analysis of molecules, uncertainty quantification, and development of surrogate models. We introduce a tool for optimal catalyst prediction. We provide integrated databases that retain information from simulations at various scales, including the learnt reaction rules, density functional theory data, thermochemistry data, reaction rate parameters, and reaction mechanisms. The Virtual Kinetics Lab software tools are easily accessible and user-centric and provide smart visualization and analysis tools of chemical systems to derive technical insights.

Virtual Kinetics Lab | Software Ecosystem

pmutt_400.200

The Python Multiscale Thermochemistry Toolbox (pMuTT) is a Python library that implements statistical thermodynamics for thermochemistry calculations and allows a one-shop calculator. Conversion between quantum mechanically computed properties and thermodynamic properties is ubiquitous in multiscale modeling. pMuTT is an open-source software that converts data from a) experimental observations or b) ab-initio data or DFT calculations to thermodynamic properties of species and reactions and kinetic parameters of reactions. Several input and output formats are enabled to make the tool useful. Energy profiles and graphs are also enabled. pMuTT offers extensive functionality right out of the box to automate these routine, and repetitive tasks. pMuTT is implemented in python and can be easily installed using “pip install pmutt”. pMuTT also has extensive documentation page with many clear examples. As an open-source object-oriented Python library, it can be easily accessed and adapted in your own Python code to provide this capability. For more information please visit: pMuTT Documentation.

pMuTT Functionalities
  • User selectable energy modes (translational, vibrational, rotational, electronic, nuclear)
  • User selectable statistical mechanics models for each energy mode
  • Converts output of DFT (Gaussian and VASP) into rate constants
  • Creates specific heat, enthalpy, and entropy output for thermodynamic calculations
  • Creates energy diagrams, free energy diagrams, and phase diagrams
  • Creates empirical objects (NASA polynomials, NASA9 polynomials, Shomate polynomials)
  • Creates input files for MKM software (Chemkin, openMKM)
pgradd_400.200

Python Group Additivity (pGrAdd) is a Python package and database that implements the First-Principles Semi-Empirical (FPSE) Group Additivity (GA) method for estimating thermodynamic properties of molecules in the gas phase, liquid phase, and on catalysts. pGrAdd allows researchers to rapidly estimate the thermodynamic properties of thousands of molecules/adsorbates and of large molecules from thermochemistry of a smaller dataset of small molecules, building and deploying models in a fraction of the time normally required. pGrAdd contains databases of gas species and adsorbates on Pt(111) surfaces so new models can be built immediately. In addition, a user can easily build a new database from their own DFT data for any adsorbates and surfaces they require. The method identifies the groups, estimate the group additivity values, and uses these to predict the thermochemistry of new molecules. The method can be applied to other catalysts by supplying new DFT data or by simply using the extended linear scaling relations and the Pt data. pGrAdd can be easily installed using “python PIP”. For more information please visit: pGrAdd Documentation.

pGrAdd Functionalities
  • Computes thermochemical data for species using group additivity
  • Reads SMILES strings and converts them to central atom-nearest neighbor groups
  • Contains databases to convert groups to thermodynamic contributions
  • Includes a gas database based on Benson’s work and 5 databases for adsorbates on Pt(111)
  • Creates empirical objects compatible with pMuTT
vunits_400.200

The Virtual Kinetic Laboratory Units (VUnits) is a Python library for unit conversion and constants developed by the Vlachos Research Group at the University of Delaware. This code supports Python-based Virutal Kinetic Laboratory software and aims to be lightweight. The list of supported units, constants, and other documentation can be found at VUnits’s documentation.

0_prest400.200

The Python Reaction Stencil (pReSt) is a first principles-based reaction mechanism generation framework. It learns the reaction rules from DFT data of published reaction mechanisms by generating molecular graphs of reactants and products, extracting the common subgraph, and defining all bonds that change during a reaction. It can generate reaction networks not studied before, “flag” reactions not seen before for further DFT convergence tests, and easily reconcile differences between catalysts and reactants that may introduce new pathways never seen before. The pReSt framework can also be a diagnostic tool for data (mechanism) quality assessment and novel pathway discovery to new molecules.

pReSt Functionalities
  • “Learns” reaction rules from published mechanisms by reading in the relaxed chemical structures for each species in a reaction mechanism
  • Can read structures from VASP (CONTCAR file), Gaussian (log file) and from SMILES strings
  • Stores reaction rules in a database to be used for generating new reaction mechanisms with different initial reactants
  • Flags unseen reactions for further investigation
renview_400.200

Reaction Network Viewer (ReNView) quickly generates a graphical representation of the reaction fluxes within the system essential for identifying dominant reaction pathways and reducing a mechanism without undergoing manual data processing. ReNView helps users analyze reaction mechanisms and identify key species and reactions by showing the flux for each path (via line thickness), partially equilibrated reactions (via line color), and surface coverage (via species background colors). The code is updated regularly with upcoming features that include lumping species based on the characteristics, such as molecular weight and number of monomers. For more information please visit: ReNView documentation.

ReNView Functionalities
  • Generates a graphical representation of reaction pathways in a chemically reacting system
  • Identifies quasi-equilibrated and non-equilibrated reaction steps based on partial equilibrium index
  • Relative magnitude and the direction of each reaction flux is portrayed for all the pathways
  • Offers excellent visual representation for dominant reaction path identification and mechanism reduction
  • Interfaces with micro-kinetic modeling software: ChemKin and OpenMKM
0_openmkm_400.200

Open-source Microkinetic Modeling (OpenMKM) is a multiscale microkinetics toolkit for modeling homogeneous reactions, e.g., gas-phase, and/or heterogeneous catalytic reactions. Microkinetic modeling enables coupling of “microscale” atomistic data with “macroscale” reactor observables. OpenMKM is a modular, object-oriented, open-source C++ software toolbox developed at the Delaware Energy Institute and is built upon the popular and robust open-source Cantera software. With OpenMKM, users can quickly set up and start performing microkinetic simulations without the need to write any code. For more information please visit: OpenMKM Documentation.

OpenMKM Functionalities
  • Users can run different ideal reactors such as batch, CSTR, PFR etc. with various heat balance options such as isothermal, adiabatic and temperature ramp etc.
  • User inputs are conveniently divided into reactor setup files, and thermodynamic and kinetic definition files to easily apply reaction mechanisms to different reactors.
  • Tightly integrated with the pMuTT and RenView software for input file generation and post-processing.
descmap_400.200

Descriptor-Based Microkinetic Analysis Package (DescMap) is a Python-based software package for automating volcano curve generation. It leverages existing software tools in the Virtual Kinetic Laboratory suite for enhanced synergy. DescMAP provides modules for descriptor selection, descriptor sampling, kinetic performance analysis and volcano curve creation. Inputting data via spreadsheets and controlling program behavior via template files increases flexibility and supported capabilities. Interactive output graphs enhance the interpretability and allow users to zoom in on optimal properties of the catalyst.

DescMAP Functionalities
  • Provides user selectable electronic (species adsorption energy) and geometric (generalized coordination number) descriptors
  • Offers diverse methods to input species’ thermodynamics such as those involving DFT, GA, LSRs, NASA polynomials, Shomate polynomials, or direct input of properties
  • Automates calculation of kinetic parameters using data from DFT or BEP relationships
  • Provides interactive graphs for post-processing and multiple output options for analyzing and better visualization of kinetic performances
  • Interfaces with thermochemistry software (pMuTT and pGrAdd) and microkinetic modeling software (ChemKin and OpenMKM)
ckineticsdb_400.200

Chemical Kinetics Database (CKineticsDB) is a state-of-the art datahub for ab-initio calculations-based microkinetic modeling files. It is extensible and adaptable, with a data model rooted in the inherent relations of the stored data, resulting in efficient data management practices. CKineticsDB retains all the information from simulations at various scales and allows accurate regeneration of publication results. The stored data can be accessed based on software parameters, catalysis parameters, reactions of interest, and publications. The data is curated before uploading to the database through a semi-automated process to ensure high quality standards and uniformity in the stored files.

CKineticsDB Functionalities
  • Provides access to published data related to density functional theory and micro-kinetic modelling.
  • Utilizes state-of-the-art backend to store unaltered files which can be directly used to reproduce the publication results.
  • Offers features to download specific data as per users’ interests, based on catalysis and computational parameters as well as publication details.
  • Contains the latest published data of the group with more than 8000 DFT calculations and growing.
0_aimsim_400.200

Artificial Intelligence Molecular Similarity (AIMSim) is an open source, accessible cheminformatics platform for performing similarity operations on collections of molecules (molecular datasets). AIMSim brings together the rich knowledge base of the cheminformatics community in an easily accessible interface. It provides a unified platform to simplify cheminformatics workflows for molecular datasets, such as diversity quantification, outlier and novelty analysis, clustering, and inter-molecular comparisons. AIMSim uses Python to abstract sophisticated similarity operations, molecular fingerprinting and multiprocessing capabilities from the user. The user gets to decide the granularity of control over the operations ranging from no code GUI use to programmatic usage similar to a Python package. With thousands of available chemical descriptors and almost 50 unique metrics for calculating similarity, AIMSim has a diverse range of applications from GNN verification to novelty analysis. Visit our online documentation for installation details, unit-tests and detailed descriptions of the class structure. Users will also find this interactive, online tutorial helpful.

AIMSim Functionalities
  • Software for carrying out similarity analyses on chemical datasets
  • Provides GUI (Graphical User Interface) and rich visualization to make similarity operations intuitive.
  • The user gets to decide the granularity of control over the operations ranging from no code GUI use to programmatic usage similar to a Python package.
ckbit_400.200

The Chemical Kinetic Bayesian Inference Toolbox (CKBIT) is an open-source Python library that facilitates Bayesian inference upon kinetic model parameters. Bayesian techniques estimate optimal parameter values (maximum a posteriori) or probability distributions (Markov chain Monte Carlo, variational inference) at rapid speeds. Leveraging Excel data entry to facilitate minimal coding, users may estimate activation energies, pre-exponential terms, and reaction orders from chemical kinetic data from various reactors (batch, continuous stirred-tank, plug flow) and reaction networks. Additional capabilities of hierarchical error modeling and prior distribution specification make CKBIT a flexible, accurate tool for the task of kinetic parameter estimation and uncertainty quantification. Multiple examples are available online to facilitate implementation. For more information please visit: CKBIT Documentation.

CKBIT Functionalities
  • Provides Bayesian maximum a posteriori estimation from kinetic data
  • Estimates reaction orders, apparent activation energies, and pre-exponential terms
  • Offers prior knowledge specification and hierarchical error modeling
0_nextorch_400.200

Next Experiment Toolkit in PyTorch (NEXTorch) is an open-source software package in Python/PyTorch to facilitate experimental design using Bayesian Optimization (BO). It can also be used for learning the theory and implementation of BO. The modular and flexible design of NEXTorch can deal with mixed types of parameters and data-type conversions, support both automated and human-in-the-loop optimization and offer various visualization options. It can be used in chemical synthesis in laboratory experiments, molecular modeling, reaction condition optimization, parameter estimation, and reactor geometry optimization, to mention a few examples. Such tasks can easily be performed without extensive programming effort so that the user can focus on domain-specific questions. Its backend from BoTorch/PyTorch enables GPU acceleration, parallelization, and state-of-the-art Bayesian optimization algorithms. Moreover, NEXTorch enables an interface with the commonly used simulation tools in the reaction engineering field, such as CFD and multiphysics simulations, for automatic optimization. For more information please visit: NEXTorch documentation.

NEXTorch Functionalities
  • Scalable, flexible, and accessible to the end users, such as chemists and engineers, to solve real world problems
  • Intuitive and includes researchers as a part of the optimization process
  • Provides choices for design of experiments and visualization capabilities
  • Involves fast predictive modeling, flexible optimization loops, easy interfacing with legacy software, and multiple types of parameters and data type conversions
  • Provides GPU acceleration, parallelization, and state-of-the-art Bayesian optimization algorithms
pquad_400.200

The ‘Python-based Quantification under Uncertainty with Analysis through Deconvolution‘ (pQUAD) is an easily customizable software for product quantification. It uses a principal component regression (PCR) model with error propagation via prediction intervals for multivariable regression. It can analyze experimental measurement errors by including the deconvolution of multi-component spectra. pQUAD has been developed with Agile Software Development Workflow in concurrence with feedback from experimentalist-collaborators. It also provides a user interface via Command Prompt in addition to the Python module. For more information please visit: pQUAD documentation.

pQUAD Functionalities
  • Minimizes error by training on high quality pure-component spectra
  • Build in a modular structure allowing for independent analyses using sub-workflows of the package.
  • The package provides the pure-component spectra and concentrations used for training the PCR model and the mixture data used in testing the model.
  • To facilitate usability, the package includes a Windows command prompt interface that is activated via a Windows .bat file
petboa_400.200

Parameter Estimation with Bayesian Optimization (petBOA) is a python-based, open-source tool designed for parameter estimation of mathematical models. It uses gaussian process-based Bayesian optimization to minimize objective functions. The optimizer is extended from the core NEXTorch software package. Example templates for kinetic parameter estimation using experimental data for widely used reactor models such as: batch and plug-flow reactor (PFR) are bundled with the tool.

petBOA Functionalities
  • Provides parameter estimation from experimental kinetic data given custom objective function
  • Leverages Gaussian process prior for Bayesian optimization using NexTorch software
  • Applies acquisition functions: Expected Improvement (EI) and Probability of Improvement (PI)
  • Offers open-ended equation framework and parameter input for user-specified estimation
  • Interfaces with OpenMKM software for microkinetic model parameter estimation
learnck_400.200

Learn Chemical Kinetics (LearnCK) is a Python framework designed to leverage machine learning as surrogate models for highly nonlinear and stiff chemical reaction kinetics represented within microkinetic models (MKM). Currently, the toolkit enables artificial neural network (ANN) representation of intrinsic chemical kinetics for heterogenous catalytic systems. This method is exceptionally powerful for large-scale MKM simulations. The tool provides a simple user interface a) to quickly build neural networks with the TensorFlow framework and train them, and b) to develop reactor models (such as: PFR, CSTR, etc.) using the previously trained ANN-based surrogate black-box models. A simple user-interface exists to implement statistical sampling methods such as: Latin Hypercube Sampling (LHS) to generate training data encompassing the phase-space envelop of relevant reaction conditions.

LearnCK Functionalities
  • Python opensource toolkit which uses machine (deep) learning to approximate chemical kinetics, which results in mathematical model reduction and computational power reduction.
  • Currently designed to learn and approximate kinetics using data-generated by first-principle micro-kinetic modeling data.
  • Trained neural networks run several orders of magnitude faster than the original micro-kinetic models for large-scale chemical reaction systems.
  • Easy user interface to create custom neural networks built with TensorFlow API. Provides the API to perform training-data cleansing, normalization, and visualization.
  • Python SciPy based reactor model wrappers to use this black-box neural network model approximators to recreate original reactor models.

A key technology gap identified in the RAPID roadmap was a lack of models and catalysts for focus areas, including commodity chemicals, natural gas, and biomass. The project provides the necessary infrastructure to accelerate the development of new catalysts by replacing traditional experimentation with data-driven tools and simulations that help reduce the time and cost from discovery to commercialization and market delivery. The Virtual Kinetics Lab advances industry knowledge necessary to meet the Department of Energy’s goals for increasing efficiency, reducing energy usage and feedstock waste, and decreasing the environmental impact of clean energy manufacturing.

While this project focuses primarily on heterogeneous catalysis — chemical reactions between materials of different phases (solids, liquids, and gases) — the methods developed may also be applicable to chemical reactions between molecules belonging to the same phase (homogeneous), as well as those that involve electricity (electrocatalysis).

UD RAPID Partners