AI Article Synopsis

  • The development of the DigiMOF database addresses challenges in identifying suitable metal-organic frameworks (MOFs) for specific applications by extracting valuable synthetic information from existing literature.
  • By utilizing ChemDataExtractor for data mining, researchers compiled a large dataset from over 43,000 journal articles, leading to the identification of 15,501 unique MOF materials and essential synthesis-related properties.
  • The database not only enhances the efficiency of MOF discovery but also provides a centralized resource for researchers to analyze various characteristics like topology, surface area, and density, thus facilitating future research and applications.

Article Abstract

The vastness of materials space, particularly that which is concerned with metal-organic frameworks (MOFs), creates the critical problem of performing efficient identification of promising materials for specific applications. Although high-throughput computational approaches, including the use of machine learning, have been useful in rapid screening and rational design of MOFs, they tend to neglect descriptors related to their synthesis. One way to improve the efficiency of MOF discovery is to data-mine published MOF papers to extract the materials informatics knowledge contained within journal articles. Here, by adapting the chemistry-aware natural language processing tool, ChemDataExtractor (CDE), we generated an open-source database of MOFs focused on their synthetic properties: the DigiMOF database. Using the CDE web scraping package alongside the Cambridge Structural Database (CSD) MOF subset, we automatically downloaded 43,281 unique MOF journal articles, extracted 15,501 unique MOF materials, and text-mined over 52,680 associated properties including the synthesis method, solvent, organic linker, metal precursor, and topology. Additionally, we developed an alternative data extraction technique to obtain and transform the chemical names assigned to each CSD entry in order to determine linker types for each structure in the CSD MOF subset. This data enabled us to match MOFs to a list of known linkers provided by Tokyo Chemical Industry UK Ltd. (TCI) and analyze the cost of these important chemicals. This centralized, structured database reveals the MOF synthetic data embedded within thousands of MOF publications and contains further topology, metal type, accessible surface area, largest cavity diameter, pore limiting diameter, open metal sites, and density calculations for all 3D MOFs in the CSD MOF subset. The DigiMOF database and associated software are publicly available for other researchers to rapidly search for MOFs with specific properties, conduct further analysis of alternative MOF production pathways, and create additional parsers to search for additional desirable properties.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10269341PMC
http://dx.doi.org/10.1021/acs.chemmater.3c00788DOI Listing

Publication Analysis

Top Keywords

digimof database
12
csd mof
12
mof subset
12
mof
10
journal articles
8
unique mof
8
mofs
6
database
5
database metal-organic
4
metal-organic framework
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!