Overview#
The central goal of Materials Science and Engineering is to discover and deploy innovative materials that serve society. Over the last two decades, the Materials by Design (MbD) paradigm has provided a framework for materials development that depends on the integration of computational and experimental methods. Since 2011, the Materials Genome Project (MGI) has made theory- and data-led approaches the centerpiece of accelerating materials design.
The HTMDEC initiative embraces data-centric methods with a specific focus on AI and machine learning methods to accelerate the development of materials in high-throughput research collaborations. A central challenge is seamless, integrated access to heterogeneous data and associated analysis to drive workflows across distributed and diverse research teams.
In response to this challenge, we are developing data infrastructure that supports the automated curation of high-throughput materials research data using collaboratively developed semantics and a graphical data model that can be used to drive AI and ML workflows. At the core of our approach is innovative data infrastructure to automate data capture, curate, and enable access to both raw and derived research products along with associated analyses in support FAIR data and software principles.
This includes:
Automated stream, data capture, and transformation via OpenMSIStream
Scalable and extensible data management and access via Girder
Transparent and reproducible data analysis via Whole Tale
ML-centric graphical data model via GEMD
Unified sample system and data annotation
See also:
[Administrator’s Guide)(admin-guide)