Skip to content

IDEAsLab-Materials-Informatics/ML-YS-MPEAs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ML-YS-MPEAs

A machine learning framework for training artificial neural networks to predict yield strength of multi-principal element alloys (MPEAs) using cross-validation and automated feature engineering.


Publication details:


The pre-trained model for Yield strength will soon be integrated in MAPAL alongside the already existing phase selection and hardness prediction models.


Features

Automated Neural Network Training: Build and train ANN models with customizable architectures • Cross-Validation: K-fold cross-validation with multiple runs for robust model evaluation • Feature Engineering: Automatic calculation of alloy features including:

  • Elemental properties (mean, variance, etc.)
  • Miedema enthalpy calculations (H_chem, H_el, H_mix, H_IM)
  • Phase fraction predictions (f_FCC, f_BCC, f_IM) • Model Persistence: Automatic saving of trained models, predictions, and performance metrics • Convergence Monitoring: Ensures models converge properly with error threshold checking

Dependencies

Package Purpose
pandas Data manipulation and analysis
numpy Numerical computing
tensorflow Neural network framework
scikit-learn Machine learning utilities
mapal Alloy featurization and pre-trained model transfer learning
argparse Command line argument parsing
openpyxl Excel file reading
xlsxwriter Excel file writing with multi-sheet support

Installation

• Clone this repository • Install required dependencies:

pip install pandas numpy tensorflow scikit-learn openpyxl xlsxwriter mapal

Usage

Command Line Arguments

Argument Type Description Required
-input_filepath String Path to the Excel input file containing model configurations Yes
-modIdStart Integer Starting model ID from the input file Yes
-modIdEnd Integer Ending model ID from the input file Yes
-saveDir String Directory to save trained models No (default: "trained-models-raw")

Basic Usage Examples

Single model training:

python run-autoANN.py -input_filepath models_config.xlsx -modIdStart 1 -modIdEnd 1

Multiple models training:

python run-autoANN.py -input_filepath models_config.xlsx -modIdStart 1 -modIdEnd 5

Custom save directory:

python run-autoANN.py -input_filepath models_config.xlsx -modIdStart 1 -modIdEnd 5 -saveDir my_models

Input File (input-autoANN.xlsx) Configuration

Required Columns in Excel Input File

Column Data Type Description Example
model_id Integer Unique identifier for each model 1, 2, 3...
database_path String Path to training database with the _database directory "db-mpea-ys-ac-t25-seed1.xlsx"
x String Feature names separated by semicolons as defined in MAPAL "(asymmetry,r_cov);(comp_avg,VEC);H_el"
y String Target variable name "YS_MPa"
cols_to_keep String Columns in input database file to retain in output (comma-separated) "alloy_name,phases,phase_code"
layer_units String Neural network layer sizes (comma-separated) "64,32,16"
activation_functions String Activation functions (comma-separated) "relu,relu,sigmoid"
optimizer String Optimizer name "Adam"
learning_rate Float Learning rate value 0.001
loss_function String Loss function name "mae"
Kfold_split Integer Number of cross-validation folds 5
n_runs Integer Number of training runs with random data-shuffling & model initialization in each run 3
patience_val Integer Early stopping patience 50
max_epochs Integer Maximum training epochs 1000
check_error_value Float Error convergence threshold 200
check_after_iterations Integer Check convergence after N epochs 100
tensorboard_logs Integer Enable TensorBoard logging (1 or 'true') 1

Output Structure

For each trained model, the following directory structure is created:

Directory/File Description
M-{model_id}/ Main model directory
├── M-{model_id}_inputParam.xlsx Input parameters used for training
├── M-{model_id}_xmin-feats.csv Feature minimum values for normalization
├── M-{model_id}_xmax-feats.csv Feature maximum values for normalization
├── M-{model_id}_CV-predictions-all-runs.xlsx Cross-validation predictions
├── M-{model_id}_CV-performance-all-runs.xlsx Performance metrics summary
├── M-{model_id}_model-checkpoints/ Training checkpoints directory
├── M-{model_id}_model-save-final/ Final trained models directory
├── M-{model_id}_csv-logs/ Training logs in CSV format
└── M-{model_id}_tensorboard-logs/ TensorBoard logs (if enabled)

Footnotes

  1. Department of Metallurgical and Materials Engineering, Indian Institute of Technology Ropar, Rupnagar 140001, Punjab, India 2 3 4

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages