https://github.com/Harrison-Teeg/LECA/raw/main/images/leca_logo.png

LECA - The Liquid Electrolyte Composition Analysis Package

The Liquid Electrolyte Composition Analysis (LECA) package creates a simplified Jupyter-Notebook [1] based workflow for applying Scikit-Learn [2] machine learning regression models to predict liquid electrolyte behavior based on composition.

Requirements

Python 3.7+, Jupyter Notebook 6.4.11+

With the following python libraries:
  • Scikit-learn 1.3.1+

  • NumPy 1.22.3+

  • Matplotlib 3.5.1+

  • Pandas 1.4.2+

  • SciPy 1.8.1+

  • Uncertainties 3.1.7+

  • MAPIE 0.5.6

  • HDBSCAN 0.8.28+

  • Seaborn 0.11.2+

  • GPyOpt 1.2.6+

Installation

The LECA package can be installed directly from source with:

pip install git+https://github.com/Harrison-Teeg/LECA.git

Envisaged LECA Work Flow

https://github.com/Harrison-Teeg/LECA/raw/main/images/LECA_overview.png

Data Import and Feature Engineering

  • Import Data from JSON or CSV files

  • Visualize dataset with feature overview and interactive data visualizer

  • Identify and filter outlier values using HDBSCAN [3]

  • Manually filter nonsense-values with user defined explicit boundaries

  • Combine repeated measurements and record statistical behavior (measurement noise)

  • Generate surrogate models for Arrhenius behavior or other user defined values

Initialize Regression Models and Compare Results

  • Data splitting / Scaling automatically handled

  • Declare regression models to implement (supports N-dimensional feature/objective space)
    • Linear Regression (LR)

    • Gaussian Process Regression (GPR) (supports Isotropic/Anisotropic RBF, Matern, RQ, Custom kernel)

    • Neural Network (NN)

    • Random Forest (RF)

  • Hyperparameter Optimization for NN and RF with GPyOpt [4]

  • Customized Polynomial selection for LR [5]

  • Cross-validated scoring of models and visualization to provide simple overview of comparative model performance

  • Ensemble based uncertainty estimation for LR / NN / RF models using MAPIE [6]

  • Validate performance of models on unseen validation data

Analyze Objective Function for Compositions

  • Interactive widgets to visualize objective function and model uncertainty for various compositions

  • Return optimal composition to maximize/minimize objective function optimization

  • Ranked Batch Mode Active Learning module based on RBMAL approach of Cordoso et al. [7]

Areas of Further Development

Multi-Objective Optimization: Identifying Pareto-fronts for multiple-objectives for electrolyte composition (e.g. electrochemical stability, conductivity, etc.)

References

Acknowledgments

This project has received funding from the European Union’s Horizon 2020 research and innovation program under grants agreement No 957189 (BIG-MAP) and No 957213 (BATTERY2030+).