Skip to content

Machine learning for deep elastic strain engineering of semiconductor electronic band structure and effective mass

MetadataDetails
Publication Date2021-05-28
Journalnpj Computational Materials
AuthorsEvgenii Tsymbalov, Zhe Shi, Ming Dao, Subra Suresh, Ju Li
InstitutionsNanyang Technological University, Skolkovo Institute of Science and Technology
Citations37
AnalysisFull AI Review Included

This research introduces a highly efficient, physics-informed Machine Learning (ML) framework designed to accelerate the discovery and optimization of “deep” Elastic Strain Engineering (ESE) pathways in semiconductors.

  • Core Value Proposition: The framework overcomes the computational bottleneck of traditional ab initio methods (Density Functional Theory/GW corrections) by using a Convolutional Neural Network (CNN) to predict complex electronic properties across the vast six-dimensional (6D) strain space.
  • ML Architecture: A CNN model represents the electronic band structure as a “digital image” in k-space, incorporating known physical symmetries (time-reversal, k-space periodicity) and correlations (intra-band and inter-band) for enhanced accuracy and speed.
  • Performance Metrics: The CNN achieves a relative error of less than 0.5% in band structure prediction compared to high-fidelity GW calculations, while offering inference speeds two orders of magnitude faster than competing Kernel Ridge Regression (KRR) models.
  • Training Efficiency: A knowledge transfer strategy, involving pre-training on low-fidelity PBE data followed by data fusion with high-fidelity GW data, significantly reduced the total ab initio computational cost.
  • Active Learning: An integrated active learning cycle, utilizing uncertainty estimation, autonomously selects the most informative strain states for calculation, further reducing the required training data by two to three times.
  • Diamond Optimization: Applied to diamond, the model successfully mapped energy-efficient strain pathways to achieve critical transitions, including indirect-to-direct bandgap conversion and insulator-to-metal transition (0 eV bandgap) at low elastic strain energy densities.
ParameterValueUnitContext
Undeformed Diamond Bandgap (Eg)5.6eVWide bandgap semiconductor reference.
Maximum Elastic Strain Range (Normal)±15%Range sampled in the 6D strain space.
Maximum Elastic Strain Range (Shear)±10%Range sampled in the 6D strain space.
ML Relative Error (Band Structure)<0.5%Accuracy achieved against GW calculations.
Minimum Strain Energy Density (h) for Direct Bandgap~20meV/A3Energy-efficient pathway identified for diamond.
Longitudinal Effective Mass (mL)1.55 me(me = free electron mass)Conduction Band Minimum (CBM) in undeformed diamond.
Transverse Effective Mass (mT)0.31 me(me = free electron mass)CBM in undeformed diamond.
DFT Plane Wave Energy Cutoff600eVFirst-principles calculation setting (VASP/PBE).
ML k-Space Mesh Resolution8 x 8 x 8pointsInput tensor dimension for the CNN model.
CNN Parameter Count~276,000parametersTotal parameters in the three-block CNN model.

The ML framework relies on a three-part training process—preliminary training, data fusion, and active learning—to efficiently construct a high-accuracy surrogate model for the electronic band structure En(k; Δ).

  1. First-Principles Data Generation:

    • DFT-PBE Calculations: Used the Projector Augmented Wave (PAW) method and the PBE functional (Perdew-Burke-Ernzerhof) with a 600 eV energy cutoff to generate a large, low-fidelity dataset (~35,000 strain samples).
    • GW Calculations: A smaller, high-fidelity dataset (~6,000 samples) was generated using many-body GW corrections (partially self-consistent GW0) on top of PBE settings for improved bandgap accuracy.
    • Stability Check: Phonon calculations were performed to ensure that sampled strain states remained within the phonon-stable elastic regime, avoiding phase transitions or fracture.
  2. Physics-Informed CNN Architecture:

    • Input Representation: The 6D strain tensor (Δ) is passed through fully connected layers and reshaped into a rank-5 tensor, representing the band structure as N stacked 3D images (m x m x m voxels).
    • Convolutional Blocks: The CNN uses specialized kernels to capture physical correlations:
      • Intra-band correlation: 3x3x3x1 kernel applied across adjacent k-points within the same band (ensuring piecewise smoothness).
      • Inter-band correlation: 1x1x1x3 kernel applied across different bands at the same k-point.
    • Symmetry Constraints: The architecture inherently accounts for time-reversal symmetry (En(-k) = En(k)) and k-space periodicity (reduced zone scheme).
  3. Knowledge Transfer and Data Fusion:

    • Pre-training: The CNN was initially trained on the large, computationally cheap DFT-PBE dataset to learn general features of the band structure deformation.
    • Fusion: The learned network parameters were used as a starting point for training on the smaller, costly GW dataset, exploiting the knowledge gathered from the PBE data to stabilize and accelerate high-accuracy training.
  4. Active Learning Cycle:

    • Uncertainty Estimation: Dropout-based inference, enhanced with Gaussian processes, was used to quantify the model’s uncertainty (expected error) for unsampled strain states.
    • Autonomous Sampling: In iterative cycles, approximately 200 strain cases exhibiting the highest uncertainty were automatically selected and added to the GW training set, significantly reducing the total number of ab initio calculations required.

The ability to rapidly and accurately predict electronic properties under extreme mechanical stress opens new avenues for designing next-generation semiconductor devices, particularly those utilizing diamond or strained silicon.

Industry/SectorApplication ContextRelevance to ESE/ML Framework
High-Power ElectronicsDesigning high-efficiency, radiation-hardened power devices (diodes, MOSFETs) capable of operating at high temperatures and voltages.ESE allows for precise tuning of the bandgap (Eg) and effective mass (m*), maximizing breakdown voltage and minimizing switching losses in wide bandgap materials like diamond.
Nanophotonics and OptoelectronicsDeveloping integrated light sources (LEDs) and high-speed photodetectors from indirect bandgap materials.The ML model identifies energy-efficient strain states that induce the critical indirect-to-direct bandgap transition, enabling diamond to function as an efficient light emitter.
Quantum Information ProcessingEngineering the local strain environment around color centers (e.g., NV centers in diamond) to optimize spin coherence and control.The framework provides accurate prediction of band dispersion and effective mass, which are crucial inputs for modeling strain effects on defect energy levels and quantum properties.
High-Speed Logic and RF DevicesEnhancing carrier mobility in nanoscale semiconductor channels (e.g., silicon nanowires or strained diamond films).ESE is used to minimize the conductivity effective mass (mcond), directly increasing carrier mobility for faster switching speeds in logic and radio-frequency circuits.
Materials Design and ScreeningRapidly exploring the vast 6D strain space for new material phases or optimal operating conditions for functional materials beyond diamond (e.g., Si, SiC, GaN).The general ML architecture serves as a fast, accurate surrogate model, replacing costly DFT/GW calculations in the initial screening phase of any ESE-based materials design project.
View Original Abstract

Abstract The controlled introduction of elastic strains is an appealing strategy for modulating the physical properties of semiconductor materials. With the recent discovery of large elastic deformation in nanoscale specimens as diverse as silicon and diamond, employing this strategy to improve device performance necessitates first-principles computations of the fundamental electronic band structure and target figures-of-merit, through the design of an optimal straining pathway. Such simulations, however, call for approaches that combine deep learning algorithms and physics of deformation with band structure calculations to custom-design electronic and optical properties. Motivated by this challenge, we present here details of a machine learning framework involving convolutional neural networks to represent the topology and curvature of band structures in k -space. These calculations enable us to identify ways in which the physical properties can be altered through “deep” elastic strain engineering up to a large fraction of the ideal strain. Algorithms capable of active learning and informed by the underlying physics were presented here for predicting the bandgap and the band structure. By training a surrogate model with ab initio computational data, our method can identify the most efficient strain energy pathway to realize physical property changes. The power of this method is further demonstrated with results from the prediction of strain states that influence the effective electron mass. We illustrate the applications of the method with specific results for diamonds, although the general deep learning technique presented here is potentially useful for optimizing the physical properties of a wide variety of semiconductor materials.