Feat_emulators_module #603

rhugman · 2025-07-02T16:08:04Z

Add pyemu.emulators module for surrogate modeling capabilities

...thanks Claude for the PR description...

Work in progress. Would really appreciate some external opinions on what doesn't work well. What you would like different, etc. I tried to design the classes in what seemed a sensible manner...but other sets of eyes/opinions would be appreciated!

Summary

This PR introduces a new pyemu.emulators module that provides a comprehensive framework for building and deploying surrogate models (emulators) for computationally expensive simulations. The module includes three main emulator types: Data Space Inversion (DSI), Gaussian Process Regression (GPR), and Learning-based Pattern-driven Forecast Approach (LPFA), along with a robust data transformation pipeline.

Key Features

🔧 Base Architecture

Base Emulator Class: Common interface for all emulator implementations with standardized fit(), predict(), save(), and load() methods
Flexible Transform Pipeline: Comprehensive data transform.inverse transform pipeline, with support for log10, normal score, standard scaling, and min-max transformations
PEST++ Integration: integration with PEST++ workflows for optimization and uncertainty quantification

🎯 Emulator Implementations

1. Data Space Inversion (DSI)

Based on Sun & Durlofsky (2017) methodology
Uses Singular Value Decomposition (SVD) for dimensionality reduction
Supports energy-based truncation for computational efficiency
Includes Data Space Inversion Variable Control (DSIVC) for multi-objective optimization
Autoamted PEST++ template generation for history matching and optimization workflows

2. Gaussian Process Regression (GPR)

Scikit-learn based implementation with multiple kernel support
Uncertainty quantification through prediction standard deviations

3. Learning-based Pattern-driven Forecast Approach (LPFA)

Neural network-based emulator using scikit-learn MLPRegressor
Principal Component Analysis (PCA) for dimensionality reduction
Row-wise scaling for time-series data
Optional noise modeling for residual uncertainty
Early stopping and regularization support

🔄 Data Transformation Pipeline

AutobotsAssemble: Main transformation coordinator
Multiple Transformers: Log10, normal score, standard scaling, min-max scaling
Row-wise Scaling: Specialized for time-series and grouped data
Reversible Operations: Full inverse transformation support

Technical Implementation

Core Classes

# Base class
pyemu.emulators.Emulator

# Emulator implementations  
pyemu.emulators.DSI
pyemu.emulators.GPR
pyemu.emulators.LPFA

# Transformation framework
pyemu.emulators.transformers.AutobotsAssemble

PEST++ Integration

Automatic template folder generation
PyWorker helper functions for DSI and GPR

Example Usage

import pyemu
from pyemu.emulators import DSI, GPR, LPFA

# DSI Emulator
dsi = DSI(data=observation_ensemble, transforms=[
    {'type': 'normal_score', 'quadratic_extrapolation': True}
])
dsi.fit()
pst_dsi = dsi.prepare_pestpp("template_dir")

# GPR Emulator  
gpr = GPR(data=training_data, input_names=inputs, output_names=outputs)
gpr.fit()
gpr.prepare_pestpp("pest_dir", "case_name")

# LPFA Emulator
lpfa = LPFA(data=data, input_names=inputs, groups=groups, 
            fit_groups=fit_groups, output_names=forecasts)
lpfa.fit(epochs=200)
predictions = lpfa.predict(new_data)

Testing

test suite in emulator_tests.py
Tests for all three emulator types with various transformation combinations
Integration tests with PEST++ workflows
Verification against known analytical solutions (ZDT1 benchmark)

Breaking Changes

Legacy GPR helper functions are still supported
Legacy DSI helper functions are broken.

Dependencies

numpy, pandas (existing pyemu dependencies)
scikit-learn (for GPR and LPFA implementations)
Standard library modules: os, shutil, pickle, inspect

…lass

…to feat_emulators2

Rui Hugman and others added 30 commits June 16, 2025 15:13

introducing transformer classes and pipeline

ce5dff3

transformer tests

f3c45ac

dsi initial commit

d1d684e

refactor dsi helper functions

0337259

refactor dsi out of EnDS

4848655

initial tests commit

c776d8f

Portmanager class for dsivc

2b73575

adding ies_exe path arg to dsivc_fwd run fnx to deal with pytest

069fadc

updates to dsivc for pytest'

eb26410

checkin dsi

4dcbeb4

docstrings

5b1ad83

init

7d3fbff

init

8f57091

fix to dsi tests

e83f18f

moved dsi tests to dsi_tests.py

61e8d95

moved dsi tests to dsi_tests.py

042c96a

docstrings

8a817a5

use class save instead of pickle

18ed8ed

checkin baseline ldfa with sklearn

19b3801

rename test file

133e0de

rename ldfa to lpfa

c7cfadc

lpfa test

d4ae84e

added transform pipeline test for ldfa

6a8900b

refactor StandardSclaer to use sklearn

b2761dc

fix imports

9e62644

refactor naming and streamline emulator building workflow

444591b

functional gpr class + pestpp setup

23f57ac

gpr tests

a50fe51

general fixes to ppw

a7f3a6f

refactored gpr helper fnxs to maintain legacy, but also use new GPR c…

9937420

…lass

rhugman and others added 8 commits July 2, 2025 17:25

init updates

3c942a0

fix to utils gpr test

7bd1807

Merge branch 'develop' into feat_emulators2

75cd698

fi to grp_pyworker

3133d09

fix legacy gpr oyworker handling

c6b1f0c

Merge branch 'feat_emulators2' of https://github.com/rhugman/pyemu in…

b00a4c5

…to feat_emulators2

mystery of the disapearing t_d argument

1156903

checkin tests

b27d95f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat_emulators_module #603

Feat_emulators_module #603

Uh oh!

rhugman commented Jul 2, 2025

Uh oh!

Uh oh!

Feat_emulators_module #603

Are you sure you want to change the base?

Feat_emulators_module #603

Uh oh!

Conversation

rhugman commented Jul 2, 2025

Add pyemu.emulators module for surrogate modeling capabilities

Summary

Key Features

🔧 Base Architecture

🎯 Emulator Implementations

1. Data Space Inversion (DSI)

2. Gaussian Process Regression (GPR)

3. Learning-based Pattern-driven Forecast Approach (LPFA)

🔄 Data Transformation Pipeline

Technical Implementation

Core Classes

PEST++ Integration

Example Usage

Testing

Breaking Changes

Dependencies

Uh oh!

Uh oh!