Research – Tianyou Li (李天佑)

Research Philosophy

My research journey is driven by a shift from merely fitting the existing data to strategically acquiring the information we need. This evolution consists of three key realizations:

Structure as a Prior (Phase I): Early on, I realized standard ML fails to generalize in data-scarce settings. I addressed this by explicitly incorporating the system's intrinsic properties as structural inductive biases, ensuring model robustness and physical plausibility even in under-sampled regions.
Beyond Deterministic Metrics (Phase II): I found performance gains can be spurious—often stemming from favorable artifacts of stochasticity. Shifting focus from "fitting data" to "knowing what we don't know," I prioritized calibrated uncertainty estimates over standard predictive metrics.
Active Decision Making (Phase III): This led to my current focus on Structure-Aware Gaussian Processes & Bayesian Optimization. By combining structural priors with probabilistic surrogate modeling, I develop algorithms that leverage uncertainty to efficiently sample the input space and ultimately automate scientific discovery.

Phase III: Active Decision Making

2025 - Present

Gaussian Processes for Uncertainty Quantification of Implicit Functions

Gaussian Processes Structure-Aware Prior Uncertainty Quantification

Advisor: Prof. David Bindel (Cornell University)

Many scientific systems exhibit non-smooth responses that behave as implicit functions defined by equilibrium conditions. Rather than modeling the equilibria directly, we proposed reconstructing the underlying energy landscape with a structure-aware GP prior. By incorporating a semi-parametric mean function for energy regularization and conditioning on equilibrium constraints, this approach automatically recovers accurate, multi-modal posterior distributions, enabling safe active learning strategies that efficiently navigate phase transitions.

2025 - Present

Adaptable Microarray Platform for High-Throughput Bayesian Optimization

Bayesian Optimization Structure-Aware Prior AI for Science

Advisor: Dr. Nate Cira (Cornell University)

To optimize the enzymatic reaction TMB-HRP-H₂O₂ with multiple additive reagents, we developed a high-throughput BO framework tailored for adaptable microarray platforms. Addressing a mixed discrete-continuous design space characterized by modal transitions and hardware-imposed cardinality constraints, we introduced a slot-based encoding that enforces feasibility by construction. This representation streamlines the optimization of acquisition functions while preserving the surrogate model's capacity to capture the system's inherent non-smooth structure.

2025 - Present

Bayesian Optimization for Reactive Sputtering of Superconducting Thin Films

Gaussian Processes Structure-Aware Prior AI for Science

Advisor: Dr. Jingjie Yeo (Cornell University)

Enhanced Surrogate Modeling for Reactive Sputtering. Benchmarked reactive sputtering meta-learning studies and fixed Gaussian Process Regression implementations that neglected inter-feature correlations. Replaced standard models with SAAS-prior GPs, demonstrating that Attentive Neural Processes (ANP) yield more reliable uncertainty estimates for downstream optimization compared to traditional methods.

Phase II: Beyond Deterministic Metrics

2024 - 2025

Transformer-based Any-step Dynamics Model for Model-based RL

Model-Based RL Learning for Control

Course: CS 5756 (Cornell University)

Revisiting Any-step Dynamics Models for MBRL. While reproducing an ICLR paper on Any-step Dynamics Model, I identified a discrepancy between the authors' reported error analysis and their appendix results. I hypothesized that the issue stemmed from the GRU encoder's inability to handle long-horizon dependencies effectively. I built a Transformer-based variant (TADM) that replaces the recurrent backbone with an attention mechanism. The resulting model not only corrected the error scaling behavior but also enabled parallel training , achieving state-of-the-art sample efficiency on MuJoCo benchmarks with significantly lower held-out NLL.

2024 - 2025

Characterization of LNP Drug Delivery Vehicles by Machine Learning

AI for Science Physics-based Simulation Lipid Nanoparticle

Advisor: Prof. Peter Doerschuk (Cornell University)

Physics-Informed LNP Characterization via SAXS. To address the scarcity of realistic Small-Angle X-ray Scattering (SAXS) data for Lipid Nanoparticle (LNP) characterization, we constructed a physics-based simulator that synthesizes high-fidelity scattering profiles from cryo-EM morphologies using FFT approximations. By leveraging this synthetic data to pretrain dual-task deep learning architectures before fine-tuning on limited experimental measurements, we bridged the gap between theoretical physical laws and data-driven correlations. This sim-to-real framework achieved robust generalization (R² > 0.96) on held-out data, establishing a scalable computational proxy that significantly reduces reliance on expensive electron microscopy.

Phase I: Structure as a Prior

2023 - 2024

NFC Device Identification Using Deep Learning and RF Fingerprinting

Metric Learning Structure-aware Prior RF Fingerprinting

Advisor: Prof. Junqing Zhang (University of Liverpool)

Robust NFC Device Identification via Domain-Adversarial Learning. Standard RF fingerprinting often fails to generalize when reader-tag geometries change, primarily due to position-induced carrier-frequency offsets (CFO) and systematic signal distortions. Instead of treating these variations as random noise, we formulated the problem as a structured domain shift and developed a deep learning pipeline operating directly on raw baseband waveforms. By integrating multi-proxy metric learning strategies, our framework explicitly learns position-invariant representations, achieving 98.4% cross-configuration accuracy and eliminating the need for handcrafted feature extraction.

2022 - 2023

Perovskite-based Optoelectronic Artificial Synaptic TFT

Neuromorphic Computing Physics-informed ML Perovskite

Advisor: Prof. Chun Zhao (Xi'an Jiaotong-Liverpool University)

Physics-Informed Learning for Perovskite-based Neuromorphic Systems. Treating emerging synaptic devices merely as noisy approximations of ideal weights often leads to unstable performance. To bridge the gap between device physics and learning dynamics, we characterized the potentiation-depression dynamics of CsFAMA perovskite TFTs and integrated these hardware constraints directly into the neural network's weight-update equations. We further enhanced this device-aware pipeline by incorporating influence-based relabeling strategies (RDIA-LS) to mitigate the effect of label noise. This co-design approach aligned learning dynamics with hardware behavior, demonstrating superior robustness under extreme noise regimes (up to 80%) and leading to a publication in Nano Energy.