[ICLR 2026] Test-Time Accuracy-Cost Control in Neural Simulators via Recurrent-Depth

01 Abstract

Test-Time Accuracy-Cost Control for Neural Simulators

Scientific computing often requires explicit control of the accuracy-cost trade-off: fast, approximate predictions for exploration or real-time control, and accurate (but slow) predictions for final analysis or critical decisions. Numerical methods naturally provide this flexibility through discretization, order, and tolerances. Neural simulators do not: once trained, their expected accuracy and inference cost are fixed.

We introduce RecurrSim, an architecture-agnostic framework that enables test-time accuracy-cost control in neural simulators. RecurrSim models i) serve multiple computational budgets, ii) support anytime prediction, and iii) are highly parameter-efficient.

Key Idea: RecurrSim introduces a user-controlled knob: a small number of recurrent iterations K (shallow model) yields approximate predictions, while a larger number of recurrent iterations K (deep model) yields more accurate predictions.

02 The Problem

Why Test-Time Accuracy-Cost Control for Neural Simulators Matters

Simulations Are Fundamental to Science and Engineering

Simulations allow scientists to study complex systems and engineers to optimize designs without expensive or impractical experiments.

Numerical Methods Have Explicit Control of the Accuracy-Cost Trade-Off

Finer discretizations, higher-order methods, and stricter tolerances usually yield more accurate predictions, but require more compute.

Many Choices Affect the Accuracy-Cost Trade-Off Only During Training, Not Inference

During training, allocating more computational resources generally leads to more accurate predictions; for example, through more data, larger models, or longer training. But after training, typically models have a fixed expected accuracy and inference cost.

The Gap: Unlike numerical methods, neural simulators do not provide explicit test-time accuracy-cost control.

Existing Adaptive Inference Methods Have Limitations

Deep Equilibrium models can, in principle, adapt compute by iterating toward a fixed point, but in practice they often fail to converge; therefore, DEQs with additional iterations do not reliably yield more accurate predictions.

Diffusion-based methods offer a user-controlled knob, but their intermediate states are generally not directly usable as predictions; therefore, diffusion models do not provide anytime prediction.

03 Contributions

What RecurrSim Enables

1

RecurrSim is an architecture-agnostic framework that enables explicit test-time accuracy-cost control via recurrent depth.

2

By varying the number of recurrent iterations, RecurrSim can produce predictions across multiple accuracy-cost settings.

3

RecurrSim supports anytime prediction, producing usable intermediate predictions throughout inference.

4

RecurrSim is highly parameter-efficient, matching or exceeding the performance of larger, non-weight-tied baselines.

04 Method

Recurrent-Depth Simulator

Figure 1 — RecurrSim architecture diagram

RecurrSim consists of an encoder that maps the input state x to a conditioning vector c; a recurrent-depth block that iteratively updates a latent state z_k conditioned on conditioning vector c; and a decoder that maps the final latent state to the output state.

$z_k = R([c, z_{k−1}], θ_R), k = 1, …, K.$

Training

RecurrSim is trained end-to-end. For each sample, the number of recurrent iterations K is drawn from a distribution, the recurrent-depth block is applied K times, and gradients are back-propagated. To bound memory, we use truncated backpropagation-through-depth.

Inference

At test-time, the user chooses the number of recurrent iterations K according to their desired accuracy and computational resources. The first few iterations make the largest adjustments, and subsequent iterations contribute progressively smaller adjustments.

Key Details

During training, the number of recurrent iterations K is drawn from a Poisson log-normal distribution, exposing the model to a broad spectrum of compute budgets which encourages stability across both shallow and deep rollouts.

Pseudocode

Python

def forward(self, x):
  z = self.encoder(x)

  z = self.processor(z)












  x = self.decoder(z)
  return x

Python

def forward(self, x, K=None):
  c = self.encoder(x)

  z = sample_noise()
  if self.training:
    with no_grad():
      K = sample_K()
      for _ in range(K - B):
        z = self.processor(cat([c, z], dim=1))
    for _ in range(B):
      z = self.processor(cat([c, z], dim=1))

  if not self.training:
    for _ in range(K):
      z = self.processor(cat([c, z], dim=1))

  x = self.decoder(z)
  return x

05 Results — Accuracy-Cost Trade-Off

RecurrSim Enables Explicit Test-Time Accuracy-Cost Control

Figure 2 — Trajectory Error vs. Recurrent Steps (K)

As the number of recurrent iterations K increases, trajectory error consistently converges—even beyond the maximum number of iterations seen during training (adaptive depth)—while predictions remain physically meaningful at every iteration (anytime prediction).

06 Results — Baselines

RecurrSim Outperforms Existing Adaptive Inference Methods

Model	Adaptive Depth	Anytime Prediction	Burgers (MSE ↓)	KdV (MSE ↓)	KS (Corr. ↑)
FNO-DEQ	~	~	0.045 ± 0.007	0.094 ± 0.028	41.9 ± 5.6
ACDM	✓	✗	0.044 ± 0.004	0.043 ± 0.015	40.5 ± 9.3
PDE-Refiner	✗	✓	0.161 ± 0.024	0.028 ± 0.010	61.6 ± 7.7
RecurrSim	✓	✓	0.008 ± 0.001	0.022 ± 0.005	74.7 ± 4.2

RecurrSim is the only method providing both adaptive depth and anytime prediction, while also achieving the best overall performance.

07 Results — Architecture-Agnostic

RecurrSim Transfers Across Architectures

RecurrFNO 50% fewer params
14% less memory vs FNO

On the 3D Navier-Stokes dataset

RecurrViT 42% fewer params
85% lower error vs ViT

On the Active Matter dataset

RecurrUPT 44% fewer params
8% lower error vs UPT

On the ShapeNet-Car dataset

08 Takeaway

Recurrent-depth modeling is a general and reusable idea: during training, supervise the model across a range of depths; at inference, the model can trade accuracy for compute, while supporting adaptive depth and anytime prediction.

09 Citation

BibTeX

@inproceedings{
  majid2026testtime,
  title={Test-Time Accuracy-Cost Control in Neural Simulators via Recurrent-Depth},
  author={Harris Abdul Majid and Pietro Sittoni and Francesco Tudisco},
  booktitle={The Fourteenth International Conference on Learning Representations},
  year={2026},
  url={https://openreview.net/forum?id=U2j9ZNgHqw}
}