SoFoNO : Arbitrary-Scale Image Super-Resolution via Sobolev Fourier Neural Operator

Neurocomputing'25

Jong Kwon Oh^a ¹ Applied Mathematics & Machine Learning

Hwijae Son ^b ¹

Hyung Ju Hwang ^a Applied Mathematics and Machine Learning

Jihyong Oh ^c ^* Creative Vision and Multimedia

^a Pohang University of Science and Technology ^b Konkuk University ^c Chung-Ang University

¹Co-first authors, ^*Corresponding Author

Applied Mathematics and Machine Learning Lab

Creative Vision and Multimedia Lab

Paper Code

TL;DR — We present Sobolev Fourier Neural Operator (SoFoNO) that learns Sobolev exponents to enhance frequency-domain detail reconstruction, achieving more realistic arbitrary-scale super-resolution without attention.

Abstract

Accurately reconstructing fine textures and sharp edges remains a significant challenge in Single Image Super-Resolution (SISR) tasks, often resulting in overly smooth and less realistic images. To alleviate this issue we propose a novel SISR framework named Sobolev Fourier Neural Operator (SoFoNO). Central to our approach is a specialized architecture featuring a Sobolev Branch, which effectively captures detailed structures in the frequency domain via a learnable Sobolev exponent. Importantly, the learned Sobolev exponent is directly employed as derivative order parameters within the Sobolev loss function, enabling more precise and visually coherent reconstructions. Unlike conventional pixel-level loss functions, the Sobolev loss explicitly incorporates frequency-domain penalties, significantly enhancing the reconstruction quality of detailed image structures. Extensive experiments conducted on multiple datasets under both in-scale and out-scale scenarios demonstrate that our SoFoNO provides robust and effective performance in arbitrary-scale super-resolution, consistently outperforming representative existing methods across various tested scale factors without relying on attention mechanisms.

Demo

Changing resolution (34px → 340px)

Input

Output

Zoom-In Animation

Input Output

Multi-scale Before/After Comparison

Upscaling factor : ×2

Input 1 (degraded) — A photo comparison between low resolution input (Bicubic) and high resolution output (Reconstructed).

Output 1 (restored) — A photo comparison between low resolution input (Bicubic) and high resolution output (Reconstructed).

Motivation

We utilize a practical approximation method, regardless of whether $s \in \mathbb{N}$ . The equation is as follows:

\begin{equation} \Vert f \Vert_{H^s}^2 \approx \Vert \mathcal{F}^{-1}\!\big((1+\Vert \xi \Vert^2)^{s/2}\,\mathcal{F}(f)\big)\Vert_{L^2}^2 \,, \end{equation}

where $\xi$ represents the frequency domain variable, $\mathcal{F}$ and $\mathcal{F}^{-1}$ denote the Fast Fourier Transform (FFT) and the Inverse Fast Fourier Transform (IFFT), respectively, and $s \in \mathbb{R}$ represents the derivative order.

We introduce a learnable Sobolev exponent s that directly controls frequency emphasis in the transformation. When s > 0, high-frequency components are amplified, resulting in sharper, whereas s < 0 enhances low-frequency components, producing smoother outputs.

Architecture

The proposed SoFoNO architecture consists of three main stages: Encoder, SoFoNO Blocks, and Decoder.

First, a low-resolution (LR) image is processed by an EDSR encoder to produce deep feature maps that capture local texture information. Then, a Local Ensemble module performs arbitrary-scale upsampling by aggregating nearby latent features with their relative coordinates, enabling flexible resolution synthesis. These features are fed into a sequence of SoFoNO Blocks, each containing two parallel branches: a Local Branch that refines spatial details and a Sobolev Branch that operates in the frequency domain to emphasize high-frequency components using a learnable Sobolev exponent. The two branches are adaptively fused through Cross-Mixing and AdaIN operations to effectively integrate spatial and spectral information. Finally, the Decoder reconstructs the high-resolution image through lightweight convolutional layers and bilinear interpolation, while a combined real- and frequency-domain loss ensures balanced optimization.

Results

Quantitative Results

DIV2K Dataset Benchmark Table — Results on the DIV2K validation set.

Benchmark Table — Results on the Benchmark datasets.

SoFoNO demonstrates consistently superior performance across various datasets. It achieves outstanding results on the DIV2K validation set and maintains high performance on multiple benchmark datasets, including Set5, Set14, B100, and Urban100. These results highlight SoFoNO’s strong generalization ability and its effectiveness in producing high-quality, detail-preserving reconstructions across diverse image domains.

Qualitative Results

SoFoNO demonstrates remarkable visual fidelity. The left panel (×4) shows SoFoNO’s ability to accurately reconstruct fine textures and structural patterns, preserving high-frequency details with exceptional clarity at in-scale. In contrast, the right panel (×10) highlights its robustness under out-scale conditions, where conventional methods often fail. SoFoNO faithfully restores complex structures like ceiling textures, maintaining both texture coherence and geometric consistency.

Analysis

During training, the learnable Sobolev exponent s initially takes negative values, indicating a focus on low-frequency, coarse structures. As training progresses, s gradually increases and becomes positive, shifting the model’s emphasis toward high-frequency details and resulting in sharper, more refined reconstructions.

Training dynamics of Sobolev exponent — Convergence trend of parameter s in SoFoNO during training.

Frequency emphasis visualization — Reconstructed image (x7) patches (left) and corresponding frequency error maps (right) for different values of s.

BibTeX citation

@article{oh2025sofono,
  title={SoFoNO: Arbitrary-scale image super-resolution via Sobolev Fourier neural operator},
  author={Oh, Jong Kwon and Son, Hwijae and Hwang, Hyung Ju and Oh, Jihyong},
  journal={Neurocomputing},
  pages={131944},
  year={2025},
  publisher={Elsevier}
}