International Journal of Emerging Research in Science, Engineering, and Management
Vol. 2, Issue 1, pp. 218-225, January 2026.
This work is licensed under a Creative Commons Attribution 4.0 International License.
V. Jayasree
K. Gnanesh
K. Jagadeesh
B. Anudeep
R. Aravind
Department of ECE, Siddartha Institute of Science and Technology, Puttur, India.
Abstract: Image denoising is a fundamental task in image processing, as noise introduced during image acquisition and transmission significantly degrades visual quality, structural integrity, and texture details. Classical image denoising techniques, including hybrid filtering and the Non-Local Means (NLM) algorithm, are effective in suppressing noise but rely heavily on local similarity assumptions and manual parameter tuning, which limits their performance under complex and real-world noise conditions. To address these limitations, this paper proposes a hybrid image denoising framework that integrates classical hybrid filters and NLM with a Vision Transformer (ViT)–based neural architecture. In the proposed approach, hybrid filtering and NLM are first employed to perform initial noise reduction and preserve local structures. Subsequently, a Vision Transformer refines the denoised output by leveraging self-attention mechanisms to capture global contextual information and long-range dependencies across image patches. This transformer-based modeling enables improved reconstruction of edges, textures, and fine structural details that are often lost in traditional denoising methods. The proposed method is implemented using Python and deep learning frameworks and evaluated on standard benchmark images corrupted with different types of synthetic noise, including Gaussian, salt-and-pepper, and speckle noise. Experimental results, measured using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM), demonstrate that the proposed hybrid–Vision Transformer approach consistently outperforms classical hybrid and NLM-based denoising techniques. The results confirm the effectiveness of combining local filtering strategies with global self-attention for robust and high-quality image denoising, making the proposed framework suitable for applications in medical imaging, satellite imagery, and general digital image restoration.
Keywords: Image Denoising, Hybrid Filtering, Non-Local Means, Vision Transformer, Self-Attention.
References:
- D. P. Yadav, D. Kumar, A. S. Jalal, and B. Sharma, “Hyperspectral image denoising through hybrid spectral transformer network,” Advances in Space Research, vol. 76, no. 11, pp. 6673–6693, Sep. 2025, doi: 10.1016/j.asr.2025.09.028.
- H. Yin, H. Chen, and J. Zhu, “Retentive Spatial–Spectral Transformer for hyperspectral image denoising,” Infrared Physics & Technology, vol. 151, p. 106139, Sep. 2025, doi: 10.1016/j.infrared.2025.106139.
- C. Liu, H. Li, M. Li, L. Deng, M. Dong, and L. Zhu, “Infrared image denoising: A hybrid noise removal pipeline with anti-artifact regularization,” Optics & Laser Technology, vol. 193, p. 114174, Nov. 2025, doi: 10.1016/j.optlastec.2025.114174.
- Z. Chen, C. Liu, and J. Zhou, “SSIT: A spatial–spectral interactive transformer for hyperspectral image denoising,” Science of Remote Sensing, vol. 12, p. 100276, Sep. 2025, doi: 10.1016/j.srs.2025.100276.
- B. Mustapha, Y. Zhou, C. Shan, and Z. Xiao, “Enhanced Pneumonia Detection in Chest x-rays using Hybrid Convolutional and Vision Transformer Networks,” Current Medical Imaging Formerly Current Medical Imaging Reviews, vol. 21, p. e15734056326685, Jan. 2025, doi: 10.2174/0115734056326685250101113959.
- Y. Hu, D. Cheng, Z. Huang, B. Chen, S. Lin, and S. Zhang, “DSCA-former: Dual-stem cross-attentive transformer for image denoising,” Neurocomputing, p. 132656, Jan. 2026, doi: 10.1016/j.neucom.2026.132656.
- M. Boudraa, A. Bennour, M. Al-Sarem, F. Ghabban, and O. A. Bakhsh, “Contribution to historical manuscript dating: A hybrid approach employing hand-crafted features with vision transformers,” Digital Signal Processing, vol. 149, p. 104477, Mar. 2024, doi: 10.1016/j.dsp.2024.104477.
- W. Zhao, H. Ma, N. Jin, Y. Zheng, and X. Guo, “Detection of coronary heart disease based on heart sound and hybrid Vision Transformer,” Applied Acoustics, vol. 230, p. 110420, Nov. 2024, doi: 10.1016/j.apacoust.2024.110420.
- H. Wu, J. Zheng, C. He, H. Xiao, and S. Luo, “Terahertz image super-resolution restoration using a hybrid-Transformer-based generative adversarial network,” Optics and Lasers in Engineering, vol. 189, p. 108931, Mar. 2025, doi: 10.1016/j.optlaseng.2025.108931.
- X. Tu, G. Li, Z. Fan, X. Ding, and Y. Liu, “Taming diffusion transformers for high-fidelity MRI super-resolution,” Computers in Biology and Medicine, vol. 198, no. Pt B, p. 111261, Nov. 2025, doi: 10.1016/j.compbiomed.2025.111261.
- S. Xie et al., “SCDFuse: A semantic complementary distillation framework for joint infrared and visible image fusion and denoising,” Knowledge-Based Systems, vol. 315, p. 113262, Mar. 2025, doi: 10.1016/j.knosys.2025.113262.
- A. P. Narmadha and N. Gobalakrishnan, “HET-RL: Multiple pulmonary disease diagnosis via hybrid efficient transformers based representation learning model using multi-modality data,” Biomedical Signal Processing and Control, vol. 100, p. 107157, Nov. 2024, doi: 10.1016/j.bspc.2024.107157.
- M. Firouzbakht and M. Amirmazlaghani, “Breast cancer detection in mammography images using Neighborhood Attention transformer and Shearlet Transform,” Computers in Biology and Medicine, vol. 198, no. Pt B, p. 111239, Oct. 2025, doi: 10.1016/j.compbiomed.2025.111239.
