Sparsity-based Image Denoising vis Dictionary Learning and Structural Clustering

To be presented at CVPR'2011
Weisheng Dong, Xin Li, Lei Zhang and Guangming Shi


Where does the sparsity in image signals come from? Local and nonlocal image models have supplied complementary views toward the regularity in natural images - the former attempts to construct or learn a dictionary of basis functions that promotes the sparsity; while the latter connects the sparsity with the self-similarity of the image source by clustering. In this paper, we present a variational framework for unifying the above two views and propose a new denoising algorithm built upon clustering-based sparse representation (CSR). Inspired by the success of L1-optimization, we have formulated a double-header L1-optimization problem where the regularization involves both dictionary learning and structural structuring. A surrogate-function based iterative shrinkage solution has been developed to solve the double-header L1-optimization problem and a probabilistic interpretation of CSR model is also included. Our experimental results have shown convincing improvements over state-of-the-art denoising technique BM3D on the class of regular texture images. The PSNR performance of CSR denoising is at least comparable and often superior to other competing schemes including BM3D on a collection of 12 generic natural images.


Paper [pdf], at 2011 IEEE CVPR

Bibtex entry

author = {Weisheng Dong and Xin Li and Lei Zhang and Guangming Shi},
Booktitle = { IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
Title = {Sparsity-based Image Denoising vis Dictionary Learning and Structural Clustering r},
Year = {2011}}

MATLAB code demo: CSR_denoise.rar(including test images) cvpr2011_results.rar (saved experimental results for BM3D, K-SVD, SA-DCT and CSR)


Related works


This work was partially supported by grant NSF-CCF-0914353, NSFC (No. 60736043,61072104, 61070138,and 61071170), and the Fundamental Research Funds of the Central Universities of China (No. K50510020003).


A little Bit of History of Image Denoising

To the best of my knowledge, image denoising (H-index=36) was first studied by Nasser Nahi at USC in early 1970s (though he used the name statistical image enhancement in his paper). In later 1970s, this problem was attacked by computer vision pioneers such as S. Zucker and A. Rosenfeld  in their paper titled "Iterative enhancement of noisy images." and Tom Huang in his edited book "Picture processing and digital filtering". In 1980, JS Lee published an important paper titled "Digital image enhancement and noise filtering by use of local statistics". The invention of wavelet transforms in late 1980s has led to dramatic progress in image denoising which originated in Simoncelli and Adelson's 1996 ICIP paper "Noise removal via Bayesian wavelet coring". Since then, numerous wavelet-based image denoising algorithms have appeared which include UIUC group's SPL1999 paper, my own ICIP2000 paper, Vetterli and his group's TIP2000 paper and Portilla et al.'s TIP2003 paper on GSM denoising. The class of geometric wavelets such as curvelet transform has also found promising application into image denoising.

When I attended CVPR for the first time in CVPR'2005, I played the imposter of Javier Portilla to challenge a presenter's overlooking GSM algorithm in their experimental comparison; but I was wise enough to shut my mouth after Buades' talk on nonlocal mean denoising even though their reported results are not that great (it turned out their work earned the Best Paper Honorable Mention Award). Since then, many reseachers including myself have explored the potential of nonlocal denoising. One of recent breakthroughs was made by researchers from Finland in their BM3D scheme published in 2007. My recent work ICIP2008 and ICIP2010 papers is motivated by the success of BM3D but I am more interested in theoretical explanations than algorithmic refinement. My collaboration with Dr. Weisheng Dong started with deblurring and superresolution but sooner we realize the same model is applicable to denoising too. Our work on CSR denoising can be viewed as the merge of K-SVD (dictionary learning) and BM3D (structural clustering) and is closely related to the idea of joint sparsity in LSSC and pooling features in deconvolutional networks (thanks to anonymous CVPR reviewers). We believe it is fruitful to further explore such connections in the future.

Back to the theory, I just found out another interesting connection with the theory of learning with structured sparsity. The "new" concept there is to define a so-called coding complexity regularization term, which I interpret as the derivative of entropy in information theory or description length in coding theory. So if the sparsity in compressed sensing replaces L0 with L1 and then structured sparsity replaces L0 with coding complexity, haven't we moved back to where we started from - statistical modeling of natural images? - because structured sparsity is really another way of saying that sparse coefficients are not independent and admit anothe level of sparsity encoding (like in our own work and deconvolutional networks). The most important lession that I have learned from my experience of working on image denoising seems to be: Hilbert space - at the foundation of wavelet theory or compressed sensing - might be the wrong starting point for thinking about images. Above Hilbert space, there is inner product space, normed vector space, metric space and topological space. If images live on a manifold (here is a good reference for this claim) - a nonconvex object, we must be extra cautions with the use of tools associated with convex optimization. Otherwise, we might face the same dilemma as machine learning people have dealt all the time (linear learning models vs. nonlinear natural problems).


Back to my Homepage