Manyu Jin, Tao Wang, Zexuan Ji, and Xiaobo Shen
Abstract: Perceptual image quality assessment (IQA) is one of the most indispensable yet challenging problems in image processing and computer vision. It is quite necessary to develop automatic and efficient approaches that can accurately predict perceptual image quality consistently with human subjective evaluation. To further improve the prediction accuracy for the distortion of color images, in this paper, we propose a novel effective and efficient IQA model, called perceptual gradient similarity deviation (PGSD).Based on the gradient magnitude similarity, we proposed a gradient direction selection method to automatically determine the pixel-wise perceptual gradient. The luminance and chrominance channels are both took into account to characterize the quality degradation caused by intensity and color distortions. Finally, a multi-scale strategy is utilized and pooled with different weights to incorporate image details at different resolutions.Experimental results on LIVE, CSIQ and TID2013 databases demonstrate the superior performances of the proposed algorithm.
Keywords: Image quality assessment, full reference, perceptual gradient similarity,multi-scale, standard deviation pooling.
Objective image quality assessment (IQA) is a significant basis to measure the performance of image acquisition, image transmission and image processing algorithms.Most existing methods focus on extracting image features related to human visual system(HVS) to establish evaluation models. Based on whether there is a reference image, IQA techniques are generally classified into three categories, named full reference (FR),reduced reference (RR) and no reference (NR). FR-IQA considers that the reference image is undistorted, and measures the quality by calculating the differences between a reference image and a distorted image. By the in-depth study of imaging theory, image processing techniques such as Hermite spectral collocation, extended Hamiltonian algorithm, seeded region growing, collaborative representation [Zhang, Sun, Ji et al.(2016)] and Hidden Markov model [Zheng, Jeon, Sun et al. (2017)] have been widely used in computer vision. However, the critical problem of IQA is the extraction of quality-related features consistent with human eye perception. The most classical objective IQA algorithms are Peak Signal-to-Noise Ratio (PSNR) [Avcibas, Sankur and Sayood (2002)] and Mean-Squared Error (MSE). However, both methods are not well associated with the perceptual quality without considering the structural characteristics.To overcome this limitation, structural similarity index (SSIM) [Wang, Bovik, Sheikh et al (2004)] utilizes correlations between pixels and the concept of structure information to measure the quality score. Inspired by SSIM, a variety of improved algorithms have been proposed, including gradient based structural similarity (GSSIM) [Chen, Yang and Xie(2006)], multi-scale structural similarity Index (MS-SSIM) [Wang, Simoncelli and Bovik(2004)], and information content weighted SSIM (IW-SSIM) [Wang and Li (2011)].These approaches are consistent with human subjective perception to a certain degree.
Moreover, many models are also developed based on the properties of human vision.Based on the perspective of information theory, Sheikh et al. [Sheikh, Bovik and De(2005)] pointed out that the natural images have statistical characteristics, and proposed information fidelity criterion (IFC) and visual information fidelity (VIF) [Sheikh and Bovik (2006)]. Both methods utilize the mutual information between the original image and the distorted image. Chandler et al. [Chandler and Hemami (2007)] proposed the visual signal-to-noise ratio (VSNR) metric to quantify the visual fidelity of natural images based on near-threshold and supra-threshold properties of human vision. Most apparent distortion (MAD) [Larson and Chandler (2010)] assumed that the HVS uses two distinct strategies when evaluating high-quality images and low-quality images. Based on the hypothesis that the visual saliency map is closely related to image quality, Zhang et al.[Zhang and Li (2013)] proposed spectral residual based similarity (SR-SIM) method.Then they further proposed visual saliency induced index (VSI) method [Zhang, Shen and Li (2014)]. VSI utilizes the visual saliency map as both feature and weighting function to reflect the importance of local regions.
Neuropsychological studies have shown that the human visual system is sensitive to the structural distortions of edge details in the image. Most distortions have a tendency to change the gradient values. The image gradient reflects the most significant part of the image brightness changes, which is often used to extract the edge and other structures.Gradient information has been employed for FR-IQA in many different ways. GSSIM improves SSIM by replacing the contrast comparison with the gradient based contrast comparison. Gradient magnitude similarity (GSM) [Liu, Lin and Narwaria (2012)] also uses such information to capture structural and contrast changes. Zhang et al. [Zhang,Zhang, Mou et al. (2011)] constructed feature similarity (FSIM/FSIMc for color images)to further improve the performance. FSIM calculates gradient similarity and phase similarity respectively, regarding the gradient as an independent feature and pooling with a phase congruency weighted average. Because of high computational complexity of phase congruency features, Xue et al. [Xue, Zhang, Mou et al. (2014)] proposed an effective metric called gradient magnitude similarity deviation (GMSD), which first calculates the gradient magnitude similarity and then uses a standard deviation pooling strategy to get the evaluation score. GMSD proves that utilization of image gradient magnitude can yield highly accurate quality prediction.
Most related methods model the FR-IQA strategy in gray space. However, color is the essential element to describe the content of images, where RGB model is the most common color model in computer vision. However, RGB model cannot separate the luminance and chrominance, which is not consistent with human subjective perception of color similarity. The other well-known color spaces include HSV, YIQ, Lab and so on.HSV (hue, saturation, value) model is a pyramid color space which is more closely associated with the way human vision perceives color-making attributes. YIQ model contains three components: luminance value Y and two chrominance values I and Q. The relationship between YIQ and RGB color space is a linear transformation, and YIQ can be adapted for the change of lightness intensity. Like the YIQ, the Lab model is comprised of one luminance and two chrominance channels. Extends from FSIM, FSIMc uses YIQ color space to achieve color image quality assessment. It computes phase congruency and gradient similarity in luminance component combined with the chromatic similarities in I and Q channels. CSV [Temel and Alregib (2016)] uses the CIEDE2000 color difference formulation to quantify low-level color degradation and calculates the Earth mover’s distance between color name probability vectors to measure significant color degradation.
Generally, the perceptual quality is influenced by numerous factors, including display resolution, chrominance information and viewing distance. A natural image might have objects and structures that are relevant at different scales, and the human eye is readily able to identify and process the information presented by it. MS-SSIM proposes a multiscale structural similarity method, which is more stable than the single-scale SSIM model.Multi-scale contrast similarity deviation (MCSD) [Wang, Zhang, Jia et al. (2016)]explores the contrast features by resorting to the multi-scale representation.
As the above discussions, the gradient only computed on the luminance channel of image,would make GMSD not work quite well for the color distortion. In this paper, we propose perceptual gradient similarity deviation (PGSD) to further improve the prediction accuracy for the distortion of color images. Inspired by GMSD, we propose a gradient direction selection method to automatically determine the pixel-wise perceptual gradient.Both the luminance and chrominance channels are taken into consideration when characterize the quality degradation caused by intensity and color distortions. Finally, a multi-scale strategy is utilized and pooled with distinct weights to incorporate image details at different resolutions and obtain the final score.
Research has shown that the human visual system is sensitive to the edge of image. The gradient of the image can reflect the detail contrast and texture change, and is closely related to the perceptual quality. In this paper, we improve the existing gradient calculation method by automatically selecting the gradient directions.
Similar with GMSD, we first adapt the Prewitt filters as gradient operators to get the gradient magnitude. The operators of horizontal (x) and vertical (y) are defined as follows:
These operators constitute an orthogonal coordinate system. The gradient magnitude(GM) is defined as the root mean square of direction gradients along these two orthogonal directions. GM maps of the reference (r) and the distorted (d) image are computed as follows:
where symbol ? denotes the convolution operation, mr1andmd1are the gradient magnitude images extracted from the reference and distorted images respectively, and i=(i1,i2) represents the pixel coordinates.
In order to find the direction of maximum gradient change more accurately, we introduce another set of filters, which are defined as follows:
These two operators are also orthogonal. But different from the traditional ones, they are more efficient in capturing the sloped edges. The horizontal and vertical operators can only capture one component of the sloped edges, which can be, however, captured by one of the diagonal operators. Then the corresponding GM maps can be calculated with convolution operation like Eqs. (2) and (3):
For each image, we can obtain two GM images (mr1,mr2or md1,md2). For the reference image, two GM images are compared pixel-by-pixel and the larger values are selected to construct the final GM map:
The purpose of this is to determine the gradient direction that is closer to the maximum rate of change. To ensure that the gradient values being compared come from the same coordinate system, for the distorted image, GM map is constructed based on the reference one, which can be defined as
Therefore, we can obtain the final GM maps for both reference and distorted images. We select the gradient direction of each pixel on the maximum changing rate based on the reference image, which makes the comparison of gradient more accurately.
Many researches and experiments show that the visual information enters the visual cortex through different neural channels and is then processed by different neurons. The color feature is another type of information which reflects the image content except the brightness features. Calculating the gradient similarity only on gray scale cannot guarantee an accurate evaluation for color distortions, such as change of color saturation and chromatic aberrations. Therefore, the utilization of multiple channels including luminance and chrominance channels for IQA model can extract more perceptual features that are more consistent with human perception. Hence, we transform the RGB color images into an opponent color space [Geusebroek, Boomgaard, Smeulders et al. (2001)]:
whereLrepresents the luminance information andMandNcontain the chrominance information. The conversion weights are optimized for the HVS.
Figure 1: Illustration of the computational process of the proposed PGS map
Then the gradient magnitude maps are respectively calculated on these three channels by using the direction selection method mentioned in Section 2.1. For each image, we get three GM maps: GL, GMand GN. The similarities calculated between two GM maps of each channel are defined as:
where c1and c2are the positive constants. Then, we combine SMand SNto obtain the chrominance similarity measure, which is denoted by SC:
The similarities with respect to luminance and chrominance are described as follows:
where α and β are two positive parameters to adjust the relative importance of luminance and chrominance. The procedures to calculate the PGS map are illustrated in Fig. 1.
Finally, we apply with standard deviation pooling and take the result as the IQA index called perceptual gradient similarity deviation (PGSD):
whereNis the total number of pixels in the image. And the perceptual gradient similarity means (PGSM) is the average of the PGS map and is defined as:
The value of PGSD reflects the range of distortion severities in an image, where a higher score means a larger distortion range and a lower image perceptual quality, and vice versa.
Fig. 2 shows two examples by comparing PGSD with GMSD. GMSD computes the gradient magnitude only on the gray scale. Both testing images contain declines of color saturation but with different levels. The subject scores (MOS) are 4.31707 and 3.60978,respectively. A higher MOS means better image quality. Corresponding PGSD scores are 0.0131 and 0.0499 which are consistent with the trend of MOS. However, there is no obvious change between GMSD scores, which means that GMSD method can hardly detect the change of perceptual quality. The proposed PGSD is able to consider the variation of local quality and evaluate the color distortion. Consequently, it can obtain highly relevant results to subjective image quality.
Figure 2: Examples of GMSD and PGSD for change of color saturation evaluation. (a)and (d) are two distorted images with different levels in TID2013 database. (b) and (e)are the PGS maps calculated by proposed PGSD. (c) and (f) are the GMS map calculated by GMSD model. The subjective quality scores (MOS), the PGSD indexes and GMSD indexes are listed under the corresponding images
Typical multi-scale methods include wavelet transform, curvelet transform, and pyramid decomposition. In image quality assessment, the perceptual characteristics of the image are closely related to the observed distance and the sampling density. Therefore, scale information also has an effect on the perceptual quality.
Figure 3: Multi-scale framework
In this paper, we combine multi-scale strategy by utilizing low-pass filter and downsampling in three channels of a color image. The flow chart is presented in Fig. 3. In image preprocessing, the color image is transformed into the opponent color space as mentioned in Section 2.2.
On the basis of the previous scale level, the luminance and two chrominance images are first filtered by a low-pass filter, and then down-sampled by a factor of 2. The original image is indexed as Scale 1 in this paper, and the largest scale is indexed as Scale M. The overall evaluation score PGSD is obtained by multiplying different weights for different scales, which are expressed as:
where PGSDiandωirespectively represent the single-scale IQA score and the weight at thei-thscale and=1.
In this section, we tested the performances of the proposed PGSD on three databases:LIVE database [Sheikh, Moorthy, Wang et al. (2004)], CSIQ database [Larson and Chandler (2009)], and TID 2013 database [Ponomarenko, Ieremeiev, Lukin et al. (2013)].Tab. 1 shows the main information of these three databases.
Table 1: Description of testing databases
Four common indices were used to evaluate the prediction accuracy and consistency,including the Spearman Rank-Order Correlation Coefficient (SROCC), Kendall Rank-Order Correlation Coefficient (KROCC), Pearson Linear Correlation Coefficient (PLCC)and Root Mean Squared Error (RMSE). The values of SROCC, KROCC and PLCC all range from 0 to 1, with a higher value representing a more accurate evaluation. For RMSE, a lower value indicates that the corresponding algorithm can produce more accurate estimation. Generally, a logistic regression function is utilized to provide a nonlinear mapping between the objective scores and the subjective mean opinion scores(MOS), which is defined as:
where βi, i=1,2,…,5are regression model parameters, andxis the predicted image quality. After the regression, the above four indices can be calculated for the performance evaluations.
Unless otherwise specified, parameters involved in the proposed PGSD were set as follows. We set the constants c1=170 and c2=180 in Eqs. (10) and (11), respectively. The factors α and β in Eq. (14) were set as α =0.6 and β =0.4, respectively. The multiscale weights in Eq. (17) were set as ω=[0.1333, 0.3448, 0.2856, 0.2363].
The proposed PGSD was compared with eight state-of-the-art FR-IQA techniques,including SSIM, MS-SSIM, IW-SSIM, GMS, FSIMc, GMSD, CSV and MCSD. Both FSIMc and CSV can evaluate color images. It should be noted that the parameters for all the comparison algorithms were set as the suggestion values from the corresponding references.
Table 2: Performance evaluation results in LIVE, CSIQ and TID2013 databases
In the first experiment, all the four performance indices were examined in three testing databases. The results are presented in Tab. 2, in which the top three performances of each indicator are highlighted in boldface. The best ranking models are PGSD (10 times),CSV (7 times), FSIMc (7 times), MCSD (7 times) and GMSD (6 times). For the CSIQ and TID2013 datasets, all PGSD's indicators reach the top three. FSIMc, GMSD, MCSD and CSV can only obtain satisfactory results on one specific database. We also showed the weighted average of the four indices over the three databases in Tab. 3. We can observe that PGSD has the best performance in SROCC, KROCC and RMSE. The overall performance comparison shows that our improvements are effective.
Table 3: Weighted average results of LIVE, CSIQ and TID2013 databases
Generally, a good IQA model should also have the ability to accurately predict image quality for each specific type of distortions. Tab. 4 lists the SROCC scores of each type in three databases in which the top three performances are highlighted in boldface. For the TID2013 database, the proposed PGSD has better performance for color distortion than the conventional GMSD, which means that the color channel decomposition is effective.We can observe that the proposed PGSD obtain 27 times ranking in top three models for all 35 distortion types, followed by MCSD and GMSD with 24 times and 18 times,respectively.
For the color distortions of contrast change and change of color saturation, the SROCC values of GMSD are 0.3235 and 0.2948, respectively. The corresponding indexes of PGSD are 0.6343 and 0.7785, both of which are ranked in the top three. All the SROCC values of PGSD are above 0.6, which indicates that PGSD is almost valid for all distortion types.
Fig. 4 shows the scatter plots of predicted quality scores against subjective DMOS scores for all the comparison IQA models on TID2013 database. The curves were obtained by a nonlinear fitting mentioned in Eq. (18). The scatter distribution of MS-SSIM is more centralized than SSIM, which means that the multi-scale strategy is effective. In the scatter distributions of GMSD and MCSD, some points concentrated in the straight line with zero predicted scores, which indicates that the corresponding distortions are accurately evaluated.Comparatively, the scatter distributions of PGSD are more concentrated than others, which mean that the subjective and objective are more consistent.
Table 4: SROCC performance comparison on each individual distortion
Figure 4: Scatter plots of subjective DMOS against predicted quality scores by IQA models on the TID2013 database. (a) SSIM, (b) MS-SSIM, (c) IW-SSIM, (d) GSM, (e)FSIMc, (f) GMSD, (g) CSV, (h) MCSD and (i) PGSD
In addition, we evaluated the performances using statistical significance tests to make statistically meaningful conclusions. After nonlinear regression, we compared the prediction residuals of each two models by applying the left-tailedF-test at a significance level of 0.05. A value of H=1 indicates the first model (represented by the row in Fig. 5)is superior to the second one (represented by the column of Fig. 5) in IQA performance.A value of H=0 means that the first model is not significantly better than the second one.Fig. 5(a)-5(c) show the significance test results on the LIVE, CSIQ and TID2013 databases, respectively. We can find that the PGSD is significantly better than most models on the CSIQ database. For the LIVE database, PGSD is significantly better than all the others except for CSV. In general, considering 0.05 as the level of significance,this evaluation apparently shows that PGSD performs steadily better than most comparison methods.Except for accuracy, a good IQA model should also have high efficiency. To analyze the complexities of all the comparison IQA models, Tab. 5 lists the running time on the image with size 512×512, where the order is the length of execution time. All algorithms were implemented using the Matlab R2016a platform and were tested on a PC (Intel Core i3-2120 CPU, 3.30 GHz, 6 GB RAM, and 64-bit Windows 8). The execution time is the average value of 30 repetitions for each model. From Tab. 5 we can observe that the proposed PGSD is the fastest among the models that can process color image, which is 1.7 times faster than FSIMc, and 4.6 times faster than CSV. Therefore, we can conclude that the proposed method outperforms state-of-the-art methods in both terms of prediction accuracy and efficiency.
Figure 5: The results of statistical significance tests for all the comparison IQA models on (a) LIVE, (b) CSIQ and (c) TID2013 databases. The value 1 indicates that the model in the row is significantly better than the model in the column, while the value 0 indicates that two comparison methods have no significant difference
Table 5: Average execution time of all comparison IQA models
In this paper, we have proposed a FR-IQA model called perceptual gradient similarity deviation (PGSD) which considers the perceptual quality related to HVS, including the texture edge, chrominance information and viewing distance. To fully take all the direction changes into account, a gradient direction selection method has been proposed to automatically determine the pixel-wise perceptual gradient. Then both the luminance and chrominance channels have been taken into account to characterize the quality degradation caused by intensity and color distortions. Finally, we have presented multiscale pooling strategy which is more accurate and stable than single-scale assessment.The experimental results demonstrate that the proposed PGSD outperforms state-of-theart methods in terms of prediction accuracy and efficiency. Future work will be devoted to further reduce the complexity of the proposed algorithm and consider more visual features, such as saliency induced index in Zhang et al. [Zhang, Shen and Li (2014)] to further improve the prediction accuracy for the distortion of color images.
Computers Materials&Continua2018年9期