Scoring

The core quality metric

We identify the following high-level requirements for a super-resolved (SR) image:

The pixel-wise values in SR should be as close as possible to the target HR image (after the removal of unnecessary bias.)
The quality of the image should be independent of pixel-values marked as concealed in the target image.
The SR image should not reconstruct volatile features (like clouds) or introduce artifacts.

Given these requirements, we take the Peak Signal Noise Ratio (PSNR) as our starting point and modify a few aspects. The PSNR is based on the pixel-wise Mean Square Error (MSE) which we can easily restrict to clear (i.e. unconcealed) pixels, as reconstruction of clouds is meaningless.

A potential drawback of PSNR is a high sensitivity towards biases in brightness. A constant shift in the intensity of every pixel quickly ramps up and may be even more detrimental than submitting images with artifacts and clouds. Thus, we equalize the intensities of the submitted images such that the average pixel brightness of both images match. We call the PSNR modified for brightness and clouds cPSNR.

Lastly, we notice that the absolute value of the cPSNR is dependent on the image-set, with some image-sets giving (on average) higher cPSNRs as others. To have each set of images contribute equally to the final score, we establish a baseline solution, compute its cPSNR and use it for normalization.

The final score is the average over all normalized cPSNRs.

Image registration

To compensate for pixel-shifts, the submitted images are cropped by a 3 pixel border, resulting in a 378x378 format. These cropped images are then evaluated at the corresponding patches around the center of the ground-truth images, with the highest cPSNR being the score. In the following, $HR$ is the ground-truth image and $SR$ the submitted image, both in 384x384 resolution. We denote the cropped 378x378 images as follows: for all $u, v \in \lbrace 0, \ldots, 6 \rbrace $, $HR_{u,v}$ is the subimage of $HR$ with its upper left corner at coordinates $(u,v)$ and its lower right corner at $(378+u, 378+v)$. Analogously, $SR_{3,3}$ is the center part of $SR$ used for comparison. As this patch is always the same, we will omit the $(3,3)$ offset in the notation and write $SR$ instead of $SR_{3,3}$ for simplicity.

Formal computation of a submission score

We assume that the pixel-intensities are represented as real numbers $ \in [0, 1] $ for any given image. Furthermore, for any given image $I$ with a corresponding quality map, we define $clear(I)$ as the set of pixel coordinates that are indicated as clear for image $I$. For every possible $u, v$, we first compute the bias in brightness $b$ as follows:

$$ b = \frac{1}{\mid clear(HR_{u,v})\mid} \left(\sum\limits_{{x, y} \in clear(HR_{u,v})} HR_{u,v}(x,y) - SR(x,y) \right). $$

Next, we compute the corrected clear mean-square error $cMSE$ of $SR$ w.r.t. $HR_{u,v}$

$$ cMSE(HR_{u,v}, SR) = \frac{1}{\mid clear(HR_{u,v})\mid} \sum\limits_{{x, y} \in clear(HR_{u,v})} \left(HR_{u,v}(x,y) - (SR(x,y) + b) \right)^2 $$

which results in a clear Peak Signal to Noise Ratio of

$$ cPSNR(HR_{u,v}, SR) = -10 \cdot \log_{10} \left(cMSE(HR_{u,v}, SR)\right). $$

Let $N(HR)$ be the baseline cPSNR of image $HR$ as found in the file norm.csv. The individual score for image $SR$ is

$$ z(SR) = \min_\limits{u, v \in \lbrace 0, \ldots, 6 \rbrace} \left\lbrace \frac{N(HR)}{cPSNR(HR_{u,v}, SR)} \right\rbrace. $$

finally, the overall score of the submission is

$$ Z(submission) = \frac{1}{\mid submission \mid} \sum\limits_{SR \in submission} z(SR). $$

If this score is lower than $1$, the super resolution performs - on average - better than the baseline submission, i.e. a merger of some bicubic upscalings of low resolution images. See the section "example submission" under Submission Rules for details.

During the competition, your submissions are evaluated only on a fixed selection of half of the available test image-sets. The final evaluation and ranking will be computed at the end of the competition on the complete test-set.