Evaluating Super-Resolution Sentinel Imagery for Tree Counting

9 min readFeb 27, 2024

Part 1: Georeferencing the Yosemite Tree Dataset

Introduction

At Komaza, our mission involves cultivating tiny forests, each home to roughly 200 trees. Given this scale, traditional 10m resolution Sentinel 2 imagery just doesn’t cut it for tree counting. High-resolution imagery has been economically out of reach.

By now, you have probably heard of super-resolution satellite imagery in the GIS world. You’ve probably also heard of Yosef Akhtman’s groundbreaking Sentinel-2 Deep Resolution 3.0 model, boasting Sentinel 2 imagery at an impressive 10 bands at 1m resolution. This piqued my curiosity: Can super-resolution (SR) imagery fill in the gaps, painting a detailed enough picture of each tree to enable “good enough” tree counting on these micro forests?

For a closer look at the Sentine-2 Deep Resolution work, check out Akhtman’s blog post here:

Sentinel-2 Deep Resolution 3.0

S2DR3: Effective 12-Band 10x Single Image Super-Resolution for Sentinel-2

medium.com

Initially, it seemed like a dead end. I took the known tree locations (collected from a smartphone in close proximity to trees), and layered these over Akhtman’s 1m SR imagery (acquired from digifarm). The resolution is not sufficient to distinguish individual trees.

Tree points on 1m SR imagery from Sentinel-2 Deep Resolution 2.0 acquired from Digifarm — Image by author

But here’s where it gets interesting. What if the solution lies in the extra bands of data or maybe the temporal information available through Sentinel 2 imagery? Yosef himself thinks we’re teetering on the brink of possibility. So, why don’t we dive in and see if we can tip the scales? Join me as we explore this frontier together.

Part 1: Georeferencing the Yosemite Dataset

To evaluate the possibility of tree counting with 1m SR imagery, I am turning to a benchmark tree counting dataset known as the Yosemite Tree Dataset by Guang Chen. You can find it on GitHub here:

GitHub - nightonion/yosemite-tree-dataset: A benchmark dataset for tree counting from aerial images

A benchmark dataset for tree counting from aerial images - nightonion/yosemite-tree-dataset

github.com

Like all of the three most common tree count datasets available today, the images are not georeferenced (they do not contain spatial information for viewing on a map or aligning with other datasets). However, this dataset is a single, continuous image, which means we can fairly easily georeference the dataset in one go. Once this is done, we can download the corresponding SR imagery and compare tree count methods on the two datasets.

We have to be careful. We cannot resample the labels. The labels act as a density map for the trees (or they will once we apply some Gaussian blur to them later in the series.) The sum of the values in that density map must equal the tree count. Resampling the labels destroys this relationship. Since we cannot resample the labels, we also cannot resample the dataset.

So our process is going to be:

Generate and export the affine transformation parameters that will serve to georeference the dataset image.
Generate a GeoTiff from the PNG, incorporating the affine transformation parameters and specifying a coordinate reference while avoiding resampling.
Validate our work to ensure everything is as it should be.

Generate and export the affine transformation parameters

The foundation of accurate georeferencing lies in identifying Ground Control Points (GCPs) — specific locations in the image whose real-world coordinates are known. For the Yosemite Tree Dataset, which originates from Google Maps, we utilize a Google Maps satellite base layer for our georeferencing efforts within QGIS version 3.26.1.

Setting Up the Base Layer:

Integrate Google Maps Satellite: By adding the Google Maps Satellite layer to our QGIS project, we ensure a consistent reference frame. Follow these steps:

Navigate to the Browser Panel (View > Panels > Browser Panel if it’s not already open)
Right-click on XYZ tiles and establish a new connection with the provided URL:

http://mt0.google.com/vt/lyrs=s&hl=en&x={x}&y={y}&z={z}

XYZ Connection settings — Image by Author

Double-click the Google Maps Satellite item under XYZ tiles to activate the base layer in your project.

Coordinate Reference System (CRS) Selection: Google Maps uses the EPSG:900913 projection standard, which we adopt for our project to match the dataset’s original mapping source. This selection is made under Project > Properties > CRS or via the CRS selection in the lower right-hand corner of the QGIS interface.

Using the Georeferencer

We’ll use the Georeferencer to add Ground Control points that QGIS will use to calculate the affine transformation parameters we need. To load the dataset in the Georeferencer:

Open the Georeferencer tool. This has moved around over time, but as of QGIS 3.26 it is located in the menu Layer > Georefencer....
Load the data set image z20_data.png from the Yosemite Tree Dataset in the Georeferencer with the menu option File > Open Raster.

Setting up the Georeferencer

Our objective here is precise: to avoid resampling and thus preserve the integrity of the original image and its labels. The following settings are critical to accomplish this:

Open Settings > Transformtion Settings….
Set Transformation type to Linear
Set Target CRS: ESPG:900913. Because we know the image is projected in this CRS (it is from Google Maps), this ensures we only rotate and scale the image and do not calculate skew.
Enable Create world file only (linear transformations). This will prevent resampling and will store the affine transformation parameters in a file we can use to generate a GeoTIFF.

Transformation Settings — Image by Author

Incorporating Ground Control Points

Identifying at least three GCPs that are discernible in both the dataset image and the Google Maps base layer is crucial. Find at least 3 points that you can accurately identify in both the Google base map and the dataset image. I was lucky to get some general orientation from the dataset owner, which allowed me get close enough to visually identify how the dataset fits with the map. The dataset image is of higher resolution and captured at a different time of day and in a different season than the base map, so I relied on solid geological features for the GCPs.

To add GCPs:

Use the Add Point tool
Select the point in the Georeferencer
In the resulting dialog, select from map canvas and select the point on the base map.

For a video demonstration of this process, see this video on YouTube by Alessio Rovere.

Here is a points file with the GCPs I used. You can import this into Georeferencer with the menu option File > Load GCP Points ...

Export the World File

After defining our GCPs, we calculate and export the affine transformation parameters to a world file (z20_data.wld) with the menu option File > Start Georeferencing or with the green triangle in the toolbar. This file, along with the original PNG, will allow us to generate a georeferenced GeoTIFF file.

You should now have a file named z20_data.wld with contents similar to this:

0.14928196386703424
0
0
-0.14928399627483205
-13310022.86812310852110386
4562262.78164943121373653

Generate a GeoTiff without resampling

With the affine transformation parameters securely stored in our world file, the next step is to create a GeoTiff. This process necessitates using GDAL, a powerful open-source library for manipulating raster geospatial data formats. Ensuring GDAL is installed on your system is a prerequisite for the following steps.

Converting PNG to GeoTiff:

Utilizing the command line, we employ gdal_translate to transform our PNG image into a GeoTiff format. This step is crucial for integrating the coordinate reference system and applying a compression option to optimize the file size without compromising data integrity. These are the necessary gdal_translate options:

-a_crs adds the coordinate reference system to your GeoTiff because the world file format does not specify one.
-co appends a creation option. In this case, I specify the compression format.

The full command is:

gdal_translate -of GTiff -a_srs EPSG:900913 -co "COMPRESS=DEFLATE" z20_data.png z20_data.tif

Aligning the Label File:

The integrity of our dataset extends to its labels. To ensure consistency, we replicate the world file for our label image, maintaining a 1:1 correspondence between the imagery and its annotations. This duplication is followed by a similar GDAL translation process, mirroring the steps taken with the primary dataset image:

cp z20_data.wld z20_label.wld
gdal_translate -of GTiff -a_srs EPSG:900913 -co "COMPRESS=DEFLATE" z20_label.png z20_label.tif

This operation not only preserves the spatial accuracy of our dataset but also sets the stage for subsequent analyses by ensuring that both the imagery and its labels are perfectly aligned in a geospatial context.

Extracting Tree Geopoints (Extra Credit)

While not a prerequisite for our primary modeling task, having a vector dataset of the tree locations enhances the potential for visualization and deeper insights into our data, thus unlocking a new dimension of data interaction.

Converting Pixel Locations to Tree Geopoints

The methodology utilizes two steps — first, the extraction of labeled geopoints from our GeoTIFF file, and second, the reprojection of these points into the widely used EPSG:4326 coordinate reference system. This process leverages Python’s powerful geospatial libraries: rasterio for handling raster data, shapely for point representations andgeopandas for projection. For details see the comments in the script below.

import rasterio
from shapely.geometry import Point
import geopandas as gpd

raster_path = 'z20_label.tif'
source_crs = 'EPSG:900913'  # Raster's original CRS
target_crs = 'EPSG:4326'  # Target CRS

# Open the raster
with rasterio.open(raster_path) as src:
    band1 = src.read(1)
    rows, cols = (band1 == 255).nonzero()
    raster_crs_coords = [src.xy(row, col) for row, col in zip(rows, cols)]

# Convert Coordinates to Shapely Points
point_coords = [Point(x, y) for x, y in raster_crs_coords]

# Generate a GeoSeris in our original CRS
gs = gpd.GeoSeries(point_coords, crs="900913")

# Convert to EPSG:4326 and save as geojson
gs.to_crs(4326).to_file('trees.geojson')

Validate the Work

To validate our work, we need to confirm two crucial outcomes:

The pixel values are identical between the original PNG and the generated GeoTIFF.
The GeoTIFF is correctly georeferenced.

Validating Pixel Values

For the first validation step, we employ a straightforward script that compares the pixel values of the two image files to ensure they match perfectly. This script uses the gdal module from the osgeo package to load each image as a numpy array then compares these arrays for equality. See the following script:

from osgeo import gdal
import numpy as np

def loadnumpy(path):
    ds = gdal.Open(path)
    bands_data = []
    for b in range(1, ds.RasterCount + 1):
        bands_data.append(ds.GetRasterBand(b).ReadAsArray())
    img = np.stack(bands_data, axis=0)
    ds = None
    return img

def compare_images(path1, path2):
    img1 = loadnumpy(path1)
    img2 = loadnumpy(path2)

    print(f'Image 1 shape: {img1.shape}')
    print(f'Image 2 shape: {img2.shape}')
          
    if np.array_equal(img1, img2):
        print("The pixel values of the two images are identical")
    else:
        print("The pixel values of the two images are NOT identical")

compare_images(
    'z20_data.png',
    'z20_data.tif'
)

compare_images(
    'z20_label.png',
    'z20_label.tif'
)

Running this, we get the following output, which confirms that the sizes and pixel values in the GeoTIFF files are indeed identical to those in the original PNG files. No resampling occurred during the conversion process.

Image 1 shape: (3, 46400, 27200)
Image 2 shape: (3, 46400, 27200)
The pixel values of the two images are identical
Image 1 shape: (1, 46400, 27200)
Image 2 shape: (1, 46400, 27200)
The pixel values of the two images are identical

Validating Georeferencing

For the second validation, I conducted a visual inspection within QGIS. This involves overlaying the georeferenced GeoTIFF on top of a Google satellite base map and using a map swipe tool to visually compare the alignment of features between the two layers. If the georeferenced layer correctly aligns with identifiable features on the satellite base map, it validates that the GeoTIFF is properly georeferenced.

Done!

Conclusion and Next Steps

The good news is that if you want to use this dataset and need it georeferenced, you do not need to go through this process. Mr. Chen, who is responsible for the 98,949 tree dataset, has agreed to include this GeoTIFF in the public repository soon. However, if you decide to take this on for another dataset, you now have the tools.

In the next installment of this series, I will delve into constructing a dataset designed to test the efficacy of 10-band super-resolution (SR) imagery in the context of tree counting, particularly in comparison to high-resolution RGB aerial imagery. The model I plan to use, named TreeFormer, leverages a local tree count ranking mechanism that uses unlabeled data to enhance the model training. So, in addition to acquiring super-resolution corresponding to the Yosemite Tree Dataset, we will also acquire additional unlabeled high-resolution and super-resolution imagery to capitalize on this ranking mechanism.

Stay tuned. Part II is coming soon.