-
Notifications
You must be signed in to change notification settings - Fork 48
Description
Will contribute a couple of failing unit tests against this in #231
In investigation of #224 it turns out for some data, particularly found with MODIS, we observe that there is sometimes an inconsistent number of columns and rows unexpectedly. We expect the eastern and southern edges to possibly have fewer cells. But the layouts sometime have one more column or row per tile in the middle of the overall extent. The ProjectedRaster's extent though is sized consistently with the other tiles.
A snip of python code to highlight the issue.
from pyspark.sql.functions import round as sql_round
df = spark.read.raster('https://modis-pds.s3.amazonaws.com/MCD43A4.006/31/11/2017158/MCD43A4.A2017158.h31v11.006.2017171203421_B01.TIF',
tile_dimensions=(256, 256))
cell_size_df = df.select(
sql_round((rf_extent('proj_raster').xmax - rf_extent('proj_raster').xmin) /
rf_dimensions('proj_raster').cols, 3).alias('x_res'),
sql_round((rf_extent('proj_raster').ymax - rf_extent('proj_raster').ymin) /
rf_dimensions('proj_raster').rows, 3).alias('y_res'),
sql_round(rf_extent('proj_raster').xmax - rf_extent('proj_raster').xmin, 3).alias('x_ext'),
sql_round(rf_extent('proj_raster').ymax - rf_extent('proj_raster').ymin, 3).alias('y_ext'),
rf_dimensions('proj_raster').cols.alias('cols'),
rf_dimensions('proj_raster').rows.alias('rows'),
).groupBy(['x_res', 'y_res', 'x_ext', 'y_ext', 'cols', 'rows']).count().toPandas()The resulting DF is
inconsistent_grid.csv.zip
Here's a look at it omitting the eastern and southernmost tiles. Observe the extents are consistently sized but the numbers of rows and columns. The resolution from gdalinfo is Pixel Size = (463.312716527499731,-463.312716527916677)
Worth noting that the southern/eastern edge is a little worse off really
In this case the original product is 2400 x 2400 so it seems the layout should be 9 x 256 + 1 x 96

