Converting to Cloud Optimized GeoTiff (COG) format #27
Replies: 9 comments 13 replies
-
Cloud optimized geotiffs are really only necessary when accessing files over in the internet (passing a URL and it going through GDALs VFS interface). If the file is on your local system, it being a COG isn't all that necessary. With that said, large image's |
Beta Was this translation helpful? Give feedback.
-
Yes, I have terabytes of the things. :) Good suggestion for others. import rioxarray
kw = {}
kw.update(driver="COG")
result['red'].rio.to_raster('ResultRed.tif', **kw) |
Beta Was this translation helpful? Give feedback.
-
I have found it to be pretty good generally. Could be issues with dask, maybe, but the vanilla version:- |
Beta Was this translation helpful? Give feedback.
-
Although had reason to use the above in an environment with only GDAL 3.0. session = boto3.Session(profile_name='cogstuff')
with rasterio.Env(session):
s3 = session.resource('s3')
object = s3.Object('cogbucket', 'test/test_COG_2.tif')
tif_file2 = '/vsis3/cogbucket/test/test.tif'
ds = gdal.Open(tif_file2 )
print(ds)
gdal.Translate('/vsimem/temp_file.tif', ds, format="GTiff", creationOptions=['TILED=YES' 'COPY_SRC_OVERVIEWS=YES' 'COMPRESS=LZW', 'BLOCKXSIZE=512', 'BLOCKYSIZE=512', 'PREDICTOR=YES'])
f = gdal.VSIFOpenL('/vsimem/temp_file.tif', 'rb')
gdal.VSIFSeekL(f, 0, 2) # seek to end
size = gdal.VSIFTellL(f)
gdal.VSIFSeekL(f, 0, 0) # seek to beginning
data = gdal.VSIFReadL(1, size, f)
gdal.VSIFCloseL(f)
object.put(Body=data)
ds = None |
Beta Was this translation helpful? Give feedback.
-
I have an issue with displaying properly plots from xarray (stored in zarr) through mapbox (don't really know how I would tile the underlaying raster) as I am running into memory issues. |
Beta Was this translation helpful? Give feedback.
-
FWIW, I'm working on a blog (will share when it posts) to highlight differences in "valid" COGs. Here is a script that will highlight the difference for the same image saved as two formats:
from memory_profiler import profile
from osgeo import gdal
kwargs = dict(
xRes=611.49622628141,
yRes=611.49622628141,
outputBounds=[-9079495.967826376, 3443946.7464169012, -8922952.933898335, 3600489.780344942],
srcSRS="+proj=utm +zone=17 +datum=WGS84 +units=m +no_defs",
dstSRS="epsg:3857",
)
@profile
def get_tile(path):
# Open from remote server that supports HTTP range requests
source = gdal.Open(path)
# Extract tile 8/70/105
ds = gdal.Warp("", source, format="VRT", **kwargs)
# Perform operation and fetch data
return ds.ReadAsArray()
valid = "/vsicurl?url=https%3A%2F%2Fdata.kitware.com%2Fapi%2Fv1%2Ffile%2F62d70e66bddec9d0c478d85c%2Fdownload&use_head=no&list_dir=no"
invalid = "/vsicurl?url=https%3A%2F%2Fdata.kitware.com%2Fapi%2Fv1%2Ffile%2F626733ce4acac99f42fa5894%2Fdownload&use_head=no&list_dir=no"
# Valid COG
get_tile(valid)
# uses ~25Mb
# 13.8 ms ± 1.54 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# Non-COG
get_tile(invalid)
# Uses ~160 Mb
# 17.2 s ± 1.74 s per loop (mean ± std. dev. of 7 runs, 1 loop each) The script and data are hosted here: https://data.kitware.com/#folder/6267337e4acac99f42fa5844 |
Beta Was this translation helpful? Give feedback.
-
Can you add your code? |
Beta Was this translation helpful? Give feedback.
-
If you are reading zarr and hence getting a dask backed dataset - you will have problems writing those via rioxarray using the COG driver. At least last tiem It ried it. You'd have to definite it with the various rasterio parameters as Bane gives above. A 10K x 20K raster will take a while if you are downloading it on home type bandwidth. |
Beta Was this translation helpful? Give feedback.
-
I have tried a new method to generate a "COG" directly from a stacked xarray dataset. Again it is quite slow and I have difficulties to understand. It seems to me that all the formats are similar so there should not be a lot of operations. The data go in memory and it seems to run for a while with quite a good usage of the cores but I am puzzled that it takes so much. I must be doing something wrong from osgeo import osr, gdal
import xarray as xr
ds= xr.open_zarr('westcoast_master_50m.zarr')
temp_arr= ds.to_stacked_array('v', ['y', 'x']).sum(dim='v')
res=50
width, height=len(ds.x), len(ds.y)
COG='Optimised Geotiff.tif'
options = [
'TILED=YES',
'COMPRESS=LZW',
'COPY_SRC_OVERVIEWS=YES',
]
driver=gdal.GetDriverByName('GTiff')
outraster=driver.Create(COG, width, height,1, gdal.GDT_Float32, options=options)
outraster.SetGeoTransform([ds.x.values[0],50,0,ds.y.values[0],-50,0])
outrasterSRS= osr.SpatialReference()
outrasterSRS.ImportFromEPSG(3857)
outraster.SetProjection(outrasterSRS.ExportToWkt())
outraster.GetRasterBand(1).WriteArray(temp_arr.values) #<-- heavy stuff in memory
outraster.FlushCache() This time I got a 56.4 Mb Geotiff |
Beta Was this translation helpful? Give feedback.
-
Maybe a paragraph in the README 'here's some things to make COGs with'?
Beta Was this translation helpful? Give feedback.
All reactions