Raster Data
In this section, we revisit the gdalworkshop/world.tif
dataset.
dataset = AG.read("gdalworkshop/world.tif")
GDAL Dataset (Driver: GTiff/GeoTIFF) File(s): Dataset (width x height): 2048 x 1024 (pixels) Number of raster bands: 3 [GA_ReadOnly] Band 1 (Red): 2048 x 1024 (UInt8) [GA_ReadOnly] Band 2 (Green): 2048 x 1024 (UInt8) [GA_ReadOnly] Band 3 (Blue): 2048 x 1024 (UInt8)
A description of the display is available in Raster Datasets.
Raster Bands
We can examine an individual raster band
band = ArchGDAL.getband(dataset, 1)
[GA_ReadOnly] Band 1 (Red): 2048 x 1024 (UInt8) blocksize: 256×256, nodata: nothing, units: 1.0px + 0.0 overviews: (0) 1024x512 (1) 512x256 (2) 256x128 (3) 128x64 (4) 64x32 (5) 32x16 (6) 16x8
You can programmatically retrieve the information in the header using
ArchGDAL.accessflag(band)
: the access flag for this band. (GA_ReadOnly
)colorinterp =
ArchGDAL.getcolorinterp(band)
: color interpretation of the values in the band (GCI_RedBand
)ArchGDAL.getname(colorinterp)
: name (string) corresponding to color interpretation ("Red"
)ArchGDAL.width(band)
: width (pixels) of the band (2048
)ArchGDAL.height(band)
: height (pixels) of the band (1024
)ArchGDAL.indexof(band)
: the index of the band (1+) within its dataset, or 0 if unknown. (1
)ArchGDAL.pixeltype(band)
: pixel data type for this band. (UInt8
)
You can get additional attribute information using
ArchGDAL.getscale(band)
: the scale inunits = (px * scale) + offset
(1.0
)ArchGDAL.getoffset(band)
: the offset inunits = (px * scale) + offset
(0.0
)ArchGDAL.getunittype(band)
: name for the units, e.g. "m" (meters) or "ft" (feet). (""
)ArchGDAL.getnodatavalue(band)
: a special marker value used to mark pixels that are not valid data. (-1.0e10
)(x,y) =
ArchGDAL.blocksize(band)
: the "natural" block size of this band ((256,256)
)
GDAL contains a concept of the natural block size of rasters so that applications can organized data access efficiently for some file formats. The natural block size is the block size that is most efficient for accessing the format. For many formats this is simple a whole scanline. However, for tiled images this will typically be the tile size.
Finally, you can obtain overviews:
ArchGDAL.noverview(band)
: the number of overview layers available, zero if none. (7
)ArchGDAL.getoverview(band, i)
: returns thei
-th overview in the raster band. Each overview is itself a raster band, e.g.
ArchGDAL.getoverview(band, 2)
[GA_ReadOnly] Band 1 (Red): 256 x 128 (UInt8) blocksize: 128×128, nodata: nothing, units: 1.0px + 0.0 overviews:
Raster I/O
Reading Raster Values
The general operative method for reading in raster values from a dataset
or band
is to use ArchGDAL.read()
.
ArchGDAL.read(dataset)
: reads the entire dataset as a single multidimensional array.ArchGDAL.read(dataset, indices)
: reads the raster bands at theindices
(in that order) into a multidimensional array.ArchGDAL.read(dataset, i)
: reads thei
-th raster band into an array.ArchGDAL.read(band)
: reads the raster band into an array.
The array returned by read
has (cols, rows, bands)
dimensions.
To convert to a format used by the Images.jl ecosystem, you can either create a view using PermutedDimsArray(A, (3,2,1))
or create a permuted copy using permutedims(A, (3,2,1))
. The resulting arrays will have (bands, rows, cols)
dimensions.
You can also specify the subset of rows and columns (provided as UnitRange
s like 2:9
) to read:
ArchGDAL.read(dataset, indices, rows, cols)
ArchGDAL.read(dataset, i, rows, cols)
ArchGDAL.read(band, rows, cols)
On other occasions, it might be easier to first specify a position (xoffset,yoffset)
to read from, and the size (xsize, ysize)
of the window to read:
ArchGDAL.read(dataset, indices, xoffset, yoffset, xsize, ysize)
ArchGDAL.read(dataset, i, xoffset, yoffset, xsize, ysize)
ArchGDAL.read(band, xoffset, yoffset, xsize, ysize)
You might have an existing buffer that you wish to read the values into. In such cases, the general API for doing so is to write ArchGDAL.read!(source, buffer, args...)
instead of ArchGDAL.read(source, args...)
.
Writing Raster Values
For writing values from a buffer
to a raster dataset
or band
, the following methods are available:
ArchGDAL.write!(band, buffer)
ArchGDAL.write!(band, buffer, rows, cols)
ArchGDAL.write!(band, buffer, xoffset, yoffset, xsize, ysize)
ArchGDAL.write!(dataset, buffer, i)
ArchGDAL.write!(dataset, buffer, i, rows, cols)
ArchGDAL.write!(dataset, buffer, i, xoffset, yoffset, xsize, ysize)
ArchGDAL.write!(dataset, buffer, indices)
ArchGDAL.write!(dataset, buffer, indices, rows, cols)
ArchGDAL.write!(dataset, buffer, indices, xoffset, yoffset, xsize, ysize)
ArchGDAL expects the dimensions of the buffer to be (cols, rows, bands)
or (cols, rows)
.
Windowed Reads and Writes
Following the description in mapbox/rasterio's documentation, a window is a view onto a rectangular subset of a raster dataset. This is useful when you want to work on rasters that are larger than your computers RAM or process chunks of large rasters in parallel.
For that purpose, we have a method called ArchGDAL.windows(band)
which iterates over the windows of a raster band, returning the indices corresponding to the rasterblocks within that raster band for efficiency:
using Base.Iterators: take # to prevent showing all blocks
windows = ArchGDAL.windows(band)
for (cols, rows) in take(windows, 5)
@info "Window" cols rows
end
┌ Info: Window │ cols = 1:256 └ rows = 1:256 ┌ Info: Window │ cols = 1:256 └ rows = 257:512 ┌ Info: Window │ cols = 1:256 └ rows = 513:768 ┌ Info: Window │ cols = 1:256 └ rows = 769:1024 ┌ Info: Window │ cols = 257:512 └ rows = 1:256
Alternatively, we have another method called ArchGDAL.blocks(band)
which iterates over the windows of a raster band, returning the offset
and size
corresponding to the rasterblocks within that raster band for efficiency:
blocks = ArchGDAL.blocks(band)
for (xyoffset, xysize) in take(blocks, 5)
@info "Window offset" xyoffset xysize
end
┌ Info: Window offset │ xyoffset = (0, 0) └ xysize = (256, 256) ┌ Info: Window offset │ xyoffset = (1, 0) └ xysize = (256, 256) ┌ Info: Window offset │ xyoffset = (2, 0) └ xysize = (256, 256) ┌ Info: Window offset │ xyoffset = (3, 0) └ xysize = (256, 256) ┌ Info: Window offset │ xyoffset = (0, 1) └ xysize = (256, 256)
These methods are often used for reading/writing a block of image data efficiently, as it accesses "natural" blocks from the raster band without resampling, or data type conversion.
Using the DiskArray interface
Raster bands as 2D DiskArrays
As of ArchGDAL version 0.5.0 and higher a RasterBand
is a subtype of AbstractDiskArray
from the DiskArrays.jl package. This means that a RasterBand
is also an AbstractArray
and can therefore be treated like any Julia array. This means that square bracket indexing works in addition to the read
methods described above.
band[1000:1010,300:310]
11×11 Matrix{UInt8}: 0x87 0x8a 0x7b 0x5f 0x32 0x08 0x11 0x11 0x0d 0x0e 0x09 0x8c 0x96 0x70 0x70 0x32 0x10 0x0e 0x0e 0x0a 0x11 0x0a 0x88 0xa5 0x87 0x6f 0x25 0x10 0x0f 0x10 0x0a 0x10 0x07 0x80 0x9c 0x87 0x60 0x1c 0x16 0x10 0x0b 0x03 0x0c 0x0f 0x7a 0x7e 0x6e 0x57 0x2d 0x12 0x0f 0x0c 0x05 0x0e 0x0f 0x80 0x6e 0x5e 0x5b 0x3d 0x0a 0x0e 0x13 0x0f 0x13 0x07 0x8f 0x70 0x58 0x5a 0x26 0x0f 0x10 0x11 0x0a 0x11 0x0f 0x96 0x73 0x56 0x63 0x1b 0x13 0x10 0x0d 0x03 0x0e 0x18 0x91 0x71 0x5a 0x80 0x35 0x0b 0x0e 0x13 0x0b 0x10 0x0f 0x87 0x69 0x67 0x84 0x61 0x00 0x0f 0x0c 0x04 0x0c 0x11 0x88 0x6a 0x72 0x70 0x1a 0x0e 0x0f 0x0c 0x09 0x06 0x06
Also, windowed reading of the data can alternatively be done through the DiskArrays interface:
using DiskArrays: eachchunk
# take only the first 3 chunks to decrease output
using Base.Iterators: take
for (rows, cols) in take(eachchunk(band), 3)
@info "window" rows cols
end
┌ Info: window │ rows = CartesianIndex(1, 1) └ cols = CartesianIndex(2, 1) ┌ Info: window │ rows = CartesianIndex(257, 1) └ cols = CartesianIndex(258, 1) ┌ Info: window │ rows = CartesianIndex(513, 1) └ cols = CartesianIndex(514, 1)
This code is similar to the window function mentioned in Windowed Reads and Writes but more portable because the raster band can be exchanged with any other type implementing the DiskArrays interface. Also, for many operations it will not be necessary anymore to implement the window loop, since the DiskArrays
package provides efficient implementations for reductions and lazy broadcasting, so that for example operations like:
sum(sqrt.(band), dims=1)
1×1024 Matrix{Float64}: 7094.48 7094.48 7094.48 7094.48 … 30197.9 30368.0 30295.8 29723.7
will read the data block by block allocating only the amount of memory in the order of the size of a single raster block. See the DiskArrays README for more information on DiskArrays.jl
The RasterDataset type
Many raster datasets that contain multiple bands of the same size and data type can also be abstracted as a 3D array where the last dimension represents the band index. In order to open a raster dataset in a way that it is represented as a 3D AbstractArray
there is the readraster
funtion. It returns a RasterDataset
which is a thin wrapper around a Dataset
but it is a subtype of AbstractDiskArray{T,3}
and therefore part of the array hierarchy.
This means that data can be accessed with the square-bracket syntax
dataset = AG.readraster("gdalworkshop/world.tif")
dataset[1000,300,:]
3-element Vector{UInt8}: 0x87 0x78 0x41
and broadcasting, views and reductions are provided by the DiskArrays package. In addition, many ArchGDAL functions like (getband
, nraster
, getgeotransform
, etc) are delegated to the wrapped Dataset and work for RasterDatasets as well.