This vignette aims to provide a “cookbook” walking through common use cases and code patterns for rsi. If you’ve got a problem that it seems like rsi should be able to solve, hopefully this document can help – otherwise, open an issue and we’ll see if we can get you started! And if you’ve got a use case that took you a second to figure out, please feel free to open a PR to add it as an example to this document.
With that introduction out of the way, we’ll go ahead and load rsi:
And start answering: How can I…
Get one composite per year, month, or other interval?
If you’re looking to get separate files for several intervals, you’ll
need to call get_stac_data()
separately for each of those
intervals. The easiest way to do this is through something like
vapply()
or a for-loop. Iterate along your intervals of
interest, construct your start_date
and
end_date
inside of each iteration, and call
get_stac_data()
using those dates as arguments:
aoi <- sf::st_point(c(-74.912131, 44.080410))
aoi <- sf::st_set_crs(sf::st_sfc(aoi), 4326)
aoi <- sf::st_buffer(sf::st_transform(aoi, 5070), 1000)
downloaded_years <- vapply(
2018:2020,
function(year) {
get_stac_data(
aoi = aoi,
start_date = glue::glue("{year}-01-01"),
end_date = glue::glue("{year}-12-31"),
asset_names = "lcpri",
stac_source = "https://planetarycomputer.microsoft.com/api/stac/v1",
collection = "usgs-lcmap-conus-v13",
output_filename = tempfile(fileext = ".tif")
)
},
character(1)
)
downloaded_years
#> [1] "/tmp/RtmpwkKpr9/file28725a50761.tif"
#> [2] "/tmp/RtmpwkKpr9/file287250d4cd3f.tif"
#> [3] "/tmp/RtmpwkKpr9/file287257b6f008.tif"
This ensures that get_stac_data()
is run using the same
arguments each time, so your outputs should be standardized across each
interval!
Filter the imagery I download by cloud cover (or other metadata)?
If you want to refine the outputs from your STAC query, the best approach is to write a custom query function using CQL2. CQL2 is a complicated topic, which is covered in a bit more detail both in the Downloading Data vignette as well as in the STAC website’s tutorials, but at its core is a query language that lets us filter our results down using a spatiotemporal area of interest as well as other item-level metadata.
This requires knowing what metadata your STAC API provides for the
items you’re querying! Luckily, a good number of these fields are
standardized via the STAC standard and various STAC extensions. For
instance, items implementing the electro-optical
extension will have an eo:cloud_cover
field which we
could use to filter the results of our query.
Writing a query function to filter using this field might look like this:
aoi <- sf::st_point(c(-74.912131, 44.080410))
aoi <- sf::st_set_crs(sf::st_sfc(aoi), 4326)
aoi <- sf::st_buffer(sf::st_transform(aoi, 5070), 1000)
custom_query_function <- function(bbox,
stac_source,
collection,
start_date,
end_date,
limit,
...) {
geometry <- rstac::cql2_bbox_as_geojson(bbox)
datetime <- rstac::cql2_interval(start_date, end_date)
request <- rstac::ext_filter(
rstac::stac(stac_source),
collection == {{collection}} &&
t_intersects(datetime, {{datetime}}) &&
s_intersects(geometry, {{geometry}}) &&
`eo:cloud_cover` < 50
)
rstac::items_fetch(rstac::post_request(request))
}
And we can use this to filter down how much data we’ll download! For
instance, we could download Landsat imagery for our area of interest
without using our new query function and setting
composite_function
to NULL
, so that we’ll get
one file per Landsat image:
unfiltered_landsat_images <- get_landsat_imagery(
aoi,
start_date = "2023-06-01",
end_date = "2023-09-01",
# Only downloading one asset, because we aren't using this data for anything
asset_names = landsat_band_mapping$planetary_computer_v1["red"],
mask_function = NULL,
mask_band = NULL,
output_filename = tempfile(fileext = ".tif"),
composite_function = NULL
)
length(unfiltered_landsat_images)
#> [1] 12
And we could compare that to a query using our custom query function, to see how many images are filtered out by our CQL2 query:
filtered_landsat_images <- get_landsat_imagery(
aoi,
start_date = "2023-06-01",
end_date = "2023-09-01",
query_function = custom_query_function,
asset_names = landsat_band_mapping$planetary_computer_v1["red"],
mask_function = NULL,
mask_band = NULL,
output_filename = tempfile(fileext = ".tif"),
composite_function = NULL
)
length(filtered_landsat_images)
#> [1] 7
We only wind up downloading about half the number of images!