Skip to content

Prepare data

library(impIndicator)
library(b3gbi) # General biodiversity indicators for data cubes
library(dplyr) # Data wrangling
library(knitr) # Nice tables

Import GBIF data using read.csv(), readr::read_csv(), or readxl::read_excel() based on the data set format.

Here is an example a GBIF occurrences data with the minimum required columns.decimalLatitude, decimalLongitude, species, speciesKey, coordinateUncertaintyInMeters and year

decimalLatitudedecimalLongitudespeciesspeciesKeycoordinateUncertaintyInMetersyear
-33.4720926.25137Acacia mearnsii2979775252024
-32.3415119.02159Acacia mearnsii297977582024
-34.5631719.79653Acacia longifolia297873052024
-34.6632219.80716Acacia cyclops2980425NA2024
-34.3808919.22371Acacia longifolia2978730152024
-33.0118618.36404Acacia saligna297855242024
-33.6842118.70806Acacia saligna2978552152024
-34.3301318.99397Acacia longifolia297873042024
-26.1905528.11916Acacia mearnsii297977592024
-34.4252519.86027Acacia cyclops2980425152024

The region of the study has to be given as a shapefile of the study area or a character representing the country of study area. An example is:

southAfrica_sf
#> Simple feature collection with 1 feature and 0 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 16.48333 ymin: -34.822 xmax: 32.89043 ymax: -22.13639
#> Geodetic CRS: WGS 84
#> geometry
#> 1 MULTIPOLYGON (((31.2975 -22...
acacia_cube <- taxa_cube(
taxa = taxa_Acacia,
region = southAfrica_sf,
first_year = 2010,
last_year = 2023
)

The cube is a sim_cube object. Below is an example of the acacia taxa in South Africa:

# View processed cube
acacia_cube
#>
#> Simulated data cube for calculating biodiversity indicators
#>
#> Date Range: 2010 - 2023
#> Number of cells: 385
#> Grid reference system: custom
#> Coordinate range:
#> xmin xmax ymin ymax
#> 16.60833 31.35833 -34.69700 -22.94701
#>
#> Total number of observations: 5663
#> Number of species represented: 28
#> Number of families represented: Data not present
#>
#> Kingdoms represented: Data not present
#>
#> First 10 rows of data (use n = to show more):
#>
#> # A tibble: 5,663 × 8
#> scientificName taxonKey minCoordinateUncertaintyInMeters year cellCode xcoord ycoord obs
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 Acacia mearnsii 2979775 8 2010 1376 30.4 -29.7 1
#> 2 Acacia saligna 2978552 1 2010 206 18.4 -33.9 1
#> 3 Acacia implexa 2979232 1 2010 206 18.4 -33.9 1
#> 4 Acacia pycnantha 2978604 1 2010 206 18.4 -33.9 1
#> 5 Acacia cyclops 2980425 122 2010 668 18.4 -32.2 1
#> 6 Acacia mearnsii 2979775 1100 2010 1110 29.9 -30.7 1
#> 7 Acacia mearnsii 2979775 1 2010 215 20.6 -33.9 1
#> 8 Acacia mearnsii 2979775 110 2010 215 20.6 -33.9 1
#> 9 Acacia pycnantha 2978604 1100 2010 143 19.1 -34.2 1
#> 10 Acacia saligna 2978552 1 2011 206 18.4 -33.9 1
#> # ℹ 5,653 more rows

The cube can be generated by downloading the GBIF with rgbif::occ_data()

impIndicator works other cubes with standard grid cell, such as, eea and eqdgc which are processed from b3gbi::process_cube(). An example is the mammal_cube in the b3gbi package.

# Load GBIF data cube
cube_name <- system.file("extdata", "denmark_mammals_cube_eqdgc.csv",
package = "b3gbi")
# Prepare cube
mammal_cube <- b3gbi::process_cube(cube_name, first_year = 2000)
# View cube
mammal_cube
#>
#> Processed data cube for calculating biodiversity indicators
#>
#> Date Range: 2000 - 2024
#> Single-resolution cube with cell size 0.25degrees
#> Number of cells: 265
#> Grid reference system: eqdgc
#> Coordinate range:
#> xmin xmax ymin ymax
#> 3.25 15.25 54.25 58.25
#>
#> Total number of observations: 191676
#> Number of species represented: 97
#> Number of families represented: 31
#>
#> Kingdoms represented: Animalia
#>
#> First 10 rows of data (use n = to show more):
#>
#> # A tibble: 28,155 × 15
#> year cellCode kingdomKey kingdom familyKey family taxonKey scientificName obs minCoordinateUncertainty…¹
#> <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl> <chr> <dbl> <dbl>
#> 1 2000 E006N56DB 1 Animalia 9680 Odobenidae 5218819 Odobenus rosmarus 1 111
#> 2 2000 E008N54BA 1 Animalia 5310 Phocidae 2434793 Phoca vitulina 3 1000
#> 3 2000 E008N55AA 1 Animalia 5310 Phocidae 2434793 Phoca vitulina 1 1000
#> 4 2000 E008N55AB 1 Animalia 5307 Mustelidae 2433753 Lutra lutra 1 1000
#> 5 2000 E008N55AB 1 Animalia 5510 Muridae 7429082 Mus musculus 5 1000
#> 6 2000 E008N55AC 1 Animalia 5307 Mustelidae 2433753 Lutra lutra 1 1000
#> 7 2000 E008N55AC 1 Animalia 5310 Phocidae 2434793 Phoca vitulina 3 1000
#> 8 2000 E008N55AC 1 Animalia 5310 Phocidae 2434806 Halichoerus grypus 2 1000
#> 9 2000 E008N55AC 1 Animalia 9701 Canidae 5219243 Vulpes vulpes 1 980
#> 10 2000 E008N55AC 1 Animalia 9379 Leporidae 7952072 Lepus europaeus 1 1000
#> # ℹ 28,145 more rows
#> # ℹ abbreviated name: ¹​minCoordinateUncertaintyInMeters
#> # ℹ 5 more variables: minTemporalUncertainty <dbl>, familyCount <dbl>, xcoord <dbl>, ycoord <dbl>, resolution <chr>