Skip to content

Prepare data

library(impIndicator)
library(b3gbi) # General biodiversity indicators for data cubes
library(dplyr) # Data wrangling
library(knitr) # Nice tables

Process GBIF data from the R environment

Import GBIF data using read.csv(), readr::read_csv(), or readxl::read_excel() based on the data set format.

Here is an example a GBIF occurrences data with the minimum required columns.decimalLatitude, decimalLongitude, species, speciesKey, coordinateUncertaintyInMeters and year

decimalLatitudedecimalLongitudespeciesspeciesKeycoordinateUncertaintyInMetersyear
-33.4720926.25137Acacia mearnsii2979775252024
-32.3415119.02159Acacia mearnsii297977582024
-34.5631719.79653Acacia longifolia297873052024
-34.6632219.80716Acacia cyclops2980425NA2024
-34.3808919.22371Acacia longifolia2978730152024
-33.0118618.36404Acacia saligna297855242024
-33.6842118.70806Acacia saligna2978552152024
-34.3301318.99397Acacia longifolia297873042024
-26.1905528.11916Acacia mearnsii297977592024
-34.4252519.86027Acacia cyclops2980425152024

The region of the study has to be given as a shapefile of the study area or a character representing the country of study area. An example is:

southAfrica_sf
#> Simple feature collection with 1 feature and 0 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: 16.48333 ymin: -34.822 xmax: 32.89043 ymax: -22.13639
#> Geodetic CRS: WGS 84
#> geometry
#> 1 MULTIPOLYGON (((31.2975 -22...
acacia_cube <- taxa_cube(
taxa = taxa_Acacia,
region = southAfrica_sf,
first_year = 2010,
last_year = 2023
)

The cube is a sim_cube object. Below is an example of the acacia taxa in South Africa:

# View processed cube
acacia_cube
#>
#> Simulated data cube for calculating biodiversity indicators
#>
#> Date Range: 2010 - 2023
#> Number of cells: 385
#> Grid reference system: custom
#> Coordinate range:
#> xmin xmax ymin ymax
#> 16.60833 31.35833 -34.69700 -22.94701
#>
#> Total number of observations: 5663
#> Number of species represented: 28
#> Number of families represented: Data not present
#>
#> Kingdoms represented: Data not present
#>
#> First 10 rows of data (use n = to show more):

Download from GBIF website

The cube can be generated by downloading the GBIF with rgbif::occ_data()

Cube with standard grid

impIndicator works other cubes with standard grid cell, such as, eea and eqdgc which are processed from b3gbi::process_cube(). An example is the mammal_cube in the b3gbi package.

# Load GBIF data cube
cube_name <- system.file("extdata", "denmark_mammals_cube_eqdgc.csv",
package = "b3gbi")
# Prepare cube
mammal_cube <- b3gbi::process_cube(cube_name, first_year = 2000)
# View cube
mammal_cube
#>
#> Processed data cube for calculating biodiversity indicators
#>
#> Date Range: 2000 - 2024
#> Single-resolution cube with cell size 0.25degrees
#> Number of cells: 265
#> Grid reference system: eqdgc
#> Coordinate range:
#> xmin xmax ymin ymax
#> 3.375 15.125 54.375 58.125
#>
#> Total number of observations: 191676
#> Number of species represented: 97
#> Number of families represented: 31
#>
#> Kingdoms represented: Animalia
#>
#> First 10 rows of data (use n = to show more):