Only queries GBIF for taxa not already in taxonomy_file
.
Usage
get_taxonomy(
df,
taxa_col = "original_name",
taxonomy_file = tempfile(),
force_new = list(original_name = NULL, timediff = as.difftime(26, units = "weeks")),
remove_taxa = c("BOLD:", "dead", "unverified", "annual herb", "annual grass", "\\?"),
remove_strings = c("\\sx\\s.*", "\\sX\\s.*", "\\s\\-\\-\\s.*",
"\\s\\(.*\\)", "\\ssp\\.$", "\\sssp\\.$", "\\sspec\\.$"),
remove_dead = FALSE,
...
)
Arguments
- df
Dataframe with taxa column.
- taxa_col
Character. Name of column with taxa names. Each unique taxa in this column will appear in the results in a column called
original_name
- taxonomy_file
Character. Path to save results to.
- force_new
List with elements
taxa_col
anddifftime
. Iftaxonomy_file
already exists anytaxa_col
matches betweenforce_new
andtaxonomy_file
will be requeried. Likewise anyoriginal_name
that has not been searched sincedifftime
will be requeried. Note the nametaxa_col
should be as provided as per thetaxa_col
argument. Set either toNULL
to ignore.- remove_taxa
Character. Regular expressions to be matched. Any matches will be filtered before searching. Removes any rows that match.
- remove_strings
Character. Regular expressions to be matched. Any matches will be removed from the string before searching. Removes any text that matches, but the row remains.
- ...
Arguments passed to
rgbif::name_backbone_checklist()
.
Value
Dataframe. Results from envClean::get_gbif_tax()
. Tweaked by column
rank
being lowercase and ordered factor as per envClean::lurank
. Writes
taxonomy_file
and gsub("\\.", "_accepted.", taxonomy_file)
Details
Common (vernacularName) no longer supported here. Use get_gbif_common()
on
a downstream result. It may be helpful to keep a usageKey through the
cleaning process for use in getting common names. Part of the reason for
removing that functionality here was the ambiguity of which key to use,
particularly around species vs subspecies.