Use a data map to select, rename, adjust and align columns

Useful to prepare data from several different data sources into a common structure that can be read collectively via arrow::open_dataset()

remap_data_names(
  this_name,
  df_to_remap,
  data_map = NULL,
  out_file = NULL,
  exclude_cols = c("order", "epsg", "desc", "data_name_use", "url"),
  add_month = !is.null(data_map),
  add_year = !is.null(data_map),
  add_occ = !is.null(data_map),
  occ_cols = c("occ_derivation", "quantity"),
  absences = c("0", "none detected", "none observed", "None detected", "ABSENT"),
  previous = c("delete", "move"),
  compare_previous = TRUE,
  compare_cols = c("data_name", "survey"),
  ...
)

Arguments

this_name: Character. Name of the data source.
df_to_remap: Dataframe containing the columns to select and (potentially) rename
data_map: Dataframe or NULL. Mapping of fields to retrieve. See example envImport::data_map
out_file: Character. Name of file to save. If NULL, this will be here::here("ds", this_name, "this_name.parquet")
add_month, add_year: Logical. Add a year and/or month column to returned data frame (requires a date field to be specified by data_map)
add_occ: Logical. Make an occ column (occurrence) of 1 = detected, 0 = not detected? Due to the plethora of ways original data sets record numbers and absences this should not be considered 100% reliable.
absences: Character. If add_occ what values are considered absences?
previous: Character. What to do with any previous out_file. Default is 'delete'. Alternative 'move' will rename to the same location as gsub("\.parquet", paste0("moved__", format(now(), "%Y%m%d_%H%M%S"), ".parquet"), out_file)
compare_previous: Logical. If TRUE a comparison of records per compare_cols will be made between the new and previous out_file. Ignored unless previous == "move
compare_cols: If compare_previous which columns to comapare. Default is survey.
...: Not used
exclude_names: Character. column names in namesmap to exclude from the combined data

Value

Tibble with selected, renamed, adjusted and aligned columns

Details

Includes code from the stack exchange network post by Dan.

Use a data map to select, rename, adjust and align columns

Arguments

Value

Details

See also