This function cleans raw Vessel Monitoring System, VMS, data column files,
eliminate NULL values in coordinates, parse dates, and returns a data.frame.
Arguments
- path_to_data
it can be a path to the file downloaded or the data object itself. If function is used with a path it adds a
filecolumn to the returning data.frame object that stores the name of the file as a reference.
Details
It takes a raw data file downloaded using the vms_download() function by
specifying directly its path or by referencing a data.frame already stored as an R object.
If path is used, column with the name of the raw file is conveniently added as future reference.
It also split date into three new columns year, month, day, and retains the original date column.
This function can be used with apply functions over a list
of files or it can be paralleled using furrr functions.
Examples
# Using sample dataset, or a data.frame already stored as an object
# It is possible to use a path directly as argument
data("sample_dataset")
cleaned_vms <- vms_clean(sample_dataset)
#> Cleaned: 969 empty rows from data!
head(cleaned_vms)
#> # A tibble: 6 × 13
#> id year month day date vessel_name RNP port_base owner
#> <int> <dbl> <dbl> <int> <dttm> <chr> <chr> <chr> <chr>
#> 1 1 2019 1 10 2019-01-10 03:59:00 EUROPESCA IV 54809 MAZATLAN… EURO…
#> 2 2 2019 1 10 2019-01-10 18:04:00 TABASCO 454 27041 FRONTERA ARMA…
#> 3 3 2019 1 1 2019-01-01 19:37:00 AVENTURERO 63677 TAMPICO EDGA…
#> 4 4 2019 1 8 2019-01-08 12:06:00 DON ANTONIO… 17392 MAZATLAN… PESQ…
#> 5 5 2019 1 10 2019-01-10 06:08:00 PESCAMEX 14 45526 PROGRESO PESC…
#> 6 6 2019 1 4 2019-01-04 05:41:00 DON QUINTIN… 30619 TAMPICO LUIS…
#> # ℹ 4 more variables: latitude <dbl>, longitude <dbl>, speed <dbl>,
#> # direction <dbl>
