This function cleans raw Vessel Monitoring System, VMS, data column files,
eliminate NULL values in coordinates, parse dates, and returns a data.frame
.
Arguments
- path_to_data
it can be a path to the file downloaded or the data object itself. If function is used with a path it adds a
file
column to the returning data.frame object that stores the name of the file as a reference.
Details
It takes a raw data file downloaded using the vms_download()
function by
specifying directly its path or by referencing a data.frame already stored as an R object.
If path is used, column with the name of the raw file is conveniently added as future reference.
It also split date into three new columns year
, month
, day
, and retains the original date
column.
This function can be used with apply
functions over a list
of files or it can be paralleled using furrr
functions.
Examples
# Using sample dataset, or a data.frame already stored as an object
# It is possible to use a path directly as argument
data("sample_dataset")
cleaned_vms <- vms_clean(sample_dataset)
#> Cleaned: 969 empty rows from data!
head(cleaned_vms)
#> # A tibble: 6 × 13
#> id year month day date vessel_name RNP port_base owner
#> <int> <dbl> <dbl> <int> <dttm> <chr> <chr> <chr> <chr>
#> 1 1 2019 1 10 2019-01-10 03:59:00 EUROPESCA IV 54809 MAZATLAN… EURO…
#> 2 2 2019 1 10 2019-01-10 18:04:00 TABASCO 454 27041 FRONTERA ARMA…
#> 3 3 2019 1 1 2019-01-01 19:37:00 AVENTURERO 63677 TAMPICO EDGA…
#> 4 4 2019 1 8 2019-01-08 12:06:00 DON ANTONIO… 17392 MAZATLAN… PESQ…
#> 5 5 2019 1 10 2019-01-10 06:08:00 PESCAMEX 14 45526 PROGRESO PESC…
#> 6 6 2019 1 4 2019-01-04 05:41:00 DON QUINTIN… 30619 TAMPICO LUIS…
#> # ℹ 4 more variables: latitude <dbl>, longitude <dbl>, speed <dbl>,
#> # direction <dbl>