validate version 0.1.5 is out

A new version of the validate package for data validation was just accepted on CRAN and will be available on all mirrors in a few days.

The most important addition is that you can now reference the data set as a whole, using the "dot" syntax like so:

iris %>% check_that(
    nrow(.)>100
  , "Sepal.Width" %in% names(.)) %>% 
summary()

  rule items passes fails nNA error warning                  expression
1   V1     1      1     0   0 FALSE   FALSE               nrow(.) > 100
2   V2     1      1     0   0 FALSE   FALSE "Sepal.Width" %in% names(.)

Also, it is now possible to return a logical, even when the result is NA, by passing the na.value option.

dat = data.frame(x=c(1,NA,-1))
v = validator(x > 0)
values(confront(dat,v))
        V1
[1,]  TRUE
[2,]    NA
[3,] FALSE
values(confront(dat,v,na.value=FALSE))
        V1
[1,]  TRUE
[2,] FALSE
[3,] FALSE

A complete list of changes and bugfixes can be found in the NEWS file. Below I include changes in 1.4 since I did not write about it before.

I will be talking about this package at the upcoming useR!2016 event, so join me if you're interested!

version 0.1.5

  • The '.' is now used to reference the validated data set as whole.
  • Small change in output of 'compare' to match the table in van den Broek et al. (2013)

version 0.1.4

  • 'confront' now emits a warining when variable name conflicts with name of a reference data set
  • Deprecated 'validate_reset', in favour of the shorter 'reset' (use 'validate::reset' in case of ambiguity)
  • Deprecated 'validate_options' in favour of the shorter 'voptions'
  • New option na.value with default value NA, controlling the output when a rule evaluates to NA.
  • Added rules from the ESSnet on validation (deliverable 17) to automated tests.
  • added 'grepl' to allowed validation syntax (suggested by Dusan Sovic)
  • exported a few functions w/ keywords internal for extensibility
  • Bugfix: blocks sometimes reported wrong nr of blocks (in case of a single connected block.)
  • Bugfix: macro expansion failed when macros were reused in other macros.
  • Bugfix: certain nonlinear relations were recognized as linear
  • Bugfix: rules that use (anonymous) function definitions raised error when printed.
This entry was posted in data cleaning, data correction methods, data manipulation, programming, R, Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.

*