Category Archives: programming

The program for uRos2018 is online

The uRos2018 conference is aimed at professionals and academics who are involved in producing or consuming official (government) statistics. We are happy to announce that we recently posted the full program of the 6th international conference on the use of … Continue reading

Posted in official statistics, programming, R | Leave a comment

stringdist 0.9.5.1: now with C API

Version 0.9.5.1 of stringdist is on CRAN. The main new feature, with a huge thanks to our awesome new contributor Chris Muir, is that we made it easy to call stringdist functionality from your package's C or C++ code. The … Continue reading

Posted in data cleaning, programming, R, string metrics | Leave a comment

The use of R in official statistics conference 2018

On September 12-14 the 6th international conference on the use of R in official statistics (#uRos2018) will take place at the Dutch National Statistical Office in Den Haag, the Netherlands. The conference is aimed at producers and users of official … Continue reading

Posted in official statistics, R | Leave a comment

Track changes in data with the lumberjack %>>%

So you are using this pipeline to have data treated by different functions in R. For example, you may be imputing some missing values using the simputation package. Let us first load the only realistic dataset in R > data(retailers, … Continue reading

Posted in data cleaning, data manipulation, programming, R | Leave a comment

Announcing the simputation package: make imputation simple

I am happy to announce that my simputation package has appeared on CRAN this weekend. This package aims to simplify missing value imputation. In particular it offers standardized interfaces that make it easy to define both imputation method and imputation … Continue reading

Posted in data cleaning, data correction methods, imputation, programming, R | 5 Comments

stringdist 0.9.4.2 released

stringdist 0.9.4.2 was accepted on CRAN at the end of last week. This release just fixes a few bugs affecting the stringdistmatrix function, when called with a single argument. From the NEWS file: bugfix in stringdistmatrix(a): value of p, for … Continue reading

Posted in programming, R, string metrics | 2 Comments

validate version 0.1.5 is out

A new version of the validate package for data validation was just accepted on CRAN and will be available on all mirrors in a few days. The most important addition is that you can now reference the data set as … Continue reading

Posted in data cleaning, data correction methods, data manipulation, programming, R, Uncategorized | Leave a comment

Easy data validation with the validate package

The validate package is our attempt to make checking data against domain knowledge as easy as possible. Here is an example. library(magrittr) library(validate) iris %>% check_that( Sepal.Width > 0.5 * Sepal.Length , mean(Sepal.Width) > 0 , if ( Sepal.Width > … Continue reading

Posted in data cleaning, programming, R | 11 Comments

settings 0.2.3

An updated version of the settings package has been accepted on CRAN. The settings package provides alternative options settings management for R. It is aimed to allow for layered options management where global options are the default that can easily … Continue reading

Posted in programming, R | Leave a comment

stringdist 0.9.4 and 0.9.3: distances between integer sequences

A new release of stringdist has been accepted on CRAN. stringdist offers a number of popular distance functions between sequences of integers or characters that are independent of character encoding. version 0.9.4 bugfix: edge case for zero-size for lower tridiagonal … Continue reading

Posted in programming, R, string metrics | Leave a comment