Tag Archives: data correction methods
gower 0.2.0 is on CRAN
A new version of R package gower has just been released on CRAN. Thanks to our new contributor David Turner who was kind enough to provide a pull request, gower now also computes weighted gower distances. From the NEWS file: … Continue reading
Announcing the simputation package: make imputation simple
I am happy to announce that my simputation package has appeared on CRAN this weekend. This package aims to simplify missing value imputation. In particular it offers standardized interfaces that make it easy to define both imputation method and imputation … Continue reading
validate version 0.1.5 is out
A new version of the validate package for data validation was just accepted on CRAN and will be available on all mirrors in a few days. The most important addition is that you can now reference the data set as … Continue reading
stringdist 0.8: now with soundex
An update to the stringdist package was released earlier this month. Thanks to a contribution of Jan van der Laan the package now includes a method to compute soundex codes as defined here. Briefly, soundex encoding aims to translate words … Continue reading
Approximate string matching in R
I have released a new version of the stringdist package. Besides a some new string distance algorithms it now contains two convenient matching functions: amatch: Equivalent to R's match function but allowing for approximate matching. ain: Similar to R's %in% … Continue reading
Deductive imputation with the deducorrect package
Missing data hinders statistical analyses. Estimating missing values (imputation) prior to analysis is one way to deal with that. In some cases however, the missings need not be estimated at all, since they can be derived with certainty from other … Continue reading
What do your rules look like? editrules 1.8-x answers with the help of igraph
We (Edwin de Jonge and me) have recently updated our editrules package. The most important new features include (beta) support for categorical data. However, in this post I'm going to show some visualizations we included, made possible by Gabor Csardi's … Continue reading
Improving data quality with deducorrect
Does your raw numerical data suffer from typos? sign errors? variable swaps? rounding errors? You may be able to fix all that with the deducorrect package. Today, we (that is Edwin de Jonge, Sander Scholtus and myself) uploaded the, 1.0-0 … Continue reading