stringdist now with C API

Version of stringdist is on CRAN. The main new feature, with a huge thanks to our awesome new contributor Chris Muir, is that we made it easy to call stringdist functionality from your package's C or C++ code.

The main steps to get it done are:

  1. Make sure to add stringdist to the Imports: and LinkingTo: fields in your DESRIPTION file
  2. Add the #include <stringdist_api> to your C/C++ source file.
  3. Start using stringdist from C!

Here's an example source file

#include <R.h>
#include <Rdefines.h>
#include <stringdist_api.h>

SEXP my_soundex(SEXP strings, SEXP useBytes){
  Rprintf("\nWow, using 'stringdist' soundex encoding, from my own C code!\n");
  return sd_soundex(strings, useBytes);

Great! how can I learn more?

  • The full API is desribed in a pdf file that is generated from doxygen that comes with the package. You can find it by typing ?stringdist_api on the R command line.
  • A minimal example package that links to stringdist is available on GitHub
  • A more sophisticated package with more elaborate examples can be found here: refinr (By Chris)

Any other news?

A few fixes, and a couple of long-deprecated function arguments have finally been removed. Check out the NEWS file on CRAN for a complete overview.

Happy coding!

This entry was posted in programming, R and tagged , . Bookmark the permalink.

2 Responses to stringdist now with C API

  1. Jason says:

    I recently read your benchmarks of stringdist vs RecordLinkage on R-Bloggers. (several years old now) Is stringdist the fastest? I'm trying to do a comparison of around 20,000 x 20,000 records and I'm looking for the fastest way to process through these.

    • mark says:

      It depends a bit on the distance you use, but I would say in general: yes, especially since stringdist can take advantage of multiple cores.

Leave a Reply to mark Cancel reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.