A new feature of data.table for us developers

The R package data.table is truely a work of art. The power it gives you is immense, and it’s always the first library I load in every single script.

And as of v1.13.0 it gives you even more! If you read their news file and saw this entry

  1. CsubsetDT C function is now exported for use by other packages

But are unsure what it means, hopefully this article will help.

Go on…

When you import data.table into your own package you can use its subsetting operator [.data.table by making your objects data.tables.

This is great at the R-level, but if you write any C or C++ code, and you wanted this fast subsetting you would have to leave C/C++, go back to R, do the subset, then carry on with whatever your code does. Crossing the divide like this incurs programming cost.

Now with CsubsetDT you can call this subset operator directly in C/C++. No need to go back to R, or indeed implement your own version !

But how?

Simply register the function in your own code, et voila.

library(Rcpp)

cppFunction(
  depends = 'data.table'
  , code = '
    SEXP dt_subset(SEXP x, SEXP rows, SEXP cols ) {

      SEXP(*dtsubset)(SEXP, SEXP, SEXP) = (SEXP(*)(SEXP,SEXP,SEXP)) R_GetCCallable("data.table", "CsubsetDT"); 
      
  return dtsubset(x,rows,cols);
    }
  '
)

Now you can use data.table’s subset operation directly from C/C++

df <- mapdeck::roads

dt_subset( df, 35:50, c(11L,12L) )
##    RIGHT_LOC  ROAD_NAME
## 1   HAWTHORN    BURWOOD
## 2   HAWTHORN     PALMER
## 3   HAWTHORN   ISABELLA
## 4   RICHMOND    UNNAMED
## 5   RICHMOND MAIN YARRA
## 6   HAWTHORN     BRIDGE
## 7   RICHMOND     BRIDGE
## 8   HAWTHORN     BRIDGE
## 9   RICHMOND MAIN YARRA
## 10  RICHMOND MAIN YARRA
## 11  HAWTHORN    UNNAMED
## 12  HAWTHORN    UNNAMED
## 13  HAWTHORN     BRIDGE
## 14  RICHMOND     BRIDGE
## 15  HAWTHORN    UNNAMED
## 16  HAWTHORN   CRESWICK

Addendum

Work is already underway to make a data.table API. Which means in a future release you won’t even have to register the R_GetCCallable(), you’ll only need to include the datatable.h file and use the functions directly!