The R package data.table is truely a work of art. The power it gives you is immense, and it’s always the first library I load in every single script.
And as of v1.13.0 it gives you even more! If you read their news file and saw this entry
- CsubsetDT C function is now exported for use by other packages
But are unsure what it means, hopefully this article will help.
When you import data.table
into your own package you can use its subsetting operator [.data.table
by making your objects data.tables
.
This is great at the R-level, but if you write any C or C++ code, and you wanted this fast subsetting you would have to leave C/C++, go back to R, do the subset, then carry on with whatever your code does. Crossing the divide like this incurs programming cost.
Now with CsubsetDT
you can call this subset operator directly in C/C++. No need to go back to R, or indeed implement your own
version !
Simply register the function in your own code, et voila.
library(Rcpp)
cppFunction(
depends = 'data.table'
, code = '
SEXP dt_subset(SEXP x, SEXP rows, SEXP cols ) {
SEXP(*dtsubset)(SEXP, SEXP, SEXP) = (SEXP(*)(SEXP,SEXP,SEXP)) R_GetCCallable("data.table", "CsubsetDT");
return dtsubset(x,rows,cols);
}
'
)
Now you can use data.table
’s subset operation directly from C/C++
df <- mapdeck::roads
dt_subset( df, 35:50, c(11L,12L) )
## RIGHT_LOC ROAD_NAME
## 1 HAWTHORN BURWOOD
## 2 HAWTHORN PALMER
## 3 HAWTHORN ISABELLA
## 4 RICHMOND UNNAMED
## 5 RICHMOND MAIN YARRA
## 6 HAWTHORN BRIDGE
## 7 RICHMOND BRIDGE
## 8 HAWTHORN BRIDGE
## 9 RICHMOND MAIN YARRA
## 10 RICHMOND MAIN YARRA
## 11 HAWTHORN UNNAMED
## 12 HAWTHORN UNNAMED
## 13 HAWTHORN BRIDGE
## 14 RICHMOND BRIDGE
## 15 HAWTHORN UNNAMED
## 16 HAWTHORN CRESWICK
Work is already underway to make a data.table API. Which means in a future release you won’t even have to register the R_GetCCallable()
, you’ll only need to include the datatable.h
file and use the functions directly!