Always get ready for the most powerful data wrangling package in R lang.
R is my favorite programming language when it comes to data
analytics. And one of the reason I feel it is so powerful is solely due
to this single package called data.table
.
It does not have quite the equal amount of attention to its major
alternative, the dplyr
family. I’ve spent years wondering
why.
The elegance in API design and the unparalleled performance still won’t make it one of the most popular data wrangling package in R lang.
Anyway I’ve stick to it for so many years. And I’m sure there are so many more to come. :)
Ok the topic today is about using data.table
in macOS,
which can be a trouble due to a problem of the lacking of built-in
OpenMP support in Apple’s machine. If we simply install
data.table
from CRAN, it will warn you that it is running
on single-thread mode. That is not cool. So here is a very quick fix for
that.
brew install libomp
~/.R/Makevars
This file instructs R for additional compiler arguments. Run the following one-off commands to update it:
OMP_PATH=$(brew --prefix libomp)
mkdir -p ~/.R
cat <<EOT >> ~/.R/Makevars
CPPFLAGS += -Xclang -fopenmp -I ${OMP_PATH}/include
LDFLAGS += -lomp -L ${OMP_PATH}/lib
EOT
data.table
from sourceinstall.packages("data.table", type = "source", repos = "https://Rdatatable.gitlab.io/data.table")
Done. That should do the trick.
I’m using the native aarch64
build for R. Here is my
sessionInfo
:
R version 4.1.3 (2022-03-10)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.2
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
locale:
[1] C/UTF-8/C/C/C/C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.1.3
Check it out:
library(data.table)
getDTthreads() # check how many threads in use
Now we’re talking!
If you see mistakes or want to suggest changes, please create an issue on the source repository.