vignettes/svFast.Rmd
svFast.Rmd
The {svFast} package provides a series of math and stat function that
compute in parallel if the vector is large enough (by default >=
50,000 elements). Otherwise, these functions are simplified version of
corresponding base R functions. They run on numeric vectors and do not
retain any attributes of the input vector. They are not
generic functions, on the contrary to their base R counterparts. They
are designed to be fast when computing 100,000s to 100,000,000s items.
Their names are the same as corresponding base functions followed by an
underscore. For example, log_()
is the fast version of
log()
.
library(svFast)
# The number of threads used for calculation can be changed with:
#RcppParallel::setThreadOptions(numThreads = 4)
While vector x
is short, the fun_()
versions run sequentially at similar speed of base R equivalent for
vectors roughly >= 1000 items. For smaller vectors, the overhead of
Rcpp make these functions slower.
x <- 1:10
log_(x)
#> [1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101
#> [8] 2.0794415 2.1972246 2.3025851
identical(log_(x), log(x))
#> [1] TRUE
# log in base 8
log_(x, base = 8)
#> [1] 0.0000000 0.3333333 0.5283208 0.6666667 0.7739760 0.8616542 0.9357850
#> [8] 1.0000000 1.0566417 1.1073094
bench::mark(log(x), log_(x)) # Slower with such a short vector
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 log(x) 200ns 221.07ns 2297082. 0B 0
#> 2 log_(x) 1.55µs 1.84µs 529934. 0B 0
When x
is long, the fun_()
versions run in
parallel and are much faster than the base R equivalent.
x2 <- runif(1e5)
# x2 size larger than 50,000, parallel computation activated
bench::mark(log(x2), log_(x2)) # Faster (depends on number of threads)
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 log(x2) 841µs 899µs 1110. 781KB 17.8
#> 2 log_(x2) 455µs 476µs 2094. 781KB 31.8