Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
A surprise
I came to write this post because I was surprised by its
findings.
The seed for it came from a somewhat obscure source: the Tokyo.R
slack channel.
This is actually a fairly vibrant R community. In any case, there was
a post by an R user suprised to find parallel computation much slower
than the sequential alternative – even though he had thousands, tens of
thousands of ‘embarassingly parallel’ iterations to compute.
He demonstrated this with a simple benchmarking exercise, which
showed a variety of parallel map functions from various packages (which
shall remain nameless), each slower than the last, and (much) slower
than the non-parallel versions.
The replies to this post could be anticipated, and mostly aimed to
impart some of the received wisdom: namely that you need computations to
be sufficiently complex to benefit from parallel processing, due to the
extra overhead from sending and coordinating information to and from
workers. For simple functions, it is just not worth the effort.
And this is indeed the ‘received wisdom’…
and I thought about it… and the benchmarking results continued to
look really embarassing.
The implicit answer was just not particularly satisfying:
‘sometimes it works, you just have to judge when’.
And it didn’t really answer the original poster either – for he just
attemped to expose the problem by using a simple example, not that his
real usage was as simple.
The parallel methods just didn’t work. Or rather didn’t ‘just
work’TM.
And this is what sparked off The Investigation.
The Investigation
It didn’t seem right that there should be such a high bar before
parallel computations become beneficial in R.
My starting point would be mirai, somewhat naturally (as
I’m the author). I also knew that mirai
would be fast, as
it was designed to be minimalist.
mirai
? That’s みらい or Japanese for ‘future’. All you
need to know for now is that it’s a package that can create its own
special type of parallel clusters.
I had not done such a benchmarking exercise before as performance
itself was not its raison d’être. More than anything else, it
was built as a reliable scheduler for distributed computing. It is the
engine that powers crew, the high
performance computing element of targets, where it
is used in industrial-scale reproducible pipelines.
And this is what I found:
Applying
the statistical function rpois()
over 10,000
iterations:
library(parallel) library(mirai) base <- parallel::makeCluster(4) mirai <- mirai::make_cluster(4) x <- 1:10000 res <- microbenchmark::microbenchmark( parLapply(base, x, rpois, n = 1), lapply(x, rpois, n = 1), parLapply(mirai, x, rpois, n = 1) ) ggplot2::autoplot(res) + ggplot2::theme_minimal()
Using the ‘mirai’ cluster resulted in faster results than the simple
non-parallel lapply()
, which was then in turn much faster
than the base default parallel cluster.
Faster!
I’m only showing the comparison with base R functions. They’re often
the most performant after all. The other packages that had featured in
the original benchmarking suffer from an even greater overhead than that
of base R, so there’s little point showing them above.
Let’s confirm with an even simpler function…
Applying
the base function sum()
over 10,000 iterations:
res <- microbenchmark::microbenchmark( parLapply(base, x, sum), lapply(x, sum), parLapply(mirai, x, sum) ) ggplot2::autoplot(res) + ggplot2::theme_minimal()
mirai
holds its own! Not much faster than sequential,
but not slower either.
But what if the data being transmitted back and forth is larger,
would that make a difference? Well, let’s change up the original
rpois()
example, but instead of iterating over lamba, have
it return increasingly large vectors instead.
Applying
the statistical function rpois()
to generate random vectors
around length 10,000:
x <- 9900:10100 res <- microbenchmark::microbenchmark( parLapplyLB(base, x, rpois, lambda = 1), lapply(x, rpois, lambda = 1), parLapplyLB(mirai, x, rpois, lambda = 1) ) ggplot2::autoplot(res) + ggplot2::theme_minimal()
The advantage is maintained! 1
So ultimately, what does this all mean?
Well, quite significantly, that virtually any place you have
‘embarassingly parallel’ code where you would use lapply()
or purrr::map()
, you can now confidently replace with a
parallel parLapply()
using a ‘mirai’ cluster.
The answer is no longer ‘sometimes it works, you
just have to judge when’, but:
‘yes, it works!’.
What is this Magic
mirai
uses the latest NNG (Nanomsg Next Generation)
technology, a lightweight messaging library and concurrency framework 2 – which
means that the communications layer is so fast that this no longer
creates a bottleneck.
The package leverages new connection types such as IPC (inter-process
communications), that are not available to base R. As part of R Project
Sprint 2023, R Core invited participants to provide alternative
commnunications backends for the parallel
package, and
‘mirai’ clusters were born as a result.
A ‘mirai’ cluster is simply another type of ‘parallel’ cluster, and
are persistent background processes utilising cores on your own machine,
or on other machines across the network (HPCs or even the cloud).
I’ll leave it here for this post. You’re welcome to give
mirai
a try, it’s available on CRAN and at https://github.com/shikokuchuo/mirai.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
Continue reading: mirai Parallel Clusters
Understanding Parallel Computing and Future Developments in R
This post is a follow up on an informative article from R-bloggers which illustrated that parallel computing is sometimes mistakenly assumed to be slower than sequential computing. The article presented some insightful benchmark tests and threw light upon how a tool called mirai performs better. The key argument presented in the article is the case with simple or ’embarrassingly parallel’ code, which are functions with iterations or loops that can be executed simultaneously or independently. It points out that parallel computations in such cases can be speedier than non-parallel ones, contrary to what many believe.
Long-term Implications
The value of understanding when and how to leverage parallel computing is enormous. As computational workload spikes, whether due to more complex analysis demands or increases in dataset size, the use of parallel processing will become more and more common.
Especially, the part regular users of the mainstream R system, such as researchers, data scientists, and business analysts, will find being equipped with such knowledge and tools of tremendous value. They can significantly cut down the time it takes to process large datasets and complex calculations by properly leveraging parallel computing resources and tools like mirai.
The development and acceptance of mirai-like packages will also incentivise similar intuitive, user-friendly software packages that democratise access to high-performance computing.
Potential Future Developments
- Mirai could prompt updates in the default R system to optimise for parallel computations as it has evidently showcased its speed in benchmark tests.
- Mirai’s rise to broader usage could inspire the development of more specific packages leveraging parallelism for specific computations.
- These learnings may further feed into the development of intelligent systems capable of deciding when best to leverage parallel computation, optimising efficiency, and user experience.
Actionable Advice
Users who often compute ’embarrassingly parallel’ tasks should try and harness the power of mirai.
- Start by installing and using the package via CRAN or GitHub for parallel processing in lieu of lapply() or purr::map().
- Keep an eye on the evolution of tools like mirai and adapt your workflows accordingly. As software continues to evolve, so must our best practices.
- Take some time to understand when it is best to make use of parallel versus sequential computing. The article gives us some insights; a rule of thumb is to look at the complexity and size of computation job at hand.
Notes
A benchmarking exercise in the original article showed that the mirai package resulted in faster computations than both the default parallel cluster in R and non-parallel computations for execution of a simple function over 10,000 iterations.
The mirai package benefits from NNG (Nanomsg Next Generation) technology, a high-performance messaging library and concurrency framework. It applies new connection types such as IPC (inter-process communications), not available to base R.