Simplifying Async Programming in R/Shiny Applications

[This article was first published on R | Discindo, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

In the R/Shiny community we are fortunate to have several approaches for async programming.
It is an active field of development with a variety of
options depending on the needs of the application. For examples and deeper overviews of
the state of async programming in R, head over to Veerle van Leemput’s
writing,
the Futureverse documentation or the
mirai / crew repos.

In this post, I am going to focus on an approach to simplify making multiple
async calls in shiny applications. Really, it boils down to developing a module
that wraps the initialization and polling of a callr::r_bg process into a single
function, and makes it easier write a larger async-capable shiny app while
keeping the code a bit shorter, and more compact.

The problem

I am working on refactoring a relatively large shiny application where many of the
computations are time-consuming. Ideally, I would like to convert the major bottlenecks
into async routines. Typically, is is done by setting up future/promise constructs or
sending a job to a subprocess, keeping the main shiny process free, and then
polling the subprocess ‘manually’ to fetch the result (callr/mirai/crew).

After reviewing the available options, and trying a few things, I decided
to go with callr for async, although the mirai, and crew where close seconds.
This choice was mostly because of callr’s simplicity and because I have previous
experience with it.

The callr workflow can be sumarised in the following steps:

send a call to the subprocess (possibly within a reactive and dependent on events within shiny)
monitor the status of the background process to know when to fetch the results
the polling observer has to have a switch, so we don’t waste resources on polling
while there is nothing running.

In all, its probably some 15-20 lines of code, depending on the complexity of
the function call we are sending to the subprocess. It looks something like this:

# The function we want to run async
# (sleep is added to mimic long computation)
head_six <- function(x, sleep) {
Sys.sleep(sleep)
head(x)
}
# the r_bg call
args <- list(head_six = head_six, x = my_data, sleep = 5)
bg_process <- callr::r_bg(
func = function(head_six, x, sleep) {
head_six(x, sleep)
},
args = args,
supervise = TRUE
)
# turn on polling after the task has been sent to the subprocess
poll_switch <- shiny::reactiveVal(TRUE)
# reactive to store the result returned by the subprocess
result_rct <- shiny::reactiveVal(NULL)
# monitor the background process
shiny::observe({
shiny::req(isTRUE(poll_switch()))
shiny::invalidateLater(300)
message("checking")
alive <- bg_job()$is_alive()
if (isFALSE(alive)) {
res_rct(bg_job()$get_result())
message("done")
poll_rct(FALSE)
}
})
# do stuff with `result_rct()`

Having to write this in 20 different places where async might be needed in
an application is definitelly a chore, not to mention error-prone as one needs
to keep track of the names of the process objects, polling switches, and result
reactives. Then of course, some async bits would need to respond to events, like
button clicks or other reactives in the shiny session, while others would need
to run without explicit triggers, adding to the complexity and maintanence of the
codebase.

The solution

I wanted to simplify the above process and make it quicker to write the async
code. I wanted a function or a module server that would take a function by name
and its arguments and then run the function in a background process, poll the
process and return the result when ready. Additionally, I wanted this module
to be flexible enough such that one can trigger the execution from the outside
(e.g., from the parent module) or to run without external triggers.

In the end, I came up with a solution with 3 components: the function that does the
long computation, an async version of this function, and a module server that
will do the shiny things. Bellow are the 3 parts starting with the trivial head_six
function (same as above):

# The function we want to run async
# (sleep is added to mimic long computation)
head_six <- function(x, sleep) {
Sys.sleep(sleep)
head(x)
}

The async version of the function is a wrapper that is prepared manually for the
function we need to run async. It is abstracting the callr::r_bg call, and
can live in a separate script (together with the function it wraps) instead
of the shiny server. There probably are ways to generate this function with
code, and I might try that soon, but for now creating this wrapper does not
bother me much. Having an async function that you can test and debug interactivelly
might actually be preferred.

# Async version of `head_six`
# calls `r_bg` and returns the process object
head_six_async <- function(x, sleep) {
args <- list(head_six = head_six, x = x, sleep = sleep)
bg_process <- callr::r_bg(
func = function(head_six, x, sleep) {
head_six(x, sleep)
},
args = args,
supervise = TRUE
)
return(bg_process)
}

The third part is the function (module server) that calls the async version of
the function doing the time-consumig task. The module also has reactives
to switch polling on/off, and an observer to monitor and fetch the result. It
returns a list with two elements, a reactive with the result of the async
job, and a function that updates the polling reactive (poll_rct) that allows
one to initiate the task from the outside. For example if we had a button in
another module that should trigger the computation inside this async module.

mod_async_srv <- function(id, fun_async, fun_args, wait_for_event = FALSE) {
moduleServer( id, function(input, output, session){
res_rct <- shiny::reactiveVal(NULL)
poll_rct <- shiny::reactiveVal(TRUE)
if (isTRUE(wait_for_event)) {
poll_rct(FALSE)
}
bg_job <- reactive({
req(isTRUE(poll_rct()))
do.call(fun_async, fun_args)
}) |> bindEvent(poll_rct())
observe({
req(isTRUE(poll_rct()))
invalidateLater(250)
message(sprintf("checking: %s", id))
alive <- bg_job()$is_alive()
if (isFALSE(alive)) {
res_rct(bg_job()$get_result())
message(sprintf("done: %s", id))
poll_rct(FALSE)
}
})
return(list(
start_job = function() poll_rct(TRUE),
get_result = reactive(res_rct())
))
})
}

Note that this is not a typical shiny module, in that it does not have
(and does not strictly need) a UI part. So we don’t have to worry
about the namespace (ns <- session$ns) inside it. We simply want to observe
and return. One could add a UI component to, perhaps, notify the user about the
progress (checking, checking, … done) of the async job.

With this module, refactoring to async becomes more streamlined. For example,
we could have a scenario like this.

server <- function(input, output, session) {
# async job triggered on event (input$go_async_job1)
async_job1 <- mod_async_srv(
id = "job1_srv",
fun_async = "job1_async",
fun_args = list(x = x, z = z),
wait_for_event = TRUE
)
observeEvent(input$go_async_job1, {
async_job1$start_job()
})
output$x <- renderPlot({
plot_fun(async_job1$get_result())
})
# async job that runs without external intervention
async_job2 <- mod_async_srv(
id = "job2_srv",
fun_async = "job2_async",
fun_args = list(a = a, b = b),
wait_for_event = FALSE
)
output$y <- renderPlot({
table_fun(async_job2$get_result())
})
}

Note that the two instances of mod_async_srv use different async functions
with different sets of arguments, and are triggered in different ways. Providing
some flexibility, while keeping the server code minimal.

Nothing special here, no magic, just some wrappers to make life a bit easier when
writing large shiny applications with async capabilities.

Demo

To test out this approach you can download the following gist. In it, I have
two callr background async jobs, to show the head of iris and mtcars,
with different sleep time. The iris job waits for user click, while the
mtcars job runs on its own when the app starts. Neither async job blocks
the main shiny process, as they are both in the background, so the slider and
histogram work throughout.

Summary

In this post I went over an approach to organize callr background async jobs using a module, in order to make the async code faster to write, less error prone and overall cleaner.

To leave a comment for the author, please follow the link and comment on their blog: R | Discindo.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Continue reading: A simple workflow for async {shiny} with {callr}

Long-term Implications and Future Developments in R/Shiny Asynchronous Programming

The use of asynchronous programming in R/Shiny is a topic of considerable importance among developers. Given its potential, there are several methods to achieve this result, depending on the needs of the application. Recently, a new approach has been proposed to simplify making multiple async calls in shiny applications: developing a module that bundles the initialization and polling of a callr::r_bg process into a single function. This method aims to make writing larger async-capable shiny apps more comfortable while keeping the code shorter and more compact.

Asynchronous Programming and Future Trends

The primary purpose of async programming is to make better use of CPU resources by allowing multi-threading. Working on several tasks simultaneously can help improve efficiency, particularly when many of the processes are computational and time-consuming. This feature is even more valuable in a world where applications and software platforms become increasingly complex.

The Shiny framework is a great tool to develop interactive web applications straight from R scripts. However, as these applications grown in size and complexity, expanding their async capabilities can bring multiple benefits. The outlined approach proposes an automatic initialisation and polling of async calls. This technique can reduce code length and keep it more compact, contributing to less error-prone development practices and easier maintenance.

Future Developments

Developments in R/Shiny asynchronous programming are ongoing, and for this specific approach, exploring code generation methods can automate the creation of async function wrappers. This development could further simplify the coding process, making the implementation of async programming even more straightforward.

Furthermore, considering user interactions could also be valuable. Including UI components that notify users about the progress of async jobs could enhance user experience by making the operation of async applications clearer.

Actionable Advice

Developers should consider the below strategies:

Understand user needs: Developers should evaluate the specific needs of users thoroughly before deciding on the most appropriate async programming method. Experiment with different methods, including the proposed module, to see which solution fits best their context.
Use resources efficiently: Try to allocate computing resources efficiently when designing software. Async programming allows multitasking, which preserves resources.
Stay updated: The landscape of async programming in R/Shiny is dynamic. Staying current with advancements in the field can provide insights into more efficient and effective programming techniques. Developers should check out literature such as Van Leemput’s research and the Futureverse documentation for recent practices, concepts, and insights.
Continuously improve: There is always room for improvement in programming. That includes code automation, which can simplify processes and improve efficiency.

In Conclusion

Async programming in R/Shiny holds great potential for future development and offers numerous opportunities to develop more robust and efficient applications. As the field continues to progress, so too will the available techniques and methodologies, each offering unique advantages to meet different application needs.

Read the original article