Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
You can read the original post in its original format on Rtask website by ThinkR here: Tame your namespace with a dash of suggests
We’ve all felt it, that little wave of shiver through our skin as we watch the check
of our newest commit running, thinking “Where could this possibly go wrong?”.
And suddenly, the three little redeeming ticks 0 errors ✔ | 0 warnings ✔ | 0 notes ✔
Allelhuia!
We git commit
the whole thing, we git push
proudly our branch and we open our Pull Request (PR) with a light and perky spirit.
Ideally, it works every time.
But sometimes it doesn’t ! The Continuous Integration (CI) crashes due to a lack of hidden dependencies, even though everything seemed to be well declared. 🫠
Want to get to the bottom of this? Join me on this expedition, hunting for Imports and Suggests in our dependencies !
1. A package and its CI
Here we are on an example package, it allows us to generate graphics with {ggplot2}
.
On its README.md
we can see the R-CMD-check badge, telling us this package has been tested thanks to the Continuous Integration (CI) of GitHub Actions.
The CI automatically launch a check()
of your package, not from a local installation, but from a minimal docker environment. The steps to run are defined in the config file R-CMD-check.yaml
.
Good news, the badge is green, the CI runs with no error !
2. A new function
We need a new function to save a graph in several formats at once.
We add this new function to the save_plot.R
file using {fusen}
1, sprinkled with a little documentation in the style of {roxygen2}
2, and a usage example.
Let’s take a look at the code together.
a. the documentation
#' save_plot #' #' @param plot ggplot A ggplot object to be saved #' @param ext character A vector of output format, can be multiple of "png", "svg", "jpeg", "pdf" #' @param path character A path where to save the output #' @param filename character The filename for the output, extension will be added #' #' @importFrom purrr walk #' @importFrom ggplot2 ggsave #' @importFrom glue glue #' #' @return None Create output files #' #' @export
- The
@param
tag describes the function’s four parameters - The
@importFrom
tag specifies the imports required to use the function- e.g.
@importFrom purrr walk
loadswalk()
from the{purrr}
package
- e.g.
b. the function body
save_plot <- function( plot, ext = c("png", "jpeg", "pdf"), path = "graph_output", filename = "output") { ext <- match.arg(ext, several.ok = TRUE) # save all format ext %>% walk( (x) ggsave( filename = file.path(path, glue("{filename}.{x}")), plot = plot, device = x ) ) }
- This function uses the
{purrr}
and{ggplot2}
packages to export the graphic in several formats. - The list of export formats is specified by the
ext
parameter and defaults to png, jpeg and pdf. - The format is added as a suffix to the file name using
glue()
.
c. the usage example
# create temp dir tmp_path <- tempfile(pattern = "saveplot") dir.create(tmp_path) ext <- c("svg", "pdf") data <- fetch_dataset(type = "dino") p <- plot_dataset( data, type = "ggplot", candy = TRUE, title = "the candynosaurus rex" ) save_plot( plot = p, filename = "dino", ext = ext, path = tmp_path ) # clean unlink(tmp_path, recursive = TRUE)
- We create a graph using the
plot_dataset()
function already implemented in the package. - We export the graph in svg and pdf format to a temporary folder, which we delete at the end of the example.
3. Here comes the check
We’ve got the function ready, now it’s time to make sure everything’s running smoothly.
As we’re well-mannered, we’ll do it in two steps.
a. the local check
- First, I check that my code is running on my machine
- I run a local
check
withdevtools::check()
- All green, all good !
- I run a local
b. the CI check
- I then check that my code is running on a minial environment thanks to the CI
- I send it all to the remote, and create a Pull Request (PR)
- The PR starts it’s check battery on the new branch
- And then, crash
4. A CI neither green nor cabbage-looking
What do you mean mistakes !? We’ve checked that it works ! Our confidence in the check
command takes a hit.
Before we lose hope, let’s take a closer look.
a. the R-CMD-check
- It seems that the CI checks have hit the nail on the head
- If you go to the Details of the logs, you’ll see the error below :
A first clue, then. It seems that the error comes from the example of our new save_plot()
function.
b. the backtrace
- Going down a little further in the logs, we find the error’s backtrace
- The backtrace unrolls the sequence of functions executed before the error
- This allows us to trace the origin of the problem in the sub-functions of a call
The backtrace tells us that the error comes from outside our package, in the ggsave()
function of {ggplot2}
.
c. the ggsave()
function
- Never mind, let’s dig into
{ggplot2}
!
Bingo ! The error occurs when our function tries to save a graphic in svg format but can’t find the {svglite}
package !
5. Rewinding the thread of dependencies
- Our investigation then leads us to two questions :
- Why didn’t the import of
{ggplot2}
also load{svglite}
as a dependency - Why did this error appear in the CI but not at the local
check
- Why didn’t the import of
a. the {ggplot2}
imports
Let’s start with these imports.
- To use
{ggplot2}
functions in our package, we’ve specified thatggplot2::ggsave()
is part of the package’s imports3- In other words, these imports correspond to the dependencies that are essential for the package to function properly
- The list of dependencies can be found in the
DESCRIPTION
file
- You can also check the list of
{ggplot2}
dependencies in its ownDESCRIPTION
file.- And what do I see !
-
{svglite}
is part of{ggplot2}
’s suggests, not imports- the suggests section lists little-used dependencies or dependencies intended for developers (e.g.
{testthat}
) - their installation as dependencies is not mandatory
- in the case of our CI, the list of
{ggplot2}
imports has been installed, but not the suggests list
- in the case of our CI, the list of
- the suggests section lists little-used dependencies or dependencies intended for developers (e.g.
When the pipeline tried to save the graphic as svg, it went all the way back to the ggplot2::ggsave()
function, but didn’t find the {svglite}
package.
b. what about my local check ?
How come this problem doesn’t occur during the local check?
- On our local machine, the
{svglite}
package is already installed- We can load it without any problem
- We can run
save_plot()
andcheck()
locally without any problem
- On the CI, we use minimal docker environments
- They have no packages installed other than the imports from the
DESCRIPTION
file {svglite}
will therefore not be installed, and this will show up incheck()
- They have no packages installed other than the imports from the
In other words, {svglite}
is not installed by default with our package. How can we solve this?
6. Getting {svglite}
on board
A first solution to our ailing CI would be to pass {svglite}
into our package’s imports.
- We add the line
#' @importFrom svglite svglite
to our function documentation - This will add
{svglite}
to the list of imports in the DESCRIPTION file - We check the logs when updating the doc :
Does everything go back to normal after that?
- The local
devtools::check()
remains green. So far, so good - We test the CI in a new Pull Request :
Bingo ! The CI is back to green !
7. There’s something fishy going on
a. adieu
We could be satisfied with this version. Except that it feels like a pebble in our shoe.
Why is that? For this:
Our triple green is gone!
Adding the {svglite}
dependency brings the package’s total number of mandatory dependencies (imports) up to 21.
Rightly so, the devtools::check()
warns us that this is not an optimal situation for maintaining code.
CRAN advises you to pass as many dependencies as possible in the suggests section.
b. fishing for suggestions
Passing as many imports as possible in suggests is one thing, but how do we decide on the fate of each of our dependencies?
According to CRAN, we can identify as suggests the dependencies that :
- are not necessarily useful to the user, including :
- packages used only in examples, tests and/or vignettes
- packages associated with functions that are rarely used by the user.
Let’s take the case of our dependency on {svglite}
in deptrapr::save_plot()
:
- we keep it in imports if :
save_plot()
is a flagship function of the package- it is regularly used (in which case we can add svg to the default formats)
- it is set to suggest if :
save_plot()
is a very rarely used function- it is only used in the
save_plot()
example
Note that, with this way of doing things, no need to wait until you’ve got 21 dependencies to start sorting, right ?
Let’s say save_plot()
is a minor function in our package. As with {ggplot2}
, we can then pass the {svglite}
import of our package into suggests.
Let’s do it the good way.
8. Passing from imports to suggests
Let’s try to migrate our dependency on {svglite}
from imports to suggests.
a. a breath of roxygen
- To update by hand :
- we delete our previous
roxygen2
line to remove the{svglite}
imports - we run the command
usethis::usepackage(package = "svglite", type = "Suggests")
to add{svglite}
in suggests this time
- we delete our previous
- Would you prefer to use
{attachment}
or{fusen}
to update your dependencies?- in this case, you need to save this modification in the configuration file4
- run the command
attachment::att_amend_desc(extra.suggests = "svglite", update.config = TRUE)
- with this, the addition of
{svglite}
will be remembered the next timeattachment::att_amend_desc()
is called - it also works with
{fusen}
:inflate()
will use the{attachment}
configuration file in background !
Once {svglite}
has been switched to suggest, our three ticks from the local devtools::check()
are back to green. Hurray!
If we stopped here, our GitHub CI would work without a hitch, as it installs by default :
- the package’s imports dependencies, as well as their own imports
- the package’s suggests dependencies (including
{svglite}
here)
Except that it’s not enough as it is.
c. avoid backlash
Why do more? To do better!
Otherwise, we’d be passing the hot potato on to the next developer !
Let’s put ourselves in the shoes of the next person who wants to use our package.
If they try to save their graphic as svg, they’ll have to rewind the backtrace up to the suggests dependencies, just as we did with {ggplot2}
.
This may sound simple, but it can quickly get bogged down in the following cases:
- there are 10 packages in suggests
- the CI would stop at each missing dependency once the previous one had been corrected
- function runs after a 10 minutes calculation
- the dependency error would cause us to lose all calculations
- our package uses many nested functions
- the backtrace would not allow us to trace the error
🛟 9. A namespace put to the test
a. the requireNamespace()
function
As the output of usethis::usepackage()
indicates, for suggests to be successful, they must be accompanied by a requireNamespace()
.
This function is used to check whether or not the dependency is missing, and to decide what action to take depending on the situation.
This enables us to achieve two useful behaviors :
- make the error message more explicit
- we can specify which dependency is missing and why it’s needed
- to do this, we add a
message()
or awarning()
to the execution conditioned by therequireNamespace()
- avoid errors
- you can skip execution if the package is not installed and continue without error
- to do this, we modify the parameters in the execution conditioned by
requireNamespace()
In our case, we get something like this:
save_plot <- function( plot, ext = c("png", "jpeg", "pdf"), path = "graph_output", filename = "output") { ext <- match.arg(ext, several.ok = TRUE) # check svglite is installed if (!requireNamespace("svglite", quietly = TRUE)) { warning( paste( "Skipping SVG export.", "To enable svg export with ggsave(),", "please install svglite : install.packages('svglite')" ) ) ext <- ext[ext != "svg"] } # save all format ext %>% walk((x) { ggsave( filename = file.path(path, glue("{filename}.{x}")), plot = plot, device = x ) }) }
Now if {svglite}
isn’t installed, save_plot()
points out the solution, all without crashing!
b. a nice readme
As we’ve seen in our wanderings, {svglite}
won’t install when a user tries to install our {deptrapr}
package, because it’s part of the suggests.
To save our user the trouble of poking around in the DESCRIPTION
to discover this dependency, we can make his life easier with our README
.
We add a paragraph mentioning the use of suggests, and the possibility to install them using the dependencies = TRUE
parameter during installation.
10. Wrap-up
For peace of mind while running a check
, you now have your back with the suggests
& requireNamespace()
combo!
Another way of quickly detecting this type of error when developing a function is to associate a unit test.
Having a test for the save_plot()
function would have enabled us to detect the missing dependency at the local devtools::check()
step, without having to go through the CI.
We’d directly obtain an error similar to the one observed in the CI :
Two last tips for the road:
- Packages that are not imported are not part of the
NAMESPACE
, so it is recommended to call them under the formatpkg::function()
in the code, to avoid conflicts. - If your CI uses
devtools::install()
, the suggested packages will not be installed by default, to do so specifydependencies = TRUE
in your yaml.
-
a grammar for creating documentation in a snap of fingers︎
-
c.f. the function’s
roxygen2
documentation︎ -
since version 0.4.4 of
{attachment}
︎
This post is better presented on its original ThinkR website here: Tame your namespace with a dash of suggests
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
Continue reading: Tame your namespace with a dash of suggests
Long-Term Implications and Future Developments
The discussion in the original article centers on improving the practice of coding, specifically in R, through efficient management of dependencies. The future of streamlined collaborative software development focuses heavily on refining dependency management that respects namespace environment and results in cleaner code and smoother integration of various modules.
The use of ‘imports’ and ‘suggests’ in package dependencies is a potentially significant shift in coding practices, focusing on precision, efficiency, and improved collaboration. In the long run, this can contribute to more robust software and applications. Developers will likely see shorter debugging times and fewer integration issues, especially when merging code written by different teams.
However, it’s also possible that an overuse of ‘suggests’ may end up confusing less experienced developers who might struggle to understand why certain code options aren’t working due to optional dependencies not being present. Thus, it’s vital to ensure these dependencies are documented clearly.
Actionable Advice
- Understand the difference between ‘imports’ and ‘suggests’ – the former is crucial for your package to function properly, while the latter should be used for little-used dependencies.
- If you’re going to use ‘suggests’ for a dependency, make sure to apply the requireNamespace() function. This prevents errors when integrating and lets you give clear error messages for missing dependencies. This can save time debugging issues related to missing dependencies.
- Document each dependency clearly, whether it’s an ‘import’ or a ‘suggest’. Do this within your code and also in end-user documentation like your README file. This prevents future confusion.
- To avoid mix-ups, it’s recommended to call indicates functions with unused packages in the format: pkg::function().
- If bullet-proofing checks with unit tests, ensure missing dependencies are caught during local check stage before pushing the code to continuous integration pipelines. This saves time and allows for quicker responses.
- Where possible, during CI use devtools::install() with dependencies = TRUE in your .yaml file. This will ensure suggested packages are correctly installed and could save you time tracking down issues later.
In conclusion, ensuring proper management of package dependencies forms an integral part towards writing clean, efficient, and collaborative code. By following the tips listed above, you can ensure smooth functioning of your packages and avoid unnecessary debugging or code-review processes.