“What’s New in R 4.5.0: A Summary of Exciting Changes”

[This article was first published on The Jumping Rivers Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

R 4.5.0 (“How About a Twenty-Six”) was released on 11th April, 2025.
Here we summarise some of the interesting changes that have been
introduced. In previous blog posts we have discussed the new features
introduced in R
4.4.0 and earlier
versions (see the links at the end of this post).

The full changelog can be found at the r-release ‘NEWS’
page and if
you want to keep up to date with developments in base R, have a look at
the r-devel ‘NEWS’
page.

penguins

Who doesn’t love a new dataset?

One of the great things about learning R for data science is that there
are a collection of datasets available to work with, built into the base
installation of R. The Palmer Penguins dataset has been available via an
external
package
since 2020, and has been added to R v4.5.0 as a base dataset.

This dataset is useful for clustering and classification tasks and was
originally highlighted as an alternative to the iris dataset.

In addition to the penguins dataset, there is a related penguins_raw
dataset. This may prove useful when teaching or learning data cleaning.

`use()`

If you have worked in languages other than R, its approach to importing
code from packages may seem strange. In a Python module, you would
either import a package and then use functions from within the explicit
namespace for the package:

import numpy
numpy.array([1, 2, 3])
# array([1, 2, 3])

Or you would import a specific function by name, prior to its use

from numpy import array
array([1, 2, 3])
# array([1, 2, 3])

In an R script, we either use explicitly-namespaced functions (without
loading the containing package):

penguins |>
 dplyr::filter(bill_len > 40)

Or we load a package, adding all its exported functions to our
namespace, and then use the specific functions we need:

library("dplyr")
penguins |>
 filter(bill_len > 40)

The latter form can cause some confusion. If you load multiple packages,
there may be naming conflicts between the exported functions. Indeed,
there is a filter() function in the base package {stats} that is
overridden when we load {dplyr} – so the behaviour of filter()
differs before and after loading {dplyr}.

R 4.5.0 introduces a new way to load objects from a package: use().
This allows us to be more precise about which functions we load, and
from where:

# R 4.5.0 (New session)
use("dplyr", c("filter", "select"))

# Attaching package: ‘dplyr’
#
# The following object is masked from ‘package:stats’:
#
# filter
#

penguins |>
 filter(bill_len > 40) |>
 select(species:bill_dep)

# species island bill_len bill_dep
# 1 Adelie Torgersen 40.3 18.0
# 2 Adelie Torgersen 42.0 20.2
# 3 Adelie Torgersen 41.1 17.6
# 4 Adelie Torgersen 42.5 20.7
# 5 Adelie Torgersen 46.0 21.5
# 6 Adelie Biscoe 40.6 18.6

Note that only those objects that we use() get imported from the
package:

# R 4.5.0 (Session continued)
n_distinct(penguins)
# Error in n_distinct(penguins) : could not find function "n_distinct"

A feature similar to use() has been available in the
{box} and
{import}
packages for a while. {box} is a particularly interesting project, as it
allows more fine-grained control over the import and export of objects
from specific code files.

Parallel downloads

Historically, the install.packages() function worked sequentially –
both the downloading and installing of packages was performed one at a
time. This means it could be slow to install many packages.

We often recommend the
{pak} package for
installing packages because it can download and install packages in
parallel.

But as of R 4.5.0, install.packages() (and the related
download.packages() and update.packages()) are capable of
downloading packages in parallel. This may speed up the whole
download-and-install process. As described in a post on the R-project
blog by Tomas
Kalibera, the typical speed-up expected is around 2-5x (although this is
highly variable).

C23

C23 is the current standard for
the C language. Much of base R and many R packages require compilation
from C. If a C23 compiler is available on your machine, R will now
preferentially use that.

grepv()

For pattern matching in base R, grep() and related functions are the
main tools. By default, grep() returns the index of any entry in a
vector that matches some pattern.

penguins_raw$Comments |> grep(pattern = "Nest", x = _)
# [1] 7 8 29 30 39 40 69 70 121 122 131 132 139 140 163 164 193 194 199
# [20] 200 271 272 277 278 293 294 299 300 301 302 303 304 315 316 341 342

We have been able to extract the values of the input vector, rather than
the indices, by specifying value = TRUE in the arguments to grep():

penguins_raw$Comments |>
 grep(pattern = "Nest", x = _, value = TRUE)
# [1] "Nest never observed with full clutch."
# [2] "Nest never observed with full clutch."
# [3] "Nest never observed with full clutch."
# [4] "Nest never observed with full clutch."
# [5] "Nest never observed with full clutch."
# [6] "Nest never observed with full clutch. Not enough blood for isotopes."

Now, in R 4.5.0, a new function grepv() has been introduced which will
automatically extract values rather than indices from pattern matching:

penguins_raw$Comments |>
 grepv(pattern = "Nest", x = _)
# [1] "Nest never observed with full clutch."
# [2] "Nest never observed with full clutch."
# [3] "Nest never observed with full clutch."
# [4] "Nest never observed with full clutch."
# [5] "Nest never observed with full clutch."
# [6] "Nest never observed with full clutch. Not enough blood for isotopes."

Contributions from R-Dev-Days

Many of the changes that are described in the “R News” for the new
release came about as contributions from “R Dev Day”s. These are regular
events that aim to expand the number of people contributing code to the
core of R. In 2024, Jumping Rivers staff attended these events in London
and Newcastle (prior
to “SatRDays” and “Shiny In
Production”,
respectively). Dev days are often attached to a conference and provide
an interesting challenge to anyone interested in keeping R healthy and
learning some new skills.

Trying out R 4.5.0

To take away the pain of installing the latest development version of R,
you can use docker. To use the devel version of R, you can use the
following commands:

docker pull rstudio/r-base:devel-jammy
docker run --rm -it rstudio/r-base:devel-jammy

Once R 4.5 is the released version of R and the r-docker repository
has been updated, you should use the following command to test out R
4.5.

docker pull rstudio/r-base:4.5-jammy
docker run --rm -it rstudio/r-base:4.5-jammy

An alternative way to install multiple versions of R on the same machine
is using rig.

The Future of R: A Comprehensive Look at R 4.5.0 and Its Long-Term Implications

In April of 2025, the latest version of R, R 4.5.0 was released, causing some profound implications for the future development of the programming language. The following comprehensive follow-up will discuss the long-term implications of the changes, provide future developments, and offer actionable advice based on the key insights.

The Inclusion of the Palmer Penguins Dataset

The Palmer Penguins dataset, which was previously only available through an external package, is now a core component of the R base installation. This dataset, ideal for clustering and classification tasks, offers an excellent teaching tool for data cleaning with its associated ‘penguins_raw’ function.

This extensive dataset reinforces the importance of data diversity in R programming where information comes in all shapes and sizes. Moving forward, we might see greater inclusions of comprehensive and diverse datasets as a part of the core R packages, providing more robust built-in tools for R users.

Optimal Code Importation with use()

The addition of the use() function in R 4.5.0 has streamlined the process of importing code from packages. This development marks a significant advancement in the user interface and boosts R’s versatility when it comes to code import practices from other languages such as Python. It is recommended to leverage this new operation for precise function loading and effective namespace management.

The use() function allows us to look forward to an R development environment that focuses increasingly on user-friendly and efficient coding processes. Further advances in avoiding namespace conflicts and enhancing code readability are likely to emerge.

Bolstering Download Efficiency

The parallel downloading capability in install.packages(), download.packages(), and update.packages() functions is a development that substantially reduces package installation time. Not only does this make R more efficient, but it also further aligns it with other programming environments that already employ parallel downloading capabilities, such as the {pak} package.

In the long-term, this could prompt package developers to optimize their packages for parallel downloading and installation. In addition, this particular advancement may lead to further parallel processing capabilities within other aspects of R.

Utmost Compilation Precision with C23

With the preference of C23, the most recent standard for the C language, R is moving towards more modern and efficient compilation practices. Moving forward, R users can expect cleaner, faster, and more efficient compiling processes and seamless integration with C-based systems.

Pattern Matching Simplified with grepv()

The introduction of grepv() simplifies output from pattern matching exercises in R by automatically extracting values rather than indices. In future iterations, we might anticipate additional functions to aid in pattern matching and data extraction tasks.

Growth in R Contributions and Active Developer Community

The regular contributions from “R Dev Days”, highlight the growing R developer community. Continued support and contribution to R’s core from ardent R enthusiasts mean the language can continue to advance and grow to meet the evolving needs of the coding and data science community.

Exploring R 4.5.0, the latest developments suggest a promising direction. Users can anticipate an R future that is more efficient, user-friendly, firmly grounded in modern practices, and backed by a growing and active developer community.

Getting Started with R 4.5.0

To install and experience the new features in R 4.5.0, leveraging docker is recommended. Docker allows users to easily pull the latest R release and run it from their workspace, all from the command line.

Looking Back

The latest release is a part of an ongoing trend of robust R changes and improvements. The previous version improvements made in R 4.0.0 to 4.4.0 have led to the creation of a more powerful and flexible R environment.

The takeaway here is that R continues to thrive as a language of choice for data science thanks to the enthusiastic R community committed to constantly enhancing its capabilities.

Conclusion

The R 4.5.0 release is a testament to the continual improvements that make R a leading language in data science. From the inclusion of diverse datasets to the advanced compilation, the long-term ramifications foretell a future with a more user-friendly, efficient, and powerful version of R. As an R user, it is critical to stay abreast of these advancements, take advantage of new features, and contribute your own ideas for the community’s betterment.

Read the original article