[This article was first published on R-posts.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
Join our workshop onUsing LLMs with ellmer, which is a part of our workshops for Ukraine series!
Here’s some more info:
Title: Using LLMs with ellmer
Date: Friday, June 13th, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone)
Speaker:Hadley Wickham is Chief Scientist at Posit PBC, winner of the 2019 COPSS award, and a member of the R Foundation. He builds tools (both computational and cognitive) to make data science easier, faster, and more fun. His work includes packages for data science (like the tidyverse, which includes ggplot2, dplyr, and tidyr)and principled software development (e.g. roxygen2, testthat, and pkgdown). He is also a writer, educator, and speaker promoting the use of R for data science. Learn more on his website, <http://hadley.nz>.
Description:Join us for an engaging, hands-on hackathon workshop where you’ll learn to use large language models (LLMs) from R with the ellmer (https://ellmer.tidyverse.org) package. In this 2-hour session, we’ll combine theory with practical exercises to help you create AI-driven solutions—no extensive preparation needed!
## What you’ll learn:
– A quick intro to LLMs: what they’re good at and where they struggle
– How to use ellmer with different model providers (OpenAI, Anthropic, Google Gemini, and others)
– Effective prompt design strategies and practical applications for your work
– Function calling: how to let LLMs use R functions for tasks they can’t handle well
– Extracting structured data from text, images, and video using LLMs
## What you’ll need:
– A laptop with R installed
– The development version of ellmer (`pak::pak(“tidyverse/ellmer”))`
– An account with either Claude (cheap) or Google Gemini (free).
Minimal registration fee: 20 euro (or 20 USD or 800 UAH)
Please note that the registration confirmation is sent 1 day before the workshop to all registered participants rather than immediately after registration
Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)
Fill in the registration form, attaching a screenshot of a donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after donation).
If you are not personally interested in attending, you can also contribute by sponsoring a participation of a student, who will then be able to participate for free. If you choose to sponsor a student, all proceeds will also go directly to organisations working in Ukraine. You can either sponsor a particular student or you can leave it up to us so that we can allocate the sponsored place to students who have signed up for the waiting list.
Save your donation receipt (after the donation is processed, there is an option to enter your email address on the website to which the donation receipt is sent)
Fill in the sponsorship form, attaching the screenshot of the donation receipt (please attach the screenshot of the donation receipt that was emailed to you rather than the page you see after the donation). You can indicate whether you want to sponsor a particular student or we can allocate this spot ourselves to the students from the waiting list. You can also indicate whether you prefer us to prioritize students from developing countries when assigning place(s) that you sponsored.
If you are a university student and cannot afford the registration fee, you can also sign up for the waiting listhere. (Note that you are not guaranteed to participate by signing up for the waiting list).
You can also find more information about this workshop series, a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.
Looking forward to seeing you during the workshop!
Analysis: The Future of LLMs with ellmer Workshops
In the ever-evolving field of data science, continuous learning and keeping up-to-date with the latest technologies and methodologies are of utmost importance. A recent announcement on R-bloggers.com discussed a fast-approaching online workshop on ‘Using LLMs with ellmer’ which undoubtedly caught the attention of many data science enthusiasts.
Implications and Future Developments
Large Language Models (LLMs), as introduced in this workshop, are a critical component in the realm of AI, capable of understanding and generating human-like text. Notably, the ellmer package enables these advanced AI capabilities to be integrated into the R environment. Ensuring that data scientists are adept in such tools has long-term implications for the speed, efficiency, and novel applications in data science.
Hadley Wickham, the speaker for this session, is a distinguished data scientist and prolific contributor to R packages, making the promise of future workshops held by him or speakers of a similar calibre, highly beneficial for learners. It’s quite plausible that the increased demand for these workshops could lead them to become a regular occurrence, facilitating upskilling in the R community.
In the future, we might see an expansion of topics, covering more R packages and advanced AI techniques. Furthermore, the flexible approach today’s workshop adopted towards payment (acceptable in different currencies and also by sponsoring a student) combined with its charitable cause, paints an encouraging picture of an inclusive learning community that values diversity and social responsibility. This could lead to increased accessibility in the future, as more and more professionals and students benefit from these affordable (or sponsored) learning opportunities.
Actionable Advice
Stay Informed: Regularly check R-bloggers and similar resources for updates about forthcoming workshops and apply promptly. Remember that registration confirmations are sent out a day before the workshop.
Prepare Adequately: Ensuring that the necessary prerequisites are met before the workshop (such as having R installed and setting up the ellmer package) allows for a more effective learning experience.
Be Charitable: If able, consider sponsoring a student. This not only supports the learning of individuals unable to afford the fee, but additionally contributes towards addressing social implications in areas such as Ukraine.
Take Part: Even if one is not an R user, such workshops, often held by industry experts, offer valuable insights which could be applied to data science work in general.
By utilizing such actionable advice, not only can individuals further their personal knowledge and skills, but the broader R, data science, and AI communities can continue to grow and evolve positively.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
A few years ago, the R community started using ORCID (“Open Researcher and Contributor ID”) to persistently and uniquely identify individual authors of packages in DESCRIPTION.
The idea is the following: you enter authors’ ORCID as a specially named comment in their person() object.
For instance I can be represented by:
Although anyone could use your ORCID, maliciously or inadvertently1, you definitely benefit from using your ORCID in your work.
In the case of R packages, CRAN pages and pkgdown websites feature a pretty icon linking to your ORCID profile that in turn can link to your favorite online presence.
Recognition! Personal branding!
This year, the exact same idea was applied to organizations using ROR (“Research Organizations Registry”) IDs.
Any organization, be it a research organization, an initiative or a company, can request to be listed in the registry.
A few months ago, it became possible to list ROR IDs in DESCRIPTION, which a few dozen CRAN packages currently do –
although this is still far from the thousands of CRAN packages adopting ORCIDs.
Thanks to R Core for adding the feature2 and to Achim Zeileis for spreading the news.
A package maintainer might need to list organizations in DESCRIPTION: for instance a company that owns the copyright to the package (“cph” role), an entity that funded work on the software (“fnd” role).
Adding the organization’s ROR ID to its person() object identifies it even more clearly.
As an illustration, rOpenSci can be represented by:
person("rOpenSci", role = "fnd",
comment = c("https://ropensci.org/", ROR = "019jywm96"))
The ROR icon, although less striking than the bright green ORCID icon, appears on the CRAN page of the package and links to the organization’s ROR page that in turn can link to the organization’s website:
In 2018 we had reported about tooling for using ORCID.
This year, we’d like to explain the tooling for including ROR IDs.
ROR support in the {devtools} ecosystem
Once ROR IDs were supported by base R, a next technical step was for them to be supported by Posit’s “devtools ecosystem” too.
Even if devtools is not strictly necessary for developing packages, many package developers, including some in the rOpenSci community, do use devtools.
The code supporting ROR in desc, roxygen2 and pkgdown follows the code supporting ORCID in those packages.
It is very fortunate that ORCID support was added before ROR because “orcid” is a better string to search for than “ror” that comes up in, say, “error”.
ROR IDs support in {desc}
The desc package, maintained by Gábor Csárdi, helps you manipulate DESCRIPTION files programmatically.
In its current development version, all functions handling authors (adding, searching or complementing entries) now feature a ror argument.
Furthermore, a new function, desc_add_ror(), was created.
For instance you can add a ROR ID to an author entry:
desc::desc_add_ror("019jywm96", given = "rOpenSci")
You can add an author entry including its ROR ID:
desc::desc_add_author(given = "rOpenSci", ror = "019jywm96", role = "fnd")
These functions can be handy to update a bunch of packages at once.
Even if packages are updated one by one, it is shorter to share and apply the instructions as a code snippet.
ROR support in {roxygen2}
The roxygen2 package, maintained by Hadley Wickham, generates your package’s NAMESPACE and manual pages using specially formatted comments.
Among those manual pages, your package might (and should, according to our dev guide) contains a package-level one.
You can create such a page using usethis::use_package_doc().
The following content will be added to R/package-name-package.R, for instance R/usethis-package.R.
The pkgdown package, maintained by Hadley Wickham, creates a documentation website for your package based on its metadata and documentation.
Since its 2.1.2 version, ROR IDs in DESCRIPTION are transformed into icons, similar to ORCID IDs.
The sidebar of tinkr’s website includes a ROR icon near rOpenSci name.
Support for ROR icons?
As of today, ROR icons like those on the CRAN pages, pkgdown websites and our website’s footer come from files. We have however opened an icon request for ROR in the Font Awesome repository, that you can upvote by using thumbs up. This strategy worked for ORCID. There’s already a ROR icon in the more specialized academicons library.
Conclusion: go forth, register and use ROR IDs!
In this tech note, we explained what ROR IDs are: persistent IDs for organizations.
They are to organizations what ORCIDs are to individuals.
We’ve shown ROR IDs are supported in the base R and devtools ecosystems.
ROR IDs can help identify more clearly an entity you list in your package’s DESCRIPTION because it, say, funded the work or owns the copyrights to it.
We encourage you to register your organization to the Research Organization Registry and to use the resulting ID in your package’s DESCRIPTION.
Such a task could be tackled during a package spring cleaning.
Don’t we all resort to copy-pasting formatting from others’ metadata files? ︎
Currently, packages on CRAN with a ROR ID in DESCRIPTION get a NOTE in CRAN checks, that can be ignored. Example︎
The R community uses special identifiers such as ORCID (“Open Researcher and Contributor ID”) and ROR (“Research Organizations Registry”) to uniquely and persistently identify individual authors and organizations involved in the creation of R packages. These identifiers offer recognition, a personal and organization brand element and can be linked to online profiles or websites.
Long-Term Implications
If used consistently and appropriately, ORCID and ROR IDs can greatly support the open science movement by ensuring clear attributions of contributions to scientific packages and results. This can foster transparency and collaboration within the scientific community, stimulating research and development. In the future, these IDs could become a standard tool for recognizing the work of researchers and organizations involved in the creation of scientific packages. It could also enhance the mobility and recognition of individual contributors across multiple projects.
Possible Future Developments
We may witness an expansion of these unique identifiers in other areas of open-source development, reaching beyond the scientific community. As these identifiers grow in popularity, they could integrate with other digital tools used by researchers, such as digital repositories, lab notebooks and bibliographic management tools. This would allow for a seamless tracking and crediting of research contributions, while also promoting open science practices.
Actionable Advice
If you’re a part of the R community or if you’re engaged with open source development, consider adopting the use of ORCID and ROR IDs. Registering your organization with the Research Organization Registry and using these IDs consistently can enhance visibility and recognition for your work. Also, take advantage of the tooling available for including ROR IDs such as in ‘devtools’, ‘desc’, ‘roxygen2’ and ‘pkgdown’ packages.
If you’re already using these identifiers, explore further how you can integrate them with other tools and platforms you use. And lastly, contribute to further enhancement of the system by submitting and voting for icon requests.
[This article was first published on pharmaverse blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
Hello pharmaverse community!
I’m thrilled to announce that Ashley Tarasiewicz will be taking over the Atorus seat from me on the Pharmaverse council.
Ashley has been a contributor to the pharmaverse since the early days of its inception, contributing to packages such as GitHub – atorus-research/Tplyr {Tplyr}, {pharmaRTF}, and A Centralized Metadata Object Focus on Clinical Trial Data Programming Workflows • metacore {metacore}. One of Ashley’s biggest strengths is being a voice of the user, calling on her history working through clinical reporting and submissions. In the time since Ashley joined Atorus, she built Atorus Academy to help with the up-skilling of SAS programmers to learn R. Ashley has now transitioned to the product owner OpenVal, a validated R distribution.
It was a no-brainer for me to have Ashley take over my council seat. For me personally, Ashley is an individual that I’ve leaned on since the very beginning of my career. As a sounding board and trusted confidant, having her at Atorus has been critical to our success. As such, I’m confident that her voice in the pharmaverse council will continue to help steer the community and drive progress for the industry overall.
In recent developments, Ashley Tarasiewicz has been recruited by Atorus to head the company’s seat at the pharmaverse council. This comes after several years of significant contributions to the pharmaverse by Ashley, through her innovations such as Tplyr, PharmaRTF and metacore. As Ashley takes up a new role as a product owner of OpenVal, there is much to anticipate regarding the direction of the company under her stewardship. This article aims to evaluate the potential implications and future developments that may arise from this new introduction.
Implications and Future Developments
Enhancements in Clinical Reporting and Submission
Given Ashley’s background in clinical reporting and submissions, one of the potential long-term implications is that Atorus may now streamline its reporting, analysis and submission process. This would lead to efficient data management and potentially faster drug delivery to the market.
Professional Development Opportunities
With the establishment of Atorus Academy by Ashley, a platform for the up-skilling of SAS programmers has been created. In the long run, this could lead to the up-skilling of industry professionals to learn R, consequently pushing the boundaries of industry knowledge and technical ability.
Product Expansion
Ashley’s new role as the product owner of OpenVal might be a sign of potential new innovative products or improvements to the existing ones. OpenVal, as a validated R distribution, may be further developed under Ashley’s influence to include more comprehensive statistical and graphical techniques for improved data analyses.
Actionable Recommendations
Professional Growth: With the opportunity provided by Atorus Academy to learn R, industry professionals are encouraged to take this opportunity to enhance their skills. This will not only provide an individual advantage but also elevate the standard of the industry.
Market Watch: Potential competitors and investors should monitor the developments at Atorus under Ashley’s watch. With her track record, innovatively new products can be expected.
Industry Collaboration: Considering Ashley’s voice in the pharmaverse council, there should be more engagement with her views by other council members and the industry. This would involve keen tracking of her propositions and developing suitable policies and responses.
Conclusion
In conclusion, Ashley Tarasiewicz’s assumption of Atorus’ pharmaverse council seat is a significant development that holds potential for positive change in the pharmaceutical programming landscape. Professionals, competitors, investors and market watchers should brace themselves for industry forwarding initiatives.
As technology continues to evolve and our dependence on digital infrastructure grows, data centres have become a critical component of our modern society. However, the massive amounts of electricity required to power and cool these facilities have raised concerns about their environmental impact. According to a recent report, data centres accounted for approximately 1.5% of global electricity consumption in 2024 (Nature). This staggering statistic highlights the urgent need for the industry to explore more sustainable practices and embrace future trends that can help mitigate their carbon footprint.
The Rise of Renewable Energy
One clear trend that we can expect to see in the future is the increased adoption of renewable energy sources to power data centres. As the urgency to combat climate change grows, governments and businesses are recognizing the importance of transitioning to clean energy alternatives. Renewable energy technologies such as solar, wind, and hydroelectric power offer a reliable and sustainable solution for data centres.
In recent years, we have already witnessed prominent tech companies and data centre operators investing in renewable energy projects. For instance, Google has pledged to reach 100% renewable energy for its global operations, including data centres (Google Sustainability). This commitment not only helps reduce their environmental impact but also provides an opportunity for the industry to lead by example and inspire others to follow suit.
Improving Energy Efficiency
Another crucial aspect of future trends for data centres is the continuous pursuit of energy efficiency. With power consumption being a significant contributor to their environmental footprint, data centre operators are investing in innovative solutions to optimize energy usage and reduce waste.
Advanced cooling technologies, such as liquid cooling, are gaining traction as they can significantly improve energy efficiency compared to traditional air-cooling methods. Furthermore, implementing intelligent software systems and artificial intelligence algorithms can help optimize workload distribution and resource utilization, ultimately reducing overall energy consumption.
The Advent of Edge Computing
Edge computing is poised to revolutionize the data centre industry by bringing computation closer to the source of data generation. Instead of transmitting vast amounts of data to centralized data centres, edge computing allows for processing and storage to occur directly on the devices or at the edge of the network, reducing the need for extensive data infrastructure.
This trend has the potential to lower the overall energy requirements of data centres, as fewer resources will be needed for long-distance data transmission. As the Internet of Things (IoT) continues to expand, edge computing can play a vital role in managing and processing the massive volumes of data generated by billions of connected devices.
Recommendations for the Industry
Invest in renewable energy: Data centre operators should prioritize the adoption of renewable energy sources to power their facilities. Collaborating with energy providers, governments, and clean energy advocates can help accelerate the transition to sustainable energy.
Implement energy-efficient practices: By investing in advanced cooling technologies and optimizing resource utilization, data centres can significantly improve their energy efficiency. This includes exploring innovative solutions such as liquid cooling and leveraging artificial intelligence for workload management.
Embrace edge computing: As the industry moves towards edge computing, data centre operators should adapt their infrastructure to support this trend. This involves developing edge data centres and investing in robust network infrastructure at the edge to facilitate efficient data processing and storage.
Educate and raise awareness: It is crucial for the industry to actively educate the public and stakeholders about the environmental impact of data centres and the steps being taken to mitigate it. Spreading awareness and promoting sustainable practices can inspire change and encourage others to follow suit.
In conclusion, the future trends for data centres revolve around sustainability and efficiency. The adoption of renewable energy, continuous improvements in energy efficiency, and the rise of edge computing are key factors that will shape the industry in the coming years. By embracing these trends and implementing the recommended practices, the data centre industry can pave the way towards a more sustainable and environmentally conscious future.
Our next two activities, Coworking Mini-Hackathons for First-Time Contributors, will take place February 4th 2025 1-3 UTC and March 4th 2025 13-15 UTC (see below for details), but first, let’s review what we learned from this Community Call.
Community call
Our three panellists each shared different experiences and perspectives on making contributions to open source software.
Sunny and Pascal shared their experiences with getting involved, Pascal and Yaoxiang shared technical tips for git and testing, and all three offered advice for first time contributors.
Sunny focused on her journey making her first R package, bbsTaiwan as part of the rOpenSci Champions Program and Pascal shared his experiences as a first time contributor to the babelquarto package after being a long-time solo user of git.
Then Yaoxiang rounded out our call with advice for first-time contributors on the importance of including tests and how to deal with different testing situations, referring to his experience with medrxivr.
Sunny recommended that you have a plan for your contributions, but remain flexible as things change or don’t proceed as you may have expected.
Both Sunny and Pascal pointed out that they found git to be less scary than they expected once they got started, and that they learned so much while collaborating with others.
Among other technical suggestions, Pascal and Yaoxiang both commented that starting small and using good descriptions can be really helpful, whether for git commit messages or code tests.
Mini-hackathons
Hopefully this community call has inspired you to get involved open source software.
If you’re curious about contributing to Open Source Software, and would like some support to get started, our coworking mini-hackathons are for you!
During these session you’ll join others making contributions to R packages while package maintainers and other mentors are available ’live’ to answer questions and give guidance.
We’ll also have a special Slack channel ready as a place for asynchronous questions during the event and in the week following.
These collaborative events are designed to help first-time contributors get started with open-source projects.
Whether you’re improving documentation, reviewing translations, fixing bugs, or adding new features, our mentors will guide you every step of the way.
No prior experience required. Non-first time contributors are very welcome too—just bring your curiosity and enthusiasm!
The Future of Open Source Contributions: Insights from the FOSS Community Call
In a recent webinar themed “From Novice to Contributor: Making and Supporting First-Time Contributions to FOSS,” industry experts Sunny Tseng, Pascal Burkhard, and Yaoxiang Li shared their first-hand experiences and advice for novice contributors. The session, moderated by Hugo Gruson, served as the opening of a series of activities to support first-time contributors to Open Source Software and offers valuable insights on the future of open source contributions.
Key Takeaways and Future Implications
During the call, the panellists each shared unique experiences and perspectives, extending valuable advice to newcomers in the field. The first-time contributors offered their candid insights that could essentially drive the future development of the open-source ecosystem.
Focus on Improving Skills
Sunny discussed the importance of planning while also staying versatile as things may not always go as expected. She reiterated the need for new contributors to develop their skills, something she learned while creating her first R package, which fosters a culture of continuous learning in the open source industry.
Collaboration and User-friendly Tools
Pascal’s experiences underlined the importance of collaboration in open source projects, suggesting that the future of open source contributions could involve more collaborative efforts. He stressed the benefit of user-friendly tools like git, which he found to be less intimidating than anticipated and incredibly helpful for collaborations.
Technical Expertise and Descriptive Communication
Yaoxiang advocated the importance of including tests and dealing in diverse testing situations, using his experiences with medrxivr as an example. His advice hints at the importance of technical prowess in successful open source contributions. Both Pascal and Yaoxiang also emphasized the value of detailed and descriptive communication, whether for commit messages or code tests. This could influence the culture of open communication and diligence in open source contributions.
Long-Term Implications and Future Developments
The trends mentioned above suggest that the open-source community continues to evolve towards inclusivity, collaboration, technical mastery, and transparent communication. The field will likely become more welcoming for first-time contributors, continuously facilitate skill growth, encourage collaboration, and promote diligent, descriptive communication.
Actionable Advice for Future Contributors
Plan your contributions but remain adaptable as situations may change
Take advantage of user-friendly tools like git for collaboration.
Utilize descriptive communication in your contributions, whether for commit messages or code tests.
Do not overlook the significance of technical skills, especially for conducting tests.
Looking Forward: Coworking Mini-Hackathons for First-Time Contributors
rOpenSci is hosting Coworking Mini-Hackathons for First-Time Contributors in February and March of 2025. These events are a great opportunity for novice contributors to learn and explore the world of open source. They can get hands-on experience, benefit from live mentors, and connect with a supportive community. The move towards such inclusive events further highlights the future development of the open-source world – that of embracing first-time contributors and providing them with the necessary support.
Final Thoughts
The open-source community is heading towards a more inclusive and collaborative future. The insights shared by Sunny, Pascal, and Yaoxiang are not only inspiring for novices but also indicate the direction in which open-source contributions are moving. By offering ample support to first-time contributors, we can foster a richer and more diverse community, driving innovation and technological advancements.