rOpenSci Monthly News: Leadership Changes, Multilingual Dev Guide, New Packages, and More

rOpenSci Monthly News: Leadership Changes, Multilingual Dev Guide, New Packages, and More

[This article was first published on rOpenSci – open tools for open science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Dear rOpenSci friends, it’s time for our monthly news roundup!

You can read this post on our blog.
Now let’s dive into the activity at and around rOpenSci!

rOpenSci HQ

Leadership changes at rOpenSci

After 13 years at the helm of rOpenSci, our founding executive director Karthik Ram is stepping down.
Noam Ross, rOpenSci’s current lead for peer review, will be our new Executive Director.
Karthik will remain a key advisor to rOpenSci.
We thank him for his years of leadership and service to the community!

Read Karthik’s farewell post, and Noam’s post about his new role on our blog

rOpenSci Dev Guide 0.9.0: Multilingual Now! And Better

We’re delighted to announce we’ve released a new version of our guide,
“rOpenSci Packages: Development, Maintenance, and Peer Review”!

A highlight is that our guide is now bilingual (English and Spanish), thanks to work by Yanina Bellini Saibene, Elio Campitelli and Pao Corrales, and thanks to support of the Chan Zuckerberg Initiative, NumFOCUS, and the R Consortium.
Read the guide in Spanish.

Our guide is now also getting translated to Portuguese thanks to volunteers.
We are very grateful for their work!

Read more in the blog post about the release.
Thanks to all contributors who made this release possible.

Interview with Code for Thought podcast

Our community manager, Yanina Bellini Saibene, talked with Peter Schmidt of the Code for Thought podcast, about the importance of making computing materials accessible to non-English speaking learners.
Listen to the episode.
Find our more about rOpenSci multilingual publishing project.

Coworking

Read all about coworking!

Join us for social coworking & office hours monthly on first Tuesdays!
Hosted by Steffi LaZerte and various community hosts.
Everyone welcome.
No RSVP needed.
Consult our Events page to find your local time and how to join.

And remember, you can always cowork independently on work related to R, work on packages that tend to be neglected, or work on what ever you need to get done!

Software 📦

New packages

The following three packages recently became a part of our software suite, or were recently reviewed again:

  • nuts, developed by Moritz Hennicke together with Werner Krause: Motivated by changing administrative boundaries over time, the nuts package can convert European regional data with NUTS codes between versions (2006, 2010, 2013, 2016 and 2021) and levels (NUTS 1, NUTS 2 and NUTS 3). The package uses spatial interpolation as in Lam (1983) doi:10.1559/152304083783914958 based on granular (100m x 100m) area, population and land use data provided by the European Commission’s Joint Research Center. It is available on CRAN. It has been reviewed by Pueyo-Ros Josep and Le Meur Nolwenn.

  • quadkeyr, developed by Florencia D’Andrea together with Pilar Fernandez: Quadkeyr functions generate raster images based on QuadKey-identified data, facilitating efficient integration of Tile Maps data into R workflows. In particular, Quadkeyr provides support to process and analyze Facebook mobility datasets within the R environment. It has been reviewed by Maria Paula Caldas and Vincent van Hees.

  • weatherOz, developed by Rodrigo Pires together with Anna Hepworth, Rebecca O’Leary, Jonathan Carroll, James Goldie, Dean Marchiori, Paul Melloy, Mark Padgham, Hugh Parsonage, Keith Pembleton, and Adam H. Sparks: Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development (DPIRD) of Western Australia and by the Science and Technology Division of the Queensland Governments Department of Environment and Science (DES). As well as the Bureau of Meteorology (BOM) of the Australian government precis and coastal forecasts, agriculture bulletin data, and downloading and importing radar and satellite imagery files. It has been reviewed by Laurens Geffert and Sam Rogers.

Discover more packages, read more about Software Peer Review.

New versions

The following nineteen packages have had an update since the last newsletter: frictionless (v1.0.3), aRxiv (0.10), cffr (v1.0.0), chromer (v0.8), drake (7.13.9), GSODR (v4.0.0), lightr (v1.7.1), lingtypology (v1.1.17), magick (2.8.3), melt (v1.11.2), nodbi (v0.10.4), nuts (v1.0.0), paleobioDB (v1.0.0), quadkeyr (v0.1.0), rtweet (v2.0.0), ruODK (v1.4.2), spocc (v1.2.3), tarchetypes (0.8.0), and targets (1.6.0).

Software Peer Review

There are thirteen recently closed and active submissions and 6 submissions on hold. Issues are at different stages:

Find out more about Software Peer Review and how to get involved.

On the blog

Software Review

Tech Notes

Calls for contributions

Calls for maintainers

If you’re interested in maintaining any of the R packages below, you might enjoy reading our blog post What Does It Mean to Maintain a Package?.

Calls for contributions

Also refer to our help wanted page – before opening a PR, we recommend asking in the issue whether help is still needed.

Package development corner

Some useful tips for R package developers. 👀

Reminder: R Consortium Infrastructure Steering Committee (ISC) Grant Program Accepting Proposals until April 1st!

The R Consortium Call for Proposal might be a relevant funding opportunity for your package!
Find out more in their post.
If you can’t prepare your proposal in time, the next call will start September 1st.

@examplesIf for conditional examples in package manuals

Do you know you can make some examples of your package manual conditional on, say, the session being interactive?
The @examplesIf roxygen2 tag is really handy.
What’s more, inside the examples of a single manual page, you can seamlessly mix and match @examples and @examplesIf pieces.

‘argument “..2” is missing, with no default’

Mike Mahoney posted an important PSA on Mastodon:

if you’re getting a new error message ‘argument “..2” is missing, with no default’ on #rstats 4.3.3, it’s likely because you have a trailing comma in a call to glue::glue()
seeing this pop up in a few Slacks so figured I’d share
https://github.com/tidyverse/glue/issues/320

Thanks, Mike!

Useful hack: a CRAN-specific .Rbuildignore

The .Rbuildignore file lists the files to not be included when building your package, such as your pkgdown configuration file.
Trevor L. Davis posted a neat idea on Mastodon: using a CRAN-specific .Rbuildignore, so that CRAN submissions omit some tests and vignettes to keep the package under the size limit.

Regarding tests themselves, remember you can skip some or all on CRAN (but make sure you’re running them on continuous integration!).

Key advantages of using the keyring package

If your package needs the user to provide secrets, like API tokens, to work, you might be interested in wrapping or recommending the keyring package (maintained by Gábor Csárdi), that accesses the system credential store from R.
See this recent R-hub blog post.

A package for linting roxygen2 documentation

The compelling roxylint package by Doug Kelkhoff allows you to check some aspects of your roxygen2 docs, such as the use of full stops and sentence case.
See the list of current rules.

Last words

Thanks for reading! If you want to get involved with rOpenSci, check out our Contributing Guide that can help direct you to the right place, whether you want to make code contributions, non-code contributions, or contribute in other ways like sharing use cases.
You can also support our work through donations.

If you haven’t subscribed to our newsletter yet, you can do so via a form. Until it’s time for our next newsletter, you can keep in touch with us via our website and Mastodon account.

To leave a comment for the author, please follow the link and comment on their blog: rOpenSci – open tools for open science.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Continue reading: rOpenSci News Digest, March 2024

Long-term implications and possible future developments

With numerous intriguing updates and developments mentioned in the rOpenSci news article, here are several long-term implications and possible future directions.

Leadership Changes at rOpenSci

The change in leadership from Karthik Ram to Noam Ross, both integral individuals in rOpenSci, is likely to generate some shift in the approach and direction of the organization. As Noam takes over the helm, there might be changes to the strategic roadmap for rOpenSci, and the organization’s priorities may evolve, leading to the implementation of new initiatives and the modification of existing practices.

Enhanced Guide In Multiple Language

The fact that rOpenSci’s guide is now bilingual (English and Spanish) has the potential to dramatically expand the organization’s reach to non-English speaking audience. The ongoing translation to Portuguese suggests a broader aim of making rOpenSci accessible to as many global users as possible. This implies that more language versions may also be developed in the future.

Coworking and Community Building

rOpenSci’s coworking initiative helps to foster a sense of community, where users can collaborate, learn from one another, and also help in improving and maintaining various R packages. It can result in creativity and productivity enhancement, and knowledge exchange, fostering a more robust R user base.

New Packages

The inclusion of new packages like ‘nuts’, ‘quadkeyr’, and ‘weatherOz’ to rOpenSci demonstrates growth and adaptability of the open source software that it provides. This would make rOpenSci a more versatile and valuable platform for open science, particularly for researchers working on data related to Europe’s regional data, quadkey-identified data, and Australian weather data respectively.

Actionable advice based on these insights

If you are an existing member of rOpenSci, considering the leadership change, explore any new strategic directions that Noam Ross plans to implement, and find out how you can align your help with those plans. For all users, the availability of the guide in different languages means that there are fewer barriers to using rOpenSci’s resources, so take this opportunity to deepen your understanding.

Engage in the coworking sessions by rOpenSci which offer an opportunity to learn from and connect with other users across the globe. Explore the newly added packages and check if any could serve beneficial for your research or contributions. Lastly, consider if you could contribute to rOpenSci, whether by code or non-code contributions, proactive participation will only enhance your skills and increase your understanding of open science.

Read the original article

“Jumpstart Your MLOps Journey with Free GitHub Resources”

“Jumpstart Your MLOps Journey with Free GitHub Resources”

Begin your MLOps journey with these comprehensive free resources available on GitHub.

Embarking on Your MLOps Journey with Comprehensive Free Resources on GitHub

It’s no secret that Machine Learning Operations (MLOps) is rapidly becoming a significant necessity in the world of technology and business. With the increasing relevance of data-driven decision making, integrating machine learning (ML) systems into business systems has become a cornerstone of modern business strategy. Thankfully, numerous comprehensive and free resources are available on GitHub to make your start in MLOps smoother and more effective.

Long-term implications and future developments in MLOps

Machine Learning Operations, or MLOps, aims to bridge the gap between the development of ML models and their operation in production systems. With businesses relying more on machine learning models for data analysis and decision making, the need for a framework to manage these models becomes crucial. The long-term implications of MLOps are far-reaching and exciting.

MLOps is set to become an integral part of business strategy in more industries. We anticipate a future where businesses across sectors will rely on MLOps for the functional and efficient operation of their ML systems in production environments. This suggests a potential for an exponential rise in the demand for MLOps skills and resources.

The democratization of machine learning through MLOps opens the door to a future where ML models are as ubiquitous as software applications are today. In this future, expecting businesses to have incorporated ML models into their operations will be as commonplace as expecting businesses to have a website.

Actionable Advice Based on the Anticipated MLOps Future Developments

Leverage the available resources

With an unprecedented array of free resources available on GitHub for kick-starting your journey into MLOps, the first piece of advice is to take advantage of these resources. They present beginners with an invaluable opportunity to understand the terrain before diving in fully. Experiment with different models, understand the best practices, and identify the pitfalls to avoid while managing ML models.

Devote ample time to learning MLOps

Given the anticipated rise in the significance of MLOps in business and technology, it is crucial for tech savvy individuals and businesses alike to devote ample time to understand and learn this field. Far from being just a trend or buzzword, MLOps will likely become an essential component of technology and business operation.

Stay adaptable and keep learning

The field of MLOps, like most tech fields, is continuously evolving. What works today may be outdated tomorrow. To ensure long-term success in this field, it is crucial to stay adaptable and open to learning new things. Monitor trends, follow new research, join discussions, and continue to learn.

Implement ML with a clear plan

Before deploying ML models into business operations, have a clear plan. Understand the problem you’re trying to solve, the resources at your disposal, and the best ML model for the task. Then use MLOps as your guiding principle in developing and deploying the ML model.

The resources available on GitHub provide an excellent starting point for this journey, providing a wealth of information and support for those ready to dive into the riveting world of MLOps.

Read the original article

Digital transformation in finance is the process of implementing advanced digital technologies to boost financial processes.

Digital Transformation in Finance: The Future Beyond

The digital transformation in finance indicates a paradigm shift towards the extensive utilization of sophisticated digital technologies to enhance financial processes. This transformation is reshaping the finance industry in numerous ways, leaving its indelible mark on all affiliated business operations, forecasting a landscape of technology-enhanced capabilities.

Future Implications

The digital transformation in the finance sector is not merely a passing trend. It alters the way finance industries function, fostering transparency, speed, and efficiency in operations. As companies continue to engage in digital transformation, the potential effects on the global financial landscape are profound.

Increased Automation

One of the primarily anticipated long-term implications of digital transformation in finance is the rise of automation. Automated financial operations will unleash increased productivity, optimizing various tasks such as data entry, compliance checks, and report generation. This can lead to lower operational costs and time-saving.

New Job Opportunities

While automation does eliminate some roles, it simultaneously creates new ones. With digital transformation, new skill sets will be in demand, such as data analysis, cybersecurity, AI and machine learning expertise. This implies a shift in the job market, promoting upskilling and retraining of the workforce.

Improved Customer Experience

Digital transformation also contributes to improving customer experience by providing fast, stress-free, and seamless services. The adaption of digital processes means 24/7 availability, reducing wait times and making service accessibility more convenient.

Potential Future Developments

Acceleration of AI Integration

Artificial Intelligence (AI) is expected to play a more significant role in reshaping financial services. AI can optimize numerous financial operations, from credit scoring and fraud detection to customer service and financial advising.

Increase in Cybersecurity Investments

As financial operations continue to digitize, the sector becomes a prime target for cyber-attacks. Therefore, cybersecurity will likely become a critical investment area to ensure safe and secure transactions.

Greater Regulatory Scrutiny

With the rapid digital transformation, regulatory bodies will likely scrutinize financial institutions more rigorously. Compliance to data protection regulations and other directives will become critical for operations.

Actionable Advice

  1. Embrace digital technology: Financial institutions must proactively adopt digital solutions, keeping an open mind for modern technologies like AI and machine learning.
  2. Invest in cybersecurity: To manage digital risks, firms should increase investment in cybersecurity infrastructure and policies.
  3. Focus on customer experience: Make customer satisfaction a priority by providing seamless, efficient, and secure services.
  4. Retrain workforce: Encourage workforce to learn new skills related to technological advancements.
  5. Compliance Reviews: Regularly review your digital operations to ensure they comply with all regulatory bodies and data protection laws.

In conclusion, digital transformation in finance is setting significant trends. Financial institutions should monitor these developments closely and adapt accordingly to stay competitive and relevant in the technological era.

Read the original article

“The Impact of Local Hosting on AI: Privacy, Speed, Autonomy, and Customization”

“The Impact of Local Hosting on AI: Privacy, Speed, Autonomy, and Customization”

The Evolving Landscape of Artificial Intelligence: Embracing Locally-Hosted AIs

In an age where data privacy concerns and demands for personalized experiences are at an all-time high, the evolution of artificial intelligence (AI) deployment from cloud-based to locally-hosted solutions merits a discerning analysis. This shift has significant implications that trace the contours of our digital lives. Entering this complex terrain, we must consider the potential for transformative impacts on privacy, speed, autonomy, and customization. As we unpack these elements, it is vital to critically engage with both the promises and the challenges that locally-hosted AIs present.

Privacy: A Return to Personal Agency

Privacy stands at the forefront of the debate over locally-hosted AI. This paradigm offers a promising alternative to the cloud, addressing the pervasive anxiety over data sovereignty and vulnerability. However, with the localisation of data, there emerges a nuanced set of privacy concerns that necessitate a careful examination.

Speed and Efficiency: The Quest for Real-Time Interactions

Local hosting also brings the promise of increased speed, a crucial factor for real-time decision-making and interactions. Does the theoretical reduction in latency translate into perceptible benefits, or does it introduce new limitations? This aspect of local AI hosting calls for a deep dive into the architecture and efficiency of such systems.

Autonomy: Independence from the Cloud

The notion of autonomy in locally-hosted AIs presents a dual-edged sword. On one hand, it offers freedom from the tether of cloud reliance. On the other hand, questions about the sustainability and inclusiveness of independently managed systems surface, serving up a rich field for inquiry.

Customization: The Personal Touch

Lastly, customization is a key driver for local AI hosting. The potential for tailored AI experiences individual to the user is unparalleled. We must probe the extent to which this personal customization is feasibly realizable and what it means for the user interface dynamic.

As we delve into these topics, the interplay between locally-hosted AIs and the broader technological ecosystem becomes apparent. We need to dissect this paradigm shift critically, balancing the excitement of innovation with a sober consideration of its implications. Let us journey through this nuanced landscape to uncover the deep-seated effects of hosting artificial intelligence on the edge of our personal devices.

Let’s explore the significance locally-hosted AIs, and how a shift from cloud to local hosting impacts privacy, speed, autonomy, and customization.

Read the original article

Improving Cancer Imaging Diagnosis with Bayesian Networks and Deep Learning: A Bayesian Deep Learning Approach

Improving Cancer Imaging Diagnosis with Bayesian Networks and Deep Learning: A Bayesian Deep Learning Approach

arXiv:2403.19083v1 Announce Type: new Abstract: With recent advancements in the development of artificial intelligence applications using theories and algorithms in machine learning, many accurate models can be created to train and predict on given datasets. With the realization of the importance of imaging interpretation in cancer diagnosis, this article aims to investigate the theory behind Deep Learning and Bayesian Network prediction models. Based on the advantages and drawbacks of each model, different approaches will be used to construct a Bayesian Deep Learning Model, combining the strengths while minimizing the weaknesses. Finally, the applications and accuracy of the resulting Bayesian Deep Learning approach in the health industry in classifying images will be analyzed.
In the article “Deep Learning and Bayesian Network Models in Cancer Diagnosis: A Comparative Study,” the authors explore the intersection of artificial intelligence and healthcare. Specifically, they delve into the theory behind Deep Learning and Bayesian Network prediction models and their applications in imaging interpretation for cancer diagnosis. By examining the strengths and weaknesses of each model, the authors propose a novel approach – the Bayesian Deep Learning Model – that combines the advantages of both while mitigating their limitations. The article concludes with an analysis of the accuracy and potential applications of this approach in the health industry, particularly in classifying medical images.

The Power of Bayesian Deep Learning: Revolutionizing Cancer Diagnosis with AI

Advancements in artificial intelligence (AI) have paved the way for remarkable breakthroughs in various fields. In the realm of healthcare, the ability to accurately interpret medical images can mean the difference between life and death, especially in cancer diagnosis. This article explores the underlying themes and concepts of Deep Learning and Bayesian Network prediction models, and proposes an innovative solution — the Bayesian Deep Learning Model — that combines the strengths of both approaches while minimizing their weaknesses.

The Theory Behind Deep Learning and Bayesian Networks

Deep Learning, a subset of machine learning, is a powerful approach that simulates the human brain’s neural network. It excels at automatically learning and extracting intricate features from complex datasets, without the need for explicit feature engineering. However, one of its limitations lies in uncertainty estimation, which is crucial for reliable medical diagnosis.

On the other hand, Bayesian Networks are probabilistic graphical models that can effectively handle uncertainty. They provide a structured representation of dependencies among variables and allow for principled inference and reasoning. However, they often struggle with capturing complex nonlinear patterns in data.

The Birth of Bayesian Deep Learning

Recognizing the advantages of both Deep Learning and Bayesian Networks, researchers have endeavored to combine them into a unified model. By incorporating Bayesian inference and uncertainty estimation into Deep Learning architectures, the Bayesian Deep Learning Model inherits the best of both worlds.

One approach to constructing a Bayesian Deep Learning Model is by integrating dropout layers into a deep neural network. Dropout is a technique that randomly deactivates neurons during training, forcing the network to learn robust representations by preventing overfitting. By interpreting dropout as approximate Bayesian inference, the model can estimate both aleatoric and epistemic uncertainties.

Revolutionizing Cancer Diagnosis with the Bayesian Deep Learning Model

The potential applications of the Bayesian Deep Learning Model are vast, particularly in the health industry. Imagine a system capable of accurately classifying medical images with quantified uncertainties, providing doctors with invaluable insights for making informed decisions.

By training the model on large datasets of medical images, the Bayesian Deep Learning Model can learn to detect intricate patterns indicative of cancerous tissues. Through its Bayesian framework, the model can not only provide predictions but also quantify the uncertainty associated with each prediction.

This level of uncertainty estimation is pivotal in healthcare, as it enables doctors to assess the reliability of the model’s predictions and make informed decisions. It can prevent misdiagnosis or unnecessary invasive procedures, ultimately enhancing patient care and outcomes.

The Journey Towards Enhanced Accuracy

The accuracy of the Bayesian Deep Learning Model in classifying medical images is an ongoing pursuit. To further enhance its performance, researchers are exploring techniques such as semi-supervised learning and active learning.

Semi-supervised learning leverages unlabeled data in combination with labeled data to improve model generalization. By leveraging vast amounts of available unlabeled medical images, the model can extract additional meaningful information and further refine its predictions.

Active learning, on the other hand, aims to optimize the training process by selectively choosing the most informative samples for annotation. By actively selecting samples that the model finds uncertain, researchers can iteratively improve the model’s accuracy and efficiency.

The Future of Cancer Diagnosis

The Bayesian Deep Learning Model represents a significant step forward in revolutionizing cancer diagnosis. By combining the strengths of Deep Learning and Bayesian Networks, it equips healthcare professionals with a powerful tool for accurate image interpretation and uncertainty quantification.

As the model continues to evolve and improve, it holds the potential to enhance early detection rates, improve patient outcomes, and alleviate the burden on healthcare providers. With further research and development, we can hope to usher in a future where AI plays an integral role in cancer diagnosis, saving lives and bringing us closer to a world free of this disease.

“The intersection of artificial intelligence and healthcare holds immense promise. By harnessing the power of Bayesian Deep Learning, we can transform cancer diagnosis and improve patient care in unprecedented ways.”

The research paper, titled “Investigating Deep Learning and Bayesian Network Prediction Models for Imaging Interpretation in Cancer Diagnosis,” explores the integration of two powerful machine learning techniques, Deep Learning and Bayesian Networks, for improving the accuracy of cancer diagnosis through image analysis. This is a significant contribution to the field of healthcare as accurate and timely diagnosis is crucial for effective treatment.

Deep Learning is a subset of machine learning that focuses on training neural networks to learn from large amounts of data. It has shown remarkable success in various domains, including image recognition. On the other hand, Bayesian Networks are probabilistic graphical models that represent uncertain relationships between variables. They provide a framework for capturing complex dependencies and reasoning under uncertainty.

By combining the strengths of these two models, the authors aim to construct a Bayesian Deep Learning Model that can leverage the power of Deep Learning for feature extraction and Bayesian Networks for probabilistic reasoning. This approach has the potential to enhance the accuracy of cancer diagnosis by incorporating uncertainty and capturing complex relationships between imaging features.

The paper acknowledges the advantages and drawbacks of both Deep Learning and Bayesian Networks. Deep Learning models excel at learning intricate patterns from large datasets, but they often lack interpretability and struggle with uncertainty estimation. On the other hand, Bayesian Networks offer interpretability and uncertainty quantification but may struggle with capturing complex patterns in high-dimensional data.

To overcome these limitations, the authors propose a hybrid approach that combines the strengths of both models. The Deep Learning component can be used to extract high-level features from medical images, while the Bayesian Network component can capture the uncertainty and dependencies among these features. By integrating these models, the resulting Bayesian Deep Learning approach can provide accurate predictions while also offering interpretability and uncertainty quantification.

The potential applications of this Bayesian Deep Learning approach in the health industry are vast. In the context of cancer diagnosis, accurate classification of medical images can significantly improve patient outcomes by enabling early detection and personalized treatment plans. The paper’s analysis of the resulting approach’s accuracy in classifying images will provide valuable insights into its effectiveness and potential impact in real-world healthcare settings.

In conclusion, the integration of Deep Learning and Bayesian Networks in the form of a Bayesian Deep Learning Model holds great promise for improving cancer diagnosis by leveraging the strengths of both models. The paper’s exploration of this approach and its analysis of its applications and accuracy in the health industry will contribute to the advancement of medical imaging interpretation and have a significant impact on patient care.
Read the original article