by jsendak | Apr 28, 2025 | DS Articles
Starting freelancing can feel overwhelming, but mastering specialized, high-paying skills can help you stand out in competitive markets and secure better opportunities.
Mastering Specialized Skills: A Long-Term Strategy for Freelancers
With freelancing becoming increasingly popular around the globe, prospective freelancers may be overwhelmed by the competition. One strategy that can help you rise above the crowd is mastering specialized, high-paying skills. This not only improves your marketability but also increases your chances of securing lucrative opportunities. Let’s delve into understanding long-term implications and foreseeable future developments of this approach.
Long-Term Implications
Earning a specialization over a wide array of rudimentary skills can set you apart as a freelancer. As companies rely more on remote and freelance work, the demand for specific expertise increases. This results in more consistent work and higher pay for freelancers with specialized skills. By ensuring you hold these desired abilities, you create a more sustainable freelance business for yourself in the long run.
Note that mastering a high-paying skill does not limit you to one skill. You can choose to specialize in multiple areas, conferring a level of versatility that is highly prized in the creative market.
Future Developments
Technology undeniably impacts the freelance market, with new tools and platforms regularly introduced. Therefore, ensuring you’re up to date with these changes augments your attractiveness to potential clients. Skills like AI programming, cybersecurity, and data analytics are examples of in-demand specializations that are expected to grow in the future.
Potential Future Specializations
- Artificial Intelligence (AI) and Machine Learning
- Cybersecurity
- Data Analysis and Business Intelligence
- Automation
- Content Strategy
Actionable Advice
To be successful as a freelancer, it’s crucial to stay ahead of the curve. Here’s practical advice that can help you leverage your skills for a profitable freelancing career.
Invest in Learning
Invest in your development, whether that’s self-study, online courses, or degrees. Continuously seek to improve your skills. This is vital in keeping you relevant in the fast-paced freelance market.
Stay Abreast of the Market Trends
Keep track of the market trends. Understand what skills are high-paying and in demand at the same time. It is equally essential to be aware of skills that are becoming obsolete.
Networking
Valuable connections can open doors to better opportunities. Make sure to establish good relationships with clients, related professionals, and the freelance community.
In conclusion, freelancing is a journey, and possessing high-paying specialized skills can help you successfully navigate this path. Though the task may seem daunting, with the right strategy, commitment, and time investment, you can chart a successful freelance career.
Read the original article
by jsendak | Apr 27, 2025 | DS Articles
Zhenguo Zhang’s Blog /2025/04/26/r-how-to-create-an-error-barplot-with-overlaid-points-using-ggplot/ –
library(ggplot2)
library(dplyr)
Sometimes you may want to create a plot with the following features:
- a point to indicate the mean of a group
- error bars to indicate the standard deviation of the group
- and each group may have subgroups, which are represented by different colors.
In this post, I will show you how to create such a plot using the ggplot2
package in R.
We will use the builtin mtcars
dataset as an example. And we need to
compute the following variables for later use:
- The mean mpg for each group of
cyl
(number of cylinders) and gear`` (number of gears), here
cylis the main group and
gear` is the subgroup.
# Load the mtcars dataset
data(mtcars)
# Compute the mean and standard deviation of mpg for each group
mtcars_summary <- mtcars %>%
group_by(cyl, gear) %>%
summarise(mean_mpg = mean(mpg), sd_mpg = sd(mpg)) %>%
ungroup()
# replace the NA values in sd_mpg with 1
mtcars_summary$sd_mpg[is.na(mtcars_summary$sd_mpg)] <- 1
# convert group variables into factors
mtcars_summary$cyl <- factor(mtcars_summary$cyl)
mtcars_summary$gear <- factor(mtcars_summary$gear)
Create the plot – first try
Now we can create the plot using ggplot2
. We will use the geom_point()
function to create the points, and the geom_errorbar()
function to create the error bars. We will also use the aes()
function to specify the aesthetics of the plot.
# Create the plot
plt <- ggplot(mtcars_summary, aes(x = cyl, y = mean_mpg, color = gear)) +
geom_point(size = 3) + # add points
geom_errorbar(aes(ymin = mean_mpg - sd_mpg, ymax = mean_mpg + sd_mpg), width = 0.2) + # add error bars
labs(x = "Number of Cylinders", y = "Mean MPG", color = "Number of Gears") + # add labels
theme_minimal() + # use a minimal theme
theme(legend.position = "top") # move the legend to the top
plt

Well, it is working, but the problem is that the error bars and points are all
aligned at the same position of x-axis. This is not what we want. We want the
subgroups to be separated by a small distance.
Create the plot – second try
To separate the subgroups, we can use the position_dodge()
function. This function will move the points and error bars to the left and right, so that they are not overlapping.
pd <- position_dodge(width = 0.5)
# Create the plot with position_dodge
plt <- ggplot(mtcars_summary, aes(x = cyl, y = mean_mpg, color = gear)) +
geom_point(size = 3, position = pd) + # add points with position_dodge
geom_errorbar(aes(ymin = mean_mpg - sd_mpg, ymax = mean_mpg + sd_mpg), width = 0.2, position = pd) + # add error bars with position_dodge
labs(x = "Number of Cylinders", y = "Mean MPG", color = "Number of Gears") + # add labels
theme_minimal() + # use a minimal theme
theme(legend.position = "top") # move the legend to the top
plt

Cool. Isn’t it?
The only difference is that we added the position = pd
argument to the geom_point()
and geom_errorbar()
functions. This tells ggplot2
to use the position_dodge()
function to separate the subgroups.
Conclusion
In this post, we learned how to create a plot with error bars and overlaid points using the ggplot2
package in R. We also learned how to separate the subgroups using the position_dodge()
function.
If you want to learn more about the function position_dodge()
, you can check an
excellent post here.
Happy programming! 
– /2025/04/26/r-how-to-create-an-error-barplot-with-overlaid-points-using-ggplot/ –
Continue reading: [R] How to create errorbars with overlaid points using ggplot
Long-Term Implications and Future Developments
The blog post by Zhenguo Zhang provides a well-detailed guide on how to create a plot chart using the ggplot2 package in R with overlaid points and error bars. This skill is increasingly essential in the data analysis field, especially as organizations delve more into data-driven decision making. As a developer or data analyst, mastering the use of ggplot2 for data visualization not only increases efficiency but also the clarity of your data reports.
Possibility of Increased use of ggplot2
With the continual growth of data analysis in almost all sectors, we can expect that more persons will rely on ggplot2 for their data visualization needs. Its ability to create complex and detailed plots with simple code lines makes it a powerful tool for data analysis.
The Need for Improved Visualization Tools
The use of overlaid points and error bars as shown by Zhenguo Zhang is an essential technique in data visualization. However, there is a need to simplify this process and make it more user-friendly for people without programming skills. We can then expect future developments to focus on improving user experience by introducing new functions or tools that make data visualization easier.
Actionable Advice
For individuals dealing with R and data visualization, here are some tips:
- Enhance Your R skills: Increasing your knowledge on R and its associated data visualization packages, particularly ggplot2, will prove invaluable in professional data analysis.
- Constant learning: ggplot2 is constantly being updated with new features and functionalities. Therefore, continuously updating your knowledge and skills on the package will keep you ready and equipped to handle any changes that may arise.
- Engage the R community: Participating in R-bloggers and other similar communities can provide you with a platform to not only share but also learn from others.
- Explore other visualization tools: While ggplot2 is quite powerful, other packages may be better suited for specific kind of data visualizations. Be open to learning and using other visualization tools.
Remember: The key in today’s data analysis field does not lie in simply analyzing and reporting data, but presenting it in a way that is easy to understand.
Read the original article
by jsendak | Apr 27, 2025 | DS Articles
A step-by-step guide to speed up the model inference by caching requests and generating fast responses.
Analysis: Accelerating Model Inference Through Effective Caching Practices
A major development in the realm of model inference is the application of caching requests, which allows for generation of fast responses and streamlined operations. This advancement yield significant improvements in model inference speed and is set to shape the future dynamics of this field.
Long-Term Implications
The use of caching requests presents a number of long-term implications. Primarily, there is the consequence of dramatically improved efficiency. These techniques enable shorter response times, thereby expediting the processing of large volumes of data in model inference. This could lead to major advancements in areas reliant on big data analytics and artificial intelligence, such as healthcare, finance, and smart city development.
Moreover, it may result in substantial cost savings. Faster model inferences eliminate the need for expensive processing power, thus potentially reducing overhead costs. This is particularly beneficial for smaller organizations and initiatives, as it allows them to enhance performance without significant financial investment.
Future Developments
With the continuous evolution of this technology, we can expect several developments in the future. There will likely be advancements in caching algorithms that could lead to even faster responses and more efficient model inference processes. We may also see the development of specific hardware to further accelerate these techniques.
Furthermore, industries that utilize model inference are expected to adapt quickly to these developments. They will likely incorporate these caching strategies into their systems, leading to widespread integration across multiple sectors. Overall, the future for efficient model inference through caching requests is not only promising but essential for handling growing volumes of data effectively.
Actionable Advice
Considering the highlighted implications and future trends, the following actions can provide beneficial:
- Invest in Learning: Organizations should invest in technical training aimed at understanding and implementing caching strategies for model inference. This will enhance their capacity to rapidly process data and generate insights.
- Prioritize Research and Development: Continual advancements in this field necessitate a focus on research and development. Companies should prioritize staying up-to-date with the latest ways to improve model inference through caching.
- Planning for Integration: If not already implementing this technology, organizations need to plan on its seamless integration into their existing systems. This will involve considering both logistical and technical aspects.
The successful implementation of cache requests for model inference can significantly overhaul existing data processing methods. This elevates the importance of not just understanding this technology, but also planning for its optimal use in the near future.
Read the original article
by jsendak | Apr 26, 2025 | DS Articles
Episode 22 – Mathematical Optimization for AI | AI Think Tank PodcastDiscover how mathematical optimization is transforming artificial intelligence, machine learning, and real-world decision-making in this episode of the AI Think Tank Podcast. Host Dan Wilson talks with Jerry Yurchisin, Senior Data Scientist at Gurobi, the industry leader in optimization software. Learn how companies like the NFL, Instacart, and major energy providers use Gurobi’s optimizer to solve complex problems in scheduling, logistics, finance, and AI workflows. Gain insight into practical applications, from cloud resource management to real-time analytics, and explore tools like the Burrito Optimization Game. Perfect for data scientists, AI engineers, and business leaders looking to unlock smarter, faster decisions through the power of math.
Transforming AI With Mathematical Optimization
In the 22nd episode of the AI Think Tank Podcast, host Dan Wilson discusses the ground-breaking integration of mathematical optimization in artificial intelligence (AI), machine learning (ML), and real-world decision-making with Jerry Yurchisin, a Senior Data Scientist at Gurobi. Gurobi is at the forefront of producing optimization software and has a wide clientele ranging from the NFL and Instacart to significant energy providers.
The Shift to Optimization in Problem Solving
These companies rely on Gurobi’s optimizer to resolve intricate problems concerning scheduling, logistics, finance, and AI workflows. This enables them to overturn traditional problem-solving with mathematical optimization that ensures increased efficiency and quicker, more sensible decision-making. The implications of this shift are far-reaching as it could revolutionize various industries, paving the path for more complex, real-time solutions and analytics.
Cloud Resource Management and Real-Time Analytics
There are numerous practical applications of mathematical optimization, from managing cloud resources to real-time analytics. Such applications offer potential significant value in the long run by keeping track of resources’ effective usage or providing an immediate interpretation of data.
The Future of Mathematical Optimization
As mathematical optimization becomes more mainstream, we could expect to witness an increase in problem-solving efficiency with an optimization-first approach in almost every industry. This rapid development could pave the way for more advancements in rapid, real-time analytics, intelligent algorithms, and data-driven decision making, significantly increasing productivity across sectors.
Actionable Advice for Stakeholders
- Business leaders should invest in mathematical optimization to ensure efficient and sensible decision-making and to stay competitive in the ever-evolving business landscape.
- Data scientists and AI engineers should strive to remain at the forefront of such advancements, regularly updating their knowledge and skills in mathematical optimization to provide innovative solutions.
- Companies should further tap into the potential of optimization software like Gurobi’s that enable them to solve complex problems efficiently and quickly.
- Lastly, stakeholders should look into the practical applications of optimization such as Cloud Resource Management and real-time analytics, harnessing their potential for the better utilization of resources and rapid insight generation.
The Burrito Optimization Game
For a more fun approach towards understanding mathematical optimization, tools like the Burrito Optimization Game shine a light on the use of mathematics in real-world problem-solving scenarios. It is an interesting example of how mathematical optimization can be both entertaining and educational.
Concluding Thoughts
Mathematical optimization is indeed a game-changer in the AI and ML landscape and poses a wide array of applications across sectors. As industries embrace this advancement, we could witness an exponential increase in growth, efficiency, and productivity. Thus, realizing and investing in its potential is crucial for any forward-looking enterprise.
Read the original article
by jsendak | Apr 25, 2025 | DS Articles
In which I confront the way I read code in different languages, and end up
wishing that R had a feature that it doesn’t.
This is a bit of a thought-dump as I consider some code – please don’t take it
as a criticism of any design choices; the tidyverse team have written magnitudes
more code that I have and have certainly considered their approach more than I
will. I believe it’s useful to challenge our own assumptions and dig in to how
we react to reading code.
The blog post
describing the latest updates to the tidyverse {scales} package neatly
demonstrates the usage of the new functionality, but because the examples are
written outside of actual plotting code, one feature stuck out to me in
particular…
label_glue("The {x} penguin")(c("Gentoo", "Chinstrap", "Adelie"))
# The Gentoo penguin
# The Chinstrap penguin
# The Adelie penguin
Here, label_glue
is a function that takes a {glue} string as an argument and
returns a ’labelling” function’. That function is then passed the vector of
penguin species, which is used in the {glue} string to produce the output.

Note
For those coming to this post from a python background, {glue} is R’s
answer to f-strings, and is used in almost the exact same way for simple cases:
## R:
name <- "Jonathan"
glue::glue("My name is {name}")
# My name is Jonathan
## Python:
>>> name = 'Jonathan'
>>> f"My name is {name}"
# 'My name is Jonathan'
There’s nothing magic going on with the label_glue()()
call – functions are
being applied to arguments – but it’s always useful to interrogate surprise when
reading some code.
Spelling out an example might be a bit clearer. A simplified version of
label_glue
might look like this
tmp_label_glue <- function(pattern = "{x}") {
function(x) {
glue::glue_data(list(x = x), pattern)
}
}
This returns a function which takes one argument, so if we evaluate it we get
tmp_label_glue("The {x} penguin")
# function(x) {
# glue::glue_data(list(x = x), pattern)
# }
# <environment: 0x1137a72a8>
This has the benefit that we can store this result as a new named function
penguin_label <- tmp_label_glue("The {x} penguin")
penguin_label
# function(x) {
# glue::glue_data(list(x = x), pattern)
# }
# <bytecode: 0x113914e48>
# <environment: 0x113ed4000>
penguin_label(c("Gentoo", "Chinstrap", "Adelie"))
# The Gentoo penguin
# The Chinstrap penguin
# The Adelie penguin
This is versatile, because different {glue} strings can produce different
functions – it’s a function generator. That’s neat if you want different
functions, but if you’re only working with that one pattern, it can seem odd to
call it inline without naming it, as the earlier example
label_glue("The {x} penguin")(c("Gentoo", "Chinstrap", "Adelie"))
It looks like we should be able to have all of these arguments in the same
function
label_glue("The {x} penguin", c("Gentoo", "Chinstrap", "Adelie"))
but apart from the fact that label_glue
doesn’t take the labels as an
argument, that doesn’t return a function, and the place where this will be used
takes a function as the argument.
So, why do the functions from {scales} take functions as arguments? The reason
would seem to be that this enables them to work lazilly – we don’t necessarily
know the values we want to pass to the generated function at the call site;
maybe those are computed as part of the plotting process.
We also don’t want to have to extract these labels out ourselves and compute on
them; it’s convenient to let the scale_*
function do that for us, if we just
provide a function for it to use when the time is right.
But what is passed to that generated function? That depends on where it’s
used… if I used it in scale_y_discrete
then it might look like this
library(ggplot2)
library(palmerpenguins)
p <- ggplot(penguins[complete.cases(penguins), ]) +
aes(bill_length_mm, species) +
geom_point()
p + scale_y_discrete(labels = penguin_label)
since the labels
argument takes a function, and penguin_label
is a function
created above.
I could equivalently write that as
p + scale_y_discrete(labels = label_glue("The {x} penguin"))
and not need the “temporary” function variable.
So what gets passed in here? That’s a bit hard to dig out of the source, but one
could reasonably expect that at some point the supplied function will be called
with the available labels as an argument.
I have a suspicion that the “external” use of this function, as
label_glue("The {x} penguin")(c("Gentoo", "Chinstrap", "Adelie"))
is clashing with my (much more recent) understanding of Haskell and the way that
partial application works. In Haskell, all functions take exactly 1 argument,
even if they look like they take more. This function
ghci> do_thing x y z = x + y + z
looks like it takes 3 arguments, and it looks like you can use it that way
ghci> do_thing 2 3 4
9
but really, each “layer” of arguments is a function with 1 argument, i.e. an
honest R equivalent would be
do_thing <- function(x) {
function(y) {
function(z) {
x + y + z
}
}
}
do_thing(2)(3)(4)
# [1] 9
What’s important here is that we can “peel off” some of the layers, and we get
back a function that takes the remaining argument(s)
do_thing(2)(3)
# function(z) {
# x + y + z
# }
# <bytecode: 0x116b72ba0>
# <environment: 0x116ab2778>
partial <- do_thing(2)(3)
partial(4)
# [1] 9
In Haskell, that looks like this
ghci> partial = do_thing 2 3
ghci> partial 4
9
Requesting the type signature of this function shows
ghci> :type do_thing
do_thing :: Num a => a -> a -> a -> a
so it’s a function that takes some value of type a
(which needs to be a Num
because we’re using +
for addition; this is inferred by the compiler) and then
we have
a -> a -> a -> a
This can be read as “a function that takes 3 values of a type a
and returns 1
value of that same type” but equivalently (literally; this is all just syntactic
sugar) we can write it as
a -> (a -> (a -> a))
which is “takes a value of type a
and returns a function that takes a value of
type a
, which itself returns a function that takes a value of type a
and
returns a value of type a
”. With a bit of ASCII art…
a -> (a -> (a -> a))
| | | |
| | |_z__|
| |_y________|
|_x______________|
If we ask for the type signature when some of the arguments are provided
ghci> :type do_thing 2 3
do_thing 2 3 :: Num a => a -> a
we see that now it is a function of a single variable (a -> a
).
With that in mind, the labelling functions look like a great candidate for
partially applied functions! If we had
label_glue(pattern, labels)
then
label_glue(pattern)
would be a function “waiting” for a labels
argument. Isn’t that the same as
what we have? Almost, but not quite. label_glue
doesn’t take a labels
argument, it returns a function which will use them, so the lack of the labels
argument isn’t a signal for this. label_glue(pattern)
still returns a
function, but that’s not obvious, especially when used inline as
scale_y_discrete(labels = label_glue("The {x} penguin"))
When I read R code like that I see the parentheses at the end of label_glue
and read it as “this is a function invocation; the return value will be used
here”. That’s correct, but in this case the return value is another function.
There’s nothing here that says “this will return a function”. There’s no
convention in R for signalling this (and being dynamically typed, all one can do
is read the documentation) but one could imagine one, e.g. label_glue_F
in a
similar fashion to how Julia uses an exclamation mark to signify an in-place
mutating function; sort!
vs sort
.
Passing around functions is all the rage in functional programming, and it’s how
you can do things like this
sapply(mtcars[, 1:4], mean)
# mpg cyl disp hp
# 20.09062 6.18750 230.72188 146.68750
Here I’m passing a list (the first four columns of the mtcars
dataset) and a
function (mean
, by name) to sapply
which essentially does a map(l, f)
and produces the mean of each of these columns, returning a named vector of the
means.
That becomes very powerful where partial application is allowed, enabling things
like
ghci> add_5 = (+5)
ghci> map [1..10] add_5
[6,7,8,9,10,11,12,13,14,15]
In R, we would need to create a new function more explicitly, i.e. referring to
an arbitrary argument
add_5 <- (x) x + 5
sapply(1:10, add_5)
# [1] 6 7 8 9 10 11 12 13 14 15
Maybe my pattern-recognition has become a bit too overfitted on the idea that in
R “no parentheses = function, not result; parentheses = result”.
This reads weirdly to me
calc_mean <- function() {
function(x) {
mean(x)
}
}
sapply(mtcars[, 1:4], calc_mean())
but it’s exactly the same as the earlier example, since calc_mean()
essentially returns a mean
function
calc_mean()(1:10)
[1] 5.5
For that reason, I like the idea of naming the labelling function, since I read
this
p + scale_y_discrete(labels = penguin_label)
as passing a function. The parentheses get used in the right place – where the
function has been called.
Now, having to define that variable just to use it in the scale_y_discrete
call is probably a bit much, so yeah, inlining it makes sense, with the caveat
that you have to know it’s a function.
None of this was meant to say that the {scales} approach is wrong in any way – I
just wanted to address my own perceptions of the arg = fun()
design. It does
make sense, but it looks different. Am I alone on this?
Let me know on Mastodon and/or the comment
section below.
devtools::session_info()
“`{r sessionInfo, echo = FALSE}
devtools::session_info()
“`
Continue reading: Function Generators vs Partial Application in R
An Overview of Function Generators and Partial Application in R
The article explores the approach taken by the author to read and comprehend code in different languages. Notably, the author discussed the usage of function templates and partial application in R as a programming language, using examples from tidyverse’s {scales} package, label_glue and {glue} string.
Key Insights
- In Python, {glue} is R’s equivalent to f-strings.
- label_glue(“The {x} penguin”)(c(“Gentoo”, “Chinstrap”, “Adelie”)) demonstrates the use of {glue} strings in R to output a string of results.
- label_glue functions as a function generator. It returns a function that takes one argument. This allows for flexibility as different {glue} strings can generate different functions.
- The {scales} functions take functions as arguments to work lazily, i.e., they don’t need to know the values they want to pass to the generated function at the call site. These values might be calculated as part of the plotting process.
- The process of partial application allows us to “peel off” each layer of function calls.
Long term implications and future developments
Understanding function generators and partial application is crucial to effective R programming. This provided helpful insights into the code reading process by probing into the usage of {scales}, {glue} strings, and label_glue.
The code examples demonstrate how different {glue} strings can generate different functions and how the concept of function generators and partial application can be applied to enhance R’s versatility as a programming language. These concepts have essential long-term implications for code optimization.
Understanding these methods aid in enhancing programming efficiency, enabling cleaner, more concise, and more efficient coding practices. In the future, the dynamic use of function generators and partial applications may be extended to complex programming scenarios, leading to an increase in the usability of R in tackling complicated tasks.
Actionable Advice
- Try to incorporate the use of function generators and partial applications in your regular R programming routine. Begin with simple tasks and gradually extend to more complex scenarios.
- Remember that with R, “no parentheses = function, not result; parentheses = result”. This is important when trying to distinguish between a function and a result.
- Remember that functions like label_glue and {scales} work lazily – they do not necessarily need to know the values they want to pass to the generated function at the time of its call. This is an essential aspect of programming with R.
Read the original article