[This article was first published on Johannes B. Gruber on Johannes B. Gruber, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

If there is one development at the moment which I full heartedly enjoy reading about it’s that the remains of what was once called Twitter is seeing a large E𝕏odus.
Since a certain billionaire has taken over that platform, it has continuously become worse and I was hoping that politcians, media outlets and my fellow social scientists would come to Bluesky instead, which is apparently exactly what is happening now.
So after a lot of disappointment with world events this year, my wish that Bluesky would become Twitter’s heir, seems to come true.
The reasons I like Bluesky so much are that it connects me with a peer group that is spread around the world, like Twitter once did, but that it is built on open source infrastructure, which not only makes it billionaire proof, but also incredibly easy to tap into the data.
Overall it is just a place of joy right now and thanks to how serious the developers took community moderation, I’m hopeful that it will stay this way.

However, that led to a problem this week which can only be described as ‘incredibly first world’.
I was getting too many notifications about new followers!
So many that it became impossible to go through all of them and check whom to follow back.
My approach to solving the problem?
Using R and the atrrr package I created with friends ealiers this year.

Who follows me, but I’m not following back?

I start by looking at who follows me, and whom I already follow back:

library(atrrr)
library(tidyverse)
my_followers <- get_followers("jbgruber.bsky.social", limit = Inf) |>
  # remove columns containing more complex data
  select(-ends_with("_data"))
my_follows <- get_follows("jbgruber.bsky.social", limit = Inf) |>
  select(-ends_with("_data"))
not_yet_follows <- my_followers |>
  filter(!actor_handle %in% my_follows$actor_handle)

Now not_yet_follows contains 372 people!
More than I thought.
My assumption is that they are interested in similar topics and it would probably enrich my feed if I followed a chunk of them back.
But how to decide?
I came up with three criteria:

  1. who is already followed by a large chunk of my follows
  2. who has #commsky, #polsky or #rstats in their description
  3. who has a big account, which I defined at the moment as 1,000 followers+

Number 1 and 3 are made under the assumption that popular accounts are popular for a reason and I’m relying on the wisdom of the crowd.

Who is followed by the people I follow?

To answer this, we need to get quite a bit of data.
Specifically, I loop through all accounts that I follow and get the follows from them:

follows_of_follows <- my_follows |>
  pull(actor_handle) |>
  # iterate over follows getting their follows
  map(function(handle) {
    get_follows(handle, limit = Inf, verbose = FALSE) |>
      mutate(from = handle)
  }, .progress = interactive()) |>
  bind_rows() |>
  # not sure what this means
  filter(actor_handle != "handle.invalid")

This data is huge, with over 450,000 accounts.
So who in the not_yet_follows list shows up there most often?

follows_of_follows_count <- follows_of_follows |>
  count(actor_handle, name = "n_following", sort = TRUE)
follows_of_follows_count
## # A tibble: 160,440 × 2
##    actor_handle              n_following
##    <chr>                           <int>
##  1 jbgruber.bsky.social              400
##  2 claesdevreese.bsky.social         352
##  3 rossdahlke.bsky.social            292
##  4 alessandronai.bsky.social         285
##  5 favstats.eu                       263
##  6 feloe.bsky.social                 263
##  7 jamoeberl.bsky.social             246
##  8 brendannyhan.bsky.social          226
##  9 fgilardi.bsky.social              225
## 10 dfreelon.bsky.social              224
## # ℹ 160,430 more rows

Unsurprisingly, I’m on top of this very specific list since this is a network around my own account.
But let’s see who among my not_yet_follows list is popular here:

popular_among_follows <- not_yet_follows |>
  left_join(follows_of_follows_count, by = "actor_handle") |>
  filter(n_following > 30)

I put the people who have more than 30 n_following here, which is an arbitry number I picked, and ended up with 76 people I should look into.

Who matches my interest in their description?

Specifically, I look for a couple of key hashtags: #commsky, #polsky or #rstats in their description.
These are the words I look for when checking out someone’s bio and it is very likely I want to follow them then.
Looking for the keywords is pretty simple, since we already have the data:

probably_interesting_content <- not_yet_follows |>
  filter(!is.na(actor_description)) |>
  filter(str_detect(actor_description, regex("#commsky|#polsky|#rstats",
                                             ignore_case = TRUE)))

Only 20 accounts fit this filter.
Maybe I could find better keywords?
But this is just a demo of what you could do, so let’s move on.

Who are the big accounts trying to connect?

We can look up the user info to see how many followers they have.1

popular_not_yet_follows <- not_yet_follows |>
  mutate(followers_count = get_user_info(actor_handle)$followers_count) |>
  filter(followers_count > 1000)

Again the 1,000 follower number is arbitrary, but when I look at an account and see four figure follower counts, I still think it’s a lot.
This gave me 80 accounts.

So what could I do now?
Two ways to approach it:

  1. let’s just follow them all if they fit these criteria:
lets_follow <- bind_rows(
  popular_among_follows,
  probably_interesting_content,
  popular_not_yet_follows
) |>
  distinct(actor_handle) |>
  pull(actor_handle)

follow(lets_follow)
  1. More realistically though, I still want to have a look at the 136 accounts before following them.

This can be done relatively conveniently by opening the user profiles in my browser.
I can do that with:

walk(
  paste0("https://bsky.app/profile/", lets_follow),
  browseURL
)

How else can I find followers?

What you can also do with the data is to simply check follows_of_follows_count which of the accounts that are popular among your friends you don’t yet follow – without the condition that they are following you.

popular_among_follows2 <-  follows_of_follows_count |>
  filter(!actor_handle %in% my_follows$actor_handle) |>
  filter(n_following > 30)

This gives me another 60 accounts to look through.

Of course the best way to search for intersting accounts when you are new to the platform is to look for starter packs.
The website Bluesky Directory has these ordered by topics and let’s you search through it.

How can I learn more about atrrr?

We collected a couple of tutorials on the package’s website: https://jbgruber.github.io/atrrr/
If there is something you would like to have explained (better) or you went through the docs and found an interesting endpoint, head over to GitHub and create and issue.
We are very open for ideas that make the package better!


  1. This currently only works with the development version of atrrr, install via remotes::install_github("JBGruber/atrrr").↩

To leave a comment for the author, please follow the link and comment on their blog: Johannes B. Gruber on Johannes B. Gruber.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Continue reading: So many new people on Bluesky! Who should I follow?

Examining the Potential Effects and Implications of the Transition from Twitter to Bluesky

As the remains of the once-popular social media platform, Twitter, experience a present E𝕏odus, a parallel move towards the open-source platform, Bluesky, is apparent. This shift is not only noticeably observed in the actions of social scientists, media outlets, and politicians, but also, supported by a large number of users who might prioritize the open-source infrastructure of Bluesky over Twitter.

A Better Social Media Experience

Bluesky is rising in popularity and preference as it offers an experience quite close to Twitter’s golden days – efficiently connecting users with a peer group scattered around the globe, with an emphasis on ease of data access. Such ease of data access is attributed to Bluesky’s underlying open-source architecture, which manages to keep the platform immune from the influences of billionaires, simultaneously promoting community moderation.

Although the increasing engagement has recently led to situations where users might feel overwhelmed by the notification count from new followers, it still reflects the platform’s growing popularity.

Edging Towards Personalized Experience

A solution proposed to efficiently manage this problem is crafted around using R, a programming language, and a package called ‘atrrr’ to create filters based on personalized criteria. One of the algorithms designed using ‘atrrr’ first determines who follows a user without being followed back. The output list is then filtered based on popularity among fellow followers, content relevance or specific keywords in bio, and the amount of followers they have. These filters help list potential followers that would ideally enrich the user’s feed.

Actionable Advice Based on the Transition Implications

Evidently, the advent of Bluesky seems to offer a promising alternative for Twitter users and is seen as a refreshing development in the sphere of social media. However, the issue of efficiently managing a growing number of followers is a challenge that must be addressed. The long-term implications of this transition may include some of the following:

  1. The continuous exodus from Twitter to Bluesky could signify a broader shift in priority towards open-source platforms. Platforms that can guard against negative billionaire influence and offers straightforward data access could soon be the norm.
  2. As these platforms grow, the need for efficient tools that can manage increasing engagement becomes absolutely crucial. This means there could be a surge in demand for social data analysis and manipulation packages/tools such as ‘atrrr’.
  3. As users become more selective about who they follow, more personalized algorithms will be needed to monitor content relevance. This signifies an increased emphasis on AI and data science expertise in social media planning and development.

Advice for Bluesky Users

Those feeling overwhelmed by follower notifications on Bluesky might consider using ‘atrrr’ to help manage the influx. Notably, ‘atrrr’ allows users to filter potential followers they might want to follow back based on their customized criteria, significantly enhancing their social media experience.

For Bluesky newbies or individuals seeking to expand their network, the website Bluesky Directory offers starter packs ordered by topics to help find interesting accounts. It could be a good starting point to navigate the platform and establish a strong presence.

For Learners and Developers

Educational resources and tutorials on using ‘atrrr’ can be found on the package’s website. To improve and build upon the current version of ‘atrrr’, the creators warmly welcome suggestions, issues, or ideas that could enhance the package. This is an opportunity for positive collaboration and an open invitation for individuals wanting to contribute to a relevant and influential project.

A Conclusion on the Transition

In conclusion, the transition from Twitter to Bluesky seems to be a reflection of user’s desire for a better social media experience. Developers and social media strategists can build upon this shift, focusing on creating tools and algorithms that help manage growing engagement and deliver personalized, enriching content. Bluesky seems to have started its journey on the right note. The successful management of large-scale user migration and active participation of its users in development might write the success story for Bluesky, making it an ideal successor to Twitter.

Read the original article