R Puzzle Solutions: A Showcase of R’s Data Manipulation Power

[This article was first published on Numbers around us – Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Puzzles no. 399–403

Puzzles

Author: ExcelBI

All files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.

Puzzle #399

Today we have been given a list of random strings containing of certain number of letters duplicated. And our task is to count how many of each letters are there and present it as alphabetically pasted string. Sounds nice and it is nice. Let’s go.

Loading libraries and data

library(tidyverse)
library(readxl)

input = read_excel("Excel/399 Counter Dictionary.xlsx", range = "A1:A10")
test  = read_excel("Excel/399 Counter Dictionary.xlsx", range = "B1:B10")

Transformation

count_chars = function(string) {
  chars = string %>%
    str_split(., pattern = "") %>%
    unlist() %>%
    tibble(char = .) %>%
    group_by(char) %>%
    summarise(count = n()) %>%
    ungroup() %>%
    arrange(char) %>%
    unite("char_count", c("char", "count"), sep = ":") %>%
    pull(char_count) %>%
    str_c(collapse = ", ")

  return(chars)
}

result = input %>%
  mutate(`Answer Expected` = map_chr(String, count_chars)) %>%
  select(-String)

Validation

identical(result, test)
# [1] TRUE

Puzzle #400

Once again we are playing with coordinates and checking if they form one structure. But this time vertices are mixed and we have some more to do.
In this puzzle I will give you one surprise. Be patient.

Loading libraries and data

library(tidyverse)
library(readxl)

input = read_excel("Excel/400 Connected Points_v2.xlsx", range = "A1:D8")
test  = read_excel("Excel/400 Connected Points_v2.xlsx", range = "E1:E8")

Transformation

result = input %>%
  mutate(row = row_number()) %>%
  select(row, everything()) %>%
  pivot_longer(-row, names_to = "col", values_to = "value") %>%
  select(-col) %>%
  na.omit() %>%
  group_by(row) %>%
  separate_rows(value, sep = ", ") %>%
  group_by(row, value) %>%
  summarise(n = n()) %>%
  ungroup() %>%
  select(-value) %>%
  group_by(n, row) %>%
  summarise(count = n()) %>%
  ungroup() %>%
  filter(n == 1) %>%
  mutate(`Answer Expected` = ifelse(count == 2, "Yes", "No")) %>%
  select(`Answer Expected`)

Validation

identical(test, result)
# [1] TRUE

Optimized version

I asked AI chat to optimize my code from above, because I don’t really like when my code is to long without a purpose. So I tried it, and that is really a surprise.

result2 <- input %>%
  mutate(`Answer Expected` = pmap_chr(., ~ {
    unique_values <- na.omit(c(...))
    if (length(unique(unique_values)) == 2) "Yes" else "No"
  })) %>%
  select(`Answer Expected`)


identical(test, result2)
# [1] TRUE

Puzzle #401

I am not using matrices in my daily work often, but I really like puzzles in which I can use them to solve. Today we have to form triangle from string. We have to bend it to size of matrix. Let’s try.

Loading libraries and data

library(tidyverse)
library(readxl)

input1 = read_excel("Excel/401 Make Triangle.xlsx",
                    range = "A2:A2", col_names = F) %>% pull()
input2 = read_excel("Excel/401 Make Triangle.xlsx",
                    range = "A5:A5", col_names = F) %>% pull()
input3 = read_excel("Excel/401 Make Triangle.xlsx",
                    range = "A9:A9", col_names = F) %>% pull()
input4 = read_excel("Excel/401 Make Triangle.xlsx",
                    range = "A14:A14", col_names = F) %>% pull()
input5 = read_excel("Excel/401 Make Triangle.xlsx",
                    range = "A19:A19", col_names = F) %>% pull()

test1 = read_excel("Excel/401 Make Triangle.xlsx",
                   range = "C2:D3", col_names = F) %>% as.matrix(.)
dimnames(test1) = list(NULL, NULL)
test2 = read_excel("Excel/401 Make Triangle.xlsx",
                   range = "C5:D7",col_names = F) %>% as.matrix(.)
dimnames(test2) = list(NULL, NULL)
test3 = read_excel("Excel/401 Make Triangle.xlsx",
                   range = "C9:E12",col_names = F) %>% as.matrix(.)
dimnames(test3) = list(NULL, NULL)
test4 = read_excel("Excel/401 Make Triangle.xlsx",
                   range = "C14:F17", col_names = F) %>% as.matrix(.)
dimnames(test4) = list(NULL, NULL)
test5 = read_excel("Excel/401 Make Triangle.xlsx",
                   range = "C19:G23", col_names = F)  %>% as.matrix(.)
dimnames(test5) = list(NULL, NULL)

Transformation and validation

triangle = function(string) {
  chars = str_split(string, "") %>% unlist()
  nchars = length(chars)
  positions = tibble(row = 1:10) %>%
    mutate(start = cumsum(c(1, row[-5])),
           end = start + row - 1)
  nrow = positions %>%
    mutate(nrow = map2_dbl(start, end, ~ sum(.x <= nchars &
                                               nchars <= .y))) %>%
    filter(nrow == 1) %>%
    pull(row)
  M = matrix(NA, nrow = nrow, ncol = nrow)

  for (i in 1:nrow) {
    M[i, 1:i] = chars[positions$start[i]:positions$end[i]]
  }

  FM = M %>%
    as_tibble() %>%
    select(where( ~ !all(is.na(.)))) %>%
    as.matrix()
  dimnames(FM) = list(NULL, NULL)


  return(FM)
}
identical(triangle(input1), test1) # TRUE
identical(triangle(input2), test2) # TRUE
identical(triangle(input3), test3) # TRUE
identical(triangle(input4), test4) # TRUE
identical(triangle(input5), test5) # TRUE

Puzzle #402

One of common topics in our series is of course cyphering. And today we have again some spy level puzzle. We have some phrase and keyword using which we need to code given phrase. Few weeks ago there was puzzle when lacking letters in keyword were taken from coded phrase. Today we are repeating key how many times we need. And there is one more detail, we have to handle spaces as well. Not so simple, but satisfying.

Loading libraries and data

library(tidyverse)
library(readxl)

input = read_excel("Excel/402 Vignere Cipher.xlsx", range = "A1:B10")
test  = read_excel("Excel/402 Vignere Cipher.xlsx", range = "C1:C10")

Transformation

code = function(plain_text, key) {
  coding_df = tibble(letters = letters, numbers = 0:25)

  plain_text_clean = plain_text %>%
    str_remove_all(pattern = "s") %>%
    str_split(pattern = "") %>%
    unlist()

  key = str_split(key, "") %>% unlist()
  key_full = rep(key, length.out = length(plain_text_clean))

  df = data.frame(plain_text = plain_text_clean, key = key_full) %>%
    left_join(coding_df, by = c("plain_text" = "letters")) %>%
    left_join(coding_df, by = c("key" = "letters")) %>%
    mutate(coded = (numbers.x + numbers.y) %% 26) %>%
    select(coded) %>%
    left_join(coding_df, by = c("coded" = "numbers")) %>%
    pull(letters)

  words_starts = str_split(plain_text, " ") %>%
    unlist() %>%
    str_length()

  words = list()

  for (i in 1:length(words_starts)) {
    if (i == 1) {
      words[[i]] = paste(df[1:words_starts[i]], collapse = "")
    } else {
      words[[i]] = paste(df[(sum(words_starts[1:(i-1)])+1):(sum(words_starts[1:i]))], collapse = "")
    }
  }

  words = unlist(words) %>% str_c(collapse = " ")

  return(words)
}

result = input %>%
  mutate(`Answer Expected` = map2_chr(`Plain Text`, Keyword, code))

Validation

identical(result$`Answer Expected`, test$`Answer Expected`)
# [1] TRUE

Puzzle #403

We are summarizing some values into year brackets. Usually you do it using crosstab. And our job today is to make crosstab that is not excel crosstab, but should work like it. From R side usually you have to make pivot, but I didn’t. So we have pivot table (another word for crosstab), without using pivot neither in R nor in Excel. How? Look on it.

Loading libraries and data

library(tidyverse)
library(readxl)

input = read_excel("Excel/403 Generate Pivot Table.xlsx", range = "A1:B100")
test  = read_excel("Excel/403 Generate Pivot Table.xlsx", range = "D2:F9")

Transformation

result = input %>%
  add_row(Year = 2024, Value = 0) %>% ## just to have proper year range at the end
  mutate(group = cut(Year, breaks = seq(1989, 2024, 5), labels = FALSE, include.lowest = TRUE)) %>%
  group_by(group) %>%
  summarize(Year = paste0(min(Year), "-", max(Year)),
            `Sum of Value` = sum(Value)) %>%
  ungroup() %>%
  mutate(`% of Value` = `Sum of Value`/sum(`Sum of Value`)) %>%
  select(-group)

Validation

identical(result, test)
# [1] TRUE

Feel free to comment, share and contact me with advices, questions and your ideas how to improve anything. Contact me on Linkedin if you wish as well.

R Solution for Excel Puzzles was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.

To leave a comment for the author, please follow the link and comment on their blog: Numbers around us – Medium.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Continue reading: R Solution for Excel Puzzles

Analysis

In this text, the author challenges the reader with multiple puzzles. These puzzles are a testament to the flexibility and efficiency of the R programming language, which is used to solve each problem. The incorporation of the tidyverse and readxl libraries into the solutions further showcase the power of R. The problems touch on data processing in different forms: string manipulation, coordinate transformation, matrix generation, cipher creation, and pivot table creation.

Long-term Implications

The author demonstrates how R can be leveraged for data manipulation of various types, which provides insights into its potential uses for other data-related applications in the future. This includes data analysis, visualisation, machine learning, and modeling, with the possibility of extending R’s capabilities with libraries like tidyverse and readxl. This implies that learning and utilising R can be essential for analysts, data scientists and even business officials who engage with data regularly.

Possible Future Developments

The R programming language will likely continue to evolve, with more powerful and efficient libraries being developed. These will likely improve the language’s data pre-processing functionalities further, making it an even more potent analytic instrument. As more individuals becomes aware and learn R, a possible future development could be the provision of simpler interface for R that allows even non-programmers to execute complex data manipulations.

Actionable Advice

If you work with data in any capacity, consider learning and using R for your data processing needs. This language, with its numerous libraries such as tidyverse and readxl, not only provides extensive functionality for manipulating, summarising and analysing data, but also offers a deep capability to handle and solve complex data-oriented problems. Despite R having a somewhat steep learning curve initially, especially for those without programming background, the potential long-term benefits of being able to hand-craft solutions to problems make it an investment worth making.

More Exercises and Practice

Consider trying R on more complex problems such as those presented in the text. The more practice you get using the language, the more comfortable you will become in writing efficient R code. Further, consider developing the habit of constantly looking for problems to solve using R so as to enhance your problem-solving skills while also mastering the language.

Read the original article

R Puzzle Solutions: A Showcase of R’s Data Manipulation Power

Puzzles

Puzzle #399

Loading libraries and data

Transformation

Validation

Puzzle #400

Loading libraries and data

Transformation

Validation

Optimized version

Puzzle #401

Loading libraries and data

Transformation and validation

Puzzle #402

Loading libraries and data

Transformation

Validation

Puzzle #403

Loading libraries and data

Transformation

Validation

Analysis

Long-term Implications

Possible Future Developments

Actionable Advice

More Exercises and Practice

Submit a Comment Cancel reply

Recent Posts

Recent Comments