[This article was first published on Numbers around us – Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

#223–224

Puzzles

Author: ExcelBI

All files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.

Puzzle #223

As usual on weekends we are mainly doing table transformations. Sometimes it need simple manouvers and sometimes very complicated almost magical tricks. And today we have exactly that situation. First one is like: squeeze, pull, push, twist. Simple usage of basic functions.

If you are curious, why I illustrated it with scene of cafeteria… We sometimes need to complete tables like lunch on tray. 😀

Loading libraries and data

library(tidyverse)
library(readxl)

path = "Power Query/PQ_Challenge_223.xlsx"
input = read_excel(path, range = "A1:D14")
test  = read_excel(path, range = "F1:J8")

Transformation

result = input %>%
  unite("Code", c("Type", "Code"), sep = "") %>%
  mutate(col = ifelse(row_number() %% 2 == 0, 2, 1),
         row = (row_number() + 1) %/% 2,
         .by = Group) %>%
  pivot_wider(names_from = col, values_from = c(Code, Value), names_sep = "") %>%
  select(-row)

Validation

all.equal(result, test)
#> [1] TRUE

Puzzle #224

And the second one is one of the most complicated to solve because it is totally untidy. We have different data in the same row, we have headers every couple of rows. Fortunatelly I found a way to rebuild it in less than 15 LOC.

Loading libraries and data

library(tidyverse)
library(readxl)
library(janitor)

path = "Power Query/PQ_Challenge_224.xlsx"
input = read_excel(path, range = "A1:D12")
test  = read_excel(path, range = "F1:I20")

Transformation

result = input %>%
  mutate(date = ifelse(str_detect(Column1, "d"), Column1, NA)) %>%
  fill(date) %>%
  set_names(.[1, ]) %>%
  rename("Name" = 1, "date" = 5) %>%
  filter(!str_detect(Name, "d")) %>%
  mutate(date = coalesce(excel_numeric_to_date(as.numeric(date)), mdy(date))) %>%
  pivot_longer(-c(date, Name), names_to = "Data", values_to = "Value") %>%
  na.omit() %>%
  select(Date = date, Name, Data, Value) %>%
  mutate(Value = as.numeric(Value),
         Date = as.POSIXct(Date))

Validation

all.equal(result, test, check.attributes = FALSE)
#> [1] TRUE

Feel free to comment, share and contact me with advices, questions and your ideas how to improve anything. Contact me on Linkedin if you wish as well.


PowerQuery Puzzle solved with R was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.

To leave a comment for the author, please follow the link and comment on their blog: Numbers around us – Medium.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Continue reading: PowerQuery Puzzle solved with R

Analysis of Puzzles #223 & #224: Long-Term Implications & Possible Future Developments

The text is an analysis of two data transformation puzzles, #223 and #224, carried out using the R statistical language. Each puzzle starts with loading appropriate libraries and data from Excel files. The solution process involves series of data transformation steps, and each puzzle result is checked against a test dataset to confirm accuracy.

Key Insights

  • Both puzzles employ diverse transformation functions from the tidyverse and readxl libraries in R, demonstrating the versatility of these packages in handling numerous data manipulation tasks.
  • These puzzles also highlight the importance of creativity and fluency in R’s diverse function repertoire to solve intricate data manipulation tasks.
  • Efficient data transformation skills are vital in using R for data analysis purposes.

Future Implications and Developments

As statistical analysis and data science continue to evolve and influence various sectors, possibly including business, healthcare, and technology, fluency in languages like R and the adept use of their data manipulation functionalities will serve as crucial skills. Therefore, mastering puzzle-like problems can be an effective way to advance one’s data transformation and general data manipulation skills.

In the future, more advanced and multifaceted queries could play an essential role in shaping the development of functional packages within R. As such, interactive platforms and communities, like those exemplified by R-Bloggers and Medium, offering these kinds of problem-solving experiences on an ongoing basis will be crucial for R learning and mastery.

Actionable Advice

If you’re an aspiring data analyst or scientist, solve diverse data manipulation puzzles that reflect real-world situations. Such experience will build your proficiency and capacity to handle data manipulation tasks in R. Regularly participating in the collective analysis and problem-solving platforms like R-Bloggers or Medium will help you gain experience, learn new perspectives, and expand your knowledge. Remember to explore and understand the various packages offered within R to gain maximum functionality. Above all, always check your results against established test datasets to ensure accuracy in your solutions.

Read the original article