[This article was first published on Numbers around us – Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

#227–228

Puzzles

Author: ExcelBI

All files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.

Puzzle #227

We are given two tables one with employees data, and second with conditional assignment of salary. There is kind of regular expression in there but not in official notation, but rather giving asterisk power of Joker — replace one or many characters. As this placeholder notation cannot be used in R, I made some modification in T2 table using as I called it BRAIN.API (my own mind :D). Check what was later.

Loading libraries and data

library(tidyverse)
library(readxl)

path = "Power Query/PQ_Challenge_227.xlsx"
input1 = read_excel(path, range = "A2:D13")
input2 = read_excel(path, range = "F2:H6")
test  = read_excel(path, range = "J2:N11") %>%
  arrange(Sequence, Name)

Transformation

input2 = input2 %>%
  mutate(pattern_seq = c("^1.*", "321", ".*", ".*8$"),
         pattern_name = c("^M.*", "^S.*", ".*[aA]$", ".*"))

input = input1 %>%
  cross_join(input2) %>%
  mutate(check_seq = str_detect(string = Sequence.x, pattern = pattern_seq),
         check_name = str_detect(string = Name.x, pattern = pattern_name),
         both_conditions = check_seq & check_name) %>%
  filter(both_conditions) %>%
  select(Sequence = Sequence.x,Name = Name.x, Weight, `Bonus %`, Salary) %>%
  arrange(Sequence, Name)

Validation

all.equal(input, test, check.attributes = FALSE)
#> [1] TRUE

Puzzle #228

Sometimes tables need some stretching and squeezing… and some stretching again. Especially when there are more than one header available. Fortunatelly we have special package for such cases: unpivotr. We need to “behead” some columns and rows and then we would put it in order we all accept to follow.

Loading libraries and data

library(tidyverse)
library(readxl)
library(unpivotr)

path = "Power Query/PQ_Challenge_228.xlsx"
input = read_excel(path, range = "A1:H5", col_names = F)
test  = read_excel(path, range = "J1:M20") %>%
  arrange(Category, Student, Value)

Transformation

result = input %>%
  as_cells() %>%
  behead("left", "Student") %>%
  behead("up-left", "Category") %>%
  behead("up", "Value") %>%
  select(Student, Category, Value, Marks = chr) %>%
  mutate(Marks = as.integer(Marks)) %>%
  na.omit() %>%
  arrange(Category, Student, Value)

Validation

all.equal(result, test, check.attributes = F)
#> [1] TRUE

Feel free to comment, share and contact me with advices, questions and your ideas how to improve anything. Contact me on Linkedin if you wish as well.


PowerQuery Puzzle solved with R was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.

To leave a comment for the author, please follow the link and comment on their blog: Numbers around us – Medium.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.


Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Continue reading: PowerQuery Puzzle solved with R

Analysis of PowerQuery Puzzle Solved with R

The text outlines two puzzles #227 and #228 associated with data manipulation and transformation using R language. Libraries like tidyverse, readxl, and unpivotr were predominantly used for these purposes.

Puzzle #227 – Employee Data Analysis

In the first puzzle, the author describes a situation where two tables are provided, one comprised of employee data and another stipulating conditional assignment of salary. Expressions replacing a character or set of characters were used within this placeholder notation but reveals that this notation doesn’t support in R. Therefore, the author modifies it in the T2 table employing his own BRAIN.API placeholder notation.

Puzzle #228 – Tables Transformation

In the second puzzle, the utilization of table transformation is discussed. The problem pertains to situations where tables need to be reorganized, modified or transformed notably when multiple headers are involved. The “unpivotr” library comes handy in this case. It is responsible for “beheading” certain rows and columns and then reorganizing them in a more acceptable order. The mention of “behead” suggests the function in the library responsible for these transformations.

Long-Term Implications and Possible Future Developments

With these puzzles, it is evident that R language provides a comprehensive and efficient solution for sorting, analyzing, and transforming data. The importance of the manipulation of data and understanding trends in employee salaries and organizational structures is critical for any enterprise. As such, the potential for these R solutions to increasingly be integrated and used in HR and administrative functions is high.

Predictions

R’s power for data transformation and analysis will see its adoption grow in industries with heavy reliance on quantifiable metrics such as the finance, insurance and healthcare sectors. Furthermore, big data growth will necessitate the development of more sophisticated and intuitive R libraries and functions.

Actionable Advice

Professionals using R language should continually learn and adapt as new libraries and functions become available that might make tasks more efficient. Equally, they should focus on industries where R is increasing in popularity and try to understand the industry-specific problems that R is being used to solve.

Peers

Share your practical applications or solutions involving R. This will encourage collaboration and joint problem-solving, improving skills and creating more efficient solutions. Understanding how others approach problems can provide new perspectives, that can improve problem-solving abilities.

Industry Professionals

Keep reaching out to industry professionals. This is an effective way to remain updated about how R is being deployed in different industries, which can introduce new career opportunities or provide valuable insights for problem-solving.

Continuous Learning

Finally, maintaining a state of continuous learning is key. New libraries and functions are being produced all the time, and investing effort in staying updated is vital to success as an R programmer.

Read the original article