“Restructuring Data and Calculating Sales Proportionally: PowerQuery Puzzle Solved with R

[This article was first published on Numbers around us – Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

#187–188

Puzzles

Author: ExcelBI

All files (xlsx with puzzle and R with solution) for each and every puzzle are available on my Github. Enjoy.

Puzzle #187

There are two ways of showing data and it really depends on what is the purpose of certain presentation. Sometimes we only need to show those data point that have any data in it, but sometimes we need to make something like empty data point, which means that we need to include in our report all needed dimension values, all labels, but still have no value or value replaced by 0 for this row.

And that is what the task today is about. We have sales values for different continents in different years, but we don’t have cases where sales were 0. But we need to restructure report to have it. With some twists because we need summary rows and empty rows, and… North America is abbreviated to NA, and you know what could it cause in data. Check my solution.

Loading libraries and data

library(tidyverse)
library(readxl)

input = read_excel("Power Query/PQ_Challenge_187.xlsx", range = "A1:C12")
test  = read_excel("Power Query/PQ_Challenge_187.xlsx", range = "E1:G30")

Transformation

all <- expand_grid(Continent = unique(sort(input$Continent)), Year = unique(sort(input$Year)))

result1 <- all %>%
  left_join(input, by = c("Continent", "Year")) %>%
  mutate(Sales = replace_na(Sales, 0),
         Year = as.character(Year))

years <- unique(sort(result1$Year))

empty_row <- tibble(Continent = NA, Year = NA, Sales = NA_real_)

totals <- map_dfr(years, ~ {
  yearly_data <- result1 %>%
    filter(Year == .x)
  total_row <- summarise(yearly_data, Continent = "TOTAL", Year = .x, Sales = sum(Sales))
  bind_rows(yearly_data, total_row, empty_row)
})

grand_total <- summarise(result1, Continent = "GRAND TOTAL", Year = "2010-2013", Sales = sum(Sales))

result <- bind_rows(totals, grand_total)

Validation

identical(result, test)
# [1] TRUE

Puzzle #188

Sometimes we are missing some dimension of data, because they are just aggregated into bigger sets. And we have it here. Sales were summarized by period, very irregular period. And we need that data per quarter. As we don’t know how exactly sales went, we need to calculate how many days of each quarters there was sales and assign money proportionally. Let dig in it.

Loading libraries and data

library(tidyverse)
library(readxl)

input = read_excel("Power Query/PQ_Challenge_188.xlsx", range = "A1:D4")
test  = read_excel("Power Query/PQ_Challenge_188.xlsx", range = "F1:H11")

Transformation

result = input %>%
  mutate(date = map2(`From Date`, `To Date`, seq, by = "day"),
         days = map_int(date, length),
         daily = Amount / days) %>%
  unnest(date) %>%
  mutate(quarter = quarter(date),
         year = year(date) %>% as.character() %>% str_sub(3, 4),
         Quarter = paste0("Q",quarter,"-",year)) %>%
  summarise(Amount = sum(daily) %>% round(0), .by = c(Store, Quarter))

Validation

identical(result, test)
# [1] TRUE

Feel free to comment, share and contact me with advices, questions and your ideas how to improve anything. Contact me on Linkedin if you wish as well.

PowerQuery Puzzle solved with R was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.

To leave a comment for the author, please follow the link and comment on their blog: Numbers around us – Medium.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you’re looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

Continue reading: PowerQuery Puzzle solved with R

Long-Term Implications and Future Developments

The key points of the text are primarily based on the use of certain transformations in R on datasets. The text provides details of two significant puzzles related to data manipulation and analysis using R, one involving data representation with zero sales and the other dealing with the distribution of aggregated data into smaller sets.

Implications and Future Trends

As per the text, the way datasets are dealt with in R is changing dynamically. This opens up several long-term implications and potential future trends:

There will be a continuous need for the simplification of complex data manipulations, which might prompt more software development in this area.
There will be a necessity for professionals to learn and master R programming, especially for those dealing with sales data or any kind of structured or unstructured data.
We can expect more advanced libraries and more sophisticated built-in R functions that will ease the process of data transformation.
Continuous updates and enhancements can be expected in the future versions of R, improving its usability and functionality.
Automated data processing and data transformation techniques might become more prevalent setting up a trend in the future.

Actionable Advice

Considering the direction these developments are taking, here are some actionable insights:

Learn and adapt: There is a need for professionals to continually learn and adapt to the use of R and its vast array of libraries for data manipulation tasks.
Watch out for new libraries: Stay updated with the latest libraries and trends in R, there are always new and better ways of simplifying complex tasks.
Automate: Whenever possible, automate repetitive tasks using R’s scripting capabilities, this will save valuable time and reduce the likelihood of errors in data transformation processes.
Contribute: If you are an adept R user, consider contributing to open-source R projects or creating your own libraries. Not only will this enhance your profile, but it also assists the community in advancing the software’s capabilities.

Read the original article

“Restructuring Data and Calculating Sales Proportionally: PowerQuery Puzzle Solved with R

Puzzles

Puzzle #187

Loading libraries and data

Transformation

Validation

Puzzle #188

Loading libraries and data

Transformation

Validation

Long-Term Implications and Future Developments

Implications and Future Trends

Actionable Advice

Submit a Comment Cancel reply

Recent Posts

Recent Comments