How to create a different report for each subset of an R-markdown data frame?

I have a dataset that looks like

City Score Count Returns Dallas 2.9 61 21 Phoenix 2.6 52 14 Milwaukee 1.7 38 7 Chicago 1.2 95 16 Phoenix 5.9 96 16 Dallas 1.9 45 12 Dallas 2.7 75 45 Chicago 2.2 75 10 Milwaukee 2.6 12 2 Milwaukee 4.5 32 0 Dallas 1.9 65 12 Chicago 4.9 95 13 Chicago 5 45 5 Phoenix 5.2 43 5 

I would like to create a report using R markdown; however, for each city I need to create a report. The reason for this is because one city cannot see the report for another city. How to create a report and save it in PDF format for each city?

Each report will require a median Score , average Count and average Returns . I know that using dplyr , I could just use

 finaldat <- dat %>% group_by(City) %>% summarise(Score = median(Score), Count = mean(Count) , Return= mean(Returns)) 

But the frustration comes from creating a report for every City . It is also a subset of data, not complete data. That is, this report is extensive and represents a report on the results, which is systematic, and not different for each City .

+5
source share
1 answer

It looks like a parameterized report might be what you need. See the yaml more details, but the main idea is that you set the parameter in yaml your yaml report and use this parameter in the report to configure it (for example, by filtering data on City in your case). Then, in a separate R script, you render report several times, once for each City value that you pass as a parameter to the render function. Here is a basic example:

In the rmarkdown report rmarkdown you should declare the parameter in yaml . The enumerated Dallas value in this case is the default value if no other value is entered when rendering the report:

 --- title: My Document output: pdf_document params: My_City: Dallas --- 

Then in the same rmarkdown document you will have the whole report - no matter what calculations depend on City , plus a template that is the same for any City . You access the parameter using params$My_City . The code below will filter the data frame for the current value of the My_City parameter:

 ```{r} dat %>% filter(City==params$My_City) %>% summarise(Score = median(Score), Count = mean(Count) , Return= mean(Returns)) ``` 

Then, in a separate R script, you would do something like the following to create a separate report for each City (where I assumed that the Rmarkdown file above is called MyReport.Rmd ):

 for (i in unique(dat$City)) { rmarkdown::render("MyReport.Rmd", params = list(My_City = i), output_file=paste0(i, ".pdf")) } 

In the above code, I assumed that the dat data frame is in the global environment of this separate R script that displays MyReport.Rmd . However, you can also simply specify a vector of city names instead of getting names from unique(dat$City) .

+7
source

All Articles