I spent a lot of time searching and cannot find a solution to my specific question. I would really appreciate any help.
I have a large data.frame (1258 out of 298 variables), where each row is a sample of participants, and each of the columns is a specific bacterial genus found in the sample. Then I have several entries for each member, which is also indicated in a column variable.
Here is an example of what a data frame might look like.
Corynebacterium <- c(0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.1, 0.5, 0.7, 0.1, 0.0)
Paenibacillus <- c(0.0, 0.1, 0.7, 0.3, 0.5, 0.7, 0.0, 0.0, 0.0, 0.3, 0.3, 0.0)
Psychrobacter <- c(0.1, 0.1, 0.5, 0.0, 0.0, 0.0, 0.3, 0.6, 0.0, 0.6, 0.7, 0.0)
Staphylocccus <- c(0.5, 0.0, 0.3, 0.0, 0.3, 0.2, 0.5, 0.0, 0.4, 0.1, 0.1, 0.5)
TimePoint <- c("A", "B", "C", "D", "E", "F", "A", "B", "C", "D", "E", "F")
SampleDF <- data.frame(Corynebacterium, Paenibacillus, Psychrobacter,
Staphylocccus, TimePoint)
I would like to know the number of nonzero cells from the total number of cells for a given time point.
: Corynebacterium TimePoint A # NonZeroCells/Total # Cells = 1/2 = 0.5. 50% Corynebacterium TimePoint A .