Values ​​of sample columns in a matrix without replacement

I have some experience with R, but always try to write new code. I found some very useful posts here while working on my current project, but cannot find the next step. here is what i have done so far:

  • Imported 20x20.csv ranking; each column contains one instance of each integer from 1 to 20, so all colSums are 210. rowSums are different.

  • used a post here to randomly select 4 rows from the original matrix and put them in a new 4x20 matrix.

Now I need to try 5 columns from each row without replacing the columns. that is, I need to use each column only once and have five values ​​in each row. (I don’t have a preference about whether it gives me a matrix with 20 values ​​in the right places and 60 zeros, or if I get 4 vectors from 5 values. I think I kind of want a matrix?)

If the context helps, I try to create groups based on the rating of topics in the class. rows are topics and columns are voters (students). ultimately, I want to create these random assignments in a for loop and run the program many times, so I hope to optimize the selection (according to some measurements, obviously, there are different optimization methods), and not looking at the original matrix, which is what I did in past.

this is my 4x20 matrix:

JEISANHTMBDKOGPLQRFC 2 5 4 1 1 5 13 3 4 13 11 14 14 20 9 15 9 11 17 9 15 13 20 19 17 19 19 7 4 19 7 1 5 1 17 15 10 6 7 14 6 3 14 18 2 12 14 11 19 18 15 19 4 8 19 2 2 13 7 9 1 12 10 18 4 7 18 5 12 18 2 20 6 7 16 15 5 18 1 13 2 18 14 16 

this is (one version) what i want:

  JEISANHTMBDKOGPLQRFC 2 0 4 1 1 0 0 3 4 0 0 0 0 0 0 0 0 0 0 0 0 13 0 0 0 0 0 7 0 0 0 1 5 1 0 0 0 0 0 0 0 3 14 0 0 0 0 11 0 0 0 0 0 0 0 0 2 0 7 0 1 12 0 18 4 0 0 0 0 0 0 0 6 0 0 0 5 0 1 0 2 0 0 0 
+4
source share
4 answers

You can use apply . The following command will randomly display five values ​​from each row and return a matrix of results:

 apply(mat, 1, sample, 5) 

You might want to transfer the returned matrix with t according to the original matrix.


If you want to use each column only once, you can use the following command:

 mat[cbind(seq(nrow(mat)), sample(ncol(mat), 5 * nrow(mat)))] 

It will return a vector containing the values.

To match the desired output format (matrix, including zeros and randomly selected values), you can use the following strategy:

 # create an index of the values to be kept idx <- cbind(seq(nrow(mat)), sample(ncol(mat), 5 * nrow(mat))) # create a new matrix of zeroes mat2 <- matrix(0, ncol = ncol(mat), nrow = nrow(mat)) # copy the values from the original matrix to the new one mat2[idx] <- mat[idx] 
+7
source

This should work

 data <- matrix(sample(letters,20*4,rep=T),4) # Create a fake data sample <- sample(1:20) # Scramble the order of the columns out <- matrix(0,4,5) # 5 letters for 4 lines for (i in 1:4) { out[i,] <- data[i,sample[1:5 + (i-1)*5]] # Sample 5 values of each line } 
+1
source

Assuming your data.frame is called "x", here is a simple approach that results in a list one line of data.frame s.

Here is your data:

 x <- structure(list(J = c(5L, 20L, 18L, 4L), E = c(4L, 19L, 2L, 7L ), I = c(1L, 17L, 12L, 18L), S = c(1L, 19L, 14L, 5L), A = c(5L, 19L, 11L, 12L), N = c(13L, 7L, 19L, 18L), H = c(3L, 4L, 18L, 2L), T = c(4L, 19L, 15L, 20L), M = c(13L, 7L, 19L, 6L), B = c(11L, 1L, 4L, 7L), D = c(14L, 5L, 8L, 16L), K = c(14L, 1L, 19L, 15L ), O = c(20L, 17L, 2L, 5L), G = c(9L, 15L, 2L, 18L), P = c(15L, 10L, 13L, 1L), L = c(9L, 6L, 7L, 13L), Q = c(11L, 7L, 9L, 2L), R = c(17L, 14L, 1L, 18L), F = c(9L, 6L, 12L, 14L), C = c(15L, 3L, 10L, 16L)), .Names = c("J", "E", "I", "S", "A", "N", "H", "T", "M", "B", "D", "K", "O", "G", "P", "L", "Q", "R", "F", "C"), class = "data.frame", row.names = c("2", "13", "14", "18" )) 

And the sample:

 set.seed(1) temp <- matrix(sample(20), nrow = 4) do.call(rbind, lapply(1:4, function(y) { x[y, -temp[y, ]] <- 0 x[y, ] })) # JEISANHTMBDKOGPLQRFC # 2 0 0 0 1 0 13 0 0 0 0 0 14 0 0 0 0 0 0 9 15 # 13 20 0 0 0 0 0 0 19 0 1 0 0 0 15 0 0 7 0 0 0 # 14 0 0 12 0 11 0 0 0 0 0 8 0 0 0 13 0 0 1 0 0 # 18 0 7 0 0 0 0 2 0 6 0 0 0 5 0 0 13 0 0 0 0 
+1
source

Using the Matrix package, we can easily do this from indices:

 i <- sample(nrow(X), ncol(X), replace=TRUE) j <- seq(ncol(X)) sparseMatrix(i,j,x=X[cbind(i,j)]) 

gives:

 > sparseMatrix(i,j,x=X[cbind(i,j)]) 4 x 20 sparse Matrix of class "dgCMatrix" [1,] . . . . . 13 . . 13 . 14 . . 9 . . . . . 15 [2,] . . . . . . . . . . . . . . . . . . 6 . [3,] . . . 14 11 . . 15 . 4 . 19 2 . 13 . . . . . [4,] 4 7 18 . . . 2 . . . . . . . . 13 2 18 . . 
+1
source

All Articles