How to import Excel data into R using column name and row name

I am new to R and wondered how to import Excel data into R using row names and column names. In particular, I require a subset of data in multiple sheets in a single excel file. Can I use row names and column names to identify and retrieve specific data cells in R?

Worksheet 1 ---------- * XYZA 1 2 2 B 1 1 1 C 1 3 4 D 4 2 2 E 2 2 2 ---------- Worksheet 2 ---------- * X Y1 Z1 A 1 2 2 B 1 2 3 C 1 3 4 D 4 1 1 E 2 1 1 

For example, in the above table, how could I extract the data (2,2,2,2) using the row and column names (D, Y) (D, Z) (E, Y) (E, Z) in sheet 1

how can I extract data (1,1,1,1) using row and column names (D, Y1) (D, Z1) (E, Y1) (E, Z1) in sheet 2?

Thanks for the help provided.

Barry

+7
source share
2 answers

@Andrie mentioned the XLConnect package, it is a very useful I / O package between R and Excel with the option to select a region in an Excel worksheet.

I created an Excel file similar to yours in my Dropbox shared folder, you can download the example.xls file here .

 require(XLConnect) ## A5:C5 correspond to (D,Y) (D,Z) (E,Y) (E,Z) in your example selectworksheet1 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls", sheet = "Worksheet1", region = "A5:C5", header = FALSE) selectworksheet1 ## Col0 Col1 Col2 ## 1 2 2 2 ## B4:C5 correspond to (D,Y1) (D,Z1) (E,Y1) (E,Z1) in the second example selectworksheet2 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls", sheet = "Worksheet2", region = "B4:C5", header = FALSE) selectworksheet2 ## Col0 Col1 ## 1 1 1 ## 2 1 1 unlist(selectworksheet2) ## Col01 Col02 Col11 Col12 ## 1 1 1 1 
+8
source

There are several packages that provide functions for importing Excel data into R; See the R / Import / Export documentation.

I found an xlsx package that will be useful (it will read .xls and .xlsx files). I do not believe that it will accept row and column names as input, but it will accept their numerical value (e.g. row 1, column 4). In your case, something like this should work, assuming X, Y, and Z correspond to columns 1-3:

 library(xlsx) # first example subset; call it ss1 # assume first row is not a header; otherwise requires header = T ss1 <- read.xlsx("myfile.xlsx", sheetIndex = 1, rowIndex = 4:5, colIndex = 2:3) # second example subset; call it ss2 # just the same except worksheet index = 2 ss2 <- read.xlsx("myfile.xlsx", sheetIndex = 2, rowIndex = 4:5, colIndex = 2:3) 

However, you will need to experiment with your own file until everything works as expected. You can also specify the name sheetName, but I believe that sheetIndex usually works more reliably once you select the correct index for each sheet. And be careful if the first line is the header.

Having said all this: my preferred option would be to export the sheet to a text format such as CSV, use shell tools (cut, head, tail, etc.) to get the required rows / columns and import them into R.

+2
source

All Articles