Ranking data columns in R

Question

Ranking data columns in R

I have a data frame, below are examples of data from it.

Company     Category    Margin
SBI             BK      34.5
PNB             BK      39.5
UCO BANK        BK      39.9
ANDHRA BANK     BK      41.3
INDIAN BANK     BK      42.3
DENA BANK       BK      44.5
VIJAYA BANK     BK      44.5
UNION BANK      BK      47.6
CENTRAL BANK    BK      49.8
INFOSYS         IT      5.6
HCL TECH        IT      5.9
TCS             IT      6.9
CMC             IT      12.6
TECHMAHINDRA    IT      12.6
COGNIZANT       IT      15.8
IGATE           IT      22.4
WIPRO           IT      22.9
HEXAWARE        IT      34.8
MAHINDRA SATYAM IT      34.8
DR. REDDYS      PH      14.5
SUN PHARMA      PH      19.2
CIPLA           PH      23.9
LUPIN           PH      23.9
DIVIS LABS      PH      29

A careful look at the data frame indicates that it is sorted into CATEGORY, MARGIN, and then COMPANY categories.

Now my requirement is to add a new column called Ranking and assign a rating starting at 1 for each CATEGORY set. The rating numbering should begin with 1 when a new CATEGORY appears in the list.

Output result:

Company     Category    Margin     Ranking
SBI             BK      34.5       1
PNB             BK      39.5       2
UCO BANK        BK      39.9       3 
ANDHRA BANK     BK      41.3       4
INDIAN BANK     BK      42.3       5
DENA BANK       BK      44.5       6
VIJAYA BANK     BK      44.5       7
UNION BANK      BK      47.6       8
CENTRAL BANK    BK      49.8       9
INFOSYS         IT      5.6        1
HCL TECH        IT      5.9        2
TCS             IT      6.9        3
CMC             IT      12.6       4
TECHMAHINDRA    IT      12.6       5
COGNIZANT       IT      15.8       6
IGATE           IT      22.4       7
WIPRO           IT      22.9       8
HEXAWARE        IT      34.8       9
MAHINDRA SATYAM IT      34.8       10
DR. REDDYS      PH      14.5       1
SUN PHARMA      PH      19.2       2
CIPLA           PH      23.9       3
LUPIN           PH      23.9       4
DIVIS LABS      PH      29         5

Additional requirements

Assume that the input data set is completely zigzag. Then

unique(df$Category)   # gives 5 different category
[1] "BK" "IT" "PH" "MT" "EG"

After formatting, the same one returns

unique(df$Category)   # gives only 3 categories. rest of 2 categories were deleted.
[1] "BK" "IT" "PH"

Note: In the process of formatting the input dataset to prepare it for missing values, several categories have been deleted.

.. dataframe

, Ranking . . , - , , .

head(companyRanks(3), 4) returns
    COMPANY     CATEGORY
BK  UCO BANK        BK      
IT  TCS             IT      
PH  CIPLA           PH      
MT  <NA>            MT
EG  <NA>            EG

head(companyRanks(10), 4)  # returns:
            COMPANY     CATEGORY
BK             <NA>           BK  # Since there is no company with rank 10 under category BK, NA returned
IT  MAHINDRA SATYAM           IT      
PH             <NA>           PH      
MT             <NA>           MT
EG             <NA>           EG

- , ?

+4

r dataframe ranking

Kumar 16 . '13 12:18

1

Sophia · Answer 1 · 2013-10-16T12:28:00+0000

, dataframe df, :

df$Ranking <- ave( df$Margin, df$Category, FUN=rank )

Ranking data columns in R

More articles: