The third column in my numpy array is Age. In this column, about 75% of the entries are valid and 25% are empty. Column 2 is gender-specific and uses some manipulation. I calculated the average age of men in my dataset to be 30. The average age of women in my dataset is 28.
I want to replace all the empty Age values ββfor men with 30, and for all white values ββfor women - 28.
However, I cannot do this. Does anyone have a suggestion or know what I'm doing wrong?
Here is my code:
# my entire data set is stored in a numpy array defined as x
ismale = x[::,1]=='male'
maleAgeBlank = x[ismale][::,2]==''
x[ismale][maleAgeBlank][::,2] = 30
For some reason, when I finished with the code above, I type xto display the data set, and spaces still exist, although I set them to 30. Please note that I cannot do x[maleAgeBlank]this because some of the list will include female data, since female data are not yet excluded.
Is there a way to get what I want? For some reason, if I do x[ismale][::,1] = 1(setting the column with "male" to 1), this works, but it x[ismale][maleAgeBlank][::,2] = 30doesn't work.
array pattern:
array([['3', '1', '22', ..., '0', '7.25', '2'],
['1', '0', '38', ..., '0', '71.2833', '0'],
['3', '0', '26', ..., '0', '7.925', '2'],
...,
['3', '0', '', ..., '2', '23.45', '2'],
['1', '1', '26', ..., '0', '30', '0'],
['3', '1', '32', ..., '0', '7.75', '1']],
dtype='<U82')
array(['3', '1', '22', '1', '0', '7.25', '2'],
dtype='<U82')
Note that I changed column 2 as 0 for female and 1 for men already at the output above