I have a pandas framework that looks like this:
d = {'some_col' : ['A', 'B', 'C', 'D', 'E'],
'alert_status' : [1, 2, 0, 0, 5]}
df = pd.DataFrame(d)
To perform several tasks in my work, the same tasks in pandas are required. I am starting to write standardized functions that will take data as a parameter and return something. Here's a simple one:
def alert_read_text(df, alert_status=None):
if (alert_status is None):
print 'Warning: A column name with the alerts must be specified'
alert_read_criteria = df[alert_status] >= 1
df[alert_status].loc[alert_read_criteria] = 1
alert_status_dict = {0 : 'Not Read',
1 : 'Read'}
df[alert_status] = df[alert_status].map(alert_status_dict)
return df[alert_status]
I am looking for a function to return a series. Thus, you can add a column to an existing data frame:
df['alert_status_text'] = alert_read_text(df, alert_status='alert_status')
However, at present, this function will correctly return a series, but will also modify an existing column. How do you do this so that the original column is not modified?
source
share