I am creating a function that compares a data frame (DF) with a series (S) and ultimately returns a new data frame. The common column is "name". I want the function to return a dataframe with the same number of rows as row (S) and the same number of columns as df. The function will search the name columns in df and find all matching names in the series (S). If a match is found, I want to create a new line in the new data framework that matches the df line for this particular name. If no match is found, I want to create a new row for the resulting framework independently, but include all the 0.0 values for the cells for this particular row. I have been trying to figure this out for the last 6 hours. I guess I'm having broadcast issues. Here is what I have tried.
Here are some sample data.
Series:
S[500:505]
500 Nanotechnology
501 Music
502 Logistics & Supply Chain
503 Computer & Network Security
504 Computer Software
Name: name, dtype: object
DataFrame: : name, . , = 0 .
Defense & Space Computer Software Internet Semiconductors \
0 1.0 0.0 0.0 0.0
1 0.0 1.0 0.5 0.5
2 0.0 0.5 1.0 0.5
3 0.0 0.5 0.5 1.0
4 0.5 0.0 0.0 0.0
S.shape = (31454,)
df.shape = (100,101)
all_zeros = np.zeros((len(S),len(df.columns)))
numpy dataframe
result = pd.DataFrame(data = all_zeros, columns=df.columns, index = range(len(s)))
,
result = result.drop('name', axis=1)
, ,
def set_cell_values(row):
return df.iloc[1,:]
,
for index in range(len(df)):
names_are_equal = df['name'][index] == result['name']
map(lambda x: set_cell_values(row), result[names_are_equal]))
, , . , ? , df ( ).