I have a simple frame as such:
df = [ {'col1' : 'A', 'col2': 'B', 'col3': 'C', 'col4':'0'}, {'col1' : 'M', 'col2': '0', 'col3': 'M', 'col4':'0'}, {'col1' : 'B', 'col2': 'B', 'col3': '0', 'col4':'B'}, {'col1' : 'X', 'col2': '0', 'col3': 'Y', 'col4':'0'} ] df = pd.DataFrame(df) df = df[['col1', 'col2', 'col3', 'col4']] df
Which looks like this:
| col1 | col2 | col3 | col4 | |------|------|------|------| | A | B | C | 0 | | M | 0 | M | 0 | | B | B | 0 | B | | X | 0 | Y | 0 |
I just want to replace duplicate characters with the character "0" line by line. This boils down to storing the first duplicate value we encounter, for example:
| col1 | col2 | col3 | col4 | |------|------|------|------| | A | B | C | 0 | | M | 0 | 0 | 0 | | B | 0 | 0 | 0 | | X | 0 | Y | 0 |
It seems so simple, but I'm stuck. Any boosts in the right direction would really be appreciated.
source share