You can groupby field id:
In [11]: df Out[11]: Id Date 0 1 2012-03-01 00:00:00 1 1 2013-04-08 00:00:00 2 2 2013-01-17 00:00:00 3 2 2013-05-04 00:00:00 4 2 2012-10-30 00:00:00 5 3 2013-01-03 00:00:00 In [12]: g = df.groupby('Id')
If you are not sure about the order, you can do something line by line:
In [13]: g.agg(lambda x: x.iloc[x.Date.argmax()]) Out[13]: Date Id 1 2013-04-08 00:00:00 2 2013-05-04 00:00:00 3 2013-01-03 00:00:00
which for each group captures the row with the largest (last) date (part of argmax).
If you knew that they were fine, you can take the last (or first) entry:
In [14]: g.last() Out[14]: Date Id 1 2013-04-08 00:00:00 2 2012-10-30 00:00:00 3 2013-01-03 00:00:00
(Note: they are not OK, so in this case it does not work!)
Andy hayden
source share