Create a new pandas timeseries framework from another data frame

How can I create a new pandas timeseries framework from one existing df.

Say event A started on 11/28 11:35 and ended on 11/29 19:53, which is considered 1. Again, the event. The second instance started on 11/28 11:37 and ended on 11/29 19:53 - it counts another 1. Therefore, I increased the value of A to 2. (Unfortunately, the data entry was erroneous 11/28, and that would be 11/29)

The df source specified with the start and end time of the event. And the same event can occur several times at the same time. The new df should have time series of the total number of events for a given minute in the range from Min (Start Time) to Max (End Time).

Source Df:

Start-Time       |  End-Time         | Event
11/28/2014 11:35 |  11/29/2014 19:53 | A
11/28/2014 11:36 |  11/28/2014 11:37 | B
11/28/2014 11:32 |  11/28/2014 19:53 | C
11/28/2014 11:37 |  11/28/2014 19:53 | A
......

New Df:

TimeStamp        | A |  B | C
11/28/2014 11:35 | 1 |  0 | 1
11/28/2014 11:36 | 1 |  1 | 1
11/28/2014 11:37 | 2 |  1 | 1
.....
11/29/2014 19:53 | 2 |  0 | 1
+4
2

, , "on", , - (: , )

df = pd.melt(df, id_vars="Event", var_name="Which", value_name="Time")
df["Signal"] = df.pop("Which").replace({"Start-Time": 1, "End-Time": -1})
pivoted = df.pivot(columns="Event", index="Time").fillna(0)
pivoted = pivoted.sort_index() # just in case; can't remember if this is guaranteed
df_out = pivoted.cumsum() + (pivoted == -1)

>>> df_out
                 Signal      
Event                 A  B  C
Time                         
11/28/2014 11:32      0  0  1
11/28/2014 11:35      1  0  1
11/28/2014 11:36      1  1  1
11/28/2014 11:37      2  1  1
11/28/2014 19:53      2  0  1
11/29/2014 19:53      1  0  0

- "" :

>>> df
  Event              Time  Signal
0     A  11/28/2014 11:35       1
1     B  11/28/2014 11:36       1
2     C  11/28/2014 11:32       1
3     A  11/28/2014 11:37       1
4     A  11/29/2014 19:53      -1
5     B  11/28/2014 11:37      -1
6     C  11/28/2014 19:53      -1
7     A  11/28/2014 19:53      -1

, :

>>> pivoted
                 Signal      
Event                 A  B  C
Time                         
11/28/2014 11:32      0  0  1
11/28/2014 11:35      1  0  0
11/28/2014 11:36      0  1  0
11/28/2014 11:37      1 -1  0
11/28/2014 19:53     -1  0 -1
11/29/2014 19:53     -1  0  0

, :

>>> pivoted.cumsum()
                 Signal      
Event                 A  B  C
Time                         
11/28/2014 11:32      0  0  1
11/28/2014 11:35      1  0  1
11/28/2014 11:36      1  1  1
11/28/2014 11:37      2  0  1
11/28/2014 19:53      1  0  0
11/29/2014 19:53      0  0  0

, , , , , :

>>> pivoted.cumsum() + (pivoted == -1)
                 Signal      
Event                 A  B  C
Time                         
11/28/2014 11:32      0  0  1
11/28/2014 11:35      1  0  1
11/28/2014 11:36      1  1  1
11/28/2014 11:37      2  1  1
11/28/2014 19:53      2  0  1
11/29/2014 19:53      1  0  0
+3

, @DSM. start end , groupby aggregate length. pivot .

start = [35, 36, 37, 36, 35]
end = [56, 56, 56, 58, 58]
events = ['A', 'B', 'C', 'A', 'A']

df = pd.DataFrame( {'start': start, 'end': end, 'events': events})

# stack the 'start' and 'end' columns here
new_df = pd.DataFrame({ 'times': df['start'].append(df['end']), 'events': df['events'].append(df['events']) })

new_df = new_df.groupby(['times', 'events']).agg(len)

# massage the data frame to conform to desired output
new_df = new_df.reset_index().pivot('times', 'events').fillna(0)

:

  events  times
0      A     35
1      B     36
2      C     37
3      A     36
4      A     35
0      A     56
1      B     56
2      C     56
3      A     58
4      A     58

groupby:

times  events
35     A         2
36     A         1
       B         1
37     C         1
56     A         1
       B         1
       C         1
58     A         2

, , :

events  A  B  C
times          
35      2  0  0
36      1  1  0
37      0  0  1
56      1  1  1
58      2  0  0

, @DSM , , , append , . , .

+1

All Articles