回答編集履歴
1
別解を追記
answer
CHANGED
@@ -21,4 +21,37 @@
|
|
21
21
|
4 2018-01-26 b 3 days
|
22
22
|
5 2018-01-30 b 4 days
|
23
23
|
"""
|
24
|
+
```
|
25
|
+
|
26
|
+
#### 別解:先頭から舐める版
|
27
|
+
ユニークidが多く、各id毎の行数が少ない場合は、以下のように先頭から舐めて計算する方が速いかもしれません。
|
28
|
+
```Python
|
29
|
+
import pandas as pd
|
30
|
+
|
31
|
+
df = pd.DataFrame({'id':['a','a','a','b','b','b'],'date':['2018/01/23','2018/01/24','2018/01/26','2018/01/23','2018/01/26','2018/01/30']},
|
32
|
+
columns = ['id','date'])
|
33
|
+
df.loc[:,'date'] = pd.to_datetime(df['date'])
|
34
|
+
df['delta'] = 0
|
35
|
+
|
36
|
+
prev_id,prev_date = df.loc[0,'id'], df.loc[0,'date']
|
37
|
+
for idx,row in df.iterrows():
|
38
|
+
cur_id = row['id']
|
39
|
+
cur_date = row['date']
|
40
|
+
if prev_id != cur_id:
|
41
|
+
pass
|
42
|
+
else:
|
43
|
+
df.loc[idx,'delta'] = cur_date - prev_date
|
44
|
+
prev_id = cur_id
|
45
|
+
prev_date = cur_date
|
46
|
+
|
47
|
+
print(df)
|
48
|
+
"""
|
49
|
+
id date delta
|
50
|
+
0 a 2018-01-23 0 days 00:00:00
|
51
|
+
1 a 2018-01-24 1 days 00:00:00
|
52
|
+
2 a 2018-01-26 2 days 00:00:00
|
53
|
+
3 b 2018-01-23 0
|
54
|
+
4 b 2018-01-26 3 days 00:00:00
|
55
|
+
5 b 2018-01-30 4 days 00:00:00
|
56
|
+
"""
|
24
57
|
```
|