回答編集履歴
6
言葉の修正、コード変更なし
answer
CHANGED
@@ -49,7 +49,7 @@
|
|
49
49
|
```
|
50
50
|
|
51
51
|
### 別のアイデア
|
52
|
-
これでも良い気がします。速くはなるはずなんだけど、idごとに計測するのは無理かも。あと、
|
52
|
+
これでも良い気がします。速くはなるはずなんだけど、idごとに計測するのは無理かも。あと、上のループでどれくらい時間を食うかが懸念。
|
53
53
|
|
54
54
|
```python
|
55
55
|
import pandas as pd
|
5
iterrowsより速くなる(つもり)
answer
CHANGED
@@ -59,10 +59,10 @@
|
|
59
59
|
|
60
60
|
before_id = None
|
61
61
|
zero_points = []
|
62
|
-
for i,
|
62
|
+
for i, id_ in df["id"].iteritems():
|
63
|
-
if
|
63
|
+
if id_ != before_id:
|
64
64
|
zero_points.append(i)
|
65
|
-
before_id =
|
65
|
+
before_id = id_
|
66
66
|
|
67
67
|
df.loc[:,"delta"] = df.loc[:,"date"].diff()
|
68
68
|
df.loc[zero_points,"delta"] = pd.Timedelta(0)
|
4
追記
answer
CHANGED
@@ -46,4 +46,35 @@
|
|
46
46
|
5 2018-01-30 b 4 days
|
47
47
|
"""
|
48
48
|
|
49
|
+
```
|
50
|
+
|
51
|
+
### 別のアイデア
|
52
|
+
これでも良い気がします。速くはなるはずなんだけど、idごとに計測するのは無理かも。あと、前のループでどれくらい時間を食うかが懸念。
|
53
|
+
|
54
|
+
```python
|
55
|
+
import pandas as pd
|
56
|
+
|
57
|
+
df = pd.DataFrame({'id':['a','a','a','b','b','b'],'date':['2018/01/23','2018/01/24','2018/01/26','2018/01/23','2018/01/26','2018/01/30']})
|
58
|
+
df.loc[:,'date'] = pd.to_datetime(df['date'])
|
59
|
+
|
60
|
+
before_id = None
|
61
|
+
zero_points = []
|
62
|
+
for i, row in df.iterrows():
|
63
|
+
if row["id"] != before_id:
|
64
|
+
zero_points.append(i)
|
65
|
+
before_id = row["id"]
|
66
|
+
|
67
|
+
df.loc[:,"delta"] = df.loc[:,"date"].diff()
|
68
|
+
df.loc[zero_points,"delta"] = pd.Timedelta(0)
|
69
|
+
print(df)
|
70
|
+
|
71
|
+
"""
|
72
|
+
date id delta
|
73
|
+
0 2018-01-23 a 0 days
|
74
|
+
1 2018-01-24 a 1 days
|
75
|
+
2 2018-01-26 a 2 days
|
76
|
+
3 2018-01-23 b 0 days
|
77
|
+
4 2018-01-26 b 3 days
|
78
|
+
5 2018-01-30 b 4 days
|
79
|
+
"""
|
49
80
|
```
|
3
ちょい変更
answer
CHANGED
@@ -23,7 +23,7 @@
|
|
23
23
|
|
24
24
|
```
|
25
25
|
|
26
|
-
###### fillna
|
26
|
+
###### 毎回fillnaするオーバーヘッドを削減()したもの
|
27
27
|
```python
|
28
28
|
import pandas as pd
|
29
29
|
|
2
追記
answer
CHANGED
@@ -21,4 +21,29 @@
|
|
21
21
|
5 2018-01-30 b 4 days
|
22
22
|
"""
|
23
23
|
|
24
|
+
```
|
25
|
+
|
26
|
+
###### fillnaのオーバーヘッドを削減()したもの
|
27
|
+
```python
|
28
|
+
import pandas as pd
|
29
|
+
|
30
|
+
df = pd.DataFrame({'id':['a','a','a','b','b','b'],'date':['2018/01/23','2018/01/24','2018/01/26','2018/01/23','2018/01/26','2018/01/30']})
|
31
|
+
df.loc[:,'date'] = pd.to_datetime(df['date'])
|
32
|
+
|
33
|
+
grp = df.groupby('id')
|
34
|
+
for grp_name, grp_idx in grp.groups.items():
|
35
|
+
df.loc[grp_idx,'delta'] = df.loc[grp_idx,'date'].diff()
|
36
|
+
|
37
|
+
df.fillna(0, inplace=True)
|
38
|
+
print(df)
|
39
|
+
"""
|
40
|
+
date id delta
|
41
|
+
0 2018-01-23 a 0 days
|
42
|
+
1 2018-01-24 a 1 days
|
43
|
+
2 2018-01-26 a 2 days
|
44
|
+
3 2018-01-23 b 0 days
|
45
|
+
4 2018-01-26 b 3 days
|
46
|
+
5 2018-01-30 b 4 days
|
47
|
+
"""
|
48
|
+
|
24
49
|
```
|
1
修正
answer
CHANGED
@@ -8,7 +8,7 @@
|
|
8
8
|
|
9
9
|
grp = df.groupby('id')
|
10
10
|
for grp_name, grp_idx in grp.groups.items():
|
11
|
-
df.loc[grp_idx,'delta'] = df.loc[
|
11
|
+
df.loc[grp_idx,'delta'] = df.loc[grp_idx,'date'].diff().fillna(0)
|
12
12
|
|
13
13
|
print(df)
|
14
14
|
"""
|