回答編集履歴
6
言葉の修正、コード変更なし
test
CHANGED
@@ -100,7 +100,7 @@
|
|
100
100
|
|
101
101
|
### 別のアイデア
|
102
102
|
|
103
|
-
これでも良い気がします。速くはなるはずなんだけど、idごとに計測するのは無理かも。あと、
|
103
|
+
これでも良い気がします。速くはなるはずなんだけど、idごとに計測するのは無理かも。あと、上のループでどれくらい時間を食うかが懸念。
|
104
104
|
|
105
105
|
|
106
106
|
|
5
iterrowsより速くなる(つもり)
test
CHANGED
@@ -120,13 +120,13 @@
|
|
120
120
|
|
121
121
|
zero_points = []
|
122
122
|
|
123
|
-
for i,
|
123
|
+
for i, id_ in df["id"].iteritems():
|
124
124
|
|
125
|
-
if
|
125
|
+
if id_ != before_id:
|
126
126
|
|
127
127
|
zero_points.append(i)
|
128
128
|
|
129
|
-
before_id =
|
129
|
+
before_id = id_
|
130
130
|
|
131
131
|
|
132
132
|
|
4
追記
test
CHANGED
@@ -95,3 +95,65 @@
|
|
95
95
|
|
96
96
|
|
97
97
|
```
|
98
|
+
|
99
|
+
|
100
|
+
|
101
|
+
### 別のアイデア
|
102
|
+
|
103
|
+
これでも良い気がします。速くはなるはずなんだけど、idごとに計測するのは無理かも。あと、前のループでどれくらい時間を食うかが懸念。
|
104
|
+
|
105
|
+
|
106
|
+
|
107
|
+
```python
|
108
|
+
|
109
|
+
import pandas as pd
|
110
|
+
|
111
|
+
|
112
|
+
|
113
|
+
df = pd.DataFrame({'id':['a','a','a','b','b','b'],'date':['2018/01/23','2018/01/24','2018/01/26','2018/01/23','2018/01/26','2018/01/30']})
|
114
|
+
|
115
|
+
df.loc[:,'date'] = pd.to_datetime(df['date'])
|
116
|
+
|
117
|
+
|
118
|
+
|
119
|
+
before_id = None
|
120
|
+
|
121
|
+
zero_points = []
|
122
|
+
|
123
|
+
for i, row in df.iterrows():
|
124
|
+
|
125
|
+
if row["id"] != before_id:
|
126
|
+
|
127
|
+
zero_points.append(i)
|
128
|
+
|
129
|
+
before_id = row["id"]
|
130
|
+
|
131
|
+
|
132
|
+
|
133
|
+
df.loc[:,"delta"] = df.loc[:,"date"].diff()
|
134
|
+
|
135
|
+
df.loc[zero_points,"delta"] = pd.Timedelta(0)
|
136
|
+
|
137
|
+
print(df)
|
138
|
+
|
139
|
+
|
140
|
+
|
141
|
+
"""
|
142
|
+
|
143
|
+
date id delta
|
144
|
+
|
145
|
+
0 2018-01-23 a 0 days
|
146
|
+
|
147
|
+
1 2018-01-24 a 1 days
|
148
|
+
|
149
|
+
2 2018-01-26 a 2 days
|
150
|
+
|
151
|
+
3 2018-01-23 b 0 days
|
152
|
+
|
153
|
+
4 2018-01-26 b 3 days
|
154
|
+
|
155
|
+
5 2018-01-30 b 4 days
|
156
|
+
|
157
|
+
"""
|
158
|
+
|
159
|
+
```
|
3
ちょい変更
test
CHANGED
@@ -48,7 +48,7 @@
|
|
48
48
|
|
49
49
|
|
50
50
|
|
51
|
-
###### fillna
|
51
|
+
###### 毎回fillnaするオーバーヘッドを削減()したもの
|
52
52
|
|
53
53
|
```python
|
54
54
|
|
2
追記
test
CHANGED
@@ -45,3 +45,53 @@
|
|
45
45
|
|
46
46
|
|
47
47
|
```
|
48
|
+
|
49
|
+
|
50
|
+
|
51
|
+
###### fillnaのオーバーヘッドを削減()したもの
|
52
|
+
|
53
|
+
```python
|
54
|
+
|
55
|
+
import pandas as pd
|
56
|
+
|
57
|
+
|
58
|
+
|
59
|
+
df = pd.DataFrame({'id':['a','a','a','b','b','b'],'date':['2018/01/23','2018/01/24','2018/01/26','2018/01/23','2018/01/26','2018/01/30']})
|
60
|
+
|
61
|
+
df.loc[:,'date'] = pd.to_datetime(df['date'])
|
62
|
+
|
63
|
+
|
64
|
+
|
65
|
+
grp = df.groupby('id')
|
66
|
+
|
67
|
+
for grp_name, grp_idx in grp.groups.items():
|
68
|
+
|
69
|
+
df.loc[grp_idx,'delta'] = df.loc[grp_idx,'date'].diff()
|
70
|
+
|
71
|
+
|
72
|
+
|
73
|
+
df.fillna(0, inplace=True)
|
74
|
+
|
75
|
+
print(df)
|
76
|
+
|
77
|
+
"""
|
78
|
+
|
79
|
+
date id delta
|
80
|
+
|
81
|
+
0 2018-01-23 a 0 days
|
82
|
+
|
83
|
+
1 2018-01-24 a 1 days
|
84
|
+
|
85
|
+
2 2018-01-26 a 2 days
|
86
|
+
|
87
|
+
3 2018-01-23 b 0 days
|
88
|
+
|
89
|
+
4 2018-01-26 b 3 days
|
90
|
+
|
91
|
+
5 2018-01-30 b 4 days
|
92
|
+
|
93
|
+
"""
|
94
|
+
|
95
|
+
|
96
|
+
|
97
|
+
```
|
1
修正
test
CHANGED
@@ -18,7 +18,7 @@
|
|
18
18
|
|
19
19
|
for grp_name, grp_idx in grp.groups.items():
|
20
20
|
|
21
|
-
df.loc[grp_idx,'delta'] = df.loc[gr,'date'].diff().fillna(0)
|
21
|
+
df.loc[grp_idx,'delta'] = df.loc[grp_idx,'date'].diff().fillna(0)
|
22
22
|
|
23
23
|
|
24
24
|
|