トップに関する質問 [Python] 時系列分析（ランダムフォレスト）コーディングエラーの意味.

編集履歴

質問編集履歴

2020/06/28 13:48

投稿

mango55

スコア22

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -84,9 +84,9 @@
-Y_vars = target_col
+y = dataset[target_col]
-X_vars = feature_cols
+X = dataset[feature_cols]

2020/06/28 13:48

投稿

mango55

スコア22

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -140,6 +140,8 @@
 ```
+コード修正後のエラー（初回とはエラー内容が異なります）
 KeyError                                  Traceback (most recent call last)
 ~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)

2020/06/28 11:58

投稿

mango55

スコア22

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -138,7 +138,7 @@
 ```
-```エラー内容
+```
 KeyError                                  Traceback (most recent call last)
@@ -240,7 +240,9 @@
+```
+file_0627.csv
 |date|patient|6day_exclusion_rate|14day_exclusion_rate

修正追記しました

2020/06/27 02:08

投稿

mango55

スコア22

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -28,9 +28,65 @@
 ```Python
+%matplotlib inline
+import matplotlib
+import matplotlib.pyplot as plt
+import numpy as np
+import pandas as pd
+from sklearn.linear_model import LinearRegression
+from sklearn.tree import DecisionTreeRegressor
+from sklearn.ensemble import RandomForestRegressor
+from sklearn.model_selection import GridSearchCV
+from sklearn.model_selection import train_test_split
+from sklearn.metrics import mean_squared_error
+dataset = pd.read_csv('file_0627.csv')
+dataset.head()
-Y_vars = dataset['patient']
+target_col = 'patient'
-X_vars = dataset['exclusion_rate']
+exclude_cols = ['date','patient','14day_exclusion_rate']
+feature_cols = []
+for col in dataset.columns:
+    if col not in exclude_cols:
+        feature_cols.append(col)
+X_train_val, X_test, y_train_val, y_test = \
+    train_test_split(X, y, test_size=0.3, random_state=1234) #分割1
+X_train, X_val, y_train, y_val = \
+    train_test_split(X_train_val, y_train_val, test_size=0.3, random_state=1234)　#分割2
+Y_vars = target_col
+X_vars = feature_cols
@@ -86,7 +142,43 @@
 KeyError                                  Traceback (most recent call last)
+~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
+   2656             try:
+-> 2657                 return self._engine.get_loc(key)
+   2658             except KeyError:
+pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
+pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
+pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
+pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
+KeyError: 'patient'
+During handling of the above exception, another exception occurred:
+KeyError                                  Traceback (most recent call last)
-<ipython-input-31-84f18d09bd67> in <module>
+<ipython-input-57-82605815cc01> in <module>
       3
@@ -102,62 +194,56 @@
 ~\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
-   2932                 key = list(key)
-   2933             indexer = self.loc._convert_to_indexer(key, axis=1,
--> 2934                                                    raise_missing=True)
-   2935
-   2936         # take() does not accept boolean indexers
-~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _convert_to_indexer(self, obj, axis, is_setter, raise_missing)
-   1352                 kwargs = {'raise_missing': True if is_setter else
-   1353                           raise_missing}
--> 1354                 return self._get_listlike_indexer(obj, axis, **kwargs)[1]
-   1355         else:
-   1356             try:
-~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
-   1159         self._validate_read_indexer(keyarr, indexer,
-   1160                                     o._get_axis_number(axis),
--> 1161                                     raise_missing=raise_missing)
-   1162         return keyarr, indexer
-   1163
-~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
-   1244                 raise KeyError(
-   1245                     u"None of [{key}] are in the [{axis}]".format(
--> 1246                         key=key, axis=self.obj._get_axis_name(axis)))
-   1247
-   1248             # We (temporarily) allow for some missing keys with .loc, except in
-KeyError: "None of [Float64Index([           0.140526,            0.131246,            0.134081,\n                         0.258836,            0.183608,            0.121047,\n              0.12695399999999998,            0.130412,            0.129215,\n                        -0.000315,\n              ...\n                         0.133338,            0.120761, 0.20714499999999997,\n                         0.255416, 0.11556300000000001, 0.15191500000000002,\n                         0.136953,            0.140565,            0.132261,\n                         0.215802],\n             dtype='float64', length=4465)] are in the [columns]"
-```
+   2925             if self.columns.nlevels > 1:
+   2926                 return self._getitem_multilevel(key)
+-> 2927             indexer = self.columns.get_loc(key)
+   2928             if is_integer(indexer):
+   2929                 indexer = [indexer]
+~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
+   2657                 return self._engine.get_loc(key)
+   2658             except KeyError:
+-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
+   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
+   2661         if indexer.ndim > 1 or indexer.size > 1:
+pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
+pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
+pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
+pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
+KeyError: 'patient'
+|date|patient|6day_exclusion_rate|14day_exclusion_rate
+|2020/4/1|181|0.117179|0.130412
+|2020/4/2|186|0.17748|0.129215

2020/06/27 02:06

投稿

mango55

スコア22

test CHANGED Viewed

	@@ -1 +1 @@
1	- [Python] 時系列分析（ランダムフォレスト）コーディングエラーの意味
1	+ [Python] 時系列分析（ランダムフォレスト）コーディングエラーの意味.

test CHANGED Viewed

@@ -27,18 +27,6 @@
 ```Python
-rf = RandomForestRegressor(random_state=1234)
-rf.fit(X_train, y_train)
-y_pred = rf.predict(X_val)
-rf_mse = mean_squared_error(y_val, y_pred)
-print('Random Forest RMSE: ', np.sqrt(rf_mse))
 Y_vars = dataset['patient']

2020/06/25 13:26

投稿

mango55

スコア22

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -1,4 +1,6 @@
 時系列分析（ランダムフォレスト）で将来の人数を予測しています。
 手元のデータ集計期間が短いため、ある特定日を用いて前日、前々日の人数を予測したものを学習データ、実際の人数をテストデータとして予測を行いたいです。

2020/06/25 13:22

投稿

mango55

スコア22

test CHANGED Viewed

	@@ -1 +1 @@
1	- [Python] エラーの意味
1	+ [Python] 時系列分析（ランダムフォレスト）コーディングエラーの意味

test CHANGED Viewed

@@ -10,11 +10,11 @@
 実現したいこと
-① 学習用データの作成　4/1のデータから3/31,3/30の人数を予測したもの
+① 学習用データの作成　4/1のデータから3/31,3/30の人数を予測
-② ①を学習用データ、実際の人数をテストデータとして予測する
+② ①を学習用データ、実際の人数をテストデータとして予測
-③ ランダムフォレストでRMSEを算出
+③ ①②を用いてランダムフォレストでRMSEを算出