ラベルとデータに分離したい
pythonで弁当販売の予測をしてます
データをxとyに分けようとしたら
エラーが発生した
発生している問題・エラーメッセージ
KeyError Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2889 try:
-> 2890 return self._engine.get_loc(key)
2891 except KeyError:
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Y'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-3-063c85bc5a18> in <module>
7 lunch = pd.read_csv("train.csv", sep=";", encoding="utf-8")
8
----> 9 y = lunch["Y"]
10 x = lunch.drop("Y", axis=0)
11
~\Anaconda3\lib\site-packages\pandas\core\frame.py in getitem(self, key)
2973 if self.columns.nlevels > 1:
2974 return self._getitem_multilevel(key)
-> 2975 indexer = self.columns.get_loc(key)
2976 if is_integer(indexer):
2977 indexer = [indexer]
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2890 return self._engine.get_loc(key)
2891 except KeyError:
-> 2892 return self._engine.get_loc(self._maybe_cast_indexer(key))
2893 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2894 if indexer.ndim > 1 or indexer.size > 1:
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Y'
該当のソースコード
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
lunch = pd.read_csv("train.csv", sep=";", encoding="utf-8")
y = lunch["Y"]
x = lunch.drop("Y", axis=0)
x_train, y_train, x_test, y_test = train_test_split(
x, y, test_size = 0.25)
model = RandomForestClassifier()
Model = model.fit(x_train, y_train)
print("Trainig set score: {:.2f}".format(Model.score(x_train, y_train)))
print("Test set score: {:.2f}".format(Model.score(x_test, y_test)))
csvデータは
datetime Y week soldout name kcal remarks event payday weather precipitation temperature
0 2013-11-18 90 月 0 厚切りイカフライ NaN NaN NaN NaN 快晴 -- 19.8
1 2013-11-19 101 火 1 手作りヒレカツ NaN NaN NaN NaN 快晴 -- 17.0
2 2013-11-20 118 水 0 白身魚唐揚げ野菜あん NaN NaN NaN NaN 快晴 -- 15.5
な感じです
試したこと
pandasのアップデートをしてみましたができませんでした
補足情報(FW/ツールのバージョンなど)
jupyter notebook
python3.5
回答4件
あなたの回答
tips
プレビュー