Key　Error　について（Pyhton、機械学習、Kaggle Titaincコンペ）

Question

### 前提・実現したいこと Titanic号の生存者予測モデルを作成（Kaggleのコンペ,言語はPython）中に、 sex(性別）と乗車地点（Embarked)のデータの整形を実装中に以下のエラーメッセージが発生しました。下記サイトを参考（有料会員のみ閲覧可） https://aiacademy.jp/texts/show/?id=67&course=5176 データセットの事前処理の段階で、 (1) 欠損データを代理データに入れ替える (2) 文字列カテゴリカルデータを数字へ変換の（２）の段階にて、データセットの中で文字列が使われている性別（Sex）と乗車地点（Embarked）についてダミー変数を用いて補います。 ### 発生している問題・エラーメッセージ KeyError Traceback (most recent call last) /opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2645 try: -> 2646 return self._engine.get_loc(key) 2647 except KeyError: pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Sex' During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) in ----> 1 sex_dum = pd.get_dummies(df["Sex"]) 2 df = pd.concat((df,sex_dum),axis=1) 3 df = df.drop("Sex",axis=1) 4 df = df.drop("female",axis=1) 5 /opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key) 2798 if self.columns.nlevels > 1: 2799 return self._getitem_multilevel(key) -> 2800 indexer = self.columns.get_loc(key) 2801 if is_integer(indexer): 2802 indexer = [indexer] /opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2646 return self._engine.get_loc(key) 2647 except KeyError: -> 2648 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2649 indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2650 if indexer.ndim > 1 or indexer.size > 1: pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Sex' ``` エラーメッセージ Key error ### 該当のソースコード import pandas as pd df = pd.read_csv("/kaggle/input/titanic/train.csv") df.head() df.isnull().sum() df["Age"].fillna(df["Age"].median(),inplace=True) df =df.drop("Cabin",axis=1) import matplotlib.pyplot as plt import seaborn as sns sns.countplot(x = df["Pclass"],hue = df["Survived"]) plt.show() import numpy as np edge = np.arange(0,100,10) plt.hist((df[df["Survived"]==0]["Age"],df[df["Survived"]==1]["Age"]),histtype="barstacked",bins=edge,label=[0,1]) plt.legend(title="Survived") plt.show() df["Familysize"] =df['SibSp']+df['Parch']+1 pd.crosstab(df["Familysize"],df["Survived"],normalize='index').plot(kind="bar",stacked=True) plt.show() sex_dum = pd.get_dummies(df["Sex"]) df = pd.concat((df,sex_dum),axis=1) df = df.drop("Sex",axis=1) df = df.drop("female",axis=1) emb_dum = pd.get_dummies(df["Embarked"]) df = pd.concat((df,emb_dum),axis=1) df = df.drop(["Embarked","S"],axis=1) df = df.drop(["Name","Ticket","PassengerId","Parch","SibSp"],axis=1) #今回使わないデータの削除 ### 試したことスペース・誤字がが無いか２、３度確認したけど、結局原因は特定できませんでした。初心者でデバッグの知識が、ほとんど無いレベルです。どなたかわかる方いましたら、よろしくお願いします。

Accepted Answer

質問のコードをほぼそのまま試しましたが問題なく実行出来ました。
```Python
import pandas as pd
df = pd.read_csv("train.csv")
df.head()

df.isnull().sum()

df["Age"].fillna(df["Age"].median(),inplace=True)

df =df.drop("Cabin",axis=1)

import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

sns.countplot(x = df["Pclass"],hue = df["Survived"])

import numpy as np
edge = np.arange(0,100,10)

plt.hist((df[df["Survived"]==0]["Age"],df[df["Survived"]==1]["Age"]),histtype="barstacked",bins=edge,label=[0,1])
plt.legend(title="Survived")

df["Familysize"] =df['SibSp']+df['Parch']+1

pd.crosstab(df["Familysize"],df["Survived"],normalize='index').plot(kind="bar",stacked=True)

sex_dum = pd.get_dummies(df["Sex"])
df = pd.concat((df,sex_dum),axis=1)
df = df.drop("Sex",axis=1)
df = df.drop("female",axis=1)

emb_dum = pd.get_dummies(df["Embarked"])
df = pd.concat((df,emb_dum),axis=1)
df = df.drop(["Embarked","S"],axis=1)

df = df.drop(["Name","Ticket","PassengerId","Parch","SibSp"],axis=1)
```

【実行環境】
Python 3.7
jupyter notebook
Pandas 0.22.0

前提・実現したいこと

発生している問題・エラーメッセージ

関連した質問