Pythonで、重回帰分析を行う際に、説明変数を決めて、プログラムを実行しましたら、
添付のエラーが出てしまいました。
変数を数字にしていることがよくないのでしょうか?
解決方法がわかるかたいらっしゃればご教示いただければ幸いです。
<以下は、回帰分析をするためのデータフレームです>
<エラーのコードです>
python
1# 目的変数と説明変数に分割 2 3columnList =['city_code','region_code','emailer_for_promotion','homepage_featured','TYPE_B','TYPE_C','Biryani','Desert','Extras',\ 4 'Fish','Other Snacks','Pasta','Pizza','Rice Bowl','Salad','Sandwich','Seafood','Soup','Starters','Indian','Italian','Thai'\ 5 '1.9','2.0','2.4','2.7','2.8','2.9','3.0','3.2','3.4','3.5','3.6','3.7','3.8','3.9','4.0','4.1','4.2','4.4','4.5',\ 6 '4.6','4.7','4.8','5.0','5.1','5.3','5.6','6.3','6.7','7.0'] 7 8X = train_query.loc[:,columnList] 9y = train_query.loc[:, ['num_rank']] 10 11# モデル構築用データ、モデル検証用データに分割(80:20に分割) 12X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) 13 14print(X_train.shape) 15print(X_test.shape) 16 17print(y_train.shape) 18print(y_test.shape)
<エラー表記です。>
Python
1KeyError Traceback (most recent call last) 2<ipython-input-175-37e26af03518> in <module>() 3 18 columnList =['city_code','region_code','emailer_for_promotion','homepage_featured','TYPE_B','TYPE_C','Biryani','Desert','Extras', 'Fish','Other Snacks','Pasta','Pizza','Rice Bowl','Salad','Sandwich','Seafood','Soup','Starters','Indian','Italian','Thai' '1.9','2.0','2.4','2.7','2.8','2.9','3.0','3.2','3.4','3.5','3.6','3.7','3.8','3.9','4.0','4.1','4.2','4.4','4.5', '4.6','4.7','4.8','5.0','5.1','5.3'... 4 19 5---> 20 X = train_query.loc[:,columnList] 6 21 y = train_query.loc[:, ['num_rank']] 7 22 8 96 frames 10/usr/local/lib/python3.7/dist-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing) 11 1314 with option_context("display.max_seq_items", 10, "display.width", 80): 12 1315 raise KeyError( 13-> 1316 "Passing list-likes to .loc or [] with any missing labels " 14 1317 "is no longer supported. " 15 1318 f"The following labels were missing: {not_found}. " 16 17KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Index(['Thai1.9', '2.0', '2.4', '2.7', '2.8',\n ...\n '5.3', '5.6', '6.3', '6.7', '7.0'],\n dtype='object', length=29). See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"
<
<カンマを追記したコード>
# 目的変数と説明変数に分割 columnList =['city_code','region_code','emailer_for_promotion','homepage_featured','TYPE_B','TYPE_C','Biryani','Desert','Extras',\ 'Fish','Other Snacks','Pasta','Pizza','Rice Bowl','Salad','Sandwich','Seafood','Soup','Starters','Indian','Italian','Thai',\ '1.9','2.0','2.4','2.7','2.8','2.9','3.0','3.2','3.4','3.5','3.6','3.7','3.8','3.9','4.0','4.1','4.2','4.4','4.5',\ '4.6','4.7','4.8','5.0','5.1','5.3','5.6','6.3','6.7','7.0'] X = train_query.loc[:,columnList] y = train_query.loc[:, ['num_rank']] # モデル構築用データ、モデル検証用データに分割(80:20に分割) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) print(X_train.shape) print(X_test.shape) print(y_train.shape) print(y_test.shape)
<エラーコード>
KeyError Traceback (most recent call last) <ipython-input-106-09c4fefa540e> in <module>() 26 columnList =['city_code','region_code','emailer_for_promotion','homepage_featured','TYPE_B','TYPE_C','Biryani','Desert','Extras', 'Fish','Other Snacks','Pasta','Pizza','Rice Bowl','Salad','Sandwich','Seafood','Soup','Starters','Indian','Italian','Thai', '1.9','2.0','2.4','2.7','2.8','2.9','3.0','3.2','3.4','3.5','3.6','3.7','3.8','3.9','4.0','4.1','4.2','4.4','4.5', '4.6','4.7','4.8','5.0','5.1',... 27 ---> 28 X = train_query.loc[:,columnList] 29 y = train_query.loc[:, ['num_rank']] 30 6 frames /usr/local/lib/python3.7/dist-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing) 1314 with option_context("display.max_seq_items", 10, "display.width", 80): 1315 raise KeyError( -> 1316 "Passing list-likes to .loc or [] with any missing labels " 1317 "is no longer supported. " 1318 f"The following labels were missing: {not_found}. " KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Index(['1.9', '2.0', '2.4', '2.7', '2.8',\n ...\n '5.3', '5.6', '6.3', '6.7', '7.0'],\n dtype='object', length=29). See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"