現在irisデータセットを用いてpythonを用いてLeaveOneOut法でクラス分類をしようとしています. しかしこれであっているのかどうか不安なので,どなたかご教授ください.
以下のコードで一応,結果は出るのですがあっているのか不安です.
python
1from sklearn.linear_model import LogisticRegression 2from sklearn.model_selection import train_test_split 3from sklearn.model_selection import LeaveOneOut 4import pandas as pd 5 6df = pd.read_csv('drive/My Drive/iris.txt', delim_whitespace=True, header=None) 7X = df.iloc[:, 0:4] 8y = df.iloc[:, 4] 9 10# 特徴量確認 11print(X) 12# 訓練セットとテストセットに分割 13X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=0) 14# モデルのインスタンスを生成し、訓練データで学習 15logreg = LogisticRegression().fit(X_train,y_train) 16# テストセットでモデルを評価 17#print(logreg.score(X_test,y_test)) 18 19loo=LeaveOneOut() 20score = cross_val_score(logreg, X, y, cv = loo) #分類器としてLeaveOneOut()を指定 21score.mean() 22
以下はirisデータセットです.
左の列から,がく片の長さ,がく片の幅,花弁の長さ,花弁の幅で一番右の列が品種です.
1.0,2.0,3.0と3種類の品種があります.
iris.txt
15.1 3.5 1.4 0.2 1.0 24.9 3.0 1.4 0.2 1.0 34.7 3.2 1.3 0.2 1.0 44.6 3.1 1.5 0.2 1.0 55.0 3.6 1.4 0.2 1.0 65.4 3.9 1.7 0.4 1.0 74.6 3.4 1.4 0.3 1.0 85.0 3.4 1.5 0.2 1.0 94.4 2.9 1.4 0.2 1.0 104.9 3.1 1.5 0.1 1.0 115.4 3.7 1.5 0.2 1.0 124.8 3.4 1.6 0.2 1.0 134.8 3.0 1.4 0.1 1.0 144.3 3.0 1.1 0.1 1.0 155.8 4.0 1.2 0.2 1.0 165.7 4.4 1.5 0.4 1.0 175.4 3.9 1.3 0.4 1.0 185.1 3.5 1.4 0.3 1.0 195.7 3.8 1.7 0.3 1.0 205.1 3.8 1.5 0.3 1.0 215.4 3.4 1.7 0.2 1.0 225.1 3.7 1.5 0.4 1.0 234.6 3.6 1.0 0.2 1.0 245.1 3.3 1.7 0.5 1.0 254.8 3.4 1.9 0.2 1.0 265.0 3.0 1.6 0.2 1.0 275.0 3.4 1.6 0.4 1.0 285.2 3.5 1.5 0.2 1.0 295.2 3.4 1.4 0.2 1.0 304.7 3.2 1.6 0.2 1.0 314.8 3.1 1.6 0.2 1.0 325.4 3.4 1.5 0.4 1.0 335.2 4.1 1.5 0.1 1.0 345.5 4.2 1.4 0.2 1.0 354.9 3.1 1.5 0.2 1.0 365.0 3.2 1.2 0.2 1.0 375.5 3.5 1.3 0.2 1.0 384.9 3.6 1.4 0.1 1.0 394.4 3.0 1.3 0.2 1.0 405.1 3.4 1.5 0.2 1.0 415.0 3.5 1.3 0.3 1.0 424.5 2.3 1.3 0.3 1.0 434.4 3.2 1.3 0.2 1.0 445.0 3.5 1.6 0.6 1.0 455.1 3.8 1.9 0.4 1.0 464.8 3.0 1.4 0.3 1.0 475.1 3.8 1.6 0.2 1.0 484.6 3.2 1.4 0.2 1.0 495.3 3.7 1.5 0.2 1.0 505.0 3.3 1.4 0.2 1.0 517.0 3.2 4.7 1.4 2.0 526.4 3.2 4.5 1.5 2.0 536.9 3.1 4.9 1.5 2.0 545.5 2.3 4.0 1.3 2.0 556.5 2.8 4.6 1.5 2.0 565.7 2.8 4.5 1.3 2.0 576.3 3.3 4.7 1.6 2.0 584.9 2.4 3.3 1.0 2.0 596.6 2.9 4.6 1.3 2.0 605.2 2.7 3.9 1.4 2.0 615.0 2.0 3.5 1.0 2.0 625.9 3.0 4.2 1.5 2.0 636.0 2.2 4.0 1.0 2.0 646.1 2.9 4.7 1.4 2.0 655.6 2.9 3.6 1.3 2.0 666.7 3.1 4.5 1.4 2.0 674.6 3.0 4.5 1.5 2.0 685.8 2.7 4.1 1.0 2.0 696.2 2.2 4.5 1.5 2.0 705.6 2.5 3.9 1.1 2.0 715.9 3.2 4.8 1.8 2.0 726.1 2.8 4.0 1.3 2.0 736.3 2.5 4.9 1.5 2.0 746.1 2.8 4.7 1.2 2.0 756.4 2.9 4.3 1.3 2.0 766.6 3.0 4.9 1.4 2.0 776.8 2.8 4.8 1.4 2.0 786.7 3.0 5.0 1.7 2.0 796.0 2.9 4.5 1.5 2.0 805.7 2.6 3.5 1.0 2.0 815.5 2.4 3.8 1.1 2.0 825.5 2.4 3.7 1.0 2.0 835.8 2.7 3.9 1.2 2.0 846.0 2.7 5.1 1.6 2.0 855.4 3.0 4.5 1.5 2.0 866.0 3.4 4.5 1.6 2.0 876.7 3.1 4.7 1.5 2.0 886.3 2.3 4.4 1.3 2.0 895.6 3.0 4.1 1.3 2.0 905.5 2.5 4.0 1.3 2.0 915.5 2.6 4.4 1.2 2.0 926.1 3.0 4.6 1.4 2.0 935.8 2.6 4.0 1.2 2.0 945.0 2.3 3.3 1.0 2.0 955.6 2.7 4.2 1.3 2.0 965.7 3.0 4.2 1.2 2.0 975.7 2.9 4.2 1.3 2.0 986.2 2.9 4.3 1.3 2.0 995.1 2.5 3.0 1.1 2.0 1005.7 2.8 4.1 1.3 2.0 1016.3 3.3 6.0 2.5 3.0 1025.8 2.7 5.1 1.9 3.0 1037.1 3.0 5.9 2.1 3.0 1046.3 2.9 5.6 1.8 3.0 1056.5 3.0 5.8 2.2 3.0 1067.6 3.0 6.6 2.1 3.0 1074.9 2.5 4.5 1.7 3.0 1087.3 2.9 6.3 1.8 3.0 1096.7 2.5 5.8 1.8 3.0 1107.2 3.6 6.1 2.5 3.0 1116.5 3.2 5.1 2.0 3.0 1126.4 2.7 5.3 1.9 3.0 1136.8 3.0 5.5 2.1 3.0 1145.7 2.5 5.0 2.0 3.0 1155.8 2.8 5.1 2.4 3.0 1166.4 3.2 5.3 2.3 3.0 1176.5 3.0 5.5 1.8 3.0 1187.7 3.8 6.7 2.2 3.0 1197.7 2.6 6.9 2.3 3.0 1206.0 2.2 5.0 1.5 3.0 1216.9 3.2 5.7 2.3 3.0 1225.6 2.8 4.9 2.0 3.0 1237.7 2.8 6.7 2.0 3.0 1246.3 2.7 4.9 1.8 3.0 1256.7 3.3 5.7 2.1 3.0 1267.2 3.2 6.0 1.8 3.0 1276.2 2.8 4.8 1.8 3.0 1286.1 3.0 4.9 1.8 3.0 1296.4 2.8 5.6 2.1 3.0 1307.2 3.0 5.8 1.6 3.0 1317.4 2.8 6.1 1.9 3.0 1327.9 3.8 6.4 2.0 3.0 1336.4 2.8 5.6 2.2 3.0 1346.3 2.8 5.1 1.5 3.0 1356.1 2.6 5.6 1.4 3.0 1367.7 3.0 6.1 2.3 3.0 1376.3 3.4 5.6 2.4 3.0 1386.4 3.1 5.5 1.8 3.0 1396.0 3.0 4.8 1.8 3.0 1406.9 3.1 5.4 2.1 3.0 1416.7 3.1 5.6 2.4 3.0 1426.9 3.1 5.1 2.3 3.0 1435.8 2.7 5.1 1.9 3.0 1446.8 3.2 5.9 2.3 3.0 1456.7 3.3 5.7 2.5 3.0 1466.7 3.0 5.2 2.3 3.0 1476.3 2.5 5.0 1.9 3.0 1486.5 3.0 5.2 2.0 3.0 1496.2 3.4 5.4 2.3 3.0 1505.0 3.0 5.1 1.8 3.0
(追記)
このコードを実行すると以下のようなエラーが出てしまいます. それでもなぜか分類精度の数値は表示されます. なぜでしょうか...?
error
1/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1): 2STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. 3 4Increase the number of iterations (max_iter) or scale the data as shown in: 5 https://scikit-learn.org/stable/modules/preprocessing.html 6Please also refer to the documentation for alternative solver options: 7 https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression 8 extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG) 9/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1): 10STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. 11 12Increase the number of iterations (max_iter) or scale the data as shown in: 13 https://scikit-learn.org/stable/modules/preprocessing.html 14Please also refer to the documentation for alternative solver options: 15 https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression 16 extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG) 17/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/_logistic.py:940: ConvergenceWarning: lbfgs failed to converge (status=1): 18STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
よろしくお願いします.
あなたの回答
tips
プレビュー