Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning:対処方法について

中古のLet's note（windows10）にanaconda3を入れてJupyterNotebookを使っています。現在、オライリー社の「Pythonではじめる機械学習」を学習中で、62ページあたりにいます。
scikit-learnのcancerデータセットを使ってLogisticRegressionをL1正則化にして、正則化パラメータCを3種類に変化させた場合のaccuracyとcoefficient magnitudeのグラフを表示するコードを、以下の通りほぼ丸写しして実行しました。
from sklearn.datasets import load_breast_cancer
cancer=load_breast_cancer()
X_train,X_test,y_train,y_test=train_test_split(cancer.data,cancer.target,stratify=cancer.target,random_state=42)
logreg=LogisticRegression().fit(X_train,y_train)
print("Training set score:{:.3f}".format(logreg.score(X_train,y_train)))
print("Test set score:{:.3f}".format(logreg.score(X_test,y_test)))

logreg100=LogisticRegression(C=100).fit(X_train,y_train)
print("Training set score:{:.3f}".format(logreg100.score(X_train,y_train)))
print("Test set score:{:.3f}".format(logreg100.score(X_test,y_test)))

logreg001=LogisticRegression(C=0.01).fit(X_train,y_train)
print("Training set score:{:.3f}".format(logreg001.score(X_train,y_train)))
print("Test set score:{:.3f}".format(logreg001.score(X_test,y_test)))

import matplotlib.pyplot as plt
plt.plot(logreg.coef_.T,'o',label="C=1")
plt.plot(logreg100.coef_.T,'^',label="C=100")
plt.plot(logreg001.coef_.T,'v',label="C=0.01")
plt.xticks(range(cancer.data.shape[1]),cancer.feature_names,rotation=90)
plt.hlines(0,0,cancer.data.shape[1])
plt.ylim(-5,5)
plt.xlabel("Feature")
plt.ylabel("Coefficient magnitude")
plt.legend()

そうすると、次の警告のような文面が出ました。
C:\Users\user\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
FutureWarning)
C:\Users\user\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
FutureWarning)
C:\Users\user\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
FutureWarning)
一応、訓練精度結果とグラフはテキスト通りに表示されたのですが、この文面の意味と原因、対処の必要性有無についてご教示いただきたく、よろしくお願いいたします。

meg_

2020/05/11 07:57

モジュール内で出ているWarningなのでどうしようもないかと思います。どうしてもイヤならそのモジュールのバージョンを上げれば良いかと思います。※その場合関係する他のモジュールとの互換性にご注意ください。

searabbit

2020/05/11 08:45

早々にご回答いただきましてありがとうございます。あまり詳しくないのでよくわからないのですが、使われている何かのバージョンが古いためなので、現状どうしようもないものと理解いたしました。表示されるだけであればどうしてもいやというわけではないので、実害がなければとりあえず無視していこうと思います。ありがとうございました。

meg_

2020/05/11 08:52

インストールしているsklearnが古いのでは？と推測されます。

searabbit

2020/05/11 09:52

追記ありがとうございます。 Jupyterで、import sklearn;print(sklearn.__version__)　と打ちましたら 0.21.3　と出ました。それでscikit-learnの公式ページを見てみましたら、 March 2020. scikit-learn 0.22.2 is available for download (Changelog). という文がありましたので少し古いのかな？と思いました。安易に自分で更新してあちこち不具合が出るのも不安なので、もう少し知識を身に着けてからトライしたいと思います。

行動規範の内容に同意します

回答1件

ベストアンサー

scikit-learnでは「将来変更されることが確定しているが、直ちに今のバージョンでは変更されない」オプションは警告扱いになる場合があります。

LogisticRegression(solver="liblinear")のように明示的な指定を行えば消えます。「Specify a solver to silence this warning.」とはそういう意味です。

solver : str, {‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’}, optional (default=’liblinear’).
Algorithm to use in the optimization problem.

For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ and ‘saga’ are faster for large ones.
For multiclass problems, only ‘newton-cg’, ‘sag’, ‘saga’ and ‘lbfgs’ handle multinomial loss; ‘liblinear’ is limited to one-versus-rest schemes.
‘newton-cg’, ‘lbfgs’, ‘sag’ and ‘saga’ handle L2 or no penalty
‘liblinear’ and ‘saga’ also handle L1 penalty
‘saga’ also supports ‘elasticnet’ penalty
‘liblinear’ does not handle no penalty
Note that ‘sag’ and ‘saga’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.

New in version 0.17: Stochastic Average Gradient descent solver.

New in version 0.19: SAGA solver.

Changed in version 0.20: Default will change from ‘liblinear’ to ‘lbfgs’ in 0.22.
sklearn.linear_model.LogisticRegression — scikit-learn 0.21.3 documentation

投稿2020/05/14 06:49

hayataka2049

総合スコア30935

searabbit

2020/05/14 07:46

ご回答いただきありがとうございます。ご助言いただきました通り、コードにsolver="liblinear"を追記しましたら警告が消えました。原因も明確になりスッキリしました。ありがとうございました。変更前) logreg=LogisticRegression().fit(X_train,y_train) 変更後) logreg=LogisticRegression(solver="liblinear").fit(X_train,y_train)

行動規範の内容に同意します