kaggleのTitanic課題kernelを写経しています。
写経対象: A Data Science Framework: To Achieve 99% Accuracy
今度はBaggingClassifierで困っています。
開発環境
- Python3.6.5
- Jupyter notebook
- Windows7
躓いた箇所とエラー文
grid_param内の書き方がよくないのだと推測しています。もともとのTitanic課題kernelにはclassifier__
はついていなかったのですが、stackoverflowを見てclassifier__
をつけることにしました。
↑classifier__
削除済
削除はしたものの、ExtraTreesClassifierでエラーが出ています。
python
1#WARNING: Running is very computational intensive and time expensive. 2grid_n_estimator = [10, 50, 100, 300] 3grid_ratio = [.1, .25, .5, .75, 1.0] 4grid_learn = [.01, .03, .05, .1, .25] 5grid_max_depth = [2, 4, 6, 8, 10, None] 6grid_min_samples = [5, 10, .03, .05, .10] 7grid_criterion = ['gini', 'entropy'] 8grid_bool = [True, False] 9grid_seed = [0] 10 11grid_param = [ 12 [{ 13 'n_estimators': grid_n_estimator, 14 'learning_rate': grid_learn, 15 'random_state': grid_seed 16 }], 17 18 [{ 19 'n_estimators': grid_n_estimator, 20 'max_samples': grid_ratio, 21 'random_state': grid_seed 22 }], 23 24 [{ 25 'n_estimators': grid_n_estimator, 26 'criterion': grid_criterion, 27 'max_depth': grid_max_depth, 28 'random state': grid_seed 29 }], 30 31 [{ 32 'learning_rate': [.05], 33 'n_estimators': [300], 34 'max_depth': grid_max_depth, 35 'random_state': grid_seed 36 }], 37 38 [{ 39 'n_estimators': grid_n_estimator, 40 'criterion': grid_criterion, 41 'max_depth': grid_max_depth, 42 'oob_score': [True], 43 'random_state': grid_seed 44 }], 45 46 [{ 47 'max_iter_predict': grid_n_estimator, 48 'random_state': grid_seed 49 }], 50 51 [{ 52 'fit_intercept': grid_bool, 53 'solver': ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'], 54 'random_state': grid_seed 55 }], 56 57 [{ 58 'alpha': grid_ratio, 59 }], 60 61 [{}], 62 63 [{ 64 'n_neighbors': [1,2,3,4,5,6,7], 65 'weights': ['uniform', 'distance'], 66 'algorithm': ['auto', 'ball_tree', 'kd_tree', 'brute'] 67 }], 68 69 [{ 70 'C': [1,2,3,4,5], 71 'gamma': grid_ratio, 72 'decision_function_shape': ['ovo', 'ovr'], 73 'probability': [True], 74 'random_state': grid_seed 75 }], 76 77 [{ 78 'learning_rate': grid_learn, 79 'max_depth': [1,2,4,6,8,10], 80 'n_estimators': grid_n_estimator, 81 'seed': grid_seed 82 }] 83] 84 85 86start_total = time.perf_counter() 87for clf, param in zip (vote_est, grid_param): 88 start = time.perf_counter() 89 best_search = model_selection.GridSearchCV(estimator = clf[1], param_grid = param, cv = cv_split, scoring = 'roc_auc') 90 best_search.fit(data1[data1_x_bin], data1[Target]) 91 run = time.perf_counter() - start 92 93 best_param = best_search.best_params_ 94 print('The best parameter for {} is {} with a runtime of {:.2f} seconds'.format(clf[1].__class__.__name__, best_param, run)) 95 clf[1].set_params(**best_param) 96 97run_total = time.perf_counter() - start_total 98print('Total optimization time was {:.2f} minutes.'.format(run_total/60)) 99 100print('-' *10)
classifier__
削除後のエラー文
ValueError: Invalid parameter random state for estimator ExtraTreesClassifier(bootstrap=False, class_weight=None, criterion='gini', max_depth=2, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False). Check the list of available parameters with `estimator.get_params().keys()`.
kaggleのTitanic課題提出まで漕ぎ着けました!
hayataka2049 さまのおかげです(*≧∀≦)
回答1件
あなたの回答
tips
プレビュー
バッドをするには、ログインかつ
こちらの条件を満たす必要があります。
2018/09/11 10:05
2018/09/11 10:09
2018/09/11 23:36
2018/09/12 03:39