編集履歴

質問編集履歴

現在の状況

2017/12/11 14:47

投稿

退会済みユーザー

スコア0

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -1,6 +1,6 @@
 ###前提・実現したいこと
-k 近傍法の実装の正解率の結果を元に混同行列を作成したいです。
+Digtisデータで混合行列を作成するコードようにmnistデータでも同じように混合行列を作成したいのですがどのようにしたらいいですか。
-容量が大きいためかうまくいきません。
 原因と改善策を教えてください。
 また、Digtisデータで混同行列は以下のようにできたのに今回はエラーが出るのでしょうか。

書式な改善と現在のコードの状況

2017/12/11 14:46

投稿

退会済みユーザー

スコア0

title CHANGED Viewed

File without changes

body CHANGED Viewed

@@ -3,29 +3,32 @@
 容量が大きいためかうまくいきません。
 原因と改善策を教えてください。
+また、Digtisデータで混同行列は以下のようにできたのに今回はエラーが出るのでしょうか。
 ###発生している問題・エラーメッセージ
 ```python
 ---------------------------------------------------------------------------
 ValueError                                Traceback (most recent call last)
-<ipython-input-5-8b4965474dfa> in <module>()
+<ipython-input-5-f6ad643389d7> in <module>()
-     28
+     29
-     29 if __name__ == '__main__':
+     30 if __name__ == '__main__':
----> 30     main()
+---> 31     main()
-<ipython-input-5-8b4965474dfa> in main()
+<ipython-input-5-f6ad643389d7> in main()
-     24
+     21
-     25     # 混合行列を表示
+     22     # 正解率を計算
----> 26     cm = confusion_matrix(test_dataY, predicted_labels)
+---> 23     score = accuracy_score(test_dataY, predicted_labels)
-     27     print(cm)
+     24     print("正解率:{}".format(score))
-     28
+     25
-~\Anaconda3\lib\site-packages\sklearn\metrics\classification.py in confusion_matrix(y_true, y_pred, labels, sample_weight)
+~\Anaconda3\lib\site-packages\sklearn\metrics\classification.py in accuracy_score(y_true, y_pred, normalize, sample_weight)
-    248
+    174
-    249     """
+    175     # Compute accuracy for each possible representation
---> 250     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
+--> 176     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
-    251     if y_type not in ("binary", "multiclass"):
+    177     if y_type.startswith('multilabel'):
-    252         raise ValueError("%s is not supported" % y_type)
+    178         differing_labels = count_nonzero(y_true - y_pred, axis=1)
 ~\Anaconda3\lib\site-packages\sklearn\metrics\classification.py in _check_targets(y_true, y_pred)
      69     y_pred : array or indicator matrix
@@ -41,11 +44,48 @@
     205
     206
-ValueError: Found input variables with inconsistent numbers of samples: [21000, 0]
+ValueError: Found input variables with inconsistent numbers of samples: [21000, 1]
 ```
+###Digitsデータでのソースコード
+```python
+from sklearn import datasets
+from sklearn.model_selection import LeaveOneOut
+from sklearn.metrics import accuracy_score, confusion_matrix
+from sklearn.neighbors import KNeighborsClassifier
+def main():
+    dataset = datasets.load_digits()
+    features = dataset.data
+    targets = dataset.target
+    predicted_labels = []
+    loo = LeaveOneOut()
+    for train, test in loo.split(features):
+        train_data = features[train]
+        target_data = targets[train]
+        model = KNeighborsClassifier(n_neighbors=1, metric='euclidean')
+        model.fit(train_data, target_data)
+        predicted_label = model.predict(features[test])
+        predicted_labels.append(predicted_label)
+    print(predicted_labels)
+    score = accuracy_score(targets, predicted_labels)
+    print(score)
+    # 混合行列を表示
+    cm = confusion_matrix(targets, predicted_labels)
+    print(cm)
+if __name__ == '__main__':
+    main()
+```
-###該当のソースコード
+###mnistデータでのソースコード
 ```python
 from collections import Counter
 from matplotlib import pyplot as plt
@@ -66,11 +106,9 @@
     features = mnist.data
     targets = mnist.target
+    #データを分割
     train_dataX, test_dataX, train_dataY, test_dataY = model_selection.train_test_split(features,targets,test_size=0.3)
-    # 使う近傍数ごとに正解率＆各経過時間を計算
-    accuracy_scores = []
     predicted_labels = []
     # モデルを学習
@@ -79,15 +117,18 @@
     # 一つだけ取り除いたテストデータを識別
     predicted_label = model.predict(test_dataX)
+    predicted_labels.append(predicted_label)
+    ## print(predicted_labels)
     # 正解率を計算
-    score = accuracy_score(test_dataY, predicted_label)
+    score = accuracy_score(test_dataY, predicted_labels)
     print("正解率:{}".format(score))
     # 混合行列を表示
     cm = confusion_matrix(test_dataY, predicted_labels)
     print(cm)
 if __name__ == '__main__':
     main()
 ```