Tensorflowを使ったCNNの学習

前提・実現したいこと

Tensorflowを使ったCNNの学習
本に載っていたプログラムでMNISTデータでの学習が行えたので、自ら作ったデータセットでCNNをしてみたいです。
データは64x64のグレースケール、出力は2つにクラス分けをするプログラム。

発生している問題

学習、検証のいずれのデータを使った正解率が更新されません。

INFO:tensorflow:Summary name validation error is illegal; using validation_error instead.
Epoch: 0001 cost = 0.693147182
Validation Error: 0.5
Epoch: 0002 cost = 0.693147182
Validation Error: 0.5
Epoch: 0003 cost = 0.693147182
Validation Error: 0.5
Epoch: 0004 cost = 0.693147182
Validation Error: 0.5
Epoch: 0005 cost = 0.693147182
Validation Error: 0.5
Optimization Finished!
Test Accuracy: 0.5

該当のソースコード

readData に自分で作成したデータセットのデータを取り出すプログラムが入っています。
trainImage：学習用の画像データ、形は[3200][64][64]で、前半の1900個にラベル[1,0]となるデータ、残り1300個はラベル[0,1]となる画像データ
trainLabel：学習データのラベル、形は[3200][2]、前半の1900個にラベル[1,0]となるデータ、残り1300個はラベル[0,1]となるラベルのデータ
valiImage,testImage：それぞれ検証、テストの画像データ、形は[200][64][64]、前半の100個にラベル[1,0]となるデータ、残り100個はラベル[0,1]となる画像データ
valiLabel,testLabel：検証、テスト画像のラベル、形は[200][2]、前半の100個にラベル[1,0]となるデータ、残り100個はラベル[0,1]となるラベルのデータ

tl：学習データ[3200]個をシャッフルするためのリスト

Python
1# -*- coding: utf-8 -*-
2import tensorflow as tf
3import time
4
5import numpy as np
6import cv2
7import matplotlib.pyplot as plt
8import readData as rd
9trainImage,trainLabel,valiImage,valiLabel,testImage,testLabel = rd.readData()
10valiImage = np.asarray(valiImage)
11valiImage = valiImage.reshape([200, 4096])
12testImage = np.asarray(testImage)
13testImage = testImage.reshape([200, 4096])
14
15import random
16tl= list(range(len(trainImage)))
17random.shuffle(tl)
18
19
20# Parameters
21learning_rate = 0.0001
22training_epochs = 5
23batch_size = 100
24display_step = 1
25
26def conv2d(input, weight_shape, bias_shape):
27    incoming = weight_shape[0] * weight_shape[1] * weight_shape[2]
28    weight_init = tf.random_normal_initializer(stddev=(2.0/incoming)**0.5)
29    W = tf.get_variable("W", weight_shape, initializer=weight_init)
30    bias_init = tf.constant_initializer(value=0)
31    b = tf.get_variable("b", bias_shape, initializer=bias_init)
32    return tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(input, W, strides=[1, 1, 1, 1], padding='SAME'), b))
33
34def max_pool(input, k=2):
35    return tf.nn.max_pool(input, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME')
36
37def layer(input, weight_shape, bias_shape):
38    weight_init = tf.random_normal_initializer(stddev=(2.0/weight_shape[0])**0.5)
39    bias_init = tf.constant_initializer(value=0)
40    W = tf.get_variable("W", weight_shape, initializer=weight_init)
41    b = tf.get_variable("b", bias_shape, initializer=bias_init)
42    return tf.nn.relu(tf.matmul(input, W) + b)
43
44def inference(x, keep_prob):
45    x = tf.reshape(x, shape=[-1, 64, 64, 1])
46    with tf.variable_scope("conv_1"):
47        conv_1 = conv2d(x, [5, 5, 1, 32], [32])
48        pool_1 = max_pool(conv_1)
49    with tf.variable_scope("conv_2"):
50        conv_2 = conv2d(pool_1, [5, 5, 32, 64], [64])
51        pool_2 = max_pool(conv_2)
52    with tf.variable_scope("fc"):
53        pool_2_flat = tf.reshape(pool_2, [-1, 16 * 16 * 64])
54        fc_1 = layer(pool_2_flat, [16*16*64, 1024], [1024])
55        # apply dropout
56        fc_1_drop = tf.nn.dropout(fc_1, keep_prob)
57    with tf.variable_scope("output"):
58        output = layer(fc_1_drop, [1024, 2], [2])
59
60    return output
61
62def loss(output, y):
63    xentropy = tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=y)    
64    loss = tf.reduce_mean(xentropy)
65    return loss
66
67def training(cost, global_step):
68    tf.summary.scalar("cost", cost)
69    optimizer = tf.train.AdamOptimizer(learning_rate)
70    train_op = optimizer.minimize(cost, global_step=global_step)
71    return train_op
72
73def evaluate(output, y):
74    correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y, 1))
75    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
76    tf.summary.scalar("validation error", (1.0 - accuracy))
77    return accuracy
78
79if __name__ == '__main__':
80    with tf.device("/gpu:0"):
81        with tf.Graph().as_default():
82            with tf.variable_scope("mnist_conv_model"):
83                x = tf.placeholder("float", [None, 4096])
84                y = tf.placeholder("float", [None, 2])
85                keep_prob = tf.placeholder(tf.float32) # dropout probability
86
87                output = inference(x, keep_prob)
88                cost = loss(output, y)
89                global_step = tf.Variable(0, name='global_step', trainable=False)
90                train_op = training(cost, global_step)
91                eval_op = evaluate(output, y)
92                summary_op = tf.summary.merge_all()
93                saver = tf.train.Saver()
94                sess = tf.Session()
95                summary_writer = tf.summary.FileWriter("conv_mnist_logs/",graph=sess.graph)
96                init_op = tf.global_variables_initializer()
97                sess.run(init_op)
98
99                # Training cycle
100                for epoch in range(training_epochs):   #エポック（学習画像すべて使う）の回数だけ学習
101                    avg_cost = 0.
102                    total_batch = int(len(trainImage)/batch_size) #訓練画像/バッチサイズ = 訓練画像使い切る回数
103                    
104                    random.shuffle(tl)   #シャッフルするためのリストをシャッフルする
105                    
106                    # Loop over all batches
107                    for i in list(range(total_batch)):  #訓練画像を使い切るまでバッチ単位で回す
108                        #訓練画像からバッチサイズだけ画像をランダムで取得
109                        #入力データと正解ラベルを取得
110                        bi = []
111                        bl = []
112                        for m in list(range(batch_size)):  ##バッチデータとして１００取り出す、ラベルも
113                            bi.append(trainImage[tl[m+i*batch_size]])
114                            bl.append(trainLabel[tl[m+i*batch_size]])
115                            
116                        minibatch_x = bi
117                        minibatch_y = bl
118                        minibatch_x = np.asarray(minibatch_x)
119                        minibatch_y = np.asarray(minibatch_y)
120                        minibatch_x = minibatch_x.reshape([100, 4096])
121
122                        # Fit training using batch data
123                        sess.run(train_op, feed_dict={x: minibatch_x, y: minibatch_y, keep_prob: 0.5})
124                        # Compute average loss
125                        avg_cost += sess.run(cost, feed_dict={x: minibatch_x, y: minibatch_y, keep_prob: 0.5})/total_batch
126
127                    # Display logs per epoch step
128                    if epoch % display_step == 0:
129                        print("Epoch:", '%04d' % (epoch+1), "cost =", "{:.9f}".format(avg_cost))
130                        accuracy = sess.run(eval_op, feed_dict={x: valiImage, y: valiLabel, keep_prob: 1})
131                        print("Validation Error:", (1 - accuracy))
132                        summary_str = sess.run(summary_op, feed_dict={x: minibatch_x, y: minibatch_y, keep_prob: 0.5})
133                        summary_writer.add_summary(summary_str, sess.run(global_step))
134
135                        saver.save(sess, "conv_mnist_logs/model-checkpoint", global_step=global_step)
136
137
138                print("Optimization Finished!")
139
140                #feed としてテスト用の画像データとその正解ラベルを与える
141                accuracy = sess.run(eval_op, feed_dict={x: testImage, y: testLabel, keep_prob: 1})
142
143                print("Test Accuracy:", accuracy)
144

試したこと

検証、テストデータでの正解率が0.5なのは、常にクラス分けが[1,0]になされているためのようです。
試しに検証データのvaliImageを前半の100個（ラベルがすべて[1,0]のデータ）にすると、正解率が1.0になりました。

Tensorflow初心者なのでとんちんかんなことをしているかもしれません。ご指摘お願いします。

行動規範の内容に同意します

回答1件

こうした現象でよくある例が、学習するときに行っていた前処理等をテスト時に行っていないということがあるのですが、その点は大丈夫でしょうか？

投稿2018/11/06 05:52

tiitoi

総合スコア21960

tohoho

2018/11/06 10:38

前処理とはどのような処理のことでしょうか

tiitoi

2018/11/06 10:46

例えば、画像の例だと [0, 1] に正規化する等です。 readData.py の中身がわからないので憶測でのコメントになりますが、学習データとテストデータは同じ条件のものとなっていますか？

tohoho

2018/11/06 14:04 編集

すべて確認してみましたが画像データは、64x64で1チャンネルのデータラベルは、 2のデータで[0,1]か[1,0]のどちらかになっていました。学習とテストで条件は揃えたつもりではあります。元はmnistのCNN学習だったのをちょっと変えただけなのが不具合の原因だったりしますでしょうか？データサイズもクラス分類の数も勝手に変えてしまったので不安があります。

tiitoi

2018/11/06 15:05

変えたのは入力の形状だけですよね?モデルの構築は問題にないように思います。なので、学習データでは精度が100%近くでてるのに、テストデータだと出力が全部同じになってしまうというのは、データを作る部分のコードになにか問題があるのではないかと思っているのですが、その部分が質問欄には記載されていないのでわかりません。

行動規範の内容に同意します