質問編集履歴

タグの追加

2019/03/15 10:54

投稿

yusukee345

スコア31

title CHANGED Viewed

File without changes

body CHANGED Viewed

File without changes

ソースコードの簡略化、実行環境追記

2019/03/15 10:54

投稿

yusukee345

スコア31

title CHANGED Viewed

	@@ -1,1 +1,1 @@
1	- CNN-Autoencoderでlossがnanばかり出て、Accuracyも出ない
1	+ CNN-Autoencoderでlossがnanばかり出て、Accuracyも出ずに困っています。

body CHANGED Viewed

@@ -1,27 +1,35 @@
 失礼します。
-CNN-Autoencoderを用いた画像分類器を作ろうとしていて、folder1内の28*28画像(460枚)を読み込んでCNN-Autoencoderで学習させて、更にfolder1内の画像を用いてテストした後にテスト画像の復元(decoded)を出力しようとしています。以下の様にlossやAccuracyの値が出ません。更には復元画像もほとんど真っ黒か真っ白な画像しか出ず、明らかに学習できていません。
+CNN-Autoencoderを用いた画像分類器を作ろうとしていて、folder1内の28*28画像(460枚)を読み込んでCNN-Autoencoderで学習させて、更にfolder1内の画像を用いてテストした後にテスト画像の復元(decoded)を出力しようとしています。学習を進めていくと以下の様にlossやAccuracyの値が出ません。更には復元画像もほとんど真っ白な画像しか出ず、明らかに学習できていません。
 ```
- step, loss, accuracy =      1: -204.758  0.000
-  step, loss, accuracy =      2: -734.519  0.000
-  step, loss, accuracy =      3:    nan  0.957
+  step, loss, accuracy =      1:    nan  0.935
+  step, loss, accuracy =      2:    nan  0.935
+  step, loss, accuracy =      3:    nan  0.891
-  step, loss, accuracy =      4:    nan  0.848
+  step, loss, accuracy =      4:    nan  0.891
   step, loss, accuracy =      5:    nan  0.913
+  step, loss, accuracy =      6:    nan  0.978
-  step, loss, accuracy =      6:    nan  0.913
+  step, loss, accuracy =      7:    nan  0.913
+  step, loss, accuracy =      8:    nan  0.935
-  step, loss, accuracy =      7:    nan  0.848
+  step, loss, accuracy =      9:    nan  0.848
-  step, loss, accuracy =      8:    nan  0.848
-  step, loss, accuracy =      9:    nan  0.891
+  step, loss, accuracy =     10:    nan  0.913
+  step, loss, accuracy =     11:    nan  0.935
-  step, loss, accuracy =     10:    nan  0.870
+  step, loss, accuracy =     12:    nan  0.870
+  step, loss, accuracy =     13:    nan  0.848
+  step, loss, accuracy =     14:    nan  0.913
+  step, loss, accuracy =     15:    nan  0.935
+  step, loss, accuracy =     16:    nan  0.935
+  step, loss, accuracy =     17:    nan  0.913
+  step, loss, accuracy =     18:    nan  0.913
+  step, loss, accuracy =     19:    nan  0.913
+  step, loss, accuracy =     20:    nan  0.826
+loss (test) =  nan
+accuracy(test) =  0.9347826
 ```
-以下のコードを使用しています。my_nn_lib.pyでconvolution層等を定義したうえで、cnn_ae.pyで実行するという構造になっています。
+以下のコードを使用しています。
-なお、コードはこの方 https://qiita.com/TomokIshii/items/26b7414bdb3cd3052786 のコードを使わせて頂きました。my_nn_lib.pyは同一のコードを拝借しております。
+コードはこの方 https://qiita.com/TomokIshii/items/26b7414bdb3cd3052786 のコードを使わせて頂きました。ネットワークの定義はより高レベルに書き換えました(tf.nn.conv2dでなくtf.layers.conv2dを使用)。
 `cnn_ae.py`
 ```
-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
 import numpy as np
 import matplotlib as mpl
 import random
@@ -40,91 +48,32 @@
 two_layer = 7
-# Up-sampling 2-D Layer (deconvolutoinal Layer)
-class Conv2Dtranspose(object):
-    '''
-      constructor's args:
-          input      : input image (2D matrix)
-          output_siz : output image size
-          in_ch      : number of incoming image channel
-          out_ch     : number of outgoing image channel
-          patch_siz  : filter(patch) size
-    '''
-    def __init__(self, input, output_siz, in_ch, out_ch, patch_siz, activation='relu'):
-        self.input = input
-        self.rows = output_siz[0]
-        self.cols = output_siz[1]
-        self.out_ch = out_ch
-        self.activation = activation
-        wshape = [patch_siz[0], patch_siz[1], out_ch, in_ch]  # note the arguments order
-        w_cvt = tf.Variable(tf.truncated_normal(wshape, stddev=0.1),
-                            trainable=True)
-        b_cvt = tf.Variable(tf.constant(0.1, shape=[out_ch]),
-                            trainable=True)
-        self.batsiz = tf.shape(input)[0]
-        self.w = w_cvt
-        self.b = b_cvt
-        self.params = [self.w, self.b]
-    def output(self):
-        shape4D = [self.batsiz, self.rows, self.cols, self.out_ch]
-        linout = tf.nn.conv2d_transpose(self.input, self.w, output_shape=shape4D,
-                                        strides=[1, 2, 2, 1], padding='SAME') + self.b
-        if self.activation == 'relu':
-            self.output = tf.nn.relu(linout)
-        elif self.activation == 'sigmoid':
-            self.output = tf.sigmoid(linout)
-        else:
-            self.output = linout
-        return self.output
 # Create the model
 def mk_nn_model(x):
     # Encoding phase
     x_image = tf.reshape(x, [-1, image_size, image_size, 1])
-    conv1 = Convolution2D(x_image, (image_size, image_size), 1, 16,
-                          (3, 3), activation='relu')
-    conv1_out = conv1.output()
+    conv1 = tf.layers.conv2d(inputs=x_image, filters=16, kernel_size=(3, 3), padding='same', activation=tf.nn.relu)
-    pool1 = MaxPooling2D(conv1_out)
+    pool1 = tf.layers.max_pooling2d(conv1, pool_size=(2, 2), strides=(2, 2), padding='same')
-    pool1_out = pool1.output()
+    conv2 = tf.layers.conv2d(inputs=pool1, filters=8, kernel_size=(3, 3), padding='same', activation=tf.nn.relu)
-    conv2 = Convolution2D(pool1_out, (one_layer, one_layer), 16, 8,
+    pool2 = tf.layers.max_pooling2d(conv2, pool_size=(2, 2), strides=(2, 2), padding='same')
-                          (3, 3), activation='relu')
-    conv2_out = conv2.output()
-    pool2 = MaxPooling2D(conv2_out)
+    conv3 = tf.layers.conv2d(pool2,  filters=8, kernel_size=(3, 3), padding='same', activation=tf.nn.relu)
-    pool2_out = pool2.output()
-    conv3 = Convolution2D(pool2_out, (two_layer, two_layer), 8, 8, (3, 3), activation='relu')
+    encoded = tf.layers.max_pooling2d(conv3, pool_size=(2, 2), strides=(2, 2), padding='same')
-    conv3_out = conv3.output()
-    pool3 = MaxPooling2D(conv3_out)
-    pool3_out = pool3.output()
     # at this point the representation is (8, 4, 4) i.e. 128-dimensional
     # Decoding phase
-    conv_t1 = Conv2Dtranspose(pool3_out, (two_layer, two_layer), 8, 8,
+    upsample1 = tf.image.resize_images(encoded, size=(two_layer, two_layer), method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
-                              (3, 3), activation='relu')
+    conv4 = tf.layers.conv2d(inputs=upsample1, filters=8, kernel_size=(3, 3), padding='same', activation=tf.nn.relu)
-    conv_t1_out = conv_t1.output()
-    conv_t2 = Conv2Dtranspose(conv_t1_out, (one_layer, one_layer), 8, 8,
+    upsample2 = tf.image.resize_images(conv4, size=(one_layer, one_layer), method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
-                              (3, 3), activation='relu')
+    conv5 = tf.layers.conv2d(inputs=upsample2, filters=8, kernel_size=(3, 3), padding='same', activation=tf.nn.relu)
-    conv_t2_out = conv_t2.output()
-    conv_t3 = Conv2Dtranspose(conv_t2_out, (image_size, image_size), 8, 16,
+    upsample3 = tf.image.resize_images(conv5, size=(image_size, image_size), method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
-                              (3, 3), activation='relu')
+    conv6 = tf.layers.conv2d(inputs=upsample3, filters=16, kernel_size=(3, 3), padding='same', activation=tf.nn.relu)
-    conv_t3_out = conv_t3.output()
-    conv_last = Convolution2D(conv_t3_out, (image_size, image_size), 16, 1, (3, 3),
+    logits = tf.layers.conv2d(inputs=conv6, filters=1, kernel_size=(3, 3), padding='same', activation=None)
-                              activation='sigmoid')
-    decoded = conv_last.output()
+    decoded = tf.nn.sigmoid(logits)
     decoded = tf.reshape(decoded, [-1, image_size*image_size])
     cross_entropy = -1. * x * tf.log(decoded) - (1. - x) * tf.log(1. - decoded)
@@ -154,7 +103,7 @@
     loss, decoded, accuracy = mk_nn_model(x)
     train_step = tf.train.AdagradOptimizer(learning_rate).minimize(loss)
-    epochs = 10
+    epochs = 20
     batch_size = data_number // 10
     init = tf.initialize_all_variables()
@@ -172,9 +121,9 @@
         # generate decoded image with test data
         X = np.random.permutation(fileset)
         test_image = X[0:batch_size]
-        decoded_imgs = decoded.eval(test_image)
+        decoded_imgs, test_loss, test_accuracy = sess.run([decoded, loss, accuracy], feed_dict={x: test_image})
-        print('loss (test) = ', loss.eval(test_image))
+        print('loss (test) = ', test_loss)
-        print('accuracy(test) = ', accuracy.eval(test_image))
+        print('accuracy(test) = ', test_accuracy)
     x_test = test_image
     n = 10  # how many digits we will display
@@ -198,6 +147,11 @@
     plt.savefig('images.png')
 ```
+####実行環境
+OS Windows10 64bit
+プロセッサ　Intel(R) Core(TM)i7-8700k CPU @ 3.70GHz
+RAM 32.0 GB
+Anaconda Prompt
 ####試したこと
 ・decoded直前でsigmoidをsoftmaxにした。その結果、nanばかり表示されてしまい、全く改善せず。