Tensorflowで入力サイズを変える

現在Alexnetのモデルを使って画像分類を行っていますが、学習率を変化させても認識率が発散(?)してしまい学習が進みません。そこで入力の画像サイズを大きくしてみようと思っているのですが、画像サイズを28から大きくした場合、畳み込み層や全結合の値はどのように変化させればよいのでしょうか。
Cannot feed valueのエラーが出るのですがどのように値を変えればよいか分かりません。

python
1# -*- coding: utf-8 -*-
2#Alexnet
3import sys
4import cv2
5import numpy as np
6import tensorflow as tf
7import tensorflow.python.platform
8import os
9
10NUM_CLASSES = 12
11IMAGE_SIZE = 28
12IMAGE_PIXELS = IMAGE_SIZE*IMAGE_SIZE*3
13
14flags = tf.app.flags
15FLAGS = flags.FLAGS
16flags.DEFINE_string('train', 'train.txt', 'File name of train data')
17flags.DEFINE_string('test', 'test.txt', 'File name of train data')
18flags.DEFINE_string('image_dir', 'data', 'Directory of images')
19flags.DEFINE_string('train_dir', 'logs', 'Directory to put the training data.')
20flags.DEFINE_integer('max_steps', 100, 'Number of steps to run trainer.')
21flags.DEFINE_integer('batch_size', 50, 'Batch size'
22                     'Must divide evenly into the dataset sizes.')
23flags.DEFINE_float('learning_rate', 1e-5, 'Initial learning rate.')
24
25def inference(images_placeholder, keep_prob):
26    """ 予測モデルを作成する関数
27    引数: 
28        images_placeholder: 画像のplaceholder
29        keep_prob: dropout率のplace_holder
30    返り値:
31        y_conv: 各クラスの確率(のようなもの)
32    """
33    # 重みを標準偏差0.1の正規分布で初期化
34    def weight_variable(shape):
35      initial = tf.truncated_normal(shape, stddev=0.1)
36      return tf.Variable(initial)
37    # バイアスを標準偏差0.1の正規分布で初期化
38    def bias_variable(shape):
39      initial = tf.constant(0.1, shape=shape)
40      return tf.Variable(initial)
41    # 畳み込み層の作成
42    def conv2d(x, W):
43      return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
44    # プーリング層の作成
45    def max_pool_2x2(x):
46      return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
47                            strides=[1, 2, 2, 1], padding='SAME')
48    
49    # 入力を28x28x3に変形
50    x_image = tf.reshape(images_placeholder, [-1, IMAGE_SIZE, IMAGE_SIZE, 3])
51    # 畳み込み層1の作成
52    with tf.name_scope('conv1') as scope:
53        W_conv1 = weight_variable([5, 5, 3, 32])
54        b_conv1 = bias_variable([32])
55        h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
56        
57    # プーリング層1の作成
58    with tf.name_scope('pool1') as scope:
59        h_pool1 = max_pool_2x2(h_conv1)
60    
61    # 畳み込み層2の作成
62    with tf.name_scope('conv2') as scope:
63        W_conv2 = weight_variable([5, 5, 32, 64])
64        b_conv2 = bias_variable([64])
65        h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
66
67    # プーリング層2の作成
68    with tf.name_scope('pool2') as scope:
69        h_pool2 = max_pool_2x2(h_conv2)
70        
71    # 畳み込み層3の作成
72    with tf.name_scope('conv3') as scope:
73        W_conv3 = weight_variable([5, 5, 64, 64])
74        b_conv3 = bias_variable([64])
75        h_conv3 = tf.nn.relu(conv2d(h_pool2, W_conv3) + b_conv3)
76        
77    # 畳み込み層4の作成
78    with tf.name_scope('conv4') as scope:
79        W_conv4 = weight_variable([5, 5, 64, 64])
80        b_conv4 = bias_variable([64])
81        h_conv4 = tf.nn.relu(conv2d(h_conv3, W_conv4) + b_conv4)
82        
83    # 畳み込み層5の作成
84    with tf.name_scope('conv5') as scope:
85        W_conv5 = weight_variable([5, 5, 64, 64])
86        b_conv5 = bias_variable([64])
87        h_conv5 = tf.nn.relu(conv2d(h_conv4, W_conv5) + b_conv5)
88        
89    # プーリング層3の作成
90    with tf.name_scope('pool3') as scope:
91        h_pool3 = max_pool_2x2(h_conv5)
92        
93    # 全結合層1の作成
94    with tf.name_scope('fc1') as scope:
95        W_fc1 = weight_variable([7*7*64, 1024])
96        b_fc1 = bias_variable([1024])
97        h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
98        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
99        # dropoutの設定
100        h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
101    
102    # 全結合層2の作成
103    with tf.name_scope('fc2') as scope:
104        W_fc2 = weight_variable([1024, 1024])
105        b_fc2 = bias_variable([1024])
106        h_fc2 = tf.nn.relu(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
107        # dropoutの設定
108        h_fc2_drop = tf.nn.dropout(h_fc2, keep_prob)
109    # 全結合層3の作成
110    with tf.name_scope('fc3') as scope:
111        W_fc3 = weight_variable([1024, NUM_CLASSES])
112        b_fc3 = bias_variable([NUM_CLASSES])
113        
114        y_conv = tf.matmul(h_fc2_drop, W_fc3) + b_fc3
115    # ソフトマックス関数による正規化
116    with tf.name_scope('softmax') as scope:
117        y_conv=tf.nn.softmax(tf.matmul(h_fc2_drop, W_fc3) + b_fc3)
118    # 各ラベルの確率のようなものを返す
119    return y_conv
120
121以下略

行動規範の内容に同意します

回答1件

ベストアンサー

下記、全結合層(fc1)の7*7*64の7*7が28x28の画像を入力した際に、fc1層に入力される画像サイズになり、全結合層では7x7x64=3136の1次元に落としてそれらの要素を計算します。
入力画像サイズを変更すると、fc1層に到達した際の画像サイズが変わるため、全結合層に入力されるデータの要素数が変わりますので、この部分をご希望の画像サイズに合わせて適切に変更してください。

python
1    # 全結合層1の作成
2    with tf.name_scope('fc1') as scope:
3        W_fc1 = weight_variable([7*7*64, 1024])
4        b_fc1 = bias_variable([1024])
5        h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
6        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
7        # dropoutの設定
8        h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

なお、今回のネットワークだと、poolingを2回行っているため、そこで画像の縮小が起きます。
これにより、画像サイズが28x28→14x14→7x7と変化しています。
＃pool3はご提示いただいたコードだと使ってませんよね？

投稿2018/01/15 02:19

diningyo

総合スコア379

TyoNgc

2018/01/15 02:37

回答ありがとうございます。Poolingの部分はミスでした。例えば入力を96にする場合、3回poolingを行って全結合層が12*12*64という認識でよろしいでしょうか。またその横の1024は何を表しているのでしょうか、その値も入力サイズによって変えなければいけませんか？

diningyo

2018/01/15 02:46

> 例えば入力を96にする場合、3回poolingを行って全結合層が12*12*64という認識でよろしいでしょうか。その認識で正しいです。 > またその横の1024は何を表しているのでしょうか、その値も入力サイズによって変えなければいけませんか？ 1024は全結合層の出力の次元数で、こちらは動作させる上では変更は不要です。今回のネットワークでは全結合層部分は、 fc1 -> IN:7x7x64 / OUT:1024 fc2 -> IN:1024 / OUT 0124 fc3 -> 1024 / OUT:12 となっています。

TyoNgc

2018/01/15 02:51

重ね重ね申し訳ありませんが、Dimensions must be equal, but are 3136 and 9216 for 'fc1/MatMul' (op: 'MatMul') with input shapes: [?,3136], [9216,1024].このようなエラーが出てしまうのは何が原因でしょうか。念のため修正プログラムを本文に追加します

TyoNgc

2018/01/15 02:54

すみません、これは勘違いでした。

行動規範の内容に同意します