Tensorflow, kerasで作成したCNNモデルの使い方

Question

# やりたいこと APIで画像分析を今までやってきましたが、もっと深い画像分析をやりたいと思いtensorflowとkerasを使って勉強をはじめました。まずは、イメージを掴むためkerasでMNIST用のCNNモデルを作成し、作成したモデルに対して自分で用意した画像を当てたときにどういった結果が得られるかやってみようとしたところです。 # 動作環境 Mac Python 3.6.5 tensorflow 1.8.0 keras 2.2.0 jupyternotebook 1.0.0 # わからないこと Qiitaの[記事](https://qiita.com/nagayosi/items/0034e5e82813b05e41df)を読みつつ、jupyter notebook上でkerasで配布されているサンプルコードを[mnist_cnn.py](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py)を実行してみました。サンプルコード自体はうまく作成できましたが```model.predict```でエラーが出て進みません。 # やったこと - mnist_cnn.pyの実行 mnist_cnn.pyは問題なく正常実行できました。 ``` # mnist tutorial by keras (train time about 45 minuts) '''Trains a simple convnet on the MNIST dataset. Gets to 99.25% test accuracy after 12 epochs (there is still a lot of margin for parameter tuning). 16 seconds per epoch on a GRID K520 GPU. ''' from __future__ import print_function import keras from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras import backend as K batch_size = 128 num_classes = 10 epochs = 12 # input image dimensions img_rows, img_cols = 28, 28 # the data, split between train and test sets (x_train, y_train), (x_test, y_test) = mnist.load_data() if K.image_data_format() == 'channels_first': x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) input_shape = (1, img_rows, img_cols) else: x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) input_shape = (img_rows, img_cols, 1) x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') # convert class vectors to binary class matrices y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes) model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape)) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(num_classes, activation='softmax')) model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy']) model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_test, y_test)) score = model.evaluate(x_test, y_test, verbose=0) print('Test loss:', score[0]) print('Test accuracy:', score[1]) ``` - モデルの保存一旦作成したモデルを保存してみました。こちらも正常にできました ``` # save model json_string = model.to_json() open('tutorial_mnist.json', 'w').write(json_string) model.save_weights('tutorial_mnist.h5') ``` - モデルをロードして、用意した画像の分析結果を得る画像はkaggleのDatasetより拝借。[https://www.kaggle.com/scolianni/mnistasjpg](https://www.kaggle.com/scolianni/mnistasjpg)のtestSample.zipからimg_1.jpgを利用して、predictを実行してみましたが、エラーが発生 ``` import keras from keras.datasets import mnist from keras.models import model_from_json from keras.utils import np_utils from keras.preprocessing import image from PIL import Image import matplotlib.pyplot as plt import numpy # modelのload model = model_from_json(open('tutorial_mnist.json').read()) model.load_weights('tutorial_mnist.h5') model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy']) # predict filepath = "/Users/atg/work/python/tensorflow/jupyter/mnist_test_data/img_1.jpg" img = Image.open(filepath).convert('RGB') ## Gray->L, RGB->RGB img = img.resize((28, 28)) x = numpy.array(img, dtype=numpy.float32) x = x / 255. x = x[None, ...] pred = model.predict(x, batch_size=1, verbose=0) score = numpy.max(pred) pred_label = np.argmax(pred) print("pred", pred) print("score", score) print("pred_label", pred_label) ``` * 発生したエラー ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 23 x = x[None, ...] 24 ---> 25 pred = model.predict(x, batch_size=1, verbose=0) 26 score = numpy.max(pred) 27 pred_label = np.argmax(pred) ~/work/python/tensorflow/lib/python3.6/site-packages/keras/engine/training.py in predict(self, x, batch_size, verbose, steps) 1150 'argument.') 1151 # Validate user data. -> 1152 x, _, _ = self._standardize_user_data(x) 1153 if self.stateful: 1154 if x[0].shape[0] > batch_size and x[0].shape[0] % batch_size != 0: ~/work/python/tensorflow/lib/python3.6/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size) 752 feed_input_shapes, 753 check_batch_axis=False, # Don't enforce the batch size. --> 754 exception_prefix='input') 755 756 if y is not None: ~/work/python/tensorflow/lib/python3.6/site-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 134 ': expected ' + names[i] + ' to have shape ' + 135 str(shape) + ' but got array with shape ' + --> 136 str(data_shape)) 137 return data 138 ValueError: Error when checking input: expected conv2d_1_input to have shape (28, 28, 1) but got array with shape (28, 28, 3) ``` 期待している次元が違うというエラーということで、reshapeで次元数を変更してみました。 - 変更したコード ``` import keras from keras.datasets import mnist from keras.models import model_from_json from keras.utils import np_utils from keras.preprocessing import image from PIL import Image import matplotlib.pyplot as plt import numpy # modelのload model = model_from_json(open('tutorial_mnist.json').read()) model.load_weights('tutorial_mnist.h5') model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy']) # predict filepath = "/Users/atg/work/python/tensorflow/jupyter/mnist_test_data/img_1.jpg" img = Image.open(filepath).convert('RGB') ## Gray->L, RGB->RGB img = img.resize((28, 28)) x = numpy.array(img, dtype=numpy.float32) x = x / 255. x = x[None, ...] x = numpy.reshape(x, [28,28,1]) pred = model.predict(x, batch_size=1, verbose=0) score = numpy.max(pred) pred_label = np.argmax(pred) print("pred", pred) print("score", score) print("pred_label", pred_label) ``` - エラー ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 22 x = x / 255. 23 x = x[None, ...] ---> 24 x = numpy.reshape(x, [28,28,1]) 25 26 pred = model.predict(x, batch_size=1, verbose=0) ~/work/python/tensorflow/lib/python3.6/site-packages/numpy/core/fromnumeric.py in reshape(a, newshape, order) 255 [5, 6]]) 256 """ --> 257 return _wrapfunc(a, 'reshape', newshape, order=order) 258 259 ~/work/python/tensorflow/lib/python3.6/site-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, *args, **kwds) 50 def _wrapfunc(obj, method, *args, **kwds): 51 try: ---> 52 return getattr(obj, method)(*args, **kwds) 53 54 # An AttributeError occurs if the object does not have ValueError: cannot reshape array of size 2352 into shape (28,28,1) ``` 今度はサイズでエラーが発生しましたが、そもそもpredictを行う前の画像の読み込みからモデルに適した形に変換する処理の考え方が間違っているのではと思い、teratailに質問を記載しました。モデル作成まではQiitaや海外のフォーラムでも見かけるのですが、実際に作成したモデルで予測させる際は見るサイトで方法が変わり、かつサンプルコードを写経してもうまくいかず、だんだん何が正しい方法なのかがわからなくなってきました。参考になるサイトや方法がありましたら、ご教授頂きたいです。

Accepted Answer

質問者さんが作られた予測モデルはMNIST(手書き数字0-9)を何が書かれているかを予測する画像分類モデルなんですね。
とするとこのモデルは以下のコードから入力に(28, 28, 1)の画像(width, height, channel)を入力としていますね。
```python
# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)


# 一部省略

model = Sequential()

# (28, 28, 1) or (1, 28, 28)の画像を入力層に入れる
# channel first か channel lastによって変わる
# 質問者さんの場合はchannel lastのようですね
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
```

予測時の画像の前処理のコードにコメントをつけてみました。
```python
# predict
filepath = "/Users/atg/work/python/tensorflow/jupyter/mnist_test_data/img_1.jpg"

# ここはRGBScaleではなくGrayScaleを選択しましょう
# もともとMNISTはGrayScaleですね
img = Image.open(filepath).convert('RGB') ## Gray->L, RGB->RGB
x = numpy.array(img, dtype=numpy.float32)
x = x / 255.
x = x[None, ...]
print(x.shape)
# (1, 28, 28, 3)
# RGBScaleで読み込んだためchannelが3になっていますね
x = numpy.reshape(x, [28,28,1])
# 質問者さんが記述されたエラーコードはここでreshapeできずエラーを出しているみたいですね
# ValueError: cannot reshape array of size 2352 into shape (28,28,1)
```
とりあえず上の指摘部分を直してみました。
```python
# 上記コード修正版
# predict
filepath = "/Users/atg/work/python/tensorflow/jupyter/mnist_test_data/img_1.jpg"
img = Image.open(filepath).convert('L') ## Gray->L, RGB->RGB
x = numpy.array(img, dtype=numpy.float32)
x = x.reshape((28, 28, 1))
x = x / 255.
x = x[None, ...]
print(x.shape)
# (1, 28, 28, 1)
```

やりたいこと

動作環境

わからないこと

やったこと

関連した質問