Pythonにおける画像認識機械学習プログラムのモデル学習

実現したいこと

カラー画像とそれに対応するグレースケール画像をペアとして与えて教師あり学習を行い、学習した後にグレースケール画像を与えて、カラー画像に色塗りしてくれるPythonプログラムを実装したい。

現在はカラー画像を読み込ませ、それに対応するグレースケール画像を作成し、訓練データとテストデータに分割する。
Sequential()を用いてモデルの定義
モデルのコンパイル

この次のモデルの学習で訓練データとテストデータを与えて学習させようとしたがペア画像の画像サイズが一致せず、エラーを吐かれてしまう。

発生している問題・分からないこと

カラー画像とそれに対応するグレースケール画像をモデルの学習として与えるときにカラー画像とグレースケール画像のサイズが一致していない。画像認識プログラムを作成するのも初めてで解決方法が分からない。

エラーメッセージ

error
1WARNING:tensorflow:From C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.
2
3WARNING:tensorflow:From C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\backend.py:873: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
4
52024-01-13 23:52:05.829647: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
6To enable the following instructions: SSE SSE2 SSE3 SSE4.1 SSE4.2 AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
7WARNING:tensorflow:From c:\Users\owner\.vscode\code\Python\ml\programs\rgb_to_gray copy.py:69: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.
8
9Epoch 1/50
10Traceback (most recent call last):
11  File "c:\Users\owner\.vscode\code\Python\ml\programs\rgb_to_gray copy.py", line 73, in <module>
12    model.fit(X_train, y_train, epochs=50, batch_size=8, validation_data=(X_test, y_test))
13  File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\utils\traceback_utils.py", line 70, in error_handler
14    raise e.with_traceback(filtered_tb) from None
15  File "C:\Users\owner\AppData\Local\Temp\__autograph_generated_filec5vu3b7a.py", line 15, in tf__train_function
16    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
17ValueError: in user code:
18
19    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\engine\training.py", line 1401, in train_function  *
20        return step_function(self, iterator)
21    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\engine\training.py", line 1384, in step_function  **
22        outputs = model.distribute_strategy.run(run_step, args=(data,))
23    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\engine\training.py", line 1373, in run_step  **
24        outputs = model.train_step(data)
25    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\engine\training.py", line 1151, in train_step
26        loss = self.compute_loss(x, y, y_pred, sample_weight)
27    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\engine\training.py", line 1209, in compute_loss
28        return self.compiled_loss(
29    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\engine\compile_utils.py", line 277, in __call__
30        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
31    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\losses.py", line 143, in __call__
32        losses = call_fn(y_true, y_pred)
33    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\losses.py", line 270, in call  **
34        return ag_fn(y_true, y_pred, **self._fn_kwargs)
35    File "C:\Users\owner\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\src\losses.py", line 1706, in mean_squared_error
36        return backend.mean(tf.math.squared_difference(y_pred, y_true), axis=-1)
37
38    ValueError: Dimensions must be equal, but are 2048 and 256 for '{{node mean_squared_error/SquaredDifference}} = SquaredDifference[T=DT_FLOAT](sequential/conv2d_6/Sigmoid, mean_squared_error/Cast)' with input shapes: [?,2048,2048,3], [?,256,256,3].

該当のソースコード

Python3
1import os
2os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
3
4import numpy as np
5import tensorflow as tf
6import cv2
7import glob
8import keras
9from sklearn.model_selection import train_test_split
10from tensorflow import keras
11from keras.models import Sequential
12from keras.layers import Conv2D, UpSampling2D, InputLayer
13#from keras.optimizers import adam_v2
14from tensorflow.python.keras.optimizers import adam_v2
15from keras.preprocessing.image import img_to_array
16import matplotlib.pyplot as plt
17
18# カラー画像と白黒画像のペアを用意する関数
19def prepare_data(img_paths, img_size=(256, 256)):
20    color_imgs = []
21    gray_imgs = []
22
23    for img_path in img_paths:
24        # カラー画像
25        color_img = cv2.imread(img_path)
26        color_img = cv2.cvtColor(color_img, cv2.COLOR_BGR2RGB)
27        color_img = cv2.resize(color_img, img_size)
28
29        # 白黒画像
30        gray_img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
31        gray_img = cv2.resize(gray_img, img_size)
32        gray_img = np.expand_dims(gray_img, axis=-1)  # チャンネル次元を追加
33
34        color_imgs.append(color_img)
35        gray_imgs.append(gray_img)
36
37    return np.array(color_imgs), np.array(gray_imgs)
38
39# データセットのパス
40dataset_paths = glob.glob('../image/*.JPG')
41
42# カラー画像と白黒画像のペアを用意
43color_images, gray_images = prepare_data(dataset_paths)
44
45# データを訓練データとテストデータに分割
46X_train, X_test, y_train, y_test = train_test_split(gray_images, color_images, test_size=0.8, random_state=42)
47
48# ラベルのサイズを変更
49y_train = y_train[:len(X_train)]
50y_test = y_test[:len(X_test)]
51
52
53
54# モデルの定義
55model = Sequential()
56model.add(InputLayer(input_shape=(256, 256, 1)))
57model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
58model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
59model.add(UpSampling2D((2, 2)))
60model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
61model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
62model.add(UpSampling2D((2, 2)))
63model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
64model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
65model.add(UpSampling2D((2, 2)))
66model.add(Conv2D(3, (3, 3), activation='sigmoid', padding='same'))
67
68# モデルのコンパイル
69optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.0002)
70model.compile(optimizer=optimizer, loss='mean_squared_error', metrics=['accuracy'])
71
72# モデルの学習
73model.fit(X_train, y_train, epochs=50, batch_size=8, validation_data=(X_test, y_test))
74
75# ある白黒データに対する予測
76input_gray_image = X_test[0].reshape(1, 256, 256, 1)
77predicted_color_image = model.predict(input_gray_image)
78
79# 結果の表示
80plt.subplot(1, 2, 1)
81plt.title('Input Grayscale Image')
82plt.imshow(X_test[0].reshape(256, 256), cmap='gray')
83
84plt.subplot(1, 2, 2)
85plt.title('Predicted Color Image')
86plt.imshow(predicted_color_image[0])
87plt.show()
88

試したこと・調べたこと

teratailやGoogle等で検索した
ソースコードを自分なりに変更した
知人に聞いた
その他

上記の詳細・結果

モデルの定義の際における画像サイズの変更などを行ったが同様のエラーが出てきた。

補足

開発環境：Visual Studio Code

tensorflow 2.15.0
tensorflow-estimator 2.15.0
tensorflow-intel 2.15.0
tensorflow-io-gcs-filesystem 0.31.0

Python 3.9.13

meg_

2024/01/13 16:24 編集

model.summary()の結果はどうなっていますか？モデルのアウトプットが 256X256X3 になるようにしないといけないのではないでしょうか？

dundee_dance

2024/01/13 16:24

Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 256, 256, 64) 640 conv2d_1 (Conv2D) (None, 256, 256, 64) 36928 up_sampling2d (UpSampling2 (None, 512, 512, 64) 0 D) conv2d_2 (Conv2D) (None, 512, 512, 128) 73856 conv2d_3 (Conv2D) (None, 512, 512, 128) 147584 up_sampling2d_1 (UpSamplin (None, 1024, 1024, 128) 0 g2D) conv2d_4 (Conv2D) (None, 1024, 1024, 256) 295168 conv2d_5 (Conv2D) (None, 1024, 1024, 256) 590080 up_sampling2d_2 (UpSamplin (None, 2048, 2048, 256) 0 g2D) conv2d_6 (Conv2D) (None, 2048, 2048, 3) 6915 ================================================================= Total params: 1151171 (4.39 MB) Trainable params: 1151171 (4.39 MB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________ None モデルのコンパイルの後にやった結果がこうなっています。

行動規範の内容に同意します

回答1件

ベストアンサー

この次のモデルの学習で訓練データとテストデータを与えて学習させようとしたがペア画像の画像サイズが一致せず、エラーを吐かれてしまう。

下記エラーメッセージより損失関数（mean_squared_error）のところのデータの次元の不一致でエラーが発生しているようです。

ValueError: Dimensions must be equal, but are 2048 and 256 for '{{node mean_squared_error/SquaredDifference}} = SquaredDifference[T=DT_FLOAT](sequential/conv2d_6/Sigmoid, mean_squared_error/Cast)' with input shapes: [?,2048,2048,3], [?,256,256,3].

これはモデルのアウトプットの次元が（2048,2048,3）で教師データの次元が（256,256,3）のために起きています。

下記記事はPytorchの例ですが目的は同じなので大いに参考になるかと思います。
白黒画像を畳み込みニューラルネットワーク（CNN）を用いてカラー化する

Kerasですと下記記事が参考になるかと思います。
オートエンコーダーとしてのU-Net（自己符号化から白黒画像のカラー化まで）

投稿2024/01/13 17:46

meg_

総合スコア10922