多入力のCNN層とLSTM層を組み合わせたネットワークの構築ができない

kerasを使って、CNNとRNNを組み合わせたモデルを構築しようとしております。
入力は６つで、それぞれの入力で畳み込みを行った後、一つの入力に結合して、それをLSTMで学習させます。
元データは、(time,input_number,50,50)という形で、出力及び正解データは１次元です。

以下のように書いてみたのですが、LSTMに入力する時点で次元が合わないエラーが出たり、次元を直してもType errorが出てしまいます.
ひとつのモデルで書けないかと思い、モデルを二つに分けてます。Sequential()であれば、一つで書けそうですが、多入力のやり方がわからないです。
そして、はたしてモデルが構築できたとして、model.fitなどで学習できるでしょうか？
このモデルの構築などに何かアドバイスや参考コードを頂ければと思います。

python
1
2############################################
3#Inputs
4#############################################
5
6
7input_data_1  =  Input(shape=(1,50,50))
8input_data_2  =  Input(shape=(1,50,50))
9input_data_3  =  Input(shape=(1,50,50))
10input_data_4  =  Input(shape=(1,50,50))
11input_data_5  =  Input(shape=(1,50,50))
12input_data_6  =  Input(shape=(1,50,50))
13
14#############################################
15#Convolution Layer 1
16#############################################
17
18x_1 = Conv2D(5, (5, 5), padding='same')(input_data_1)
19x_1 = BatchNormalization()(x_1)
20x_1 = Activation('relu')(x_1)
21x_1 = MaxPooling2D((2, 2), padding='same')(x_1) 
22
23x_2 = Conv2D(5, (5, 5), padding='same')(input_data_2)
24x_2 = BatchNormalization()(x_2)
25x_2 = Activation('relu')(x_2)
26x_2 = MaxPooling2D((2, 2), padding='same')(x_2) 
27
28x_3 = Conv2D(5, (5, 5), padding='same')(input_data_3)
29x_3 = BatchNormalization()(x_3)
30x_3 = Activation('relu')(x_3)
31x_3 = MaxPooling2D((2, 2), padding='same')(x_3) 
32
33x_4 = Conv2D(5, (5, 5), padding='same')(input_data_4)
34x_4 = BatchNormalization()(x_4)
35x_4 = Activation('relu')(x_4)
36x_4 = MaxPooling2D((2, 2), padding='same')(x_4) 
37
38x_5 = Conv2D(5, (5, 5), padding='same')(input_data_5)
39x_5 = BatchNormalization()(x_5)
40x_5 = Activation('relu')(x_5)
41x_5 = MaxPooling2D((2, 2), padding='same')(x_5) 
42
43x_6 = Conv2D(5, (5, 5), padding='same')(input_data_6)
44x_6 = BatchNormalization()(x_6)
45x_6 = Activation('relu')(x_6)
46x_6 = MaxPooling2D((2, 2), padding='same')(x_6) 
47
48
49x_1 = Dropout(0.25)(x_1)
50x_2 = Dropout(0.25)(x_2)
51x_3 = Dropout(0.25)(x_3)
52x_4 = Dropout(0.25)(x_4)
53x_5 = Dropout(0.25)(x_5)
54x_6 = Dropout(0.25)(x_6)
55
56#############################################
57#Convolution Layer 2
58#############################################
59
60x_1_b = Conv2D(5, (3, 3), padding='same')(x_1)
61x_1_b = BatchNormalization()(x_1_b)
62x_1_b = Activation('relu')(x_1_b)
63x_1_b = MaxPooling2D((2, 2), padding='same')(x_1_b) 
64
65x_2_b = Conv2D(5, (3, 3), padding='same')(x_2)
66x_2_b = BatchNormalization()(x_2_b)
67x_2_b = Activation('relu')(x_2_b)
68x_2_b = MaxPooling2D((2, 2), padding='same')(x_2_b) 
69
70x_3_b = Conv2D(5, (3, 3), padding='same')(x_3)
71x_3_b = BatchNormalization()(x_3_b)
72x_3_b = Activation('relu')(x_3_b)
73x_3_b = MaxPooling2D((2, 2), padding='same')(x_3_b) 
74
75x_4_b = Conv2D(5, (3, 3), padding='same')(x_4)
76x_4_b = BatchNormalization()(x_4_b)
77x_4_b = Activation('relu')(x_4_b)
78x_4_b = MaxPooling2D((2, 2), padding='same')(x_4_b) 
79
80x_5_b = Conv2D(5, (3, 3), padding='same')(x_5)
81x_5_b = BatchNormalization()(x_5_b)
82x_5_b = Activation('relu')(x_5_b)
83x_5_b = MaxPooling2D((2, 2), padding='same')(x_5_b) 
84
85x_6_b = Conv2D(5, (3, 3), padding='same')(x_6)
86x_6_b = BatchNormalization()(x_6_b)
87x_6_b = Activation('relu')(x_6_b)
88x_6_b = MaxPooling2D((2, 2), padding='same')(x_6_b) 
89
90#############################################
91#Fully Connected Layer 1~3
92#############################################
93
94x_1_1 = Dense(20, activation = 'relu')(x_1_b)
95x_2_1 = Dense(20, activation = 'relu')(x_2_b)
96x_3_1 = Dense(20, activation = 'relu')(x_3_b)
97x_4_1 = Dense(20, activation = 'relu')(x_4_b)
98x_5_1 = Dense(20, activation = 'relu')(x_5_b)
99x_6_1 = Dense(20, activation = 'relu')(x_6_b)
100
101#Dropout
102
103x_1_1 = Dropout(0.25)(x_1_1)
104x_2_1 = Dropout(0.25)(x_2_1)
105x_3_1 = Dropout(0.25)(x_3_1)
106x_4_1 = Dropout(0.25)(x_4_1)
107x_5_1 = Dropout(0.25)(x_5_1)
108x_6_1 = Dropout(0.25)(x_6_1)
109
110
111
112x_1_2 = Dense(20, activation = 'relu')(x_1_1)
113x_2_2 = Dense(20, activation = 'relu')(x_2_1)
114x_3_2 = Dense(20, activation = 'relu')(x_3_1)
115x_4_2 = Dense(20, activation = 'relu')(x_4_1)
116x_5_2 = Dense(20, activation = 'relu')(x_5_1)
117x_6_2 = Dense(20, activation = 'relu')(x_6_1)
118
119
120x_1_3 = Dense(1, activation = 'relu')(x_1_2)
121x_2_3 = Dense(1, activation = 'relu')(x_2_2)
122x_3_3 = Dense(1, activation = 'relu')(x_3_2)
123x_4_3 = Dense(1, activation = 'relu')(x_4_2)
124x_5_3 = Dense(1, activation = 'relu')(x_5_2)
125x_6_3 = Dense(1, activation = 'relu')(x_6_2)
126
127
128merge = layers.concatenate([x_1_3, x_2_3, x_3_3, x_4_3, x_5_3, x_6_3], axis = 1)
129
130x = BatchNormalization()(merge)
131x = Activation('relu')(x)
132
133model_conv = Model(inputs = [input_data_1, input_data_2, input_data_3, input_data_4, input_data_5, input_data_6] , outputs = x )
134
135
136y = Flatten()(x)
137y = y[np.newaxis,:,:]
138
139
140#############################################
141#LSTM layer
142#############################################
143
144lstm = LSTM(30)(y)
145y = Dense(20)(lstm)
146y = Dense(20)(y)
147y = BatchNormalization()(y)
148output  = Dense(1)(y)
149model_lstm = Model(input = lstm , output = output)
150
151
152# # # #############################################
153# # # #Visualize
154# # # #############################################
155# from keras.utils import plot_model
156
157# model_conv.summary()
158# plot_model(model_conv, to_file = 'model_conv.png',show_shapes=True, show_layer_names=False)
159
160# model_lstm.summary()
161# plot_model(model_lstm,to_file = 'model_lstm.png', show_shapes=True, show_layer_names=True)
162
163
164

行動規範の内容に同意します

回答1件

ベストアンサー

model_lstmのinputsにわたしている値が適切ではないです。以下のようにInput layerを渡すように修正すると少なくともsummaryはだせるようになります。

python
1model_lstm = Model(inputs = [input_data_1, input_data_2, input_data_3, input_data_4, input_data_5, input_data_6] , outputs = output)

ただ、普段batch_sizeとして使っている次元をtimeの次元として使うのはkerasではあまりいい選択ではないと思います。

以下、kerasでよくある書き方への落とし込み方を説明します。
上記の修正で、求めておられるモデルが構築できている場合は読み飛ばして下さい。

別案

kerasっぽく書くなら(batch_size, time, input_dim, 50, 50)という入力にするのがいいのではないでしょうか。

まず、CNNの部分だけ切り出して、keras.Modelを作る関数を作ります。

python
1def conv_block(input_dim, width, height):
2    x  =  Input(shape=(input_dim, width, height))
3    #############################################
4    #Convolution Layer 1
5    #############################################
6    x_1 = Conv2D(5, (5, 5), padding='same')(x)
7    x_1 = BatchNormalization()(x_1)
8    x_1 = Activation('relu')(x_1)
9    x_1 = MaxPooling2D((2, 2), padding='same')(x_1) 
10    x_1 = Dropout(0.25)(x_1)
11    #############################################
12    #Convolution Layer 2
13    #############################################
14    x_1_b = Conv2D(5, (3, 3), padding='same')(x_1)
15    x_1_b = BatchNormalization()(x_1_b)
16    x_1_b = Activation('relu')(x_1_b)
17    x_1_b = MaxPooling2D((2, 2), padding='same')(x_1_b) 
18    #############################################
19    #Fully Connected Layer 1~3
20    #############################################
21    x_1_1 = Dense(20, activation = 'relu')(x_1_b)
22    x_1_1 = Dropout(0.25)(x_1_1)
23    x_1_2 = Dense(20, activation = 'relu')(x_1_1)
24    x_1_3 = Dense(1, activation = 'relu')(x_1_2)
25    model = Model(inputs=x, outputs=x_1_3)
26    return model

次に、inputを時間方向を加えたものに変更します。なお、conv_blockは時間の次元は畳み込まないので、keras.TiemDistributedを使って時間以外の次元だけconv_blockに渡します。時間の長さ分だけ繰り返し同じconv_blockに入力して、(バッチサイズ, 時間, conv_blockの出力)という出力を作ります。

python
1from keras.layers import TimeDistributed
2
3time = 10
4############################################
5#Inputs
6############################################
7input_data_1  =  Input(shape=(time, 1,50,50))
8input_data_2  =  Input(shape=(time, 1,50,50))
9input_data_3  =  Input(shape=(time, 1,50,50))
10input_data_4  =  Input(shape=(time, 1,50,50))
11input_data_5  =  Input(shape=(time, 1,50,50))
12input_data_6  =  Input(shape=(time, 1,50,50))
13
14############################################
15#Conv blocks
16############################################
17x_1_3 = TimeDistributed(conv_block(1, 50, 50))(input_data_1)
18x_2_3 = TimeDistributed(conv_block(1, 50, 50))(input_data_2)
19x_3_3 = TimeDistributed(conv_block(1, 50, 50))(input_data_3)
20x_4_3 = TimeDistributed(conv_block(1, 50, 50))(input_data_4)
21x_5_3 = TimeDistributed(conv_block(1, 50, 50))(input_data_5)
22x_6_3 = TimeDistributed(conv_block(1, 50, 50))(input_data_6)

次に、時間の次元を残して、Mergeしていきます。

python
1from keras.layers import Reshape
2
3############################################
4#Merge Layers
5############################################
6merge = layers.concatenate([x_1_3, x_2_3, x_3_3, x_4_3, x_5_3, x_6_3], axis = 2)
7# merge.shape => TensorShape([None, 10, 6, 13, 1])
8x = BatchNormalization()(merge)
9x = Activation('relu')(x)
10# Flatten without time dimension
11output_dim = x.shape[2] * x.shape[3] * x.shape[4]
12y_conv = Reshape((time, output_dim))(x)

すると、LSTMの標準的な入力である(バッチサイズ, 時間, 特徴量)の形になっているので、後は単にLSTMを構築します。

python
1#############################################
2#LSTM layer
3#############################################
4# Input_shape=(None, time, emb) => output_shape=(None, 30)
5lstm = LSTM(30)(y_conv)
6y = Dense(20)(lstm)
7y = Dense(20)(y)
8y = BatchNormalization()(y)
9output  = Dense(1)(y)
10# model_lstm = Model(inputs = y_conv , outputs = output)
11model = Model([input_data_1, input_data_2, input_data_3, input_data_4, input_data_5, input_data_6], outputs=output)

model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 10, 1, 50, 5 0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            [(None, 10, 1, 50, 5 0                                            
__________________________________________________________________________________________________
input_3 (InputLayer)            [(None, 10, 1, 50, 5 0                                            
__________________________________________________________________________________________________
input_4 (InputLayer)            [(None, 10, 1, 50, 5 0                                            
__________________________________________________________________________________________________
input_5 (InputLayer)            [(None, 10, 1, 50, 5 0                                            
__________________________________________________________________________________________________
input_6 (InputLayer)            [(None, 10, 1, 50, 5 0                                            
__________________________________________________________________________________________________
time_distributed (TimeDistribut (None, 10, 1, 13, 1) 7086        input_1[0][0]                    
__________________________________________________________________________________________________
time_distributed_1 (TimeDistrib (None, 10, 1, 13, 1) 7086        input_2[0][0]                    
__________________________________________________________________________________________________
time_distributed_2 (TimeDistrib (None, 10, 1, 13, 1) 7086        input_3[0][0]                    
__________________________________________________________________________________________________
time_distributed_3 (TimeDistrib (None, 10, 1, 13, 1) 7086        input_4[0][0]                    
__________________________________________________________________________________________________
time_distributed_4 (TimeDistrib (None, 10, 1, 13, 1) 7086        input_5[0][0]                    
__________________________________________________________________________________________________
time_distributed_5 (TimeDistrib (None, 10, 1, 13, 1) 7086        input_6[0][0]                    
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 10, 6, 13, 1) 0           time_distributed[0][0]           
                                                                 time_distributed_1[0][0]         
                                                                 time_distributed_2[0][0]         
                                                                 time_distributed_3[0][0]         
                                                                 time_distributed_4[0][0]         
                                                                 time_distributed_5[0][0]         
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 10, 6, 13, 1) 4           concatenate[0][0]                
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 10, 6, 13, 1) 0           batch_normalization_12[0][0]     
__________________________________________________________________________________________________
reshape (Reshape)               (None, 10, 78)       0           activation_12[0][0]              
__________________________________________________________________________________________________
lstm (LSTM)                     (None, 30)           13080       reshape[0][0]                    
__________________________________________________________________________________________________
dense_18 (Dense)                (None, 20)           620         lstm[0][0]                       
__________________________________________________________________________________________________
dense_19 (Dense)                (None, 20)           420         dense_18[0][0]                   
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 20)           80          dense_19[0][0]                   
__________________________________________________________________________________________________
dense_20 (Dense)                (None, 1)            21          batch_normalization_13[0][0]     
==================================================================================================
Total params: 56,741
Trainable params: 56,579
Non-trainable params: 162
__________________________________________________________________________________________________

投稿2020/02/23 05:34

編集2020/02/26 15:31

T.Tom

総合スコア58

blackmk

2020/02/25 02:13

とてもご丁寧にありがとうございます。確かに、別案の方がkerasで書きやすく、無理が無いように思えます。非常に勉強になりました。ありがとうございます。

blackmk

2020/02/25 05:11

追加で申し訳ありませんが、'Node' object has no attribute 'output_masks'が出ております。おそらくkerasとtensorflowのモジュールがごちゃ混ぜになっているので、よろしければその他importしたモジュールを教えていただけれればと思います。

T.Tom

2020/02/26 15:34

素のkerasだったんですね。勘違いしていました。import文を修正してkerasのモジュールに統一しましたので、こちらで再度試して頂きたいです。

blackmk

2020/02/27 02:38

再度失礼いたします。CNN層からLSTM層に渡す際に float() argument must be a string or a number , not 'Dimension' と出てしまいます…shapeも確認しましたが、y_convの次元もT.Tom様の書かれた次元の通りで間違いはありません。これはどのようなミスでしょうか？以下、importしたものです。 from keras.layers import Input,Dense,Dropout,Conv2D,BacthNormalization,Activation, TimeDistributed,Reshape,concatenate,LSTM from keras.models import Model

行動規範の内容に同意します