回答率: 85.47%

質問するログイン新規登録

トップに関する質問文章を学習させるときに次元数を揃えるにはどうしたらいいか

編集履歴

質問編集履歴

1

情報の修正

2018/08/14 13:57

投稿

スコア170

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -1,157 +1 @@
 文章を学習させるときに次元数を揃えるにはどうしたらいいでしょうか？
-今、ValueError: Error when checking input: expected input_1 to have shape (68,) but got array with shape (100,)　というようにエラーが出ています。
-```ここに言語を入力
-　　　　・
-　　　　・
-def vectorize_stories(data, word_idx, 100, 100):
-    X = []
-    Xq = []
-    Y = []
-    for story, query, answer in data:
-        x = [word_idx[w] for w in story]
-        xq = [word_idx[w] for w in query]
-        # let's not forget that index 0 is reserved
-        y = np.zeros(len(word_idx) + 1)
-        y[word_idx[answer]] = 1
-        X.append(x)
-        Xq.append(xq)
-        Y.append(y)
-    return (pad_sequences(X, maxlen=100),
-            pad_sequences(Xq, maxlen=100), np.array(Y))
-　　　　・
-　　　　・
-inputs_train, queries_train, answers_train = vectorize_stories(train,word_idx,100,100)
-inputs_test, queries_test, answers_test = vectorize_stories(test,word_idx,100,100)
-input_sequence = Input((1000,))
-question = Input((1000,))
-input_encoder_m = Sequential()
-input_encoder_m.add(Embedding(input_dim=vocab_size,
-                              output_dim=64))
-input_encoder_m.add(Dropout(0.3))
-input_encoder_c = Sequential()
-input_encoder_c.add(Embedding(input_dim=vocab_size,
-                              output_dim=1000))
-input_encoder_c.add(Dropout(0.3))
-question_encoder = Sequential()
-question_encoder.add(Embedding(input_dim=vocab_size,
-                               output_dim=64,
-                               input_length=1000))
-question_encoder.add(Dropout(0.3))
-input_encoded_m = input_encoder_m(input_sequence)
-input_encoded_c = input_encoder_c(input_sequence)
-question_encoded = question_encoder(question)
-match = dot([input_encoded_m, question_encoded], axes=(2, 2))
-match = Activation('softmax')(match)
-response = add([match, input_encoded_c])
-response = Permute((2, 1))(response)
-answer = concatenate([response, question_encoded])
-answer = LSTM(32)(answer)
-answer = Dropout(0.3)(answer)
-answer = Dense(vocab_size)(answer)
-answer = Activation('softmax')(answer)
-model = Model([input_sequence, question], answer)
-model.compile(optimizer='rmsprop', loss='categorical_crossentropy',
-              metrics=['accuracy'])
-model.fit([inputs_train, queries_train], answers_train,
-          batch_size=32,
-          epochs=120,
-          validation_data=([inputs_test, queries_test], answers_test))
-```
-とコードを書きました。trainとtestには、
-```ここに言語を入力
-['Sandra', 'moved', 'to', 'the', 'kitchen', '.', 'John', 'travelled', 'to', 'the', 'kitchen', '.', 'Sandra', 'moved', 'to', 'the', 'hallway']
-```
-のようなリストに格納された文章を入れています。今回入力された値と指定した次元数が違うからこのようなエラーが出ていると思うのですが、どのように揃えたらいいのでしょうか？また、学習させる文章により長さが違いますが、その場合もどのように次元数を揃えたらいいのでしょうか？