前提
Qitaの https://qiita.com/namakemono/items/4c779c9898028fc36ff3
の記事を参考にBERTを動かそうとしているのですが、エラーが出てしまい、動かすことができません。同課解決法を教えていただきたいです。
実現したいこと
BERTを問題なく動作させる
発生している問題・エラーメッセージ
TypeError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_18868/906636053.py in <module> 6 x_train = to_features(train_texts, max_length) 7 y_train = tf.keras.utils.to_categorical(train_labels, num_classes=num_classes) ----> 8 model = build_model(model_name, num_classes=num_classes, max_length=max_length) ~\AppData\Local\Temp/ipykernel_18868/1666341763.py in build_model(model_name, num_classes, max_length) 11 token_type_ids=token_type_ids 12 ) ---> 13 output = tf.keras.layers.Dense(num_classes, activation="softmax")(pooler_output) 14 model = tf.keras.Model(inputs=[input_ids, attention_mask, token_type_ids], outputs=[output]) 15 optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0) ~\anaconda3\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs) 65 except Exception as e: # pylint: disable=broad-except 66 filtered_tb = _process_traceback_frames(e.__traceback__) ---> 67 raise e.with_traceback(filtered_tb) from None 68 finally: 69 del filtered_tb ~\anaconda3\lib\site-packages\keras\engine\input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name) 195 # have a `shape` attribute. 196 if not hasattr(x, 'shape'): --> 197 raise TypeError(f'Inputs to a layer should be tensors. Got: {x}') 198 199 if len(inputs) != len(input_spec): TypeError: Inputs to a layer should be tensors. Got: pooler_output
該当のソースコード
python
import numpy as np import tensorflow as tf import transformers from sklearn.metrics import accuracy_score # model_nameはここから取得(cf. https://huggingface.co/transformers/pretrained_models.html) model_name = "cl-tohoku/bert-base-japanese" tokenizer = transformers.BertTokenizer.from_pretrained(model_name) # 訓練データ train_texts = [ "この犬は可愛いです", "その猫は気まぐれです", "あの蛇は苦手です" ] train_labels = [1, 0, 0] # 1: 好き, 0: 嫌い # テストデータ test_texts = [ "その猫はかわいいです", "どの鳥も嫌いです", "あのヤギは怖いです" ] test_labels = [1, 0, 0] # テキストのリストをtransformers用の入力データに変換 def to_features(texts, max_length): shape = (len(texts), max_length) # input_idsやattention_mask, token_type_idsの説明はglossaryに記載(cf. https://huggingface.co/transformers/glossary.html) input_ids = np.zeros(shape, dtype="int32") attention_mask = np.zeros(shape, dtype="int32") token_type_ids = np.zeros(shape, dtype="int32") for i, text in enumerate(texts): encoded_dict = tokenizer.encode_plus(text, max_length=max_length, pad_to_max_length=True) input_ids[i] = encoded_dict["input_ids"] attention_mask[i] = encoded_dict["attention_mask"] token_type_ids[i] = encoded_dict["token_type_ids"] return [input_ids, attention_mask, token_type_ids] # 単一テキストをクラス分類するモデルの構築 def build_model(model_name, num_classes, max_length): input_shape = (max_length, ) input_ids = tf.keras.layers.Input(input_shape, dtype=tf.int32) attention_mask = tf.keras.layers.Input(input_shape, dtype=tf.int32) token_type_ids = tf.keras.layers.Input(input_shape, dtype=tf.int32) bert_model = transformers.TFBertModel.from_pretrained(model_name) last_hidden_state, pooler_output = bert_model( input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids ) output = tf.keras.layers.Dense(num_classes, activation="softmax")(pooler_output) model = tf.keras.Model(inputs=[input_ids, attention_mask, token_type_ids], outputs=[output]) optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0) model.compile(optimizer=optimizer, loss="categorical_crossentropy", metrics=["acc"]) return model num_classes = 2 max_length = 15 batch_size = 10 epochs = 3 x_train = to_features(train_texts, max_length) y_train = tf.keras.utils.to_categorical(train_labels, num_classes=num_classes) model = build_model(model_name, num_classes=num_classes, max_length=max_length) # 訓練 model.fit( x_train, y_train, batch_size=batch_size, epochs=epochs ) # 予測 x_test = to_features(test_texts, max_length) y_test = np.asarray(test_labels) y_preda = model.predict(x_test) y_pred = np.argmax(y_preda, axis=1) print("Accuracy: %.5f" % accuracy_score(y_test, y_pred))
試したこと
Some layers from the model checkpoint at cl-tohoku/bert-base-japanese-char-whole-word-masking were not used when initializing TFBertModel: ['mlm___cls', 'nsp___cls'] - This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). All the layers of TFBertModel were initialized from the model checkpoint at cl-tohoku/bert-base-japanese-char-whole-word-masking. If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.
という警告文が出たためbertのモデルを変えてみたり、エラー分自体調べてみたりしましたが、自分では解決法を見つけることができませんでした。
補足情報(FW/ツールのバージョンなど)
開発環境はjupyterを利用しています。
まだ回答がついていません
会員登録して回答してみよう