BERTを動かしたい

前提

Qitaの　https://qiita.com/namakemono/items/4c779c9898028fc36ff3
の記事を参考にBERTを動かそうとしているのですが、エラーが出てしまい、動かすことができません。同課解決法を教えていただきたいです。

実現したいこと

BERTを問題なく動作させる

発生している問題・エラーメッセージ

TypeError                                 Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18868/906636053.py in <module>
      6 x_train = to_features(train_texts, max_length)
      7 y_train = tf.keras.utils.to_categorical(train_labels, num_classes=num_classes)
----> 8 model = build_model(model_name, num_classes=num_classes, max_length=max_length)

~\AppData\Local\Temp/ipykernel_18868/1666341763.py in build_model(model_name, num_classes, max_length)
     11         token_type_ids=token_type_ids
     12     )
---> 13     output = tf.keras.layers.Dense(num_classes, activation="softmax")(pooler_output)
     14     model = tf.keras.Model(inputs=[input_ids, attention_mask, token_type_ids], outputs=[output])
     15     optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0)

~\anaconda3\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

~\anaconda3\lib\site-packages\keras\engine\input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
    195     # have a `shape` attribute.
    196     if not hasattr(x, 'shape'):
--> 197       raise TypeError(f'Inputs to a layer should be tensors. Got: {x}')
    198 
    199   if len(inputs) != len(input_spec):

TypeError: Inputs to a layer should be tensors. Got: pooler_output

該当のソースコード

python
1import numpy as np
2import tensorflow as tf
3import transformers
4from sklearn.metrics import accuracy_score
5
6# model_nameはここから取得(cf. https://huggingface.co/transformers/pretrained_models.html)
7model_name = "cl-tohoku/bert-base-japanese"
8tokenizer = transformers.BertTokenizer.from_pretrained(model_name)
9
10# 訓練データ
11train_texts = [
12    "この犬は可愛いです",
13    "その猫は気まぐれです",
14    "あの蛇は苦手です"
15]
16train_labels = [1, 0, 0] # 1: 好き, 0: 嫌い
17
18# テストデータ
19test_texts = [
20    "その猫はかわいいです",
21    "どの鳥も嫌いです",
22    "あのヤギは怖いです"
23]
24test_labels = [1, 0, 0]
25
26# テキストのリストをtransformers用の入力データに変換
27def to_features(texts, max_length):
28    shape = (len(texts), max_length)
29    # input_idsやattention_mask, token_type_idsの説明はglossaryに記載(cf. https://huggingface.co/transformers/glossary.html)
30    input_ids = np.zeros(shape, dtype="int32")
31    attention_mask = np.zeros(shape, dtype="int32")
32    token_type_ids = np.zeros(shape, dtype="int32")
33    for i, text in enumerate(texts):
34        encoded_dict = tokenizer.encode_plus(text, max_length=max_length, pad_to_max_length=True)
35        input_ids[i] = encoded_dict["input_ids"]
36        attention_mask[i] = encoded_dict["attention_mask"]
37        token_type_ids[i] = encoded_dict["token_type_ids"]
38    return [input_ids, attention_mask, token_type_ids]
39
40# 単一テキストをクラス分類するモデルの構築
41def build_model(model_name, num_classes, max_length):
42    input_shape = (max_length, )
43    input_ids = tf.keras.layers.Input(input_shape, dtype=tf.int32)
44    attention_mask = tf.keras.layers.Input(input_shape, dtype=tf.int32)
45    token_type_ids = tf.keras.layers.Input(input_shape, dtype=tf.int32)
46    bert_model = transformers.TFBertModel.from_pretrained(model_name)
47    last_hidden_state, pooler_output = bert_model(
48        input_ids,
49        attention_mask=attention_mask,
50        token_type_ids=token_type_ids
51    )
52    output = tf.keras.layers.Dense(num_classes, activation="softmax")(pooler_output)
53    model = tf.keras.Model(inputs=[input_ids, attention_mask, token_type_ids], outputs=[output])
54    optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0)
55    model.compile(optimizer=optimizer, loss="categorical_crossentropy", metrics=["acc"])
56    return model
57
58num_classes = 2
59max_length = 15
60batch_size = 10
61epochs = 3
62
63x_train = to_features(train_texts, max_length)
64y_train = tf.keras.utils.to_categorical(train_labels, num_classes=num_classes)
65model = build_model(model_name, num_classes=num_classes, max_length=max_length)
66
67# 訓練
68model.fit(
69    x_train,
70    y_train,
71    batch_size=batch_size,
72    epochs=epochs
73)
74
75# 予測
76x_test = to_features(test_texts, max_length)
77y_test = np.asarray(test_labels)
78y_preda = model.predict(x_test)
79y_pred = np.argmax(y_preda, axis=1)
80print("Accuracy: %.5f" % accuracy_score(y_test, y_pred))

試したこと

Some layers from the model checkpoint at cl-tohoku/bert-base-japanese-char-whole-word-masking were not used when initializing TFBertModel: ['mlm___cls', 'nsp___cls']
- This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFBertModel were initialized from the model checkpoint at cl-tohoku/bert-base-japanese-char-whole-word-masking.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.

という警告文が出たためbertのモデルを変えてみたり、エラー分自体調べてみたりしましたが、自分では解決法を見つけることができませんでした。