前提・実現したいこと
機械学習の初心者です.Laboroの事前学習済みDistilBERTで2値の文書分類タスクを解こうと思っています.
発生している問題・エラーメッセージ
model.fit()でエラーが発生しました.
ValueError Traceback (most recent call last) <ipython-input-95-23c2a77e679e> in <module>() 6 batch_size=BATCH_SIZE, 7 epochs=EPOCHS, ----> 8 validation_data=(features_val, labels_val)) ValueError: Shape mismatch: The shape of labels (received (4, 1)) should equal the shape of logits except for the last dimension (received (4, 128, 768)).
モデルの出力形状が問題なのでしょうか.
該当のソースコード
features_trainとlabels_trainにはそれぞれ文字列とラベルがエンコードされたリストが入っています.
python
1import os 2import random 3import sys 4import warnings 5import re 6import numpy as np 7import pandas as pd 8import tensorflow as tf 9from tensorflow.keras import losses 10from tensorflow.keras.optimizers import Adam 11from transformers import AlbertTokenizer, TFDistilBertForSequenceClassification, DistilBertConfig 12 13BATCH_SIZE = 4 14EPOCHS = 5 15MAXLEN = 128 16LR = 1e-5 17pretrained_model_name_or_path = 'laboro-ai/distilbert-base-japanese' 18 19model = TFDistilBertForSequenceClassification.from_pretrained(pretrained_model_name_or_path, from_pt=True, num_labels=2) 20optimizer = Adam(learning_rate=3e-5) 21loss = losses.SparseCategoricalCrossentropy(from_logits=True) 22model.compile(optimizer=optimizer, 23 loss=loss, 24 metrics=['accuracy']) 25 26model.fit(x=features_train, 27 y=labels_train, 28 batch_size=BATCH_SIZE, 29 epochs=EPOCHS, 30 validation_data=(features_val, labels_val))
####features_train
{'input_ids': <tf.Tensor: shape=(3754, 128), dtype=int32, numpy= array([[ 2, 6, 22258, ..., 1, 1, 1], [ 2, 6, 19958, ..., 1, 1, 1], [ 2, 6, 1780, ..., 15229, 22485, 3], ..., [ 2, 5676, 11915, ..., 1, 1, 1], [ 2, 5676, 11915, ..., 1, 1, 1], [ 2, 5676, 11915, ..., 1, 1, 1]], dtype=int32)>, 'attention_mask': <tf.Tensor: shape=(3754, 128), dtype=int32, numpy= array([[1, 1, 1, ..., 0, 0, 0], [1, 1, 1, ..., 0, 0, 0], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 0, 0, 0], [1, 1, 1, ..., 0, 0, 0], [1, 1, 1, ..., 0, 0, 0]], dtype=int32)>}
#####labels_train
[0 1 0 ... 0 1 0]
試したこと
model.summaryで各層の形状を見たところmultipleとなっていました.
_________________________________________________________________ Layer (type) Output Shape Param _________________________________________________________________ distilbert (TFDistilBertMain multiple 67497984 _________________________________________________________________ pre_classifier (Dense) multiple 590592 _________________________________________________________________ classifier (Dense) multiple 1538 _________________________________________________________________ dropout_119 (Dropout) multiple 0 _________________________________________________________________ Total params: 68,090,114 Trainable params: 68,090,114 Non-trainable params: 0 _________________________________________________________________
補足情報(FW/ツールのバージョンなど)
環境はgoogle colabです. ライブラリのtransformersとsentencepieceは最新のものです.
あなたの回答
tips
プレビュー