transformersのTFBertForQuestionAnsweringの出力が何を意味するか

Question

[transformers:TFBertForQuestionAnswering](https://huggingface.co/transformers/model_doc/bert.html?highlight=forsequenceclassification#tfbertforquestionanswering)
上記サイトのTFBertForQuestionAnsweringのExampleについて質問です。

```Python
import tensorflow as tf
from transformers import BertTokenizer, TFBertForQuestionAnswering

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = TFBertForQuestionAnswering.from_pretrained('bert-base-uncased')
input_ids = tf.constant(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True))[None, :]  # Batch size 1
outputs = model(input_ids)
start_scores, end_scores = outputs[:2]
```

このコードにおいて、start_scores,end_scoresは何を意味しているのですか？

Accepted Answer

ドキュメントにあるように、TFBertForQuestionAnsweringは入力したテキストの中から応答にあたる箇所を抽出します。

> Bert Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute span start logits and span end logits).

よって、start_scoresとend_scoresは抽出箇所の始点と終点のlogitsです。つまり、start_scoresとend_scoresは入力テキストと同じ長さの値をもち、それらの値はテキスト内の各tokenが始点もしくは終点である確率 (正確にはlogits) です。start_scoresの中で最も大きい値のインデックス番号から、end_scoresの中で最も大きい値のインデックス番号まで入力テキストから抜き出してquestion-answeringのタスクを実行します。

関連した質問