前提
こんにちは。
質問を見ていただきありがとうございます。
transformersのpipeline機能を用いたところ、エラーが生じてしまい、悩んでいます。
お忙しい中だとは思いますが、回答いただけると嬉しいです。
実現したいこと
transformersのpipeline機能でquestion-answeringを行いたいと思っています。
使うモデルはstudio-ousia/Luke-japanese-base-liteと
studio-ousia/luke-japanese-base-liteをDDQAデータセット(運転ドメインQAデータセット)を用いてfinetuningしたモデルであるMy_luke_model_squad.pthです。
My_luke_model_squad.pthはここ( https://huggingface.co/Mizuiro-sakura/luke-japanese-finetuned-question-answering/tree/main )に置いてあります
発生している問題・エラーメッセージ
multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "C:\Users\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "C:\Users\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 48, in mapstar return list(map(*args)) File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\data\processors\squad.py", line 180, in squad_convert_example_to_features encoded_dict = tokenizer.encode_plus( # TODO(thom) update this logic File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\tokenization_utils_base.py", line 2667, in encode_plus return self._encode_plus( File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\models\mluke\tokenization_mluke.py", line 559, in _encode_plus ) = self._create_input_sequence( File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\models\mluke\tokenization_mluke.py", line 775, in _create_input_sequence first_ids = get_input_ids(text) File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\models\mluke\tokenization_mluke.py", line 735, in get_input_ids tokens = self.tokenize(text, **kwargs) File "C:\Userst\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\tokenization_utils.py", line 520, in tokenize if token in no_split_token: TypeError: unhashable type: 'list' """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "C:\Users\desktop\Python\luke_squad.py", line 15, in <module> result=qa(text) File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\pipelines\question_answering.py", line 380, in __call__ return super().__call__(examples[0], **kwargs) File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\pipelines\base.py", line 1074, in __call__ return self.run_single(inputs, preprocess_params, forward_params, postprocess_params) File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\pipelines\base.py", line 1095, in run_single for model_inputs in self.preprocess(inputs, **preprocess_params): File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\pipelines\question_answering.py", line 396, in preprocess features = squad_convert_examples_to_features( File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\transformers\data\processors\squad.py", line 377, in squad_convert_examples_to_features features = list( File "C:\Users\AppData\Local\Programs\Python\Python39\lib\site-packages\tqdm\std.py", line 1111, in __iter__ for obj in iterable: File "C:\Users\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 420, in <genexpr> return (item for chunk in result for item in chunk) File "C:\Users\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 870, in next raise value TypeError: unhashable type: 'list'
該当のソースコード
python
1import torch 2from transformers import MLukeTokenizer,pipeline 3 4tokenizer = MLukeTokenizer.from_pretrained('studio-ousia/luke-japanese-base-lite') 5model=torch.load('My_luke_model_squad.pth') # 学習済みモデルの読み込み 6qa=pipeline('question-answering', model=model, tokenizer=tokenizer) 7 8text={ 9 'context':'私の名前はEIMIです。好きな食べ物は苺です。趣味は皆さんと会話することです。', 10 'question' : '好きな食べ物は何ですか' 11} 12if __name__=='__main__': 13 result=qa(text) 14 15 print(result)
補足情報(FW/ツールのバージョンなど)
python:3.9.13
transformers:4.24.0
あなたの回答
tips
プレビュー