質問編集履歴

質問への回答を追加しました。

2018/11/16 04:30

投稿

スコア35

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -129,3 +129,39 @@
 1 UnicodeDecodeErrorが起きてしまっている原因箇所
 2 encode問題の対処の仕方をお答えしてほしいです。
+補足質問への回答
+1 エラー発生時のdata_train_sの中身について
+["print(len([s for s in l if s.endswith('e')]))", 'onSuccess {Function}', "1' UNION ALL SELECT CONCAT(0x716b6b6a71,(CASE WHEN (EXISTS(SELECT creditcard_id FROM performance_schema.events_waits_summary_by_instance)) THEN 1 ELSE 0 END),0x716a717a71),NULL-- mwJp", 'onAfterRender {Function}', "1' UNION ALL SELECT CONCAT(0x716b6b6a71,(CASE WHEN (EXISTS(SELECT aTEC FROM zsTX)) THEN 1 ELSE 0 END),0x716a717a71),NULL-- utMa", 'select* from database where id = 1;']
+2 Tracebackの中身
+Traceback (most recent call last):
+  File "svm.py", line 59, in <module>
+    words = get_words(data_train_s)
+  File "svm.py", line 30, in get_words
+    ret.append(get_words_main(content))
+  File "svm.py", line 35, in get_words_main
+    return [token for token in tokenize(content)]
+  File "svm.py", line 35, in <listcomp>
+    return [token for token in tokenize(content)]
+  File "svm.py", line 22, in tokenize
+    yield node.surface.lower()
+UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position 0: invalid start byte