質問編集履歴

1

試したことを追記しました

2020/02/12 03:15

投稿

kimtakuya_
kimtakuya_

スコア22

test CHANGED
File without changes
test CHANGED
@@ -54,6 +54,8 @@
54
54
 
55
55
  ### 試したこと
56
56
 
57
+ 1つ目
58
+
57
59
  文字コードをutf-8に指定しましたが、うまくいきません。
58
60
 
59
61
  f=open(os.path.join(glove_dir,'glove.6B.100d.txt'),encoding='utf-8')
@@ -63,6 +65,36 @@
63
65
  error:
64
66
 
65
67
  'utf-8' codec can't decode byte 0x93 in position 5457: invalid start byte
68
+
69
+
70
+
71
+ 2つ目
72
+
73
+ f=open(os.path.join(glove_dir,'glove.6B.100d.txt'),encoding='utf-8', errors='backslashreplace')
74
+
75
+ for i, line in enumerate(f, start=1):
76
+
77
+ if '\x93' in line:
78
+
79
+ print(i, 'data=', line)
80
+
81
+ を実行して(エラーの出ている行?)の情報を出してみる
82
+
83
+
84
+
85
+ output:
86
+
87
+ 218 data= 遯カ\x93 -0.22111 -0.37868 -0.45325 0.14185 -0.41884 -0.068733 0.9203 -0.61626 -0.5716 -0.42955 1.2049 -1.2358 -0.26185 0.088171 0.75712 -0.24336 0.46966 0.15848 -0.63489 0.040005 0.28095 0.086989 0.80209 0.74317 0.30236 -0.57191 0.65167 -0.4509 1.1676 -0.060849 -0.85457 1.012 0.6167 -0.9409 -0.59359 -0.32423 0.31153 0.97604 -0.33894 0.32657 0.32848 -1.118 -0.090404 -0.61118 0.32629 -0.61908 0.9044 -0.8888 0.0023076 0.58002 -0.71818 -0.43466 0.55749 1.1147 -0.74757 -2.8426 -0.3132 -0.72711 0.16355 0.32031 -0.26561 0.28186 -0.86369 -0.25157 1.0981 -0.2622 -0.49901 0.071966 0.20213 0.072797 -0.23135 -0.022841 0.52705 0.25267 -0.081948 -0.53206 0.39748 0.53545 -0.89259 -0.64567 0.15596 0.022857 -0.29035 0.003132 -0.8019 0.29554 0.10346 -1.2921 0.31751 0.64262 -0.3628 0.15087 0.13307 1.1898 -0.31689 0.22648 -1.0675 -0.26161 0.080567 -1.2265
88
+
89
+
90
+
91
+ 76956 data= 遶\x87\x93 -0.28535 0.59318 -0.98323 1.4728 0.25745 0.41131 -0.17679 -0.11268 -0.58251 0.18187 1.3638 0.25468 -0.25794 -0.32218 0.055278 0.36173 -0.12181 0.27412 0.60933 0.62075 0.37659 1.0968 -0.34157 0.039422 1.4037 0.4129 0.15789 0.5491 -0.17189 0.55394 0.10505 0.16119 0.2298 -0.58572 -0.69517 0.19146 0.011744 -0.13812 -0.14675 0.92818 0.94505 0.064279 -0.12379 -0.057076 -0.68696 -0.99602 0.052588 -0.86161 0.22469 0.38867 -0.39322 -0.10653 -0.60466 0.8345 -0.57358 0.64156 -0.12862 -0.6559 -0.43485 0.67029 -0.81563 -0.48901 -0.81829 0.051527 0.63091 -0.10189 -0.34779 -0.99888 -0.43871 0.12262 0.51132 0.84689 -0.2557 0.49127 1.0359 0.76659 0.023119 -0.42643 0.37069 -0.30532 -0.396 0.071616 -0.14664 0.68623 -0.48416 0.68954 0.47875 -0.31374 0.59729 0.67351 -0.072777 -0.53515 -0.36786 -0.8431 -0.038361 -0.66771 -0.055356 0.35602 -0.40015 0.052309
92
+
93
+
94
+
95
+ 82823 data= h逶サ\x93 0.3164 -0.21773 -0.84591 -0.079903 0.12024 0.39647 -0.92599 -0.53315 0.23001 -0.27539 0.21821 0.56403 -0.33615 -0.74269 -0.85972 -0.0066054 0.74867 -0.92791 0.69111 -0.66205 0.42678 0.058274 -0.5581 0.36632 -1.4436 1.1286 0.16445 0.5975 0.38322 -0.70766 0.46191 0.77508 0.18933 -0.29637 -0.73876 0.028421 0.14837 0.16107 -0.13688 0.35587 -0.11276 -0.92444 -0.18708 0.53556 -0.14804 -0.028497 0.40851 -0.32742 0.1738 0.29484 -0.92802 -0.13762 0.21952 -0.13727 -0.73242 0.70597 -0.62059 -0.17175 -0.47128 -0.72417 0.26664 0.3126 0.15971 -0.36321 -0.31418 0.015932 0.88777 -0.43147 -0.94412 0.2309 0.076188 -0.21855 -0.32944 0.22401 0.51158 -0.099254 -0.33135 -0.25784 -0.22101 -0.5114 -0.17049 -0.32388 -0.32175 -0.3164 0.78179 1.442 -1.027 -0.58906 0.41066 -0.083527 -0.22247 -0.84649 0.20128 0.50814 -0.025018 -0.44438 0.18665 -0.97964 -0.38345 -0.97738
96
+
97
+
66
98
 
67
99
 
68
100