編集履歴

質問編集履歴

自分で調べたことの追加

2018/07/11 00:44

投稿

jun_endo

スコア56

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -12,7 +12,13 @@
 認識したい文字列の特徴が
+```
-```0~9999の数字＋末尾にアルファベット```
+0~9999の数字＋末尾にアルファベット
+```
 という特徴をしています。
@@ -34,7 +40,7 @@
-****2018年7月11日追記****
+###****2018年7月11日追記****
@@ -46,17 +52,23 @@
+ただ、**認識させたい文字列は数が異常に多く**、画像データは準備できても、
+**ラベルの作成の時**に、単純に計算しただけでも、
+**260000次元の配列を必要**としています。
+（下記で質問しています）
-↓
+↓機械学習でラベルの作り方
 [https://teratail.com/questions/135207](https://teratail.com/questions/135207)
-ただ、**認識させたい文字列は数が異常に多く**、画像データは準備できても、
-**ラベルの作成の時**に、単純に計算しただけでも、
-**260000次元の配列を必要**としています。
 それは実に、無茶であるために、
@@ -66,8 +78,248 @@
 そこで、初めて機械学習で認識にかけることができるということです。
+（下記で質問しています）
+↓機械学習　数字列の桁ごとに画像認識させたい
 [https://teratail.com/questions/135292](https://teratail.com/questions/135292)
+###数字のみの桁ごとに認識するものはあった
+桁ごとの数字認識
+[https://stackoverflow.com/questions/9413216/simple-digit-recognition-ocr-in-opencv-python](https://stackoverflow.com/questions/9413216/simple-digit-recognition-ocr-in-opencv-python)
+上記のプログラムを、今のバージョン用に書き換えて実行したところ、
+うまく動きました。
+**学習プログラム**
+```lang-python
+import sys
+import numpy as np
+import cv2
+im = cv2.imread('pitrain.png')
+im3 = im.copy()
+gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
+blur = cv2.GaussianBlur(gray,(5,5),0)
+thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)
+#################      Now finding Contours         ###################
+image, cnts, hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
+samples =  np.empty((0,100))
+responses = []
+keys = [i for i in range(48,58)]
+for cnt in cnts:
+    if cv2.contourArea(cnt)>50:
+        [x,y,w,h] = cv2.boundingRect(cnt)
+        if  h>28:
+            cv2.rectangle(im,(x,y),(x+w,y+h),(0,0,255),2)
+            roi = thresh[y:y+h,x:x+w]
+            roismall = cv2.resize(roi,(10,10))
+            cv2.imshow('norm',im)
+            key = cv2.waitKey(0)
+            if key == 27:  # (escape to quit)
+                sys.exit()
+            elif key in keys:
+                responses.append(int(chr(key)))
+                sample = roismall.reshape((1,100))
+                samples = np.append(samples,sample,0)
+responses = np.array(responses,np.float32)
+responses = responses.reshape((responses.size,1))
+print("training complete")
+np.savetxt('generalsamples.data',samples)
+np.savetxt('generalresponses.data',responses)
+```
+pitrain.png
+![pitrain.png](69d6761129d6a5c7cbee8de2a1ec50b9.png)
+テスト用
+```lang-python
+import cv2
+import numpy as np
+#######   training part    ###############
+samples = np.loadtxt('generalsamples.data',np.float32)
+responses = np.loadtxt('generalresponses.data',np.float32)
+responses = responses.reshape((responses.size,1))
+model = cv2.ml.KNearest_create()
+model.train(samples, cv2.ml.ROW_SAMPLE, responses)
+############################# testing part  #########################
+im = cv2.imread('PS.png')
+out = np.zeros(im.shape,np.uint8)
+gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
+thresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)
+imgs, contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
+for cnt in contours:
+    if cv2.contourArea(cnt)>50:
+        [x,y,w,h] = cv2.boundingRect(cnt)
+        if  h>28:
+            cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)
+            roi = thresh[y:y+h,x:x+w]
+            roismall = cv2.resize(roi,(10,10))
+            roismall = roismall.reshape((1,100))
+            roismall = np.float32(roismall)
+            retval, results, neigh_resp, dists = model.findNearest(roismall, k = 1)
+            string = str(int((results[0][0])))
+            cv2.putText(out,string,(x,y+h),0,1,(0,255,0))
+cv2.imshow('im',im)
+cv2.imshow('out',out)
+cv2.waitKey(10000)
+```
+pi.png
+![pi.png](67b652489feda92c62077f820a65e03d.png)
+**結果は上記のサイトにありますのでそちらを見てください。**
+###画像の数字にアルファベットを混ぜてみた
+上記のプログラムを全く変えないで、
+画像を以下のものにしました。
+PS.png
+![イメージ説明](c3efd715601900962bdf6b8d334c496c.png)
+それで実行してみたところ、
-１、**数字＋アルファベット**の混合文字列を認識させたい
+案の定、アルファベットも数字と認識してしまい、
+失敗に終わりました。
+###
+上記のプログラムに何か付け足すことで、
+アルファベットを認識できる方法があるのであれば、
+知りたいです。
+そのほか、何か方法があれば教えてください。

へんしゅう１

2018/07/11 00:44

投稿

jun_endo

スコア56

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -44,6 +44,14 @@
 **本来は、機械学習で文字認識をしたい**と検討していました。
+↓
+[https://teratail.com/questions/135207](https://teratail.com/questions/135207)
 ただ、**認識させたい文字列は数が異常に多く**、画像データは準備できても、
 **ラベルの作成の時**に、単純に計算しただけでも、
@@ -58,7 +66,7 @@
 そこで、初めて機械学習で認識にかけることができるということです。
+[https://teratail.com/questions/135292](https://teratail.com/questions/135292)

自分の中で考えたことの追加

2018/07/11 00:16

投稿

jun_endo

スコア56

test CHANGED Viewed

File without changes

test CHANGED Viewed

@@ -31,3 +31,35 @@
 どなたか教えてください。
+****2018年7月11日追記****
+なぜ、**文字を物体認識**にかけたいかというと、
+**本来は、機械学習で文字認識をしたい**と検討していました。
+ただ、**認識させたい文字列は数が異常に多く**、画像データは準備できても、
+**ラベルの作成の時**に、単純に計算しただけでも、
+**260000次元の配列を必要**としています。
+それは実に、無茶であるために、
+**前処理の段階で、
+文字列を桁ごとに分割して、仮想的に画像化**することで、
+そこで、初めて機械学習で認識にかけることができるということです。
+１、**数字＋アルファベット**の混合文字列を認識させたい