文字表記のソースコードをOpenCVからPILに変えたいです。

前提・実現したいこと

前提として、
github上にあるdeep_sort_yolov3のdemo.pyとyolo.pyを編集しています。
【参照github：deep_sort_yolov3】

demo.py(～行目は多少違う可能性あり)
12行目　from PIL import Image　→　from PIL import Image, ImageFont, ImageDraw　に変更
47行目　model_filename = 'model_data/market1501.pb'　→　model_filename = 'model_data/mars-small128.pb'　に変更
67行目　font = ImageFont.truetype(font='font/yumin.ttf', size=50)　を追加(fps=の下)
77行目　draw = ImageDraw.Draw(image)　を追加(image=の下)
112行目　cv2.putText(frame, str(class_names[0]),・・・)　を非表示
その下に　draw.text((int(bbox[0]), int(bbox[1])), str(class_names[0]), fill=(255, 255, 255), font=font)
frame = np.array(image[...,::-1])　を追加

yolo.py(～行目は多少違う可能性あり)
54行目　with open(classes_path) as f:　→　with open(classes_path, encoding="utf-8") as f:　に変更
1268行目　if predicted_class != args["class"]:　～　を非表示
1224行目　if predicted_class != 'person' and predicted_class != 'car':　～　を表示　
また'person'を'ひと'に変更

deep_sort_yolov3-masterファイル内
fontファイルの追加
outputファイルの追加
test_videoファイルの追加
model_data内のcoco_classes.txtのpersonをひとに変更

yolov3.weightsをダウンロード
重みファイル↑を変換
python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5

分かる方がいましたらご教授お願いいたします。

発生している問題・エラーメッセージ

Traceback (most recent call last):
  File "demo.py", line 181, in <module>
    main(YOLO())
  File "demo.py", line 121, in main
    frame = np.array(image[...,::-1])
TypeError: 'Image' object is not subscriptable

該当のソースコード

python
1#! /usr/bin/env python
2# -*- coding: utf-8 -*-
3
4from __future__ import division, print_function, absolute_import
5import os
6import datetime
7from timeit import time
8import warnings
9import cv2
10import numpy as np
11import argparse
12#from PIL import Image
13from PIL import Image, ImageFont, ImageDraw
14from yolo import YOLO
15from deep_sort import preprocessing
16from deep_sort import nn_matching
17from deep_sort.detection import Detection
18from deep_sort.tracker import Tracker
19from tools import generate_detections as gdet
20from deep_sort.detection import Detection as ddet
21from collections import deque
22from keras import backend
23
24backend.clear_session()
25ap = argparse.ArgumentParser()
26ap.add_argument("-i", "--input",help="path to input video", default = "./test_video/test.avi")
27ap.add_argument("-c", "--class",help="name of class", default = "person")
28args = vars(ap.parse_args())
29
30pts = [deque(maxlen=30) for _ in range(9999)]
31warnings.filterwarnings('ignore')
32
33# initialize a list of colors to represent each possible class label
34np.random.seed(100)
35COLORS = np.random.randint(0, 255, size=(200, 3),
36	dtype="uint8")
37
38def main(yolo):
39
40    start = time.time()
41    #Definition of the parameters
42    max_cosine_distance = 0.5 #余弦距离的控制阈值
43    nn_budget = None
44    nms_max_overlap = 0.3 #非极大抑制的阈值
45
46    counter = []
47    #deep_sort
48    #model_filename = 'model_data/market1501.pb'
49    model_filename = 'model_data/mars-small128.pb'
50    encoder = gdet.create_box_encoder(model_filename,batch_size=1)
51
52    metric = nn_matching.NearestNeighborDistanceMetric("cosine", max_cosine_distance, nn_budget)
53    tracker = Tracker(metric)
54
55    writeVideo_flag = True
56    #video_path = "./output/output.avi"
57    video_capture = cv2.VideoCapture(args["input"])
58
59    if writeVideo_flag:
60    # Define the codec and create VideoWriter object
61        w = int(video_capture.get(3))
62        h = int(video_capture.get(4))
63        #fourcc = cv2.VideoWriter_fourcc(*'MJPG')
64        fourcc = cv2.VideoWriter_fourcc(*'XVID')
65        out = cv2.VideoWriter('./output/'+args["input"][43:57]+ "_" + args["class"] + '_output.avi', fourcc, 15, (w, h))
66        list_file = open('detection.txt', 'w')
67        frame_index = -1
68
69    fps = 0.0
70
71    font = ImageFont.truetype(font='font/yumin.ttf', size=30)
72
73    while True:
74
75        ret, frame = video_capture.read()  # frame shape 640*480*3
76        if ret != True:
77            break
78        t1 = time.time()
79
80       # image = Image.fromarray(frame)
81        image = Image.fromarray(frame[...,::-1]) #bgr to rgb
82        draw = ImageDraw.Draw(image)
83
84        boxs,class_names = yolo.detect_image(image)
85        features = encoder(frame,boxs)
86        # score to 1.0 here).
87        detections = [Detection(bbox, 1.0, feature) for bbox, feature in zip(boxs, features)]
88        # Run non-maxima suppression.
89        boxes = np.array([d.tlwh for d in detections])
90        scores = np.array([d.confidence for d in detections])
91        indices = preprocessing.non_max_suppression(boxes, nms_max_overlap, scores)
92        detections = [detections[i] for i in indices]
93
94        # Call the tracker
95        tracker.predict()
96        tracker.update(detections)
97
98        i = int(0)
99        indexIDs = []
100        c = []
101        boxes = []
102        for det in detections:
103            bbox = det.to_tlbr()
104            cv2.rectangle(frame,(int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])),(255,255,255), 2)
105
106        for track in tracker.tracks:
107            if not track.is_confirmed() or track.time_since_update > 1:
108                continue
109            #boxes.append([track[0], track[1], track[2], track[3]])
110            indexIDs.append(int(track.track_id))
111            counter.append(int(track.track_id))
112            bbox = track.to_tlbr()
113            color = [int(c) for c in COLORS[indexIDs[i] % len(COLORS)]]
114
115            cv2.rectangle(frame, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])),(color), 3)
116            cv2.putText(frame,str(track.track_id),(int(bbox[0]), int(bbox[1] -50)),0, 5e-3 * 150, (color),2)
117            if len(class_names) > 0:
118               class_name = class_names[0]
119               #cv2.putText(frame, str(class_names[0]),(int(bbox[0]), int(bbox[1] -20)),0, 5e-3 * 150, (color),2)
120               draw.text((int(bbox[0]), int(bbox[1])), str(class_names[0]), fill=(255, 255, 255), font=font)
121               frame = np.array(image[...,::-1])
122            #frame = np.array(image)
123            i += 1
124            #bbox_center_point(x,y)
125            center = (int(((bbox[0])+(bbox[2]))/2),int(((bbox[1])+(bbox[3]))/2))
126            #track_id[center]
127            pts[track.track_id].append(center)
128            thickness = 5
129            #center point
130            cv2.circle(frame,  (center), 1, color, thickness)
131
132	    #draw motion path
133            for j in range(1, len(pts[track.track_id])):
134                if pts[track.track_id][j - 1] is None or pts[track.track_id][j] is None:
135                   continue
136                thickness = int(np.sqrt(64 / float(j + 1)) * 2)
137                cv2.line(frame,(pts[track.track_id][j-1]), (pts[track.track_id][j]),(color),thickness)
138                #cv2.putText(frame, str(class_names[j]),(int(bbox[0]), int(bbox[1] -20)),0, 5e-3 * 150, (255,255,255),2)
139
140        count = len(set(counter))
141        cv2.putText(frame, "Total Object Counter: "+str(count),(int(20), int(120)),0, 5e-3 * 200, (0,255,0),2)
142        cv2.putText(frame, "Current Object Counter: "+str(i),(int(20), int(80)),0, 5e-3 * 200, (0,255,0),2)
143        cv2.putText(frame, "FPS: %f"%(fps),(int(20), int(40)),0, 5e-3 * 200, (0,255,0),3)
144        cv2.namedWindow("YOLO3_Deep_SORT", 0);
145        cv2.resizeWindow('YOLO3_Deep_SORT', 1024, 768);
146        cv2.imshow('YOLO3_Deep_SORT', frame)
147
148        if writeVideo_flag:
149            #save a frame
150            out.write(frame)
151            frame_index = frame_index + 1
152            list_file.write(str(frame_index)+' ')
153            if len(boxs) != 0:
154                for i in range(0,len(boxs)):
155                    list_file.write(str(boxs[i][0]) + ' '+str(boxs[i][1]) + ' '+str(boxs[i][2]) + ' '+str(boxs[i][3]) + ' ')
156            list_file.write('\n')
157        fps  = ( fps + (1./(time.time()-t1)) ) / 2
158        #print(set(counter))
159
160        # Press Q to stop!
161        if cv2.waitKey(1) & 0xFF == ord('q'):
162            break
163    print(" ")
164    print("[Finish]")
165    end = time.time()
166
167    if len(pts[track.track_id]) != None:
168       print(args["input"][43:57]+": "+ str(count) + " " + str(class_name) +' Found')
169
170    else:
171       print("[No Found]")
172
173    video_capture.release()
174
175    if writeVideo_flag:
176        out.release()
177        list_file.close()
178    cv2.destroyAllWindows()
179
180if __name__ == '__main__':
181    main(YOLO())

退会済みユーザー

2021/11/28 21:20 編集

> File "demo.py", line 121, in main > frame = np.array(image[...,::-1]) > TypeError: 'Image' object is not subscriptable ということは、タイトルにあるように問題の本質はYOLOではなくてPILでのデータ扱いです。他の人が同じことを再現しようとするとYOLOを使えるようにしてダミーデータを用意しないといけませんのでハードルが随分あがります。エラーが起きる必要最小限のコードで再掲載できますか？例えばOpenCVとPILとで動くPythonのコードとダミー画像を掲載、などです。

jbpb0

2021/11/28 22:13

> 'Image' object is not subscriptable frame = np.array(image[...,::-1]) ↓ 修正 frame = np.array(image)[...,::-1] かなあと、 image = Image.fromarray(frame[...,::-1]) #bgr to rgb draw = ImageDraw.Draw(image) を draw.text((int(bbox[0]), int(bbox[1])), str(class_names[0]), fill=(255, 255, 255), font=font) の直前に追加しないと、それより上で「cv2.rectangle」と「cv2.putText」で処理したことが無かったことになりませんか？

caramel

2021/11/29 10:02

>frame = np.array(image)[...,::-1] 発生しているエラーメッセージは解決しました。しかし、 Traceback (most recent call last): File "demo.py", line 183, in <module> main(YOLO()) File "demo.py", line 132, in main cv2.circle(frame, (center), 1, color, thickness) cv2.error: OpenCV(4.5.3) :-1: error: (-5:Bad argument) in function 'circle' > Overload resolution failed: > - Layout of the output array img is incompatible with cv::Mat (step[ndims-1] != elemsize or step[1] != elemsize*nchannels) > - Expected Ptr<cv::UMat> for argument 'img' といったエラーメッセージが出ました。 circleに渡されている引数はintで整数値なので間違いはないと思ってます。 >Expected Ptr<cv::UMat> for argument 'img' 引数 'img' に対して<cv::UMat>が期待されるというのは、 image = Image.fromarray(frame[...,::-1])のfromarrayをUMatに変えればいいということでしょうか？ >直前に追加しないと、それより上で「cv2.rectangle」と「cv2.putText」で処理したことが無かったことになりませんか？確かに、言われるまで消えていることに気づきませんでした。上記のエラーが解決してから考えたいと思います。

jbpb0

2021/11/30 05:42

> といったエラーメッセージが出ました。は、この質問とは別内容なので、別の質問にしてください

行動規範の内容に同意します

回答1件

'Image' object is not subscriptable

python
1frame = np.array(image[...,::-1])

↓ 修正

python
1frame = np.array(image)[...,::-1]

投稿2021/11/30 05:39

jbpb0

総合スコア7658

あなたの回答

tips

プレビュー

行動規範の内容に同意します

質問の解決につながる回答をしましょう。サンプルコードなど、より具体的な説明があると質問者の理解の助けになります。また、読む側のことを考えた、分かりやすい文章を心がけましょう。

まだベストアンサーが選ばれていません

会員登録して回答してみよう

アカウントをお持ちの方は

15分調べてもわからないことは
teratailで質問しよう！

ただいまの回答率
85.31%

質問をまとめることで
思考を整理して素早く解決

テンプレート機能で
簡単に質問をまとめる

質問する

質問をすることでしか得られない、回答やアドバイスがある。

15分調べてもわからないことは、質問しよう！

文字表記のソースコードをOpenCVからPILに変えたいです。

前提・実現したいこと

発生している問題・エラーメッセージ

該当のソースコード

関連した質問