YOLOv3 コンフィデンスレベル取得方法

YOLOv3ホームページに公開されているdetect.pyというファイルにおいて検出された物体のコンフィデンスの値を取得したいと考えています。
ソースコード中のpredictionにそれらの情報が格納されていると考え抽出しようと考えているのですが、中身の値の認識に困っています。検出された物体のコンフィデンスを正確に出力するためにはどのようにしたら良いか教えていただけると幸いです。
ソースコードはdetect.pyの一部抜粋となります。

python
1def arg_parse():
2  
3    parser = argparse.ArgumentParser(description='YOLO v3 Detection Module')
4
5    parser.add_argument("--images", dest = 'images', help =
6                        "Image / Directory containing images to perform detection upon",
7                        default = "imgs", type = str)
8    parser.add_argument("--det", dest = 'det', help =
9                        "Image / Directory to store detections to",
10                        default = "det", type = str)
11    parser.add_argument("--bs", dest = "bs", help = "Batch size", default = 1)
12    parser.add_argument("--confidence", dest = "confidence", help = "Object Confidence to filter predictions", default = 0.5)
13    parser.add_argument("--nms_thresh", dest = "nms_thresh", help = "NMS Threshhold", default = 0.4)
14    parser.add_argument("--cfg", dest = 'cfgfile', help =
15                        "Config file",
16                        default = "cfg/yolov3.cfg", type = str)
17    parser.add_argument("--weights", dest = 'weightsfile', help =
18                        "weightsfile",
19                        default = "yolov3.weights", type = str)
20    parser.add_argument("--reso", dest = 'reso', help =
21                        "Input resolution of the network. Increase to increase accuracy. Decrease to increase speed",
22                        default = "416", type = str)
23    parser.add_argument("--scales", dest = "scales", help = "Scales to use for detection",
24                        default = "1,2,3", type = str)
25
26    return parser.parse_args()
27
28if __name__ ==  '__main__':
29    args = arg_parse()
30    scales = args.scales
31    images = args.images
32    batch_size = int(args.bs)
33    confidence = float(args.confidence)
34    nms_thesh = float(args.nms_thresh)
35    start = 0
36
37    CUDA = torch.cuda.is_available()
38
39    num_classes = 80
40    classes = load_classes('data/coco.names')
41
42    model = Darknet(args.cfgfile)
43    model.load_weights(args.weightsfile)
44
45    model.net_info["height"] = args.reso
46    inp_dim = int(model.net_info["height"])
47    assert inp_dim % 32 == 0
48    assert inp_dim > 32
49
50    #If there's a GPU availible, put the model on GPU
51    if CUDA:
52        model.cuda()
53
54    model.eval()
55
56    read_dir = time.time()
57    #Detection phase
58    try:
59        imlist = [osp.join(osp.realpath('.'), images, img) for img in os.listdir(images) if os.path.splitext(img)[1] == '.png' or os.path.splitext(img)[1] =='.jpeg' or os.path.splitext(img)[1] =='.jpg']
60    except NotADirectoryError:
61        imlist = []
62        imlist.append(osp.join(osp.realpath('.'), images))
63    except FileNotFoundError:
64        print ("No file or directory with the name {}".format(images))
65        exit()
66
67    if not os.path.exists(args.det):
68        os.makedirs(args.det)
69
70    load_batch = time.time()
71
72    batches = list(map(prep_image, imlist, [inp_dim for x in range(len(imlist))]))
73    im_batches = [x[0] for x in batches]
74    orig_ims = [x[1] for x in batches]
75    im_dim_list = [x[2] for x in batches]
76    im_dim_list = torch.FloatTensor(im_dim_list).repeat(1,2)
77
78    if CUDA:
79        im_dim_list = im_dim_list.cuda()
80
81    leftover = 0
82
83    if (len(im_dim_list) % batch_size):
84        leftover = 1
85
86    if batch_size != 1:
87        num_batches = len(imlist) // batch_size + leftover
88        im_batches = [torch.cat((im_batches[i*batch_size : min((i +  1)*batch_size,
89                            len(im_batches))]))  for i in range(num_batches)]
90
91    i = 0
92
93    write = False
94    model(get_test_input(inp_dim, CUDA), CUDA)
95
96    start_det_loop = time.time()
97
98    objs = {}
99
100    for batch in im_batches:
101        #load the image
102        start = time.time()
103        if CUDA:
104            batch = batch.cuda()
105
106        with torch.no_grad():
107            prediction = model(Variable(batch), CUDA)
108# prediction here
109            print ("prediction", prediction)
110
111        prediction = write_results(prediction, confidence, num_classes, nms = True, nms_conf = nms_thesh)
112# prediction here
113        print ("prediction2", prediction)
114
115        if type(prediction) == int:
116            i += 1
117            continue
118
119        end = time.time()
120
121        prediction[:,0] += i*batch_size
122
123        if not write:
124            output = prediction
125            write = 1
126        else:
127            output = torch.cat((output,prediction))

これらのpredictionの出力は以下のようになっています。

prediction tensor([[[1.5383e+01, 1.2399e+01, 9.3864e+01, ..., 7.5703e-04,
9.0208e-04, 5.9246e-04],
[1.8194e+01, 1.4778e+01, 1.0411e+02, ..., 2.1265e-04,
1.1475e-03, 1.6560e-03],
[2.1265e+01, 1.2748e+01, 3.8478e+02, ..., 3.6203e-03,
7.6282e-03, 6.8394e-03],
...,
[4.1259e+02, 4.1129e+02, 3.3664e+00, ..., 2.8758e-05,
3.9763e-05, 2.3203e-05],
[4.1155e+02, 4.0989e+02, 7.5316e+00, ..., 1.7735e-04,
2.2018e-04, 2.0052e-04],
[4.1110e+02, 4.1259e+02, 5.2966e+01, ..., 9.5141e-05,
1.5668e-04, 2.1929e-04]]])

prediction2 tensor([[ 0.0000, 89.3013, 110.7477, 303.7198, 294.3178, 0.9951, 0.9997,
1.0000],
[ 0.0000, 256.5005, 98.3645, 373.2559, 144.1284, 0.9953, 0.9431,
7.0000],
[ 0.0000, 69.5096, 173.2218, 170.4211, 343.0221, 0.9997, 0.9882,
16.0000]])

行動規範の内容に同意します

回答1件

Pythonコードに見えますので本家ではなく別の方のコードを使ってますでしょうか
本家はcで書かれてます

書かれたprediction tensorで言うと最後の次元の5番目がコンフィデンスだと思います
なので下記を打てばひとまず全Bboxのコンフィデンスは出ます。
print(pred[:,:,4].shape,pred[:,:,4])
そのうちのMaxが知りたい場合は下記を打てば出ます
print(pred[:,:,4].max())

検出した物体のコンフィデンスかどうかはこの関数前後のコードを追った方がよろしいかもしれません

投稿2020/04/24 09:04