opencvをrealsense仕様に変えたい

前提・実現したい

ubuntu20.04 python3.8 vscodeを使用しています
顔方向推定しているコードですがopencvによるカメラ起動ではなくrealseseのコードを使用し起動したいです。
よろしくお願い致します。

発生している問題・エラーメッセージ

Traceback (most recent call last):
  File "main_realsense.py", line 65, in <module>
    facebox = mark_detector.extract_cnn_facebox(frames)
  File "/head_pose_estimation/mark_detector.py", line 138, in extract_cnn_facebox
    _, raw_boxes = self.face_detector.get_faceboxes(
  File "/head_pose_estimation/mark_detector.py", line 22, in get_faceboxes
    rows, cols, _ = image.shape
AttributeError: 'tuple' object has no attribute 'shape'

該当のソースコード

1個目のコードを２個めのコードに変えました

if __name__ == '__main__':
    # Before estimation started, there are some startup works to do.

    # 1. Setup the video source from webcam or video file.
    video_src = args.cam if args.cam is not None else args.video
    if video_src is None:
        print("Video source not assigned, default webcam will be used.")
        video_src = "path"

    cap = cv2.VideoCapture(video_src)

    # Get the frame size. This will be used by the pose estimator.
    width = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
    height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)

    # 2. Introduce a pose estimator to solve pose.
    pose_estimator = PoseEstimator(img_size=(height, width))

    # 3. Introduce a mark detector to detect landmarks.
    mark_detector = MarkDetector()

    # 4. Measure the performance with a tick meter.
    tm = cv2.TickMeter()

    # Now, let the frames flow.
    while True:

        # Read a frame.
        frame_got, frame = cap.read()
        if frame_got is False:
            break

        # If the frame comes from webcam, flip it so it looks like a mirror.
        if video_src == 0:
            frame = cv2.flip(frame, 2)

        # Step 1: Get a face from current frame.
        facebox = mark_detector.extract_cnn_facebox(frame)

        # Any face found?
        if facebox is not None:

            # Step 2: Detect landmarks. Crop and feed the face area into the
            # mark detector.
            x1, y1, x2, y2 = facebox
            face_img = frame[y1: y2, x1: x2]

            # Run the detection.
            tm.start()
            marks = mark_detector.detect_marks(face_img)
            tm.stop()

            # Convert the locations from local face area to the global image.
            marks *= (x2 - x1)
            marks[:, 0] += x1
            marks[:, 1] += y1

            # Try pose estimation with 68 points.
            pose = pose_estimator.solve_pose_by_68_points(marks)

            # All done. The best way to show the result would be drawing the
            # pose on the frame in realtime.

            # Do you want to see the pose annotation?
            pose_estimator.draw_annotation_box(
                frame, pose[0], pose[1], color=(0, 255, 0))

            # Do you want to see the head axes?
            pose_estimator.draw_axes(frame, pose[0], pose[1])

            # Do you want to see the marks?
            mark_detector.draw_marks(frame, marks, color=(0, 255, 0))

            # Do you want to see the facebox?
            mark_detector.draw_box(frame, [facebox])

        # Show preview.
        cv2.imshow("Preview", frame)
        if cv2.waitKey(1) == 27:
            break

if __name__ == '__main__':

    cap = RealsenseCapture()
    # プロパティの設定
    cap.WIDTH = 640
    cap.HEIGHT = 480
    cap.FPS = 30

    # 1. Setup the video source from webcam or video file.
    video_src = args.cam if args.cam is not None else args.video
    if video_src is None:
        print("Video source not assigned, default webcam will be used.")

    cap.start()
    # Get the frame size. This will be used by the pose estimator.
    #width = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
    #height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
    # 2. Introduce a pose estimator to solve pose.
    pose_estimator = PoseEstimator(img_size=(cap.HEIGHT, cap.WIDTH))
    # 3. Introduce a mark detector to detect landmarks.
    mark_detector = MarkDetector()
    # 4. Measure the performance with a tick meter.
    tm = cv2.TickMeter()

    # Now, let the frame flow.
    while True:

        # Read a frames.
        ret, frames = cap.read()
        color_frame = frames[0]
        depth_frame = frames[1]
        # ヒートマップに変換
        depth_colormap = cv2.applyColorMap(cv2.convertScaleAbs(
        depth_frame, alpha=0.08), cv2.COLORMAP_JET)

        if ret is False:
            break

        # If the frames comes from webcam, flip it so it looks like a mirror.
        if video_src == 0:
            frames = cv2.flip(frames, 2)

        # Step 1: Get a face from current frames.
        facebox = mark_detector.extract_cnn_facebox(frames)

        # Any face found?
        if facebox is not None:

            # Step 2: Detect landmarks. Crop and feed the face area into the
            # mark detector.
            x1, y1, x2, y2 = facebox
            face_img = frames[y1: y2, x1: x2]

            # Run the detection.
            tm.start()
            marks = mark_detector.detect_marks(face_img)
            tm.stop()

            # Convert the locations from local face area to the global image.
            marks *= (x2 - x1)
            marks[:, 0] += x1
            marks[:, 1] += y1

            # Try pose estimation with 68 points.
            pose = pose_estimator.solve_pose_by_68_points(marks)

            # All done. The best way to show the result would be drawing the
            # pose on the frames in realtime.

            # Do you want to see the pose annotation?
            pose_estimator.draw_annotation_box(
                frames, pose[0], pose[1], color=(0, 255, 0))

            # Do you want to see the head axes?
            pose_estimator.draw_axes(frames, pose[0], pose[1])

            # Do you want to see the marks?
            mark_detector.draw_marks(frames, marks, color=(0, 255, 0))

            # Do you want to see the facebox?
            mark_detector.draw_box(frames, [facebox])
        # レンダリング
        images = np.hstack((color_frame, depth_colormap))  # RGBとDepthを横に並べて表示
        cv2.imshow('Preview', frames)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
# ストリーミング停止
cap.release()
cv2.destroyAllWindows()

退会済みユーザー

2021/10/17 04:06

Realsenseのカメラを 1. OpenCVのインターフェイスでcv2.VicdeoCapture()みたいに扱いたい（ただのカラーUSBカメラとして扱いたい） 2. Pyrealsense2からカラー画像のフレームを取り出してOpenCV OpenCVのフォーマットにしたいどちらでしょうか？ ※深度情報を使いたいなら2の選択肢のみになります。

退会済みユーザー

2021/10/17 04:47

ご回答ありがとうございます。２の方法でやりたいと考えています。よろしくお願いいたします。

行動規範の内容に同意します

回答1件

ベストアンサー

Pyrealsense2のリファレンスが大変参考になります。

Python3
1
2##############################################
3# 以下を初めの方に加えてください。
4import pyrealsense2 as rs
5
6# カメラのパイプライン一式
7pipeline = rs.pipeline()
8config = rs.config()
9pipeline_wrapper = rs.pipeline_wrapper(pipeline)
10pipeline_profile = config.resolve(pipeline_wrapper)
11device = pipeline_profile.get_device()
12device_product_line = str(device.get_info(rs.camera_info.product_line))
13
14# 深度カメラの解像度とFPS
15config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
16
17# カラーカメラの解像度とFPS
18if device_product_line == 'L500':
19    config.enable_stream(rs.stream.color, 960, 540, rs.format.bgr8, 30)
20else:
21    config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)
22
23# カラー画像に深度の解像度を合わせる
24align_to = rs.stream.color
25align = rs.align(align_to)
26
27profile = pipeline.start(config)
28
29##############################################
30# 以下、whileのフレームを受け取る部分だけ書きます。
31
32while True:
33    # Read a frames.
34    frames = pipeline.wait_for_frames()
35    aligned_frames = align.process(frames)
36    color_frame = aligned_frames.get_color_frame()
37    depth_frame = aligned_depth_frame = aligned_frames.get_depth_frame()

投稿2021/10/18 10:36

退会済みユーザー

総合スコア0

退会済みユーザー

2021/10/18 10:38

OpenCVよりうっとおしいですが、OpenCVよりできることも多いですので我慢ですね！ whileの外側で初期設定をしておけば、colorとdepthはポンと取れるようになっています。もし、frames =の部分からdepth_frame = までをdef ○○みたいに定義してしまえば、 whileループの中身ももっとすっきりできると思います。

行動規範の内容に同意します