1

I'm trying to use Open3D for image recognition. I want to detect the block, then put a square around it (I have a ply with 524288 points). I'm using the Intel real sense d435 as the camera.

I turned the video (the images of the video) into a NumPy array. I get the interstices of the depth of frames, and create a Open3D image. I create a point cloud from the depth of the image, and flip the point cloud to align with the camera view. Then compute the AABB. Get the bounding box coordinates (I want a box around the block to check that the program is running correctly), then draw the box.

while True:
    # Wait for a coherent pair of frames: depth and color
    frames = pipeline.wait_for_frames()
    depth_frame = frames.get_depth_frame()
    color_frame = frames.get_color_frame()
    
    if not depth_frame or not color_frame:
        continue. 

    # Convert images to numpy arrays
    depth_image = np.asanyarray(depth_frame.get_data())
    color_image = np.asanyarray(color_frame.get_data())

    # Create an Open3D image from the depth image
    depth_o3d = o3d.geometry.Image(depth_image)
    color_o3d = o3d.geometry.Image(color_image)

    # Get the intrinsics of the depth frame
    intrinsics = depth_frame.profile.as_video_stream_profile().intrinsics
    pinhole_camera_intrinsic = o3d.camera.PinholeCameraIntrinsic(
        intrinsics.width, intrinsics.height, intrinsics.fx, intrinsics.fy, intrinsics.ppx, intrinsics.ppy
    )

    # Create a point cloud from the depth image
    rgbd_image = o3d.geometry.RGBDImage.create_from_color_and_depth(
        color_o3d, depth_o3d, convert_rgb_to_intensity=False
    )
    pcd = o3d.geometry.PointCloud.create_from_rgbd_image(
        rgbd_image, pinhole_camera_intrinsic
    )

    # Flip the point cloud to align with the camera view
    pcd.transform([[1, 0, 0, 0],
                   [0, -1, 0, 0],
                   [0, 0, -1, 0],
                   [0, 0, 0, 1]])

    # Compute the axis-aligned bounding box (AABB)
    aabb = pcd.get_axis_aligned_bounding_box()
    aabb.color = (1, 0, 0)  # Set the color of the bounding box to red

    # Get the bounding box coordinates
    min_bound = aabb.get_min_bound()
    max_bound = aabb.get_max_bound()

    # Convert the bounding box coordinates to pixel coordinates
    min_bound_pixel = rs.rs2_project_point_to_pixel(intrinsics, min_bound)
    max_bound_pixel = rs.rs2_project_point_to_pixel(intrinsics, max_bound)

    # Draw the bounding box on the color image
    cv2.rectangle(color_image, (int(min_bound_pixel[0]), int(min_bound_pixel[1])), 
                  (int(max_bound_pixel[0]), int(max_bound_pixel[1])), (0, 255, 0), 2)

    # Display the color image with the bounding box
    cv2.imshow("color window", color_image)

    # Break the loop if 'q' is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

But the code I have isn't detecting the block. How can I fix that?

2
  • What is a "stl"? Commented Feb 10 at 20:42
  • STL is a file format carrying triangle meshes. Commented Feb 10 at 21:27

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.