CN105339981B - Method for using one group of primitive registration data - Google Patents

Method for using one group of primitive registration data Download PDF

Info

Publication number
CN105339981B
CN105339981B CN201480034631.3A CN201480034631A CN105339981B CN 105339981 B CN105339981 B CN 105339981B CN 201480034631 A CN201480034631 A CN 201480034631A CN 105339981 B CN105339981 B CN 105339981B
Authority
CN
China
Prior art keywords
coordinate system
primitives
plane
camera
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201480034631.3A
Other languages
Chinese (zh)
Other versions
CN105339981A (en
Inventor
田口裕
田口裕一
E·阿塔埃尔-坎斯佐古力
S·拉姆阿里加姆
T·W·加拉斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mitsubishi Electric Corp
Original Assignee
Mitsubishi Electric Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US13/921,296 external-priority patent/US9420265B2/en
Application filed by Mitsubishi Electric Corp filed Critical Mitsubishi Electric Corp
Publication of CN105339981A publication Critical patent/CN105339981A/en
Application granted granted Critical
Publication of CN105339981B publication Critical patent/CN105339981B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)
  • Studio Devices (AREA)

Abstract

A method of carrying out registration data using one group of primitive for including point and plane.Firstly, this method selects first group of primitive from the data in the first coordinate system, wherein first group of primitive includes at least three primitives and at least one plane.It predicts from the first coordinate system to the transformation of the second coordinate system.Using the transformation by first group of basis element change to the second coordinate system.Second group of primitive is determined according to first group of primitive for transforming to the second coordinate system.Then, using first group of primitive in the first coordinate system and second group of primitive in the second coordinate system, by the second coordinate system and the first co-registration of coordinate systems used.Registration can be used for tracking the posture for obtaining the camera of data.

Description

用于使用一组基元配准数据的方法Methods for registering data using a set of primitives

技术领域technical field

本发明总体涉及计算机视觉,并且更特别地,涉及估计相机的姿势。The present invention relates generally to computer vision and, more particularly, to estimating the pose of a camera.

背景技术Background technique

在重建场景的3D结构的同时跟踪相机的姿势的系统和方法被广泛用于增强现实(AR)可视化、机器人导航、场景建模、以及计算机视觉应用。这样的处理通常被称为同时定位与地图构建(SLAM)。实时SLAM系统可以使用获取二维(2D)图像的传统相机、获取三维(3D)点云(一组3D点)的深度相机、或者获取2D图像和3D点云两者的红、绿、蓝和深度(RGB-D)相机(诸如,)。跟踪是指使用相机的预测运动以用于顺序地估计相机的姿势的处理,而重定位是指使用一些基于特征的全局配准(registration)以用于从跟踪失败恢复的处理。Systems and methods for tracking the pose of a camera while reconstructing the 3D structure of a scene are widely used in augmented reality (AR) visualization, robotic navigation, scene modeling, and computer vision applications. Such processing is often referred to as simultaneous localization and mapping (SLAM). A real-time SLAM system can use conventional cameras that acquire two-dimensional (2D) images, depth cameras that acquire three-dimensional (3D) point clouds (a set of 3D points), or red, green, blue, and Depth (RGB-D) cameras (such as, ). Tracking refers to the process of using the predicted motion of the camera for sequentially estimating the pose of the camera, while relocalization refers to the process of using some feature-based global registration for recovery from tracking failures.

使用2D相机的SLAM系统针对纹理场景通常是成功的,但是针对无纹理区域很可能失败。在迭代最近点(ICP)方法的帮助下,使用深度相机的系统依靠场景中的几何变化(诸如,曲面和深度边界)。然而,当几何变化很小时(诸如,在平面场景中),基于ICP的系统经常失败。使用RGB-D相机的系统可以利用纹理和几何特征两者,但是它们仍然要求独特的纹理。SLAM systems using 2D cameras are usually successful for textured scenes, but are likely to fail for textureless regions. With the help of iterative closest point (ICP) methods, systems using depth cameras rely on geometrical changes in the scene (such as surfaces and depth boundaries). However, ICP-based systems often fail when geometry changes are small (such as in planar scenes). Systems using RGB-D cameras can utilize both texture and geometric features, but they still require unique textures.

很多方法都没有明确解决重建比单人房间大的三维模型时的困难。为了将那些方法扩展到较大的场景,要求较好的存储器管理技术。然而,存储器限制不是唯一的挑战。通常,房间规模的场景包括具有纹理和几何特征的许多对象。为了扩展到较大的场景,需要跟踪具有有限纹理和不充分几何变化的区域(诸如,走廊)中的相机姿势。Many approaches do not explicitly address the difficulty of reconstructing 3D models larger than a single-occupancy room. In order to extend those methods to larger scenarios, better memory management techniques are required. However, memory constraints are not the only challenge. Often, room-scale scenes include many objects with textured and geometric features. To scale to larger scenes, camera poses need to be tracked in areas with limited texture and insufficient geometric variation, such as corridors.

相机跟踪camera tracking

考虑到一些3D对应,使用3D传感器获取3D点云的系统将跟踪问题归纳为配准问题。ICP方法从由相机运动预测给出的初始姿势估计开始,迭代地定位点到点或点到平面对应。ICP已被广泛用于移动机器人中的线扫描3D传感器(也称为扫描匹配)以及用于产生全部3D点云的深度相机和3D传感器。U.S.20120194516以ICP方法将点到平面对应用于相机的姿势跟踪。地图的该表示是一组体元(voxel)。每个体元表示针对到最接近表面点的距离的截断符号距离函数(truncated signed distance function)。该方法不从3D点云提取平面;相反,通过使用局部邻域确定3D点的法线来建立点到平面对应。这样的基于ICP的方法要求场景具有充分几何变化以用于精确配准。Given some 3D correspondence, systems that use 3D sensors to acquire 3D point clouds generalize the tracking problem into a registration problem. The ICP method iteratively locates point-to-point or point-to-plane correspondences, starting from an initial pose estimate given by camera motion prediction. ICP has been widely used for line scan 3D sensors (also known as scan matching) in mobile robots as well as for depth cameras and 3D sensors for generating full 3D point clouds. US20120194516 applies point-to-plane correspondence to the ICP method Camera pose tracking. This representation of the map is a set of voxels. Each voxel represents a truncated signed distance function for the distance to the closest surface point. This method does not extract planes from 3D point clouds; instead, point-to-plane correspondences are established by using local neighborhoods to determine the normals of 3D points. Such ICP-based methods require scenes with sufficient geometric variation for accurate registration.

另一种方法从RGB图像提取特征并且执行基于描述符的点匹配,以确定点到点对应并且估计相机姿势,然后利用ICP方法将相机姿势细化。该方法使用场景中的纹理(RGB)和几何(深度)特征。然而,仅使用点特征来处理无纹理区域和具有重复纹理的区域仍然有问题。Another approach extracts features from RGB images and performs descriptor-based point matching to determine point-to-point correspondences and estimate camera poses, which are then refined using ICP methods. The method uses texture (RGB) and geometric (depth) features in the scene. However, using only point features for textureless regions and regions with repetitive textures is still problematic.

使用平面的SLAMUsing flat SLAM

平面特征已被用于许多SLAM系统中。为了确定相机姿势,要求其法线跨越R3的至少三个平面。因此,特别是当视场(FOV)或传感器的范围较小(诸如,在中)时,仅使用平面导致许多退化问题(degeneracy issue)。大FOV线扫描3D传感器和小视场(FOV)深度相机的组合可以利用额外系统成本避免退化。Planar features have been used in many SLAM systems. To determine the camera pose, its normals are required to span at least three planes of R3 . Therefore, especially when the field of view (FOV) or the range of the sensor is small (such as in ), using only planes leads to many degeneracy issues. The combination of a large FOV line scan 3D sensor and a small field of view (FOV) depth camera can avoid degradation with additional system cost.

相关申请中描述的方法使用了点-平面SLAM,点-平面SLAM使用点和平面两者以避免在使用这些基元中的一个基元的方法中常见的失败模式。该系统不使用任何相机运动预测。相反,该系统通过全局地定位点和平面对应,针对所有帧执行重定位。结果,该系统仅能每秒处理约三个帧,并且遇到由于基于描述符的点匹配导致的一些重复纹理的失败。The method described in the related application uses point-plane SLAM, which uses both points and planes to avoid failure modes common in methods using one of these primitives. The system does not use any camera motion prediction. Instead, the system performs relocation for all frames by locating point and plane correspondences globally. As a result, the system was only able to process about three frames per second and encountered failures with some duplicate textures due to descriptor-based point matching.

相关申请中描述的方法还提出了使用点到点和平面到平面对应两者在不同坐标系中配准3D数据。The methods described in the related application also propose the use of both point-to-point and plane-to-plane correspondences to register 3D data in different coordinate systems.

发明内容SUMMARY OF THE INVENTION

在包括人造结构的室内和室外场景中,平面是主要的。本发明的实施方式提供了一种用于跟踪使用点和平面作为基元特征的RGB-D相机的系统和方法。通过拟合平面,该方法隐式地处理3D传感器特有的深度数据中的噪声。跟踪方法由重定位和集束调整(bundleadjustment)处理支持,以展示使用手持式或机器人上安装的RGB-D相机的实时同时定位与地图构建(SLAM)系统。In indoor and outdoor scenes including man-made structures, planes are predominant. Embodiments of the present invention provide a system and method for tracking an RGB-D camera using points and planes as primitive features. By fitting a plane, the method implicitly deals with noise in depth data specific to 3D sensors. The tracking method is supported by relocalization and bundle adjustment processing to demonstrate a real-time simultaneous localization and mapping (SLAM) system using a handheld or robot-mounted RGB-D camera.

本发明的一个目的是在将导致配准失败的退化问题减到最小的同时使能快速和精确的配准。该方法使用相机运动预测定位点和平面对应,并且提供基于预测和校正框架的跟踪器。该方法结合使用点和平面两者的重定位和集束调整处理,以从跟踪失败恢复并且持续细化相机姿势估计。It is an object of the present invention to enable fast and accurate registration while minimizing degradation problems that lead to registration failures. The method uses camera motion to predict anchor and plane correspondences, and provides a tracker based on a prediction and correction framework. The method uses a combination of point and plane relocalization and bundle adjustment processing to recover from tracking failures and continue to refine the camera pose estimate.

具体地,一种方法使用包括点和平面的一组基元来配准数据。首先,该方法从第一坐标系中的数据选择第一组基元,其中,第一组基元包括至少三个基元和至少一个平面。Specifically, one method uses a set of primitives including points and planes to register data. First, the method selects a first set of primitives from data in a first coordinate system, wherein the first set of primitives includes at least three primitives and at least one plane.

对从第一坐标系到第二坐标系的变换进行预测。使用该变换将第一组基元变换到第二坐标系。根据变换到第二坐标系的第一组基元确定第二组基元。The transformation from the first coordinate system to the second coordinate system is predicted. Use this transformation to transform the first set of primitives to the second coordinate system. The second set of primitives is determined from the first set of primitives transformed to the second coordinate system.

然后,使用第一坐标系中的第一组基元和第二坐标系中的第二组基元,将第二坐标系与第一坐标系配准。所述配准可以用于跟踪获取数据的相机的姿势。Then, the second coordinate system is registered with the first coordinate system using the first set of primitives in the first coordinate system and the second set of primitives in the second coordinate system. The registration can be used to track the pose of the camera that acquired the data.

附图说明Description of drawings

图1是根据本发明的实施方式的用于跟踪相机的姿势的方法的流程图;以及1 is a flowchart of a method for tracking a pose of a camera according to an embodiment of the present invention; and

图2是根据本发明的实施方式的用于使用相机的预测姿势在当前帧与地图(map)之间建立点到点和平面到平面对应的过程的示意图。2 is a schematic diagram of a process for establishing point-to-point and plane-to-plane correspondences between a current frame and a map using a camera's predicted pose, according to an embodiment of the present invention.

具体实施方式Detailed ways

本发明的实施方式提供了一种用于跟踪相机的姿势的系统和方法。本发明通过将相机运动预测用于更快速的对应搜索和配准,扩展了在我们的相关美国申请Sn.13/539,060中描述的实施方式。我们使用在当前帧和地图之间建立的点到点和平面到平面对应。地图包括来自之前在全局坐标系中配准的帧的点和平面。这里,我们关注的是使用相机运动预测来建立平面到平面对应以及建立点到点和平面到平面对应两者的混合情况。Embodiments of the present invention provide a system and method for tracking the pose of a camera. The present invention extends the embodiment described in our related US application Sn. 13/539,060 by using camera motion prediction for faster correspondence search and registration. We use the point-to-point and plane-to-plane correspondence established between the current frame and the map. The map includes points and planes from frames previously registered in the global coordinate system. Here, we focus on using camera motion prediction to establish plane-to-plane correspondence and a mixture of both point-to-point and plane-to-plane correspondences.

系统概述System Overview

在优选系统中,RGB-D相机102是Xtion PRO LIVE,其获取一系列帧101。我们使用基于关键帧的SLAM系统,这里我们选择多个代表帧作为关键帧,并且将在单个全局坐标系中配准的关键帧存储在地图中。与仅使用点的现有技术SLAM相比,我们在系统中的所有处理中使用点和平面两者作为基元。每个帧中的点和平面被称为量度(measurement),并且来自关键帧的量度作为地标被存储在地图中。In a preferred system, the RGB-D camera 102 is or Xtion PRO LIVE, which acquires a series of frames 101 . We use a keyframe-based SLAM system, where we select multiple representative frames as keyframes and store the keyframes registered in a single global coordinate system in the map. Compared to the state-of-the-art SLAM that only uses points, we use both points and planes as primitives in all processing in the system. Points and planes in each frame are called measurements, and measurements from keyframes are stored in the map as landmarks.

在给定地图的情况下,我们使用预测和校正框架来估计当前帧的姿势:我们预测相机的姿势,并且使用该姿势来确定点和平面量度与点和平面地标之间的对应,然后使用所述对应来确定相机姿势。Given a map, we use a prediction and correction framework to estimate the pose of the current frame: we predict the pose of the camera, and use this pose to determine the correspondence between point and plane metrics and point and plane landmarks, then use the to determine the camera pose.

跟踪可能由于不正确或不充分的对应而失败。在预定数量的连续跟踪失败之后,我们重定位,其中,我们在当前帧与地图之间使用全局点和平面对应搜索。我们还应用使用点和平面的集束调整,以异步地细化在地图中的地标。Tracking may fail due to incorrect or insufficient correspondence. After a predetermined number of consecutive track failures, we relocalize, where we use a global point and plane correspondence search between the current frame and the map. We also apply bundle adjustment using points and planes to asynchronously refine landmarks in the map.

方法概述Method overview

如图1所述,由场景103的红、绿、蓝和深度(RGB-D)相机102获取110当前帧101。预测获取所述帧时该相机的姿势,相机的姿势被用于定位130帧与地图194之间的点和平面对应。在随机采样一致性(RANSAC)框架140中使用点和平面对应,以将帧配准到地图。如果150配准失败,则对连续失败的数量进行计数,并且如果154为假(F),则继续下一帧,否则如果154为真(T),则使用全局配准方法而不使用相机运动预测来重定位158相机。The current frame 101 is acquired 110 by the red, green, blue and depth (RGB-D) camera 102 of the scene 103 as described in FIG. 1 . The pose of the camera when the frame was acquired is predicted, and the pose of the camera is used to locate 130 the point and plane correspondence between the frame and the map 194 . Point and plane correspondences are used in a random sampling agreement (RANSAC) framework 140 to register frames to a map. If 150 registration failed, count the number of consecutive failures and if 154 is false (F) continue to the next frame, else if 154 is true (T) use global registration method without camera motion Prediction to relocate 158 cameras.

如果RANSAC配准成功,则在RANSAC框架中估计的姿势160被用作帧的姿势。接下来,确定170当前帧是否为关键帧,并且如果为假,则在步骤110处,继续进行下一帧。否则,在当前帧中提取180附加的点和平面,更新190地图194,并且针对下一帧继续进行。使用集束调整对地图进行异步细化198。If the RANSAC registration is successful, the pose 160 estimated in the RANSAC framework is used as the pose of the frame. Next, it is determined 170 whether the current frame is a key frame, and if false, at step 110, proceed to the next frame. Otherwise, the additional points and planes are extracted 180 in the current frame, the map 194 is updated 190, and the process continues for the next frame. Asynchronous refinement of the map using bundle adjustment 198.

可以在连接到本领域中已知的存储器和输入/输出接口的处理器中执行这些步骤。These steps may be performed in a processor connected to memory and input/output interfaces known in the art.

相机姿势跟踪camera pose tracking

如上所述,我们的跟踪使用包括点和平面两者的特征。所述跟踪基于预测和校正方案,这可以被概括如下。针对每一个帧,我们使用相机运动模型来预测姿势。基于所预测的姿势,我们定位帧中与地图中的点和平面地标相对应的点和平面量度。我们使用点和平面对应来执行基于RANSAC的配准。如果该姿势与当前存储在地图中的任何关键帧的姿势不同,则我们提取附加的点和平面量度,并且将该帧添加到地图中作为新关键帧。As mentioned above, our tracking uses features that include both points and planes. The tracking is based on a prediction and correction scheme, which can be summarized as follows. For each frame, we use the camera motion model to predict the pose. Based on the predicted pose, we locate point and plane metrics in the frame that correspond to point and plane landmarks in the map. We use point and plane correspondences to perform RANSAC-based registration. If the pose is different from the pose of any keyframe currently stored in the map, we extract the additional point and plane metrics, and add the frame to the map as a new keyframe.

相机运动预测camera motion prediction

我们将第k个帧的姿势表示为We denote the pose of the kth frame as

其中,Rk和tk分别表示旋转矩阵和平移矢量。我们使用第一帧来定义地图的坐标系;因此,T1是单位矩阵,并且Tk表示第k个帧相对于地图的姿势。Among them, R k and t k represent the rotation matrix and translation vector, respectively. We use the first frame to define the map's coordinate system; thus, T1 is the identity matrix, and Tk represents the pose of the kth frame relative to the map.

我们通过使用恒定速度假设来预测第k个帧的姿势使ΔT表示第(k-1)个帧与第(k-2)个帧之间的先前估计的运动,即,ΔT=Tk-1Tk-2 -1。然后,我们将第k个帧的姿势预测为 We predict the pose of the kth frame by using the constant velocity assumption Let ΔT denote the previously estimated motion between the (k-1)th frame and the (k-2)th frame, ie, ΔT=T k-1 T k-2 -1 . Then, we predict the pose of the kth frame as

定位点和平面对应Anchor point and plane correspondence

如图2所示,我们使用所预测的姿势来定位与地图中的地标对应的第k个帧中的点和平面量度。考虑当前帧的预测姿势201,我们定位地图202中的点和平面地标与当前帧203中的点和平面量度之间的对应。我们首先使用所预测的姿势将地图中的地标转变到当前帧。然后,针对每一个点,我们从当前帧中的所预测的像素位置开始使用光流过程执行局部搜索。针对每一个平面,我们首先定位所预测的平面的参数。然后,我们考虑所预测的平面上的一组基准点,并且定位与位于所预测的平面上的每个基准点连接的像素。选择具有最大数量的连接的像素的基准点,并且使用所有连接的像素来细化平面参数。As shown in Figure 2, we use the predicted pose to locate the point and plane measure in the kth frame corresponding to the landmark in the map. Considering the predicted pose 201 of the current frame, we locate the correspondence between the point and plane landmarks in the map 202 and the point and plane metrics in the current frame 203 . We first transform landmarks in the map to the current frame using the predicted pose. Then, for each point, we perform a local search using an optical flow process starting from the predicted pixel location in the current frame. For each plane, we first locate the parameters of the predicted plane. Then, we consider a set of fiducials on the predicted plane and locate the pixel connected to each fiducial lying on the predicted plane. The fiducial point with the largest number of connected pixels is chosen, and all connected pixels are used to refine the plane parameters.

点对应:使pi=(xi,yi,zi,1)T表示地图中的第i个点地标210,第i个点地标210被表示为齐次矢量。当前帧中的pi的2D图像投影220被预测为Point correspondence: Let p i =( xi ,y i ,z i ,1) T denote the ith point landmark 210 in the map, which is represented as a homogeneous vector. The 2D image projection 220 of pi in the current frame is predicted as

其中,是变换到第k个帧的坐标系的3D点,并且函数FP(·)使用内部相机校准参数来确定3D点到图像平面上的前向投影。我们通过使用Lucas-Kanade的光流方法从的初始位置开始定位相应点量度。使为所确定的光流矢量230。然后,相应点量度in, is the 3D point transformed to the coordinate system of the kth frame, and the function FP(·) uses the internal camera calibration parameters to determine the forward projection of the 3D point onto the image plane. We use Lucas-Kanade's optical flow method from The initial position of , starts to locate the corresponding point measurement. Make is the determined optical flow vector 230 . Then, the corresponding point measure for

其中,函数BP(·)将2D图像像素后向投影到3D射线,并且D(·)是指像素的深度值。如果光流矢量未被确定或像素位置具有无效深度值,则认为该特征丢失。where the function BP(·) backprojects 2D image pixels to 3D rays, and D(·) refers to the depth value of the pixel. If the optical flow vector is not determined or the pixel position With an invalid depth value, the feature is considered missing.

平面对应:代替对每个帧独立于其它帧地执行耗时的平面提取过程(现有技术),我们利用所预测的姿势来提取平面。这产生更快的平面量度提取,并且还提供平面对应。Plane correspondence: Instead of performing a time-consuming plane extraction process (prior art) for each frame independently of other frames, we utilize the predicted pose to extract the plane. This results in faster plane metric extraction and also provides plane correspondence.

使πj=(aj,bj,cj,dj)T表示地图中的第j个平面地标240的平面方程。我们假设平面地标和相应量度在图像中具有一些重叠区域。为了定位这样的相应平面量度,我们从第j个平面地标的内点(inlier)随机地选择多个基准点250qj,r(r=1,...,N),并且将基准点转变到第k个帧作为255Let π j =(a j ,b j ,c j ,d j ) T denote the plane equation for the jth plane landmark 240 in the map. We assume that planar landmarks and corresponding metrics have some overlapping areas in the image. To locate such corresponding plane metrics, we randomly select a number of fiducial points 250q j,r (r=1,...,N) from the inliers of the jth plane landmark, and transform the fiducial points to The kth frame as 255

我们还将πj转变到第k个帧作为245We also transform πj to the kth frame as 245

我们从平面上的每个转变后的基准点定位连接的像素260,并且选择具有最大内点的像素。这些内点用于细化平面方程,得到相应的平面量度如果内点的数量小于阈值,则平面地标被声明为丢失。例如,我们使用N=5个基准点,使用针对点到平面距离的为50mm的阈值来确定平面上的内点,并且使用9000作为最小数量的内点的阈值。we start from the plane datum point after each transition on Connected pixels 260 are located, and the pixel with the largest inlier is selected. These interior points are used to refine the plane equation, resulting in the corresponding plane measure If the number of inliers is less than a threshold, the planar landmark is declared missing. For example, we use N=5 fiducial points, use a threshold of 50mm for point-to-plane distance to determine inliers on the plane, and use 9000 as the threshold for the minimum number of inliers.

地标选择Landmark selection

使用地图中的所有地标执行上述处理可能效率低。因此,我们使用在最接近当前帧的单个关键帧中出现的地标。在跟踪处理之前,通过使用先前帧的姿势Tk-1来选择最接近的关键帧。It may be inefficient to perform the above processing with all landmarks in the map. Therefore, we use the landmark that occurs in the single keyframe closest to the current frame. Before the tracking process, the closest keyframe is selected by using the pose Tk -1 of the previous frame.

RANSAC配准RANSAC registration

基于预测的对应搜索提供点到点和平面到平面对应的候选,所述候选可能包括外点(outlier)。因此,我们执行基于RANSAC的配准以确定内点并且确定相机姿势。为了明确地确定姿势,我们需要至少三个对应。因此,如果存在少于三个对应的候选,则我们立即确定跟踪失败。为了进行精确相机跟踪,当仅存在少量对应的候选时,我们也确定跟踪失败。The prediction-based correspondence search provides candidates for point-to-point and plane-to-plane correspondences, which may include outliers. Therefore, we perform RANSAC-based registration to determine the inliers and determine the camera pose. To unambiguously determine the pose, we need at least three correspondences. Therefore, if there are less than three corresponding candidates, we immediately determine that the tracking has failed. For accurate camera tracking, we also determine that tracking fails when there are only a few corresponding candidates.

如果存在足够数量的候选,则我们使用封闭形式的混合对应来解决配准问题。因为平面的数量通常远远少于点的数量,并且由于来自许多点的支持致使平面具有较少噪声,该过程使平面对应优先于点对应。如果RANSAC定位了足够数量的内点(例如,所有点和平面量度的数量的40%),则认为跟踪成功。该方法产生第k个帧的校正姿势TkIf there is a sufficient number of candidates, we use closed-form mixed correspondence to solve the registration problem. Because the number of planes is usually much less than the number of points, and the planes are less noisy due to support from many points, this process prioritizes plane correspondences over point correspondences. Tracking is considered successful if RANSAC locates a sufficient number of interior points (eg, 40% of the number of all point and plane metrics). This method produces the corrected pose Tk for the kth frame.

地图更新map update

如果所估计的姿势Tk与地图中的任何现有关键帧的姿势充分不同,则我们将第k个帧确定为关键帧。为了检验该情况,我们可以例如使用平移100mm和旋转5°的阈值。针对新关键帧,使在基于RANSAC的配准中被定位为内点的点和平面量度与相应地标相关联,同时丢弃被定位为外点的那些点和平面量度。然后,我们可以提取最新出现在该帧中的附加点和平面量度。附加点量度是在不接近任何现有点量度的像素上,使用关键点检测器(诸如,尺度不变特征变换(SIFT)和加速鲁棒特征(SURF))提取的。附加平面量度是通过在不是任何现有平面量度的内点的像素上使用基于RANSAC的平面拟合来提取的。将附加点和平面量度作为新地标添加到地图。另外,我们针对帧中的所有点量度提取用于重定位的特征描述符(诸如,SIFT和SURF)。If the estimated pose Tk is sufficiently different from the pose of any existing keyframes in the map, we determine the kth frame as a keyframe. To check this, we can eg use a threshold of 100mm translation and 5° rotation. For the new keyframe, point and plane metrics located as inliers in the RANSAC-based registration are associated with corresponding landmarks, while those located as outliers are discarded. We can then extract the additional point and plane metrics that were most recent in that frame. Additional point metrics are extracted using keypoint detectors such as Scale-Invariant Feature Transform (SIFT) and Accelerated Robust Features (SURF) on pixels that are not close to any existing point metrics. Additional plane metrics are extracted by using a RANSAC-based plane fit on pixels that are not inliers of any existing plane metrics. Add additional points and plane measures to the map as new placemarks. Additionally, we extract feature descriptors (such as SIFT and SURF) for relocalization for all point metrics in the frame.

Claims (14)

1.一种用于使用一组基元来配准数据的方法,其中,所述数据具有三个维度(3D),并且,点和平面都作为所述基元,所述方法包括以下步骤:1. A method for registering data using a set of primitives, wherein the data has three dimensions (3D), and both points and planes are used as the primitives, the method comprising the steps of: 从第一坐标系中的所述数据选择第一组基元,其中,所述第一组基元包括至少三个基元,其中至少一个基元为平面;selecting a first set of primitives from the data in a first coordinate system, wherein the first set of primitives includes at least three primitives, at least one of which is a plane; 预测从所述第一坐标系到第二坐标系的变换,其中,通过使用相机运动模型来预测所述变换;predicting a transformation from the first coordinate system to the second coordinate system, wherein the transformation is predicted using a camera motion model; 使用所预测的变换将所述第一组基元变换到所述第二坐标系;transforming the first set of primitives to the second coordinate system using the predicted transformation; 根据变换到所述第二坐标系的所述第一组基元来确定第二组基元;以及determining a second set of primitives from the first set of primitives transformed to the second coordinate system; and 使用相互对应的所述第一坐标系中的所述第一组基元和所述第二坐标系中的所述第二组基元,将所述第二坐标系与所述第一坐标系配准,其中,所述配准被用于同时定位与地图构建(SLAM),并且在处理器中执行以上步骤。Using the first set of primitives in the first coordinate system and the second set of primitives in the second coordinate system that correspond to each other, the second coordinate system is aligned with the first coordinate system Registration, wherein the registration is used for simultaneous localization and mapping (SLAM), and the above steps are performed in a processor. 2.根据权利要求1所述的方法,其中,所述第一组基元包括的至少三个基元中,至少一个基元是所述第一坐标系中的至少一个点,至少一个基元是所述第一坐标系中的至少一个平面,并且,所述第二组基元包括的至少三个基元中,至少一个基元是所述第二坐标系中的至少一个点,至少一个基元是所述第二坐标系中的至少一个平面。2. The method according to claim 1, wherein, among the at least three primitives included in the first group of primitives, at least one primitive is at least one point in the first coordinate system, and at least one primitive is is at least one plane in the first coordinate system, and among the at least three primitives included in the second group of primitives, at least one primitive is at least one point in the second coordinate system, and at least one A primitive is at least one plane in the second coordinate system. 3.根据权利要求1所述的方法,其中,通过可移动相机获取所述数据。3. The method of claim 1, wherein the data is acquired by a movable camera. 4.根据权利要求1所述的方法,其中,所述数据包括纹理和深度。4. The method of claim 1, wherein the data includes texture and depth. 5.根据权利要求1所述的方法,其中,所述配准使用随机采样一致性(RANSAC)。5. The method of claim 1, wherein the registration uses random sampling consensus (RANSAC). 6.根据权利要求1所述的方法,其中,所述数据为由相机获取的一系列帧的形式。6. The method of claim 1, wherein the data is in the form of a series of frames acquired by a camera. 7.根据权利要求6所述的方法,所述方法还包括:7. The method of claim 6, further comprising: 从所述一系列帧选择一组帧作为关键帧;以及selecting a set of frames from the series of frames as keyframes; and 将所述关键帧存储在地图中,其中,所述关键帧包括所述点和所述平面,并且所述点和所述平面作为地标被存储在所述地图中。The keyframes are stored in a map, wherein the keyframes include the points and the planes, and the points and the planes are stored in the map as landmarks. 8.根据权利要求7所述的方法,所述方法还包括:8. The method of claim 7, further comprising: 针对每个帧预测所述相机的姿势;以及predict the pose of the camera for each frame; and 根据所述配准,针对每个帧确定所述相机的姿势,以跟踪所述相机。From the registration, the pose of the camera is determined for each frame to track the camera. 9.根据权利要求1所述的方法,其中,所述配准是实时的。9. The method of claim 1, wherein the registration is in real time. 10.根据权利要求7所述的方法,所述方法还包括:10. The method of claim 7, further comprising: 应用使用所述点和所述平面的集束调整来细化所述地图中的地标。The application refines the landmarks in the map using the bundle adjustment of the points and the plane. 11.根据权利要求8所述的方法,其中,第k个帧的姿势为11. The method of claim 8, wherein the pose of the kth frame is 其中,Rk和tk分别表示旋转矩阵和平移矢量。Among them, R k and t k represent the rotation matrix and translation vector, respectively. 12.根据权利要求8所述的方法,其中,所述预测使用恒定速度假设。12. The method of claim 8, wherein the prediction uses a constant speed assumption. 13.根据权利要求6所述的方法,其中,使用光流过程来定位所述帧中的所述点。13. The method of claim 6, wherein the point in the frame is located using an optical flow process. 14.根据权利要求1所述的方法,其中,使所述平面的对应优先于所述点的对应。14. The method of claim 1, wherein the correspondence of the plane is prioritized over the correspondence of the points.
CN201480034631.3A 2013-06-19 2014-05-30 Method for using one group of primitive registration data Active CN105339981B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/921,296 US9420265B2 (en) 2012-06-29 2013-06-19 Tracking poses of 3D camera using points and planes
US13/921,296 2013-06-19
PCT/JP2014/065026 WO2014203743A1 (en) 2013-06-19 2014-05-30 Method for registering data using set of primitives

Publications (2)

Publication Number Publication Date
CN105339981A CN105339981A (en) 2016-02-17
CN105339981B true CN105339981B (en) 2019-04-12

Family

ID=50979838

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480034631.3A Active CN105339981B (en) 2013-06-19 2014-05-30 Method for using one group of primitive registration data

Country Status (4)

Country Link
JP (1) JP6228239B2 (en)
CN (1) CN105339981B (en)
DE (1) DE112014002943T5 (en)
WO (1) WO2014203743A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6775969B2 (en) * 2016-02-29 2020-10-28 キヤノン株式会社 Information processing equipment, information processing methods, and programs
CA3032812A1 (en) 2016-08-04 2018-02-08 Reification Inc. Methods for simultaneous localization and mapping (slam) and related apparatus and systems
CN106780601B (en) * 2016-12-01 2020-03-27 北京未动科技有限公司 Spatial position tracking method and device and intelligent equipment
EP3333538B1 (en) * 2016-12-07 2020-09-09 Hexagon Technology Center GmbH Scanner vis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009237845A (en) * 2008-03-27 2009-10-15 Sony Corp Information processor, information processing method, and computer program
JP2010288112A (en) * 2009-06-12 2010-12-24 Nissan Motor Co Ltd Self-position estimation apparatus and self-position estimation method
CN102609942A (en) * 2011-01-31 2012-07-25 微软公司 Mobile camera localization using depth maps
CN103123727A (en) * 2011-11-21 2013-05-29 联想(北京)有限公司 Method and device for simultaneous positioning and map building

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5580164B2 (en) * 2010-10-18 2014-08-27 株式会社トプコン Optical information processing apparatus, optical information processing method, optical information processing system, and optical information processing program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009237845A (en) * 2008-03-27 2009-10-15 Sony Corp Information processor, information processing method, and computer program
JP2010288112A (en) * 2009-06-12 2010-12-24 Nissan Motor Co Ltd Self-position estimation apparatus and self-position estimation method
CN102609942A (en) * 2011-01-31 2012-07-25 微软公司 Mobile camera localization using depth maps
CN103123727A (en) * 2011-11-21 2013-05-29 联想(北京)有限公司 Method and device for simultaneous positioning and map building

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
3D SLAM using planar segments;JAN WEINGARTEN ET AL;《IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS》;20061001;全文
Camera tracking for augmented reality media;BOLAN JIANG ET AL;《MULTIMEDIA AND EXPO,2000.ICME 2000.2000 IEEE INTERNATIONAL CONFEREN CE ON NEW YORK》;20000730;全文
MonoSLAM:Real-Time Single Camera SLAM;ANDREW J DAVISON ET AL;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20070601;第29卷(第6期);全文
Point一plane SLAM for hand-held 3D sensors;TAGUCHI YUICHI ET AL;《2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION》;20130506;全文
RGB-D camera-based parallel tracking and meshing;SEBASTIAN LIEBERKNECHT ET AL;《MIXED AND AUGMENTED REALITY(ISMAR),2011 10TH IEEE INTERNATIONAL SYMPOSIUM ON》;20111026;全文
RGB一D mapping:Using Kinect一style depth cameras for dense 3D modeling of indoor environments;P.HENRY ET AL;《THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH》;20120210;第31卷(第5期);全文

Also Published As

Publication number Publication date
JP6228239B2 (en) 2017-11-08
CN105339981A (en) 2016-02-17
JP2016527574A (en) 2016-09-08
WO2014203743A1 (en) 2014-12-24
DE112014002943T5 (en) 2016-03-10

Similar Documents

Publication Publication Date Title
US9420265B2 (en) Tracking poses of 3D camera using points and planes
JP7173772B2 (en) Video processing method and apparatus using depth value estimation
CN102763132B (en) Three-dimensional measurement apparatus and processing method
CN103988226B (en) Method for estimating camera motion and for determining 3D model of reality
Ataer-Cansizoglu et al. Tracking an RGB-D camera using points and planes
Herrera et al. Dt-slam: Deferred triangulation for robust slam
CN110702111A (en) Simultaneous localization and map creation (SLAM) using dual event cameras
Vidas et al. Real-time mobile 3D temperature mapping
KR20180087947A (en) Modeling method and modeling apparatus using 3d point cloud
JP2018523881A (en) Method and system for aligning data
Ataer-Cansizoglu et al. Calibration of non-overlapping cameras using an external SLAM system
JP6922348B2 (en) Information processing equipment, methods, and programs
CN108416385A (en) It is a kind of to be positioned based on the synchronization for improving Image Matching Strategy and build drawing method
Tomono 3-D localization and mapping using a single camera based on structure-from-motion with automatic baseline selection
JP2019032218A (en) Position information recording method and apparatus
CN105339981B (en) Method for using one group of primitive registration data
CN120266160A (en) Neural network based positioning
CN110310325B (en) Virtual measurement method, electronic device and computer readable storage medium
JP2006113832A (en) Stereo image processing apparatus and program
Lui et al. An Iterative 5-pt Algorithm for Fast and Robust Essential Matrix Estimation.
Fu et al. FSVO: Semi-direct monocular visual odometry using fixed maps
KR101896183B1 (en) 3-d straight lines detection method for camera motion estimation
Mair et al. Efficient camera-based pose estimation for real-time applications
Laskar et al. Robust loop closures for scene reconstruction by combining odometry and visual correspondences
Chien et al. Regularised energy model for robust monocular ego-motion estimation.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant