Real-time visual tracking of 3D objects with dynamic handling of occlusion

Position-based visual servoing requires estimating and tracking the three dimensional position and orientation of a 3D target object from camera images. This paper describes a novel approach to the problem that consists of two steps. First, a set of spatial pose constraints is derived from image features, by means of which 3D object pose is calculated with an efficient model-fitting algorithm. Kalman-filtering is then used to estimate object velocity and acceleration. Compared to previous approaches that use Kalman-filters to directly estimate the object state from image features, the proposed method has a variety of advantages: Computation time is only O(n) rather than O(n/sup 3/) where n is the number of image features considered, sensor fusion is simplified and temporal estimation is decoupled from the choice of image features. The last point is of particular importance if occlusions that may occur during tracking are to be predicted and dynamically handled. With the tracking method proposed, a robot could be precisely controlled with respect to static objects and robustly follow targets moving in 6 degrees of freedom, while occasions were continuously predicted and appropriate features automatically selected at video rate (25 Hz). High robustness is obtained by Hough transform-based feature extraction.