
IEEE/CAA Journal of Automatica Sinica
Citation: | Yuzhen Liu, Ziyang Meng, Yao Zou and Ming Cao, "Visual Object Tracking and Servoing Control of a Nano-Scale Quadrotor: System, Algorithms, and Experiments," IEEE/CAA J. Autom. Sinica, vol. 8, no. 2, pp. 344-360, Feb. 2021. doi: 10.1109/JAS.2020.1003530 |
IN recent years, the study of UAVs has attracted increasing attention from both industrial and academic communities. Miniaturization is one of the major trends of the development of UAV technologies. In 2019, DJI-innovations① launched a small quadrotor named the “Mavic mini”, which is the smallest of their products. However, it still weighs
In addition to miniaturization, intellectualization is another major trend in the development of UAV technologies. In particular, in order to excute various tasks in many practical applications (e.g., surveillance, augmented reality, environmental monitoring, behavior modeling, rescue, and search), the capability of autonomous object tracking is indispensable [5]–[14]. For example, the authors of [13] apply an open-TLD (tracking-learning-detection) [15] tracker on a UAV platform, i.e., AR Drone 2.0, to accomplish the task of tracking non-artificial targets including people or moving cars. However, the TLD-based approach cannot achieve comparable performance to state-of-the-art tracking algorithms. Li et al. [14] realize a vision-based pedestrian tracking system on the UAV platform DJI Matrice100 by exploiting the well-known correlation filter-based tracker. In particular, a frequency domain correlation filter acts as the base tracker, and its online training model is further transformed to the spatial domain to obtain the re-detection model. Since the detection classifier depends on the accuracy of the base tracking model and the generic object proposal [16], this method still cannot efficiently deal with situations where there are variations in object appearance. The above research works focus on relatively large UAV platforms. In contrast, few research works involving visual object tracking use nano-scale UAV platforms due to their very limited volume size, payload ability and power consumption. Firstly, a majority of existing onboard localization and navigation sensors cannot be mounted on nano-scale UAVs, which include a GPS module, laser range finders and normal cameras. Moreover, most existing state-of-the-art visual object tracking solutions require a significant amount of computation resources, and thus, they are not available to be directly employed on nano-scale UAVs. Last but not least, the flight of the nano-scale UAVs is more susceptible to the external disturbances due to their light weight and small volume. Srisamosorn et al. [4] use a nano-scale quadrotor to realize person tracking with the use of multiple environmental cameras. In [5], Briod propose an ego-motion estimation algorithm for use with a
Typical visual object tracking approaches have been recently proposed in [11], [14], [15], and [17]–[24]. In particular, early template-based methods take advantage of selectively updating template and multiple key frames to find an optimal patch to describe object appearance [14]. Moreover, discriminative approaches [11], [17], [18] have been proposed, and both the foreground and background information is considered from sequential images. In [11], two linear support vector machine (SVM) classifiers are trained with simple local binary pattern (LBP) features to detect and track a drogue object during aerial refueling. In addition, the correlation filter-based tracker [19]–[22] emerges as one of the most successful and popular tracking frameworks due to its promising performance and computational efficiency. In particular, Bolme et al. [19] leverage the correlation filter-based method to achieve object tracking, in which the minimum output sum of squared error (MOSSE) filter is proposed with an operation speed of 669 frames per second (FPS). Henriques et al. [20] propose a popular correlation filter-based object tracking algorithm, denoted by kernelized correlation filter (KCF), where the kernel technique is used to improve performance by allowing classification on a rich, high-dimensional feature space. The fact that cyclic matrices can be diagonalized in the Fourier space is leveraged to significantly speed up the matrix operation. However, because most of the aforementioned tracking methods employ relatively risky update schemes and are based on the spatiotemporal consistency of visual cues, they can only be used to handle short-term tracking. In other words, the tracking errors will inevitably be accumulated during long-term tracking, and the lack of robustness against the variations of object appearance (e.g., changes in geometry/photometry, different camera viewpoints, partial occlusion, or object disappearing) usually leads to tracking failure in challenging scenarios. On the other hand, some state-of-the-art object detection methods based on deep-learning (e.g., YOLOv3 [23], MASK R-CNN [24]) have been proposed, and are shown to be robust with excellent detection accuracy. However, the required computations are significant and may lead to the poor real-time performance on CPU or average performance GPU platforms.
In parallel, visual servoing control methods [25]–[29] can be classified into two major categories: image-based visual servoing (IBVS) and position-based visual servoing (PBVS) depending on whether the image measurements from the camera are directly implemented in the control loop. In the IBVS scheme, the controller is designed using two-dimensional pixel coordinates from the image plane. In [25], Zheng et al. propose an IBVS controller for a quadrotor. The trajectories of the image moment features are first designed, following the definition in the virtual image plane. Then, a feature trajectory tracking controller is proposed to track the designed trajectories. In contrast, the PBVS controller requires the reconstruction of the three-dimensional Cartesian coordinates of the observed object from visual data. The main advantage of PBVS is that the control law is precisely formulated in the Cartesian coordinate space such that the control problem (computation of the feedback signal) is separated from the pose estimation problem. In general, PBVS control can be resolved into two sub-tasks: relative position estimation (between the object and the UAV) and trajectory tracking controller design. In particular, most of existing methods need the prior model information of object or depth data measured by sensors (e.g., RGB-D sensor or stereo sensor). In [28], Popova and Liu develop a PBVS control method based on cascaded proportional-integral-derivative (PID) controllers for a quadrotor to track a ground moving object. In this work, the relative position between the object and the quadrotor is estimated based on the assumption that the object is always on flat ground during the whole tracking process. However, it is not general enough for most practical tracking cases. Also, extensive trajectory tracking control techniques have been proposed [30]–[37]. Some traditional linear control methods [30], [31] are employed to stabilize the quadrotor in a small neighborhood around the equilibrium by linearizing the dynamic model of the quadrotor. The authors of [12], [14], and [32] adopt the cascade PID controllers consisting of the inner-loop attitude subsystem and the outer-loop position/velocity subsystems. However, in real-world applications, the control system usually suffers from input deadzones, parametric uncertainties and external disturbances. Hence, nonlinear adaptive control methods have been widely proposed [33]–[37]. In [33], the authors proposed a neural network-based adaptive control method for a ship-mounted crane to achieve boom/rope positioning while simultaneously ensuring payload swing suppression in the presence of external disturbances and input dead-zone effects. In [35], an asymptotic tracking controller is presented for a quadrotor using the robust integral of the signum of the error (RISE) method and the immersion and invariance (I&I)-based adaptive control method. In [36], an adaptive backstepping control algorithm is proposed to drive a helicopter to achieve trajectory tracking. However, only numerical simulation results are presented in this work, and the control strategy is not verified in real-world flight experiments. In addition, although some research works have been conducted on trajectory tracking control or visual servoing control for quadrotors, most of them have been performed on the platforms which typically weigh more than
Mainly motivated by the above references, in this paper, we develop a
To the best of our knowledge, few works have reported stable and robust object tracking on a UAV weighing less than
The rest of this paper is organized as follows. In Section II, we give an overview of the overall system and present the hardware structure and system flow chart. Sections III and IV, respectively show the key components of the proposed tracking system: visual object tracking module and position-based visual servoing control module. In Section V, implementation details and experimental results are presented. In addition, some main notations used in this paper are given in Table I.
Notations | Definitions |
{I}, {B}, {C} | Inertial frame, body frame, and camera frame, respectively; {B} is assumed to be coincident with {C} |
$(u, v)$ | The coordinates of the point in the image plane |
$\phi,\theta,\psi$ | Euler angles, i.e., the roll, pitch, and yaw |
$^A{\bf{x}}$ | The vector ${\bf{x}}$ is expressed in the frame $\{A\}$ |
$^A_B{\bf{R}}$ | Rotation matrix from the frame {B} to the frame {A} |
$^A{\bf{p}}_B$ | The position of the origin of the frame $\{{B}\}$ expressed in the frame $\{{A}\}$ |
${\bf{S}}({{{\omega}}})$ | Skew-symmetric matrix of vector ${{\omega}}=[\omega_x,\omega_y,\omega_z]^T$, i.e., $ \left[ {0−ωyωzωy0−ωx−ωzωx0} \right]$ |
${\bf{I}}_n$ | $n \times n$ identity matrix |
In this section, we first introduce the hardware structure of the developed nano-scale UAV platform. Then, the flow chart of the proposed tracking system is presented.
1) Nano-Scale UAV Platform: Our UAV platform is the Crazyflie 2.0② (Fig. 1), a nano-scale quadrotor, which weighs only 27 g with a size of 92 mm
2) Positioning Deck: In order to provide the UAV platform with accurate pose estimation, we integrate a new positioning deck (Fig. 2) (with a weight of
3) Visual Module: In order to enable the considered nano-scale UAV with the capability of object tracking, a lightweight visual module is integrated, and correspondingly, the ground station is equipped with an image acquisition card to receive image frames in real time (30 Hz). This visual module weighs only 4.7 g and includes a camera and an analogue video transmitter as shown in Fig. 2. In particular, the camera consists of a complementary metal oxide semiconductor (CMOS) image sensor with an image resolution of
The flow chart of the proposed tracking system is shown in Fig. 3. First, the nano-scale quadrotor takes off from the ground. The visual object tracking module processes the received images to estimate the location of the object in the image plane. Then, the relative position between the object and the quadrotor in the inertial frame are calculated based on the visual tracking results. Finally, the adaptive tracking controller calculates the corresponding control input to track the object. In addition, if the visual object tracking module fails to track the object, the yaw angle of the quadrotor will be adjusted in time to search again the missing target. If the object fails to be found after a preset period of time, the quadrotor will land automatically for safety reasons.
As shown in Fig. 3, the proposed tracking system mainly consists of two key components: the visual object tracking module (Section III) and position-based visual servoing control module (Section IV), in which the considered PBVS control is resolved into two sub-tasks: relative position estimation (Section IV-A) and adaptive tracking controller design (Section IV-B).
In this section, we will detail the proposed visual object tracking module, where two operation modes are included, i.e., tracking an artificial object (the concentric circles), and tracking a non-artificial object (a pedestrian).
In the object tracking problem, the considered objects are usually divided into artificial and non-artificial objects. Generally speaking, the case of artificial objects is relatively simple with obvious features (e.g., special geometry shape or color), such that one can directly leverage frame by frame detection method to achieve tracking with these simple features. In comparison, the case of non-artificial objects, e.g., pedestrians, vehicles, and UAVs, is relatively complex since the selection and extraction of features are difficult. Therefore, for non-artificial objects, it is not possible to directly apply simple feature-based detection methods to realize stable and accurate tracking.
In this section, we first introduce an artificial object tracking algorithm based on the frame by frame detection method with concentric circles as an example. Then, we propose RMCTer, a dual-component object tracking algorithm, where the pedestrian is regarded as a tracking example of a non-artificial object. It is important to point out that the proposed RMCTer algorithm is general and can be easily applied to track other artificial or non-artificial objects.
The details are described in Algorithm 1 below, where
During flight, the concentric circles may distort into two concentric ellipses in the image, while the height (or width) ratio of the two ellipses (
Algorithm 1
Input:
Output:
1:
2:
3:
4: for
5: if (the number of points in
6:
7: if
8:
9:
10:
11: return TRUE
12: end if
13: end if
14: end for
15: return FALSE
In this section, we propose a robust multi-collaborative tracker (RMCTer) to achieve stable and accurate pedestrian tracking in real time. As shown in Fig. 4, the RMCTer consists of a short-term tracking module and a long-term processing module. In particular, a two-stage correlation filter-based tracker consisting of a translation filter and a scale filter is employed for the short-term tracking module. This component generally works accurately and efficiently in relatively stable scenarios. Along with this short-term module, in order to be robust against variations in the appearance of object and compensate for visual tracking errors, an efficient long-term processing module is presented. It is composed of a learning-based multi-object detection network (i.e., YOLOv3), and a two-stage template scoring component based on the image histogram and SURF (speeded up robust features) matching. Note that, although the detection network (YOLOv3) requires a relatively large amount of computation resource, our short-term module is computationally efficient with reliable short-term tracking performance. A periodic takeover control strategy is adopted for the proposed tracker to ensure both effectiveness and real-time performance. For each
The following subsections detail successively all the components of RMCTer: the two-stage short-term tracking module, long-term processing module and initialization/re-initialization approach.
1) Two-Stage Short-Term Tracking Module: In order to obtain accurate and efficient short-term tracking performance, we leverage a correlation filter-based tracker, where a two-stage filter process is performed to obtain translation estimation and scale estimation. In particular, the employed tracker is based on the KCF [20] and discriminative scale space correlation filter (DSSCF) [22]. Compared with [20]–[22] where only the current frame is used to train the discriminative model such that the temporal correlation is inevitably neglected, we use the recent frames in a sliding window to train our discriminative model. In this way, the tracking model is weighted by time order and inherently contains a temporal context, leading to a better description of the current object appearance.
For translation estimation, the goal of training is to find a classifier
minwk∑m=m0βm∑i(⟨w,ϕ(xi)⟩−yi)2+λ‖w‖2 | (1) |
where
The solution to (1) is given by
ˆα=k∑m=m0βmˆymk∑m=m0βmˆkxxm+λ | (2) |
where
kxx′=exp(−1σ2(‖x‖2+‖x′‖2−2F−1(C∑c=1cˆx∗⊙cˆx′))) | (3) |
where
During the detection, given a candidate image patch
ˆf(z)=ˆkxz⊙ˆα. | (4) |
Then, we take the inverse Fourier transform of
In order to handle scale variation, it follows from [22] that a second-stage one dimensional correlation filter is performed. To evaluate the scale filter, the scale pyramids are built around the tracking target. In particular, the image patches centered around the location obtained by the first-stage translation filter are cropped from the image. The sizes of image patches are
In addition, the short-term tracking state is estimated by the value of PSR [19], which is defined as
ε=gmax−μσ | (5) |
where
St={excellence,ε≥τ2success,τ1≤ε<τ2failure,ε<τ1. | (6) |
If the state is
2) Long-Term Processing Module: As described in Section I, a single short-term tracking module cannot effectively adapt to the variations of object appearance and handle tracking errors. Therefore, we propose a long-term processing module, which is composed of a state-of-the-art deep learning-based object detection network (i.e., YOLOv3) and a two-stage template scoring component.
In particular, the input frame is first processed by the YOLOv3 network, whose output consists of multiple bounding boxes containing various objects and the corresponding labels. Then, the image patches corresponding to the bounding boxes with pedestrian labels are cropped from the image and selected as the candidate image patches. If there are no bounding boxes with the pedestrian label, this frame is directly processed by the short-term tracking module, and the next frame is input to the long-term processing module. If this condition occurs in
δmax=maxi{Hi+ℓA×Ki} | (7) |
where
x=ρxl+(1−ρ)xs | (8) |
where
As the tracking process proceeds, the template images should be updated in real time due to possible variations in the object appearance. Therefore, we perform a fast templates updating method. In particular, the tracking patches with short-term tracking state
3) Initialization/Re-Initialization: The initialization/re-initialization approach becomes simple thanks to the well-established long-term processing module. To be specific, the processing result of the long-term processing module serves as the initial bounding box of the object after the long-term processing state stays as
Based on the visual object tracking results (Section III), we propose a PBVS control method for a nano-scale quadrotor to achieve tracking. In this section, we first introduce a method of estimating the relative position between the object and the quadrotor. Then, a nonlinear adaptive tracking controller is proposed by taking into account the dynamics of the nano-scale quadrotor and the presence of internal and external disturbances.
In order to perform the PBVS task, the relative position between the object and the quadrotor in the inertial frame must be calculated. Therefore, we propose a feasible two-stage estimation method, and the following assumptions are first imposed.
Assumption 1: The object height is relatively stable.
Assumption 2: In the initialization stage, the altitude of the quadrotor is available and the tracking object is on flat ground.
Remark 1: Assumption 1 is reasonable since most common practical tracking objects satisfy such an assumption, e.g., rigid-body, cars, standing pedestrian. Note that Assumption 2 is only needed in the initialization stage instead of being required in the whole tracking process as considered in [9] and [28]. Therefore, Assumption 2 can be more easily guaranteed and more applicable to the practical tracking assignment (e.g., slope ground, object moving up and moving down during tracking).
As shown in Fig. 5, we denote
IpfA=sAICRK−1[uA′vA′1]+IpC=sA[xfAyfAzfA]+IpCIpfB=sBICRK−1[uB′vB′1]+IpC=sB[xfByfBzfB]+IpC | (9) |
where
By denoting the height of the tracking object as
IpfB−IpfA=[00hp] | (10) |
i.e.,
sAxfA=sBxfB | (11) |
sAyfA=sByfB | (12) |
hp=sAzfA−sBzfB. | (13) |
In terms of (11) and (12), we have that
sB=sA2(xfAxfB+yfAyfB). | (14) |
Substituting (14) into (13) yields
sA=hpzfA−zfB2(xfAxfB+yfAyfB). | (15) |
In the tracking stage (Fig. 5(a)), the relative position between the object and the quadrotor in the inertial frame, i.e.,
Id=[dxdydz]=[IpC(1)−IpfA(1)IpC(2)−IpfA(2)IpC(3)−IpfA(3)]=[−sAxfA−sAyfA−sAzfA] | (16) |
where
Id=[hpxfAzfB2(xfAxfB+yfAyfB)−zfAhpyfAzfB2(xfAxfB+yfAyfB)−zfAhpzfAzfB2(xfAxfB+yfAyfB)−zfA]. | (17) |
However, the value of object height cannot be directly obtained since there is no prior model information of the tracking object. We therefore design a feasible method to estimate the object height.
In the initialization stage, as shown in Fig. 5(b), it follows from Assumption 2 that:
huo=IpC(3)−IpfB(3)=−sBzfB | (18) |
where
sB=−huozfB. | (19) |
Substituting (14) and (19) into (13) yields
hp=huo[1−zfBzfA2(xfAxfB+yfAyfB)]. | (20) |
Based on (9) and (20), the height of the object
Note that after initialization, our estimation method does not depend on the flat ground assumption and does not need the altitude information of the UAV. It is especially useful in the cases where the object moves up or moves down during tracking, or the altitude information of the UAV is not reliable (note that the upper measurement limit of ToF sensor is only
Next, we present a nonlinear adaptive tracking controller for the nano-scale quadrotor to achieve stable object tracking with consideration of the presence of internal and external disturbances.
For notation simplicity,
1) The Nano-Scale Quadrotor Model: By considering the nano-scale quadrotor model as a rigid body, its kinematics and dynamics equations are described as follows [41]:
˙p=v | (21) |
m˙v=−mge3+Rf+Δf | (22) |
f=Fe3 | (23) |
˙R=RS(ω) | (24) |
J˙ω=−S(ω)Jω+τ+Δτ | (25) |
where
R=[cθcψcψsθsϕ−cϕsψcϕcψsθ−sϕsψcθsψsψsθsϕ+cϕcψcψsψsθ−sϕcψ−sθcθsϕcθcϕ] |
with
The applied lift force
[Fτxτyτz]=[11110−d0dd0−d0k−kk−k][l1l2l3l4] | (26) |
where
Assumption 3: The disturbance forces
|Δfi|≤ϖi,|Δτi|≤Θi,i=x,y,z. | (27) |
Assumption 4: During the flight progress, the pitch angle
Remark 2: Assumption 3 is valid because the forces
2) Simplified Model: Due to the strong coupling of the above quadrotor model (21)–(25), it should be decoupled to facilitate the control design. In particular, the quadrotor model can be divided into two subsystems [44]
a) Altitude system
˙pz=vz | (28) |
m˙vz=−mg+cθcϕF+Δfz. | (29) |
b) Longitudinal-lateral subsystem
˙ˉp=ˉv | (30) |
m˙ˉv=FˉR+Δˉf | (31) |
˙ˉR=ˇRˉω | (32) |
˙ψ=sϕcθωy+cϕcθωz | (33) |
J˙ω=−S(ω)Jω+τ+Δτ | (34) |
where
3) Desired Trajectory: We define the desired trajectories of the nano-scale quadrotor in the inertial frame as
[prψr]=[p+Δdψ+Δφ] | (35) |
Δd=d−dexp | (36) |
where
4) Control Law Design: Before designing the controller, a lemma is first introduced.
Lemma 1: Given any scalar positive function
|η|−η2√η2+ε(t)<√ε(t),∀η∈R. | (37) |
Proof: Since
|η|−η2√η2+ε(t)=|η|√η2+ε(t)−|η|√η2+ε(t)<√η2+ε(t)−|η|≤√η2+√ε(t)−|η|=√ε(t). | (38) |
■
Our control objective is to develop control thrust
We first consider the altitude subsystem, and derive a control thrust
˙˜pz=˙pz−˙prz=vz−˙prz. | (39) |
Define a Lyapunav function
˙Vpz=˜pz(vz−˙prz). | (40) |
Design a virtual control law
αz=−kpz˜pz+˙prz,kpz>0. | (41) |
Substituting (41) into (40) yields
˙Vpz=−kpz˜p2z+˜pz˜vz | (42) |
where
m˙˜vz=m(˙vz−˙αz)=−mg+cθcϕF+Δfz−m˙αz. | (43) |
Assign a Lyapunov function
˙Vvz=˜vz(−mg+cθcϕF+Δfz−m˙αz)≤˜vz(−mg+cθcϕF−m˙αz)+|˜vz|ϖz. |
In view of Lemma 1, it follows that:
˙Vvz<˜vz(−mg+cθcϕF−m˙αz+˜vzϖz√˜v2z+εvz)+√εvzϖz. | (44) |
Under Assumption 4, we design
F=1cθcϕ(m(g+˙αz)−kvz˜vz−˜pz−˜vzˆϖz√˜v2z+εvz) | (45) |
where
˙V∗vz<−kvz˜v2z+˜vz(˜vz˜ϖz√˜v2z+εvz−˜pz)+√εvzϖz−1ςvz˜ϖz˙ˆϖz. |
Then, the update law for
˙ˆϖz=ςvz(˜v2z√˜v2z+εvz−σvzˆϖz),σvz>0 | (46) |
and it follows that:
˙V∗vz<−kvz˜v2z−˜vz˜pz+√εvzϖz+σvz˜ϖzˆϖz. | (47) |
We next consider the longitudinal-lateral subsystem, and design
Define the longitudinal-lateral position error
˙Vˉp=˜ˉpT(ˉv−˙ˉpr). | (48) |
We then design a virtual control law
αˉp=−kˉp˜ˉp+˙ˉpr,kpz>0. | (49) |
Substituting (49) into (48) yields
˙Vˉp=−kˉp‖˜ˉp‖2+˜ˉpT˜ˉv | (50) |
where
m˙˜ˉv=m˙ˉv−m˙αˉp=FˉR+Δˉf−m˙αˉp. | (51) |
Assign a Lyapunov function
˙Vˉv=˜ˉvT(FˉR+Δˉf−m˙αˉp)≤˜ˉvT(FˉR−m˙αˉp)+∑i=x,y|˜ˉvi|ϖi. | (52) |
In view of Lemma 1, it follows that:
˙Vˉv<˜ˉvT(FˉR−m˙αˉp+Fεˉv(˜ˉv)ˉϖ)+√εˉv∑i=x,yϖi | (53) |
where
αˉv=1F(−kˉv˜ˉv+m˙αˉp−Fεˉv(˜ˉv)ˆˉϖ−˜ˉp) | (54) |
where
V∗ˉv=Vˉv+12˜ˉϖTΞ−1ˉv˜ˉϖ | (55) |
where
˙V∗ˉv<−kˉv‖˜ˉv‖2+˜ˉvT(F˜ˉR−˜ˉp+Fεˉv(˜ˉv)˜ˉϖ)+√εˉv∑i=x,yϖi−˜ˉϖTΞ−1ˉv˙ˆˉϖ | (56) |
where
˙ˆˉϖ=Ξˉv(Fεˉv(˜ˉv)˜ˉv−σˉvˆˉϖ),σˉv>0 | (57) |
such that
˙V∗ˉv<−kˉv‖˜ˉv‖2+˜ˉvT(F˜ˉR−˜ˉp)+√εˉv∑i=x,yϖi+σˉv˜ˉϖTˆˉϖ. | (58) |
In view of (32), the error dynamics of
˙VˉR=˜ˉRT(ˇRˉω−˙αˉv). | (59) |
Design a virtual control
αˉR=ˇR−1(−kˉR˜ˉR−F˜ˉv+˙αˉv). | (60) |
Substituting (60) into (59) yields
˙VˉR=−kˉR‖˜ˉR‖2+˜ˉRT(−F˜ˉv+ˇR˜ˉω) | (61) |
where
Define the yaw error
˙Vψ=˜ψ(sϕcθωy+cϕcθωz−˙ψr). | (62) |
Design a virtual control law
αψ=cθcϕ(−kψ˜ψ+˙ψr−sϕcθωy),kψ>0. | (63) |
Substituting (63) into (62) yields
˙Vψ=−kψ˜ψ2+cϕcθ˜ψ˜ωz | (64) |
where
Define
J˙˜ω=J˙ω−J˙αR=−S(ω)Jω+τ+Δτ−J˙αR. | (65) |
In view of Assumption 3, choose a Lyapunov function
˙Vω=˜ωT(−S(ω)Jω+τ+Δτ−J˙αR)≤˜ωT(−S(ω)Jω+τ−J˙αR)+∑i=x,y,z|˜ωi|Θi. | (66) |
From Lemma 1, it follows that:
˙Vω<˜ωT[−S(ω)Jω+τ−J˙αR+Fεω(˜ω)Θ]+√εω∑i=x,y,zΘi | (67) |
where
τ=−kω˜ω−ς+S(ω)Jω+J˙αR−Fεω(˜ω)ˆΘ | (68) |
where
˙V∗ω<−kω‖˜ω‖2+˜ωT[Fεω(˜ω)˜Θ−ς]+√εω∑i=x,y,zΘi−˜ΘTΞ−1ω˙ˆΘ. | (69) |
In addition, the update law for
˙ˆΘ=Ξω(Fεω(˜ω)˜ω−σωˆΘ),σω>0 | (70) |
then it follows that:
˙V∗ω<−kω‖˜ω‖2−˜ωTς+√εω∑i=x,y,zΘi+σω˜ΘTˆΘ. | (71) |
Theorem 1: Consider the nano-scale quadrotor system (21)–(25). Suppose Assumptions 3 and 4 hold. The proposed control laws (45), (68), and the parameter update laws (46), (57), (70) guarantee that the tracking errors of the closed-loop system remain uniformly bounded and converge to a small compact set containing the origin.
Proof: Consider the Lyapunov function
V=Vpz+V∗vz+Vˉp+V∗ˉv+VˉR+Vψ+V∗ω. |
In terms of (42), (47), (50), (58), (61), (64), and (71), the derivative of
˙V<−kpz˜p2z−kvz˜v2z−kˉp‖˜ˉp‖2−kˉv‖˜ˉv‖2−kˉR‖˜ˉR‖2−kψ˜ψ2−kω‖˜ω‖2+√εvzϖz+√εˉv∑i=x,yϖi+√εω∑i=x,y,zΘi+σvz˜ϖzˆϖz+σˉv˜ˉϖTˆˉϖ+σω˜ΘTˆΘ≤−kpz˜p2z−kvz˜v2z−kˉp‖˜ˉp‖2−kˉv‖˜ˉv‖2−kˉR‖˜ˉR‖2−kψ˜ψ2−kω‖˜ω‖2−σvz2˜ϖ2z−σˉv2‖˜ˉϖ‖2−σω2‖˜Θ‖2+√εvzϖz+√εˉv∑i=x,yϖi+√εω∑i=x,y,zΘi+σvz‖ϖz‖22+σˉv‖ˉϖ‖22+σω‖Θ‖22≤−2kminV+C |
where
Remark 3: It follows from (60) and (68) that the derivatives of
˙α=h,h=−λ0|α−αR|12sign(α−αR)+α′α′=−λ1sign(α′−h) | (72) |
where
Remark 4: Compared with the visual servoing controllers adopted in [14], [25], and [28], the proposed algorithm considers the existence of disturbances, in which we leverage the
In this section, we demonstrate the capabilities of the proposed tracking system with the tailor-made nano-scale quadrotor platform. In particular, the experiments consist of four parts. First, we demonstrate the visual object tracking module (Section III), which includes two operation modes: tracking an artificial object (the concentric circles), and tracking a non-artificial object (a pedestrian). Second, the method of estimating the relative position between the object and the quadrotor (Section IV-A) is verified. Third, hovering control tests in the presence of periodic disturbances are performed to demonstrate the proposed adaptive tracking controller (Section IV-B). Last but not least, real-world object tracking flight tests are performed to validate the system. The video of the experiments is available online at
We then present the implementation details. In particular, the parameters for RMCTer are set as follows: the number of frames used for initialization is
As described in Section III, the proposed visual object tracking module includes two operation modes: tracking an artificial object (the concentric circles) and tracking a non-artificial object (a pedestrian).
First, we compare the developed concentric circle tracking algorithm (Section III-A) with the method adopted in [40]. The result is shown in Fig. 6. The left image is the result by leveraging the method of [40], and the right image shows the result by performing the proposed algorithm. The time consumptions of algorithms are marked with yellow text in the lower left corner of the images. It can be seen that, although these two methods both successfully identify the concentric circles, the proposed algorithm uses less time due to the use of the inclusion relationship of concentric circles.
Next, we validate the capabilities of the proposed tracking algorithm RMCTer (Section III-B) on public datasets and in real-world environment. We conduct the evaluation on the popular benchmark proposed by [47], which contains
In order to evaluate the accuracy of relative position estimation introduced in Section IV-A, we conduct two indoor experiments, where the object is fixed and we manually control the quadrotor moving to change the relative position between the object and the quadrotor in real time. Meanwhile, an OptiTrack④ motion capture system in Fig. 9 is used to record the three-dimensional positions of the object and the quadrotor to obtain the groundtruth of relative position. The experimental results are shown in Figs. 10 and 11. In particular, Fig. 10 shows the comparison between the estimated and groundtruth of the relative position between the nano-scale quadrotor and the concentric circles, and Fig. 11 shows the comparison between the quadrotor and the pedestrian. It can be seen that the estimated results achieve good accuracy. In particular, corresponding to Fig. 10, the estimation errors are respectively
In order to verify the proposed tracking controller (Section IV-B), adaptive hovering control experiment is performed. The control objective is to let the nano-scale quadrotor hovering at the original point even under a periodic disturbance. The desired horizontal position is set to be
To demonstrate the whole proposed tracking system, real-world object tracking flight tests are performed. In the experiment, the tracking object moves randomly in three-dimensional space, and the disturbance is the same as the one for the hovering control test (Section V-C). The three-dimensional positions of the object and the quadrotor are recorded by the OptiTrack system. Figs. 13 and 14 show the tracking performance of concentric circle tracking and single pedestrian tracking, respectively. It can be seen that the trajectories of the nano-scale quadrotor are consistent with those of different objects. In particular, in Fig. 13, the average tracking errors are 6.78 cm, 8.65 cm, and 5.19 cm, respectively for the
In this paper, we propose a monocular vision-based object tracking and servoing control system using a tailor-made nano-scale UAV platform. In order to guarantee tracking robustness and accuracy, we propose a novel visual object tracking algorithm, i.e., RMCTer, where a powerful short-term tracking module and an efficient long-term processing module are tightly integrated. Moreover, an adaptive PBVS control method is proposed to achieve stable and accurate object tracking in the presence of internal and external disturbances. The experimental results demonstrate that each component of the proposed system is effective, and that the whole tracking system has high accuracy, strong stability, and low latency. An interesting direction for further work is the realization of a lightweight and robust state estimation algorithm (i.e., visual-inertial odometry or simultaneous localization and mapping) on the nano-scale UAV platform, such that stable object tracking can be obtained in more complex and more challenging environment.
[1] |
K. Fregene, “Unmanned aerial vehicles and control: Lockheed martin advanced technology laboratories,” IEEE Control Syst. Mag., vol. 32, no. 5, pp. 32–34, Oct. 2012. doi: 10.1109/MCS.2012.2205474
|
[2] |
X. Zhang, B. Xian, B. Zhao, and Y. Zhang, “Autonomous flight control of a nano quadrotor helicopter in a GPS-denied environment using on-board vision,” IEEE Trans. Ind. Electron., vol. 62, no. 10, pp. 6392–6403, Oct. 2015. doi: 10.1109/TIE.2015.2420036
|
[3] |
G. W. Cai, J. Dias, and L. Seneviratne, “A survey of small-scale unmanned aerial vehicles: Recent advances and future development trend,” Unmanned Syst., vol. 2, no. 2, pp. 175–199, Apr. 2014. doi: 10.1142/S2301385014300017
|
[4] |
V. Srisamosorn, N. Kuwahara, A. Yamashita, T. Ogata, and J. Ota, “Human-tracking system using quadrotors and multiple environmental cameras for face-tracking application,” Int. J. Adv. Robot. Syst., vol. 14, no. 5, pp. 1–18, Oct. 2017.
|
[5] |
A. Briod, J. C. Zufferey, and D. Floreano, “Optic-flow based control of a 46 g quadrotor,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Tokyo, Japan, 2013, pp. 149–158.
|
[6] |
D. Palossi, J. Singh, M. Magno, and L. Benini, “Target following on nano-scale unmanned aerial vehicles,” in Proc. 7th IEEE Int. Workshop on Advances in Sensors and Interfaces, Vieste, Italy, 2017, pp. 170–175.
|
[7] |
D. Floreano and R. J. Wood, “Science, technology and the future of small autonomous drones,” Nature, vol. 521, no. 7553, pp. 460–466, May 2015. doi: 10.1038/nature14542
|
[8] |
P. Serra, R. Cunha, T. Hamel, D. Cabecinhas, and C. Silvestre, “Landing of a quadrotor on a moving target using dynamic image-based visual servo control,” IEEE Trans. Robot., vol. 32, no. 6, pp. 1524–1535, Dec. 2016. doi: 10.1109/TRO.2016.2604495
|
[9] |
H. Cheng, L. S. Lin, Z. Q. Zheng, Y. W. Guan, and Z. C. Liu, “An autonomous vision-based target tracking system for rotorcraft unmanned aerial vehicles,” in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, Vancouver, Canada, 2017, pp. 1732–1738.
|
[10] |
A. Rodriguez-Ramos, A. Alvarez-Fernandez, H. Bavle, P. Campoy, and J. P. How, “Vision-based multirotor following using synthetic learning techniques,” Sensors, vol. 19, no. 21, pp. 4794, Nov. 2019. doi: 10.3390/s19214794
|
[11] |
Y. J. Yin, X. G. Wang, D. Xu, F. F. Liu, Y. L. Wang, and W. Q. Wu, “Robust visual detection-learning-tracking framework for autonomous aerial refueling of UAVs,” IEEE Trans. Instrum. Meas., vol. 65, no. 3, pp. 510–521, Mar. 2016. doi: 10.1109/TIM.2015.2509318
|
[12] |
P. Campoy, J. F. Correa, I. Mondragón, I. Mondragón, C. Martínez, M. Olivares, L. Mejías, and J. Artieda, “Computer vision onboard UAVs for civilian tasks,” J. Intell. Robot. Syst., vol. 54, no. 1–3, pp. 105–134, 2009. doi: 10.1007/s10846-008-9256-z
|
[13] |
J. Pestana, J. L. Sanchez-Lopez, S. Saripalli, and P. Campoy, “Computer vision based general object following for GPS-denied multirotor unmanned vehicles,” in Proc. American Control Conf., Portland, USA, 2014, pp. 1886–1891.
|
[14] |
R. Li, M. Pang, C. Zhao, G. Y. Zhou, and L. Fang, “Monocular long-term target following on UAVs,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshops, Las Vegas, USA, 2016, pp. 29–37.
|
[15] |
Z. Kalal, K. Mikolajczyk, and J. Matas, “Tracking-learning-detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 7, pp. 1409–1422, Jul. 2012. doi: 10.1109/TPAMI.2011.239
|
[16] |
C. L. Zitnick and P. Dollár, “Edge boxes: Locating object proposals from edges,” in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 391–405.
|
[17] |
B. Babenko, M. H. Yang, and S. Belongie, “Robust object tracking with online multiple instance learning,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 8, pp. 1619–1632, Aug. 2011. doi: 10.1109/TPAMI.2010.226
|
[18] |
S. Hare, S. Golodetz, A. Saffari, V. Vineet, M. M. Cheng, S. L. Hicks, and P. H. S. Torr, “Struck: Structured output tracking with kernels,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 10, pp. 2096–2109, Oct. 2016. doi: 10.1109/TPAMI.2015.2509974
|
[19] |
D. S. Bolme, J. R. Beveridge, B. A. Draper, and Y. M. Lui, “Visual object tracking using adaptive correlation filters,” in Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, San Francisco, USA, 2010, pp. 2544–2550.
|
[20] |
J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, “High-speed tracking with kernelized correlation filters,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 3, pp. 583–596, Mar. 2015. doi: 10.1109/TPAMI.2014.2345390
|
[21] |
J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, “Exploiting the circulant structure of tracking-by-detection with kernels,” in Proc. 12th European Conf. Computer Vision, Florence, Italy, 2012, pp. 702–715.
|
[22] |
M. Danelljan, G. Häger, F. S. Khan, and M. Felsberg, “Discriminative scale space tracking,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 8, pp. 1561–1575, Aug. 2017. doi: 10.1109/TPAMI.2016.2609928
|
[23] |
J. Redmon and A. Farhadi, “YOLOv3: An incremental improvement,” arXiv preprint arXiv: 1804.02767, 2018.
|
[24] |
K. M. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 386–397, Feb. 2020. doi: 10.1109/TPAMI.2018.2844175
|
[25] |
D. L. Zheng, H. S. Wang, W. D. Chen, and Y. Wang, “Planning and tracking in image space for image-based visual servoing of a quadrotor,” IEEE Trans. Ind. Electron., vol. 65, no. 4, pp. 3376–3385, Apr. 2018. doi: 10.1109/TIE.2017.2752124
|
[26] |
F. Chaumette and S. Hutchinson, “Visual servo control. I. Basic approaches,” IEEE Robot. &Automat. Mag., vol. 13, no. 4, pp. 82–90, Dec. 2006.
|
[27] |
N. Guenard, T. Hamel, and R. Mahony, “A practical visual servo control for an unmanned aerial vehicle,” IEEE Trans. Robot., vol. 24, no. 2, pp. 331–340, Apr. 2008. doi: 10.1109/TRO.2008.916666
|
[28] |
M. G. Popova and H. H. Liu, “Position-based visual servoing for target tracking by a quadrotor UAV,” in Proc. AIAA Guidance, Navigation, and Control Conf., San Diego, USA, 2016, pp. 2092–2103.
|
[29] |
W. B. Zhao, H. Liu, F. L. Lewis, K. P. Valavanis, and X. L. Wang, “Robust visual servoing control for ground target tracking of quadrotors,” IEEE Trans. Control Syst. Technol., vol. 28, no. 5, pp. 1980–1987, Sept. 2020. doi: 10.1109/TCST.2019.2922159
|
[30] |
F. Rinaldi, S. Chiesa, and F. Quagliotti, “Linear quadratic control for quadrotors UAVs dynamics and formation flight,” J. Intell. &Robot. Syst., vol. 70, no. 1–4, pp. 203–220, Apr. 2013.
|
[31] |
D. Mellinger, M. Shomin, and V. Kumar, “Control of quadrotors for robust perching and landing,” in Proc. Int. Powered Lift Conf., Philadelphia, USA, 2010, pp. 205–225.
|
[32] |
Y. Z. Liu and Z. Y. Meng, “Visual object tracking for a nano-scale quadrotor,” in Proc. 15th Int. Conf. Control, Automation, Robotics and Vision, Singapore, 2018, pp. 843–847.
|
[33] |
T. Yang, N. Sun, H. Chen, and Y. C. Fang, “Neural network-based adaptive antiswing control of an underactuated ship-mounted crane with roll motions and input dead zones,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 3, pp. 901–914, Mar. 2020. doi: 10.1109/TNNLS.2019.2910580
|
[34] |
N. Sun, D. K. Liang, Y. M. Wu, Y. H. Chen, Y. D. Qin, and Y. C. Fang, “Adaptive control for pneumatic artificial muscle systems with parametric uncertainties and unidirectional input constraints,” IEEE Trans. Ind. Informatics, vol. 16, no. 2, pp. 969–979, Feb. 2020. doi: 10.1109/TII.2019.2923715
|
[35] |
B. Zhao, B. Xian, Y. Zhang, and X. Zhang, “Nonlinear robust adaptive tracking control of a quadrotor UAV via immersion and invariance methodology,” IEEE Trans. Ind. Electron., vol. 62, no. 5, pp. 2891–2902, May 2015. doi: 10.1109/TIE.2014.2364982
|
[36] |
Y. Zou and W. Huo, “Adaptive tracking control for a model helicopter with disturbances,” in Proc. American Control Conf, Chicago, USA, 2015, pp. 3824–3829.
|
[37] |
P. Marantos, C. P. Bechlioulis, and K. J. Kyriakopoulos, “Robust trajectory tracking control for small-scale unmanned helicopters with model uncertainties,” IEEE Trans. Control Syst. Technol., vol. 25, no. 6, pp. 2010–2021, Nov. 2017. doi: 10.1109/TCST.2016.2642160
|
[38] |
M. W. Mueller, M. Hamer, and R. D’Andrea, “Fusing ultra-wideband range measurements with accelerometers and rate gyroscopes for quadrocopter state estimation,” in Proc. IEEE Int. Conf. Robotics & Automation, Seattle, USA, 2015, pp. 1730–1736.
|
[39] |
M. Greiff, “Modelling and control of the crazyflie quadrotor for aggressive and autonomous flight by optical flow driven state estimation,” M.S. thesis, Lund University, Sweden, 2017.
|
[40] |
Y. J. Li, “The design and implementation of four rotor UAV fixed landing system based on machine vision,” M.S. thesis, South China University of Technology, Guangzhou, China, 2015.
|
[41] |
R. Mahony, V. Kumar, and P. Corke, “Multirotor aerial vehicles: Modeling, estimation, and control of quadrotor,” IEEE Robot. &Autom. Mag., vol. 19, no. 3, pp. 20–32, Sept. 2012.
|
[42] |
Y. Zou, “Trajectory tracking controller for quadrotors without velocity and angular velocity measurements,” IET Control Theory &Appl., vol. 11, no. 1, pp. 101–109, Jan. 2017.
|
[43] |
M. Huang, B. Xian, C. Diao, K. Y. Yang, and Y. Feng, “Adaptive tracking control of underactuated quadrotor unmanned aerial vehicles via backstepping,” in Proc. American Control Conf., Baltimore, USA, 2010, pp. 2076–2081.
|
[44] |
B. Zhu and W. Huo, “Robust nonlinear control for a model-scaled helicopter with parameter uncertainties,” Nonlinear Dyn., vol. 73, no. 1–2, pp. 1139–1154, Jul. 2013. doi: 10.1007/s11071-013-0858-z
|
[45] |
M. Krstić, I. Kanellakopoulos, and P. Kokotović, Nonlinear and Adaptive Control Design. New York, UK: Wiley, 1995.
|
[46] |
A. Levant, “Higher-order sliding modes, differentiation and output-feedback control,” Int. J. Control, vol. 76, no. 9–10, pp. 924–941, 2003. doi: 10.1080/0020717031000099029
|
[47] |
Y. Wu, J. Lim, and M. H. Yang, “Object tracking benchmark,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 9, pp. 1834–1848, Sept. 2015. doi: 10.1109/TPAMI.2014.2388226
|
Notations | Definitions |
{I}, {B}, {C} | Inertial frame, body frame, and camera frame, respectively; {B} is assumed to be coincident with {C} |
$(u, v)$ | The coordinates of the point in the image plane |
$\phi,\theta,\psi$ | Euler angles, i.e., the roll, pitch, and yaw |
$^A{\bf{x}}$ | The vector ${\bf{x}}$ is expressed in the frame $\{A\}$ |
$^A_B{\bf{R}}$ | Rotation matrix from the frame {B} to the frame {A} |
$^A{\bf{p}}_B$ | The position of the origin of the frame $\{{B}\}$ expressed in the frame $\{{A}\}$ |
${\bf{S}}({{{\omega}}})$ | Skew-symmetric matrix of vector ${{\omega}}=[\omega_x,\omega_y,\omega_z]^T$, i.e., $ \left[ {0−ωyωzωy0−ωx−ωzωx0} \right]$ |
${\bf{I}}_n$ | $n \times n$ identity matrix |