| Abstract | Conventional methods for inspection and monitoring using human operators are tedious,
time consuming, and prone to errors. Technological innovations in the fields of image processing
and machine vision have provided opportunities to automate these manual tasks in
applications as diverse as surveillance, medical diagnostics, remote sensing, industrial quality
control, and precision agriculture. Such automation can increase efficiency, productivity,
effectiveness, speed, quality, and yield. Researchers over the years have investigated the
feasibility, applications, and implications of different sensors and image processing algorithms,
and now the use of low-cost sensors and mobile robots for monitoring and inspection
applications is a reality.
In this dissertation I focus on inspection and monitoring tasks that can be automated by applying
image processing techniques to the output obtained from optical sensors (color and
color-depth sensors). Although some of the industrial and military applications involving
vision sensors for automation have already evolved into real-world products, there are many
areas that still require thorough investigation and research for real-world implementation. I
target applications that have societal importance in the areas of safety and precision agriculture.
First, I present QuickBlaze, a flame and smoke detection system based on vision sensors
aimed at early detection of fire incidents for open or closed indoor and outdoor environments.
We use simple image and video processing techniques to compute motion and color cues,
enabling segmentation of flame and smoke candidates from the background in real time.
QuickBlaze does not require any offline training, although manual adjustment of parameters
during a calibration phase is required to cater to the particular camera’s depth of view and the
surrounding environment. In an extensive empirical evaluation benchmarking QuickBlaze
against commercial fire detection software, we find that it has a better response time, is 2.66
times faster, and better localizes fire incidents. Detection of fire using our real-time video
processing approach early on in the burning process holds the potential to decrease the length
iii
of the critical period from combustion to human response in the event of a fire.
Second, we present a novel method for joint localization of a quadcopter pursuer with a
monocular camera and an arbitrary target. Our focus is on mobile robots that are capable
of tracking and monitoring a target in scenarios such as person/child/animal monitoring or
tracking a fugitive. Our method localizes both the pursuer and target with respect to a common
reference frame. We show that predicting and correcting pursuer and target trajectories
simultaneously produces better results than standard approaches to estimating relative target
trajectories in a 3D coordinate system. The efficiency of the proposed method is demonstrated
by a series of experiments with a real quadcopter pursuing a human. The results
show that the visual tracker can deal effectively with target occlusions and that joint localization
outperforms standard localization methods.
Third, I present a textured fruit segmentation method based on super-pixel oversegmentation,
dense SIFT descriptors, and bag-of-visual-word histogram classification within each superpixel.
An empirical evaluation of the proposed technique for textured fruit segmentation
yields a 96.67% detection rate, a per-pixel accuracy of 97.657%, and a per frame false alarm
rate of 0.645%, compared to a detection rate of 90.0%, accuracy of 84.94%, and false alarm
rate of 0.887% for the baseline sparse keypoint-based method. I conclude that super-pixel
over-segmentation, dense SIFT descriptors, and bag-of-visual-word histogram classification
are effective for in-field segmentation of textured green fruits from the background.
Fourth, I present two new methods for automated counting of fruit in images of mango
tree canopies, one using texture-based dense segmentation and one using shape-based fruit
detection, and compare the use of these methods relative to existing techniques. We tested
the robustness of each algorithm on multiple sets of images of mango trees acquired over
a period of three years. These images sets vary in imaging conditions (light and exposure),
distance to the tree, average number of fruit on the tree, orchard, and season. I find that
for fruit-background segmentation, K-nearest neighbor pixel classification based on color
and smoothness or pixel classification based on super-pixel over-segmentation, clustering
iv
of dense SIFT (Scale Invariant Feature Transform) features into visual words, and bag-ofvisual-
word super-pixel classification using support vector machines is more effective than
simple contrast and color based segmentation. I find that pixel classification is best followed
by fruit detection using an elliptical shape model or blob detection using color filtering and
morphological image processing techniques.
Fifth, I investigate the use of RGB-D based modeling of natural objects using RGB-D sensors
and a combination of volumetric 3D reconstruction and parametric shape modeling.
We apply the general method to the specific case of detecting and modeling quadric objects
(pineapple fruit) in cluttered agricultural environments, towards applications in fruit health
monitoring and crop yield prediction. Our method estimates the camera trajectory then performs
volumetric reconstruction of the scene. Next, we detect fruit and segment out point
clouds that belong to fruit regions. We use two novel methods for robust estimation of a
parametric shape model from the dense point cloud: (i) MSAC-based robust fitting of an
ellipsoid to the 3D-point cloud, and (ii) nonlinear least squares minimization of dense SIFT
(scale invariant feature transform) descriptor distances between fruit pixels in corresponding
frames. We compare our shape modeling methods with a baseline direct ellipsoid estimation
method. We find that our parametric shape modeling methods are more robust and better
able to estimate the size, shape, and volume of pineapple fruit than is the baseline direct
method.
The techniques proposed in this dissertation will aid the development of new and evolution of
existing machine vision based civilian applications that are emerging due to the availability
of low-cost optical sensors and computing systems. |