Tech Report - Ray

Features and Capabilities

Computer Vision Architecture: Object detection

Computer Vision Architecture: Processing and Interfacing:

Autonomous Identification of humans:

For Autonomous identification of humans, there are three major components working together to make the entire system work: Intruder detection, intruder tracking and tracking based driving.

The Intruder detection component is based on a deep learning approach. We will collect custom data on person(intruders) walking/running from an aerial view and train our own Yolov5 model based on the pretrained weights provided by ultralytics using MS COCO dataset. With this component, our drone will have the capability to detect in real-time the objects in the camera’s field of view on the drone. This component will take in the video stream from the camera and return the bounding box of the intruder in the image if he is in the field of view.

The intruder tracking component is based on Simple Online and Realtime Tracking with a Deep Association Metric, which is DeepSORT in short. Compared to other tracking techniques such as SORT, Boosting, KCF tracker and GOTURN tracker, DeepSORT can track objects through longer periods of occlusions thus effectively reducing the number of identity switches. the intruder tracking component will take the bounding box produced by the intruder detection component as its starting point and tracks the movement of the intruder as long as he is in the field of view.

The tracking based driving component is a assisting module for the intruder tracking component. While the tracking component is tracking the intruder, the change of coordinates and size of the bounding box on the intruder will change in real-time. Then the tracking based driving component will take the bounding box information and calculate how large is the bounding box and where is the centre of the bounding box compared to the image. If the bounding box is large compared to the size of the image, it means the intruder is close to the drone and the drone needs to back off and vice versa. If the bounding box is not centred in the image, it means the drone needs to drive sideways or up and down so that the intruder can be remain in the centre of the image. It will then send out commands to drive the drone based on the logic above.