Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Goals

PS: The goals stated here are for the 2020-2021 season. The competition for this season has concluded, however we will retain this architecture and the future competition goal may be similar to this one.

...

To do so, the CV system is segmented into a list of modules, each of which assist in accomplishing one of the above tasks.


The Modules

Note: it may be useful to follow along exploring our CV repository as you read this guide (https://github.com/UWARG/computer-vision-python ).

DeckLinkSRC

Video data for the CV system originates from a GoPro mounted on the plane. The GoPro then uses a Blacksheep video transmitter to send live-stream data to the ground. The receiver is then attached to a video card on the CV computer, known as the DeckLink, which interprets and processes video input. The DeckLinkSRC program was designed to extract video data from this card and attach it to the CV program’s data pipelines.

However, this implementation has since changed, and the module now simply reads from an OBS stream which pulls data directly from the DeckLink. This module is also responsible for handling the life cycle of the video stream, including saving it for future training and closing the stream when prompted.

Command

The command module serves as the communication interface between the firmware and CV systems. It includes two major components: the POGI (Plane Out, Ground In) system, responsible for transferring telemetry data from the Xbee (firmware RF receiver) to the CV system, and the PIGO (Plane In, Ground Out) system, responsible for transferring CV data about pylon or box locations to the firmware autopilot. Data is transferred via two shared JSON files that are used by both the CV and firmware systems.

Target Acquisition

This module hosts and runs the YOLOV5 object detection neural network which detects pylons within a video frame. Object detection involves creating a so-called “bounding box” within an image that identifies the rough location of a target object within an image. The bounding box includes data about its height, width, and x-y coordinates of its top-right corner. Note that the coordinate system of an image is such that the origin is at the top left, and the bottom right is defined by (+IMAGE_WIDTH, +IMAGE_HEIGHT).

Geolocation

The geolocation module implements a complex mathematical algorithm designed to correlate data about bounding boxes detected within an image to actual coordinates on the ground, based on telemetry data such as altitude, GPS coordinates of the plane, and camera gimbal. The algorithm itself is a bit too involved to cover here. You may refer to the following link if you’re interested in learning more about the math behind it: https://docs.google.com/document/d/1VKYIrmWJYfpLPCAjQPiH1J5fnQPtNcjEC2NE5Eg4QxU/edit .

Once the plane has landed, the search module is responsible for using stored coordinates of depot/clinic locations to give the firmware program information on how to turn to face the rough direction of the boxes. Note that this location will not be very accurate, due to GPS accuracy limitations and unaccounted-for variables within the geolocating process, but the point of this module is to get the plane facing in the rough direction so that the Taxi program can begin package detection.

Taxi

The taxi program uses another YOLOV5 object detection model to search for cardboard boxes (the packages) within a video frame. It also includes logic on calculating the distance from the plane to the box, as defined by the size of the box within the image and the camera’s focal length. This data is sent to the firmware system to help it navigate towards the box. Once the plane has travelled to a relatively close position, manual control is handed over to the pilot to allow for precise maneuvers that can facilitate QR code scanning.

QR Scanner

The QR scanner module uses OpenCV and the PyZbar library to detect QR codes within a video frame, decode those QRs, and display the decoded information using a bounding box for the pilot to see.

Timestamp

The timestamp module simply attaches time stamps to pieces of data as they are queued into the data pipelines (more on that below), to help with tasks such as matching image data with telemetry data.

Main Program & Multiprocessing

The CV architecture is very object-oriented. The modules that are discussed above are all instantiated as objects within the main program (located in the root directory).

...

In this manner, producer-consumers, pipelines, and processes compose the main concepts of the CV multiprocessing system.

Approach to Testing

Unit tests are included in each of the modules, and are created with the pytest library. The integration testing strategy is covered here: Integration Testing Architecture .