2022-2023 Computer Vision Architecture

Link to last year’s architecture:

Link to GitHub repository: https://github.com/UWARG/computer-vision-python

CV airside

CV airside is written in Python 3.8 , and is data-driven, transforming data from one form to another. The core program takes video and an odometry stream as input, and outputs locations of interest on the ground to the last step.

In previous years, the last step was:

  • 2020-2021: Estimating the location of cones to pass to ZeroPilot for landing.

  • 2021-2022: Tracking the path of a human.

For 2022-2023, the last step is searching for and confirming a landing pad for landing, using the locations.

The transformation done by the core program is divided into smaller units, as each frame is time-indepndent from another frame and odometry data is the same with other odometry data.

CV airside runs on a NVIDIA Jetson TX2i (the Jetson).

Interfaces

Input

CV airside has two sources of input: Video and ZeroPilot.

Video

Video is from the PiCam 2 through the Linux MIPI driver, and is in high quality colour format. Specifically:

  • TODO: Resolution, bit depth, and frame rate.

ZeroPilot

Communication from ZeroPilot is through the Linux UART driver, and receives message type 0 (odometry) or message type 1 (movement request).

  • Odometry contains, at minimum, position (longitude, latitude, and height) and orientation (yaw, pitch, roll).

    • TODO: Units

  • A movement request informs CV airside that ZeroPilot is ready to receive a movement command.

More information here:

Output

CV airside has one source of output: ZeroPilot.

ZeroPilot

Communication to ZeroPilot is through the Linux UART driver, and transmits message type 2 (movement command) or message type 3 (landing initiation command).

  • A movement command contains a point relative to the drone (relative includes its current heading) and an absolute compass heading.

    • Positive x, y, z is in the drone’s forward, right, and up direction respectively. Negative is opposite.

    • TODO: Units

  • A landing initiation command informs ZeroPilot to land at its current location (i.e. the landing pad should be directly beneath the drone).

    • Once this command has been sent, CV airside no longer needs to run. More information here:

More information here:

Producer-consumer worker model

Multiprocessing

As parallelism is impossible in Python multithreading with the Global Interpreter Lock, CV airside uses the Python multiprocessing library. Interprocess communication is handled by the library’s Queue class as well as synchronization primitives: Lock and Semaphore .

More information here:

Workers

CV airside is divided into smaller units, each called a worker. Each worker is specialized to transform data from one form to another, repeatedly consuming inputs from a queue from the previous worker and producing outputs to a queue for the next worker.

The transformation within a worker is also subdivided, and may require state. Therefore, each worker contains an instantiation of a class object with the required methods and members. A class object is created by the worker, and is independent from other workers.

Figure 1: Details of a worker process.

main

All workers are managed by the main process, which can command workers to pause, resume, and exit. main is reponsible for instantiating and passing the required settings and interprocess communication objects to all workers on creation.

Data flows from the first worker to the last, and there may be multiple of the same type of worker so that a specific transformation is done in parallel for increased throughput.

Figure 2: Example of overall data flow.

Workers in CV airside

Name

Description

Next worker

Name

Description

Next worker

zp_input_worker (previously commsInterface)

Input: UART message.

Output: Odometry OR movement request.

  • TODO: Describe what this data structure looks like in Python.

Deserialize the UART message into odometry or movement request, attach a timestamp, and choose the next worker to pass it to.

Odometry: merge_targets_odometry

Request: zp_search_movement

video_input_worker (previously decklinksrc)

Input: Video.

Output: Frame.

  • TODO: What does this look like?

Transforms video from the Linux MIPI driver into a TODO: cv2? frame and attach a timestamp.

target_acquisition_worker

target_acquisition_worker (previously targetAcquisitionWorker)

Input: Frame.

Output: Bounding boxes.

  • TODO: What does this look like?

Runs ML inference on the provided frame to create bounding boxes and attaches the timestamp of the original frame to them.

merge_worker

merge_worker (previously mergeImageWithTelemetryWorker)

Input: Odometry AND bounding boxes.

Output: Merged odometry and bounding boxes.

  • TODO: What does this look like?

Combines the bounding boxes with the odometry with the closest timestamp. The stored list of odometry must have at least one odometry with a timestamp time before the bounding boxes and at least one after.

geolocation_worker

geolocation_worker

Input: Merged odometry and bounding boxes.

Output: Locations on the ground.

  • TODO: What does this look like?

Transforms the bounding boxes (locations on a frame) into locations on the ground, using odometry.

zp_search_movement

zp_search_movement (previously various last steps)

Input: Locations on the ground.

Output: UART message.

  • Movement command OR landing initiation command.

If there are no locations, start a search pattern by sending movement commands to ZP. This worker will keep track of the search area that has already been searched, as well as the path to continue the search.

If there are locations, investigate closer by sending movement commands to ZP, until hovering 2 metres above the landing pad.

Once hovering 2 metres above the landing pad, if the landing pad can still be recognized, the landing initiation command is sent. Otherwise, send movement commands to make the drone increase height until the landing pad is found again. If the maximum height is reached and the landing pad has not been found, restart the search.

N/A

CV groundside

TODO: More detail

CV groundside displays pilot video and telemetry data received from the drone, calculates paths using waypoints, and transmits the list of waypoints to the drone.

TODO: Towers?

Telemetry

TODO

Pathing

The path is calculated by TODO: Detail.

Waypoints

Definition of a waypoint:

In Autonomous Flight: Cruise mode, the drone will proceed to each waypoint in the list until it arrives at the last waypoint, and transition into Autonomous Flight: Search mode.

More detail here:

Â