...
Link to last year’s architecture: Computer Vision Architecture 2021-2022
Link to GitHub repository: https://github.com/UWARG/computer-vision-python
CV airside
CV airside is written in Python 3.8 , and is data-driven, transforming data from one form to another. The core program takes video and an odometry stream as input, and outputs locations of interest on the ground to the last step.
...
CV airside runs on a NVIDIA Jetson TX2i (the Jetson).
Interfaces
Input
CV airside has two sources of input: Video and ZeroPilot.
Video
Video is from the PiCam 2 through the Linux MIPI driver, and is in high quality colour format. Specifically:
TODO: Resolution, bit depth, and frame rate.
ZeroPilot
Communication from ZeroPilot is through the Linux UART driver, and receives message type 0 (odometry) or message type 1 (movement request).
...
More information here: Communication & Message Formats
Output
CV airside has one source of output: ZeroPilot.
ZeroPilot
Communication to ZeroPilot is through the Linux UART driver, and transmits message type 2 (movement command) or message type 3 (landing initiation command).
A movement command contains a point relative to the drone (relative includes its current heading) and an absolute compass heading.
Positive x, y, z is in the drone’s forward, right, and up direction respectively. Negative is opposite.
TODO: Units
A landing initiation command informs ZeroPilot to land at its current location (i.e. the landing pad should be directly beneath the drone).
Once this command has been sent, CV airside no longer needs to run. More information here: https://uwarg-docs.atlassian.net/wiki/spaces/ARCHS22/pages/2123431971/2023 System Architecture+System+Architecture#Autonomous-flight-mode:-Landing
More information here: Communication & Message Formats
Producer-consumer worker model
Multiprocessing
As parallelism is impossible in Python multithreading with the Global Interpreter Lock, CV airside uses the Python multiprocessing
library. Interprocess communication is handled by the library’s Queue
class as well as synchronization primitives: Lock
and Semaphore
.
More information here: https://docs.python.org/3.8/library/multiprocessing.html
Workers
CV airside is divided into smaller units, each called a worker. Each worker is specialized to transform data from one form to another, repeatedly consuming inputs from a queue from the previous worker and producing outputs to a queue for the next worker.
The transformation within a worker is also subdivided, and may require state. Therefore, each worker contains an instantiation of a class object with the required methods and members. A class object is created by the worker, and is independent from other workers.
...
main
All workers are managed by the main
process, which can command workers to pause, resume, and exit. main
is reponsible for instantiating and passing the required settings and interprocess communication objects to all workers on creation.
Data flows from the first worker to the last, and there may be multiple of the same type of worker so that a specific transformation is done in parallel for increased throughput.
...
Workers in CV airside
...
Name | Description | Next worker |
---|---|---|
zp_input_worker (previously commsInterface) | Input: UART message. Output: Odometry OR movement request.
Deserialize the UART message into odometry or movement request, attach a timestamp, and choose the next worker to pass it to. | Odometry: merge_targets_odometry Request: zp_search_movement |
video_input_worker (previously decklinksrc) | Input: Video. Output: Frame.
Transforms video from the Linux MIPI driver into a TODO: cv2? frame and attach a timestamp. | target_acquisition_worker |
target_acquisition_worker (previously targetAcquisitionWorker) | Input: Frame. Output: Bounding boxes.
Runs ML inference on the provided frame to create bounding boxes and attaches the timestamp of the original frame to them. | merge_worker |
merge_worker (previously mergeImageWithTelemetryWorker) | Input: Odometry AND bounding boxes. Output: Merged odometry and bounding boxes.
Combines the bounding boxes with the odometry with the closest timestamp. The stored list of odometry must have at least one odometry with a timestamp time before the bounding boxes and at least one after. | geolocation_worker |
geolocation_worker | Input: Merged odometry and bounding boxes. Output: Locations on the ground.
Transforms the bounding boxes (locations on a frame) into locations on the ground, using odometry. | zp_search_movement |
zp_search_movement (previously various last steps) | Input: Locations on the ground. Output: UART message.
If there are no locations, start a search pattern by sending movement commands to ZP. This worker will keep track of the search area that has already been searched, as well as the path to continue the search. If there are locations, investigate closer by sending movement commands to ZP, until hovering 2 metres above the landing pad. Once hovering 2 metres above the landing pad, if the landing pad can still be recognized, the landing initiation command is sent. Otherwise, send movement commands to make the drone increase height until the landing pad is found again. If the maximum height is reached and the landing pad has not been found, restart the search. | N/A |
CV groundside
TODO: Is this good enough?More detail
CV groundside displays pilot video and telemetry data received from the drone, calculates paths using waypoints, and transmits the list of waypoints to the drone.
TODO: Towers?
Telemetry
TODO
Pathing
The path is calculated by TODO: Detail.
Waypoints
Definition of a waypoint: https://uwarg-docs.atlassian.net/wiki/spaces/ARCHS22/pages/2123431971/2023+System+Architecture#Waypoints
In Autonomous Flight: Cruise mode, the drone will proceed to each waypoint in the list until it arrives at the last waypoint, and transition into Autonomous Flight: Search mode.
More detail here: https://uwarg-docs.atlassian.net/wiki/spaces/ARCHS22/pages/2123431971/2023+System+Architecture#Autonomous-flight-mode:-Cruise