The module receives images from the pipelines and uses the run_locator() function to return geographical coordinates. A series of methods within the run_locator() follow a mathematical process to do this.
...
Above is a diagram of vectors representing several inputs from the pipeline and a representation of the projection done to map a pixel on the image to a geographic coordinate (the red 'a' vector and the dotted line down to the (x, y) coordinate). Below is the list of what each vector (that is an input from the pipeline) represents.
O is the vector to the camera location.
C is the vector of the orientation of the camera. Its magnitude represents the resolution of the image. (larger magnitude → higher resolution + more zoomed in)
U, V are vectors that act as the axis of the image.
Below are the other variables:
A is a vector pointing from the camera to an arbitrary point on the image.
(X, Y) is the geographic coordinate of the arbitrary point. As the 3 dimensional vector, it should be written as (X, Y, 0), as Z = 0.
The A vector can be extended down to the plane representing the earth to return a geographical coordinate using the following calculation:
A_i = C + m_i(U) + n_i(V) , where m_i, n_i are scalars that have the A vector point to a specific point.
(X, Y, 0) = O + tA_i, where t is a scalar.
0 = O_z + tA_iz → t = -(O_z/A_iz). (for clarification, these variables are components of the above variables)
and so the geographic coordinate (X, Y) can be calculated from a point on an image using the above calculation.
As several points for an image are needed (to get a mean and variance of the detected object), the calculation may become expensive. A projective mapping algorithm is used in hopes of reducing the time needed to get the desired amount of coordinates in an image: