Real-Time Edge Perception
What it does
A vision system mounted on a moving vehicle identifies objects in its path, estimates the distance to each one in metres, and produces an encoded video stream of the annotated scene — all in real time on a single edge device. The detection labels and distance estimates feed back to the vehicle’s control loop; the encoded video feeds back to a monitoring station.
Why edge
The latency budget between “see the object” and “act on the object” is too tight for any round trip to the cloud. Edge inference is the only viable answer. The interesting engineering is everywhere but the model: hardware-accelerated decoding, async multi-stage pipelines that don’t drop frames, depth sensing in lighting conditions that the factory floor wasn’t designed for.
Architecture
- YOLO object detection with TensorRT acceleration on a Jetson-class edge device.
- ZED-class stereo camera for synchronised colour + depth.
- Async pipeline: capture → inference → depth merge → encode → publish, with each stage on its own thread.
- Hardware H.264 encoding so CPU never becomes the bottleneck.
See it running
Captures from real deployments are below.