Low resolution imaging/vision uses less energy per frame. High resolution imaging/vision enables precise visual understanding. Mobile vision systems would benefit from the ability to situationally sacrifice image resolution to save system energy when imaging detail is unnecessary. The figure below shows the resolution-energy tradeoff.
Unfortunately, in todayβs systems, any change in sensor resolution leads to a substantial pause in frame delivery -- as much as 280 ms. Frame delivery is bottlenecked by a sequence of reconfiguration procedures and memory management in current operating systems before it resumes at the new resolution. This latency from reconfiguration impedes the adoption of otherwise beneficial resolution-energy tradeoff mechanisms.
We propose Banner as a media framework that provides a rapid sensor resolution reconfiguration service as a modification to common media frameworks, e.g., V4L2. In Banner, most of the sequential procedures are avoided for reconfiguring sensor resolution: (1) Banner avoids repeated memory allocation; (2) Banner sets sensor format in parallel with user application. The figure below shows the resolution reconfiguration pipeline before and after Banner is integrated into the system.
In particular, Banner employs two key techniques: parallel reconfiguration and format-oblivious memory management. Parallel reconfiguration aims at reconfiguring the sensor while the application is processing frames for the previous resolution such that the reconfiguration latency is hidden. Format-oblivious memory management aims at maintaining a single set of frame buffers β regardless of resolution β to eliminate repeated invocation of expensive memory allocation system calls.
The figure below illustrates the parallel reconfiguration strategy.
The parallel reconfiguration module is designed based on three considerations. First, the sensor is not always busy; there is an idle time between captures. Second, the reconfiguration thread cannot be interrupted, otherwise the end-to-end latency will be increased. Dequeuing a buffer signals that a capture is complete and queuing a buffer signals the next capture. The system should identify the right time to reconfigure sensor. Third, reconfiguration itself takes time, due to camera driver implementations and camera hardware limitations. To resolve these considerations, thread-level concurrency can address the first and second considerations, while a reconfiguration timing budget can address the second and third considerations.
Format-oblivious memory only allocates buffers once and delivers frames to the application according to how many bytes are used but not frame format. The figure below show that format-oblivious memory management reuses previously allocated buffers to store frames with different formats.
We evaluate and validate the effectiveness of Banner for reconfiguring sensor resolution in a variety of vision tasks, including a display-only application working at 25 FPS, a frame offloading application working at 15 FPS and a marker-based pose estimation application running at 15 FPS.
As shown in the figure above, Banner completely eliminates the frame-to-frame reconfiguration latency (226 ms to 33 ms), i.e., removing the frame drop during sensor resolution reconfiguration. Banner also halves the end-to-end resolution reconfiguration latency (226 ms to 105 ms). This enables a more than 49% reduction of system power consumption by allowing continuous vision applications to reconfigure the sensor resolution to 480p compared with downsampling from 1080p to 480p, as measured in a cloud-based offloading workload running on a Jetson TX2 board. As a result, Banner unlocks unprecedented capabilities for mobile vision applications to dynamically reconfigure sensor resolutions to balance the energy efficiency and task accuracy tradeoff.