Client required faster image processing on smartphones for better user experience without sacrificing accuracy.
acceleration without accuracy degradation
Client was facing high cloud server costs for their facial recognition pipeline, thus their neural networks were accelerated.
reduction in cloud infrastructure costs
Low latency & RAM consumption
Client required to reduce their NN model size to meet their RAM limitations, while maintaining low latency
Meet chipset's RAM limitations
Client required compression of their neural network to meet their chipset's 5MB RAM limitation
Reduction of hardware costs
Client was incurring very high hardware costs from operating object detection on 25 video streams.
Reduction in server costs
Reduction of peak
RAM consumption
from 36 Mb to 4.5 Mb
Faster facial keypoint detection
Client could not achieve fast enough facial keypoint detection to provide a seamless mobile app experience.
Acceleration of the baseline model on multiple mobile platforms