Enot — neural network compression & acceleration

GET STARTED

Try for free

AutoDL framework for neural network compression & acceleration

2 weeks

neural network compression

hardware cost
reduction

25 times

70%

up to

20 times

neural network acceleration

to a compressed
model

Get a free 2-week trial

ENOT is a framework with a Python API that can be quickly and easily integrated within various neural network training pipelines

Product

Language

CNN

RNN

LSTM

DNN

Machine Learning Frameworks

Deployment

NN Types

Hardware Libraries

CPU

FGPA

GPU

NPU

Runtime

On-premises

On the cloud

Processing images faster

Client required faster image processing on smartphones for better user experience without sacrificing accuracy.

Case

Smartphone manufacturer

4.8 times

acceleration without accuracy degradation

Reduction of cloud costs

Client was facing high cloud server costs for their facial recognition pipeline, thus their neural networks were accelerated.

Case

Results

3.2 times

reduction in cloud infrastructure costs

Low latency & RAM consumption

Case

Client required to reduce their NN model size to meet their RAM limitations, while maintaining low latency

Results

4.8 Mb

NN model size

9.1Mb

Peak RAM consumption

4 ms

Latency

Meet chipset's RAM limitations

Client required compression of their neural network to meet their chipset's 5MB RAM limitation

Case

Results

Reduction of hardware costs

Case

Client was incurring very high hardware costs from operating object detection on 25 video streams.

Results

4.2 times

Acceleration

2.2 times

Reduction in server costs

8 times

Reduction of peak
RAM consumption
from 36 Mb to 4.5 Mb

Faster facial keypoint detection

Client could not achieve fast enough facial keypoint detection to provide a seamless mobile app experience.

Case

Results

48%

Acceleration of the baseline model on multiple mobile platforms

Results

Telecommunications

Smartphone manufacturer

Electronics manufacturer

AI-based mobile app

Oil & Gas

Electronics

Healthcare

Oil & Gas

Autonomous Driving

Cloud Computing

Telecom

Mobile Apps

Internet of Things

Robotics

ENOT applies these methods simultaneously to achieve the highest compression/acceleration rate without accuracy degradation. It allows to automate the search for the optimal neural network architecture, taking into account latency, RAM and model size constraints for different hardware and software platforms.

Technology

Our neural network architecture search engine allows to automatically find the best possible architecture from millions of available options, taking into account several parameters:

— input resolution
— depth of neural network
— operation type
— activation type
— number of neurons at each layer
— bit width for target hardware platform for NN inference

NAS

Pruning

Quantization

Distillation