TernaryNet: faster deep model inference without GPUs for medical 3D segmentation using sparse and binary convolutions.

Int J Comput Assist Radiol Surg

Biomedical Image Analysis Group, Department of Computing, Imperial College London, London, SW7 2AZ, UK.

Published: September 2018

Purpose: Deep convolutional neural networks (DCNN) are currently ubiquitous in medical imaging. While their versatility and high-quality results for common image analysis tasks including segmentation, localisation and prediction is astonishing, the large representational power comes at the cost of highly demanding computational effort. This limits their practical applications for image-guided interventions and diagnostic (point-of-care) support using mobile devices without graphics processing units (GPU).

Methods: We propose a new scheme that approximates both trainable weights and neural activations in deep networks by ternary values and tackles the open question of backpropagation when dealing with non-differentiable functions. Our solution enables the removal of the expensive floating-point matrix multiplications throughout any convolutional neural network and replaces them by energy- and time-preserving binary operators and population counts.

Results: We evaluate our approach for the segmentation of the pancreas in CT. Here, our ternary approximation within a fully convolutional network leads to more than 90% memory reductions and high accuracy (without any post-processing) with a Dice overlap of 71.0% that comes close to the one obtained when using networks with high-precision weights and activations. We further provide a concept for sub-second inference without GPUs and demonstrate significant improvements in comparison with binary quantisation and without our proposed ternary hyperbolic tangent continuation.

Conclusions: We present a key enabling technique for highly efficient DCNN inference without GPUs that will help to bring the advances of deep learning to practical clinical applications. It has also great promise for improving accuracies in large-scale medical data retrieval.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s11548-018-1797-4DOI Listing

Publication Analysis

Top Keywords

inference gpus
12
convolutional neural
8
ternarynet faster
4
deep
4
faster deep
4
deep model
4
model inference
4
gpus medical
4
medical segmentation
4
segmentation sparse
4

Similar Publications

Article Synopsis
  • Demand for computing power in major scientific experiments, like the CMS at CERN, is expected to significantly increase over the coming decades.
  • The implementation of coprocessors, particularly GPUs, in data processing workflows can enhance performance and efficiency, especially for machine learning tasks.
  • The Services for Optimized Network Inference on Coprocessors (SONIC) approach allows for improved use of coprocessors, demonstrating successful integration and acceleration of workflows across various environments without sacrificing throughput.
View Article and Find Full Text PDF

Intelligent mobile image sensing powered by deep learning analyzes images captured by cameras from mobile devices, such as smartphones or smartwatches. It supports numerous mobile applications, such as image classification, face recognition, and camera scene detection. Unfortunately, mobile devices often lack the resources necessary for deep learning, leading to increased inference latency and rapid battery consumption.

View Article and Find Full Text PDF

Associative memory is a cornerstone of cognitive intelligence within the human brain. The Bayesian confidence propagation neural network (BCPNN), a cortex-inspired model with high biological plausibility, has proven effective in emulating high-level cognitive functions like associative memory. However, the current approach using GPUs to simulate BCPNN-based associative memory tasks encounters challenges in latency and power efficiency as the model size scales.

View Article and Find Full Text PDF

Motivation: Local ancestry inference is a powerful technique in genetics, revealing population history and the genetic basis of diseases. It is particularly valuable for improving eQTL discovery and fine-mapping in admixed populations. Despite the widespread use of the RFMix software for local ancestry inference, large-scale genomic studies face challenges of high memory consumption and processing times when handling RFMix output files.

View Article and Find Full Text PDF

Dynamic networks have become a pivotal area of study in deep learning due to their ability to selectively activate computing units (such as layers or channels) or dynamically allocate computation to information-rich regions. This capability significantly curtails unnecessary computations, adapting to varying inputs. Despite these advantages, the practical efficiency of dynamic models often falls short of theoretical computation.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!