7.2.9 Vector Ineffectual Activation Identifier Format (VIAI) 159
7.2.10 Ineffectual Activation Skipping 159
7.2.11 Ineffectual Weight Skipping 161
Exercise 161
References 161
8 Network Sparsity 163
8.1 Energy Efficient Inference Engine (EIE) 163
8.1.1 Leading Nonzero Detection (LNZD) Network 163
8.1.2 Central Control Unit (CCU) 164
8.1.3 Processing Element (PE) 164
8.1.4 Deep Compression 166
8.1.5 Sparse Matrix Computation 167
8.1.6 System Performance 169
8.2 Cambricon-X Accelerator 169
8.2.1 Computation Unit 171
8.2.2 Buffer Controller 171
8.2.3 System Performance 174
8.3 SCNN Accelerator 175
8.3.1 SCNN PT-IS-CP-Dense Dataflow 175
8.3.2 SCNN PT-IS-CP-Sparse Dataflow 177
8.3.3 SCNN Tiled Architecture 178
8.3.4 Processing Element Architecture 179
8.3.5 Data Compression 180
8.3.6 System Performance 180
8.4 SeerNet Accelerator 183
8.4.1 Low-Bit Quantization 183
8.4.2 Efficient Quantization 184
8.4.3 Quantized Convolution 185
8.4.4 Inference Acceleration 186
8.4.5 Sparsity-Mask Encoding 186
8.4.6 System Performance 188
Exercise 188
References 188
9 3D Neural Processing 191
9.1 3D Integrated Circuit Architecture 191
9.2 Power Distribution Network 193
9.3 3D Network Bridge 195
9.3.1 3D Network-on-Chip 195
9.3.2 Multiple-Channel High-Speed Link 195
9.4 Power-Saving Techniques 198
9.4.1 Power Gating 198
9.4.2 Clock Gating 199
Exercise 200
References 201
Appendix A: Neural Network Topology 203
Index 205
ARTIFICIAL INTELLIGENCE HARDWARE DESIGN Learn foundational and advanced topics in Neural Processing Unit design with real-world examples from leading voices in the field In Artificial Intelligence Hardware Design: Challenges and Solutions, distinguished researchers and authors Drs. Albert Chun Chen Liu and Oscar Ming Kin Law deliver a rigorous and practical treatment of the design applications of specific circuits and systems for accelerating neural network processing. Beginning with a discussion and explanation of neural networks and their developmental history, the book goes on to describe parallel architectures, streaming graphs for massive parallel computation, and convolution optimization. The authors offer readers an illustration of in-memory computation through Georgia Tech's Neurocube and Stanford's Tetris accelerator using the Hybrid Memory Cube, as well as near-memory architecture through the embedded eDRAM of the Institute of Computing Technology, the Chinese Academy of Science, and other institutions. Readers will also find a discussion of 3D neural processing techniques to support multiple layer neural networks, as well as information like: A thorough introduction to neural networks and neural network development history, as well as Convolutional Neural Network (CNN) models Explorations of various parallel architectures, including the Intel CPU, Nvidia GPU, Google TPU, and Microsoft NPU, emphasizing hardware and software integration for performance improvement Discussions of streaming graph for massive parallel computation with the Blaize GSP and Graphcore IPU An examination of how to optimize convolution with UCLA Deep Convolutional Neural Network accelerator filter decomposition Perfect for hardware and software engineers and firmware developers, Artificial Intelligence Hardware Design is an indispensable resource for anyone working with Neural Processing Units in either a hardware or software capacity.
About the Author Albert Chun Chen Liu, PhD, is Chief Executive Officer of Kneron. He is Adjunct Associate Professor at National Tsing Hua University, National Chiao Tung University, and National Cheng Kung University. He has published over 15 IEEE papers and is an IEEE Senior Member. He is a recipient of the IBM Problem Solving Award based on the use of the EIP tool suite in 2007 and IEEE TCAS Darlington award in 2021.
Oscar Ming Kin Law, PhD, is the Director of Engineering at Kneron. He works on smart robot development and in-memory architecture for neural networks. He has over twenty years of experience in the semiconductor industry working with CPU, GPU, and mobile design. He has also published over 60 patents in various areas.