This Mighty Brain Chip Is So Efficient It Could Bring Developed AI To Your Phone
The main factor is how hardware chips are now set up. Based on the common Von Neumann architecture, the chip isolates memory storage from its central processors. Each computation is one nightmarish Monday morning commute, with the chip constantly shuttling information to and fro from each compartment, forming a notorious “memory wall.”
If you have ever been stuck in traffic, you understand the frustration: it takes time and wasted power. As AI algorithms become increasingly complex, the issue gets increasingly worse.
So why not create a chip based upon the brain, one potential perfect match for deep neural nets?
Enter compute-in-memory, or CIM, chips. Faithful to their name, these chips compute and also store memory at the same site. Forget commuting; the chips are greatly efficient work-from-home alternatives, nixing the information traffic bottleneck issue and promising greater efficiency and lower energy consumption.
Or so goes the concept. Most CIM chips running AI algorithms have entirely focused on chip design, showcasing their capabilities utilizing simulations of the chip rather than running tasks on full-fledged hardware. The chips also fight to adjust to multiple various AI tasks– image recognition, voice perception– limiting their integration into smartphones or other everyday tools.
This month, one study in Nature upgraded CIM from the ground up. Rather than focusing entirely on the chip’s design, the international group– led by neuromorphic hardware professionals Dr. H.S. Philip Wong at Stanford and also Dr. Gert Cauwenberghs at UC San Diego– optimized the whole setup, from technology to architecture to algorithms that calibrate the hardware.
The resulting NeuRRAM chip is a potent neuromorphic computing behemoth with 48 parallel cores and three million memory cells. Very versatile, the chip tackled multiple AI standard tasks– such as reading hand-written numbers, identifying cars and other objects in pictures, and decoding voice recordings– with over 84 percent accuracy.
While the success price may seem mediocre, it rivals existing digital chips but dramatically saves power. To the authors, it is a step closer to bringing AI directly to our devices rather than requiring us to shuttle data to the cloud for calculation.
” Having those computations done on the chip instead of sending data to and from the cloud could allow faster, more secure, cheaper, and more scalable AI going into the future and give more people access to AI power,” stated Wong.
Neural Inspiration
AI-specific chips are a currently an exciting dime a dozen. From Google’s Tensor Processing Unit (TPU) and Tesla’s Dojo supercomputer architecture to Baidu and also Amazon, technology giants are investing millions in the AI chip gold rush to construct processors that support increasingly sophisticated deep learning algorithms. Some even tap into machine learning to style chip architectures customized for AI software, bringing the race full circle.
One fascinating concept comes straight from the brain. As information passes through our neurons, they “wire up” into networks through physical “docks” called synapses. These frameworks, sitting on top of neural branches such as little mushrooms, are multitaskers: they both compute and store information through changes in their protein composition.
In other words, unlike classic computers, neurons do not need to shuttle information from memory to CPUs. This offers the brain its advantage over digital devices: it’s highly energy efficient and performs multiple calculations simultaneously, all packed into a three-pound jelly stuffed inside the skull.
Why not recreate aspects of the brain?
Enter the neuromorphic computer. One hack utilized RRAMs, or resistive random-access memory devices (also dubbed ‘memristors’). RRAMs also store memory when reduced from power by changing the resistance of their hardware.
Comparable to synapses, these components can be packed into dense arrays on a small area, creating circuits capable of highly complex calculations without bulk. When combined with CMOS, one fabrication process for constructing circuits in our current microprocessors and chips, the duo becomes even more potent for running deep learning algorithms.
Yet it comes at a cost. “The highly-parallel analog computation within RRAM-CIM architecture brings superior efficiency but makes it challenging to realize the same degree of functional flexibility and computational accuracy as in electronic circuits,” stated the authors.
Optimization Genie
The new research delved into every part of a RRAM-CIM chip, redesigning it for practical use.
It begins with technology. NeuRRAM boasts forty-eight cores that compute in parallel, with RRAM tools physically interwoven into CMOS circuits. Like one neuron, each core could be individually turned off when not in use, preserving power while its memory is kept on the RRAM.
These RRAM cells– all three million of them– are linked so that information can transfer in both directions. It’s an important design, allowing the chip to flexibly adjust to multiple different types of AI algorithms, the authors explained. For example, a kind of deep neural net, CNN (convolutional neural network), is especially great at computer vision.
However, it needs information to flow in a single direction. In contrast, LSTMs, a kind of deep neural net often used for audio recognition, recurrently process data to match signals with time. Such as synapses, the chip encodes how strongly one RRAM “neuron” links to another.
This architecture made it possible to fine-tune information flow to minimize traffic jams. Such as expanding single-lane traffic to multi-lane, the chip could duplicate a network’s current “memory” from most computationally intensive concerns so that multiple cores analyze the situation simultaneously.
The last touchup to previous CIM chips was a stronger bridge between brain-like computation– often analog– and digital processing. Here, the chip utilizes a neuron circuit that can quickly convert analog calculations to digital signals. It is a step up from previous “power-hungry and area-hungry” setups, the authors explained.
The optimizations worked out. Putting their theory to the examination, the team manufactured the NeuRRAM chip and developed algorithms to program the hardware for different algorithms– like Play Station 5 running different games.
In a multitude of benchmark examinations, the chip performed like one champ. Running a seven-layer CNN on the chip, NeuRRAM had less than a one percent mistake rate at recognizing hand-written digits utilizing the popular MNIST database.
It also excelled on more complicated tasks. Loading another popular deep neural net, LSTM, the chip was about 85 percent correct when challenged with Google speech command recognition. Using simply eight cores, the chip– running on yet another AI architecture– could recover noisy images, reducing errors by roughly 70 percent.
One word: energy
Most AI algorithms are total power hogs. NeuRRAM ran at half the energy cost of previous advanced RRAM-CIM chips, further translating the promise of power savings with neuromorphic computing into reality.
However, the study’s standout is its strategy. Too often, when designing chips, scientists require to balance efficiency, versatility, and accuracy for multiple tasks– metrics that are often at odds with one another. The problem gets even harder when all the computing is done straightly on the hardware. NeuRRAM showed that it is possible to battle all beasts at once.
The authors stated that the strategy used here can be utilized to optimize other neuromorphic computing tools, such as phase-change memory technologies.
For now, NeuRRAM is a proof-of-concept, revealing that a physical chip– rather than a simulation of it– works as intended. However, there’s room for improvement, including further scaling RRAMs and shrinking its dimension to one day potentially fit into our phones.
” Maybe today it is utilized to do simple AI tasks such as keyword spotting or human detection, but tomorrow it could allow a whole various user experience. Imagine real-time video analytics combined with speech recognition all within a small tool,” said study author Dr. Weier Wan. “As one researcher and an engineer, I aim to bring research innovations from laboratories into practical use.”
Read the original article on Singularity Hub.