Significant Leap Forward in Performance and Power Efficiency Reported Using Altera High-end FPGAs with Hard Floating Point DSP Blocks
Hong Kong, — March 2, 2015—Altera Corporation announced Microsoft (NASDAQ: MSFT) is using Altera Arria® 10 FPGAs (field programmable gate arrays), to achieve compelling performance-per-Watt in data center acceleration based on CNN (convolutional neural network) algorithms. These algorithms are frequently used for image classification, image recognition, and natural language processing.
Microsoft researchers are working on advancing cloud technologies and are using the Arria 10 Developer Kit and engineering samples of Arria 10 FPGAs, which are demonstrating up to 40 GFLOPS-per-Watt, an industry-leading level in data center performance. Also, when compared with GPGPUs, this FPGA performance offers a more than 3X performance-to-power advantage for CNN platforms. This performance is achieved using the open software development language known as OpenCL, or VHDL to code the Arria 10 FPGA and its IEEE754 hard floating point DSP (digital signal processing) blocks.
“We are seeing a significant leap forward in CNN performance and power efficiency with Arria 10 engineering samples and the silicon’s precision hard floating point in the DSP blocks is part of the reason we are seeing compelling results in our research,” said Doug Burger, director, Client and Cloud Apps, Microsoft Research. Burger describes some of the challenges facing the data center at an infrastructure level and how by replacing traditional CPUs with reprogrammable FPGAs, Microsoft is addressing these challenges.
“The FPGA has an architectural advantage for neural algorithms with the ability to convolve and do pooling very efficiently with a flexible data path which enables many OpenCL kernels to pass data directly to each other without having to go to external memory,” said Michael Strickland, director of the Compute and Storage Business Unit, Altera. “Arria 10 has an additional architectural advantage of supporting hard floating point for both multiplication and addition – this hard floating point enables more logic and a faster clock speed than traditional FPGA products.”
Altera previously announced that Microsoft is using its Stratix V FPGAs to accelerate search on its innovative Catapult board being deployed in servers in the first Bing data center later this year.
Related Quotes
Altera 20 nm FPGAs with Hard Floating Point DSP Demonstrate Industry-leading Performance and Power Efficiency
Many companies are using Altera Arria® 10 FPGA products with on-board hard floating point DSP to achieve compelling performance-per-Watt. Altera is working closely with customers and partners on solutions for high performance computing (HPC), data center acceleration, and financial systems.
Microsoft – Doug Burger, Director of Client and Cloud Apps
“We are seeing a significant leap forward in CNN performance and power efficiency with Arria 10 engineering samples, and the silicon’s precision hard floating point in the DSP blocks is part of the reason we are seeing compelling results in our research,” said Doug Burger, director, Client and Cloud Apps, Microsoft Research.
Bittware – Jeff Milrod, President and CEO, Bittware
“Altera’s Arria 10 is a true game changer. Native floating-point engines on these devices give system designers access to massive floating-point resources with tremendous ease-of-use and power efficiency in an FPGA. Classic signal processing applications can now interface analog signals directly to Arria 10 and process them there in floating point,” said Jeff Milrod, president and CEO, Bittware. “For HPC and acceleration applications, FPGA algorithms no longer need to be ported to fixed point, nor do they need to be inefficiently implemented in fixed-point emulation of floating point. The Arria 10’s native floating point provides more than 40 GFLOPS/W with a higher Fmax, while using only one-third of the logic resources. It is easier to use, lower power, faster, and less resource-intensive than any other alternative previously available.”
Gidel – ReuvenWeintraub, Founder and CTO, Gidel
“We are enthusiastic about the Altera Arria 10’s unprecedented flops-per-power performance. For a long time, FPGAs excelled in performance-per-power for bit, byte and then integer processing,” said ReuvenWeintraub, founder and CTO, Gidel. “The Altera Arria 10’s tremendous floating-point-per-power opens the way for Gidel products to be a great fit for many more HPC and DSP applications.”
Nallatech – Allan Cantle, President and Founder, Nallatech
“Nallatech has ported several of our customers’ production codes that required floating point math using Altera’s OpenCL compiler. By targeting these at the new Arria 10 FPGA with dedicated floating-point DSPs, we can see savings in logic resource usage, increased clock frequencies and further improvements in performance/watt metrics, making Nallatech’s new Arria 10-based accelerators more compelling for a wider range of application areas,” said Allan Cantle, president and founder, Nallatech.
ReFLEX CES – YannCasteignau, Principal Engineer, ReFLEX CES
“The ReFLEX CES recently-released FPGA boards based on Altera Arria10 FPGAs will largely benefit from the new floating-point DSP blocks implemented in this Generation10 FPGA family,” said YannCasteignau, principal engineer, ReFLEX CES. “Our target is to give customers a significant GFLOPS/W ratio increase (ratio of three is expected), and at the same time, reduce the logic required to implement complex floating-point computations, leaving maximum space for custom design implementation. Many of our customers use ReFLEX CES boards for high performance computing, and power consumption is often a challenge. With Arria10 FPGAs, the power consumption is reduced for better computing performances. The Arria10 new hard-coded DSP floating-point operator is a decisive advantage for ReFLEX CES boards when it comes to increasing performance, reducing the logic needs, and optimizing the GFLOPS/W ratio.”