This paper presents an optimised high throughput architecture for integer squaring on FPGAs. The approach reduces the number of DSP blocks required compared to a standard multiplier. Previous work has proposed the tiling method for double precision squaring, using the least number of DSP blocks so far. However that approach incurs a large overhead in terms of look-up table (LUT) consumption and has a complex and irregular structure that is not suitable for higher word size. The architecture proposed in this paper can reduce DSP block usage by an equivalent amount to the tiling method while incurring a much lower LUT overhead: 21.8% fewer LUTs for a 53-bit squarer. The architecture is mapped to a Xilinx Virtex 6 FPGA and evaluated for a wide range of operand word sizes, demonstrating its scalability and efficiency. © 2013 IEEE.
|Original language||English (US)|
|Title of host publication||Proceedings - 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2013|
|Number of pages||4|
|State||Published - Aug 12 2013|