Figure 1: Architecture of Symmetric Transpose and Systolic 16-Tap FIRs
By: Hichem Belhadj, Govind Krishnan, Madhubabu Anumukonda, Microsemi
One of the pillars of Microsemi’s religion is “Power Matters.” While the previous generations of Microsemi FPGAs delivered the lowest static power in their class, SmartFusion2 and IGLOO2 were designed to deliver not only the lowest static power, but also the lowest total power. This was achieved with a comprehensive power-conscious development of process technology, architecture, and design of the configurable logic, and embedded features such as SerDes, DDR2/3, and DSP blocks. Additionally, these devices offer a Flash*Freeze power mode that reduces the power consumption to even less than the static power. This article focuses on DSP designs; FIR filters architectures will be briefly discussed and actual silicon power dissipation comparison will be presented.
Finite Impulse Response or FIR filters are among the DSP blocks widely used in a large number of applications to remove unwanted noise, improve signal quality, or shape signal spectrum. Several architectures of these FIR filters (Transpose, Systolic with or without symmetry) have various characteristics such as the total initial latency, the number of DSP blocks, the throughput or performance, and the number of pipeline registers. Figure 1 depicts the symmetric versions of Transpose and Systolic 16-Tap FIRs and illustrates the differences between these architectures.
In a nutshell, systolic architectures use pipeline stages and reduce the inputs fanout to increase the frequency of operations. However the initial latency for N-Tap systolic FIR is (2*N -2)-cycles. The transpose architectures run at a lower frequency but have a better initial latency of (N-1)-cycles and use less sequential resources. Other criteria related to filter stability arise in particular when the number of taps is very large and weighting features need to be considered. For instance, in a voice processing application dealing with echo cancellation, the weights need to be higher at the near end, where most of the echo resides, and decrease on the later filter taps as the echo is lower.
The power study of all these architectures covered 32-, 64-, 128-Tap Transpose FIR implementations, the power estimation tools, and more importantly actual silicon measurement at various temperatures. The following figures provide the actual silicon Total Power measured at room temperature on the development kits for IGLOO2, Artix7, and CycloneV.
Looking at the charts in Figure 2, several comments are worth making:
- The silicon measurements prove that IGLOO2 has significantly lower power when compared to Artix7 and CycloneV. These saving figures are more substantial at lower frequencies and at high temperatures. These figures will be even more drastic, as IGLOO2 numbers were collected for 1.2V while the device is capable of running at 1V.
- The power dissipation of IGLOO2 is linear to the number of Taps.
- The Artix7 power figures are worse when the number of Taps is low, while the Cyclone V power dissipation gets worse when the number of Taps grows. The in-depth analysis of the Altera peculiar power consumption when the number of Taps increases highlights some architectural issues.
In conclusion, Microsemi SmartFusion2 and IGLOO2 families are the best suited for power-conscious high speed designs and particularly when these are DSP intensive.