BLUSP Processor

Block diagram BLUSP Processor

Architectures with a programmable kernel are often a good solution in a design problem.

Most architectures on the market are either good at control code, and quite bad to run intensive, repetitive DSP code, or are optimized for DSP code and then hard to program and not suited for [large] chunks of control code.

Although many architectures claim to be low power, significant improvement on power consumption is still possible.

Finally the observation that many problems benefit from a few dedicated instructions; so an approach allowing quick introduction of a limited number of dedicated instructions provides significant value.

Design

The BLUSP architecture has been designed driven by a number of specific applications, where we demonstrated its applicability. Wireless transceivers for standards like 802.15.4, 802.11.ah, Bluetooth. [Automotive] RADAR receivers. [Automotive] Image recognition. But it is also applicable in medical applications like Hearing Aids...

Its calculation power and speed make it fit to implement also complex and high data rate modulation applications. It can e.g. handle a 256 point FFT in 1350 cycles with 16bit precision. Its 32 bit capability make it fit to handle audio applications. The core can implement in 32bit an MP3 decoder, running at 4 MHz clock rate.

More specifically BLUSP stands for:

Ultra low power

The core consumes in a 40nm LP TSMC library, operated at 1.1Volts a current of 13uA/MHz with a typical program. It consumes 21uA/MHz [=7pJ/instruction] with a “smoker” pattern. This pattern is loading the slots and the complex MAC at 100%.

These data are based on simulations with post layout extraction of load capacitances and resistances.

Calculation Performance 

BluSP is a high performance DSC [mix of DSP and Controller] core. 3 Issue core architecture. Single cycle complex MAC. 600 MHz speed in a 40nm LP library.

Standard C Programming Model

We are offering an ECLIPS based software development kit. It features a small, ultra RISC, instruction set with few [only one] specific data type.

This makes it very easy to optimize standard C code for the core.

The software environment consists of an [assembly] code simulator, debugger, profiler, compiler. It has been developed in cooperation with an external partner. 

Possible to add dedicated instructions

Often adding a few dedicated instructions can improve significantly the power consumption of certain algorithms. In the frame of around 3 months BlueICe is capable to deliver dedicated versions of the core, tools comprising these dedicated instructions.

Into the future BlueICe intends to evolve its core into a multiprocessor architecture, driven by sample design targets: easy programming model, close to standard C; low power; adaptable to the specific problem.

Furthermore a 6-issue version has been defined which is even more powerful and has a lower power consumption.