My Research

I am a research fellow on architectural prototyping and parallel programming model for (embedded) many-core architectures. I do research in three main directions:

a. Developing modular, scalar and power-efficient parallel software for modern many-cores is a hard task (aka: nightmare) for programmers. They need new abstractions and an ad-hoc software stack (libraries, drivers) to extract the maximum performance/energy by their code.

b. Integrating hardware accelerators and cores in the same chip is a common design trend, nowadays. Rapidly prototyping accelerators and the platform, and efficiently exploiting them from the application layer are still open tasks.

c. Providing Real-time guarantees on single-core GP machines is not easy at all.. and it gets even worse when you scale to multi- and many-cores!

R&D on compilers, extensions to progamming models, runtime libraries and design toolflows to tackle these issues..well, it’s my work.

I received a joint PhD from University of Bologna (Italy) and Université de Bretagne-Sud (France). I spent 5 years doing research on:

Parallel Programming models for MPSoCs

Software for embedded systems will become more complex, as multicore hardware will enable more functions to be implemented on the same device. The use of such advanced platform implies a good knowledge by the software developers of the underlying hardware architecture to make the most of their advanced features. Of course such a knowledge is hard to achieve and several mechanism and paradigms were implemented to make the programmer aware of “what is under the curtain” and let him specifying architectural level optimizations on data and code allocation, synchronization mechanisms and more. One of these standards is OpenMP.
OpenMP is a widely-adopted shared memory parallel programming interface. It consists of a set of compiler directives, library routines and environment variables that provide a simple means to specify parallel execution within a sequential code, originally used on Symmetric Multi-Processors (SMP), but recently many implementations for embedded MPSoCs have been proposed (Cell).

Busses and Predictability for MPSoCs

In this area my research fields were Embedded High-Predictability systems for Automotive and Avionics, focusing mainly on Shared Bus architectures. Predictability (the capability to predict system behaviour in terms of worst-case response times) is a key property in RT-Systems and its main challenges in those domain are represented by shared hardware resources and the complex sowtware multi-task environment. All these aspects introduce contention which leads to potentially unpredictable response times. In detail, I explored the use of Time Division Multiple Access (TDMA) techniques to resolve Shared Bus contention impacts on performance. TDMA techniques implicitly ensure total predictability, but they have big limitations in terms of performance, which strongly depends on the Time Slot allocations. Then, the main issue is to develop new TDMA-based algorithms and schemas to find a good tradeoff between performance and Predictability. My work contributed to the European Project Predator (FP7).

A few links…