We are witnessing a major paradigm shift in microprocessor architecture. Performance is constantly increasing, just as it has been for decades, but sheer CPU clock speed has reached the point where the chips are literally melting. Instead of more megahertz, multiple processor cores now provide the performance we expect.
Unfortunately, to benefit from the multi-core performance potential, software has to be parallelized. This process is difficult, labor-intensive and error-prone. Moreover, parallelization aims at a moving target. Future multi-core systems will not only contain more processor cores but also cores of different kinds. It is highly unlikely that anyone will find the time to manually modify source code to keep up with hardware progress.
Most ISVs feel challenged by the prospect of parallelizing their multi-million lines of legacy code bases. It is difficult because most programmers lack multi-core experience, and acquiring the necessary knowledge involves a steep and challenging learning curve. Even with relevant training, it is labor-intensive because tracking and resolving dependencies in the source code take lots of time lacking adequate tools. It is also error-prone since parallel software is magnitudes harder to test than sequential software.
The technological foundation of FASThread is an auto-parallelization framework that makes sure that the parallel code preserves the sequential correctness. Unlike previous unsuccessful attempts, our proprietary auto-parallelization framework can handle especially challenging, pointer-filled source code. Moreover, our auto-parallelizer interacts with the user to clean out dependencies from the sequential source so that it can be parallelized automatically.
Nema Labs offers automatic application optimization for multi-core systems. Our key strategy is to leverage the existing skill set of conventionally trained programmers who master sequential programming and high-productivity programming tools such as IDEs, sequential debuggers and the like.
Nema Labs was founded to enable a smooth transition to parallel code for fast and reliable multi-core execution, especially for programmers with little prior experience of parallelization. Our first product FASThread is an interactive compiler add-on, integrated in a development IDE. FASThread adds the capability to any of its supported C/C++ compilers to make multi-core platform optimizations.
We believe the developers’ existing knowledge and experience is best used by extending the functionality of well-known development tools. Furthermore, FASThread has the ability to present intuitive feedback to the programmer on how to best modify the sequential code so that it can be parallelized automatically and reliably. As a result, FASThread and programmers work in unison; tweaking and debugging sequential code with well-known tools, and automatically generating high-performance parallel applications.
The feedback loop is established by conveying information from the auto-parallelization framework to the programmer. The information is presented in an intuitive way and includes customized references to a database full of programming tips that can unleash the full power of parallel, multi-core code.
The long-term vision of the technology is to shield platform dependent optimizations from the software developer so that he/she can focus on software innovation while the FASThread technology fuels it by unlocking the performance of present and future multi-core platforms, be they homogeneous or heterogeneous.
Getting up to speed with FASThread is a matter of hours. In our experience, code that would take hours to parallelize with FASThread will take days and weeks with other tools.
The first release of FASThread integrates with Visual Studio and targets Intel and AMD processors. An Eclipse/Linux version, as well as a command line version of FASThread, will be available in the near future. Support for the Gnu and Intel C compilers will also be added.
We are busy developing our second generation of tools to address many-core platforms where we extend our technology to support a combination of multiple parallelism patterns and more elaborate analysis to beef up our underlying auto-parallelization framework.