Nema Labs created FASThread to enable any programmer to overcome the most pressing challenges when developing performance critical software for multi-cores. The driving forces behind FASThread are the following:
FASThread integrates into an IDE (Integrated Development Environment) and offers parallelization of sequential source code in a familiar high-productivity programming environment.
Writing code for maximum multi-core performance requires programmers to master several involved steps:
FASThread addresses all these challenges.
1. Identifying candidates for parallelization
Code segments where most of the execution time is spent are the best candidates for performance improvements through parallelization. Finding these segments among thousands of lines of code is challenging without assistance.
FASThread’s integrated profiler automatically identifies code segments where parallelization will provide maximum leverage. The profiling information is presented to the programmer who can decide which code segments should be modified.
2 & 3. Selecting and implementing reliable threading methods
Conceptually a code segment can be parallelized by distributing its components (e.g., iterations in a loop) evenly across available cores. The parallelized code segment will execute correctly only if the work in all its components is independent of each other. Establishing that components are independent is a difficult and involved problem. In C/C++, where pointers prevail, this is particularly challenging.
FASThread’s automatic parallelization analysis quickly establishes whether code components are independent. In the case it cannot be parallelized directly FASThread will list all the issues it has encountered that prevent parallelization. The developer will be guided to examine each issue in turn through interactive help pages. These pages provide generic examples of how a certain issue can be eliminated by cleaning up the C/C++ code from dependencies.
When all issues have been rectified FASThread will be able to parallelize the code automatically. The auto-parallelization framework does source-level optimizations towards the targeted platform and generates a parallel binary executable based on the sequential source code. FASThread will preserve the correctness of the sequential source in the parallel binary.
4. Verifying multi-core performance
The final step is to verify that the parallel version results in the expected performance increase. FASThread includes functionality to run performance tests of the parallel binary on the multi-core platform. A sequence of runs of the original sequential version as well as the parallel version on a preset number of cores is launched and the speedup results are reported. If the performance improvement is not as expected, one can quickly go back to the profiling step to choose another code segment for parallelization.