Why FASThread?

Why FASThread?

Nema Labs created FASThread to enable any programmer to overcome the most pressing challenges when developing performance critical software for multi-cores. The driving forces behind FASThread are the following:

  • Writing parallel code is difficult and time consuming. The parallelization should be abstracted away from the developer and done fully automatically.
  • Parallel code is to some extent a platform-dependent optimization and will have to be redone when a new platform becomes available or when the code is ported to another (multi-core) platform. The platform-dependent optimizations, e.g. parallelization, should be done automatically. The original, sequential source code should not be modified, as that would create maintenance problems.
  • Generation of parallel code must be reliable as testing parallel code is a tedious and error prone process.

FASThread integrates into an IDE (Integrated Development Environment) and offers parallelization of sequential source code in a familiar high-productivity programming environment.

Multi-core challenges addressed

Writing code for maximum multi-core performance requires programmers to master several involved steps:

  1. Identifying hot-spots – the code segments where most of the execution time is spent.
  2. Selecting and implementing appropriate parallelization methods.
  3. Verifying that no bugs were introduced in the parallel code. 
  4. Verifying multi-core performance.

FASThread addresses all these challenges.

FASThread work flow

1. Identifying candidates for parallelization
Code segments where most of the execution time is spent are the best candidates for performance improvements through parallelization. Finding these segments among thousands of lines of code is challenging without assistance.
FASThread’s integrated profiler automatically identifies code segments where parallelization will provide maximum leverage. The profiling information is presented to the programmer who can decide which code segments should be modified.

2 & 3. Selecting and implementing reliable threading methods
Conceptually a code segment can be parallelized by distributing its components (e.g., iterations in a loop) evenly across available cores. The parallelized code segment will execute correctly only if the work in all its components is independent of each other. Establishing that components are independent is a difficult and involved problem.  In C/C++, where pointers prevail, this is particularly challenging.

FASThread’s automatic parallelization analysis quickly establishes whether code components are independent. In the case it cannot be parallelized directly FASThread will list all the issues it has encountered that prevent parallelization. The developer will be guided to examine each issue in turn through interactive help pages. These pages provide generic examples of how a certain issue can be eliminated by cleaning up the C/C++ code from dependencies.

When all issues have been rectified FASThread will be able to parallelize the code automatically. The auto-parallelization framework does source-level optimizations towards the targeted platform and generates a parallel binary executable based on the sequential source code. FASThread will preserve the correctness of the sequential source in the parallel binary.

4. Verifying multi-core performance
The final step is to verify that the parallel version results in the expected performance increase. FASThread includes functionality to run performance tests of the parallel binary on the multi-core platform. A sequence of runs of the original sequential version as well as the parallel version on a preset number of cores is launched and the speedup results are reported. If the performance improvement is not as expected, one can quickly go back to the profiling step to choose another code segment for parallelization.