pasl

While multicore computers have become mainstream, multicore programming remains full of pitfalls. Using P cores to speed up one’s program by a factor close to P turns out to be far more challenging than it might sound at first. Beyond the immediate difficultly of parallelizing existing algorithms, one faces the problem of amortizing the costs of thread creation, the challenge of dealing with dynamic load balancing, the hazards of concurrent programming, and the headaches associated with weak memory models.

Since January 2011, we have been working on the development of new abstractions and scheduling algorithms for multicore programming. We belive that, by using the right programming language abstractions together with a flexible scheduler, multicore programming can become much more accessible. To experiment our ideas, we have been developing a C++ library, called pasl.

Related publications

Oracle Scheduling: Controlling Granularity in Implicitly Parallel Languages

Efficient Primitives for Creating and Scheduling Parallel Computations

Scheduling Parallel Programs by Work Stealing with Private Deques

Theory and Practice of Chunked Sequences

Software

Screenshots

Speedup curves

Speedup curve for string search (pattern of 10 chars, text of 10 billion chars).

Processor utilization

The plot shows utilization along the y axis and time along the x axis in the computation of matrix-multiply.

Experimental results

Our experimental results show that work stealing algorithms based on private deques are competitive with traditional concurrent work stealing deques.

Parallel runs averaged on 20 runs (usually 5% to 10% difference between a fast and a slow run); see our PPoPP’13 paper for details.

Theoretical result

We have proved the efficiency of a work stealing scheduler based on private deques.

Team

Collaborators

References

Acar, Umut A., Arthur Charguéraud, and Mike Rainey. 2011. “Oracle Scheduling: Controlling Granularity in Implicitly Parallel Languages.” In Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, (OOPSLA), edited by Cristina Videira Lopes and Kathleen Fisher, 499–518. ACM. http://arthur.chargueraud.org/research/2011/oracle.

———. 2012. “Efficient Primitives for Creating and Scheduling Parallel Computations.” In Declarative Aspects and Applications of Multicore Programming (DAMP).

———. 2013. “Scheduling Parallel Programs by Work Stealing with Private Deques.” In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 219–28. PPoPP ’13. ACM.

———. 2014. “Theory and Practice of Chunked Sequences.” In Algorithms - ESA 2014 - 22th Annual European Symposium. Proceedings, edited by Andreas S. Schulz and Dorothea Wagner, 8737:25–36. Lecture Notes in Computer Science. Springer.