Online / 6 & 7 February 2021


Accelerating HPC applications with Out-of-Order Commit Processors

With the end of Moore’s law, improving single-core processor performance can be extremely difficult to do in an energy-efficient manner. One alternative is to rethink conventional processor design methodologies and propose innovative ideas to unlock additional performance and efficiency. In an attempt to overcome these difficulties, we propose a compiler-informed non-speculative out-of-order commit processor, that attacks the limitations of in-order commit in current out-of-order cores to increase the effective instruction window and use critical resources of the core more intelligently. We build our core based on the open source RISC-V ISA. The hardware and software ecosystem around RISC-V enables building custom hardware and experimenting new HW/SW cooperative ideas.

While modern out-of-order processors execute instructions out-of-order to increase instruction-level parallelism, they retire instructions and manage their limited resources (register file, load/store queue, etc.) in program order to guarantee safe instruction retirement. However, this implementation requires instructions to wait for all preceding branches to resolve in order to release their critical resources, which leaves a significant amount of performance on the table. We propose a HW/SW co-design that enables non-speculative out-of-order commit in a lightweight manner, improving performance and efficiency. The key insight of our work is that identifying true branch dependencies, if properly understood, could lead to higher performance. Dependency analysis shows that not all instructions depend on the most recent branch in the reorder buffer and therefore, there are missed opportunities to improve the performance by not releasing the critical resources of independent instructions. Our processor employs a HW/SW co-design where the compiler detects true branch dependencies that enables the hardware to manage critical resources more intelligently. Also, we introduce a new interface between hardware and OS to enable precise exception handling by exposing recent changes of out-of-order committed instructions.

In our talk, we will look at the potential of our out-of-order commit core for HPC workloads. Initial studies with C-based HPC applications show promising results, and we intend to show results for a variety of additional HPC workloads to evaluate the potential of the design. We believe our HW/SW co-design might be a way to build the processors in the future. This work will appear in proceedings of the 26th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2021).


Photo of Ali Hajiabadi Ali Hajiabadi