HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Reconfigurable Computing
Website | Installation | Tutorials | Samples | Documentation
IntroductionWith the pursuit of improving compute performance under strict power constraints, there is an increasing need for deploying applications to heterogeneous hardware architectures with accelerators, such as GPUs and FPGAs. However, although these heterogeneous computing platforms are becoming widely available, they are very difficult to program especially with FPGAs. As a result, the use of such platforms has been limited to a small subset of programmers with specialized hardware knowledge.
To tackle this challenge, we introduce HeteroCL, a programming infrastructure comprised of a Python-based domain-specific language (DSL) and a compilation flow. The HeteroCL DSL provides a clean programming abstraction that decouples algorithm specification from three important types of hardware customization in compute, data types, and memory architectures. HeteroCL can further capture the interdependence among these different customization techniques, allowing programmers to explore various performance/area/accuracy trade-offs in a systematic and productive manner. In addition, our framework currently provides two advanced domain-specific optimizations with stencil analysis and systolic array generation, which produce highly efficient microarchitectures for accelerating popular workloads from image processing and deep learning domains.
Current Compilation Flow Evaluation on AWS F1 (Xilinx Virtex UltraScale+TM VU9P FPGA)The speedup is over a single-core single-thread CPU execution on AWS F1.
Benchmark Data Sizes Type #LUTs #FFs #BRAMs #DSPs Freqency (MHz) Speedup Back End KNN Digit RecognitionImage classification K=3 #images=1800
uint49 4009 5835 88 0 250 12.5 General K-Means
Clustering K=16 #elem=320 x 32
int32 212708 235011 32 1536 190.6 16.0 General Smith-Waterman
Genomic sequencing string len=128
uint2 110841 88369 1409 0 152.2 20.9 General Seidel
Image processing 2160 pixel x 3840 pixel
fixed16 21719 31663 46 96 250 5.9 Stencil Gaussian
Image processing 2160 pixel x 3840 pixel
fixed16 70833 131160 46 688 250 13.2 Stencil Jacobi
Linear algebra 2160 pixel x 3840 pixel
fixed16 14883 22485 46 48 250 5.0 Stencil GEMM
Matrix multiplication 1024 x 1024 x 1024
fixed16 454492 800283 932 2507 236.8 8.9 Systolic Array LeNet Inference
CNN MNIST
fixed16 362291 660186 739.5 1368 250 10.6 Systolic Array Publication
If you use HeteroCL in your design, please cite our FPGA\'19 paper:
@article{lai2019heterocl, title={HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Reconfigurable Computing}, author={Lai, Yi-Hsiang and Chi, Yuze and Hu, Yuwei and Wang, Jie and Yu, Cody Hao and Zhou, Yuan and Cong, Jason and Zhang, Zhiru}, journal={Int\'l Symp. on Field-Programmable Gate Arrays (FPGA)}, year={2019} }Related Work
HeteroCL is a Python-based DSL extended from TVM and it extends Halide IR for intermediate representation. HeterCL incoporates the SODA framework, PolySA framework, and Merlin Compiler for FPGA back-end generation.
Stencil with Optimized Dataflow Architecture (SODA) Polyhedral-Based Systolic Array Auto-Compilation (PolySA) Merlin Compiler Halide TVM Contributing to HeteroCL Use Pull Request. Python coding style. Python docstring style.本文链接: http://polyche.immuno-online.com/view-721973.html