Abstract: The rising popularity of deep learning algorithms demands special accelerators for matrix-matrix multiplication. Most of the matrix multipliers are designed based on the systolic array ...