实验报告(改进)

实验要求

  • 完成指令编码设计(4分)
  • 能够通过自己编写的至少2个正确汇编测试程序,生成正确的madd指令编码(4分)
  • 能够通过自己编写的至少2个错误汇编测试程序,进行适当的错误处理(4分)
  • 能够通过自己编写的至少1个C语言测试程序,生成正确的madd汇编指令(4分)
  • 编码规范,风格良好,注释适当,架构合理(4分)

额外完成的

  • 使用比较合适的测试方法
  • 可以给 llvm 前端也添加相应的支持,支持高级语言的内联汇编或者内置函数生成对应的汇编

代码 Patch

主模块

rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> git format-patch 3c115231b000ed300a6f0551b5b30bd549d566f3
0001-Add-builtin-__builtin_riscv_madd.patch
0002-Edit-Emit-function-type-error-and-grammer-error.patch
0003-Repair-fix-int_riscv_madd-type.patch
0004-Edit-no-changes.patch
0005-Edit-edit-comment.patch
0006-Add-test-cases-instruction-ID-case.patch
0007-Info-submodule-updated.patch
0008-Edit-format-code.patch

子模块 (llvm/lib/Target/RISCV)

rt@rogerthat ~/m/l/l/l/T/RISCV (master)> git format-patch 60950388fdb2be61593e2552100ed49186cf6b14
0001-Add-Pat-intr-mov-MADD-to-RISCVInstrInfoM.td.patch
0002-Edit-MADD-duplicated.patch
0003-Add-MADDW-MADD.patch
0004-Edit-edit-format.patch
0005-Add-specific-instruction-error.patch
0006-ADD-new-madd-pattern.patch

使用说明

安装与编译

wget https://github.com/llvm/llvm-project/archive/refs/tags/llvmorg-13.0.0.tar.gz
tar -xzf llvmorg-13.0.0.tar.gz
cd llvm-project-llvmorg-13.0.0
mkdir build install 
cd build
cmake -DLLVM_ENABLE_PROJECTS="clang" \
	-DLLVM_TARGETS_TO_BUILD="RISCV" \
	-DCMAKE_BUILD_TYPE=Debug \
	-DCMAKE_INSTALL_PREFIX=(pwd)/compile/llvm-project-llvmorg-13.0.0/install \
	../llvm
make -j3 // 怕炸内存,32G内存8个核编译在链接的时候很容易炸内存

应用 patch

git apply $diff 或者

git am *.patch

注意有两个模块

指令编码

不同的文件代表了不同类型的指令集拓展,经过一番查阅后,决定把 madd 指令添加到 M extention

// llvm/lib/Target/RISCV/RISCVInstrInfoM.td
// customized mul+add instruction
def MADD    : ALU_rr<0b1111111, 0b111, "madd">,
              Sched<[WriteIALU, ReadIALU, ReadIALU]>;
// customized 64 bit version
def MADDW   : ALUW_rr<0b1111111, 0b111, "maddw">,
              Sched<[WriteIALU, ReadIALU, ReadIALU]>;

指令继承自 ALU_rr,需要注意的是这里有一些 properties,比如 hasSideEffects,在继承的过程中,程序会检查是否冲突。每种 properties 有对应的含义

// llvm/lib/Target/RISCV/RISCVInstrInfo.td
let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class ALU_rr<bits<7> funct7, bits<3> funct3, string opcodestr>
    : RVInstR<funct7, funct3, OPC_OP, (outs GPR:$rd), (ins GPR:$rs1, GPR:$rs2),
              opcodestr, "$rd, $rs1, $rs2">;
// llvm/lib/Target/RISCV/RISCVInstrFormats.td
class RVInstR<bits<7> funct7, bits<3> funct3, RISCVOpcode opcode, dag outs,
              dag ins, string opcodestr, string argstr>
    : RVInst<outs, ins, opcodestr, argstr, [], InstFormatR> {
  bits<5> rs2;
  bits<5> rs1;
  bits<5> rd;

  let Inst{31-25} = funct7;
  let Inst{24-20} = rs2;
  let Inst{19-15} = rs1;
  let Inst{14-12} = funct3;
  let Inst{11-7} = rd;
  let Opcode = opcode.Value;
}

指定 OPC_OP 为 opcode,func7 为 0b1111111,func3 为 0b111,寄存器用 5 位表示(约束为 GPR 寄存器)

指令 Pattern

// Add New MADD Pattern
def : Pat<(add GPR:$rs1, (mul GPR:$rs1, GPR:$rs2)), (MADD GPR:$rs1, GPR:$rs2)>;

实验结果/测试程序

llvm-lit 测试程序

  • 能够通过自己编写的至少2个正确汇编测试程序,生成正确的madd指令编码(4分)
  • 能够通过自己编写的至少2个错误汇编测试程序,进行适当的错误处理(4分)

正确的指令编码测试

llvm/test/MC/RISCV/rv32m-valid.s

最后两条指令是新增的测试样例

# RUN: llvm-mc %s -triple=riscv32 -mattr=+m -riscv-no-aliases -show-encoding \
# RUN:     | FileCheck -check-prefixes=CHECK-ASM,CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc %s -triple=riscv64 -mattr=+m -riscv-no-aliases -show-encoding \
# RUN:     | FileCheck -check-prefixes=CHECK-ASM,CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+m < %s \
# RUN:     | llvm-objdump --mattr=+m -M no-aliases -d -r - \
# RUN:     | FileCheck --check-prefix=CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+m < %s \
# RUN:     | llvm-objdump --mattr=+m -M no-aliases -d -r - \
# RUN:     | FileCheck --check-prefix=CHECK-ASM-AND-OBJ %s

# CHECK-ASM-AND-OBJ: mul a4, ra, s0
# CHECK-ASM: encoding: [0x33,0x87,0x80,0x02]
mul a4, ra, s0
# CHECK-ASM-AND-OBJ: mulh ra, zero, zero
# CHECK-ASM: encoding: [0xb3,0x10,0x00,0x02]
mulh x1, x0, x0
# CHECK-ASM-AND-OBJ: mulhsu t0, t2, t1
# CHECK-ASM: encoding: [0xb3,0xa2,0x63,0x02]
mulhsu t0, t2, t1
# CHECK-ASM-AND-OBJ: mulhu a5, a4, a3
# CHECK-ASM: encoding: [0xb3,0x37,0xd7,0x02]
mulhu a5, a4, a3
# CHECK-ASM-AND-OBJ: div s0, s0, s0
# CHECK-ASM: encoding: [0x33,0x44,0x84,0x02]
div s0, s0, s0
# CHECK-ASM-AND-OBJ: divu gp, a0, a1
# CHECK-ASM: encoding: [0xb3,0x51,0xb5,0x02]
divu gp, a0, a1
# CHECK-ASM-AND-OBJ: rem s2, s2, s8
# CHECK-ASM: encoding: [0x33,0x69,0x89,0x03]
rem s2, s2, s8
# CHECK-ASM-AND-OBJ: remu s2, s2, s8
# CHECK-ASM: encoding: [0x33,0x79,0x89,0x03]
remu x18, x18, x24

# CHECK-ASM-AND-OBJ: madd a0, a1, a2
# CHECK-ASM: encoding: [0x33,0xf5,0xc5,0xfe] 
madd a0, a1, a2
# CHECK-ASM-AND-OBJ: madd ra, ra, ra
# CHECK-ASM: encoding: [0xb3,0xf0,0x10,0xfe]
madd ra, ra, ra

测试结果

rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> llvm-lit llvm/test/MC/RISCV/rv32m-valid.s 
-- Testing: 1 tests, 1 workers --
PASS: LLVM :: MC/RISCV/rv32m-valid.s (1 of 1)

Testing Time: 0.11s
  Passed: 1

错误的指令编码测试

llvm/test/MC/RISCV/rv32m-invalid.s

最后4条新增的测试指令

# RUN: not llvm-mc -triple riscv32 -mattr=+m < %s 2>&1 | FileCheck %s

# RV64M instructions can't be used for RV32
mulw ra, sp, gp # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
divw tp, t0, t1 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
divuw t2, s0, s2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
remw a0, a1, a2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
remuw a3, a4, a5 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set

maddw a0, a1, a2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set

# RV32M operands can't be Imm or float regs
madd a0, a1, 1  # check: :[[@line]]:14: error: invalid Imm operand for instruction: gpr is needed 
madd a0, 1, a1  # check: :[[@line]]:10: error: invalid Imm operand for instruction: gpr is needed 
madd a0, f0, a1 # check: :[[@line]]:10: error: invalid rs1 for instruction: gpr is needed
madd a0, a1, f1 # check: :[[@line]]:14: error: invalid rs2 for instruction: gpr is needed

测试结果

rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> llvm-lit llvm/test/MC/RISCV/rv32m-invalid.s 
-- Testing: 1 tests, 1 workers --
PASS: LLVM :: MC/RISCV/rv32m-invalid.s (1 of 1)

Testing Time: 0.03s
  Passed: 1

C语言测试程序

  • 能够通过自己编写的至少1个C语言测试程序,生成正确的madd汇编指令(4分)

  • 可以给 llvm 前端也添加相应的支持,支持高级语言的内联汇编或者内置函数生成对应的汇编

int main() {
    asm("madd x1, x1, x1");
    int a = 1, b = 2, c;
    c=__builtin_riscv_madd(a,b);
    c = a*b+b;
    return 0;
}

可以看到同时测试了内联汇编和内置函数(builtin function)

编译选项,不链接只生成 obj

clang --target=riscv32 -march=rv32g -c -o riscv-test.o riscv-test.c

rt@rogerthat ~/m/test> llvm-objdump -d riscv-test.o

riscv-test.o:   file format elf32-littleriscv

Disassembly of section .text:

00000000 <main>:
       0: 13 01 01 fe   addi    sp, sp, -32
       4: 23 2e 11 00   sw      ra, 28(sp)
       8: 23 2c 81 00   sw      s0, 24(sp)
       c: 13 04 01 02   addi    s0, sp, 32
      10: 13 05 00 00   mv      a0, zero
      14: 23 2a a4 fe   sw      a0, -12(s0)
      18: b3 f0 10 fe   madd    ra, ra, ra
      1c: 93 05 10 00   addi    a1, zero, 1
      20: 23 28 b4 fe   sw      a1, -16(s0)
      24: 93 05 20 00   addi    a1, zero, 2
      28: 23 26 b4 fe   sw      a1, -20(s0)
      2c: 83 25 04 ff   lw      a1, -16(s0)
      30: 03 26 c4 fe   lw      a2, -20(s0)
      34: b3 f5 c5 fe   madd    a1, a1, a2
      38: 23 24 b4 fe   sw      a1, -24(s0)
      3c: 03 26 04 ff   lw      a2, -16(s0)
      40: 83 25 c4 fe   lw      a1, -20(s0)
      44: b3 f5 c5 fe   madd    a1, a1, a2
      48: 23 24 b4 fe   sw      a1, -24(s0)
      4c: 03 24 81 01   lw      s0, 24(sp)
      50: 83 20 c1 01   lw      ra, 28(sp)
      54: 13 01 01 02   addi    sp, sp, 32
      58: 67 80 00 00   ret

可以看出程序正确生成,且成果实现了对参数 a,b,c 调用 madd 指令,以及 DAG 选择的映射。

参考资料

  1. How to Write an LLVM Backend

  2. LLVM测试框架

  3. Add intrinsic for Zbb extension

  4. LLVM COMMAND

拓展资料

  1. How to run Clang-LLVM for RISCV target machine for Bitmanip extension

可能存在的错误处理

  1. 编译时 signal 9 杀进程 -> 内存/硬盘不够用的情况。会发现链接时会在 /tmp 中创建文件,内存不够加交换分区,硬盘不够则给 /tmp 挂载的逻辑卷加空间(如果可行)。