如何给 LLVM RISCV 添加一条“指令”(改进)
实验报告(改进)
实验要求
- 完成指令编码设计(4分)
- 能够通过自己编写的至少2个正确汇编测试程序,生成正确的madd指令编码(4分)
- 能够通过自己编写的至少2个错误汇编测试程序,进行适当的错误处理(4分)
- 能够通过自己编写的至少1个C语言测试程序,生成正确的madd汇编指令(4分)
- 编码规范,风格良好,注释适当,架构合理(4分)
额外完成的
- 使用比较合适的测试方法
- 可以给 llvm 前端也添加相应的支持,支持高级语言的内联汇编或者内置函数生成对应的汇编
代码 Patch
主模块
rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> git format-patch 3c115231b000ed300a6f0551b5b30bd549d566f3
0001-Add-builtin-__builtin_riscv_madd.patch
0002-Edit-Emit-function-type-error-and-grammer-error.patch
0003-Repair-fix-int_riscv_madd-type.patch
0004-Edit-no-changes.patch
0005-Edit-edit-comment.patch
0006-Add-test-cases-instruction-ID-case.patch
0007-Info-submodule-updated.patch
0008-Edit-format-code.patch子模块 (llvm/lib/Target/RISCV)
rt@rogerthat ~/m/l/l/l/T/RISCV (master)> git format-patch 60950388fdb2be61593e2552100ed49186cf6b14
0001-Add-Pat-intr-mov-MADD-to-RISCVInstrInfoM.td.patch
0002-Edit-MADD-duplicated.patch
0003-Add-MADDW-MADD.patch
0004-Edit-edit-format.patch
0005-Add-specific-instruction-error.patch
0006-ADD-new-madd-pattern.patch使用说明
安装与编译
wget https://github.com/llvm/llvm-project/archive/refs/tags/llvmorg-13.0.0.tar.gz
tar -xzf llvmorg-13.0.0.tar.gz
cd llvm-project-llvmorg-13.0.0
mkdir build install 
cd build
cmake -DLLVM_ENABLE_PROJECTS="clang" \
	-DLLVM_TARGETS_TO_BUILD="RISCV" \
	-DCMAKE_BUILD_TYPE=Debug \
	-DCMAKE_INSTALL_PREFIX=(pwd)/compile/llvm-project-llvmorg-13.0.0/install \
	../llvm
make -j3 // 怕炸内存,32G内存8个核编译在链接的时候很容易炸内存应用 patch
git apply $diff 或者
git am *.patch 
注意有两个模块
指令编码
不同的文件代表了不同类型的指令集拓展,经过一番查阅后,决定把 madd 指令添加到 M extention 中
// llvm/lib/Target/RISCV/RISCVInstrInfoM.td
// customized mul+add instruction
def MADD    : ALU_rr<0b1111111, 0b111, "madd">,
              Sched<[WriteIALU, ReadIALU, ReadIALU]>;
// customized 64 bit version
def MADDW   : ALUW_rr<0b1111111, 0b111, "maddw">,
              Sched<[WriteIALU, ReadIALU, ReadIALU]>;指令继承自 ALU_rr,需要注意的是这里有一些 properties,比如 hasSideEffects,在继承的过程中,程序会检查是否冲突。每种 properties 有对应的含义
// llvm/lib/Target/RISCV/RISCVInstrInfo.td
let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class ALU_rr<bits<7> funct7, bits<3> funct3, string opcodestr>
    : RVInstR<funct7, funct3, OPC_OP, (outs GPR:$rd), (ins GPR:$rs1, GPR:$rs2),
              opcodestr, "$rd, $rs1, $rs2">;// llvm/lib/Target/RISCV/RISCVInstrFormats.td
class RVInstR<bits<7> funct7, bits<3> funct3, RISCVOpcode opcode, dag outs,
              dag ins, string opcodestr, string argstr>
    : RVInst<outs, ins, opcodestr, argstr, [], InstFormatR> {
  bits<5> rs2;
  bits<5> rs1;
  bits<5> rd;
  let Inst{31-25} = funct7;
  let Inst{24-20} = rs2;
  let Inst{19-15} = rs1;
  let Inst{14-12} = funct3;
  let Inst{11-7} = rd;
  let Opcode = opcode.Value;
}指定 OPC_OP 为 opcode,func7 为 0b1111111,func3 为 0b111,寄存器用 5 位表示(约束为 GPR 寄存器)
指令 Pattern
// Add New MADD Pattern
def : Pat<(add GPR:$rs1, (mul GPR:$rs1, GPR:$rs2)), (MADD GPR:$rs1, GPR:$rs2)>;实验结果/测试程序
llvm-lit 测试程序
- 能够通过自己编写的至少2个正确汇编测试程序,生成正确的madd指令编码(4分)
- 能够通过自己编写的至少2个错误汇编测试程序,进行适当的错误处理(4分)
正确的指令编码测试
llvm/test/MC/RISCV/rv32m-valid.s
最后两条指令是新增的测试样例
# RUN: llvm-mc %s -triple=riscv32 -mattr=+m -riscv-no-aliases -show-encoding \
# RUN:     | FileCheck -check-prefixes=CHECK-ASM,CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc %s -triple=riscv64 -mattr=+m -riscv-no-aliases -show-encoding \
# RUN:     | FileCheck -check-prefixes=CHECK-ASM,CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+m < %s \
# RUN:     | llvm-objdump --mattr=+m -M no-aliases -d -r - \
# RUN:     | FileCheck --check-prefix=CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+m < %s \
# RUN:     | llvm-objdump --mattr=+m -M no-aliases -d -r - \
# RUN:     | FileCheck --check-prefix=CHECK-ASM-AND-OBJ %s
# CHECK-ASM-AND-OBJ: mul a4, ra, s0
# CHECK-ASM: encoding: [0x33,0x87,0x80,0x02]
mul a4, ra, s0
# CHECK-ASM-AND-OBJ: mulh ra, zero, zero
# CHECK-ASM: encoding: [0xb3,0x10,0x00,0x02]
mulh x1, x0, x0
# CHECK-ASM-AND-OBJ: mulhsu t0, t2, t1
# CHECK-ASM: encoding: [0xb3,0xa2,0x63,0x02]
mulhsu t0, t2, t1
# CHECK-ASM-AND-OBJ: mulhu a5, a4, a3
# CHECK-ASM: encoding: [0xb3,0x37,0xd7,0x02]
mulhu a5, a4, a3
# CHECK-ASM-AND-OBJ: div s0, s0, s0
# CHECK-ASM: encoding: [0x33,0x44,0x84,0x02]
div s0, s0, s0
# CHECK-ASM-AND-OBJ: divu gp, a0, a1
# CHECK-ASM: encoding: [0xb3,0x51,0xb5,0x02]
divu gp, a0, a1
# CHECK-ASM-AND-OBJ: rem s2, s2, s8
# CHECK-ASM: encoding: [0x33,0x69,0x89,0x03]
rem s2, s2, s8
# CHECK-ASM-AND-OBJ: remu s2, s2, s8
# CHECK-ASM: encoding: [0x33,0x79,0x89,0x03]
remu x18, x18, x24
# CHECK-ASM-AND-OBJ: madd a0, a1, a2
# CHECK-ASM: encoding: [0x33,0xf5,0xc5,0xfe] 
madd a0, a1, a2
# CHECK-ASM-AND-OBJ: madd ra, ra, ra
# CHECK-ASM: encoding: [0xb3,0xf0,0x10,0xfe]
madd ra, ra, ra测试结果
rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> llvm-lit llvm/test/MC/RISCV/rv32m-valid.s 
-- Testing: 1 tests, 1 workers --
PASS: LLVM :: MC/RISCV/rv32m-valid.s (1 of 1)
Testing Time: 0.11s
  Passed: 1错误的指令编码测试
llvm/test/MC/RISCV/rv32m-invalid.s
最后4条新增的测试指令
# RUN: not llvm-mc -triple riscv32 -mattr=+m < %s 2>&1 | FileCheck %s
# RV64M instructions can't be used for RV32
mulw ra, sp, gp # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
divw tp, t0, t1 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
divuw t2, s0, s2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
remw a0, a1, a2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
remuw a3, a4, a5 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
maddw a0, a1, a2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
# RV32M operands can't be Imm or float regs
madd a0, a1, 1  # check: :[[@line]]:14: error: invalid Imm operand for instruction: gpr is needed 
madd a0, 1, a1  # check: :[[@line]]:10: error: invalid Imm operand for instruction: gpr is needed 
madd a0, f0, a1 # check: :[[@line]]:10: error: invalid rs1 for instruction: gpr is needed
madd a0, a1, f1 # check: :[[@line]]:14: error: invalid rs2 for instruction: gpr is needed测试结果
rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> llvm-lit llvm/test/MC/RISCV/rv32m-invalid.s 
-- Testing: 1 tests, 1 workers --
PASS: LLVM :: MC/RISCV/rv32m-invalid.s (1 of 1)
Testing Time: 0.03s
  Passed: 1C语言测试程序
- 能够通过自己编写的至少1个C语言测试程序,生成正确的madd汇编指令(4分) 
- 可以给 llvm 前端也添加相应的支持,支持高级语言的内联汇编或者内置函数生成对应的汇编 
int main() {
    asm("madd x1, x1, x1");
    int a = 1, b = 2, c;
    c=__builtin_riscv_madd(a,b);
    c = a*b+b;
    return 0;
}可以看到同时测试了内联汇编和内置函数(builtin function)
编译选项,不链接只生成 obj
clang --target=riscv32 -march=rv32g -c -o riscv-test.o riscv-test.c
rt@rogerthat ~/m/test> llvm-objdump -d riscv-test.o
riscv-test.o:   file format elf32-littleriscv
Disassembly of section .text:
00000000 <main>:
       0: 13 01 01 fe   addi    sp, sp, -32
       4: 23 2e 11 00   sw      ra, 28(sp)
       8: 23 2c 81 00   sw      s0, 24(sp)
       c: 13 04 01 02   addi    s0, sp, 32
      10: 13 05 00 00   mv      a0, zero
      14: 23 2a a4 fe   sw      a0, -12(s0)
      18: b3 f0 10 fe   madd    ra, ra, ra
      1c: 93 05 10 00   addi    a1, zero, 1
      20: 23 28 b4 fe   sw      a1, -16(s0)
      24: 93 05 20 00   addi    a1, zero, 2
      28: 23 26 b4 fe   sw      a1, -20(s0)
      2c: 83 25 04 ff   lw      a1, -16(s0)
      30: 03 26 c4 fe   lw      a2, -20(s0)
      34: b3 f5 c5 fe   madd    a1, a1, a2
      38: 23 24 b4 fe   sw      a1, -24(s0)
      3c: 03 26 04 ff   lw      a2, -16(s0)
      40: 83 25 c4 fe   lw      a1, -20(s0)
      44: b3 f5 c5 fe   madd    a1, a1, a2
      48: 23 24 b4 fe   sw      a1, -24(s0)
      4c: 03 24 81 01   lw      s0, 24(sp)
      50: 83 20 c1 01   lw      ra, 28(sp)
      54: 13 01 01 02   addi    sp, sp, 32
      58: 67 80 00 00   ret可以看出程序正确生成,且成果实现了对参数 a,b,c 调用 madd 指令,以及 DAG 选择的映射。
参考资料
拓展资料
可能存在的错误处理
- 编译时 signal 9 杀进程 -> 内存/硬盘不够用的情况。会发现链接时会在 /tmp 中创建文件,内存不够加交换分区,硬盘不够则给 /tmp 挂载的逻辑卷加空间(如果可行)。
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 R0gerThat!
 评论











