如何给 LLVM RISCV 添加一条“指令”(改进)
实验报告(改进)
实验要求
- 完成指令编码设计(4分)
- 能够通过自己编写的至少2个正确汇编测试程序,生成正确的madd指令编码(4分)
- 能够通过自己编写的至少2个错误汇编测试程序,进行适当的错误处理(4分)
- 能够通过自己编写的至少1个C语言测试程序,生成正确的madd汇编指令(4分)
- 编码规范,风格良好,注释适当,架构合理(4分)
额外完成的
- 使用比较合适的测试方法
- 可以给 llvm 前端也添加相应的支持,支持高级语言的内联汇编或者内置函数生成对应的汇编
代码 Patch
主模块
rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> git format-patch 3c115231b000ed300a6f0551b5b30bd549d566f3
0001-Add-builtin-__builtin_riscv_madd.patch
0002-Edit-Emit-function-type-error-and-grammer-error.patch
0003-Repair-fix-int_riscv_madd-type.patch
0004-Edit-no-changes.patch
0005-Edit-edit-comment.patch
0006-Add-test-cases-instruction-ID-case.patch
0007-Info-submodule-updated.patch
0008-Edit-format-code.patch
子模块 (llvm/lib/Target/RISCV)
rt@rogerthat ~/m/l/l/l/T/RISCV (master)> git format-patch 60950388fdb2be61593e2552100ed49186cf6b14
0001-Add-Pat-intr-mov-MADD-to-RISCVInstrInfoM.td.patch
0002-Edit-MADD-duplicated.patch
0003-Add-MADDW-MADD.patch
0004-Edit-edit-format.patch
0005-Add-specific-instruction-error.patch
0006-ADD-new-madd-pattern.patch
使用说明
安装与编译
wget https://github.com/llvm/llvm-project/archive/refs/tags/llvmorg-13.0.0.tar.gz
tar -xzf llvmorg-13.0.0.tar.gz
cd llvm-project-llvmorg-13.0.0
mkdir build install
cd build
cmake -DLLVM_ENABLE_PROJECTS="clang" \
-DLLVM_TARGETS_TO_BUILD="RISCV" \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=(pwd)/compile/llvm-project-llvmorg-13.0.0/install \
../llvm
make -j3 // 怕炸内存,32G内存8个核编译在链接的时候很容易炸内存
应用 patch
git apply $diff
或者
git am *.patch
注意有两个模块
指令编码
不同的文件代表了不同类型的指令集拓展,经过一番查阅后,决定把 madd
指令添加到 M extention
中
// llvm/lib/Target/RISCV/RISCVInstrInfoM.td
// customized mul+add instruction
def MADD : ALU_rr<0b1111111, 0b111, "madd">,
Sched<[WriteIALU, ReadIALU, ReadIALU]>;
// customized 64 bit version
def MADDW : ALUW_rr<0b1111111, 0b111, "maddw">,
Sched<[WriteIALU, ReadIALU, ReadIALU]>;
指令继承自 ALU_rr
,需要注意的是这里有一些 properties,比如 hasSideEffects
,在继承的过程中,程序会检查是否冲突。每种 properties 有对应的含义
// llvm/lib/Target/RISCV/RISCVInstrInfo.td
let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class ALU_rr<bits<7> funct7, bits<3> funct3, string opcodestr>
: RVInstR<funct7, funct3, OPC_OP, (outs GPR:$rd), (ins GPR:$rs1, GPR:$rs2),
opcodestr, "$rd, $rs1, $rs2">;
// llvm/lib/Target/RISCV/RISCVInstrFormats.td
class RVInstR<bits<7> funct7, bits<3> funct3, RISCVOpcode opcode, dag outs,
dag ins, string opcodestr, string argstr>
: RVInst<outs, ins, opcodestr, argstr, [], InstFormatR> {
bits<5> rs2;
bits<5> rs1;
bits<5> rd;
let Inst{31-25} = funct7;
let Inst{24-20} = rs2;
let Inst{19-15} = rs1;
let Inst{14-12} = funct3;
let Inst{11-7} = rd;
let Opcode = opcode.Value;
}
指定 OPC_OP 为 opcode,func7 为 0b1111111,func3 为 0b111,寄存器用 5 位表示(约束为 GPR 寄存器)
指令 Pattern
// Add New MADD Pattern
def : Pat<(add GPR:$rs1, (mul GPR:$rs1, GPR:$rs2)), (MADD GPR:$rs1, GPR:$rs2)>;
实验结果/测试程序
llvm-lit 测试程序
- 能够通过自己编写的至少2个正确汇编测试程序,生成正确的madd指令编码(4分)
- 能够通过自己编写的至少2个错误汇编测试程序,进行适当的错误处理(4分)
正确的指令编码测试
llvm/test/MC/RISCV/rv32m-valid.s
最后两条指令是新增的测试样例
# RUN: llvm-mc %s -triple=riscv32 -mattr=+m -riscv-no-aliases -show-encoding \
# RUN: | FileCheck -check-prefixes=CHECK-ASM,CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc %s -triple=riscv64 -mattr=+m -riscv-no-aliases -show-encoding \
# RUN: | FileCheck -check-prefixes=CHECK-ASM,CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+m < %s \
# RUN: | llvm-objdump --mattr=+m -M no-aliases -d -r - \
# RUN: | FileCheck --check-prefix=CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+m < %s \
# RUN: | llvm-objdump --mattr=+m -M no-aliases -d -r - \
# RUN: | FileCheck --check-prefix=CHECK-ASM-AND-OBJ %s
# CHECK-ASM-AND-OBJ: mul a4, ra, s0
# CHECK-ASM: encoding: [0x33,0x87,0x80,0x02]
mul a4, ra, s0
# CHECK-ASM-AND-OBJ: mulh ra, zero, zero
# CHECK-ASM: encoding: [0xb3,0x10,0x00,0x02]
mulh x1, x0, x0
# CHECK-ASM-AND-OBJ: mulhsu t0, t2, t1
# CHECK-ASM: encoding: [0xb3,0xa2,0x63,0x02]
mulhsu t0, t2, t1
# CHECK-ASM-AND-OBJ: mulhu a5, a4, a3
# CHECK-ASM: encoding: [0xb3,0x37,0xd7,0x02]
mulhu a5, a4, a3
# CHECK-ASM-AND-OBJ: div s0, s0, s0
# CHECK-ASM: encoding: [0x33,0x44,0x84,0x02]
div s0, s0, s0
# CHECK-ASM-AND-OBJ: divu gp, a0, a1
# CHECK-ASM: encoding: [0xb3,0x51,0xb5,0x02]
divu gp, a0, a1
# CHECK-ASM-AND-OBJ: rem s2, s2, s8
# CHECK-ASM: encoding: [0x33,0x69,0x89,0x03]
rem s2, s2, s8
# CHECK-ASM-AND-OBJ: remu s2, s2, s8
# CHECK-ASM: encoding: [0x33,0x79,0x89,0x03]
remu x18, x18, x24
# CHECK-ASM-AND-OBJ: madd a0, a1, a2
# CHECK-ASM: encoding: [0x33,0xf5,0xc5,0xfe]
madd a0, a1, a2
# CHECK-ASM-AND-OBJ: madd ra, ra, ra
# CHECK-ASM: encoding: [0xb3,0xf0,0x10,0xfe]
madd ra, ra, ra
测试结果
rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> llvm-lit llvm/test/MC/RISCV/rv32m-valid.s
-- Testing: 1 tests, 1 workers --
PASS: LLVM :: MC/RISCV/rv32m-valid.s (1 of 1)
Testing Time: 0.11s
Passed: 1
错误的指令编码测试
llvm/test/MC/RISCV/rv32m-invalid.s
最后4条新增的测试指令
# RUN: not llvm-mc -triple riscv32 -mattr=+m < %s 2>&1 | FileCheck %s
# RV64M instructions can't be used for RV32
mulw ra, sp, gp # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
divw tp, t0, t1 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
divuw t2, s0, s2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
remw a0, a1, a2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
remuw a3, a4, a5 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
maddw a0, a1, a2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
# RV32M operands can't be Imm or float regs
madd a0, a1, 1 # check: :[[@line]]:14: error: invalid Imm operand for instruction: gpr is needed
madd a0, 1, a1 # check: :[[@line]]:10: error: invalid Imm operand for instruction: gpr is needed
madd a0, f0, a1 # check: :[[@line]]:10: error: invalid rs1 for instruction: gpr is needed
madd a0, a1, f1 # check: :[[@line]]:14: error: invalid rs2 for instruction: gpr is needed
测试结果
rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> llvm-lit llvm/test/MC/RISCV/rv32m-invalid.s
-- Testing: 1 tests, 1 workers --
PASS: LLVM :: MC/RISCV/rv32m-invalid.s (1 of 1)
Testing Time: 0.03s
Passed: 1
C语言测试程序
能够通过自己编写的至少1个C语言测试程序,生成正确的madd汇编指令(4分)
可以给 llvm 前端也添加相应的支持,支持高级语言的内联汇编或者内置函数生成对应的汇编
int main() {
asm("madd x1, x1, x1");
int a = 1, b = 2, c;
c=__builtin_riscv_madd(a,b);
c = a*b+b;
return 0;
}
可以看到同时测试了内联汇编和内置函数(builtin function)
编译选项,不链接只生成 obj
clang --target=riscv32 -march=rv32g -c -o riscv-test.o riscv-test.c
rt@rogerthat ~/m/test> llvm-objdump -d riscv-test.o
riscv-test.o: file format elf32-littleriscv
Disassembly of section .text:
00000000 <main>:
0: 13 01 01 fe addi sp, sp, -32
4: 23 2e 11 00 sw ra, 28(sp)
8: 23 2c 81 00 sw s0, 24(sp)
c: 13 04 01 02 addi s0, sp, 32
10: 13 05 00 00 mv a0, zero
14: 23 2a a4 fe sw a0, -12(s0)
18: b3 f0 10 fe madd ra, ra, ra
1c: 93 05 10 00 addi a1, zero, 1
20: 23 28 b4 fe sw a1, -16(s0)
24: 93 05 20 00 addi a1, zero, 2
28: 23 26 b4 fe sw a1, -20(s0)
2c: 83 25 04 ff lw a1, -16(s0)
30: 03 26 c4 fe lw a2, -20(s0)
34: b3 f5 c5 fe madd a1, a1, a2
38: 23 24 b4 fe sw a1, -24(s0)
3c: 03 26 04 ff lw a2, -16(s0)
40: 83 25 c4 fe lw a1, -20(s0)
44: b3 f5 c5 fe madd a1, a1, a2
48: 23 24 b4 fe sw a1, -24(s0)
4c: 03 24 81 01 lw s0, 24(sp)
50: 83 20 c1 01 lw ra, 28(sp)
54: 13 01 01 02 addi sp, sp, 32
58: 67 80 00 00 ret
可以看出程序正确生成,且成果实现了对参数 a,b,c 调用 madd 指令,以及 DAG 选择的映射。
参考资料
拓展资料
可能存在的错误处理
- 编译时 signal 9 杀进程 -> 内存/硬盘不够用的情况。会发现链接时会在 /tmp 中创建文件,内存不够加交换分区,硬盘不够则给 /tmp 挂载的逻辑卷加空间(如果可行)。
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 R0gerThat!
评论