实验报告(改进)

实验要求

完成指令编码设计（4分）
能够通过自己编写的至少2个正确汇编测试程序，生成正确的madd指令编码（4分）
能够通过自己编写的至少2个错误汇编测试程序，进行适当的错误处理（4分）
能够通过自己编写的至少1个C语言测试程序，生成正确的madd汇编指令（4分）
编码规范，风格良好，注释适当，架构合理（4分）

额外完成的

使用比较合适的测试方法
可以给 llvm 前端也添加相应的支持，支持高级语言的内联汇编或者内置函数生成对应的汇编

代码 Patch

主模块

rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> git format-patch 3c115231b000ed300a6f0551b5b30bd549d566f3
0001-Add-builtin-__builtin_riscv_madd.patch
0002-Edit-Emit-function-type-error-and-grammer-error.patch
0003-Repair-fix-int_riscv_madd-type.patch
0004-Edit-no-changes.patch
0005-Edit-edit-comment.patch
0006-Add-test-cases-instruction-ID-case.patch
0007-Info-submodule-updated.patch
0008-Edit-format-code.patch

子模块（llvm/lib/Target/RISCV）

rt@rogerthat ~/m/l/l/l/T/RISCV (master)> git format-patch 60950388fdb2be61593e2552100ed49186cf6b14
0001-Add-Pat-intr-mov-MADD-to-RISCVInstrInfoM.td.patch
0002-Edit-MADD-duplicated.patch
0003-Add-MADDW-MADD.patch
0004-Edit-edit-format.patch
0005-Add-specific-instruction-error.patch
0006-ADD-new-madd-pattern.patch

使用说明

安装与编译

wget https://github.com/llvm/llvm-project/archive/refs/tags/llvmorg-13.0.0.tar.gz
tar -xzf llvmorg-13.0.0.tar.gz
cd llvm-project-llvmorg-13.0.0
mkdir build install 
cd build
cmake -DLLVM_ENABLE_PROJECTS="clang" \
	-DLLVM_TARGETS_TO_BUILD="RISCV" \
	-DCMAKE_BUILD_TYPE=Debug \
	-DCMAKE_INSTALL_PREFIX=(pwd)/compile/llvm-project-llvmorg-13.0.0/install \
	../llvm
make -j3 // 怕炸内存,32G内存8个核编译在链接的时候很容易炸内存

应用 patch

git apply $diff 或者

git am *.patch

注意有两个模块

指令编码

不同的文件代表了不同类型的指令集拓展，经过一番查阅后，决定把 madd 指令添加到 M extention 中

// llvm/lib/Target/RISCV/RISCVInstrInfoM.td
// customized mul+add instruction
def MADD    : ALU_rr<0b1111111, 0b111, "madd">,
              Sched<[WriteIALU, ReadIALU, ReadIALU]>;
// customized 64 bit version
def MADDW   : ALUW_rr<0b1111111, 0b111, "maddw">,
              Sched<[WriteIALU, ReadIALU, ReadIALU]>;

指令继承自 ALU_rr，需要注意的是这里有一些 properties，比如 hasSideEffects，在继承的过程中，程序会检查是否冲突。每种 properties 有对应的含义

// llvm/lib/Target/RISCV/RISCVInstrInfo.td
let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in
class ALU_rr<bits<7> funct7, bits<3> funct3, string opcodestr>
    : RVInstR<funct7, funct3, OPC_OP, (outs GPR:$rd), (ins GPR:$rs1, GPR:$rs2),
              opcodestr, "$rd, $rs1, $rs2">;

// llvm/lib/Target/RISCV/RISCVInstrFormats.td
class RVInstR<bits<7> funct7, bits<3> funct3, RISCVOpcode opcode, dag outs,
              dag ins, string opcodestr, string argstr>
    : RVInst<outs, ins, opcodestr, argstr, [], InstFormatR> {
  bits<5> rs2;
  bits<5> rs1;
  bits<5> rd;

  let Inst{31-25} = funct7;
  let Inst{24-20} = rs2;
  let Inst{19-15} = rs1;
  let Inst{14-12} = funct3;
  let Inst{11-7} = rd;
  let Opcode = opcode.Value;
}

指定 OPC_OP 为 opcode，func7 为 0b1111111，func3 为 0b111，寄存器用 5 位表示（约束为 GPR 寄存器）

指令 Pattern

// Add New MADD Pattern
def : Pat<(add GPR:$rs1, (mul GPR:$rs1, GPR:$rs2)), (MADD GPR:$rs1, GPR:$rs2)>;

实验结果/测试程序

llvm-lit 测试程序

能够通过自己编写的至少2个正确汇编测试程序，生成正确的madd指令编码（4分）
能够通过自己编写的至少2个错误汇编测试程序，进行适当的错误处理（4分）

正确的指令编码测试

llvm/test/MC/RISCV/rv32m-valid.s

最后两条指令是新增的测试样例

# RUN: llvm-mc %s -triple=riscv32 -mattr=+m -riscv-no-aliases -show-encoding \
# RUN:     | FileCheck -check-prefixes=CHECK-ASM,CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc %s -triple=riscv64 -mattr=+m -riscv-no-aliases -show-encoding \
# RUN:     | FileCheck -check-prefixes=CHECK-ASM,CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc -filetype=obj -triple=riscv32 -mattr=+m < %s \
# RUN:     | llvm-objdump --mattr=+m -M no-aliases -d -r - \
# RUN:     | FileCheck --check-prefix=CHECK-ASM-AND-OBJ %s
# RUN: llvm-mc -filetype=obj -triple=riscv64 -mattr=+m < %s \
# RUN:     | llvm-objdump --mattr=+m -M no-aliases -d -r - \
# RUN:     | FileCheck --check-prefix=CHECK-ASM-AND-OBJ %s

# CHECK-ASM-AND-OBJ: mul a4, ra, s0
# CHECK-ASM: encoding: [0x33,0x87,0x80,0x02]
mul a4, ra, s0
# CHECK-ASM-AND-OBJ: mulh ra, zero, zero
# CHECK-ASM: encoding: [0xb3,0x10,0x00,0x02]
mulh x1, x0, x0
# CHECK-ASM-AND-OBJ: mulhsu t0, t2, t1
# CHECK-ASM: encoding: [0xb3,0xa2,0x63,0x02]
mulhsu t0, t2, t1
# CHECK-ASM-AND-OBJ: mulhu a5, a4, a3
# CHECK-ASM: encoding: [0xb3,0x37,0xd7,0x02]
mulhu a5, a4, a3
# CHECK-ASM-AND-OBJ: div s0, s0, s0
# CHECK-ASM: encoding: [0x33,0x44,0x84,0x02]
div s0, s0, s0
# CHECK-ASM-AND-OBJ: divu gp, a0, a1
# CHECK-ASM: encoding: [0xb3,0x51,0xb5,0x02]
divu gp, a0, a1
# CHECK-ASM-AND-OBJ: rem s2, s2, s8
# CHECK-ASM: encoding: [0x33,0x69,0x89,0x03]
rem s2, s2, s8
# CHECK-ASM-AND-OBJ: remu s2, s2, s8
# CHECK-ASM: encoding: [0x33,0x79,0x89,0x03]
remu x18, x18, x24

# CHECK-ASM-AND-OBJ: madd a0, a1, a2
# CHECK-ASM: encoding: [0x33,0xf5,0xc5,0xfe] 
madd a0, a1, a2
# CHECK-ASM-AND-OBJ: madd ra, ra, ra
# CHECK-ASM: encoding: [0xb3,0xf0,0x10,0xfe]
madd ra, ra, ra

测试结果

rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> llvm-lit llvm/test/MC/RISCV/rv32m-valid.s 
-- Testing: 1 tests, 1 workers --
PASS: LLVM :: MC/RISCV/rv32m-valid.s (1 of 1)

Testing Time: 0.11s
  Passed: 1

错误的指令编码测试

llvm/test/MC/RISCV/rv32m-invalid.s

最后4条新增的测试指令

# RUN: not llvm-mc -triple riscv32 -mattr=+m < %s 2>&1 | FileCheck %s

# RV64M instructions can't be used for RV32
mulw ra, sp, gp # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
divw tp, t0, t1 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
divuw t2, s0, s2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
remw a0, a1, a2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set
remuw a3, a4, a5 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set

maddw a0, a1, a2 # CHECK: :[[@LINE]]:1: error: instruction requires the following: RV64I Base Instruction Set

# RV32M operands can't be Imm or float regs
madd a0, a1, 1  # check: :[[@line]]:14: error: invalid Imm operand for instruction: gpr is needed 
madd a0, 1, a1  # check: :[[@line]]:10: error: invalid Imm operand for instruction: gpr is needed 
madd a0, f0, a1 # check: :[[@line]]:10: error: invalid rs1 for instruction: gpr is needed
madd a0, a1, f1 # check: :[[@line]]:14: error: invalid rs2 for instruction: gpr is needed

测试结果

rt@rogerthat ~/m/llvm-project-llvmorg-13.0.0 (master)> llvm-lit llvm/test/MC/RISCV/rv32m-invalid.s 
-- Testing: 1 tests, 1 workers --
PASS: LLVM :: MC/RISCV/rv32m-invalid.s (1 of 1)

Testing Time: 0.03s
  Passed: 1

C语言测试程序

能够通过自己编写的至少1个C语言测试程序，生成正确的madd汇编指令（4分）
可以给 llvm 前端也添加相应的支持，支持高级语言的内联汇编或者内置函数生成对应的汇编

int main() {
    asm("madd x1, x1, x1");
    int a = 1, b = 2, c;
    c=__builtin_riscv_madd(a,b);
    c = a*b+b;
    return 0;
}

可以看到同时测试了内联汇编和内置函数（builtin function）

编译选项，不链接只生成 obj

clang --target=riscv32 -march=rv32g -c -o riscv-test.o riscv-test.c

rt@rogerthat ~/m/test> llvm-objdump -d riscv-test.o

riscv-test.o:   file format elf32-littleriscv

Disassembly of section .text:

00000000 <main>:
       0: 13 01 01 fe   addi    sp, sp, -32
       4: 23 2e 11 00   sw      ra, 28(sp)
       8: 23 2c 81 00   sw      s0, 24(sp)
       c: 13 04 01 02   addi    s0, sp, 32
      10: 13 05 00 00   mv      a0, zero
      14: 23 2a a4 fe   sw      a0, -12(s0)
      18: b3 f0 10 fe   madd    ra, ra, ra
      1c: 93 05 10 00   addi    a1, zero, 1
      20: 23 28 b4 fe   sw      a1, -16(s0)
      24: 93 05 20 00   addi    a1, zero, 2
      28: 23 26 b4 fe   sw      a1, -20(s0)
      2c: 83 25 04 ff   lw      a1, -16(s0)
      30: 03 26 c4 fe   lw      a2, -20(s0)
      34: b3 f5 c5 fe   madd    a1, a1, a2
      38: 23 24 b4 fe   sw      a1, -24(s0)
      3c: 03 26 04 ff   lw      a2, -16(s0)
      40: 83 25 c4 fe   lw      a1, -20(s0)
      44: b3 f5 c5 fe   madd    a1, a1, a2
      48: 23 24 b4 fe   sw      a1, -24(s0)
      4c: 03 24 81 01   lw      s0, 24(sp)
      50: 83 20 c1 01   lw      ra, 28(sp)
      54: 13 01 01 02   addi    sp, sp, 32
      58: 67 80 00 00   ret

可以看出程序正确生成，且成果实现了对参数 a，b，c 调用 madd 指令，以及 DAG 选择的映射。