FPGA開発日記

カテゴリ別記事インデックス https://msyksphinz.github.io/github_pages , English Version https://fpgadevdiary.hatenadiary.com/

自作RISC-V CPUコアにおけるGShare分岐予測器の実装

自作RISC-V CPUコアの方は、いくつか構成を改善してGShare分岐予測器が正しく動くようになった。

GShareの厄介なところは、複数の命令をフェッチした際に同時に分岐予測をするのがかなり難しい。 実際問題かなり複雑な論理になってしまい、おそらくこのままではまともな周波数で動かすことは難しいだろう。

性能自体はかなり問題なくなってきた。Dhrystoneの前半を流して、分岐予測ミスの数もかなり減ってきたと思う。

ms-x1carbon:dhrystone msyksphinz$ wc bru_detail.log
  14525  281750 2485690 bru_detail.log
ms-x1carbon:dhrystone msyksphinz$ grep Miss bru_detail.log| wc
    377    7211   63806

一つはGHRの長さをかなり絞っているという点(現在10ビット)だと思う。ただしこれを単純に大きくしていくと今の実装ではPHTも大きくなってしまうので少し改善が必要な気がしている。

以下のようにしてGHRが流れていき、分岐予測も成功していることが見て取れる。

              199418 : (21, 4) pc_vaddr = 0080002b6c, target_addr = 008000206e, pred_target_addr = 008000206e, ras_index =          7, Succ, jal     pc - 0xafe
              199434 : (23, 8) pc_vaddr = 0080002082, target_addr = 0080002052, pred_target_addr = 0080002052, ras_index =          8, Succ, jal     pc - 0x30
              199466 : (24, 4) pc_vaddr = 008000205a, target_addr = 008000205e, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 761, bhr=1111111011, Succ, beq     a0, a1, pc + 8
              199470 : (25, 2) pc_vaddr = 0080002060, target_addr = 0080002086, pred_target_addr = 0080002086, ras_index =          8, Succ, ret
              199474 : (26, 2) pc_vaddr = 0080002088, target_addr = 008000208a, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 619, bhr=1101101111, Succ, bnez    a0, pc - 14
              199478 : (27, 4) pc_vaddr = 008000208e, target_addr = 00800028e8, pred_target_addr = 00800028e8, ras_index =          8, Succ, jal     pc + 0x85a
              199486 : (28,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 572, bhr=1101111011, Succ, beqz    a5, pc + 14
              199510 : (29, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken   , bim=3, gidx= 572, bhr=1011110110, Succ, beq     a5, a4, pc - 14
              199514 : (30,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 170, bhr=0111101101, Succ, beqz    a5, pc + 14
              199518 : (31, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken   , bim=3, gidx= 170, bhr=1111011010, Succ, beq     a5, a4, pc - 14
              199522 : (32,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 754, bhr=1110110101, Succ, beqz    a5, pc + 14
              199526 : (33, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken   , bim=3, gidx= 754, bhr=1101101010, Succ, beq     a5, a4, pc - 14
              199530 : (34,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 914, bhr=1011010101, Succ, beqz    a5, pc + 14
              199534 : (35, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken   , bim=3, gidx= 914, bhr=0110101010, Succ, beq     a5, a4, pc - 14
              199538 : (36,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 530, bhr=1101010101, Succ, beqz    a5, pc + 14
              199542 : (37, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken   , bim=3, gidx= 530, bhr=1010101010, Succ, beq     a5, a4, pc - 14
              199546 : (38,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx=  18, bhr=0101010101, Succ, beqz    a5, pc + 14
              199550 : (39, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken   , bim=2, gidx=  18, bhr=1010101010, Succ, beq     a5, a4, pc - 14
              199554 : (40,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx=  18, bhr=0101010101, Succ, beqz    a5, pc + 14
              199558 : (41, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken   , bim=2, gidx=  18, bhr=1010101010, Succ, beq     a5, a4, pc - 14
              199562 : (42,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx=  18, bhr=0101010101, Succ, beqz    a5, pc + 14
              199566 : (43, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken   , bim=2, gidx=  18, bhr=1010101010, Succ, beq     a5, a4, pc - 14
              199570 : (44,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx=  18, bhr=0101010101, Succ, beqz    a5, pc + 14
      1000 :        567 : IPC(recent) = 0.57, IPC(total) = 0.57
      2000 :       1261 : IPC(recent) = 0.69, IPC(total) = 0.63
      3000 :       2447 : IPC(recent) = 1.19, IPC(total) = 0.82
      4000 :       3911 : IPC(recent) = 1.46, IPC(total) = 0.98
      5000 :       5349 : IPC(recent) = 1.44, IPC(total) = 1.07
      6000 :       6820 : IPC(recent) = 1.47, IPC(total) = 1.14
      7000 :       8220 : IPC(recent) = 1.40, IPC(total) = 1.17
      8000 :       9670 : IPC(recent) = 1.45, IPC(total) = 1.21
      9000 :      11117 : IPC(recent) = 1.44, IPC(total) = 1.24
     10000 :      12578 : IPC(recent) = 1.46, IPC(total) = 1.26
     11000 :      13993 : IPC(recent) = 1.41, IPC(total) = 1.27
     12000 :      15430 : IPC(recent) = 1.44, IPC(total) = 1.29
     13000 :      16892 : IPC(recent) = 1.46, IPC(total) = 1.30
     14000 :      18340 : IPC(recent) = 1.45, IPC(total) = 1.31
     15000 :      19763 : IPC(recent) = 1.42, IPC(total) = 1.32
     16000 :      21197 : IPC(recent) = 1.43, IPC(total) = 1.32
     17000 :      22667 : IPC(recent) = 1.47, IPC(total) = 1.33
     18000 :      24098 : IPC(recent) = 1.43, IPC(total) = 1.34
     19000 :      25538 : IPC(recent) = 1.44, IPC(total) = 1.34
     20000 :      26969 : IPC(recent) = 1.43, IPC(total) = 1.35
     21000 :      28432 : IPC(recent) = 1.46, IPC(total) = 1.35
     22000 :      29860 : IPC(recent) = 1.42, IPC(total) = 1.36
     23000 :      31313 : IPC(recent) = 1.45, IPC(total) = 1.36
     24000 :      32744 : IPC(recent) = 1.43, IPC(total) = 1.36
     25000 :      34203 : IPC(recent) = 1.46, IPC(total) = 1.37
     26000 :      35618 : IPC(recent) = 1.42, IPC(total) = 1.37
     27000 :      37085 : IPC(recent) = 1.46, IPC(total) = 1.37
     28000 :      38519 : IPC(recent) = 1.43, IPC(total) = 1.38
     29000 :      39972 : IPC(recent) = 1.45, IPC(total) = 1.38
     30000 :      41387 : IPC(recent) = 1.42, IPC(total) = 1.38
     31000 :      42855 : IPC(recent) = 1.47, IPC(total) = 1.38
     32000 :      44289 : IPC(recent) = 1.43, IPC(total) = 1.38
     33000 :      45742 : IPC(recent) = 1.45, IPC(total) = 1.39
     34000 :      47157 : IPC(recent) = 1.42, IPC(total) = 1.39
     35000 :      48630 : IPC(recent) = 1.47, IPC(total) = 1.39
     36000 :      50046 : IPC(recent) = 1.42, IPC(total) = 1.39
     37000 :      51517 : IPC(recent) = 1.47, IPC(total) = 1.39
     38000 :      52918 : IPC(recent) = 1.40, IPC(total) = 1.39
     39000 :      54396 : IPC(recent) = 1.47, IPC(total) = 1.39
     40000 :      55814 : IPC(recent) = 1.42, IPC(total) = 1.40
     41000 :      57292 : IPC(recent) = 1.48, IPC(total) = 1.40
     42000 :      58674 : IPC(recent) = 1.38, IPC(total) = 1.40
     43000 :      60168 : IPC(recent) = 1.49, IPC(total) = 1.40
     44000 :      61573 : IPC(recent) = 1.41, IPC(total) = 1.40
     45000 :      63067 : IPC(recent) = 1.49, IPC(total) = 1.40