自作RISC-V CPUコアの方は、いくつか構成を改善してGShare分岐予測器が正しく動くようになった。
GShareの厄介なところは、複数の命令をフェッチした際に同時に分岐予測をするのがかなり難しい。 実際問題かなり複雑な論理になってしまい、おそらくこのままではまともな周波数で動かすことは難しいだろう。
性能自体はかなり問題なくなってきた。Dhrystoneの前半を流して、分岐予測ミスの数もかなり減ってきたと思う。
ms-x1carbon:dhrystone msyksphinz$ wc bru_detail.log 14525 281750 2485690 bru_detail.log ms-x1carbon:dhrystone msyksphinz$ grep Miss bru_detail.log| wc 377 7211 63806
一つはGHRの長さをかなり絞っているという点(現在10ビット)だと思う。ただしこれを単純に大きくしていくと今の実装ではPHTも大きくなってしまうので少し改善が必要な気がしている。
以下のようにしてGHRが流れていき、分岐予測も成功していることが見て取れる。
199418 : (21, 4) pc_vaddr = 0080002b6c, target_addr = 008000206e, pred_target_addr = 008000206e, ras_index = 7, Succ, jal pc - 0xafe 199434 : (23, 8) pc_vaddr = 0080002082, target_addr = 0080002052, pred_target_addr = 0080002052, ras_index = 8, Succ, jal pc - 0x30 199466 : (24, 4) pc_vaddr = 008000205a, target_addr = 008000205e, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 761, bhr=1111111011, Succ, beq a0, a1, pc + 8 199470 : (25, 2) pc_vaddr = 0080002060, target_addr = 0080002086, pred_target_addr = 0080002086, ras_index = 8, Succ, ret 199474 : (26, 2) pc_vaddr = 0080002088, target_addr = 008000208a, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 619, bhr=1101101111, Succ, bnez a0, pc - 14 199478 : (27, 4) pc_vaddr = 008000208e, target_addr = 00800028e8, pred_target_addr = 00800028e8, ras_index = 8, Succ, jal pc + 0x85a 199486 : (28,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 572, bhr=1101111011, Succ, beqz a5, pc + 14 199510 : (29, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken , bim=3, gidx= 572, bhr=1011110110, Succ, beq a5, a4, pc - 14 199514 : (30,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 170, bhr=0111101101, Succ, beqz a5, pc + 14 199518 : (31, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken , bim=3, gidx= 170, bhr=1111011010, Succ, beq a5, a4, pc - 14 199522 : (32,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 754, bhr=1110110101, Succ, beqz a5, pc + 14 199526 : (33, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken , bim=3, gidx= 754, bhr=1101101010, Succ, beq a5, a4, pc - 14 199530 : (34,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 914, bhr=1011010101, Succ, beqz a5, pc + 14 199534 : (35, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken , bim=3, gidx= 914, bhr=0110101010, Succ, beq a5, a4, pc - 14 199538 : (36,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 530, bhr=1101010101, Succ, beqz a5, pc + 14 199542 : (37, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken , bim=3, gidx= 530, bhr=1010101010, Succ, beq a5, a4, pc - 14 199546 : (38,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 18, bhr=0101010101, Succ, beqz a5, pc + 14 199550 : (39, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken , bim=2, gidx= 18, bhr=1010101010, Succ, beq a5, a4, pc - 14 199554 : (40,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 18, bhr=0101010101, Succ, beqz a5, pc + 14 199558 : (41, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken , bim=2, gidx= 18, bhr=1010101010, Succ, beq a5, a4, pc - 14 199562 : (42,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 18, bhr=0101010101, Succ, beqz a5, pc + 14 199566 : (43, 1) pc_vaddr = 00800028f6, target_addr = 00800028e8, pred_target_addr = 00800028e8, Taken , bim=2, gidx= 18, bhr=1010101010, Succ, beq a5, a4, pc - 14 199570 : (44,16) pc_vaddr = 00800028f4, target_addr = 00800028f6, pred_target_addr = 0000000000, NotTaken, bim=0, gidx= 18, bhr=0101010101, Succ, beqz a5, pc + 14
1000 : 567 : IPC(recent) = 0.57, IPC(total) = 0.57 2000 : 1261 : IPC(recent) = 0.69, IPC(total) = 0.63 3000 : 2447 : IPC(recent) = 1.19, IPC(total) = 0.82 4000 : 3911 : IPC(recent) = 1.46, IPC(total) = 0.98 5000 : 5349 : IPC(recent) = 1.44, IPC(total) = 1.07 6000 : 6820 : IPC(recent) = 1.47, IPC(total) = 1.14 7000 : 8220 : IPC(recent) = 1.40, IPC(total) = 1.17 8000 : 9670 : IPC(recent) = 1.45, IPC(total) = 1.21 9000 : 11117 : IPC(recent) = 1.44, IPC(total) = 1.24 10000 : 12578 : IPC(recent) = 1.46, IPC(total) = 1.26 11000 : 13993 : IPC(recent) = 1.41, IPC(total) = 1.27 12000 : 15430 : IPC(recent) = 1.44, IPC(total) = 1.29 13000 : 16892 : IPC(recent) = 1.46, IPC(total) = 1.30 14000 : 18340 : IPC(recent) = 1.45, IPC(total) = 1.31 15000 : 19763 : IPC(recent) = 1.42, IPC(total) = 1.32 16000 : 21197 : IPC(recent) = 1.43, IPC(total) = 1.32 17000 : 22667 : IPC(recent) = 1.47, IPC(total) = 1.33 18000 : 24098 : IPC(recent) = 1.43, IPC(total) = 1.34 19000 : 25538 : IPC(recent) = 1.44, IPC(total) = 1.34 20000 : 26969 : IPC(recent) = 1.43, IPC(total) = 1.35 21000 : 28432 : IPC(recent) = 1.46, IPC(total) = 1.35 22000 : 29860 : IPC(recent) = 1.42, IPC(total) = 1.36 23000 : 31313 : IPC(recent) = 1.45, IPC(total) = 1.36 24000 : 32744 : IPC(recent) = 1.43, IPC(total) = 1.36 25000 : 34203 : IPC(recent) = 1.46, IPC(total) = 1.37 26000 : 35618 : IPC(recent) = 1.42, IPC(total) = 1.37 27000 : 37085 : IPC(recent) = 1.46, IPC(total) = 1.37 28000 : 38519 : IPC(recent) = 1.43, IPC(total) = 1.38 29000 : 39972 : IPC(recent) = 1.45, IPC(total) = 1.38 30000 : 41387 : IPC(recent) = 1.42, IPC(total) = 1.38 31000 : 42855 : IPC(recent) = 1.47, IPC(total) = 1.38 32000 : 44289 : IPC(recent) = 1.43, IPC(total) = 1.38 33000 : 45742 : IPC(recent) = 1.45, IPC(total) = 1.39 34000 : 47157 : IPC(recent) = 1.42, IPC(total) = 1.39 35000 : 48630 : IPC(recent) = 1.47, IPC(total) = 1.39 36000 : 50046 : IPC(recent) = 1.42, IPC(total) = 1.39 37000 : 51517 : IPC(recent) = 1.47, IPC(total) = 1.39 38000 : 52918 : IPC(recent) = 1.40, IPC(total) = 1.39 39000 : 54396 : IPC(recent) = 1.47, IPC(total) = 1.39 40000 : 55814 : IPC(recent) = 1.42, IPC(total) = 1.40 41000 : 57292 : IPC(recent) = 1.48, IPC(total) = 1.40 42000 : 58674 : IPC(recent) = 1.38, IPC(total) = 1.40 43000 : 60168 : IPC(recent) = 1.49, IPC(total) = 1.40 44000 : 61573 : IPC(recent) = 1.41, IPC(total) = 1.40 45000 : 63067 : IPC(recent) = 1.49, IPC(total) = 1.40