自作CPUにおいて、GShareb分岐予測器の実装をしようとしている。自分が実装しているものが本当に正しいのか検証したくて、いろいろモデルを作りながらまとめている。そのメモ。
実験
以下のプログラムを考える。要するに、両方の引数が偶数か奇数かで値を設定し、その結果に基づいて比較を行う。
このプログラムでは、3つの分岐命令が使用されている。
- 引数0の偶数・奇数をチェック
- 引数1の偶数・奇数をチェック
- 上記2つの結果が異なっているかをチェック
つまり最後の比較は、上記の2つの比較に大きく依存する形式となっている。
.global branch_count branch_count: andi a0, a0, 1 beqz a0, .cut_aa li a0, 0 j .bb_check .cut_aa: li a0, 1 .bb_check: andi a1, a1, 1 beqz a1, .cut_bb li a1, 0 j .final_check .cut_bb: li a1, 1 .final_check: bne a0, a1, .ret_true li a0, 0 .ret_true: li a0, 1 ret
このbranch_count()を引数のパタンでひたすら回していく。
int result_count = 0; extern int branch_count(int aa, int bb); int main () { for (int a = 0; a < 10; a++) { for (int b = 0; b < 10; b++) { if (branch_count (a, b)) { result_count ++; } } } return 0; }
これで、現状において分岐予測の結果をシミュレーションで取得する。
grep 80002014 bru_detail.log 15778 : (09,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 0000000000000000, NotTaken, bim=1, Succ, DASM(0x00b51363) 16014 : (11,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 0000000000000000, Taken , bim=0, Miss, DASM(0x00b51363) 16178 : (14,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 16302 : (13,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 16462 : (14,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 16586 : (13,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 16746 : (14,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 16870 : (13,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 17030 : (14,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 17154 : (13,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 17498 : (14,2) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=1, Miss, DASM(0x00b51363) 17702 : (03,1) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=2, Miss, DASM(0x00b51363) 17874 : (07,2) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=1, Miss, DASM(0x00b51363) 18038 : (09,1) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=2, Miss, DASM(0x00b51363) 18210 : (13,2) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=1, Miss, DASM(0x00b51363) 18374 : (15,1) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=2, Miss, DASM(0x00b51363) 18546 : (03,2) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=1, Miss, DASM(0x00b51363) 18710 : (05,1) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=2, Miss, DASM(0x00b51363) 18882 : (09,2) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=1, Miss, DASM(0x00b51363) 19046 : (11,1) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=2, Miss, DASM(0x00b51363) 19310 : (06,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 19466 : (07,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 19626 : (08,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 19750 : (07,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 19910 : (08,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 20034 : (07,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 20194 : (08,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 20318 : (07,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 20478 : (08,2) pc_vaddr = 0000000080002014, target_addr = 0000000080002018, pred_target_addr = 000000008000201a, NotTaken, bim=1, Succ, DASM(0x00b51363) 20602 : (07,1) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=0, Miss, DASM(0x00b51363) 20866 : (01,2) pc_vaddr = 0000000080002014, target_addr = 000000008000201a, pred_target_addr = 000000008000201a, Taken , bim=1, Miss, DASM(0x00b51363)
TakenとNotTakenが連続で入れ替わるような結果になっている。これを改善していくことになる。