FPGA開発日記

カテゴリ別記事インデックス https://msyksphinz.github.io/github_pages , English Version https://fpgadevdiary.hatenadiary.com/

分岐予測の評価キット Branch Prediction Championship Kit を試す (インストール)

Ubuntu 20.04 LTSを使用した。

ビルドのためには、Boostを使用しなければいけないらしい。

jilp.org

sudo apt install libboost-dev libboost-iostreams-dev

Championship Kitをダウンロードして、ビルドする。

curl -L http://hpca23.cse.tamu.edu/cbp2016/cbp2016.final.tar.gz | tar xz
cd cbp2016.eval
cd sim
make

これで一応バイナリが作れたらしい。あとは、トレースファイルを持ちこんで解析をする? 非常に大きなトレースファイルをダウンロードして解凍する。

cd traces
curl -L http://hpca23.cse.tamu.edu/cbp2016/evaluationTraces.Final.tar | tar x
mv evaluationTraces/* .

なんだこりゃ。

...
-rw-r--r-- 1 msyksphinz msyksphinz   2226737 Apr 22  2016 SHORT_SERVER-61.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   1318337 Apr 22  2016 SHORT_SERVER-60.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz  46038679 Apr 22  2016 SHORT_SERVER-5.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   2699244 Apr 22  2016 SHORT_SERVER-79.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   1873985 Apr 22  2016 SHORT_SERVER-78.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   6787630 Apr 22  2016 SHORT_SERVER-77.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   3608147 Apr 22  2016 SHORT_SERVER-76.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   9221889 Apr 22  2016 SHORT_SERVER-75.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   2442634 Apr 22  2016 SHORT_SERVER-74.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   3453716 Apr 22  2016 SHORT_SERVER-73.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   3390638 Apr 22  2016 SHORT_SERVER-72.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz    973732 Apr 22  2016 SHORT_SERVER-71.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   1490451 Apr 22  2016 SHORT_SERVER-70.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz 114213235 Apr 22  2016 SHORT_SERVER-6.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz  10081228 Apr 22  2016 SHORT_SERVER-92.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   6232632 Apr 22  2016 SHORT_SERVER-91.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   2954378 Apr 22  2016 SHORT_SERVER-90.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   5605736 Apr 22  2016 SHORT_SERVER-89.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   2432037 Apr 22  2016 SHORT_SERVER-88.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   4658574 Apr 22  2016 SHORT_SERVER-87.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   4277872 Apr 22  2016 SHORT_SERVER-86.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz    857480 Apr 22  2016 SHORT_SERVER-85.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   4819861 Apr 22  2016 SHORT_SERVER-84.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   1990277 Apr 22  2016 SHORT_SERVER-83.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   3601609 Apr 22  2016 SHORT_SERVER-82.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz  13071864 Apr 22  2016 SHORT_SERVER-81.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   2352295 Apr 22  2016 SHORT_SERVER-80.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz  38042298 Apr 22  2016 SHORT_SERVER-8.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz  51433959 Apr 22  2016 SHORT_SERVER-7.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   1731878 Apr 22  2016 SHORT_SERVER-99.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   1230601 Apr 22  2016 SHORT_SERVER-98.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   1239842 Apr 22  2016 SHORT_SERVER-97.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   1092030 Apr 22  2016 SHORT_SERVER-96.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz  13436816 Apr 22  2016 SHORT_SERVER-95.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz  11446195 Apr 22  2016 SHORT_SERVER-94.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz  14727370 Apr 22  2016 SHORT_SERVER-93.bt9.trace.gz
-rw-r--r-- 1 msyksphinz msyksphinz   6886787 Apr 22  2016 SHORT_SERVER-9.bt9.trace.gz

なるほど、ARMコアのトレースファイルが付いているのか。

BT9_SPA_TRACE_FORMAT
bt9_minor_version: 0
has_physical_address: 1
md5_checksum:
conversion_date:
original_stf_input_file:
total_instruction_count:        346537889    # Instruction count
branch_instruction_count:         63649054    # Branch count
invalid_physical_branch_target_count:         28185384    # Invalid Physical Target Count
A32_instruction_count:        236061553    # A32 instructions
A64_instruction_count:                0    # A64 instructions
T32_instruction_count:        110476336    # T32 instructions
unidentified_instruction_count:                0    # Unidentified instructions
BT9_NODES
#NODE   id      virtual_address    physical_address          opcode  size
NODE    0                    0                    -                0    0
NODE    1           0x8001bd84           0x8001bd84       0xe1a0f00e    4    class:  RET+IND+UCD  behavior:   AT+DIR  taken_cnt:     9699  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "mov       pc, lr"
NODE    2           0x804779b4           0x804779b4       0xebee597a    4    class: CALL+DIR+UCD  behavior:  DYN+DIR  taken_cnt:     9771  not_taken_cnt:        2  tgt_cnt:   1  # mnemonic: "bl        0x000000008000dfa4"
NODE    3           0x8000dfc4           0x8000dfc4       0xeb00c52a    4    class: CALL+DIR+UCD  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "bl        0x000000008003f474"
NODE    4           0x8003f48c           0x8003f48c       0xebffffd5    4    class: CALL+DIR+UCD  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "bl        0x000000008003f3e8"
NODE    5           0x8003f410           0x8003f410        0x8bd81f0    4    class:  RET+IND+CND  behavior:  ANT+DIR  taken_cnt:        0  not_taken_cnt:     9773  tgt_cnt:   0  # mnemonic: "popeq     {r4, r5, r6, r7, r8, pc}"
NODE    6           0x8003f428           0x8003f428       0xe12fff33    4    class: CALL+IND+UCD  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "blx       r3"
NODE    7           0x800090e0           0x800090e0       0x979ff101    4    class:  JMP+IND+CND  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "ldrls     pc, [pc, r1, lsl #2]"
NODE    8           0x80009198           0x80009198        0xa000006    4    class:  JMP+DIR+CND  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "beq       0x00000000800091b8"
NODE    9           0x800091c4           0x800091c4       0xe8bd8038    4    class:  RET+IND+UCD  behavior:  DYN+DIR  taken_cnt:     9771  not_taken_cnt:        2  tgt_cnt:   1  # mnemonic: "pop       {r3, r4, r5, pc}"
NODE   10           0x8003f444           0x8003f444       0x18bd81f0    4    class:  RET+IND+CND  behavior:  ANT+DIR  taken_cnt:        0  not_taken_cnt:     9773  tgt_cnt:   0  # mnemonic: "popne     {r4, r5, r6, r7, r8, pc}"
NODE   11           0x8003f464           0x8003f464       0x1affffea    4    class:  JMP+DIR+CND  behavior:  ANT+DIR  taken_cnt:        0  not_taken_cnt:     9773  tgt_cnt:   0  # mnemonic: "bne       0x000000008003f414"
NODE   12           0x8003f468           0x8003f468       0xe8bd81f0    4    class:  RET+IND+UCD  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "pop       {r4, r5, r6, r7, r8, pc}"
NODE   13           0x8003f494           0x8003f494       0xe8bd8000    4    class:  RET+IND+UCD  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "ldmfd     sp!, {pc}"
NODE   14           0x8000dfcc           0x8000dfcc       0xe894aff0    4    class:  RET+IND+UCD  behavior:  DYN+DIR  taken_cnt:     9769  not_taken_cnt:        4  tgt_cnt:   1  # mnemonic: "ldm       r4, {r4, r5, r6, r7, r8, r9, r10, >
NODE   15           0x804779cc           0x804779cc       0xebef29e9    4    class: CALL+DIR+UCD  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "bl        0x0000000080042178"
NODE   16           0x800421b8           0x800421b8       0x1a00001e    4    class:  JMP+DIR+CND  behavior:  ANT+DIR  taken_cnt:        0  not_taken_cnt:     9773  tgt_cnt:   0  # mnemonic: "bne       0x0000000080042238"
NODE   17           0x800421d0           0x800421d0        0xa000009    4    class:  JMP+DIR+CND  behavior:  DYN+DIR  taken_cnt:     9722  not_taken_cnt:       51  tgt_cnt:   1  # mnemonic: "beq       0x00000000800421fc"
NODE   18           0x800421ec           0x800421ec       0x1afffffa    4    class:  JMP+DIR+CND  behavior:  ANT+DIR  taken_cnt:        0  not_taken_cnt:       51  tgt_cnt:   0  # mnemonic: "bne       0x00000000800421dc"
NODE   19           0x800421f8           0x800421f8        0xa000011    4    class:  JMP+DIR+CND  behavior:  ANT+DIR  taken_cnt:        0  not_taken_cnt:       51  tgt_cnt:   0  # mnemonic: "beq       0x0000000080042244"
NODE   20           0x80042200           0x80042200       0x18bd88f0    4    class:  RET+IND+CND  behavior:   AT+DIR  taken_cnt:     9773  not_taken_cnt:        0  tgt_cnt:   1  # mnemonic: "popne     {r4, r5, r6, r7, r11, pc}"
NODE   21           0x804779e8           0x804779e8       0x1a000036    4    class:  JMP+DIR+CND  behavior:  ANT+DIR  taken_cnt:        0  not_taken_cnt:     9774  tgt_cnt:   0  # mnemonic: "bne       0x0000000080477ac8"
NODE   22           0x804779f4           0x804779f4       0x1affff66    4    class:  JMP+DIR+CND  behavior:  DYN+DIR  taken_cnt:       83  not_taken_cnt:     9691  tgt_cnt:   1  # mnemonic: "bne       0x0000000080477794"
NODE   23           0x804779fc           0x804779fc       0xe8bd8ff0    4    class:  RET+IND+UCD  behavior:  DYN+IND  taken_cnt:     9690  not_taken_cnt:        1  tgt_cnt:   8  # mnemonic: "pop       {r4, r5, r6, r7, r8, r9, r10, r11,>

./scripts/doit.shで実行する。

mkdir: cannot create directory ‘../results/new_traces’: File exists
Running predictor on this trace ../sim/predictor ../traces/LONG_MOBILE-1.bt9.trace.gz > ../results/new_traces/LONG_MOBILE-1.res
Running predictor on this trace ../sim/predictor ../traces/LONG_MOBILE-4.bt9.trace.gz > ../results/new_traces/LONG_MOBILE-4.res
Running predictor on this trace ../sim/predictor ../traces/SHORT_MOBILE-25.bt9.trace.gz > ../results/new_traces/SHORT_MOBILE-25.res
Running predictor on this trace ../sim/predictor ../traces/SHORT_MOBILE-2.bt9.trace.gz > ../results/new_traces/SHORT_MOBILE-2.res
Running predictor on this trace ../sim/predictor ../traces/SHORT_MOBILE-4.bt9.trace.gz > ../results/new_traces/SHORT_MOBILE-4.res
Running predictor on this trace ../sim/predictor ../traces/LONG_MOBILE-2.bt9.trace.gz > ../results/new_traces/LONG_MOBILE-2.res
Running predictor on this trace ../sim/predictor ../traces/SHORT_MOBILE-1.bt9.trace.gz > ../results/new_traces/SHORT_MOBILE-1.res
Running predictor on this trace ../sim/predictor ../traces/SHORT_MOBILE-27.bt9.trace.gz > ../results/new_traces/SHORT_MOBILE-27.res
...

めっちゃ時間がかかる。最終的にこんな感じの結果が得られた。

ResultDirs ==>                  lts/new_traces

LONG_MOBILE-1                  0.679
LONG_MOBILE-4                  5.923
SHORT_MOBILE-25                0.001
SHORT_MOBILE-2                 0.032
SHORT_MOBILE-4                 0.679
LONG_MOBILE-2                  0.002
SHORT_MOBILE-1                 0.036
SHORT_MOBILE-27                0.001
SHORT_MOBILE-30                0.048
LONG_MOBILE-3                  0.003
SHORT_MOBILE-24                0.001
SHORT_MOBILE-28                0.158
SHORT_MOBILE-3                 0.067

AMEAN                          0.587