FPGA開発日記

カテゴリ別記事インデックス https://msyksphinz.github.io/github_pages , English Version https://fpgadevdiary.hatenadiary.com/

Rocket ChipにはDual Coreのコンフィグレーションがある(1. シミュレーション試行 & Vivado合成試行)

UCBの開発したRISC-V実装であるRocket Chipは基本的にシングルコアのモードで動作させるが、よく見るとデュアルコアのモードも存在する。

class DualCoreConfig extends Config(
  new WithNBigCores(2) ++ new BaseConfig)

つまり、emulator/ディレクトリで以下のようにすれば一応デュアルコアのRocket Chipが作れることになる。

make CONFIG=DualCoreConfig

さらに、デュアルコア Rocket Chipの環境でDhrystoneを動かすためには、outputファイルを指定すればよい。

make CONFIG=DualCoreConfig output/dhrystone.riscv.out
  • 実行結果 :
$ make CONFIG=DualCoreConfig output/dhrystone.riscv.out
mkdir -p ./output
ln -fs /home/msyksphinz/riscv64//riscv64-unknown-elf/share/riscv-tests/benchmarks/dhrystone.riscv output/dhrystone.riscv
./emulator-freechips.rocketchip.system-DualCoreConfig +max-cycles=100000000 +verbose output/dhrystone.riscv 3>&1 1>&2 2>&3 | /home/msyksphinz/riscv64/bin/spike-dasm  > output/dhrystone.riscv.out && [ $PIPESTATUS -eq 0 ]
Microseconds for one run through Dhrystone: 532
Dhrystones per Second:                      1877
mcycle = 266427
minstret = 203319

実行してみると、output/dhrystone.riscv.out に2コア分の情報が出力された。C0がコア0、C1がコア1の情報のようだ。

C1:        901 [0] pc=[000001004c] W[r 0=ffffffea5d08a67f][0] R[r 0=a2550095c80cb9b7] R[r 5=9b4288d9bf70827b] inst=[10500073] wfi (args unknown)
C0:        902 [1] pc=[0000000828] W[r 8=0000000000000000][1] R[r 0=0000000000000000] R[r20=0000000000000003] inst=[f1402473] csrr    s0, mhartid
C1:        902 [0] pc=[000001004c] W[r 0=ffffffea5d08a67f][0] R[r 0=a2550095c80cb9b7] R[r 5=9b4288d9bf70827b] inst=[10500073] wfi (args unknown)
C0:        903 [0] pc=[0000000828] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r20=0000000000000003] inst=[f1402473] csrr    s0, mhartid
C1:        903 [0] pc=[000001004c] W[r 0=ffffffea5d08a67f][0] R[r 0=a2550095c80cb9b7] R[r 5=9b4288d9bf70827b] inst=[10500073] wfi (args unknown)
C0:        904 [0] pc=[0000000828] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r20=0000000000000003] inst=[f1402473] csrr    s0, mhartid
C1:        904 [1] pc=[0000010050] W[r 0=0000000000010052][1] R[r31=a2550095c80cb9b7] R[r29=9b4288d9bf70827b] inst=[0000bff5] j       pc - 4
C0:        905 [1] pc=[000000082c] W[r 0=0000000000000400][1] R[r 8=0000000000000000] R[r 0=0000000000000000] inst=[40044403] lbu     s0, 1024(s0)
C1:        905 [0] pc=[0000010050] W[r 0=0000000000010052][0] R[r31=a2550095c80cb9b7] R[r29=9b4288d9bf70827b] inst=[0000bff5] j       pc - 4
C0:        906 [0] pc=[000000082c] W[r 0=0000000000000400][0] R[r 8=0000000000000000] R[r 0=0000000000000000] inst=[40044403] lbu     s0, 1024(s0)
C1:        906 [1] pc=[000001004c] W[r 0=ffffffea5d08a67f][0] R[r 0=0000000000000000] R[r 5=9b4288d9bf70827b] inst=[10500073] wfi (args unknown)
C0:        907 [0] pc=[000000082c] W[r 0=0000000000000400][0] R[r 8=0000000000000000] R[r 0=0000000000000000] inst=[40044403] lbu     s0, 1024(s0)
C1:        907 [0] pc=[000001004c] W[r 0=ffffffea5d08a67f][0] R[r 0=a2550095c80cb9b7] R[r 5=9b4288d9bf70827b] inst=[10500073] wfi (args unknown)
C0:        908 [0] pc=[000000082c] W[r 0=0000000000000400][0] R[r 8=0000000000000000] R[r 0=0000000000000000] inst=[40044403] lbu     s0, 1024(s0)
C1:        908 [0] pc=[000001004c] W[r 0=ffffffea5d08a67f][0] R[r 0=a2550095c80cb9b7] R[r 5=9b4288d9bf70827b] inst=[10500073] wfi (args unknown)
C0:        909 [0] pc=[000000082c] W[r 0=0000000000000400][0] R[r 8=0000000000000000] R[r 0=0000000000000000] inst=[40044403] lbu     s0, 1024(s0)

それぞれ分割してみる。

grep "\[1\] pc" output/dhrystone.riscv.out | grep C0 > output/dhrystone.c0.out
grep "\[1\] pc" output/dhrystone.riscv.out | grep C1 > output/dhrystone.c1.out

どうやらコア0はDhrystoneを普通に実行して、コア1は休んでいるようだ。

  • output/dhrystone.c0.out (抜粋)
...
C0:     539000 [1] pc=[0000000804] W[r 0=0000000000000808][1] R[r 0=0000000000000000] R[r12=0000000000000003] inst=[04c0006f] j       pc + 0x4c
C0:     539023 [1] pc=[0000000850] W[r 8=0000000000000000][1] R[r 0=0000000000000000] R[r20=0000000000000003] inst=[f1402473] csrr    s0, mhartid
C0:     539026 [1] pc=[0000000854] W[r 0=0000000000000108][0] R[r 0=0000000000000000] R[r 8=0000000000000000] inst=[10802423] sw      s0, 264(zero)
C0:     539027 [1] pc=[0000000858] W[r 8=00000000800050f0][1] R[r 0=0000000000000000] R[r18=0000000000000003] inst=[7b202473] csrr    s0, dscratch
C0:     539028 [1] pc=[000000085c] W[r 0=00000000800050f0][0] R[r 0=0000000000000000] R[r18=0000000000000003] inst=[7b200073] dret (args unknown)
  • output/dhrystone.c1.out` (抜粋)
C1:     163725 [1] pc=[0080000104] W[r 4=0000000080005140][1] R[r 4=000000008000517f] R[r 0=0000000000000000] inst=[fc027213] andi    tp, tp, -64
C1:     163726 [1] pc=[0080000108] W[r10=0000000000000001][1] R[r 0=0000000000000000] R[r20=0000000000000003] inst=[f1402573] csrr    a0, mhartid
C1:     163727 [1] pc=[008000010c] W[r11=0000000000000001][1] R[r 0=0000000000000000] R[r 1=0000000000000003] inst=[00004585] li      a1, 1
C1:     163729 [1] pc=[008000010e] W[r 0=0000000000000000][0] R[r10=0000000000000001] R[r11=0000000000000001] inst=[00b57063] bgeu    a0, a1, pc + 0
C1:     163734 [1] pc=[008000010e] W[r 0=0000000000000000][0] R[r10=0000000000000001] R[r11=0000000000000001] inst=[00b57063] bgeu    a0, a1, pc + 0
C1:     163739 [1] pc=[008000010e] W[r 0=0000000000000000][0] R[r10=0000000000000001] R[r11=0000000000000001] inst=[00b57063] bgeu    a0, a1, pc + 0
C1:     163741 [1] pc=[008000010e] W[r 0=0000000000000000][0] R[r10=0000000000000001] R[r11=0000000000000001] inst=[00b57063] bgeu    a0, a1, pc + 0
C1:     163744 [1] pc=[008000010e] W[r 0=0000000000000000][0] R[r10=0000000000000001] R[r11=0000000000000001] inst=[00b57063] bgeu    a0, a1, pc + 0
C1:     163749 [1] pc=[008000010e] W[r 0=0000000000000000][0] R[r10=0000000000000001] R[r11=0000000000000001] inst=[00b57063] bgeu    a0, a1, pc + 0

デュアルコア構成でZedBoard向けビルドを実行してみる

以下の変更を加えて、DualCore向けの構成を作成する。 FPGAのビルド用に、 ZynqDualConfigを作成した。

diff --git a/common/src/main/scala/Configs.scala b/common/src/main/scala/Configs.scala
index 7ae2c38..40addd2 100644
--- a/common/src/main/scala/Configs.scala
+++ b/common/src/main/scala/Configs.scala
@@ -39,6 +39,7 @@ class WithSmallCores extends Config(
   })

 class ZynqConfig extends Config(new WithZynqAdapter ++ new DefaultFPGAConfig)
+class ZynqDualConfig extends Config(new WithZynqAdapter ++ new DefaultFPGADualConfig)
 class ZynqSmallConfig extends Config(new WithSmallCores ++ new ZynqConfig)

 class WithIntegrationTest extends Config(
  • rocket-chip/src/main/scala/rocketchip/Configs.scala
diff --git a/src/main/scala/rocketchip/Configs.scala b/src/main/scala/rocketchip/Configs.scala
index e253bfb..5387c14 100644
--- a/src/main/scala/rocketchip/Configs.scala
+++ b/src/main/scala/rocketchip/Configs.scala
@@ -92,6 +92,8 @@ class FPGAConfig extends Config (
 )

 class DefaultFPGAConfig extends Config(new FPGAConfig ++ new BaseConfig)
+class DefaultFPGADualConfig extends Config(new FPGAConfig ++ new WithNCores(2) ++ new BaseConfig)
 class DefaultL2FPGAConfig extends Config(
   new WithL2Capacity(64) ++ new WithL2Cache ++ new DefaultFPGAConfig)

で、makeを実行する。 zedboard ディレクトリで、 make CONFIG=ZynqDualConfig rocket && make CONFIG=ZynqDualConfigを実行した。

しかし、2コアはZedBoardのFPGAで入りきらなかった。残念。

...
Netlist sorting complete. Time (s): cpu = 00:00:00.09 ; elapsed = 00:00:00.09 . Memory (MB): peak = 2660.359 ; gain = 0.000 ; free physical = 3793 ; free virtual = 8269

Phase 1.2 IO Placement/ Clock Placement/ Build Placer Device
ERROR: [Place 30-640] Place Check : This design requires more Slice LUTs cells than are available in the target device. This design requires 58438 of such cell types but only 53200 compatible sites are available in the target device. Ple
ase analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. Please set tcl parameter "drc.disableLUTOverUtilError" to 1 to change t
his error to warning.
ERROR: [Place 30-640] Place Check : This design requires more LUT as Logic cells than are available in the target device. This design requires 57297 of such cell types but only 53200 compatible sites are available in the target device. P
lease analyze your synthesis results and constraints to ensure the design is mapped to Xilinx primitives as expected. If so, please consider targeting a larger device. Please set tcl parameter "drc.disableLUTOverUtilError" to 1 to change
 this error to warning.
INFO: [Timing 38-35] Done setting XDC timing constraints.
Phase 1.2 IO Placement/ Clock Placement/ Build Placer Device | Checksum: e1670006

Time (s): cpu = 00:00:19 ; elapsed = 00:00:12 . Memory (MB): peak = 2660.359 ; gain = 0.000 ; free physical = 3699 ; free virtual = 8179
Phase 1 Placer Initialization | Checksum: e1670006

Time (s): cpu = 00:00:19 ; elapsed = 00:00:12 . Memory (MB): peak = 2660.359 ; gain = 0.000 ; free physical = 3699 ; free virtual = 8179
ERROR: [Place 30-99] Placer failed with error: 'Implementation Feasibility check failed, Please see the previously displayed individual error or warning messages for more details.'
Please review all ERROR, CRITICAL WARNING, and WARNING messages during placement to understand the cause for failure.
Ending Placer Task | Checksum: e1670006

Time (s): cpu = 00:00:19 ; elapsed = 00:00:12 . Memory (MB): peak = 2660.359 ; gain = 0.000 ; free physical = 3700 ; free virtual = 8180
49 Infos, 0 Warnings, 0 Critical Warnings and 4 Errors encountered.
place_design failed

f:id:msyksphinz:20171029221532p:plain

SmallConfigを使うと一応入った。

diff --git a/common/src/main/scala/Configs.scala b/common/src/main/scala/Configs.scala
index 7ae2c38..3a30d2c 100644
--- a/common/src/main/scala/Configs.scala
+++ b/common/src/main/scala/Configs.scala
@@ -39,6 +39,8 @@ class WithSmallCores extends Config(
   })

 class ZynqConfig extends Config(new WithZynqAdapter ++ new DefaultFPGAConfig)
+class ZynqDualConfig extends Config(new WithZynqAdapter ++ new DefaultFPGADualConfig)
+class ZynqDualSmallConfig extends Config(new WithZynqAdapter ++ new DefaultFPGASmallDualConfig)
 class ZynqSmallConfig extends Config(new WithSmallCores ++ new ZynqConfig)
  • rocket-chip/src/main/scala/rocketchip/Configs.scala
diff --git a/src/main/scala/rocketchip/Configs.scala b/src/main/scala/rocketchip/Configs.scala
index e253bfb..a4c0aa3 100644
--- a/src/main/scala/rocketchip/Configs.scala
+++ b/src/main/scala/rocketchip/Configs.scala
@@ -92,6 +92,9 @@ class FPGAConfig extends Config (
 )

 class DefaultFPGAConfig extends Config(new FPGAConfig ++ new BaseConfig)
+class DefaultFPGADualConfig extends Config(new FPGAConfig ++ new WithNCores(2) ++ new BaseConfig)
+class DefaultFPGASmallDualConfig extends Config(new FPGAConfig ++ new WithSmallCores ++ new WithNCores(2) ++ new BaseConfig)
 class DefaultL2FPGAConfig extends Config(
   new WithL2Capacity(64) ++ new WithL2Cache ++ new DefaultFPGAConfig)

DefaultFPGASmallDualConfigを使うようにする。make CONFIG=ZynqDualSmallConfigで合成とインプリメントを実行した。

f:id:msyksphinz:20171030011226p:plain

f:id:msyksphinz:20171030011251p:plain