FPGA開発日記

カテゴリ別記事インデックス https://msyksphinz.github.io/github_pages , English Version https://fpgadevdiary.hatenadiary.com/

VivadoでRISC-V自作CPUコアの論理合成を行う (Critical Path解析)

自作アウトオブオーダCPU、非常に久しぶりに論理合成をしてみると、非常に大きなCritical Pathが存在していた。 これは良くないので改善する必要がある。

Vivadoを使用した。使用したFPGAバイスはよくわからない。

------------------------------------------------------------------------------------
| Tool Version : Vivado v.2019.2 (lin64) Build 2708876 Wed Nov  6 21:39:14 MST 2019
| Date         : Thu Dec 15 19:10:30 2022
| Host         : polygon running 64-bit Ubuntu 22.04.1 LTS
| Command      : report_timing -file scariv_tile_wrapper_timing_synth.rpt
| Design       : scariv_tile_wrapper
| Device       : 7z030-fbg484
| Speed File   : -3  PRODUCTION 1.11 2014-09-11
------------------------------------------------------------------------------------

タイミングレポートは以下の通り。解析するとレジスタファイルをフォワードする論理を入れていたのだが、これが多くのモジュールを通過しておりCritical Pathとなっている。

Timing Report

Slack (VIOLATED) :        -31.580ns  (required time - arrival time)
  Source:                 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/gen_input_pipeline[3].inp_pipe_operands_q_reg[4][1][21]/C
                            (rising edge-triggered cell FDCE clocked by i_clk  {rise@0.000ns fall@5.000ns period=10.000ns})
  Destination:            u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_reg/D
                            (rising edge-triggered cell FDCE clocked by i_clk  {rise@0.000ns fall@5.000ns period=10.000ns})
  Path Group:             i_clk
  Path Type:              Setup (Max at Slow Process Corner)
  Requirement:            10.000ns  (i_clk rise@10.000ns - i_clk rise@0.000ns)
  Data Path Delay:        41.441ns  (logic 10.609ns (25.600%)  route 30.832ns (74.400%))
  Logic Levels:           145  (CARRY4=53 LUT1=1 LUT2=9 LUT3=9 LUT4=10 LUT5=15 LUT6=47 MUXF7=1)
    Location             Delay type                Incr(ns)  Path(ns)    Netlist Resource(s)
  -------------------------------------------------------------------    -------------------
                         (clock i_clk rise edge)      0.000     0.000 r
                                                      0.000     0.000 r  i_clk (IN)
                         net (fo=0)                   0.000     0.000    i_clk
                         IBUF (Prop_ibuf_I_O)         0.744     0.744 r  i_clk_IBUF_inst/O
                         net (fo=1, unplaced)         0.390     1.134    i_clk_IBUF
                         BUFG (Prop_bufg_I_O)         0.080     1.214 r  i_clk_IBUF_BUFG_inst/O
                         net (fo=289811, unplaced)    0.584     1.798    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i_clk_IBUF_BUFG
                         FDCE                                         r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/gen_input_pipeline[3].inp_pipe_operands_q_reg[4][1][21]/C
  -------------------------------------------------------------------    -------------------
                         FDCE (Prop_fdce_C_Q)         0.226     2.024 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/gen_input_pipeline[3].inp_pipe_operands_q_reg[4][1][21]/Q
                         net (fo=2, unplaced)         0.399     2.423    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/gen_input_pipeline[3].inp_pipe_operands_q_reg_n_1_[4][1][21]
                         LUT4 (Prop_lut4_I1_O)        0.119     2.542 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_59/O
                         net (fo=1, unplaced)         0.244     2.786    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_59_n_1
                         LUT5 (Prop_lut5_I4_O)        0.043     2.829 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_34/O
                         net (fo=1, unplaced)         0.515     3.344    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_34_n_1
                         LUT6 (Prop_lut6_I1_O)        0.043     3.387 f  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_21/O
                         net (fo=5, unplaced)         0.272     3.659    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_21_n_1
                         LUT3 (Prop_lut3_I1_O)        0.043     3.702 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_123/O
                         net (fo=2, unplaced)         0.255     3.957    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/info_q[1][is_subnormal]
                         LUT5 (Prop_lut5_I4_O)        0.043     4.000 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_121/O
                         net (fo=2, unplaced)         0.255     4.255    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_121_n_1
                         LUT6 (Prop_lut6_I5_O)        0.043     4.298 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_78/O
                         net (fo=2, unplaced)         0.259     4.557    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_78_n_1
                         CARRY4 (Prop_carry4_DI[2]_O[3])
                                                      0.224     4.781 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_56/O[3]
                         net (fo=3, unplaced)         0.284     5.065    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/exponent_product0[3]
                         LUT2 (Prop_lut2_I0_O)        0.120     5.185 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_55/O
                         net (fo=11, unplaced)        0.290     5.475    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/exponent_product_q__0[3]
                         LUT2 (Prop_lut2_I1_O)        0.043     5.518 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/i___31_i_590/O
                         net (fo=1, unplaced)         0.000     5.518    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_891[1]
                         CARRY4 (Prop_carry4_S[3]_CO[3])
                                                      0.173     5.691 r  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_346/CO[3]
                         net (fo=1, unplaced)         0.000     5.691    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_346_n_1
                         CARRY4 (Prop_carry4_CI_O[1])
                                                      0.159     5.850 f  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_350/O[1]
                         net (fo=10, unplaced)        0.361     6.211    u_scariv_tile/fpu.u_fp_phy_registers/exponent_difference_4[5]
                         LUT2 (Prop_lut2_I0_O)        0.126     6.337 r  u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_885/O
                         net (fo=1, unplaced)         0.000     6.337    u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_885_n_1
                         CARRY4 (Prop_carry4_DI[2]_CO[3])
                                                      0.203     6.540 r  u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_586/CO[3]
                         net (fo=1, unplaced)         0.000     6.540    u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_586_n_1
                         CARRY4 (Prop_carry4_CI_CO[2])
                                                      0.122     6.662 r  u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_345/CO[2]
                         net (fo=22, unplaced)        0.306     6.968    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_214_0[0]
                         LUT6 (Prop_lut6_I4_O)        0.122     7.090 f  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_594/O
                         net (fo=55, unplaced)        0.329     7.419    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_594_n_1
                         LUT6 (Prop_lut6_I5_O)        0.043     7.462 f  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_347/O
                         net (fo=3, unplaced)         0.262     7.724    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_347_n_1
                         LUT6 (Prop_lut6_I0_O)        0.043     7.767 f  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_351/O
                         net (fo=2, unplaced)         0.255     8.022    u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_351_n_1
                         LUT6 (Prop_lut6_I5_O)        0.043     8.065 f  u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_391/O
...
                         LUT6 (Prop_lut6_I1_O)        0.043    41.275 r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[imm][0]_i_11/O
                         net (fo=1, unplaced)         0.244    41.519    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[imm][0]_i_11_n_1
                         LUT4 (Prop_lut4_I1_O)        0.043    41.562 r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[imm][0]_i_6/O
                         net (fo=11, unplaced)        0.290    41.852    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[imm][0]_i_6_n_1
                         LUT4 (Prop_lut4_I2_O)        0.043    41.895 r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_9/O
                         net (fo=3, unplaced)         0.262    42.157    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_9_n_1
                         LUT5 (Prop_lut5_I0_O)        0.043    42.200 r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_4/O
                         net (fo=1, unplaced)         0.377    42.577    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_4_n_1
                         LUT5 (Prop_lut5_I0_O)        0.043    42.620 r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_2/O
                         net (fo=1, unplaced)         0.000    42.620    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_2_n_1
                         MUXF7 (Prop_muxf7_I0_O)      0.112    42.732 r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl_reg[op][0]_i_1/O
                         net (fo=2, unplaced)         0.388    43.120    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/w_ex0_pipe_ctrl[op][0]
                         LUT5 (Prop_lut5_I1_O)        0.119    43.239 r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_i_1/O
                         net (fo=1, unplaced)         0.000    43.239    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/w_ex0_div_stall
                         FDCE                                         r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_reg/D
  -------------------------------------------------------------------    -------------------

                         (clock i_clk rise edge)     10.000    10.000 r
                                                      0.000    10.000 r  i_clk (IN)
                         net (fo=0)                   0.000    10.000    i_clk
                         IBUF (Prop_ibuf_I_O)         0.669    10.669 r  i_clk_IBUF_inst/O
                         net (fo=1, unplaced)         0.371    11.039    i_clk_IBUF
                         BUFG (Prop_bufg_I_O)         0.072    11.111 r  i_clk_IBUF_BUFG_inst/O
                         net (fo=289811, unplaced)    0.439    11.550    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/i_clk_IBUF_BUFG
                         FDCE                                         r  u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_reg/C
                         clock pessimism              0.103    11.653
                         clock uncertainty           -0.035    11.618
                         FDCE (Setup_fdce_C_D)        0.041    11.659    u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_reg
  -------------------------------------------------------------------
                         required time                         11.659
                         arrival time                         -43.239
  -------------------------------------------------------------------
                         slack                                -31.580

どうもこの辺のCritical Pathは何度確認しても具体的な場所がつかめないので、一度Flattenを全部削除して本当にCriticalな場所を探してみる。

+
 synth_design -top ${TOP_NAME} -part $DEVICE_NAME -fanout_limit 10000 \
     -include_dir ../../src/fpnew/src/common_cells/include \
+    -flatten_hierarchy none \
     -include_dir ../../src \
     -verilog_define $::env(RV_DEFINE)
 write_checkpoint -force ${TOP_NAME}.dcp

するとCritical PathがFrontend側に移った。これはRVC命令を切り出す周辺の論理だ。これは結構面倒だな... 切り出しの論理を簡単化するためにはどうしたらいいかな。

                         LUT4 (Prop_lut4_I0_O)        0.043     4.003 r  u_scariv_tile/u_frontend/u_scariv_inst_buffer/word_loop[0].u_decoder_inst_cat_i_745/O
                         net (fo=128, unplaced)       0.354     4.357    u_scariv_tile/u_frontend/u_scariv_inst_buffer/word_loop[0].u_decoder_inst_cat_i_745_n_0
                         MUXF7 (Prop_muxf7_S_O)       0.168     4.525 f  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_532/O
                         net (fo=1, unplaced)         0.377     4.902    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_532_n_0
                         LUT6 (Prop_lut6_I1_O)        0.119     5.021 f  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_161/O
                         net (fo=1, unplaced)         0.377     5.398    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_161_n_0
                         LUT6 (Prop_lut6_I1_O)        0.043     5.441 f  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_58/O
                         net (fo=1, unplaced)         0.244     5.685    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_58_n_0
                         LUT5 (Prop_lut5_I4_O)        0.043     5.728 f  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_16/O
                         net (fo=89, unplaced)        0.319     6.047    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_16_n_0
                         LUT2 (Prop_lut2_I1_O)        0.043     6.090 r  u_scariv_tile/u_frontend/u_scariv_inst_buffer/u_bru_inst_cnt_i_5/O
                         net (fo=107, unplaced)       0.345     6.435    u_scariv_tile/u_frontend/u_scariv_inst_buffer/u_bru_inst_cnt_i_5_n_0
                         LUT3 (Prop_lut3_I2_O)        0.043     6.478 r  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_117/O
                         net (fo=2, unplaced)         0.255     6.733    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_117_n_0
                         LUT6 (Prop_lut6_I5_O)        0.043     6.776 r  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_50/O
                         net (fo=3, unplaced)         0.262     7.038    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_50_n_0
                         LUT6 (Prop_lut6_I5_O)        0.043     7.081 r  u_scariv_tile/u_frontend/u_scariv_inst_buffer/u_except_disp_pick_up_i_65/O
                         net (fo=42, unplaced)        0.322     7.403    u_scariv_tile/u_frontend/u_scariv_inst_buffer/u_except_disp_pick_up_i_65_n_0
                         LUT4 (Prop_lut4_I0_O)        0.043     7.446 r  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_1660/O
                         net (fo=128, unplaced)       0.354     7.800    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_1660_n_0
                         MUXF7 (Prop_muxf7_S_O)       0.168     7.968 f  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_833/O
                         net (fo=1, unplaced)         0.000     7.968    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_833_n_0
                         MUXF8 (Prop_muxf8_I0_O)      0.044     8.012 f  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_348/O
                         net (fo=1, unplaced)         0.351     8.363    u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_348_n_0
                         LUT6 (Prop_lut6_I0_O)        0.120     8.483 f  u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_110/O