自作アウトオブオーダCPU、非常に久しぶりに論理合成をしてみると、非常に大きなCritical Pathが存在していた。 これは良くないので改善する必要がある。
Vivadoを使用した。使用したFPGAデバイスはよくわからない。
------------------------------------------------------------------------------------ | Tool Version : Vivado v.2019.2 (lin64) Build 2708876 Wed Nov 6 21:39:14 MST 2019 | Date : Thu Dec 15 19:10:30 2022 | Host : polygon running 64-bit Ubuntu 22.04.1 LTS | Command : report_timing -file scariv_tile_wrapper_timing_synth.rpt | Design : scariv_tile_wrapper | Device : 7z030-fbg484 | Speed File : -3 PRODUCTION 1.11 2014-09-11 ------------------------------------------------------------------------------------
タイミングレポートは以下の通り。解析するとレジスタファイルをフォワードする論理を入れていたのだが、これが多くのモジュールを通過しておりCritical Pathとなっている。
Timing Report Slack (VIOLATED) : -31.580ns (required time - arrival time) Source: u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/gen_input_pipeline[3].inp_pipe_operands_q_reg[4][1][21]/C (rising edge-triggered cell FDCE clocked by i_clk {rise@0.000ns fall@5.000ns period=10.000ns}) Destination: u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_reg/D (rising edge-triggered cell FDCE clocked by i_clk {rise@0.000ns fall@5.000ns period=10.000ns}) Path Group: i_clk Path Type: Setup (Max at Slow Process Corner) Requirement: 10.000ns (i_clk rise@10.000ns - i_clk rise@0.000ns) Data Path Delay: 41.441ns (logic 10.609ns (25.600%) route 30.832ns (74.400%)) Logic Levels: 145 (CARRY4=53 LUT1=1 LUT2=9 LUT3=9 LUT4=10 LUT5=15 LUT6=47 MUXF7=1)
Location Delay type Incr(ns) Path(ns) Netlist Resource(s) ------------------------------------------------------------------- ------------------- (clock i_clk rise edge) 0.000 0.000 r 0.000 0.000 r i_clk (IN) net (fo=0) 0.000 0.000 i_clk IBUF (Prop_ibuf_I_O) 0.744 0.744 r i_clk_IBUF_inst/O net (fo=1, unplaced) 0.390 1.134 i_clk_IBUF BUFG (Prop_bufg_I_O) 0.080 1.214 r i_clk_IBUF_BUFG_inst/O net (fo=289811, unplaced) 0.584 1.798 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i_clk_IBUF_BUFG FDCE r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/gen_input_pipeline[3].inp_pipe_operands_q_reg[4][1][21]/C ------------------------------------------------------------------- ------------------- FDCE (Prop_fdce_C_Q) 0.226 2.024 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/gen_input_pipeline[3].inp_pipe_operands_q_reg[4][1][21]/Q net (fo=2, unplaced) 0.399 2.423 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/gen_input_pipeline[3].inp_pipe_operands_q_reg_n_1_[4][1][21] LUT4 (Prop_lut4_I1_O) 0.119 2.542 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_59/O net (fo=1, unplaced) 0.244 2.786 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_59_n_1 LUT5 (Prop_lut5_I4_O) 0.043 2.829 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_34/O net (fo=1, unplaced) 0.515 3.344 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_34_n_1 LUT6 (Prop_lut6_I1_O) 0.043 3.387 f u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_21/O net (fo=5, unplaced) 0.272 3.659 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___43_i_21_n_1 LUT3 (Prop_lut3_I1_O) 0.043 3.702 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_123/O net (fo=2, unplaced) 0.255 3.957 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/info_q[1][is_subnormal] LUT5 (Prop_lut5_I4_O) 0.043 4.000 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_121/O net (fo=2, unplaced) 0.255 4.255 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_121_n_1 LUT6 (Prop_lut6_I5_O) 0.043 4.298 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_78/O net (fo=2, unplaced) 0.259 4.557 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_78_n_1 CARRY4 (Prop_carry4_DI[2]_O[3]) 0.224 4.781 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_56/O[3] net (fo=3, unplaced) 0.284 5.065 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/exponent_product0[3] LUT2 (Prop_lut2_I0_O) 0.120 5.185 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___39_i_55/O net (fo=11, unplaced) 0.290 5.475 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/exponent_product_q__0[3] LUT2 (Prop_lut2_I1_O) 0.043 5.518 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/i___31_i_590/O net (fo=1, unplaced) 0.000 5.518 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_891[1] CARRY4 (Prop_carry4_S[3]_CO[3]) 0.173 5.691 r u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_346/CO[3] net (fo=1, unplaced) 0.000 5.691 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_346_n_1 CARRY4 (Prop_carry4_CI_O[1]) 0.159 5.850 f u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_350/O[1] net (fo=10, unplaced) 0.361 6.211 u_scariv_tile/fpu.u_fp_phy_registers/exponent_difference_4[5] LUT2 (Prop_lut2_I0_O) 0.126 6.337 r u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_885/O net (fo=1, unplaced) 0.000 6.337 u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_885_n_1 CARRY4 (Prop_carry4_DI[2]_CO[3]) 0.203 6.540 r u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_586/CO[3] net (fo=1, unplaced) 0.000 6.540 u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_586_n_1 CARRY4 (Prop_carry4_CI_CO[2]) 0.122 6.662 r u_scariv_tile/fpu.u_fp_phy_registers/i___31_i_345/CO[2] net (fo=22, unplaced) 0.306 6.968 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_214_0[0] LUT6 (Prop_lut6_I4_O) 0.122 7.090 f u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_594/O net (fo=55, unplaced) 0.329 7.419 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_594_n_1 LUT6 (Prop_lut6_I5_O) 0.043 7.462 f u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_347/O net (fo=3, unplaced) 0.262 7.724 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_347_n_1 LUT6 (Prop_lut6_I0_O) 0.043 7.767 f u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_351/O net (fo=2, unplaced) 0.255 8.022 u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_351_n_1 LUT6 (Prop_lut6_I5_O) 0.043 8.065 f u_scariv_tile/fpu.fpu_loop[1].u_scariv_fpu/u_fpu/u_scariv_fpnew_wrapper/fma64.fpnew_64/i___31_i_391/O ... LUT6 (Prop_lut6_I1_O) 0.043 41.275 r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[imm][0]_i_11/O net (fo=1, unplaced) 0.244 41.519 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[imm][0]_i_11_n_1 LUT4 (Prop_lut4_I1_O) 0.043 41.562 r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[imm][0]_i_6/O net (fo=11, unplaced) 0.290 41.852 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[imm][0]_i_6_n_1 LUT4 (Prop_lut4_I2_O) 0.043 41.895 r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_9/O net (fo=3, unplaced) 0.262 42.157 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_9_n_1 LUT5 (Prop_lut5_I0_O) 0.043 42.200 r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_4/O net (fo=1, unplaced) 0.377 42.577 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_4_n_1 LUT5 (Prop_lut5_I0_O) 0.043 42.620 r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_2/O net (fo=1, unplaced) 0.000 42.620 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl[op][0]_i_2_n_1 MUXF7 (Prop_muxf7_I0_O) 0.112 42.732 r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_pipe_ctrl_reg[op][0]_i_1/O net (fo=2, unplaced) 0.388 43.120 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/w_ex0_pipe_ctrl[op][0] LUT5 (Prop_lut5_I1_O) 0.119 43.239 r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_i_1/O net (fo=1, unplaced) 0.000 43.239 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/w_ex0_div_stall FDCE r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_reg/D ------------------------------------------------------------------- ------------------- (clock i_clk rise edge) 10.000 10.000 r 0.000 10.000 r i_clk (IN) net (fo=0) 0.000 10.000 i_clk IBUF (Prop_ibuf_I_O) 0.669 10.669 r i_clk_IBUF_inst/O net (fo=1, unplaced) 0.371 11.039 i_clk_IBUF BUFG (Prop_bufg_I_O) 0.072 11.111 r i_clk_IBUF_BUFG_inst/O net (fo=289811, unplaced) 0.439 11.550 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/i_clk_IBUF_BUFG FDCE r u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_reg/C clock pessimism 0.103 11.653 clock uncertainty -0.035 11.618 FDCE (Setup_fdce_C_D) 0.041 11.659 u_scariv_tile/alu_loop[0].u_scariv_alu/u_alu/r_ex1_div_stall_reg ------------------------------------------------------------------- required time 11.659 arrival time -43.239 ------------------------------------------------------------------- slack -31.580
どうもこの辺のCritical Pathは何度確認しても具体的な場所がつかめないので、一度Flattenを全部削除して本当にCriticalな場所を探してみる。
+ synth_design -top ${TOP_NAME} -part $DEVICE_NAME -fanout_limit 10000 \ -include_dir ../../src/fpnew/src/common_cells/include \ + -flatten_hierarchy none \ -include_dir ../../src \ -verilog_define $::env(RV_DEFINE) write_checkpoint -force ${TOP_NAME}.dcp
するとCritical PathがFrontend側に移った。これはRVC命令を切り出す周辺の論理だ。これは結構面倒だな... 切り出しの論理を簡単化するためにはどうしたらいいかな。
LUT4 (Prop_lut4_I0_O) 0.043 4.003 r u_scariv_tile/u_frontend/u_scariv_inst_buffer/word_loop[0].u_decoder_inst_cat_i_745/O net (fo=128, unplaced) 0.354 4.357 u_scariv_tile/u_frontend/u_scariv_inst_buffer/word_loop[0].u_decoder_inst_cat_i_745_n_0 MUXF7 (Prop_muxf7_S_O) 0.168 4.525 f u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_532/O net (fo=1, unplaced) 0.377 4.902 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_532_n_0 LUT6 (Prop_lut6_I1_O) 0.119 5.021 f u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_161/O net (fo=1, unplaced) 0.377 5.398 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_161_n_0 LUT6 (Prop_lut6_I1_O) 0.043 5.441 f u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_58/O net (fo=1, unplaced) 0.244 5.685 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_58_n_0 LUT5 (Prop_lut5_I4_O) 0.043 5.728 f u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_16/O net (fo=89, unplaced) 0.319 6.047 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[0].u_scariv_rvc_expander_i_16_n_0 LUT2 (Prop_lut2_I1_O) 0.043 6.090 r u_scariv_tile/u_frontend/u_scariv_inst_buffer/u_bru_inst_cnt_i_5/O net (fo=107, unplaced) 0.345 6.435 u_scariv_tile/u_frontend/u_scariv_inst_buffer/u_bru_inst_cnt_i_5_n_0 LUT3 (Prop_lut3_I2_O) 0.043 6.478 r u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_117/O net (fo=2, unplaced) 0.255 6.733 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_117_n_0 LUT6 (Prop_lut6_I5_O) 0.043 6.776 r u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_50/O net (fo=3, unplaced) 0.262 7.038 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_50_n_0 LUT6 (Prop_lut6_I5_O) 0.043 7.081 r u_scariv_tile/u_frontend/u_scariv_inst_buffer/u_except_disp_pick_up_i_65/O net (fo=42, unplaced) 0.322 7.403 u_scariv_tile/u_frontend/u_scariv_inst_buffer/u_except_disp_pick_up_i_65_n_0 LUT4 (Prop_lut4_I0_O) 0.043 7.446 r u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_1660/O net (fo=128, unplaced) 0.354 7.800 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_1660_n_0 MUXF7 (Prop_muxf7_S_O) 0.168 7.968 f u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_833/O net (fo=1, unplaced) 0.000 7.968 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_833_n_0 MUXF8 (Prop_muxf8_I0_O) 0.044 8.012 f u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_348/O net (fo=1, unplaced) 0.351 8.363 u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_348_n_0 LUT6 (Prop_lut6_I0_O) 0.120 8.483 f u_scariv_tile/u_frontend/u_scariv_inst_buffer/rvc_expand_loop[1].u_scariv_rvc_expander_i_110/O