FPGA開発日記

カテゴリ別記事インデックス https://msyksphinz.github.io/github_pages , English Version https://fpgadevdiary.hatenadiary.com/

自作CPUのVIPTキャッシュポリシ導入検討 (3. クリティカルパスの削減検討)

自作CPUのキャッシュについて,VIPTを導入して実装を開始した. 目的としてはLSUパイプラインのクリティカルパスの削減だが,目標の周波数に向けていろいろと弊害がある. 最新のVivado結果としては,L1Dのキャッシュヒットとフォワーディングの結果から,MSHRへの書き込みを1サイクルでやっているがこれを納めるのがきつい.

Max Delay Paths
--------------------------------------------------------------------------------------
Slack (VIOLATED) :        -9.398ns  (required time - arrival time)
  Source:                 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex2_pipe_ctrl_reg[size][0]/C
                            (rising edge-triggered cell FDRE clocked by main_crg_clkout0  {rise@0.000ns fall@10.000ns period=20.000ns})
  Destination:            mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rd_regs][0][early_index][1]/D
                            (rising edge-triggered cell FDCE clocked by main_crg_clkout0  {rise@0.000ns fall@10.000ns period=20.000ns})

ちょっとさすがに長すぎるので,MSHRのロード部分は次のステージに移動することにしてみる.

    SLICE_X90Y153        FDRE (Prop_fdre_C_Q)         0.518    12.507 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex2_pipe_ctrl_reg[size][0]/Q
                         net (fo=180, routed)         1.095    13.602    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ldq_haz_check_if\\.ex2_size[0]
    SLICE_X90Y153        LUT6 (Prop_lut6_I3_O)        0.124    13.726 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_fwd_check_if\\.paddr_dw[4]_INST_0/O
                         net (fo=17, routed)          1.376    15.102    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.paddr_dw[4]
    SLICE_X89Y165        LUT4 (Prop_lut4_I0_O)        0.124    15.226 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.fwd_miss_valid_INST_0_i_62/O
                         net (fo=1, routed)           0.639    15.866    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.fwd_miss_valid_INST_0_i_62_n_0
    SLICE_X89Y166        LUT6 (Prop_lut6_I2_O)        0.124    15.990 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.fwd_miss_valid_INST_0_i_33/O
                         net (fo=3, routed)           0.992    16.982    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/r_entry_reg[is_valid]_0
    SLICE_X92Y166        LUT5 (Prop_lut5_I0_O)        0.124    17.106 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.fwd_miss_valid_INST_0_i_13/O
                         net (fo=15, routed)          1.404    18.510    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/w_ex2_fwd_valid[0][14]
    SLICE_X89Y172        LUT6 (Prop_lut6_I4_O)        0.124    18.634 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[0]_INST_0_i_17/O
                         net (fo=1, routed)           1.524    20.157    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[0]_INST_0_i_17_n_0
    SLICE_X89Y174        LUT6 (Prop_lut6_I0_O)        0.124    20.281 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[0]_INST_0_i_9/O
                         net (fo=3, routed)           1.049    21.330    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/coarse_stld_fwd.fwd_loop[0].w_ex2_fwd_valid_oh[11]
    SLICE_X89Y175        LUT4 (Prop_lut4_I0_O)        0.124    21.454 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[1]_INST_0_i_9/O
                         net (fo=2, routed)           0.604    22.058    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[1]_INST_0_i_9_n_0
    SLICE_X88Y176        LUT6 (Prop_lut6_I4_O)        0.124    22.182 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[1]_INST_0_i_1/O
                         net (fo=2, routed)           0.179    22.361    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[1]_INST_0_i_1_n_0
    SLICE_X88Y176        LUT3 (Prop_lut3_I0_O)        0.124    22.485 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[3]_INST_0_i_6/O
                         net (fo=2, routed)           1.191    23.677    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[3]_INST_0_i_6_n_0
    SLICE_X86Y164        LUT4 (Prop_lut4_I0_O)        0.124    23.801 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[2]_INST_0/O
                         net (fo=162, routed)         0.723    24.524    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[2]
    SLICE_X86Y160        MUXF7 (Prop_muxf7_S_O)       0.296    24.820 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[7]_INST_0_i_29/O
                         net (fo=1, routed)           0.000    24.820    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[7]_INST_0_i_29_n_0
    SLICE_X86Y160        MUXF8 (Prop_muxf8_I0_O)      0.104    24.924 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[7]_INST_0_i_9/O
                         net (fo=8, routed)           1.220    26.143    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/coarse_stld_fwd.fwd_loop[0].size[0]
    SLICE_X85Y153        LUT6 (Prop_lut6_I0_O)        0.316    26.459 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[4]_INST_0_i_1/O
                         net (fo=2, routed)           0.441    26.901    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/gen_dw49_return[4]
    SLICE_X84Y151        LUT2 (Prop_lut2_I1_O)        0.124    27.025 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[4]_INST_0/O
                         net (fo=6, routed)           0.521    27.545    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_fwd_check_if\\.fwd_dw[4]
    SLICE_X85Y151        LUT2 (Prop_lut2_I0_O)        0.124    27.669 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_19/O
                         net (fo=1, routed)           0.720    28.390    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_19_n_0
    SLICE_X85Y149        LUT6 (Prop_lut6_I0_O)        0.124    28.514 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_8/O
                         net (fo=1, routed)           0.445    28.959    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_8_n_0
    SLICE_X85Y147        LUT6 (Prop_lut6_I0_O)        0.124    29.083 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_3/O
                         net (fo=9, routed)           0.478    29.561    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_3_n_0
    SLICE_X85Y145        LUT2 (Prop_lut2_I1_O)        0.124    29.685 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0_i_11/O
                         net (fo=2, routed)           0.304    29.989    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0_i_11_n_0
    SLICE_X85Y144        LUT6 (Prop_lut6_I3_O)        0.124    30.113 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0_i_3/O
                         net (fo=2, routed)           0.951    31.065    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0_i_3_n_0
    SLICE_X97Y142        LUT5 (Prop_lut5_I3_O)        0.124    31.189 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0/O
                         net (fo=11, routed)          0.311    31.499    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_l1d_mshr/entry_loop[0].u_entry/l1d_missu_if\\.load
    SLICE_X97Y142        LUT5 (Prop_lut5_I2_O)        0.124    31.623 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_l1d_mshr/entry_loop[0].u_entry/lsu_loop[0].u_mycpu_lsu_i_905/O
                         net (fo=60, routed)          0.356    31.980    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_l1d_mshr/entry_loop[0].u_entry/w_l1d_missu_loads_no_conflicts0
    SLICE_X99Y142        LUT3 (Prop_lut3_I0_O)        0.124    32.104 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_l1d_mshr/entry_loop[0].u_entry/lsu_loop[0].u_mycpu_lsu_i_269/O
                         net (fo=1, routed)           0.285    32.389    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.resp_payload[full]
    SLICE_X97Y142        LUT6 (Prop_lut6_I4_O)        0.124    32.513 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/lsu_pipe_haz_if\\.payload[hazard_typ][0]_INST_0_i_1/O
                         net (fo=34, routed)          0.388    32.900    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/lsu_pipe_haz_if\\.payload[hazard_typ][0]
    SLICE_X95Y142        LUT6 (Prop_lut6_I0_O)        0.124    33.024 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/lrsc_if\\.sc_check_valid_INST_0_i_1/O
                         net (fo=3, routed)           0.511    33.535    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/lrsc_if\\.sc_check_valid_INST_0_i_1_n_0
    SLICE_X99Y137        LUT4 (Prop_lut4_I0_O)        0.124    33.659 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_mispred_out_if\\.mis_valid_INST_0_i_1/O
                         net (fo=4, routed)           0.732    34.390    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_mispred_out_if\\.mis_valid_INST_0_i_1_n_0
    SLICE_X90Y130        LUT6 (Prop_lut6_I0_O)        0.124    34.514 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_mispred_out_if\\.mis_valid_INST_0/O
                         net (fo=75, routed)          1.756    36.270    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_alu/mispred_lsu_if[0]\\.mis_valid
    SLICE_X70Y114        LUT6 (Prop_lut6_I3_O)        0.124    36.394 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_alu/ex0_early_wr_if\\.valid_INST_0_i_1/O
                         net (fo=3, routed)           0.624    37.019    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_alu/p_4_in
    SLICE_X67Y108        LUT4 (Prop_lut4_I0_O)        0.124    37.143 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_alu/ex0_early_wr_if\\.valid_INST_0/O
                         net (fo=110, routed)         1.139    38.282    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_iq_payload_ram/u_ram/early_wr_if[1]\\.valid
    SLICE_X59Y106        LUT5 (Prop_lut5_I0_O)        0.124    38.406 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_iq_payload_ram/u_ram/entry_loop[2].u_issue_entry_i_44/O
                         net (fo=14, routed)          0.506    38.911    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_iq_payload_ram/u_ram/entry_loop[2].u_issue_entry_i_44_n_0
    SLICE_X57Y105        LUT4 (Prop_lut4_I1_O)        0.124    39.035 f  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_iq_payload_ram/u_ram/entry_loop[0].u_issue_entry_i_46/O
                         net (fo=28, routed)          0.706    39.741    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_entry_freelist/r_entry_reg[rds][0][predict_ready][0]_i_2
    SLICE_X55Y103        LUT6 (Prop_lut6_I4_O)        0.124    39.865 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_entry_freelist/entry_loop[5].u_issue_entry_i_35/O
                         net (fo=1, routed)           0.506    40.372    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/i_put_data[rd_regs][0][early_index][1]
    SLICE_X53Y103        LUT6 (Prop_lut6_I1_O)        0.124    40.496 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rds][0][early_index][1]_i_2/O
                         net (fo=1, routed)           0.639    41.135    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rds][0][early_index][1]_i_2_n_0
    SLICE_X53Y104        LUT3 (Prop_lut3_I0_O)        0.152    41.287 r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rds][0][early_index][1]_i_1/O
                         net (fo=1, routed)           0.000    41.287    mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rds][0][early_index][1]_i_1_n_0
    SLICE_X53Y104        FDCE                                         r  mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rd_regs][0][early_index][1]/D