自作CPUのキャッシュについて,VIPTを導入して実装を開始した. 目的としてはLSUパイプラインのクリティカルパスの削減だが,目標の周波数に向けていろいろと弊害がある. 最新のVivado結果としては,L1Dのキャッシュヒットとフォワーディングの結果から,MSHRへの書き込みを1サイクルでやっているがこれを納めるのがきつい.
Max Delay Paths -------------------------------------------------------------------------------------- Slack (VIOLATED) : -9.398ns (required time - arrival time) Source: mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex2_pipe_ctrl_reg[size][0]/C (rising edge-triggered cell FDRE clocked by main_crg_clkout0 {rise@0.000ns fall@10.000ns period=20.000ns}) Destination: mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rd_regs][0][early_index][1]/D (rising edge-triggered cell FDCE clocked by main_crg_clkout0 {rise@0.000ns fall@10.000ns period=20.000ns})
ちょっとさすがに長すぎるので,MSHRのロード部分は次のステージに移動することにしてみる.
SLICE_X90Y153 FDRE (Prop_fdre_C_Q) 0.518 12.507 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex2_pipe_ctrl_reg[size][0]/Q net (fo=180, routed) 1.095 13.602 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ldq_haz_check_if\\.ex2_size[0] SLICE_X90Y153 LUT6 (Prop_lut6_I3_O) 0.124 13.726 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_fwd_check_if\\.paddr_dw[4]_INST_0/O net (fo=17, routed) 1.376 15.102 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.paddr_dw[4] SLICE_X89Y165 LUT4 (Prop_lut4_I0_O) 0.124 15.226 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.fwd_miss_valid_INST_0_i_62/O net (fo=1, routed) 0.639 15.866 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.fwd_miss_valid_INST_0_i_62_n_0 SLICE_X89Y166 LUT6 (Prop_lut6_I2_O) 0.124 15.990 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.fwd_miss_valid_INST_0_i_33/O net (fo=3, routed) 0.992 16.982 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/r_entry_reg[is_valid]_0 SLICE_X92Y166 LUT5 (Prop_lut5_I0_O) 0.124 17.106 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/coarse_rs2_ram.u_rs2_dataram/ex2_fwd_check_if[0]\\.fwd_miss_valid_INST_0_i_13/O net (fo=15, routed) 1.404 18.510 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/w_ex2_fwd_valid[0][14] SLICE_X89Y172 LUT6 (Prop_lut6_I4_O) 0.124 18.634 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[0]_INST_0_i_17/O net (fo=1, routed) 1.524 20.157 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[0]_INST_0_i_17_n_0 SLICE_X89Y174 LUT6 (Prop_lut6_I0_O) 0.124 20.281 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[0]_INST_0_i_9/O net (fo=3, routed) 1.049 21.330 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/coarse_stld_fwd.fwd_loop[0].w_ex2_fwd_valid_oh[11] SLICE_X89Y175 LUT4 (Prop_lut4_I0_O) 0.124 21.454 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[1]_INST_0_i_9/O net (fo=2, routed) 0.604 22.058 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[1]_INST_0_i_9_n_0 SLICE_X88Y176 LUT6 (Prop_lut6_I4_O) 0.124 22.182 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[1]_INST_0_i_1/O net (fo=2, routed) 0.179 22.361 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[1]_INST_0_i_1_n_0 SLICE_X88Y176 LUT3 (Prop_lut3_I0_O) 0.124 22.485 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[3]_INST_0_i_6/O net (fo=2, routed) 1.191 23.677 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[3]_INST_0_i_6_n_0 SLICE_X86Y164 LUT4 (Prop_lut4_I0_O) 0.124 23.801 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[2]_INST_0/O net (fo=162, routed) 0.723 24.524 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_miss_haz_index[2] SLICE_X86Y160 MUXF7 (Prop_muxf7_S_O) 0.296 24.820 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[7]_INST_0_i_29/O net (fo=1, routed) 0.000 24.820 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[7]_INST_0_i_29_n_0 SLICE_X86Y160 MUXF8 (Prop_muxf8_I0_O) 0.104 24.924 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[7]_INST_0_i_9/O net (fo=8, routed) 1.220 26.143 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/coarse_stld_fwd.fwd_loop[0].size[0] SLICE_X85Y153 LUT6 (Prop_lut6_I0_O) 0.316 26.459 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[4]_INST_0_i_1/O net (fo=2, routed) 0.441 26.901 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/gen_dw49_return[4] SLICE_X84Y151 LUT2 (Prop_lut2_I1_O) 0.124 27.025 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_stq/u_req_ptr/ex2_fwd_check_if[0]\\.fwd_dw[4]_INST_0/O net (fo=6, routed) 0.521 27.545 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_fwd_check_if\\.fwd_dw[4] SLICE_X85Y151 LUT2 (Prop_lut2_I0_O) 0.124 27.669 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_19/O net (fo=1, routed) 0.720 28.390 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_19_n_0 SLICE_X85Y149 LUT6 (Prop_lut6_I0_O) 0.124 28.514 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_8/O net (fo=1, routed) 0.445 28.959 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_8_n_0 SLICE_X85Y147 LUT6 (Prop_lut6_I0_O) 0.124 29.083 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_3/O net (fo=9, routed) 0.478 29.561 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/r_ex3_aligned_data[23]_i_3_n_0 SLICE_X85Y145 LUT2 (Prop_lut2_I1_O) 0.124 29.685 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0_i_11/O net (fo=2, routed) 0.304 29.989 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0_i_11_n_0 SLICE_X85Y144 LUT6 (Prop_lut6_I3_O) 0.124 30.113 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0_i_3/O net (fo=2, routed) 0.951 31.065 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0_i_3_n_0 SLICE_X97Y142 LUT5 (Prop_lut5_I3_O) 0.124 31.189 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.load_INST_0/O net (fo=11, routed) 0.311 31.499 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_l1d_mshr/entry_loop[0].u_entry/l1d_missu_if\\.load SLICE_X97Y142 LUT5 (Prop_lut5_I2_O) 0.124 31.623 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_l1d_mshr/entry_loop[0].u_entry/lsu_loop[0].u_mycpu_lsu_i_905/O net (fo=60, routed) 0.356 31.980 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_l1d_mshr/entry_loop[0].u_entry/w_l1d_missu_loads_no_conflicts0 SLICE_X99Y142 LUT3 (Prop_lut3_I0_O) 0.124 32.104 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/u_l1d_mshr/entry_loop[0].u_entry/lsu_loop[0].u_mycpu_lsu_i_269/O net (fo=1, routed) 0.285 32.389 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/l1d_missu_if\\.resp_payload[full] SLICE_X97Y142 LUT6 (Prop_lut6_I4_O) 0.124 32.513 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/lsu_pipe_haz_if\\.payload[hazard_typ][0]_INST_0_i_1/O net (fo=34, routed) 0.388 32.900 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/lsu_pipe_haz_if\\.payload[hazard_typ][0] SLICE_X95Y142 LUT6 (Prop_lut6_I0_O) 0.124 33.024 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/lrsc_if\\.sc_check_valid_INST_0_i_1/O net (fo=3, routed) 0.511 33.535 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/lrsc_if\\.sc_check_valid_INST_0_i_1_n_0 SLICE_X99Y137 LUT4 (Prop_lut4_I0_O) 0.124 33.659 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_mispred_out_if\\.mis_valid_INST_0_i_1/O net (fo=4, routed) 0.732 34.390 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_mispred_out_if\\.mis_valid_INST_0_i_1_n_0 SLICE_X90Y130 LUT6 (Prop_lut6_I0_O) 0.124 34.514 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/u_lsu_top/lsu_loop[0].u_mycpu_lsu/u_lsu_pipe/ex2_mispred_out_if\\.mis_valid_INST_0/O net (fo=75, routed) 1.756 36.270 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_alu/mispred_lsu_if[0]\\.mis_valid SLICE_X70Y114 LUT6 (Prop_lut6_I3_O) 0.124 36.394 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_alu/ex0_early_wr_if\\.valid_INST_0_i_1/O net (fo=3, routed) 0.624 37.019 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_alu/p_4_in SLICE_X67Y108 LUT4 (Prop_lut4_I0_O) 0.124 37.143 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_alu/ex0_early_wr_if\\.valid_INST_0/O net (fo=110, routed) 1.139 38.282 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_iq_payload_ram/u_ram/early_wr_if[1]\\.valid SLICE_X59Y106 LUT5 (Prop_lut5_I0_O) 0.124 38.406 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_iq_payload_ram/u_ram/entry_loop[2].u_issue_entry_i_44/O net (fo=14, routed) 0.506 38.911 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_iq_payload_ram/u_ram/entry_loop[2].u_issue_entry_i_44_n_0 SLICE_X57Y105 LUT4 (Prop_lut4_I1_O) 0.124 39.035 f mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_iq_payload_ram/u_ram/entry_loop[0].u_issue_entry_i_46/O net (fo=28, routed) 0.706 39.741 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_entry_freelist/r_entry_reg[rds][0][predict_ready][0]_i_2 SLICE_X55Y103 LUT6 (Prop_lut6_I4_O) 0.124 39.865 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/u_entry_freelist/entry_loop[5].u_issue_entry_i_35/O net (fo=1, routed) 0.506 40.372 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/i_put_data[rd_regs][0][early_index][1] SLICE_X53Y103 LUT6 (Prop_lut6_I1_O) 0.124 40.496 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rds][0][early_index][1]_i_2/O net (fo=1, routed) 0.639 41.135 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rds][0][early_index][1]_i_2_n_0 SLICE_X53Y104 LUT3 (Prop_lut3_I0_O) 0.152 41.287 r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rds][0][early_index][1]_i_1/O net (fo=1, routed) 0.000 41.287 mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rds][0][early_index][1]_i_1_n_0 SLICE_X53Y104 FDCE r mycpu_subsystem_axi_wrapper/u_mycpu_subsystem/u_tile/alu_loop[1].u_alu/u_mycpu_issue_unit/entry_loop[5].u_issue_entry/r_entry_reg[rd_regs][0][early_index][1]/D