drm/amdgpu: correct the calculation of RAS bad page

After the introduction of NPS RAS, one bad page record on eeprom may be
related to 1 or 16 bad pages, so the bad page record and bad page are
two different concepts, define a new variable to store bad page number.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Tao Zhou
2024-11-29 16:52:41 +08:00
committed by Alex Deucher
parent 1f06e7f344
commit ae756cd853
4 changed files with 36 additions and 22 deletions

View File

@@ -169,7 +169,8 @@ void amdgpu_umc_handle_bad_pages(struct amdgpu_device *adev,
err_data->err_addr_cnt, false);
amdgpu_ras_save_bad_pages(adev, &err_count);
amdgpu_dpm_send_hbm_bad_pages_num(adev, con->eeprom_control.ras_num_recs);
amdgpu_dpm_send_hbm_bad_pages_num(adev,
con->eeprom_control.ras_num_bad_pages);
if (con->update_channel_flag == true) {
amdgpu_dpm_send_hbm_bad_channel_flag(adev, con->eeprom_control.bad_channel_bitmap);