drm/msm: Avoid unclocked GMU register access in 6xx gpu_busy

From testing on sc7180-trogdor devices, reading the GMU registers
needs the GMU clocks to be enabled. Those clocks get turned on in
a6xx_gmu_resume(). Confusingly enough, that function is called as a
result of the runtime_pm of the GPU "struct device", not the GMU
"struct device". Unfortunately the current a6xx_gpu_busy() grabs a
reference to the GMU's "struct device".

The fact that we were grabbing the wrong reference was easily seen to
cause crashes that happen if we change the GPU's pm_runtime usage to
not use autosuspend. It's also believed to cause some long tail GPU
crashes even with autosuspend.

We could look at changing it so that we do pm_runtime_get_if_in_use()
on the GPU's "struct device", but then we run into a different
problem. pm_runtime_get_if_in_use() will return 0 for the GPU's
"struct device" the whole time when we're in the "autosuspend
delay". That is, when we drop the last reference to the GPU but we're
waiting a period before actually suspending then we'll think the GPU
is off. One reason that's bad is that if the GPU didn't actually turn
off then the cycle counter doesn't lose state and that throws off all
of our calculations.

Let's change the code to keep track of the suspend state of
devfreq. msm_devfreq_suspend() is always called before we actually
suspend the GPU and msm_devfreq_resume() after we resume it. This
means we can use the suspended state to know if we're powered or not.

NOTE: one might wonder when exactly our status function is called when
devfreq is supposed to be disabled. The stack crawl I captured was:
  msm_devfreq_get_dev_status
  devfreq_simple_ondemand_func
  devfreq_update_target
  qos_notifier_call
  qos_max_notifier_call
  blocking_notifier_call_chain
  pm_qos_update_target
  freq_qos_apply
  apply_constraint
  __dev_pm_qos_update_request
  dev_pm_qos_update_request
  msm_devfreq_idle_work

Fixes: eadf79286a ("drm/msm: Check for powered down HW in the devfreq callbacks")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/489124/
Link: https://lore.kernel.org/r/20220610124639.v4.1.Ie846c5352bc307ee4248d7cab998ab3016b85d06@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
This commit is contained in:
Douglas Anderson
2022-06-10 12:47:31 -07:00
committed by Rob Clark
parent 1ff1da40d6
commit 6694482a70
6 changed files with 54 additions and 34 deletions

View File

@@ -20,6 +20,7 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq,
u32 flags)
{
struct msm_gpu *gpu = dev_to_gpu(dev);
struct msm_gpu_devfreq *df = &gpu->devfreq;
struct dev_pm_opp *opp;
/*
@@ -32,10 +33,13 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq,
trace_msm_gpu_freq_change(dev_pm_opp_get_freq(opp));
if (gpu->funcs->gpu_set_freq)
gpu->funcs->gpu_set_freq(gpu, opp);
else
if (gpu->funcs->gpu_set_freq) {
mutex_lock(&df->lock);
gpu->funcs->gpu_set_freq(gpu, opp, df->suspended);
mutex_unlock(&df->lock);
} else {
clk_set_rate(gpu->core_clk, *freq);
}
dev_pm_opp_put(opp);
@@ -58,16 +62,25 @@ static void get_raw_dev_status(struct msm_gpu *gpu,
unsigned long sample_rate;
ktime_t time;
mutex_lock(&df->lock);
status->current_frequency = get_freq(gpu);
busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
time = ktime_get();
busy_time = busy_cycles - df->busy_cycles;
status->total_time = ktime_us_delta(time, df->time);
df->busy_cycles = busy_cycles;
df->time = time;
if (df->suspended) {
mutex_unlock(&df->lock);
status->busy_time = 0;
return;
}
busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
busy_time = busy_cycles - df->busy_cycles;
df->busy_cycles = busy_cycles;
mutex_unlock(&df->lock);
busy_time *= USEC_PER_SEC;
busy_time = div64_ul(busy_time, sample_rate);
if (WARN_ON(busy_time > ~0LU))
@@ -175,6 +188,8 @@ void msm_devfreq_init(struct msm_gpu *gpu)
if (!gpu->funcs->gpu_busy)
return;
mutex_init(&df->lock);
dev_pm_qos_add_request(&gpu->pdev->dev, &df->idle_freq,
DEV_PM_QOS_MAX_FREQUENCY,
PM_QOS_MAX_FREQUENCY_DEFAULT_VALUE);
@@ -244,12 +259,16 @@ void msm_devfreq_cleanup(struct msm_gpu *gpu)
void msm_devfreq_resume(struct msm_gpu *gpu)
{
struct msm_gpu_devfreq *df = &gpu->devfreq;
unsigned long sample_rate;
if (!has_devfreq(gpu))
return;
df->busy_cycles = 0;
mutex_lock(&df->lock);
df->busy_cycles = gpu->funcs->gpu_busy(gpu, &sample_rate);
df->time = ktime_get();
df->suspended = false;
mutex_unlock(&df->lock);
devfreq_resume_device(df->devfreq);
}
@@ -261,6 +280,10 @@ void msm_devfreq_suspend(struct msm_gpu *gpu)
if (!has_devfreq(gpu))
return;
mutex_lock(&df->lock);
df->suspended = true;
mutex_unlock(&df->lock);
devfreq_suspend_device(df->devfreq);
cancel_idle_work(df);