LSPCON likes to throw short HPDs during the enable seqeunce prior to the
link being trained. These obviously result in the channel CR/EQ check
failing and thus we schedule a pointless hotplug work to retrain the
link. Avoid that by ignoring the bad CR/EQ status until we've actually
initially trained the link.
I've not actually investigated to see what LSPCON is trying to signal
with the short pulse. But as long as it signals anything I think we're
supposed to check the link status anyway, so I don't really see other
good ways to solve this. I've not seen these short pulses being
generated by normal DP sinks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-5-ville.syrjala@linux.intel.com
Doing link retraining from the short pulse handler is problematic since
that might introduce deadlocks with MST sideband processing. Currently
we don't retrain MST links from this code, but we want to change that.
So better to move the entire thing to the hotplug work. We can utilize
the new encoder->hotplug() hook for this.
The only thing we leave in the short pulse handler is the link status
check. That one still depends on the link parameters stored under
intel_dp, so no locking around that but races should be mostly harmless
as the actual retraining code will recheck the link state if we
end up there by mistake.
v2: Rebase due to ->post_hotplug() now being just ->hotplug()
Check the connector type to figure out if we should do
the HDMI thing or the DP think for DDI
[pushed with whitespace changes for sparse]
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-3-ville.syrjala@linux.intel.com
The LG 4k TV I have doesn't deassert HPD when I turn the TV off, but
when I turn it back on it will pulse the HPD line. By that time it has
forgotten everything we told it about scrambling and the clock ratio.
Hence if we want to get a picture out if it again we have to tell it
whether we're currently sending scrambled data or not. Implement
that via the encoder->hotplug() hook.
v2: Force a full modeset to not follow the HDMI 2.0 spec more
closely (Shashank)
[pushed with whitespace fixes to make sparse happy]
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com
Compute workload tends to be "bursty", Only tune the behavior of
nature dpm don't work well for most of such workloads. From test
results, Fix sclk in highest two levels can get better performance.
so add min sclk setting into the default cumpute workload policy on
smu7.
user still can change sclk range through sysfs pp_dpm_sclk
for better perf/watt.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It show what parameters can be configured to tune
the behavior of natural dpm for perf/watt on smu7.
user can select the mode per workload, but even the default per
workload settings are not bulletproof.
user can configure custom settings per different use case
for better perf or better perf/watt.
cat pp_power_profile_mode
NUM MODE_NAME SCLK_UP_HYST SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL MCLK_UP_HYST MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL
0 3D_FULL_SCREEN: 0 100 30 0 100 10
1 POWER_SAVING: 10 0 30 - - -
2 VIDEO: - - - 10 16 31
3 VR: 0 11 50 0 100 10
4 COMPUTE: 0 5 30 - - -
5 CUSTOM: 0 0 0 0 0 0
* CURRENT: 0 100 30 0 100 10
Under manual dpm level,
user can echo "0/1/2/3/4">pp_power_profile_mode
to select 3D_FULL_SCREEN/POWER_SAVING/VIDEO/VR/COMPUTE
mode.
echo "5 * * * * * * * *">pp_power_profile_mode
to set custom settings.
"5 * * * * * * * *" mean "CUSTOM enable_sclk SCLK_UP_HYST
SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL enable_mclk MCLK_UP_HYST
MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL"
if the parameter enable_sclk/enable_mclk is true,
driver will update the following parameters to dpm table.
if false, ignore the following parameters.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The goal here is to try and reduce the latency of signaling additional
requests following the wakeup from interrupt by reducing the list of
to-be-signaled requests from an rbtree to a sorted linked list. The
original choice of using an rbtree was to facilitate random insertions
of request into the signaler while maintaining a sorted list. However,
if we assume that most new requests are added when they are submitted,
we see those new requests in execution order making a insertion sort
fast, and the reduction in overhead of each signaler iteration
significant.
Since commit 56299fb7d9 ("drm/i915: Signal first fence from irq handler
if complete"), we signal most fences directly from notify_ring() in the
interrupt handler greatly reducing the amount of work that actually
needs to be done by the signaler kthread. All the thread is then
required to do is operate as the bottom-half, cleaning up after the
interrupt handler and preparing the next waiter. This includes signaling
all later completed fences in a saturated system, but on a mostly idle
system we only have to rebuild the wait rbtree in time for the next
interrupt. With this de-emphasis of the signaler's role, we want to
rejig it's datastructures to reduce the amount of work we require to
both setup the signal tree and maintain it on every interrupt.
References: 56299fb7d9 ("drm/i915: Signal first fence from irq handler if complete")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-1-chris@chris-wilson.co.uk
The Parfait (version 2.1.0) static code analysis tool found the
following NULL pointer derefernce problem.
- drivers/gpu/drm/drm_vblank.c
Null pointer checks were added to return values from calls to
drm_crtc_from_index(). There is a possibility, however minute, that
crtc->index may not be found when trying to find the struct crtc
from it's assigned index given in drm_crtc_init_with_planes().
3 return checks for NULL where added with a call to
WARN_ON(!crtc).
Signed-off-by: Joe Moriarty <joe.moriarty@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220191157.100960-2-joe.moriarty@oracle.com
In XenGT, ioreq copy is used to trap mmio write and ppgtt write. Both
of them are memory write, ioreq handler couldn't distinguish them. So
ioreq handler probe the ppgtt write handler, if it is succuess, this
ioreq is ppgtt write, otherwise it is mmio write.
So ppgtt write handler should return an error at the failure of finding
page track, it is fatal to implement ioreq handler in XenGT.
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
page_track_handler take lock at the beginning, the lock should be released
at the failure of finding page track. Otherwise deadlock will happen.
Fixes: e502a2af4c ("drm/i915/gvt: Provide generic page_track infrastructure for write-protected page")
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Add a new debugfs entry kvmgt_nr_cache_entries under vgpu which shows
the number of entry in dma cache.
$ cat /sys/kernel/debug/gvt/vgpu1/kvmgt_nr_cache_entries
10101
v3: fix compiling error for some configuration. (Xiong Zhang <xiong.y.zhang@intel.com>)
v2: keep debugfs layout flat.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
The implementation of current kvmgt implicitly setup dma mapping at MPT
API gfn_to_mfn. First this design against the API's original purpose.
Second, there is no unmap hit in this design. The result is that the
dma mapping keep growing larger and larger. For mutl-vm case, they will
consume IOMMU IOVA low 4GB address space quickly and so tons of rbtree
entries crated in the IOMMU IOVA allocator. Finally, single IOVA
allocation can take as long as ~70ms. Such latency is intolerable.
To address both above issues, this patch introduced two new MPT API:
o dma_map_guest_page - setup dma map for guest page
o dma_unmap_guest_page - cancel dma map for guest page
The kvmgt implements these 2 API. And to reduce dma setup overhead for
duplicated pages (eg. scratch pages), two caches are used: one is for
mapping gfn to struct gvt_dma, another is for mapping dma addr to
struct gvt_dma.
With these 2 new API, the gtt now is able to cancel dma mapping when page
table is invalidated. The dma mapping is not in a gradual increase now.
v2: follow the old logic for VFIO_IOMMU_NOTIFY_DMA_UNMAP at this point.
Cc: Hang Yuan <hang.yuan@intel.com>
Cc: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Fix below error with minor code refactor.
CHECK drivers/gpu/drm/i915//gvt/handlers.c
drivers/gpu/drm/i915//gvt/handlers.c:203 sanitize_fence_mmio_access() error: 'vgpu' dereferencing possible ERR_PTR()
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Fix check error at
CHECK drivers/gpu/drm/i915//gvt/kvmgt.c
drivers/gpu/drm/i915//gvt/kvmgt.c:455 intel_vgpu_create() error: we previously assumed 'vgpu' could be null (see line 454)
For failed vgpu create, just show error return in failure message.
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g,
vgpu can't get the correct default context by updating the registers before
inhibit context submission. It always get back the hardware default value
unless the inhibit context submission happened before the 1st time
forcewake put. With this wrong default context, vgpu will run with
incorrect state and meet unknown issues.
The solution is initialize these mmios by adding lri command in ring buffer
of the inhibit context, then gpu hardware has no chance to go down RC6 when
lri commands are right being executed, and then vgpu can get correct
default context for further use.
v3:
- fix code fault, use 'for' to loop through mmio render list(Zhenyu)
v4:
- save the count of engine mmio need to be restored for inhibit context and
refine some comments. (Kevin)
v5:
- code rebase
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
No functional change. This defination will also be used in future patchesi.
v4:
- refine patch description (Kevin)
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
We don't know how many page tables will be shadowed. It varies
considerably corresponding to guest load. Radix tree is a better
choice for us. Since Page Frame Number is used as key so most of
the bits are common.
Here is some performance data (duration in us) of looking up a
element:
Before: (aka. ppgtt_find_shadow_page)
0.308 0.292 0.246 0.432 0.143 ... 0.311 0.225 0.382 0.199 0.325
After: (aka. intel_vgpu_find_spt_by_mfn)
0.106 0.106 0.107 0.106 0.105 0.107 ... 0.107 0.109 0.105 0.108
This time I didn't get the early data of hash table. The data is
measured when desktop is shown.
As last change, the overall benchmark almost is not changed, but
we get better scalability.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>