aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDaniel Vetter <daniel.vetter@ffwll.ch>2013-05-24 21:29:32 +0200
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2013-06-13 10:49:37 -0700
commitc7ac79de28de4a4c6a2e903ff097c4bfed98e816 (patch)
tree2d6384bf9a11c00844e4cb14ee267cf7245adeee
parent0676316a2f650ffebb3c25bc8486e0322a5b0497 (diff)
drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus
commit 7abb690a0e095717420ba78dcab4309abbbec78a upstream. Chris Wilson noticed that since commit 1f83fee08d625f8d0130f9fe5ef7b17c2e022f3c [v3.9] Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Nov 15 17:17:22 2012 +0100 drm/i915: clear up wedged transitions X can again get -EIO when it does not expect it. And even worse score a SIGBUS when accessing gtt mmaps. The established ABI is that we _only_ return an -EIO from execbuf - all other ioctls should just work. And since the reset code moves all bos out of gpu domains and clears out all the last_seqno/ring tracking there really shouldn't be any reason for non-execbuf code to ever touch the hw and see an -EIO. After some extensive discussions we've noticed that these spurios -EIO are caused by i915_gem_wait_for_error: http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg20540.html That is easy to fix by returning 0 instead of -EIO, since grabbing the dev->struct_mutex does not yet mean that we actually want to touch the hw. And so there is no reason at all to fail with -EIO. But that's not the entire since, since often (at least it's easily googleable) dmesg indicates that the reset fails and we declare the gpu wedged. Then, quite a bit later X wakes up with the "Timed out waiting for the gpu reset to complete" DRM_ERROR message in wait_for_errror and brings down the desktop with an -EIO/SIGBUS. So clearly we're missing a wakeup somewhere, since the gpu reset just doesn't take 10 seconds to complete. And indeed we're do handle the terminally wedged state wrong. Fix this all up. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-rw-r--r--drivers/gpu/drm/i915/i915_gem.c7
1 files changed, 2 insertions, 5 deletions
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 73cb479a488..db2cff319c8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -91,14 +91,11 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
{
int ret;
-#define EXIT_COND (!i915_reset_in_progress(error))
+#define EXIT_COND (!i915_reset_in_progress(error) || \
+ i915_terminally_wedged(error))
if (EXIT_COND)
return 0;
- /* GPU is already declared terminally dead, give up. */
- if (i915_terminally_wedged(error))
- return -EIO;
-
/*
* Only wait 10 seconds for the gpu reset to complete to avoid hanging
* userspace. If it takes that long something really bad is going on and