aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAlex Elder <elder@inktank.com>2012-10-08 20:37:30 -0700
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2012-11-05 09:56:49 +0100
commita407c28651969f9e34fd51ea0415886b70e5c991 (patch)
treea9ba73aa2f32e8e5e1699c5c704c4cd2a7622e99
parent6d59ab3707595e1a7b2e6051ff3f5f0dabd0ad06 (diff)
rbd: reset BACKOFF if unable to re-queue
commit 588377d6199034c36d335e7df5818b731fea072c upstream. If ceph_fault() is unable to queue work after a delay, it sets the BACKOFF connection flag so con_work() will attempt to do so. In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't result in newly-queued work, it simply ignores this condition and proceeds as if no backoff delay were desired. There are two problems with this--one of which is a bug. The first problem is simply that the intended behavior is to back off, and if we aren't able queue the work item to run after a delay we're not doing that. The only reason queue_delayed_work() won't queue work is if the provided work item is already queued. In the messenger, this means that con_work() is already scheduled to be run again. So if we simply set the BACKOFF flag again when this occurs, we know the next con_work() call will again attempt to hold off activity on the connection until after the delay. The second problem--the bug--is a leak of a reference count. If queue_delayed_work() returns 0 in con_work(), con->ops->put() drops the connection reference held on entry to con_work(). However, processing is (was) allowed to continue, and at the end of the function a second con->ops->put() is called. This patch fixes both problems. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-rw-r--r--net/ceph/messenger.c3
1 files changed, 2 insertions, 1 deletions
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 159aa8bef9e7..cad0d17ec45e 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2300,10 +2300,11 @@ restart:
mutex_unlock(&con->mutex);
return;
} else {
- con->ops->put(con);
dout("con_work %p FAILED to back off %lu\n", con,
con->delay);
+ set_bit(CON_FLAG_BACKOFF, &con->flags);
}
+ goto done;
}
if (con->state == CON_STATE_STANDBY) {