Commit Graph

237 Commits

Author SHA1 Message Date
Lars Ellenberg
46385c84ac drbd: move put_ldev from __req_mod() to the endio callback
One invocation in the endio handler is good enough,
we don't need mention it for each of the different ways
it calls __req_mod().

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:51 +02:00
Lars Ellenberg
d64957c9a9 drbd: fix WRITE_ACKED_BY_PEER_AND_SIS to not set RQ_NET_DONE
Just because this request happened during a resync does
not mean it may pretend to have been barrier-acked.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:50 +02:00
Lars Ellenberg
41c4a0035b drbd: fix READ_RETRY_REMOTE_CANCELED to not complete if device is suspended
READ_RETRY_REMOTE_CANCELED needs to be grouped with the other _CANCELED
cases, not with CONNECTION_LOST_WHILE_PENDING, as that would complete
(fail) the bio even if the device became suspended.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:48 +02:00
Lars Ellenberg
6d49e101fd drbd: make OOS_HANDED_TO_NETWORK its own case
OOS_HANDED_TO_NETWORK should not be grouped with the various
*_CANCELED/*_FAILED cases.
Also, not only clear the RQ_NET_QUEUED flag, but also mark it RQ_NET_DONE,
so it can be distinguished from a local-only request even after that.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:47 +02:00
Lars Ellenberg
001a88687a drbd: fix potential data corruption and protocol error
We assumed only bios with bi_idx == 0 would end up
in drbd_make_request().

That is wrong.

At least device mapper, in __clone_and_map(), may submit
clones only covering a partial bio, but sharing
the original bvec, by adjusting bi_idx and relevant
other bio members of the clone.

We used __bio_for_each_segment() in various places,
even though that is documented as
 * drivers should not use the __ version unless they _really_ want to
 * run through the entire bio and not just pending pieces

Impact: we would send the full bio bvec, even for the clone
with bi_idx > 0, which will cause data corruption on the
peer (because we submit wrong data at the clone offset),
and will cause a DRBD protocol error, disconnect/reconnect
and resync (thus fixing the corruption),
because the next package header would be expected right
in the middle of the sent data, causing DRBD magic mismatch.

Fix: drop the assert, and use bio_for_each_segment()
instead of the __ version.

Conflicts:

	drbd/drbd_tracing.c

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:39 +02:00
Philipp Reisner
fc28845bc0 drbd: Fix a potential race that could case data inconsistency
When we have a write request and a state change C_WF_BITMAP_S -> C_SYNC_SOURCE
at the same time, and it happens that the line

	remote = remote && drbd_should_do_remote(s);

stills sees C_WF_BITMAP_S, and

	send_oos = rw == WRITE && drbd_should_send_oos(s);

already sees C_SYNC_SOURCE both are 0.

This causes the write to not be mirrored, but marked as out-of-sync on the
Sync_Source node.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:34 +02:00
Lars Ellenberg
031a7c173f drbd: add missing part_round_stats to _drbd_start_io_acct
Without this, iostat frequently sees bogus svctime and >= 100% "utilization".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:16:33 +02:00
Philipp Reisner
dfa8bedbfe drbd: Implemented the disk-timeout option
When the disk-timeout is active, and it expires for a single request,
we consider the local disk as D_FAILED. Note: With this change,
I made both timeout based state transitions HARD state transitions.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 15:10:45 +02:00
Philipp Reisner
2b4dd36fba drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2012-05-09 10:16:04 +02:00
Philipp Reisner
2f5cdd0b2c drbd: Converted the transfer log from mdev to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:58 +02:00
Andreas Gruenbacher
1b3bb47d52 drbd: Remove redundant check
Opening a device only succeeds on a primary node, or when explicitly
setting the allow_oos module parameter to allow opening the device
read-only on a secondary node.  There is no other way that a request can
get into drbd_make_request(), so this code cannot trigger.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:52 +02:00
Andreas Gruenbacher
7be8da0798 drbd: Improve how conflicting writes are handled
The previous algorithm for dealing with overlapping concurrent writes
was generating unnecessary warnings for scenarios which could be
legitimate, and did not always handle partially overlapping requests
correctly.  Improve it algorithm as follows:

* While local or remote write requests are in progress, conflicting new
  local write requests will be delayed (commit 82172f7).

* When a conflict between a local and remote write request is detected,
  the node with the discard flag decides how to resolve the conflict: It
  will ask its peer to discard conflicting requests which are fully
  contained in the local request and retry requests which overlap only
  partially.  This involves a protocol change.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:51 +02:00
Lars Ellenberg
8c387def58 drbd: simplify condition in drbd_may_do_local_read()
fold
	if (x >= (N+1))
		return 0;
	if (x < N)
		return 0;
into
	if (x != N)
		return 0;

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:39 +02:00
Andreas Gruenbacher
c670a39867 drbd: Use the IS_ALIGNED() macro in some more places
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:39 +02:00
Andreas Gruenbacher
8ca9844f10 drbd: Remove obsolete comment
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:38 +02:00
Andreas Gruenbacher
fcefa62e4c drbd: Rename drbd_endio_{pri,sec} -> drbd_{,peer_}request_endio
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-10-14 16:47:36 +02:00
Philipp Reisner
a21e929827 drbd: Moved the mdev member into drbd_work (from drbd_request and drbd_peer_request)
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:33:08 +02:00
Andreas Gruenbacher
6024fece73 drbd: Defer new writes when detecting conflicting writes
Before submitting a new local write request, wait for any conflicting
local or remote requests to complete.

We could assume that the new request occurred first and that the
conflicting requests overwrote it (and therefore discard the new
reques), but we know for sure that the new request occurred after the
conflicting requests and so this behavior would we weird.  We would also
end up with the wrong result if the new request is not fully contained
within the conflicting requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:34 +02:00
Andreas Gruenbacher
ddd8877d31 drbd: Remove unnecessary reference counting left-over
Nothing in this function accesses mdev->tconn->net_conf, so there is no
need for get_net_conf() / put_net_conf() anymore.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:33 +02:00
Andreas Gruenbacher
5e4722645a drbd: _req_conflicts(): Get rid of the epoch_entries tree
Instead of keeping a separate tree for local and remote write requests
for finding requests and for conflict detection, use the same tree for
both purposes.  Introduce a flag to allow distinguishing the two
possible types of entries in this tree.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:32 +02:00
Andreas Gruenbacher
53840641bb drbd: Allow to wait for the completion of an epoch entry as well
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:31 +02:00
Andreas Gruenbacher
a500c2efbb drbd: struct drbd_request: Introduce a new collision flag
This flag is set when a processes puts itself to sleep to wait for a
conflicting request to complete.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:29 +02:00
Andreas Gruenbacher
9e204cddaf drbd: Move some functions to where they are used
Move drbd_update_congested() to drbd_main.c, and drbd_req_new() and
drbd_req_free() to drbd_req.c: those functions are not used anywhere
else.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-09-28 10:26:28 +02:00
Christoph Hellwig
5a7bbad27a block: remove support for bio remapping from ->make_request
There is very little benefit in allowing to let a ->make_request
instance update the bios device and sector and loop around it in
__generic_make_request when we can archive the same through calling
generic_make_request from the driver and letting the loop in
generic_make_request handle it.

Note that various drivers got the return value from ->make_request and
returned non-zero values for errors.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-09-12 12:12:01 +02:00
Philipp Reisner
87eeee41f8 drbd: moved req_lock and transfer log from mdev to tconn
sed -i \
       -e 's/mdev->req_lock/mdev->tconn->req_lock/g' \
       -e 's/mdev->unused_spare_tle/mdev->tconn->unused_spare_tle/g' \
       -e 's/mdev->newest_tle/mdev->tconn->newest_tle/g' \
       -e 's/mdev->oldest_tle/mdev->tconn->oldest_tle/g' \
       -e 's/mdev->out_of_sequence_requests/mdev->tconn->out_of_sequence_requests/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:30:15 +02:00
Philipp Reisner
31890f4ab2 drbd: moved agreed_pro_version, last_received and ko_count to tconn
sed -i \
       -e 's/mdev->agreed_pro_version/mdev->tconn->agreed_pro_version/g' \
       -e 's/mdev->last_received/mdev->tconn->last_received/g' \
       -e 's/mdev->ko_count/mdev->tconn->ko_count/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:07 +02:00
Philipp Reisner
e42325a576 drbd: moved data and meta from mdev to tconn
Patch mostly:

sed -i -e 's/mdev->data/mdev->tconn->data/g' \
       -e 's/mdev->meta/mdev->tconn->meta/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:05 +02:00
Philipp Reisner
b2fb6dbe52 drbd: moved net_cont and net_cnt_wait from mdev to tconn
Patch partly generated by:

sed -i -e 's/get_net_conf(mdev)/get_net_conf(mdev->tconn)/g' \
       -e 's/put_net_conf(mdev)/put_net_conf(mdev->tconn)/g' \
       -e 's/get_net_conf(odev)/get_net_conf(odev->tconn)/g' \
       -e 's/put_net_conf(odev)/put_net_conf(odev->tconn)/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:04 +02:00
Philipp Reisner
89e58e755e drbd: moved net_conf from mdev to tconn
Besides moving the struct member, everything else is generated by:

sed -i -e 's/mdev->net_conf/mdev->tconn->net_conf/g' \
       -e 's/odev->net_conf/odev->tconn->net_conf/g' \
       *.[ch]

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:27:03 +02:00
Andreas Gruenbacher
8554df1c6d drbd: Convert all constants in enum drbd_req_event to upper case
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:55 +02:00
Andreas Gruenbacher
bb3bfe9614 drbd: Remove the unused hash tables
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:54 +02:00
Andreas Gruenbacher
8b946255f8 drbd: Use interval tree for overlapping epoch entry detection
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:53 +02:00
Andreas Gruenbacher
010f6e678f drbd: Put sector and size in struct drbd_epoch_entry into struct drbd_interval
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:52 +02:00
Andreas Gruenbacher
dac1389ccc drbd: Add read_requests tree
We do not do collision detection for read requests, but we still need to
look up the request objects when we receive a package over the network.
Using the same data structure for read and write requests results in
simpler code once the tl_hash and app_reads_hash tables are removed.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-29 11:26:31 +02:00
Andreas Gruenbacher
de696716e8 drbd: Use interval tree for overlapping write request detection
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:06 +02:00
Andreas Gruenbacher
ace652acf2 drbd: Put sector and size in struct drbd_request into struct drbd_interval
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-08-25 14:58:05 +02:00
Bart Van Assche
24c4830c8e drbd: Fix spelling
Found these with the help of ispell -l.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2011-05-24 10:21:29 +02:00
Lars Ellenberg
76727f684a drbd: fix potential activity log refcount imbalance in error path
It is no longer sufficient to trigger on local WRITE,
we need to check on (rq_state & RQ_IN_ACT_LOG)
before calling drbd_al_complete_io also in the error path.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-05-24 10:06:44 +02:00
Or Gerlitz
03567812d8 drbd: drop code present under #ifdef which is relevant to 2.6.28 and below
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:21 +01:00
Philipp Reisner
7fde2be930 drbd: Implemented real timeout checking for request processing time
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:16 +01:00
Philipp Reisner
039312b648 drbd: Removed left over, now wrong comments
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:09 +01:00
Lars Ellenberg
e636db5b95 drbd: fix potential imbalance of ap_in_flight
When we receive a barrier ack, we walk the ring list of drbd requests
in the transfer log of the respective epoch, do some housekeeping,
and free those objects.

We tried to keep epochs of mirrored and unmirrored drbd requests
separate, and assert that no local-only requests are present in a
barrier_acked epoch.

It turns out that this has quite a number of corner cases and would
add bloated code without functional benefit.

We now revert the (insufficient) commits
 drbd: Fixed an issue with AHEAD -> SYNC_SOURCE transitions
 drbd: Ensure that an epoch contains only requests of one kind
and instead fix the processing of barrier acks to cope with
a mix of local-only and mirrored requests.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:48:06 +01:00
Philipp Reisner
6a35c45f89 drbd: Ensure that an epoch contains only requests of one kind
The assert in drbd_req.c:755 forces us to have only requests of
one kind in an epoch. The two kinds we distinguish here are:
local-only or mirrored.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:42 +01:00
Philipp Reisner
71c78cfba2 drbd: Nothing should stop SyncSource -> Ahead transitions
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:34 +01:00
Philipp Reisner
da0a78161d drbd: Be more careful with SyncSource -> Ahead transitions
We may not get from SyncSource to Ahead if we have sent some
P_RS_DATA_REPLY packets to the peer and are waiting for
P_WRITE_ACK.

Again, this is not relevant for proper tuned systems, but makes
sure that the not-tuned system does not get diverging bitmaps.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:45:26 +01:00
Philipp Reisner
c88d65e223 drbd: Documenting drbd_should_do_remote() and drbd_should_send_oos()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:43:32 +01:00
Andreas Gruenbacher
81e84650c2 drbd: Use the standard bool, true, and false keywords
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:24 +01:00
Andreas Gruenbacher
0cf9d27e38 drbd: Get rid of unnecessary macros (2)
The FAULT_ACTIVE macro just wraps the drbd_insert_fault macro for no
apparent reason.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:15 +01:00
Andreas Gruenbacher
2f58dcfc85 drbd: Rename drbd_make_request_26 to drbd_make_request
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:36:11 +01:00
Lars Ellenberg
8a3c104438 drbd: fix regression, we need to close drbd epochs during normal operation
commit e2041475e6ddb081734d161f6421977323f5a9b9
drbd: Starting with protocol 96 we can allow app-IO while receiving the bitmap

Contained a bad chunk that tried to optimize away drbd barriers during
bitmap exchange, but accidentally dropped them for normal mode as well.

Impact: depending on activity log size and access pattern, activity log
extents may not be recycled in time, causeing IO to block indefinetely.

Fix: skip drbd barriers only if there is no connection to send them on,
or the request being completed has not been on the network at all.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-03-10 11:35:20 +01:00