Bugzilla – Bug 911337
3.18.1-1-desktop doesn't honor settings for block devices
Last modified: 2015-01-06 20:54:49 UTC
# uname -a Linux saturn 3.18.1-1-desktop #1 SMP PREEMPT Wed Dec 17 18:20:30 UTC 2014 (5f2f35e) x86_64 x86_64 x86_64 GNU/Linux # cat /etc/os-release NAME=openSUSE VERSION="20141225 (Tumbleweed)" VERSION_ID="20141225" PRETTY_NAME="openSUSE 20141225 (Tumbleweed) (x86_64)" ID=opensuse ANSI_COLOR="0;32" CPE_NAME="cpe:/o:opensuse:opensuse:20141225" BUG_REPORT_URL="https://bugs.opensuse.org" HOME_URL="https://opensuse.org/" ID_LIKE="suse" # cat /sys/block/sd?/queue/scheduler none none none none none # cat /sys/block/sdb/queue/scheduler none # echo deadline > /sys/block/sdb/queue/scheduler # echo $? 0 # cat /sys/block/sdb/queue/scheduler none Similar issues exist with other elements, such as "nr_requests"; values can be echo'ed onto the files, but they get ignored.
same result on a 13.1 + Kernel:Stable 3.18 I've elevator=deadline on boot line dmesg contain the right information [ 1.059057] io scheduler deadline registered (default) cat and echo = same result as reported
same here: cat /sys/block/sda/queue/scheduler none I have "deadline" set via systemd on Opensuse 13.2 Tumbleweed x64
Hum, strange. I'm running OpenSUSE 13.2 and everything works as expected here. What's more interesting, the contents of 'scheduler' should rather look like: quack:/crypted/home/jack/source/linux-fs # cat /sys/block/sda/queue/scheduler noop [deadline] cfq The output 'none' means there either isn't any IO scheduler or the device isn't stackable (e.g. it is a DM device). Neither of these seems to be your case. What does your IO setup look like guys? Are you using sda/sdb directly or is there device mapper involved?
@Jan see my comment ? in demsg scheduler seems to be registered but cat /sys/block/sda/queue/scheduler always return none. For example for me (still under 13.1) using 3.11x normal series = scheduler present. booting 3.18.1 absent. Previous 3.17x series also show the scheduler. Type are always a direct block device (even if some are hardware raid) Tested on 12 differents configuration.
Thanks, so the regression happened somewhere between 3.17 and 3.18. That's good to know. I'll check the changes. BTW: I've seen your comment just I probably did explain myself clear enough. Sure the IO scheduler is registered in the system (which is what dmesg shows) but for some reason the device sd? needn't be attached to any IO scheduler due to a bug.
Ah, I think I see what has happened. Is there directory 'mq' in /sys/block/sda/?
Yes it's there /sys/block/sda/mq: total 0 drwxr-xr-x 10 root root 0 jan 6 11:11 0 /sys/block/sda/mq/0: total 0 -r--r--r-- 1 root root 4096 jan 6 11:11 active drwxr-xr-x 2 root root 0 jan 6 11:11 cpu0 drwxr-xr-x 2 root root 0 jan 6 11:11 cpu1 drwxr-xr-x 2 root root 0 jan 6 11:11 cpu2 drwxr-xr-x 2 root root 0 jan 6 11:11 cpu3 drwxr-xr-x 2 root root 0 jan 6 11:11 cpu4 drwxr-xr-x 2 root root 0 jan 6 11:11 cpu5 drwxr-xr-x 2 root root 0 jan 6 11:11 cpu6 drwxr-xr-x 2 root root 0 jan 6 11:11 cpu7 -r--r--r-- 1 root root 4096 jan 6 11:11 cpu_list -r--r--r-- 1 root root 4096 jan 6 11:11 dispatched -r--r--r-- 1 root root 4096 jan 6 11:11 pending -r--r--r-- 1 root root 4096 jan 6 11:11 queued -r--r--r-- 1 root root 4096 jan 6 11:11 run -r--r--r-- 1 root root 4096 jan 6 11:11 tags /sys/block/sda/mq/0/cpu0: total 0 -r--r--r-- 1 root root 4096 jan 6 11:12 completed -r--r--r-- 1 root root 4096 jan 6 11:12 dispatched -r--r--r-- 1 root root 4096 jan 6 11:12 merged -r--r--r-- 1 root root 4096 jan 6 11:12 rq_list /sys/block/sda/mq/0/cpu1: total 0 -r--r--r-- 1 root root 4096 jan 6 11:12 completed -r--r--r-- 1 root root 4096 jan 6 11:12 dispatched -r--r--r-- 1 root root 4096 jan 6 11:12 merged -r--r--r-- 1 root root 4096 jan 6 11:12 rq_list /sys/block/sda/mq/0/cpu2: total 0 -r--r--r-- 1 root root 4096 jan 6 11:12 completed -r--r--r-- 1 root root 4096 jan 6 11:12 dispatched -r--r--r-- 1 root root 4096 jan 6 11:12 merged -r--r--r-- 1 root root 4096 jan 6 11:12 rq_list /sys/block/sda/mq/0/cpu3: total 0 -r--r--r-- 1 root root 4096 jan 6 11:12 completed -r--r--r-- 1 root root 4096 jan 6 11:12 dispatched -r--r--r-- 1 root root 4096 jan 6 11:12 merged -r--r--r-- 1 root root 4096 jan 6 11:12 rq_list /sys/block/sda/mq/0/cpu4: total 0 -r--r--r-- 1 root root 4096 jan 6 11:12 completed -r--r--r-- 1 root root 4096 jan 6 11:12 dispatched -r--r--r-- 1 root root 4096 jan 6 11:12 merged -r--r--r-- 1 root root 4096 jan 6 11:12 rq_list /sys/block/sda/mq/0/cpu5: total 0 -r--r--r-- 1 root root 4096 jan 6 11:12 completed -r--r--r-- 1 root root 4096 jan 6 11:12 dispatched -r--r--r-- 1 root root 4096 jan 6 11:12 merged -r--r--r-- 1 root root 4096 jan 6 11:12 rq_list /sys/block/sda/mq/0/cpu6: total 0 -r--r--r-- 1 root root 4096 jan 6 11:12 completed -r--r--r-- 1 root root 4096 jan 6 11:12 dispatched -r--r--r-- 1 root root 4096 jan 6 11:12 merged -r--r--r-- 1 root root 4096 jan 6 11:12 rq_list /sys/block/sda/mq/0/cpu7: total 0 -r--r--r-- 1 root root 4096 jan 6 11:12 completed -r--r--r-- 1 root root 4096 jan 6 11:12 dispatched -r--r--r-- 1 root root 4096 jan 6 11:12 merged -r--r--r-- 1 root root 4096 jan 6 11:12 rq_list
That explains it - the device is now handled by block-multiqueue layer. That layer does low-overhead handling of IO requests (it is a replacement of the old block layer for fast devices) and in particular there is no IO scheduler involved. So unless you observe any regression in behavior things work as they should. I'll keep the bug open for a bit so that other reporters can confirm whether they are in the same situation.
(In reply to Jan Kara from comment #8) > That explains it - the device is now handled by block-multiqueue layer. As Bruno confirmed, I have the "mq" directory in all of my devices (sd{a..e}, with sda being an SSD). The "queue" directory still exists, with all tuneable entries in there, such as "scheduler", "nr_requests", etc. None of them can be changed in 3.18.1, while this was possible in 3.17.4. > That > layer does low-overhead handling of IO requests (it is a replacement of the > old block layer for fast devices) and in particular there is no IO scheduler > involved. So unless you observe any regression in behavior things work as > they should. Hmm, if I understand this correctly, there are no _tuning_ knobs anymore?!? The CFQ scheduler was really unusable in certain I/O situations, while deadline behaved much more friendly, latency-wise; that's why I used to carry a suitable udev-rules file around. Is this correct? Are there no tuneables anymore? > I'll keep the bug open for a bit so that other reporters can confirm whether > they are in the same situation. I'll check my various I/O intensive workloads and report back...
I just ran my I/O intensive workload again (copying a 48 GiB file on an XFS file system on an LVM lv on an MD raid10 on 4 spinning disks) while trying to start some processes in the foreground. It appears as if the kernels have become much better with such situations nowadays! 3.18.1 vs. 3.17.4 (deadline) behaves almost identical, while 3.17.4 (cfq) is still worse than 3.17.4 (deadline) *and* 3.18.1 (but not as worse as it was several years ago). Looks like I need to look again at all my tuning tweaks... ;)
Glad to hear that Manfred :). Regarding your questions in comment 9: Some tunables in /sys/block/.../queue/ are still used - e.g. read_ahead_kb, add_random, ... Since there is no IO scheduling (basically things behave as with noop IO scheduler) and number of requests is limited only by number of tags available in HW, tunables like scheduler or nr_requests don't make sense anymore. So I'm closing the bug since this is in fact a feature.