Bug 345039 - Encrypted filesystem write blocks the system
Encrypted filesystem write blocks the system
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE 11.1
Classification: openSUSE
Component: Kernel
Final
i686 openSUSE 11.1
: P5 - None : Major (vote)
: ---
Assigned To: Jeff Mahoney
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-11-29 20:24 UTC by Carlos Robinson
Modified: 2010-11-27 20:55 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Requested report detailed info (9.65 KB, application/octet-stream)
2008-01-03 11:26 UTC, Carlos Robinson
Details
requested dmesg output (20.72 KB, application/x-gunzip)
2008-02-20 01:49 UTC, Carlos Robinson
Details
System.map-2.6.22.17-0.1-cer of the running kernel (211.82 KB, application/x-gunzip)
2008-02-20 01:50 UTC, Carlos Robinson
Details
kernel.config.gz of the running kernel for the dmesg file above. (20.74 KB, application/x-gunzip)
2008-02-20 01:51 UTC, Carlos Robinson
Details
dmesg_after_standard_kernel.out.gz (17.11 KB, application/x-gunzip)
2008-02-20 20:28 UTC, Carlos Robinson
Details
kernel.gz (24.44 KB, application/x-gunzip)
2008-02-20 20:29 UTC, Carlos Robinson
Details
messages.gz (4.20 KB, application/x-gunzip)
2008-02-20 20:30 UTC, Carlos Robinson
Details
boot.msg.gz (11.58 KB, application/x-gunzip)
2008-02-20 20:33 UTC, Carlos Robinson
Details
kernel log (3.01 KB, application/x-gunzip)
2008-02-21 12:37 UTC, Carlos Robinson
Details
messages.gz, debug kernel (4.28 KB, application/x-gunzip)
2008-02-21 12:39 UTC, Carlos Robinson
Details
dmesg_debug_kernel.out.gz (8.37 KB, application/x-gunzip)
2008-02-21 12:40 UTC, Carlos Robinson
Details
boot.msg_debug_kernel.gz (12.49 KB, application/x-gunzip)
2008-02-21 12:40 UTC, Carlos Robinson
Details
boot.msg.gz, debug kernel, good log (12.41 KB, application/x-gunzip)
2008-02-21 21:27 UTC, Carlos Robinson
Details
dmesg.out.gz file (23.84 KB, application/x-gunzip)
2008-03-17 20:25 UTC, Carlos Robinson
Details
messages log file (22.64 KB, application/x-gunzip)
2008-05-18 11:42 UTC, Carlos Robinson
Details
factory_11.0_20080517_boot.msg.gz (27.81 KB, application/x-gunzip)
2008-05-18 11:42 UTC, Carlos Robinson
Details
Output from "top" during the lock in factory (3.42 KB, application/x-gunzip)
2008-05-22 19:18 UTC, Carlos Robinson
Details
kernel log during crash using LUKS XFS crypto filesystem (re comment 46) (28.25 KB, application/x-gunzip)
2008-07-16 11:29 UTC, Carlos Robinson
Details
messages log file opensuse 11.1 GM (37.64 KB, application/x-gunzip)
2008-12-29 19:13 UTC, Carlos Robinson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Robinson 2007-11-29 20:24:33 UTC
(I classify this provisionally as a security problem, considering encryption a security feature; please reclassify as needed).


Setup description (longish)
----------------------------


I have several encrypted filesystems inherited from 10.2 and before (system has been upgraded during the time from 8.1 up to 10.3; ie, I don't use fresh install). CPU is a P-IV, 1GiB, and I have 3 HD (± 300, 120 & 60 GB) and more than 16 partitions on two of them.

The main encrypted partition is activated via "/etc/init.d/boot.crypto start" during boot up, and /etc/cryptotab:

/dev/loop0  /dev/disk/by-id/ata-ST3320620A_5QF2M56F-part15  \
   /cripta   xfs   twofish256   noatime,nodiratime


It is mounted via loop0, /dev/mapper/cryptotab_loop0, and I think /dev/dm-0, but exactly how, I don't understand (please, documentation anywhere :-?)


nimrodel:~ # mount -l | grep cripta
/dev/mapper/cryptotab_loop0 on /cripta type xfs (rw,noatime,nodiratime) [crypta_320]

nimrodel:~ # dmsetup info
Name:              cryptotab_loop0
State:             ACTIVE
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 0
Number of targets: 1



That one has not given problems as far as I know. However, I have three more encripted filesystems, on file (not partitions) and mounted manually. Many of the files stored on them are 300-400 MiB. One of the filesystem is largish (28GiB), and the other two have the exact size of a DVD (4.4GiB), because I burn the file directly to a DVD as my backup procedure. Thus, I can also mount encrypted DVDs from the command line using "mount" as user, and all this through entries in fstab:

/biggy/crypta_f.mm.x   /mnt/crypta.mm.x  xfs  \
   noauto,user,noatime,nodiratime,loop,encryption=twofish256  1 4

/Grande/imgs/crypta_f1_dvd.mm.x    /mnt/crypta.mm_dvd1.x   xfs  \
   noauto,user,loop,encryption=twofish256  1 4

/Grande/imgs/crypta_f2_dvd.mm.x    /mnt/crypta.mm_dvd2.x  xfs   \
   noauto,user,loop,encryption=twofish256  1 4


nimrodel:~ # mount -l | grep twofish
/biggy/crypta_f.mm.x on /mnt/crypta.mm.x type xfs (rw,noexec,nosuid,nodev,noatime,nodiratime,loop=/dev/loop1,encryption=twofish256)

nimrodel:~ # losetup -a
/dev/loop0: [000e]:4593 (/dev/disk/by-id/ata-ST3320620A_5QF2M56F-part15)
/dev/loop1: [1650]:135 (/biggy/crypta_f.mm.x), encryption twofish (type 18), key length 32


This setup has worked very well while using 10.2, but in 10.3 I have problems: I experience filesystem crash or lock up when writing to the large encrypted one.



Problem description
--------------------

Reading from /mnt/crypta.mm.x (28GiB) works (I managed to copy it elsewhere without a glitch).

However, writing to this filesystem, from one of the encrypted DVDs fail. To clear out the DVD subsystem, I copied over the DVD using dd to a file, and loopmounted that one:

/dev/loop4: [0314]:9142822 (/Grande/imgs/roto), encryption twofish (type 18), key length 32

Well, whatever the data source, writing to /mnt/crypta.mm.x a set of large files (over 300 MiB each), locks the console where the copy is having place. The copy process stops. It is unkillable. Umount of that filesystem locks (and umount is unkillable). Logout of GNome locks if an xterm locked by this problem. Reboot of system locks, requiring hard power off, and later, filesystem repair (not, it seems, of /mnt/crypta.mm.x).

If gnome is active, I see in the CPU system monitor applet that "IOWait" is 100%, but cpu ussage is very low (encryption uses a lot of cpu here).

gkrellm ends by freezing.

If I try to "ls -l" the destination dir of the copy (that is locked,
frozen), the terminal doing this ls also freezes.

I have to lazy umount ("umount -l /mountpoints &") all the mountpoints I
can, and then try to reboot (which hangs), and finally poweroff the machine
forcefully.

There is absolutely nothing in the logs relative to this problem (I believe I know how to look into logs and find problems, even if not understanding them; that was my work specialty some years ago O:-) ).


I have fsck the encrypted filesystem, nothing reported:


  nimrodel:~ # losetup  -e twofish256 /dev/loop2 /biggy/crypta_f.mm.x
  Password:

  nimrodel:~ # file -s /dev/loop2
  /dev/loop2: SGI XFS filesystem data (blksz 4096, inosz 256, v2 dirs)

  nimrodel:~ # xfs_check /dev/loop2
  ERROR: The filesystem has valuable metadata changes in a log which needs to
  be replayed.  Mount the filesystem to replay the log, and unmount it before
  re-running xfs_check.  If you are unable to mount the filesystem, then use
  the xfs_repair -L option to destroy the log and attempt a repair.
  Note that destroying the log may cause corruption -- please attempt a mount
  of the filesystem before doing this.

  nimrodel:~ # mount /dev/loop2 /mnt/crypta.mm.x/
  nimrodel:~ # umount /dev/loop2
  nimrodel:~ # xfs_check /dev/loop2
  nimrodel:~ #

No errors, you see.


I'm aware that the encryption filesystem method have changed in 10.3, but the
only document I have is the release notes. Probably I would have to use a
different method than losetup, but I have no idea which exactly.


But notice that the problem arises from a filesystem mounted directly from
fstab - shouldn't this method be used now anymore?

I can not mount this partition from /etc/cryptotab as I need to mount and umount it manually, at will.

In any case, the "classic" method should not freeze the computer, as it does :-?


I can run tests; right now I think I can reproduce the situation at will, as I had to reboot several times yesterday while trying to diagnose it (several hours). But please, be careful with the suggested tests, as a failure means powering off forcefully, which means a minimum of half an hour per test, providing I don't loose data in the process. So far I have been lucky. And please, try to limit tests to runlevel 3, as a gnome lockup plays havoc with things like command history and some gadgets.

I haven't seen any messages related to the lock. Not in dmesg, not in consoles, log files, etc. nothing. Not related to the problem even remotely, and sometimes not a message about anything at all.

I'm willing to recompile the kernel and increase verbosity somewhere, if you know that somewhere (I don't).


Mmmm... I think I wrote all I know about it. I hope you can decipher the problem!
Comment 1 Christian Boltz 2007-11-29 23:07:08 UTC
Reassigning to the kernel developers (as suggested by Ludwig Nussel) - and CC'ing Ludwig because he has seen a similar problem at home (see his mail in suse-security).
Comment 2 Carlos Robinson 2007-12-30 11:48:44 UTC
This problem continues. I increased the severity because this bug impedes work.

Soon after I write large files to an encrypted, loop mounted, filesystem, the operation locks. All programs are thereafter unable to exit or be killed -9, and system can not even reboot or halt, only power switch works, forcing a huge fsck.


It does not matter if filesystem is mounted through device mapper, new style (dm-crypt), or old style (cryptoloop). A week ago I tried new style.


/etc/crypttab:

mycrypt_mm_f  /biggy/crypta_f.mm.x  none  \
cipher=twofish-cbc-plain,size=256,hash=sha512,noauto,loop


/etc/fstab:

/dev/mapper/mycrypt_mm_f   /mnt/crypta.mm.x   xfs   \
noauto,user,noatime,nodiratime   1 4



nimrodel:~ # df -h /mnt/crypta.mm.x
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/mycrypt_mm_f
                       28G   19G  9.3G  67% /mnt/crypta.mm.x



Mounted as:

/dev/mapper/mycrypt_mm_f on /mnt/crypta.mm.x type xfs \
(rw,noexec,nosuid,nodev,noatime,nodiratime)



Logs: nothing is logged anywhere. Nothing!




I'd be happy to try kernel options to enable more logging, or try tests...
I can trigger the bug at will, it is reproducible here.


But I need clues, I don't know what to do: you are the experts, not me. The next step for me would be to erase and recreate the filesystem, but if that works you will not be able to diagnose it.
Comment 3 Jeff Mahoney 2008-01-02 22:20:15 UTC
There could be a deadlock or an Oops occurring. Since you're running a GUI and the system is locking up in the manner it is, it's tough to debug.

If you can reproduce this at will, could you do so in console mode after running dmesg -n 8 ; klogconsole -r 0

That will dump all kernel output messages to the current console. If there's an oops, you'll see it. Otherwise, if it locks up without an Oops, please do a alt+sysrq+t to dump the states of the processes on the system. If something is stuck in a deadlock, we should be able to see it.

The best way to capture this info is with a serial console, but if you're unable to set one up - a photo of the Oops will do. If there's sysrq+t output, that's going to be more difficult to capture in its entirety.
Comment 4 Carlos Robinson 2008-01-03 11:12:25 UTC
(In reply to comment #3 from Jeff Mahoney)
> There could be a deadlock or an Oops occurring. Since you're running a GUI and
> the system is locking up in the manner it is, it's tough to debug.

The first time I was in gnome; later on, I always tested in runlevel 3.

I will attach my report as a file. The gist of it: nothing.


> If you can reproduce this at will, could you do so in console mode after
> running dmesg -n 8 ; klogconsole -r 0

No output at all. The cp command simply stops copying, cpu goes iddle.


> please do a
> alt+sysrq+t to dump the states of the processes on the system. If something is
> stuck in a deadlock, we should be able to see it.

No output at all. Don't we need to enable first sysrq somehow?


> The best way to capture this info is with a serial console, but if you're
> unable to set one up - a photo of the Oops will do. If there's sysrq+t output,
> that's going to be more difficult to capture in its entirety.

Serial port would be possible (if you point me to a howto, including pinout of serial cable), but I don't think it is necessary, the system doesn't crash till I try to umount or access the affected filesystem. I can read the log, do dmesg, cat /proc/something, etc. However, there is no message anywhere. Nothing, nada.
 

Comment 5 Carlos Robinson 2008-01-03 11:26:58 UTC
Created attachment 189148 [details]
Requested report detailed info
Comment 6 Carlos Robinson 2008-01-24 00:27:53 UTC
There is some unexpected relation to Bug 349942 - see comment number 4 there:
(https://bugzilla.novell.com/show_bug.cgi?id=349942#c4)

I will try to reproduce this bug with Apparmour disabled, time permitting.
Comment 7 Frank Fiene 2008-01-30 18:04:09 UTC
Hmm, maybe i have the same error.

I've shared an encrypted xfs formatted disk with nfs.

If i try to copy large files (bigger than 350MB), cp ends normally but gives me a file of round about 350MB. Nor error messages from cp and nothing in /var/log/messages both on server and client.

If i try to mount an iso file on this server with loop, every file after the first 350MB are corrupt or are not existing.

I've mounted with r(w)size of 8192 and 1024, tried also to disable NFSv4.

I didn't try to enable NFS over TCP and i think UDP is not the problem.

Puh AppArmor. On client disabled, i don't know if on server. Normaly i install without AppArmor but i am not sure ...
Comment 8 Carlos Robinson 2008-01-31 00:00:54 UTC
I finally tried with apparmour disabled, after doing a backup just in case.

I managed to copy 4.7 GB from an encrypted DVD to an encrypted filesystem without problems. I then tried to copy another one... and this one failed. So apparmour is not the problem. Nor did I get any error message anywhere: the copy process stops, the affected filesystem is not umountable.

Question, please, someone:

Is there any option I can enable in the kernel, even recompiling the kernel, so that we could get more detailed log allowing us to learn what the heck is happening here?

Comment 9 Frank Fiene 2008-01-31 08:27:48 UTC
Sorry, not the same error for me.

I've tested with samba with the same shared directory and it works, but samba is so bloody slow (6MB/s vs. 12MB/s on a 100MBit line)

I will create a new bug report.
Comment 10 Jeff Mahoney 2008-02-19 21:22:39 UTC
@Carlos:

If the system is hanging and you can still issue commands (as seems to be the case), can you try the following:

echo t > /proc/sysrq-trigger

Depending on whether writing files works, either save the output of dmesg to a file, or output it to another machine like this:

dmesg | ssh <somehost> "cat > dmesg.out"

That will give a kernel stack trace of everything on the system, and should tell me where the encrypted block device is hanging.
Comment 11 Carlos Robinson 2008-02-19 23:13:08 UTC
(In reply to comment #10 from Jeff Mahoney)
> @Carlos:
> 
> If the system is hanging and you can still issue commands (as seems to be the
> case), can you try the following:
> 
> echo t > /proc/sysrq-trigger

Do I do that before I start the file copy procedure, or after it hangs? I assume the former.

>
> Depending on whether writing files works, either save the output of dmesg to a
> file, or output it to another machine like this:
> 
> dmesg | ssh <somehost> "cat > dmesg.out"
> 
> That will give a kernel stack trace of everything on the system, and should
> tell me where the encrypted block device is hanging.

Ok, I'll try, time permitting.
 
Comment 12 Jeff Mahoney 2008-02-19 23:15:49 UTC
No, it needs to be done after the hang. A trace before the hang will just show a normally running system, chugging along. :)
Comment 13 Carlos Robinson 2008-02-20 01:49:13 UTC
Created attachment 195842 [details]
requested dmesg output

That is the dmesg file, saved locally. It took me longer to test because first I had to boot my spare computer, which was recalcitrant tonight (it "runs" 7.3).

There is a snag: I forgot I was running my slightly customized kernel. I'll attach my kernel config file and System.map, but if that doesn't help, tell me and I'll redo the test with the default kernel - but not tonight, here it is 2:48 AM.
Comment 14 Carlos Robinson 2008-02-20 01:50:23 UTC
Created attachment 195843 [details]
System.map-2.6.22.17-0.1-cer  of the running kernel
Comment 15 Carlos Robinson 2008-02-20 01:51:11 UTC
Created attachment 195845 [details]
kernel.config.gz of the running kernel for the dmesg file above.
Comment 16 Jeff Mahoney 2008-02-20 18:10:11 UTC
That's a showstopper. Obviously we can't support kernels we don't ship. If you're able to reproduce without VMware and with a kernel as shipped by Novell, please re-open this bug. 
Comment 17 Carlos Robinson 2008-02-20 18:38:53 UTC
(In reply to comment #16 from Jeff Mahoney)
> That's a showstopper. Obviously we can't support kernels we don't ship. If
> you're able to reproduce without VMware and with a kernel as shipped by 
> Novell, please re-open this bug. 
> 

You are unfair and unjust! :-/

If you have read this report carefully you'd know that's not the case. What about Ludwig? Is he also using a non standard software?

I said that yesterday I was running my kernel, I repeat: _Yesterday_. I can reproduce any time with the default kernel, and I have done so often here.

Also, Vmware is not involved, as it happens that I used vmware yesterday to test something, the first time since November.

I simply forgot to reboot before submitting my report. I was so happy that you gave me instructions to follow that I forgot to reboot to the default kernel before testing and reporting, and I told you so. I also told you I could repeat the test in standard conditions.

You could simply have said: "Yes, please repeat the test in standard conditions." I would have set aside two or three hours to please you. But no, you go and close the bug as invalid after all these months of testings on your behalf.

YOU ARE UNFAIR!  :-/

Comment 18 Carlos Robinson 2008-02-20 18:39:56 UTC
Invalid close as "invalid".
Comment 19 Jeff Mahoney 2008-02-20 18:52:34 UTC
My apologies, I misunderstood. It's standard practice for us to close out bugs in unsupportable situations. The intention is that if the problem is reproducible with a shipped kernel, the user will re-open it and provide the information. In many cases, the reporter is unable to reproduce it and the bug stays closed. If we did otherwise, we'd end up with loads of "NEEDINFO" bugs collecting dust since those users tend to fall silent when it's shown that the problem isn't with anything we ship. I did mention that you could re-open it to reproduce with the shipped kernel, so closing the bug as invalid under such situations is not "final," as you've just proved. :)

If you're able, the contents of /var/log/messages would be helpful as well. It looks like the dmesg output was too long and ended up getting cut off at the beginning.
Comment 20 Carlos Robinson 2008-02-20 20:28:29 UTC
Created attachment 196086 [details]
dmesg_after_standard_kernel.out.gz

Ok, apologies accepted.

You could,however, simply mark as needinfo, an automatically as invalid after, say, a month without response.

Here I add the data. I'll attach my /var/log/kernel file and /var/log/messages (edited to contain data of today only), because my messages file does not contain kernel messages (syslog-ng configuration non standard).

If you need more logs or data, ask :-)
Comment 21 Carlos Robinson 2008-02-20 20:29:15 UTC
Created attachment 196087 [details]
kernel.gz
Comment 22 Carlos Robinson 2008-02-20 20:30:05 UTC
Created attachment 196088 [details]
messages.gz
Comment 23 Carlos Robinson 2008-02-20 20:33:20 UTC
Created attachment 196089 [details]
boot.msg.gz
Comment 24 Jeff Mahoney 2008-02-21 00:46:08 UTC
Hrm, the goal was to get an un-truncated log, but that didn't work.

So, how about we try two things at once:

I've built a -debug kernel that contains things like lock debugging, as well as and increased log buffer size. This isn't a kernel you'd want to run on a production system, but it contains a bunch of sanity checks that should help us here.

http://ftp.suse.com/pub/people/jeffm/suse/testpkgs/345039/

Once you've installed the kernel, reboot, and re-run the tests as before.
Comment 25 Carlos Robinson 2008-02-21 01:11:07 UTC
Thanks for your effort -- I hope we can catch this one :-)

Ok, I will install it tomorrow. I see two rpms, I suppose I install both.

What was the rpm command to install a supplementary kernel without removing the already installed one? I don't remember off-hand. If you don't know I'll look the man page again. Grub is no problem, I'll handle that ok.
Comment 26 Jeff Mahoney 2008-02-21 01:43:43 UTC
rpm -i <rpm> does it. 

You'll only need the .i386.rpm. The .nosrc.rpm is the source rpm. It's only "nosrc" instead of "src" because otherwise we'd have to ship a full copy of the linux kernel source for _every_ flavor of the kernel we ship. That's a lot of space to waste. ;)
Comment 27 Ludwig Nussel 2008-02-21 08:00:12 UTC
FWIW the problem I had at home was caused by the sata driver and is fixed/worked around meanwhile.
Comment 28 Carlos Robinson 2008-02-21 12:37:45 UTC
Created attachment 196255 [details]
kernel log

Here you are.

I had to copy over 9 GiB till the crash happened, I almost thought it did not want to help us today. Yiks!

But I think that something wrong happened to the trace: it is to small. The dmesg file doesn't contain anything. I think I saw more in the terminal, but I can't be sure. I did "dmesg | tee somefile", it started to pour pages, then stopped.

The kernel log file contains less than a screen of traces.

Ok, if you think something happened to the data, I'll repeat the test, dumping via ssh to the aux pc.
Comment 29 Carlos Robinson 2008-02-21 12:39:39 UTC
Created attachment 196260 [details]
messages.gz, debug kernel
Comment 30 Carlos Robinson 2008-02-21 12:40:10 UTC
Created attachment 196261 [details]
dmesg_debug_kernel.out.gz
Comment 31 Carlos Robinson 2008-02-21 12:40:51 UTC
Created attachment 196262 [details]
boot.msg_debug_kernel.gz
Comment 32 Carlos Robinson 2008-02-21 19:41:43 UTC
One detail: this time, when lazy umounting everything, I got the message:

can't lock file /etc/mtab~: timed out

This is the first time I see this.


Ah, please tell me if the test was successful, and you saw something - even if then it takes months to repair; but knowing that the fault is finally known would be be something.
Comment 33 Jeff Mahoney 2008-02-21 20:02:22 UTC
It certainly seems as though something happened to the data. The dmesg output provided is lacking any sort of stack trace information. I appreciate your efforts in trying to track this down. It can only be more frustrating for you than it is for me. When you commented that the output was only half a page, I was hopeful that it was a lockdep report citing a deadlock problem. It just ended up being a really truncated stack trace. Given that the next line after the trace was a "klog starting" line, it looks as though it was from a previous test run.

Hopefully dumping it over the network will give the full output.
Comment 34 Carlos Robinson 2008-02-21 21:27:59 UTC
Created attachment 196436 [details]
boot.msg.gz, debug kernel, good log

Ok, this time I think we got the data. dmesg worked fine both locally and in the remote ssh. I don't know why it failed the other time: I suppose that the command "tee" crashed and impeded the correct saving of the file. I did notice "tee" crashing, but I did not notice that the /var/log/kernel file didn't had all the data.

The restart you saw in the previous log is simply the next reboot of the computer: I edited the log to remove previous lines to this debug session, and maybe the next normal session, to save size.

I'm attaching this time only the dmesg output, because I think it is complete. If you want kernel and messages log, just ask for it.

And obviously, yes, I want to have this solved O:-)

I can make do rebooting to 10.2 and saving what I need, but...
Comment 35 Jeff Mahoney 2008-03-17 19:45:02 UTC
Actually, that looks just like boot.msg, not dmesg. That ends up silencing most of the kernel output during boot once syslog starts.
Comment 36 Carlos Robinson 2008-03-17 20:25:48 UTC
Created attachment 202656 [details]
dmesg.out.gz file

Indeed, because the attachment of Comment #34 is the boot log file. This was a month ago, and now I can't remember how or why the dmesg file wasn't attached. Maybe firefox crashed or stuck.

Anyway, here goes dmesg.
Comment 37 Jeff Mahoney 2008-03-18 18:02:00 UTC
Ok, here's what I see that's of potential interest:

xfsdatad/0    D 00000000  7000  1958      2 (L-TLB)
       f60dfee4 00000046 00000001 00000000 f60dfe78 c04aa718 c04aa518 c044fa00
       c044fa00 00000046 00000009 df8934c0 79edc300 000000ac 00000000 df8935f0
       c1a0da00 00000000 0000000a 00000000 c1b7f828 0044788f f789b70c 00000000
Call Trace:
 [<c01e1ebd>] rwsem_down_failed_common+0x14a/0x169
 [<c02e475e>] rwsem_down_write_failed+0x1b/0x23
 [<c02e4826>] call_rwsem_down_write_failed+0x6/0x10
 [<f941e854>] xfs_ilock+0x45/0x65 [xfs]
 [<f943fab9>] xfs_setfilesize+0x35/0xc3 [xfs]
 [<f943fc31>] xfs_end_bio_delalloc+0x10/0x1a [xfs]
 [<c0132061>] run_workqueue+0x7d/0x110
 [<c01329e1>] worker_thread+0xbe/0xcd
 [<c01350ea>] kthread+0x3b/0x61
 [<c0106167>] kernel_thread_helper+0x7/0x10

loop1         D EE785D4C  5148  5569      2 (L-TLB)
       ee785dc4 00000046 00000086 ee785d4c 00000086 dfeac914 ee785d60 c044fa00
       c044fa00 dfeac890 0000000a ee765830 79edc300 000000ac 00000000 ee765960
       c1a0da00 00000000 dfeac890 ee785d88 c1b7f828 ee785d94 c01c9b53 ee785dac
Call Trace:
 [<c02e3702>] io_schedule+0x34/0x53
 [<c0154519>] sync_page+0x3b/0x3f
 [<c02e38eb>] __wait_on_bit+0x36/0x5d
 [<c0154683>] wait_on_page_bit+0x5d/0x64
 [<c0159183>] write_cache_pages+0x14e/0x27b
 [<c01592d2>] generic_writepages+0x22/0x2a
 [<f9440089>] xfs_vm_writepages+0x36/0x3c [xfs]
 [<c01592fd>] do_writepages+0x23/0x34
 [<c015520c>] __filemap_fdatawrite_range+0x60/0x6c
 [<c0155413>] filemap_fdatawrite+0x26/0x28
 [<f926541f>] sync_file+0x28/0x62 [loop]
 [<f9266904>] loop_thread+0x3ef/0x428 [loop]
 [<c01350ea>] kthread+0x3b/0x61
 [<c0106167>] kernel_thread_helper+0x7/0x10

mc            D EE5DD75C  5152  5710   5532 (NOTLB)
       ee5dd770 00000086 00000002 ee5dd75c ee5dd74c 00000000 00000001 c044fa00
       c044fa00 00000010 00000007 eeaa0c90 eb727b00 000000bc 00000000 eeaa0dc0
       c1a0da00 00000000 00012cff 00000000 c1b7f828 0001f36c 0000d7ff ffffffff
Call Trace:
 [<c02e4a17>] __down+0xc3/0xd7
 [<c02e479a>] __down_failed+0xa/0x10
 [<f94418b6>] xfs_buf_lock+0x4b/0x4e [xfs]
 [<f942fb90>] xfs_getsb+0x25/0x31 [xfs]
 [<f94355da>] xfs_trans_getsb+0x3b/0x6d [xfs]
 [<f94346eb>] xfs_trans_apply_sb_deltas+0x15/0x4a9 [xfs]
 [<f9434c6e>] _xfs_trans_commit+0xef/0x2fd [xfs]
 [<f94261bc>] xfs_iomap_write_allocate+0x2ea/0x422 [xfs]
 [<f9424f6b>] xfs_iomap+0x34c/0x3e0 [xfs]
 [<f94460de>] xfs_bmap+0x2c/0x32 [xfs]
 [<f9440024>] xfs_map_blocks+0x3c/0x6b [xfs]
 [<f9440ff0>] xfs_page_state_convert+0x30f/0x5f6 [xfs]
 [<f944140b>] xfs_vm_writepage+0x93/0xc8 [xfs]
 [<c0158e2b>] __writepage+0xb/0x25
 [<c01591a9>] write_cache_pages+0x174/0x27b
 [<c01592d2>] generic_writepages+0x22/0x2a
 [<f9440089>] xfs_vm_writepages+0x36/0x3c [xfs]
 [<c01592fd>] do_writepages+0x23/0x34
 [<c018c153>] __writeback_single_inode+0x1a0/0x337
 [<c018c658>] sync_sb_inodes+0x18e/0x260
 [<c018ca81>] writeback_inodes+0x84/0xd3
 [<c0159665>] balance_dirty_pages_ratelimited_nr+0xea/0x1ac
 [<c0155af6>] generic_file_buffered_write+0x47a/0x549
 [<f94473b3>] xfs_write+0x614/0x941 [xfs]
 [<f94439c8>] xfs_file_aio_write+0x6a/0x72 [xfs]
 [<c0172ca7>] do_sync_write+0xc5/0x102
 [<c0173480>] vfs_write+0xaf/0x13e
 [<c0173a29>] sys_write+0x3d/0x61
 [<c0104e86>] sysenter_past_esp+0x5f/0x89

pdflush       D EE855D80  5512  5781      2 (L-TLB)
       ee855d94 00000046 00000002 ee855d80 ee855d70 00000000 00000000 c044fa00
       c044fa00 00000000 0000000a ee23f300 76116f00 000000b3 00000000 ee23f430
       c1a0da00 00000000 f0150290 f7b78140 df9dc438 0001cbc0 f92a5ca0 ffffffff
Call Trace:
 [<c02e3702>] io_schedule+0x34/0x53
 [<c0154519>] sync_page+0x3b/0x3f
 [<c02e3828>] __wait_on_bit_lock+0x36/0x60
 [<c01544ca>] __lock_page+0x54/0x5c
 [<c0159142>] write_cache_pages+0x10d/0x27b
 [<c01592d2>] generic_writepages+0x22/0x2a
 [<f9440089>] xfs_vm_writepages+0x36/0x3c [xfs]
 [<c01592fd>] do_writepages+0x23/0x34
 [<c018c06b>] __writeback_single_inode+0xb8/0x337
 [<c018c658>] sync_sb_inodes+0x18e/0x260
 [<c018ca81>] writeback_inodes+0x84/0xd3
 [<c015991e>] wb_kupdate+0x84/0xe4
 [<c0159d13>] pdflush+0x117/0x1ae
 [<c01350ea>] kthread+0x3b/0x61
 [<c0106167>] kernel_thread_helper+0x7/0x10

Comment 38 Jeff Mahoney 2008-03-18 18:08:50 UTC
There have been a *lot* of upstream XFS changes recently. Would you be willing to test an openSUSE 11.0 Alpha kernel to see if this is still reproducable?
Comment 39 Carlos Robinson 2008-03-18 18:44:28 UTC
Yes, of course.

Easiest for me is to try an os11.0 kernel on os10.3, if that is possible. Just tell me which one to download and install. The advantage for me, if it works, would be to "repair" 10.3 that way.

If that's not possible, I have another partition on which I have intended to install alpha, but didn't do it yet. It would be longer to do, but I would.

Comment 40 Carlos Robinson 2008-03-18 18:45:11 UTC
Changing status
Comment 41 Carlos Robinson 2008-03-30 10:23:12 UTC
(In reply to comment #39 from Carlos Robinson)

> If that's not possible, I have another partition on which I have intended to
> install alpha, but didn't do it yet. It would be longer to do, but I would.

I have tested alpha 11, the Friday release. 



minas-morgul:~ # uname -a
Linux minas-morgul 2.6.25-rc6-git5-10-default #1 SMP 2008-03-20 22:57:18 +0100 i686 i686 i386 GNU/Linux


The "problem" is very much alive and kicking here. :-/

I don't have a log report, I had forgotten the "echo t > /proc/sysrq-trigger" command and once the system was crashed I could no longer mount the partition with my notes to refresh my memory. If you want it, tell me and I will try again.
Comment 42 Jeff Mahoney 2008-03-30 14:01:26 UTC
It would be useful, but not essential. I was hoping to see if any of the recent fixes helped. Can you try to reproduce with kernel-vanilla from 10.3 as well? This will show whether or not the problem was introduced by any of our patches.

I understand this can be a pain. I don't have an environment set up locally with which to track this down.
Comment 43 Carlos Robinson 2008-03-30 19:37:22 UTC
Ok, I just tried with kernel vmlinuz-2.6.22.5-31-vanilla.

I copied 9 GiB from two encrypted partitions to another encrypted partition, two times (delete destination space, copy, delete, copy again): no problem, no crash, fast copy.

So, yes, the problem is an addition, and the new XFS things don't solve the problem.

Hopefully this can help you to find the problem... for me, I know that I don't need to boot into 10.2 to do my encrypted backups, I can use the vanilla kernel in 10.3 - but I don't know what else will stop to work, and what security problems have not been plugged.
Comment 44 Carlos Robinson 2008-05-18 11:42:07 UTC
Created attachment 216233 [details]
messages log file

I have checked current factory to see if the problem is still active - and it is, very much so :-(

Kernel: 2.6.25.3-2-default
Factory version: probably Beta 3. Fully up to date.

Attached goes the /var/log/messages file, and the boot.msg file.


Knowing that 10.3 Vanilla does not have this bug, I would willing to try your kernel with some patches removed. If you can not provide such a kernel, I'm wiling to compile it myself, but I'd need guidance as to what patches to remove and how: I have compiled kernels many times, but I have never removed patches; and in this case I have no idea what to remove, of course

It takes about 4 hours to compile a kernel here, plus testing time. But if you can not reproduce this bug on your premises, it stands to reason that it must be me who does the testing: but I _need_ _guidance_ as to what to test.

This bug is ugly: so far I'm the only known person bitten, but there could be more out there, and you never know what a broken piece of software might do any other day.

Please help.
Comment 45 Carlos Robinson 2008-05-18 11:42:45 UTC
Created attachment 216234 [details]
factory_11.0_20080517_boot.msg.gz
Comment 46 Carlos Robinson 2008-05-22 19:18:16 UTC
Created attachment 217613 [details]
Output from "top" during the lock in factory

This is a log taken from the command 'top' during the lock. The copy process was made with 'mc', so look at this line:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  WCHAN     Flags   COMMAND                                         
10628 crypt     20   0  6688 2836 1956 D  0.0  0.3   0:03.12 xfs_ilock ..4.2... mc                                              

Ii hope this helps a bit more to learn what is happening.
Comment 47 Carlos Robinson 2008-07-16 11:25:04 UTC
I'm still waiting for trying to solve this (see comment #44)

Meanwhile, I have reproduced the problem using LUKS encryption (10.3), creating encrypted filesystems using Yast.

/etc/crypttab:

cr_crypta_f5_dvd.mm.x   /Grande/imgs/crypta_f5_dvd.mm.x   none    noauto
cr_crypta_f6_dvd.mm.x   /Grande/imgs/crypta_f6_dvd.mm.x   none    noauto

(I changed the last "none" (written by Yast) to "noauto")


/etc/fstab:

/dev/mapper/cr_crypta_f5_dvd.mm.x  /mnt/crypta.mm_dvd5.x   xfs             noatime,noauto   0 0

/dev/mapper/cr_crypta_f6_dvd.mm.x  /mnt/crypta.mm_dvd6.x   xfs             noatime,noauto   0 0


nimrodel:~ # file /Grande/imgs/crypta_f[56]*x
/Grande/imgs/crypta_f5_dvd.mm.x: LUKS encrypted file, ver 1 [aes, cbc-essiv:sha256, sha1] UUID: 1ec0ab07-73d0-4ccd-b8df-08076ff
/Grande/imgs/crypta_f6_dvd.mm.x: LUKS encrypted file, ver 1 [aes, cbc-essiv:sha256, sha1] UUID: bb7b10ee-74a1-43aa-b55a-df39ca3



These appeared to work for some hours, then locked. I rebooted, tried again, locked fast. I'll attach the dmesg output later.


I also made filesystems in the same way, but using ext3 and reiserfs, which do not lock (so far, after writing +40 GB). So the problem is indeed associated with XFS. There is something ugly hidden in there, waiting to come out when you less expect it.
Comment 48 Carlos Robinson 2008-07-16 11:29:55 UTC
Created attachment 228120 [details]
kernel log during crash using LUKS XFS crypto filesystem (re comment 46)
Comment 49 Carlos Robinson 2008-07-16 11:31:23 UTC
(In reply to comment #48 from Carlos Robinson)
> Created an attachment (id=228120) [details]
> kernel log during crash using LUKS XFS crypto filesystem (re comment 46)
> 

Errata:

(re comment 47)
Comment 50 Carlos Robinson 2008-12-29 19:13:40 UTC
Created attachment 262656 [details]
messages log file opensuse 11.1 GM

Well, the bug remains alive on OpenSUSE 11.1 GM.

I tried twice. First attempt, an encrypted XFS filesystem made with 10.3:

/etc/crypttab:

mycr_mm_f       /data/biggy/crypta_f.mm.x   none    \
             cipher=twofish-cbc-plain,size=256,hash=sha512,noauto,loop

/etc/fstab:

/dev/mapper/mycr_mm_f    /data/crypta.mm.x   xfs     \
          noauto,user,noatime,nodiratime          0 0


You will find the messages log attached, it is the first part. Then I tried with another filesystem, XFS LUKS made with 11.0:

/etc/crypttab:

mcr_mm_f5_dvd.x    /data/Grande/imgs/crypta_f5_dvd.mm.x    none    noauto


/etc/fstab:

/dev/mapper/mcr_mm_f5_dvd.x  /data/crypta.mm_dvd5.x  xfs  noatime,noauto 0 0


The log is in the second part of the file. Both attempts failed when I tried to write (big files), nothing got written in fact. Procedure is the same as previously.


As I see no progress in this matter, I have destroyed my big XFS encrypted filesystem and created another one as reiserfs. Notice that I have kept that filesystem for a year for the sole purpose of your tests, and I have seen no movement at all on this bug - so I destroy it.
Comment 51 Carlos Robinson 2009-10-18 16:39:20 UTC
It seems that the problem has disappeared in 11.2

I still have to do more tests, but it looks good.

Who changed what?
Comment 52 Carlos Robinson 2009-10-19 18:40:08 UTC
I have written 83G to an encrypted XFS filesystem, via USB, and it went fine.
11.2 RC1
Comment 53 Carlos Robinson 2010-01-05 01:25:11 UTC
Problem persists. Opened new report, and reported to SGI directly.
Comment 54 Carlos Robinson 2010-01-05 14:20:51 UTC
(In reply to comment #53)
> Problem persists. Opened new report, and reported to SGI directly.

SGI reports back that it is DM lockup. Back to you.

http://oss.sgi.com/bugzilla/show_bug.cgi?id=860


Opened Bug 568319.
Comment 55 Carlos Robinson 2010-01-15 20:47:46 UTC
Per the suggestion in the SGI bugzilla, I tried mounting 'nobarrier', which avoids the issue in 11.2 - but it does not work for 11.0
Comment 56 Carlos Robinson 2010-11-27 20:55:00 UTC
This is solved in 11.2 as far as I can see. There was a patch to the kernel that solved it - see Bug 568319 - DM lockup causes filesystem failure.

Although I'm not the assignee, I'll close it, as the reporter and affected party...