News: 0001511889

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Linux Fixes Regression That Broke File Names With ❤️ & Other Special Characters

([Linux Storage] 3 Hours Ago Case Folding Gone Wrong)


Linus Torvalds took to reverting some code tonight within the mainline Linux kernel that inadvertently had broken support having filenames with ❤️ and other special Unicode characters in filenames when on file-systems with case-folding (optional case insensitive file/folder name) support.

Merged to the Linux kernel last month was [1]this change to the kernel's Unicode handling to not special case ignorable code points. This commit stripping around 3k lines of kernel code left the ignorable code points to decompose/casefold themselves. Unfortunately though this ended up breaking things for file-systems with Unicode case-folding support for case insensitive file/folder handling, like F2FS. In turn those running new Linux kernels were no longer able to read files with special characters, such as the ❤️ emoji.

[2]This kernel bug report raised the issue over being unable to find certain files on an F2FS file-system now after the specified Unicode change.

With that Unicode change clearly causing problems and breaking existing user-space support with accessing existing files of all things, Linus Torvalds immediately took to reverting the problematic code.

Linus Torvalds [3]commented in the revert:

"It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior.

So now you can't look up those names, because they hash to something different.

Of course, it's also entirely possible that in the meantime people have created *new* files with the new ("more correct") case folding logic, and reverting will just make other things break.

The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug."

At least if you don't make use of case-folding on a supported file-system and running on a very recent kernel, you have nothing to worry about especially if you don't typically toss special characters into your filenames. In any case one more interesting/unique Linux kernel regression now resolved.



[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5c26d2f1d3f5e4be3e196526bead29ecb139cf91

[2] https://bugzilla.kernel.org/show_bug.cgi?id=219586

[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=231825b2e1ff6ba799c5eaf396d3ab2354e37c6b



THIS IS PLEDGE WEEK FOR THE FORTUNE PROGRAM

If you like the fortune program, why not support it now with your
contribution of a pithy fortunes, clean or obscene? We cannot continue
without your support. Less than 14% of all fortune users are contributors.
That means that 86% of you are getting a free ride. We can't go on like
this much longer. Federal cutbacks mean less money for fortunes, and unless
user contributions increase to make up the difference, the fortune program
will have to shut down between midnight and 8 a.m. Don't let this happen.
Mail your fortunes right now to "fortune". Just type in your favorite pithy
saying. Do it now before you forget. Our target is 300 new fortunes by the
end of the week. Don't miss out. All fortunes will be acknowledged. If you
contribute 30 fortunes or more, you will receive a free subscription to "The
Fortune Hunter", our monthly program guide. If you contribute 50 or more,
you will receive a free "Fortune Hunter" coffee mug ....