[thelist] perl: sorting and umlauts

David R. Miller dmiller at mcc.ca
Tue Aug 15 11:21:18 CDT 2000

Joxn wrote:
> Hi,
> I have a list with country names in German language and want to sort it
> by name. However, when I just use 'sort', I get this result:

I just picked up this thread (back from summer holidays). If your output is
destined for the web, here's a great sorting trick you can use (which I'll
take full credit for).

First, convert all your non-ASCII characters to HTML entities. Then sort
disregarding everything in the entity except the root character using a
(Randall) Schwartian Transform. The following snippet sorts a list of French
words for eventual display in an options list.

	 # If French, alphabetically reorder the options using a Schwartzian
	if ($FORM{lang} eq 'french') {
		@options =
			map {{num=>$_->[0], text=>$_->[1]}}
			sort {$a->[2] cmp $b->[2]}
			map {
				my $temp = $_->{text};
				$temp =~ s/&|acute;|grave;|circ;|cedil;|uml;//g;
				[$_->{num}, $_->{text}, $temp]

This means, for instance that é, è and ê will all be sorted as though they
were e.

David R. Miller
Manager, Computer-Based Testing
Medical Council of Canada
dmiller at mcc.ca
(613) 521-6012
(613) 521-9722 (fax)

More information about the thelist mailing list