Quick ActiveSupport::Multibyte glossary trick
I was trying to make a glossary of words grouped by their first letter, but I wanted words starting with the letter é grouped with words starting with the letter e. No small feat you might imagine. Wrong.
dict = words.inject({}) do |dict, word|
letter = word.chars.decompose[0..0].downcase.to_s
dict[letter] ||= []
dict[letter] << word; dict
end
The reason this works is that letters like é have a decomposed form in Unicode, this form consists of a latin letter and a accent modifier. I’m not sure what happens if you run Arabic through this code, but we’ll cross that bridge when we get there.