Rails + PostgreSQL accent insensitive search
For a project we’re currently working on the need for accent insensitive search came up.
I google for a couple of solutions, most fo which required applying a translate() to the records or using to_ascii() in order to replace accented characters with plain ones. For me this should be a considerable performance hit as the DB had to apply the function to every record before performing the query.
I decided to go with a case insensitive using the operator ‘~*’. Basically, for each character which has accented variations I introduce an expression (ex: a -> (a|à|á|â|ã) resulting in a query like :
select * from my_table where name ~* '(a|à|á|â|ã)ndr(e|é|è|ê)'
This will search for the name “André” in the database.
Since I’m using ruby I created a new method in the String class called accent_insensitive_regexp:
class String
@@ACCENT_INSENSITIVE_REGEXES=[
"(a|á|à|â|ã)","(e|é|è|ê)","(i|í|ì)","(o|ó|ò|ô|õ)","(u|ú|ù)","(c|ç)"
]def accent_insensitive_regexp
res=self
@@ACCENT_INSENSITIVE_REGEXES.each {|exp|
res.gsub! Regexp.new(exp), exp
}
res
endend
Now i can my case and accent insensitive search working by using something like:
User.find :all, :conditions=>[" name ~* '?' ",name.downcase.accent_insensitive_regexp]
Those who’ve tried to use the downcase and upcase methods in Rails will notice that this will not work properly since these methods to not deal with the accented characters.
For dealing with that issue I changed those two methods in the String class to:
alias_method
ld_upcase, :upcase
alias_methodld_downcase, :downcase
@@DOWNCASE_ACCENT_CHARS=”çàáèéìíòóùúâêôãõ”
@@UPCASE_ACCENT_CHARS=”ÇÀÁÈÉÌÍÒÓÙÚÂÊÔÃÕ”def upcase
self.old_upcase.tr(@@DOWNCASE_ACCENT_CHARS,@@UPCASE_ACCENT_CHARS)
enddef downcase
self.old_downcase.tr(@@UPCASE_ACCENT_CHARS,@@DOWNCASE_ACCENT_CHARS)
end
And there you have it. I quick and dirty hack to get this stuff to work properly in RoR. Please notice I haven’t spent much time cleaning the code I just though this could get someone with similar problems started.

Comments(2)

















































I like this approach but your query will not find ANDRÉ or any name with a uppercase accented letter. Any suggestion to solve this problem?
I ordered parts for a new pc that I am building and in the heat of the moment forgot to add an OS to my shopping cart on newegg. I ordered: amd phenom 2 955BE sapphire 4850 A GIGABYTE GA-MA790XT-UD4P The ram and cd/dvd drive im sure will work since they arent anything fancy or new. I have linux 9.04 64bit cd. So will linux recognize my hardware?
________________
unlock iphone 3gs