NYCPHP Meetup

NYPHP.org

[nycphp-talk] Squashing accented characters

Andrew Yochum andrew at plexpod.com
Fri Oct 22 14:57:53 EDT 2010


Hi Paul,

You can achieve that with unicode transliteration:
     http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines
Check out the PHP Iconv extension:
     http://us.php.net/manual/en/intro.iconv.php

Hope that helps!

Regards,
Andrew

On 10/22/10 2:50 PM, Paul A Houle wrote:
> For my site at
>
> http://ookaboo.com/
>
> I'm running into the problem that people are searching for 
> "Dusseldorf" but the name of the place is "Düsseldorf",  so they don't 
> find it.
>
> It seems to me a good answer to this is to have some function that 
> squashes accented characters down to unaccented forms.  I'd index the 
> unaccented forms and also squash down queries so they'd always match 
> up.  I definitely need to do both ISO-Latin-1 and the 
> Latin-Extended-A,   because fate has given me a lot of place names 
> that have the Polish dark L in them (? 
> <http://fileformat.info/info/unicode/char/0142/>).  It also seems like 
> there are a lot of characters in Latin Extended-B that would also map 
> plausably to unaccented characters.
>
> I can see how to write something like this,  I'd need to parse out the 
> Unicode code points from UTF-8 and run them through a lookup table,  
> but it's a lot of details and I wonder if anybody has written a PHP 
> function to do this already.
>
>
> _______________________________________________
> New York PHP Users Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/Show-Participation


-- 
Andrew Yochum
Plexpod
andrew at plexpod.com
office: 718-360-0879
mobile: 347-688-4699
fax:    718-504-6289

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20101022/fdf531ed/attachment.html>


More information about the talk mailing list