Buttitta: Huge string replace in JavaScript?

Huge string replace in JavaScript?

I've got a small JavaScript application that will parse files the user
drops into the browser. Recently I've discovered an issue with some
non-english characters. The file types that are dropped on here are using
the Windows-1252 character set, so characters such as ñ, are actually
coming through as Ã± and I must convert them all to the proper characters.
I've found an extremely useful website with the collection of the
characters, and their counterparts that I need to convert to.
I've condensed that down into two JavaScript arrays:
var toReplace = ["Ã€", "Ã", "Ã‚", "Ãƒ", "Ã„", "Ã…", "Ã†", "Ã‡", "Ãˆ",
"Ã‰", "ÃŠ", "Ã&lsqauo;", "ÃŒ", "Ã", "ÃŽ", "Ã", "Ã", "Ã'", "Ã'", "Ã"", "Ã"", "Ã•",
"Ã–", "Ã—", "Ã˜", "Ã™", "Ãš", "Ã›", "Ãœ", "Ã", "Ãž", "ÃŸ", "Ã", "Ã¡",
"Ã¢", "Ã£", "Ã¤", "Ã¥", "Ã¦", "Ã§", "Ã¨", "Ã©", "Ãª", "Ã«", "Ã¬", "Ã",
"Ã®", "Ã¯", "Ã°", "Ã±", "Ã²", "Ã³", "Ã´", "Ãµ", "Ã¶", "Ã·", "Ã¸", "Ã¹",
"Ãº", "Ã»", "Ã¼", "Ã½", "Ã¾", "Ã¿"];
var replaceWith = ["À", "Á", "Â", "Ã", "Ä", "Å", "Æ", "Ç", "È", "É", "Ê",
"Ë", "Ì", "Í", "Î", "Ï", "Ð", "Ñ", "Ò", "Ó", "Ô", "Õ", "Ö", "×", "Ø", "Ù",
"Ú", "Û", "Ü", "Ý", "Þ", "ß", "à", "á", "â", "ã", "ä", "å", "æ", "ç", "è",
"é", "ê", "ë", "ì", "í", "î", "ï", "ð", "ñ", "ò", "ó", "ô", "õ", "ö", "÷",
"ø", "ù", "ú", "û", "ü", "ý", "þ", "ÿ"];
What would be the most efficient way to replace all characters from a
paragraph in toReplace with it's counterpart (same index) in replaceWith?
I'm hoping this won't be too loop-heavy since it's not uncommon to drop
over 100 files into this application that already does some heavy looping
& parsing.
Perhaps there is a better way to do this instead of keeping these
characters in arrays?

Buttitta

Tuesday, 13 August 2013

Huge string replace in JavaScript?

No comments:

Post a Comment