Character Set in JavaScript

Last updated on 5 Nov, 2020

JavaScript programs are written using Unicode character set. Unicode is a superset of ASCII and supports most of languages in the world.

Unicode is a standard for consistent representation of text maintained by Unicode Consortium. Unicode Consortium is a non-profit organization based in Mountain View California.

Non English Text

Let us try to print a Japanese text using console.log() statement. I translated "hope" to Japanese using Google translate and it says the Japanese is Nozomu.


Above code prints the Japanese text just like that in console.


Since JavaScript supports Unicode character set, it also possible to use foreign languages as variable names.

const പേര് = "Backbencher";
console.log(പേര്); // "Backbencher"

Above code used a word from Malayalam language as an identifier. That is also valid in JavaScript.

Escape Sequence

Due to either hardware or software limitations, if we are not able input a particular unicode character, we can make use of escape sequence. Any unicode character in JavaScript can be represented using 6 characters. 6 characters include a \, u and 4 hexa decimal characters.

console.log("\u2764"); // "❤"

Above code logs a heart symbol in console.

Another useful case is to write latin alphabets. How to write an é?. We can make use of unicode in this case.

console.log("\u00e9"); // "é"

According to JavaScript engine, both é and \u00e9 are same.

console.log("é" === "\u00e9"); // true


We can write a character in multiple ways using Unicode. Let us take the case of é. It can be written as a single unicode character as seen above.

console.log("\u00e9"); // "é"

é can also be written by combining the normal ASCII e with the acute accent combining mark(\u0301). The combining mark adds the dash on any normal characters.

console.log("e\u0301"); // "é"
console.log("f\u0301"); // "f́"

Even though both techniques produces the same output, they are not equal internally.

console.log("\u00e9" === "e\u0301"); // false

Unicode Application

Even though we can use unicode to declare variables or as string literals, its direct usage is very rare. I have not seen anyone giving a Japanese word as variable name. When we declare a variable for maximum readability, it is good to choose English language.

There can be scenarios when we need to insert a special character like copyright symbol. In that case if use unicode, we might save inserting an additional image.

console.log("\u00A9"); // "©"
--- ○ ---
Joby Joseph
Web Architect