Google: 60 Percent of the Web’s Content is Now in Unicode

Unicode's mission is to enable people around the world to use computers in their language by creating a standard for encoding the characters of all writing systems in the world. Judging from the latest data from Google, Unicode is clearly on its way to fulfill this mission. According to Google, about 60% of the web's content is now encoded in Unicode.


Just since 2006 alone, Unicode's usage has grown 800% and, as Google notes, if it had added the ASCII standard, which is basically a subset of most other encodings, Unicode's share would have been closer to 80%.

Today's Unicode standard includes close to 110,000 characters, including 75,000 Chinese ideographs, Arabic, Russian and hundreds of emoji symbols.

Google itself uses Unicode as the internal format for all the text in Google Search. Indeed, whenever it encounters a text in any other format, the first thing it does is convert in to Unicode.

Liked this story? Share it.

Looking for more tech stories to read? Give our new tech news aggregator a try.

About the author

Frederic Lardinois has written 851 articles for SiliconFilter

Frederic Lardinois founded SiliconFilter in 2011. Before starting this site, he wrote about 1,500 articles for ReadWriteWeb. His areas of interest are consumer web and mobile apps, as well as Internet-connected devices like cars, smart sensors and toasters. You can reach him at [email protected]


There are no responses so far.

Leave your response