How to Set a Web Page's Encoding Type


What is an Encoding Type?

The encoding type refers to the character encoding, which is the method of converting a sequence of bytes into a sequence of characters. Back in the day when the internet was created, the character encodings were practically tailored to the English language. When requested for a response, the server transfer messages to the client through a stream of bytes. Once the client receives the bytes, it interprets them as a sequence of characters. This conversion method can be simple one-to-one correspondence to complex switching schemes. Without the browser or reader knowing the character set, they would have to guess at the encoding type, which would lead to a series of errors for unrecognized sequences of characters. Essentially, the types of character encodings allow different languages and characters to be transmitted across the internet. Specifying the encoding type tells browser or readers to use that type of encoding rather than guessing. For a background of character encodings and their evolution visit www.w3.org.

Why Set a Web Page's Encoding Type?

When you are coding your website without providing an encoding type, you probably have no idea that the browser is working magic behind the scenes. Without specifying an encoding type, systems try to fall back to an encoding like UTF-8; however, they will first try to guess at the encoding before completely falling back. This ultimately makes the browser load the page slower than if the character encoding type had been provided. As for search engines, pages without an encoding type can be a similar issue, but search engines have a bit of an advantage. Browsers are forced to load the page because the user chose the URL. Search engines, on the other hand, can simply just omit a web page without a character encoding type from the results because they cannot be certain that it will help the user with their search. This does not mean that search engines won't index web pages without an encoding type, but it is likely that these pages will suffer as opposed to similar pages with specified encodings.

Setting the Encoding Type

The majority of web pages use UTF-8, but you can choose to use a different character encoding for your web page. Whatever encoding type you choose, make sure you implement it correctly and that it only contain characters defined by Unicode. As for setting the encoding type of an HTML document, it can be done in two ways. We will set our web page encoding to UTF-8 using the two ways below:

Specify the Encoding in the HTTP Headers

Server configuration files and the server programming language can modify the HTTP header that specifies the web page's encoding type. In this example, we are using the .htaccess file for Apache.

<Files example.html>
ForceType text/html;charset=UTF-8
</Files>

Specify the Encoding in the Head Tag

While you can use this in any HTML or XHTML document, it is considered secondary to using HTTP Headers. This is primarily because the Headers are received first, and with in-document elements such as this, the browser would need first parse the page to find the encoding.

<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />