Encoding in web development is the process of transforming data from one format into another agreed-upon universal code system/scheme. The code system has one or a combination of characters such as numbers, letters, symbols, and any other existing characters.
Those that are responsible for creating any given universal coding system have assigned each character with a corresponding unique value. When one encodes data, they are merely converting the data into the assigned corresponding equivalent value from the universal coding scheme that one chooses to use.
One encodes information so that it is consumed properly via different systems that process data on the web. One can decode text back into its original format using the algorithm used to encode it. Encoding in web development is essential as it assists in communication and storage of information. However, one should not to use encoding if sending confidential/private information.
Encoding vs Encryption vs Hashing
Most people outside the developed world tend to confuse encoding with encryption and hashing, and use them interchangeably as if they are one. However, encoding is not the same as encryption or hashing for that matter.
Encoding in web development, as explained above, is a sort of translation if you will, a process that helps one system read and display data properly using an agreed-upon schema. In comparison, encryption means keeping the data confidential by changing the original data into a mixed string of random characters. The encryption can then be reversed by using a unique assigned key.
As opposed to both encryption and encoding, hashing is an irreversible process that turns data into a hashed code or secure format. This means that, once a piece of data has been hashed, it cannot be turned back to its original format. Hashing is especially useful when storing sensitive data like user passwords.
So why do we use encoding in web development?
The simple answer is that computers don’t see data as we, humans do. They see it in binary, which is a system formed from zeros and ones. So, encoding information helps computers “translate” data from text, images, videos, etc, into zeros and ones they can read and understand.
Once the information is processed, it is then decoded using a codex, and served back to us, the users in a way that we can consume it and understand it.
Most commonly used encoding standards
ASCII encoding
Encoding in web development uses various universal coding schemes, but the oldest and best-known one is among them ASCII. ASCII stands for American Standard Code for Information Interchange and has evolved from the telegraph system. This standard was “born” in the sixties and to this day it is used to encode text files.
ASCII has 127 positions for its character-set that are defined and some additional characters that are undefined. They include numbers, symbols, small letters, punctuation marks, and capital letters. These ASCII characters can be printable or unprintable.
Unicode encoding
Unicode encoding is almost like the ASCII code scheme but goes beyond the first 128 characters we see in the first example, which makes Unicode a bit more complex.
A very popular choice, UTF-8 is only one example of Unicode encoding. UTF-8 used is the default method to encode the input in the browser from an HTML5 document. Another example of Unicode encoding is UTF-16. However, the 8 and 16 endings simply represent the number of bits allocated for each character. Ultimately, both have the same results.
URL encoding
URL (Uniform Resource Locator) encoding, also known as percent-encoding, and relies heavily on the ASCII character set to encode and transmit data over the internet.
However, since some characters within the URL are not a part of the ASCII set, the encoding process needs to replace them with “safe” characters. Simply put, URL encoding uses the percentage sign %, followed by two hexadecimal to replace these unsafe characters.
For example, there are no spaces within a URL – really, check for yourself. So, URL encoding uses the plus sign + or the percentage sign followed by number 20 – %20 – to replace it and make it safe.