Dispersion Design

< Back

Percent Encoding of Characters



Percent encoding is a method of encoding prohibited characters in strings. Percent encoding allows characters to be used in a string that would normally not be able to be represented.

Percent encoding is most often seen in URLs (URIs) and the most commonly encoded character is a space. URLs are not allowed to contain the space character (ASCII character number 32, which is 0x20 in hexadecimal notation), so a space character gets written as '%20'. For example:

http://www.dispersiondesign.com/path containing spaces/

would be encoded as:


The following table shows some characters and their percent encoded equivalent:

CharacterASCII valueASCII value (in hex)Percent Encoded

Let’s see how to encode and decode percent encoding.

Decoding (Unescaping) Percent Encoding

In programming languages that support regular expressions, such as Perl, PHP and JavaScript, decoding a percent encoded string is a simple substitution operation. First we need a regular expression that locates valid percent encoded character sequences. In URLs, a percent encoded sequence starts with a '%' (percent) character, followed by exactly two characters that can be 0-9, a-f or A-F. In regular expression syntax, we can find two consecutive characters that are 0-9, a-f or A-F with:


Finding these characters with a preceeding '%' character is then simply:


The percent encoded value is a hexadecimal value, so it needs to be converted to a decimal value. In Perl, this is accomplished using the hex() function:

my $decimal = hex($1);

Then, the resulting decimal value needs to be converted to a character. The function in Perl for this is chr():

my $character = chr($decimal);

Putting this together, the unescaping (decoding) or percent encoding can be performed in Perl with a single line of code:

$str =~ s/%([0-9a-fA-F]{2})/chr(hex($1))/ge;

JavaScript Solution

In JavaScript, the same thing can be performed with the parseInt() and fromCharCode() functions:

var regex = /%([0-9a-fA-F]{2})/g;
str = str.replace(regex, function (str, p1) {
	return String.fromCharCode(parseInt(p1, 16));

However, JavaScript has a built-in function called unscape() that can perform the same task:

str = unescape(str);

Encoding (Escaping) with Percent Encoding

Creating a percent encoded string requires that the invalid characters first be defined. For example, if you wish to encode all characters that are not a-z, A-Z and 0-9, you would need a regular expression like the following:


Now, in Perl, these characters can be substituted using ord() to get the decimal ASCII value for the character and sprintf() to get the hexadecimal equivalent:

$str =~ s/([^0-9a-zA-Z])/sprintf("%%%02X", ord($1))/ge;

JavaScript Solution

In JavaScript, the solution can be written:

var regex = /[^0-9a-zA-Z]/g;
str = str.replace(regex, function (str) {
	var d = str.charCodeAt(0);
	return (d < 16 ? '%0' : '%') + d.toString(16);

JavaScript also has a built-in function called escape() that will percent-encode a string. However, the escape() function does not give you any control over which characters are escaped.