URL Encoding: A Complete Developer's Guide

Everything you need to know about URL encoding, special characters, and proper URL formatting.

CO
conv4me
October 12, 2025
3 min read
3 views

Introduction

URL encoding (percent-encoding) converts unsafe characters into a format that can be transmitted over the internet. URLs can only contain a limited set of characters. Anything else must be encoded.

Failing to encode URLs properly leads to broken links, security vulnerabilities, and data corruption. This guide covers best practices, common pitfalls, and security considerations.

How It Works

URL encoding (percent-encoding) converts unsafe characters into %XX format, where XX is the hexadecimal ASCII code.

Character types:

Safe (never need encoding):
  A-Z a-z 0-9 - _ . ~

Reserved (have meaning in URLs):
  : / ? # [ ] @ ! $ & ' ( ) * + , ; =

Must be encoded:
  Space and anything not safe/reserved

Encoding process:

Character: @
ASCII code: 64 (decimal)
Hex: 40
Encoded: %40

Character: Space
ASCII code: 32 (decimal)
Hex: 20
Encoded: %20

Example transformation:

Input:  hello world & test
Step 1: hello world & test
Step 2: Identify unsafe chars (space, &)
Step 3: Convert to hex
        space → 0x20 → %20
        & → 0x26 → %26
Output: hello%20world%20%26%20test

Why reserved characters need encoding:

URLs have structure: scheme://host:port/path?query=value#fragment

Each symbol has meaning:

  • ? starts query string
  • & separates parameters
  • # starts fragment
  • / separates path segments
  • : separates scheme/port

If your data contains these characters, you must encode them. Otherwise the browser interprets them as structure, not data.

Example of what breaks:

// Bad: User searches for "cats & dogs"
const url = `/search?q=cats & dogs`;
// Browser sees: /search?q=cats&dogs
// Thinks "dogs" is a separate parameter
// Query becomes: q="cats ", dogs=""

// Good: Encode the &
const url = `/search?q=cats%20%26%20dogs`;
// Browser sees: /search?q=cats%20%26%20dogs
// Query becomes: q="cats & dogs"

UTF-8 and multi-byte characters:

Non-ASCII characters (emoji, Chinese, accents) encode to multiple %XX sequences:

café → caf%C3%A9  (é = two bytes: C3 A9)
🚀 → %F0%9F%9A%80  (four bytes)

The browser handles UTF-8 encoding automatically with encodeURIComponent().

Best Practices

1. Always Encode User Input in URLs

Why: User input can contain special characters that break URL structure. Failure to encode creates security vulnerabilities and broken functionality.

How to implement:

// Bad: Direct string interpolation
const searchQuery = "cats & dogs";
const url = `/search?q=${searchQuery}`;
// Result: /search?q=cats & dogs
// Broken: "&" interpreted as parameter separator

// Good: Encode user input
const searchQuery = "cats & dogs";
const url = `/search?q=${encodeURIComponent(searchQuery)}`;
// Result: /search?q=cats%20%26%20dogs
// Works correctly

Critical: Never trust user input. Always encode it before including in URLs.

2. Use encodeURIComponent for Query Parameters

Why: encodeURIComponent encodes all special characters except -_.!~*'(). This is correct for query parameter values and path segments.

How to implement:

// Correct usage
const params = {
  name: "John Doe",
  email: "john+tag@example.com",
  url: "https://example.com/page?id=123"
};

const queryString = Object.entries(params)
  .map(([key, value]) => `${key}=${encodeURIComponent(value)}`)
  .join('&');
// Result: name=John%20Doe&email=john%2Btag%40example.com&url=https%3A%2F%2Fexample.com%2Fpage%3Fid%3D123

Don’t use encodeURI(): It doesn’t encode &, =, +. This breaks query parameters.

3. Understand Reserved vs Unreserved Characters

Why: Knowing which characters need encoding prevents double-encoding and malformed URLs.

Character classification:

Unreserved (never need encoding):
  A-Z a-z 0-9 - _ . ~

Reserved (have special meaning, encode in data):
  : / ? # [ ] @ ! $ & ' ( ) * + , ; =

Must be encoded (unsafe in URLs):
  Space " < > % { } | \ ^ `

Example:

// Space must be encoded
"hello world" → "hello%20world"

// @ is reserved, encode in data
"user@example.com" → "user%40example.com"

// / is reserved, don't encode in paths
"/api/users" → "/api/users" (no encoding)

4. Never Double-Encode URLs

Why: Double-encoding produces incorrect URLs that must be decoded twice. Hard to debug and causes data corruption.

How to implement:

// Bad: Encoding already-encoded data
const encoded = encodeURIComponent("hello world");  // "hello%20world"
const doubleEncoded = encodeURIComponent(encoded);  // "hello%2520world"
// %20 became %2520 - wrong!

// Good: Check if already encoded
function safeEncode(str) {
  // Decode first, then encode (idempotent)
  return encodeURIComponent(decodeURIComponent(str));
}

Detection: If you see %25 in URLs, it’s likely double-encoded (% became %25).

5. Use URL Builder Libraries for Complex URLs

Why: Manual string concatenation is error-prone. Libraries handle encoding, parameter ordering, and edge cases correctly.

How to implement:

// Manual (error-prone)
const url = `https://api.example.com/search?q=${encodeURIComponent(query)}&page=${page}`;

// Using URL API (recommended)
const url = new URL('https://api.example.com/search');
url.searchParams.set('q', query);  // Automatically encoded
url.searchParams.set('page', page);
// Result: https://api.example.com/search?q=encoded%20value&page=1
# Python: Use urllib.parse
from urllib.parse import urlencode, urlparse, urlunparse

params = {'q': 'search term', 'filter': 'active'}
query_string = urlencode(params)
# Result: q=search+term&filter=active

Common Pitfalls

Not Encoding at All

The problem:

// No encoding - dangerous
const username = "user@example.com";
fetch(`/api/users/${username}`);
// Result: /api/users/user@example.com
// Broken: @ is reserved character

Why it’s bad:

  • Breaks URL parsing (reserved chars have special meaning)
  • Security risk (URL injection attacks)
  • Data corruption (special chars lost or misinterpreted)

The fix:

// Properly encoded
const username = "user@example.com";
fetch(`/api/users/${encodeURIComponent(username)}`);
// Result: /api/users/user%40example.com

Encoding the Entire URL

The problem:

// Wrong: Encoding the whole URL
const fullUrl = "https://example.com/search?q=hello world";
const encoded = encodeURIComponent(fullUrl);
// Result: https%3A%2F%2Fexample.com%2Fsearch%3Fq%3Dhello%20world
// Broken: Not a valid URL anymore

Why it’s bad: Encodes structural characters (://, /, ?, &) that should remain as-is.

The fix:

// Correct: Only encode the data parts
const baseUrl = "https://example.com/search";
const query = "hello world";
const fullUrl = `${baseUrl}?q=${encodeURIComponent(query)}`;
// Result: https://example.com/search?q=hello%20world

Using + for Spaces (Outdated)

The problem:

// Old HTML form encoding
const encoded = "hello+world";  // + means space

// But in modern URLs:
const value = "1+1=2";
// Should be: 1%2B1%3D2
// Not: 1+1=2 (loses the + sign)

Why it’s bad: + is ambiguous. In query strings, it’s sometimes decoded as space (application/x-www-form-urlencoded). But not in path segments.

The fix: Always use %20 for spaces. It works everywhere.

"hello world" → "hello%20world"  // Standard
"hello world" → "hello+world"    // Legacy, avoid

Quick Reference Checklist

Encoding user input:

  • Always encode user input before adding to URLs
  • Use encodeURIComponent() for query parameters and path segments
  • Use URL API / libraries instead of manual string concatenation
  • Never encode structural characters (://, /, ?, &)
  • Check for double-encoding (avoid encoding already-encoded data)

Special cases:

  • Use %20 for spaces (not +)
  • Encode @ as %40 in user data
  • Encode & as %26 to prevent parameter splitting
  • Encode # as %23 to prevent fragment interpretation

Security:

  • Never trust user input in URLs
  • Validate URLs after encoding
  • Use URL parsing libraries, not regex
  • Log suspicious encoding patterns (multiple % signs)

Language-Specific Functions

JavaScript:

encodeURIComponent(str)  // Use for data parts
encodeURI(str)          // Use for complete URLs (rare)
decodeURIComponent(str) // Decode data parts

Python:

from urllib.parse import quote, unquote
quote(str)         # Encode
quote_plus(str)    # Encode with + for spaces (forms)
unquote(str)       # Decode

PHP:

urlencode($str)      // Encode for query params
rawurlencode($str)   // RFC 3986 compliant (recommended)
urldecode($str)      // Decode

Go:

import "net/url"
url.QueryEscape(str)   // Encode
url.QueryUnescape(str) // Decode

Standards and References

  • RFC 3986: Uniform Resource Identifier (URI) standard
  • RFC 1738: Uniform Resource Locators (legacy)
  • OWASP: URL Encoding Guide
  • MDN: encodeURIComponent() reference

Summary

URL encoding is mandatory for any user-provided data in URLs. Use encodeURIComponent() for query parameters and path segments. Don’t encode the full URL structure. Avoid double-encoding.

Key takeaways:

  1. Always encode user input: Prevents broken URLs and security bugs
  2. Use encodeURIComponent: Correct function for data parts
  3. Don’t encode structure: Keep ://, /, ?, & as-is
  4. Use libraries: URL API handles encoding automatically
  5. Test with special chars: @, &, #, +, = must encode correctly

Try It Yourself

Head over to our tools and experiment with the concepts discussed in this article.