Encoding & Decoding for Pentesters — Base64, URL, Hex, and Beyond

Practical guide to encoding and decoding techniques used in penetration testing. Covers Base64, URL encoding, Hex, HTML entities, Unicode, and chaining techniques for WAF bypass.

Try the Encoding/Decoding Multi-Tool

Encoding and decoding are fundamental operations in penetration testing. Whether you're crafting payloads to bypass WAFs, decoding obfuscated data found during recon, or unwrapping multi-layer CTF challenges, understanding these encoding schemes is essential.

This guide covers the encoding formats you'll use most often during engagements, with practical examples for each. Every format covered here is available in our Encoding/Decoding Multi-Tool with chaining support.

Base64

Base64 is the most common encoding you'll encounter in security work. It represents binary data as ASCII text using a 64-character alphabet (A-Z, a-z, 0-9, +, /), with = padding at the end.

When you'll see it

  • JWT tokens (header and payload are Base64url-encoded)
  • HTTP Basic Authentication headers
  • Email attachments (MIME encoding)
  • Encoded PowerShell payloads (powershell -e)
  • Data exfiltration over DNS or HTTP
  • Obfuscated malware payloads

Command line

# Encode
echo -n "payload" | base64

# Decode
echo "cGF5bG9hZA==" | base64 -d

# PowerShell
[Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes("payload"))
[Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("cGF5bG9hZA=="))

URL Encoding

URL encoding (percent encoding) replaces unsafe characters with %XX where XX is the hex value of the character. This is critical for web application testing.

Standard vs Full encoding

Standard URL encoding only encodes characters that are unsafe in URLs (spaces, <, >, #, etc). Full encoding converts every character, which is useful for obfuscation.

# Standard: <script>alert(1)</script>
%3Cscript%3Ealert(1)%3C%2Fscript%3E

# Full: alert(1)
%61%6C%65%72%74%28%31%29

Double URL encoding

Some applications decode URL input twice — once at the web server layer and once in application code. If a WAF only decodes once before checking, double encoding can bypass it:

# Single: <
%3C

# Double: <
%253C

The WAF sees %253C, decodes it to %3C (which looks safe), and the application decodes it again to <.

Hex Encoding

Hex encoding represents each byte as two hexadecimal characters. It's used extensively in exploit development, shellcode, and low-level analysis.

Formats you'll encounter

  • Plain hex: 48656c6c6f
  • Prefixed: \x48\x65\x6c\x6c\x6f (common in shellcode)
  • Spaced: 48 65 6c 6c 6f (hex dumps)
  • 0x prefix: 0x48 0x65 (programming)
# Encode
echo -n "Hello" | xxd -p

# Decode
echo "48656c6c6f" | xxd -r -p

# Python
"Hello".encode().hex()
bytes.fromhex("48656c6c6f").decode()

HTML Entity Encoding

HTML entities replace characters that have special meaning in HTML. This is both a defense mechanism (output encoding to prevent XSS) and an attack technique (encoding payloads to bypass filters).

Named vs Numeric entities

# Named entities
&lt;script&gt;  →  <script>
&amp;         →  &
&quot;        →  "

# Numeric (decimal)
&#60;&#115;&#99;&#114;&#105;&#112;&#116;&#62;  →  <script>

# Numeric (hex)
&#x3c;&#x73;&#x63;&#x72;&#x69;&#x70;&#x74;&#x3e;  →  <script>

Numeric entity encoding of every character is useful for bypassing XSS filters that look for specific strings like "script".

Unicode Escapes

Unicode escape sequences (\uXXXX) represent characters by their Unicode code point. These appear in JSON, JavaScript, Java, and other contexts.

# JavaScript
\u003cscript\u003e  →  <script>

# This can bypass filters that don't handle unicode normalization

Chaining Encodings for WAF Bypass

The real power comes from combining encodings. Web application firewalls typically only decode one or two layers. By chaining encodings strategically, you can craft payloads that pass through the WAF but are decoded by the application.

Common chains

  • Base64 + URL encode — Encode your payload in Base64, then URL encode the result. The application URL-decodes first, then Base64-decodes.
  • Double URL encode — Two rounds of URL encoding. Bypasses WAFs that decode once before inspection.
  • HTML entity + URL encode — Useful when injecting into HTML attributes that are also URL-decoded.
  • Hex + Base64 — Obfuscate shellcode or payloads for delivery via text channels.

Our Encoding/Decoding Multi-Tool supports chaining any combination of these operations, showing intermediate results at each step.

ROT13

ROT13 rotates each letter by 13 positions in the alphabet. It's self-reversing (applying it twice returns the original). You'll encounter it in CTFs and occasionally in obfuscated malware.

Hello World  →  Uryyb Jbeyq
Uryyb Jbeyq  →  Hello World

Binary and Decimal

Binary (base-2) and decimal (base-10) representations of text are common in CTF challenges and steganography.

# "Hi" in binary
01001000 01101001

# "Hi" in decimal
72 105

Identifying Unknown Encodings

Tips for recognizing encoding formats at a glance:

  • Ends with = or == — Almost certainly Base64
  • Contains %XX — URL encoded
  • Only 0-9 and a-f — Hex encoded
  • Contains & and ; — HTML entities
  • Contains \u — Unicode escapes
  • Only 0 and 1 in groups of 8 — Binary
  • Looks like text but garbled — Try ROT13

CLI Version

Prefer working from the terminal? Install the CLI version via pip:

pip install offseckit-encode

Then encode and decode directly from your terminal:

encode -o base64-encode "Hello World"
encode -o base64-decode "SGVsbG8gV29ybGQ="
encode -o url-encode -o base64-encode "test payload"
echo "encoded data" | encode -o hex-decode

Chain operations with multiple -o flags and use --steps to see intermediate results:

encode -o base64-encode -o url-encode -o hex-encode "payload" --steps

Source code and full documentation on GitHub.

Quick Reference

Use our Encoding/Decoding Multi-Toolto instantly encode or decode in any of these formats. The chaining feature lets you combine operations — something most online encoders can't do. All processing happens in your browser, so it's safe for sensitive data.