Three Places Where Base64 and Base64URL Produce Different Output
Published: 2026-06-28
When you pass a JWT segment to Python's base64.b64decode(), you'll often get binascii.Error: Invalid base64-encoded string. The usual culprit is a - or _ character, or a missing = at the end. "Just replace + with -" fixes one problem but leaves the others. Base64 and Base64URL differ in three specific places, and each one requires a separate fix.
1. Character substitution: + and / become - and _
Of the 64 characters in the Base64 alphabet, only positions 62 and 63 change.
Index Base64 Base64URL 0–61 A–Z, a–z, 0–9 same 62+
-
63
/
_
The output only differs when the input bytes contain 6-bit patterns that map to index 62 (111110) or 63 (111111). If neither pattern appears in the data, Base64 and Base64URL produce identical output.
import base64
data = b'\xfb\xef\xbe' # binary: 11111011 11101111 10111110
print(base64.b64encode(data)) # b'++++'
print(base64.urlsafe_b64encode(data)) # b'----'
All three bytes of \xfb\xef\xbe split into 6-bit chunks of 111110 (index 62), which Base64 encodes as + and Base64URL encodes as -. The mapping is symmetric, so decoding just needs the reverse substitution before being passed to a standard decoder.
2. Padding: = present vs = absent
Base64 converts 3 bytes into 4 characters. When the input length isn't a multiple of 3, = padding is appended to reach the next multiple of 4.
1 byte input → 2 chars + "==" (2 padding chars)
2 bytes input → 3 chars + "=" (1 padding char)
3 bytes input → 4 chars (no padding)
RFC 4648 Section 5 (Base64url) makes padding optional, with one condition: the receiver must already know the total data length. JWT treats its three segments (header, payload, signature) as satisfying this condition and omits = from all of them.
JWT segments are Base64URL without padding. Python's base64.urlsafe_b64decode() requires padding, so passing a raw JWT segment will fail.
import base64, json
segment = 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9'
# Re-add padding before decoding
padding = (4 - len(segment) % 4) % 4
decoded = base64.urlsafe_b64decode(segment + '=' * padding)
print(json.loads(decoded)) # {'alg': 'HS256', 'typ': 'JWT'}
The formula (4 - len(s) % 4) % 4 uses double modulo to avoid adding 4 padding characters when the string is already a multiple of 4. When len % 4 is 0, the result is (4 - 0) % 4 = 0. When it's 2, you get 2 padding characters. When it's 3, you get 1.
3. Embedding in URLs: = becomes %3D
Standard Base64 output placed in a URL query string causes = to get percent-encoded as %3D.
# Standard Base64 dropped directly into a query string
?token=abc+def/ghi==
# After URL encoding (valid but verbose)
?token=abc%2Bdef%2Fghi%3D%3D
# Base64URL (URL-safe, no padding)
?token=abc-def_ghi
Some parsers interpret + as a space in query strings, and = as a key-value separator, which can silently corrupt the value before your code even sees it. The third form is why Base64URL exists.
Which format is used where
Use case Format Padding JWT (header, payload, signature) Base64URL none OAuth 2.0 PKCE (code_challenge) Base64URL none Data URIs (data:image/png;base64,...)
Standard Base64
yes
MIME (email attachments)
Standard Base64
yes
Embedding in URL query parameters
Base64URL
none recommended
Using as a filename
Base64URL
none recommended
When decoding, identify the format first. The presence of - or _ indicates Base64URL; + or / indicates standard Base64. The presence or absence of = at the end is not a reliable indicator — it depends on whether the encoder chose to include padding, not on which variant was used.
Decoding in practice
When the format is uncertain, apply all three fixes in order:
import base64
def decode_base64_any(s: str) -> bytes:
# 1. Convert Base64URL characters to standard Base64
s = s.replace('-', '+').replace('_', '/')
# 2. Restore padding
s += '=' * ((4 - len(s) % 4) % 4)
# 3. Decode
return base64.b64decode(s)
In JavaScript, atob() only accepts standard Base64 with +, /, and =, so the same substitution is needed before passing Base64URL input.
function decodeBase64Any(s) {
s = s.replace(/-/g, '+').replace(/_/g, '/');
while (s.length % 4 !== 0) s += '=';
return atob(s);
}
Node.js 16 added a 'base64url' encoding option, so Buffer.from(s, 'base64url') handles the full variant natively — no character substitution or padding needed. That option is Node.js-specific, though, so the explicit normalization approach is more portable across runtimes and languages.
You can switch between Base64 and Base64URL to inspect the encoding differences directly in the tool below.