Path Traversal — How Attackers Read Files from Your Server
Path traversal (directory traversal) is one of the simpler web attacks, yet it still makes the top 10 vulnerability lists. It lets an attacker read arbitrary files from the server — configuration, SSH keys, passwords, application code. In some cases, it also allows writing.
How It Works
Imagine an application that displays images:
https://example.com/loadImage?filename=photo.jpg
The server takes the filename parameter and appends it to a base path:
/var/www/images/ + photo.jpg → /var/www/images/photo.jpg
What if instead of photo.jpg we supply ../../../etc/passwd?
/var/www/images/../../../etc/passwd → /etc/passwd
The ../ sequence moves up one directory. Three such sequences from /var/www/images/ reach the filesystem root. The server returns the contents of /etc/passwd.
What Do Attackers Want to Read?
On Linux, typical targets:
/etc/passwd # user list
/etc/shadow # password hashes (if server runs as root)
/etc/hosts # DNS configuration
/proc/self/environ # environment variables (may contain API keys)
/home/user/.ssh/id_rsa # private SSH key
/var/log/auth.log # authentication logs
~/.bash_history # command history
On web applications:
/var/www/html/.env # database passwords, API keys
/var/www/html/config/database.yml # Rails configuration
/etc/nginx/nginx.conf # server configuration
Bypass Techniques
A naive defense is filtering ../ from the string. Attackers have several workarounds.
Nested Sequences
If the server strips ../ once, nesting works:
....//....//....//etc/passwd
After the inner ../ is removed, what remains is ../../../etc/passwd.
URL Encoding
%2e%2e%2f → ../
%2e%2e/ → ../
..%2f → ../
Double Encoding
If the server decodes the URL twice:
%252e%252e%252f → %2e%2e%2f → ../
Non-standard Encoding
..%c0%af → ../ (UTF-8 overlong encoding)
..%ef%bc%8f → ../ (fullwidth slash)
Null Byte
If the server enforces a file extension (e.g., .jpg):
../../../etc/passwd%00.jpg
The null byte (%00) truncates the string — the server opens /etc/passwd, ignoring .jpg. This trick mainly works on older PHP and C versions.
Absolute Path
Sometimes ../ isn’t needed — a full path is enough:
filename=/etc/passwd
Starting from the Required Directory
If the server checks that the path starts with /var/www/images:
filename=/var/www/images/../../../etc/passwd
Validation passes because the path begins with the required directory.
How to Defend
Two layers of defense — both needed simultaneously.
1. Input Validation
Best approach: whitelist allowed filenames. If that’s not possible — verify input contains only alphanumeric characters:
import re
def is_safe_filename(filename):
return bool(re.match(r'^[a-zA-Z0-9._-]+$', filename))
This blocks ../, %2e, null bytes, and everything else.
2. Path Canonicalization
Even with input validation — check the resolved path:
import os
BASE_DIR = '/var/www/images'
def safe_path(filename):
full_path = os.path.realpath(os.path.join(BASE_DIR, filename))
if not full_path.startswith(BASE_DIR):
raise ValueError('Path traversal detected')
return full_path
os.path.realpath() resolves symlinks and ../, so even if an attacker bypasses input validation, canonicalization catches it.
3. Minimal Privileges
The web server should not run as root. If it runs as www-data, even a successful path traversal won’t read /etc/shadow.
# Check who the server runs as
ps aux | grep nginx
ps aux | grep apache
4. chroot / Containers
In production environments — use chroot, Docker, or namespaces. An attacker can supply ../../../../, but if the root filesystem is isolated, they won’t escape beyond the container.
Testing
Quick test on your own application:
# Basic test
curl "https://your-site.com/api/file?name=../../../etc/passwd"
# With encoding
curl "https://your-site.com/api/file?name=%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd"
# Null byte
curl "https://your-site.com/api/file?name=../../../etc/passwd%00.jpg"
If any of these return the contents of /etc/passwd — you have a problem.