Path traversal (directory traversal) is one of the simpler web attacks, yet it still makes the top 10 vulnerability lists. It lets an attacker read arbitrary files from the server — configuration, SSH keys, passwords, application code. In some cases, it also allows writing.

How It Works

Imagine an application that displays images:

https://example.com/loadImage?filename=photo.jpg

The server takes the filename parameter and appends it to a base path:

/var/www/images/ + photo.jpg → /var/www/images/photo.jpg

What if instead of photo.jpg we supply ../../../etc/passwd?

/var/www/images/../../../etc/passwd → /etc/passwd

The ../ sequence moves up one directory. Three such sequences from /var/www/images/ reach the filesystem root. The server returns the contents of /etc/passwd.

What Do Attackers Want to Read?

On Linux, typical targets:

/etc/passwd              # user list
/etc/shadow              # password hashes (if server runs as root)
/etc/hosts               # DNS configuration
/proc/self/environ       # environment variables (may contain API keys)
/home/user/.ssh/id_rsa   # private SSH key
/var/log/auth.log        # authentication logs
~/.bash_history          # command history

On web applications:

/var/www/html/.env             # database passwords, API keys
/var/www/html/config/database.yml  # Rails configuration
/etc/nginx/nginx.conf          # server configuration

Bypass Techniques

A naive defense is filtering ../ from the string. Attackers have several workarounds.

Nested Sequences

If the server strips ../ once, nesting works:

....//....//....//etc/passwd

After the inner ../ is removed, what remains is ../../../etc/passwd.

URL Encoding

%2e%2e%2f                    →  ../
%2e%2e/                      →  ../
..%2f                        →  ../

Double Encoding

If the server decodes the URL twice:

%252e%252e%252f  →  %2e%2e%2f  →  ../

Non-standard Encoding

..%c0%af         →  ../  (UTF-8 overlong encoding)
..%ef%bc%8f      →  ../  (fullwidth slash)

Null Byte

If the server enforces a file extension (e.g., .jpg):

../../../etc/passwd%00.jpg

The null byte (%00) truncates the string — the server opens /etc/passwd, ignoring .jpg. This trick mainly works on older PHP and C versions.

Absolute Path

Sometimes ../ isn’t needed — a full path is enough:

filename=/etc/passwd

Starting from the Required Directory

If the server checks that the path starts with /var/www/images:

filename=/var/www/images/../../../etc/passwd

Validation passes because the path begins with the required directory.

How to Defend

Two layers of defense — both needed simultaneously.

1. Input Validation

Best approach: whitelist allowed filenames. If that’s not possible — verify input contains only alphanumeric characters:

import re

def is_safe_filename(filename):
    return bool(re.match(r'^[a-zA-Z0-9._-]+$', filename))

This blocks ../, %2e, null bytes, and everything else.

2. Path Canonicalization

Even with input validation — check the resolved path:

import os

BASE_DIR = '/var/www/images'

def safe_path(filename):
    full_path = os.path.realpath(os.path.join(BASE_DIR, filename))
    if not full_path.startswith(BASE_DIR):
        raise ValueError('Path traversal detected')
    return full_path

os.path.realpath() resolves symlinks and ../, so even if an attacker bypasses input validation, canonicalization catches it.

3. Minimal Privileges

The web server should not run as root. If it runs as www-data, even a successful path traversal won’t read /etc/shadow.

# Check who the server runs as
ps aux | grep nginx
ps aux | grep apache

4. chroot / Containers

In production environments — use chroot, Docker, or namespaces. An attacker can supply ../../../../, but if the root filesystem is isolated, they won’t escape beyond the container.

Testing

Quick test on your own application:

# Basic test
curl "https://your-site.com/api/file?name=../../../etc/passwd"

# With encoding
curl "https://your-site.com/api/file?name=%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd"

# Null byte
curl "https://your-site.com/api/file?name=../../../etc/passwd%00.jpg"

If any of these return the contents of /etc/passwd — you have a problem.