RTFM, and Python's os.path.join

I was recently reviewing an application written in Python for possible security vulnerabilities. I came across a file upload feature where the user had control over the filename. It looked like I might be able to overwrite any arbitrary file on the system; however, the filename was properly sanitized to prevent directory traversal and there was no way around the mitigation.
I started by reading the documentation for each method and function that was used—a lesson that I learnt the hard way early in my career.

This was the code responsible for creating the upload path.

file_path = os.path.join("uploads", "static", file_name)

I had used os.path.join many times before, but I decided to read the manual again. I found something very interesting:

screenshot of Python's os.path.join documentation

If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component.

So I fired up a Python interactive shell and tried a few things in a Python interactive shell to clarify the above paragraph.
The above code outputs uploads/static/test.png for file_name="test.png"

Result of using a test.png as file_name

This is the expected behaviour and is probably what the developer wanted, but what happens if we pass an absolute path for the file_name e.g. /tmp/test.png?

Result of using an absolute path e.g. /tmp/test.png as file_name

Boom. The result, as unexpectedly expected it is, would be /tmp/test.png.
As the documentation has already said, if you pass an absolute path to os.path.join, it will drop everything else and will use that as the final return value.
That was all I needed to create a POC and get a shell.

The main point of the story is that it’s important to read the manual for everything, even if you think you already know about it. There are two reasons for this.

One reason is that things change, and so what you think you know might be different in newer versions or there could be new features or new defaults that make a significant difference. For example, in the screenshot above, it says “Changed in version 3.6: Accepts a path-like object for path and paths.”, while it’s not exactly related to the current subject, these types of changes could be a game changer in a different situation.

The second reason is that often times when you read about something you thought already knew, you realize that there were subtle details that you were unaware of. This can be due to changes between versions or because you simply never had to deal with those specific details before. Either way, it’s important to always read the manual to be sure.