File system paths on Windows are stranger than you might think. On any Unix-derived system, a path is an admirably simple thing: if it starts with a /, it’s a path. Not so on Windows, which serves up a bewildering variety of schemes for composing a path.

When I implemented the path autocompletion feature in Fileside 1.7, I needed to take a closer look at this to make sure I had all bases covered. This blog post shares my findings.

At a glance

Absolute path formats

Relative path formats

Disallowed characters

Characters Validity
< > ” / | ? * Never allowed
. Disallowed as final character
: Disallowed except for with data streams

Length limits

Windows path schemes

There are three different kinds of absolute path, and three different kinds of relative path on Windows.

Absolute paths

Absolute, or fully qualified, paths are complete paths, that on their own uniquely identify a location in the file system.

Drive paths

Drive paths are the good old-fashioned paths we all know and love, consisting of a drive letter and a sequence of directories.

D:Doughnut preferencesWith jam in

UNC paths

UNC stands for Universal Naming Convention and describes paths that start with \, commonly used to refer to network drives. The first segment after the \ is the host, which can be either a named server or an IP address, like so:

\WorkHard
\192.168.1.15Hard

UNC paths can also be used to access local drives in a similar way:

\localhostC$UsersAndrew Fletcher
\127.0.0.1C$UsersAlan Wilder

Or by using the computer name:

\PipelineC$UsersMartin Gore

UNC paths have a peculiar way of indicating the drive letter, we must use $ instead of :. Furthermore, accessing drives in this way will only work if you’re logged in as an administrator.

Note that \Pipeline is not a valid directory path in itself, as it only identifies a server. The name of a share must be appended to arrive at an actual folder.

Device paths

A device path starts with either of these two beauties:

  • \?
  • \.

They can be used to address physical devices (disks, displays, printers etc) in addition to files and folders. Not something you’ll ever use in day-to-day file management, but useful to know about if you come across one in the wild.

The syntax for accessing a local folder using a device path is either of:

\?Z:AnimalsCute
\.Z:AnimalsCunning

If you really want to get obscure for the sake of it, you can also substitute an equivalent device identifier for Z::

\?Volume{59e01a55-88c5-411f-bf0b-92820bdb2548}AnimalsCryptic

Here, Volume{59e01a55-88c5-411e-bf0a-92820bdb2549} happens to be the identifier for the disk volume on which Z: resides on the computer on which I’m writing this.

There’s also a special syntax for representing UNC paths as device paths:

\?UNClocalhostZ$AnimalsCurious

With device paths, the bit that comes after the \? or \. is a name defined in Windows’s internal Object Manager namespace. For those curious to explore what’s available in this namespace, you can download the WinObj tool and take a look.

Normalised and literal device paths

So what’s the difference between \? and \., you ask.

Normally, when you pass a path to Windows, it starts off by cleaning it up before using it. This process is referred to as normalisation. But more about that later.

A \? path skips this cleaning step, while a \. path doesn’t. Hence, we can refer to \? paths as literal device paths, and \. as normalised device paths.

Say you, for whatever incomprehensible reason, need to access a file named .., which would normally be resolved to the parent directory during normalisation, you can do so via a literal device path.

Relative paths

Relative paths are incomplete paths, that need to be combined with another path to uniquely identify a location.

Paths relative to the current directory

These are paths that use the current directory as their starting point, e.g. .Torquay refers to a subdirectory of the current directory, and ..Wales refers to a subdirectory of the parent of the current directory.

In Fileside, the current directory is taken to mean the folder shown in the pane where you enter your path.

Paths relative to the root of the current drive

If you start a path with a single , it’s interpreted relative to the root of the current drive. So, if you’re currently anywhere on the E: drive and type in Africa, you’ll end up in E:Africa.

When the current directory is accessed via a UNC path, a current drive-relative path is interpreted relative to the current root share, say \EarthAsia.

Paths relative to a drive’s current directory

Less commonly used, paths specifying a drive without a backslash, e.g. E:Kreuzberg, are interpreted relative to the current directory of that drive. This really only makes sense in the context of the command line shell, which keeps track of a current working directory for each drive.

This is the only type of path that Fileside does not support, as it has no concept of a current directory per drive. Only panes have a current directory.

Normalisation

As alluded to above, all paths except literal device paths go through a normalisation process before they’re used. This process consists of the following steps:

  • Replace forward slashes (/) with backslashes ()
  • Collapse repeated backslash separators into one
  • Resolve relative paths replacing any . or ..
  • Trim trailing spaces and periods

It’s thus generally possible to specify Windows paths using forward slashes as well.

Windows naming rules

Now we’ll turn our focus to the individual segments that make up a path. There are a number of restrictions on what names you can use for files and folders.

Disallowed characters

You can’t use any of the following characters in a name:

< > " /  | ? *

Any non-printable character with an ASCII value below 32 is equally out of the question.

The cunning colon

For the most part, : is also banned.

However, there is an exotic exception in the form of NTFS alternate data streams. It is a little known fact that you can store a hidden piece of data inside a file in certain contexts by appending a suffix preceded by a colon to its name.

The perilous period

A . is fine anywhere within, or at the start of, a name, but is disallowed as a final character.

Leading and trailing spaces

Curiously, Windows allows spaces at the beginning of names, but not at the end. As a name with spaces around it often looks identical to one without, these are generally a terrible idea, and Fileside automatically trims them off when you rename or create files.

Disallowed names

For historical reasons, you can’t use any of the following names either:

CON, PRN, AUX, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.

That includes usage with an extension tacked on. If you call a file COM1.txt for example, it gets converted into \.COM1 internally, and gets interpreted as a device by Windows. Not what you want.

Case sensitivity

For the most part, Windows doesn’t make a distinction between lower- and upper-case characters in paths.

C:Polish hamlet, c:polish Hamlet, C:Polish Hamlet, or C:POliSh hAMlET are considered exactly the same thing.

However, since the Windows 10 April 2018 Update, NTFS file systems do have the option of enabling case sensitivity on a per-folder basis.

Length limits

We’re not quite done yet, the lengths of things have limits too.

Paths

Traditionally, a path on Windows could not exceed a total of 260 characters. Even today, this is still the case for some apps, unless they have taken care to implement a workaround.

This workaround consists of transforming every path into a literal device path under the hood before passing it Windows. Through doing so, we can bypass the 260 character limit, and increase it to a much more generous 32,767 characters instead. Fileside implements this workaround.

Names

Individual file and folder names cannot be longer than 255 characters.

So many ways to say the same thing

Armed with all this knowledge, we realise that we can construct an almost unlimited number of different path strings that all refer to the same directory.

  • C:CHAMELEON
  • c:chameleon
  • C:/\//\///Chameleon
  • C:Windows..Users..Chameleon
  • \localhostC$Chameleon
  • \127.0.0.1C$Chameleon
  • \?C:Chameleon
  • \.C:Chameleon
  • \.UNClocalhostC$Chameleon
  • \?Volume{59e01a55-88c5-411e-bf0a-92820bdb2549}Chameleon
  • \.GLOBALROOTDeviceHarddiskVolume4Chameleon
  • etc.

That’s what sticking to a policy of total backwards compatibility for several decades gets you!

Read More