Some filenames are better than others. We enforce a certain pattern to ensure your website follows best practices for naming files on the web.
Back in the DOS days, filenames had some pretty strict rules. They
couldn't be longer than eight
characters plus a three character extension. This was called
the 8.3 filename, and it brings back a lot of
memories for this ol' programmer.
In modern operating systems, filenames are much more flexible. These days, it's perfect valid to
have a file called Bob's Résumé.pages
— note the capital letters, apostrophe, space,
and use of accented characters. This is obviously better than bobsres.pgs
, but it's
probably not something you want to do on your website.
Because of the way URLs are encoded, your pretty filenames will end up looking like this.
https://example.com/files/Bob%27s%20R%C3%A9sum%C3%A9.pages
It would be much easier to read and type if your URL looked like this instead.
https://example.com/files/bobs-resume.pages
Web-safe filenames
To keep things consistent, we enforce certain rules to create web-safe filenames when working with files in Surreal CMS.
- Filenames must be lowercase
- Filenames must not contain spaces
- Filenames must only contain a-z, 0-9, dashes, and dots.
Yuck. This is staring starting to sound like those crazy password requirements you see on banking websites. Don't worry — we make this seamless for users by automatically:
- Lowercasing the filename
- Converting spaces to dashes
- "Latinizing" accented characters (e.g. "é" becomes "e")
- Removing symbols and similar characters
So if a user enters Bob's Résumé.pages
, we'll convert it to
bobs-resume.pages
for them.
Why is this considered a best practice?
In reality, you can name a file just about anything you want and browsers will be able to handle it, albeit with a really ugly encoded URL. It's not just about aesthetics, though.
Some operating systems are case sensitive (e.g. Linux, macOS) while others are not (e.g. Windows). That
means, in some operating systems, FILE.EXT
is the same as file.ext
. If you
ever move your website to a server running a case-sensitive OS, links that used to work have the
potential to break. There's also the possibility of
duplicate content issues. For
this reason, it's a lot safer to stick with lowercase filenames.
Spaces aren't pretty in URLs because they're encoded with %20
, so we replace them
with dashes. Dashes are simply less scary looking in URLs, so users are more comfortable seeing them. We
prefer dashes over underscores per
Google's recommendation.
Accented characters get encoded in URLs too, so we convert them to their Latin equivalents. This makes URLs easier to read and type. Search engines do a really good job of understanding these conversions, so SEO points aren't deducted. (Your "crème brûlée" recipe will still appear in searches for "creme brulee.")
We don't enforce a length for filenames, but it's possible that search engines might frown upon excessively long filenames, especially if you're stuffing them with keywords. Short and concise is a good rule of thumb.
We follow these best practices and enforce them in Surreal CMS so our users don't have to worry about it. It's one of many behind-the-scenes things we do to ensure a great experience for both them and their visitors.