URL - Just Solve the File Format Problem
URL
From Just Solve the File Format Problem
Jump to:
File Format
Name
URL
Ontology
Electronic File Formats
Web
URL
Wikidata ID
Q42253
Q61694
Q424583
Released
1990
URL
(Uniform Resource Locator) is an address of a resource as used on the World Wide Web, and is one of Tim Berners Lee's original three pillars of the Web along with
HTTP
and
HTML
. Technically speaking, a URL is just one category of such addresses, a subset of URI (Uniform Resource Identifier) and parallel to URN (Uniform Resource Name), but such distinctions aren't always consistently maintained even by technical people, and URL has entered the popular language in a way those other terms have not.
Over time, the precise definitions of the various terms for Web-related addresses have changed and been argued extensively about by technical people, and some more have been added: an IRI (Internationalized Resource Identifier) is like a URI, but extended to allow non-
ASCII
characters so that languages other than English can be supported. However, the newest HTML 5 standards drafts choose to take a more pragmatic approach of just using "URL" to refer to anything that a browser is expected to resolve as an address, as one of many "willful violations" of earlier tech specs they did there. (The "techie" equivalent of social conservatives may consider this to be "defining deviancy down" and hence an "abomination".)
Use of URLs (and URIs, etc.) is not limited to the Web, as there are a number of other technical usages such as in defining namespaces for file formats (e.g.,
XML
), and in identifying even non-Web-accessible objects for the purpose of expressing taxonomic relations. In less-technical usage, URLs turn up in all sorts of places like TV commercials, billboards, and on the side of vans, but often with the protocol portion left off when
HTTP
(or its secure variety, HTTPS, which eventually became the most commonly used protocol) is used. These days most browsers don't even show the "https://" part in the address bar, though it's still officially part of the URL.
Contents
Types of identifiers
Standard syntax
data: URLs
Example
See also
Official documents
Proposed documents
Official sites
Other links
Types of identifiers
URI
: The official "parent term" for URLs, URNs, and other such identifiers, but limited to
ASCII
characters, with anything else needing to be specially encoded. Even within the ASCII range, some characters such as the space are prohibited, reserved, or designated to be used only for specific syntactic purposes, with percent encoding necessary for all other uses.
IRI
: The internationalized version of URIs, with more liberal rules about what characters in the entire
Unicode
range may be included. This allows text in non-English languages to be included without messy encoding, though various transfer protocols may still require the entire string to be encoded on transmission to produce an ASCII-based URI. This is really just a different representation of a URI; any IRI may be represented as a URI, with
percent-encoding
UTF-8 sequences if it contains non-ASCII characters.
URL
: Technically only the subset of URIs that are "locators", able to be used to retrieve resources because they designate a specific address for them, but in practice the distinction is very fuzzy and usually ignored. Some newer standards such as HTML 5.0 simply follow common non-techie usage and use URL to refer to the whole universe of Web-style addresses (encompassing URIs and IRIs, and anything else a browser can accept as an address even if it fails to comply with any of the standards).
URN
: Uniform Resource Name. Another type of URI which is supposed to provide a stable permanent identifier for a resource which does not include a specific (and changeable) address for it. To resolve a URN, one needs a resolver such as a server or website that stores a table of current locations of items with URNs. Currently the standards call for all URNs to begin with the 'urn:' scheme identifier, and the next item after this is a URN namespace, followed by another colon and the namespace-specific information. Some common naming schemes have been adopted as URNs, such as
ISBNs
(International Standard Book Number), which have the format "urn:isbn:1-234567-890". Unfortunately, browsers haven't been quick to implement URN resolvers as standard features, though add-ons can be installed to do it. URNs can also be used to refer to a
UUID
, and are also used to refer to hashes in
magnet URIs
Standard syntax
URLs/URIs/etc. always start with a scheme (protocol). (At least,
absolute
URLs do; there are also
relative
URLs that leave off parts at the beginning because they are construed as being relative to the current URL they are accessed from.) The most common was traditionally
HTTP
, but more recently the encrypted variant HTTPS became more common; there are many other schemes too, although they are less common. The scheme part ends with a colon (:).
After this, the rest of the URL is scheme-dependent; there are a number of different syntaxes used in different types of URLs. A common syntax, expected by the standards to be used in all schemes with hierarchical path structures, follows the scheme part with a double slash (//) which introduces a host or authority portion (usually a
domain name
), which is then followed by another slash and then the full path being addressed, which uses forward slashes to separate hierarchical levels (which may, but needn't, correspond to subdirectories in a
filesystem
).
There's a common misconception that URLs always have a double slash after the colon, sometimes causing developers of new schemes to put this in their syntax where the standards don't call for it; it is only supposed to be used if the following element is some sort of "authority" (most commonly the address to connect to) by which a following path is to be interpreted. There are a number of schemes with no such authority, and hence no double slash; for insstance "mailto:".
data: URLs
One scheme,
data:
, is actually a file format in its own right, since it encodes the entire contents of a file within the URL instead of referencing an external resource as other schemes do.
Wikipedia article on data: URIs
RFC 2397
Data URL maker
Example
See also
Magnet URI
URL encoding
Official documents
RFC 1738
(early absolute URL standard)
RFC 1808
(early relative URL standard)
RFC 2396
(early URI syntax standard)
RFC 3986
(later URI syntax standard)
RFC 3987
(IRI standard)
RFC 4395
(info on registering new URI schemes)
W3C clarification of URIs, URLs, and URNs (2001)
HTML 5.1 draft section on URLs
(which intentionally disregards the distinctions in the other documents above)
Proposed documents
Internet draft for File scheme
Official sites
W3C: Naming and Addressing
(old)
W3C: Identifiers
(new)
IANA list of registered URI schemes
IANA list of URN namespaces
Other links
Dan's Web Tips: URLs
Well Designed URLs Blog
Cool URIs don't change (Tim Berners-Lee)
File URIs in Windows
File URI scheme update project wiki
URI Guidelines for publishing Linked Datasets on data.gov.au v0.1
Proposed Group: URI Specification Community Group
If you liked it then you should have put a URL on it
Beyond Robust Links: The case for robust urls and an Archival Url standard
OkHttp’s New URL Class
History of the URL: Domain, Protocol, and Port
Retrieved from "
Categories
File Formats
Electronic File Formats
Web
Naming and numbering systems
Hidden categories:
FormatInfo without formattype
FormatInfo without extensions
FormatInfo without mimetypes
Personal tools
Log in / create account
Namespaces
Page
Discussion
Variants
Views
Read
View source
View history
Actions
Main page
File formats
Formats by extension
Still more extensions
Software
Glossary
Library
Sources
Categories
Community portal
Recent changes
Random page
Toolbox
What links here
Related changes
Special pages
Printable version
Permanent link
This page was last modified on 6 March 2026, at 19:02.
This page has been accessed 59,956 times.
Content is available under
Creative Commons 0
About Just Solve the File Format Problem
Disclaimers
US