Where would the web be without links? Links are what hold together what we know as the World Wide Web. Without links, the World Wide Web would be more appropriately called the World Wide Set Of Unrelated Pages, or, incidentally, WWSOUP.
While it’s great how simple and effective the process is of “linking” pages together, I think there’s room for improvement.
If you’ve never heard of the term fragment identifier, well, that’s just the official name for the part of a URL that follows the hash symbol (“#”). Some people refer to links with fragment identifiers as “in page links”. So for example, in the following URL, the fragment identifier would be the string “scroll-to-fragid”:
If you visit the above URL (which is the WHATWG HTML5 spec that discusses fragment identifiers), the page will automatically jump to the section that’s been identified by the browser as the “scroll-to-fragid” section.
Fragment identifiers also come in handy for deep linking and preserving state in Ajax-based applications. So these certainly have an important role on the web.
How Are Fragments Identified?
In order for the browser to correctly identify which section of the page the window should scroll to, the fragment needs to be identified within the HTML by means of the
id attribute. This means that if a web developer hasn’t done his job, then there will be no way to link to a specific fragment of any particular document.
So, if you were linking to a document that was five screens long that didn’t have any
id attributes in the source, and you wanted to link to a specific section three screens down, you would have no way to do this. You’d have to link to it, then place the words “Scroll down to section B” or something ridiculous like that.
The Problem With Fragment Identifiers
The simple problem that I see with fragment identifiers is that their existence and functionality relies completely on the developer rather than the browser. Yes, the browser needs to read and interpret the identifier and identify the matching fragment. But if the developer doesn’t include any
id attributes in the HTML of the page, then there will be no identifiable fragments.
Do you see why this is a problem? Whether the developer has coded identifiers into the HTML has nothing to do with whether or not the page actually has fragments. Virtually every web page has fragments. In fact, sectioning content as defined in the HTML5 spec implies as much. Every element on the page that can contain content can theoretically be categorized as a “fragment”.
So why is it up to the developer (or content creator) to define whether or not a specific portion of the content can be linked to? When any page of content is created, there is no way of knowing which sections of the page are worthy of being identified. The developer or content creator may have a general idea of how a page’s content might be divided up, but ultimately it will be the linking resource that should have full control over what portion of the page they want to highlight.
That, after all, is how linking works. A page that’s displayed as a result of a web-based hyperlink is displayed to the end user only because the referrer (i.e. the page linking to it) defined the link that way. This means that, regardless of what the developer has done behind the scenes in the HTML, all HTML fragments on that page should be identifiable by external referrers.
The Solution: Power to the Browser and User
The solution, as I see it, is for the HTML spec to require that browsers have an internal mechanism for identifying fragments that can optionally be overridden by the developer. Just as the browser, by default, makes all links blue and underlines them, and allows these styles to be changed via CSS, likewise the ability to link to specific sections of a page should be built into the browser, and then the developer should have the option to change this.
Here’s a simple example of how this might be implemented. Suppose you have the following HTML page:
<h1>Page Title</h1> <p>Some introductory text.</p> <h2>Page Subhead 1</h2> <p>Some text for subhead 1.</p> <h2>Page Subhead 2</h2> <p>Some text for subhead 2.</p> <h2>Page Subhead 3</h2> <p>Some text for subhead 3.</p> <h2>Page Subhead 4</h2> <p>Some text for subhead 4.</p>
This type of structure is common on almost all blog posts. The post is divided into sections by means of headings, but unless the developer actually hard-codes
id attributes onto each heading tag, there is no way to link to any of those unique sections of the page.
To solve this problem, the browser should allow native fragment identifiers that use the HTML elements themselves in a CSS selector-like fashion. So if you wanted to link to “Page Subhead 3″ in that HTML page, you could do something like this:
<a href="http://www.example.com/example.html#h2:3">Check this out!</a>
Notice the string
h2:3 that appears after the hash symbol. This tells the browser to link to the third
<h2> element on the page. This example, of course, is just theoretical, and not meant to imply that this is the way it will be implemented. This is just to illustrate how it could be done without being dependent on developer-added attributes.
Why Should Fragments Be Identified By Users?
The reason fragments should be identifiable by users is because a user, not the content creator or the developer, will ultimately decide whether or not a portion of content is valuable or notable in some way.
Yes, the content creator should have the ability to decide how a page is generally divided, if they choose to do so. But the end user should not be restricted from linking to content fragments just because a developer couldn’t be bothered to add
id attributes to every element on the page. And that’s besides the fact that it would be a waste of time for a developer to do that or to have to build a CMS that does it automatically.
Blog Comments Get It Right
Linking directly to someone’s blog comment is very useful. Even if a blog doesn’t have an active link for each comment, it’s pretty easy to use developer tools to find the comment’s
id and link to it. I’ve done this many times on Smashing Magazine (they don’t have live links on each comment).
If there was no way to link to an individual blog comment, this would be a great hindrance to linking on the web. It would not be enough to link to the “#comments” section and then hope for the best. So CMSs like WordPress do the right thing by dynamically adding a unique identifier to each comment.
As mentioned, this saves the content creator from having to do it themselves, and puts the identifiability (or, the decision on what’s valuable) in the hands of the user or the referring website.
It’s Already in the Works
Being fearful of writing an article like this and having someone smarter poke holes in my proposal, I ran a draft of this piece by Paul Irish and he pointed out that an improvement to fragment identifiers is already in the works, but in very early stages.
A developer named Simon St. Laurent is hosting an “unofficial draft” of a specification called Using CSS Selectors as Fragment Identifiers. The draft is authored by St. Laurent and Eric Meyer and seems to be in the works for about a year (based on the date on that page). There’s even a jQuery script with a GitHub repo that attempts to implement this new type of fragment identifier. (Thanks to Ahmad for the GitHub link.)
And on a related note, media fragments (i.e. deep linking in audio and video, similar to what you can do on YouTube) have now been introduced and have some browser support (evidently WebKit and Firefox). Check out this part of the spec for the syntax.
All credit to Paul Irish for filling me in on these details.
Although implementing better fragment identifiers could be a challenge to support and publicize, for the reasons I’ve explained here, I think it’s a worthwhile addition to the HTML/CSS spec. I’m glad someone is already working on a proposal for this, and I hope this article serves to help make this known so that control of linking to content fragments ends up where it’s supposed to be: in the hands of users.