drupal development. drupal support. drupal training.

Follow SystemSeed on:

Do it with Drupal

The Seed Feed

Thu, 1st Apr 2010

As with most things, there is more than one way to link to a page on the internet. The same goes for linking to content within a Drupal system. You can use absolute URLs, absolute paths, relative paths, node ids, or even special syntax provided by one or another Drupal module. So what should you use, and when?

Consider these three URLs:

  1. http://www.systemseed.com/about
    An absolute, or external URL to a piece of content.
  2. /about
    An absolute path to a piece of content.
  3. /node/157
    An absolute path to an Node ID.

Whist they do all link to the same thing, and clicking on any of them will take you to the same page, they all pose a few problems.

The problem with using absolute URL's in links

The SystemSeed website can be found at the address http://www.systemseed.com. But, as with most sites I work on, I have several copies of the site and each one is accessible at a different url. There is dev.systemseed.com, my local development environment where I am fee to do what I like without fear of breaking things. Then there is stage.systemseed.com, our staging server, where we test new features for the website before sending them into the wild.

If I had used absolute URLs when linking to content on www.systemseed.com, then when I transferred that content over to dev.systemseed.com, the link within the content would actually be linking back to www.systemseed.com and clicking on it would send be back over to the live site. Having links that point the the live version of the website from the development site can be very frustrating, and introduces the potential for serious mistakes to be made.

Tip: If you have multiple development environments, install the Environment Indicator module

If I did have the Environment Indicator module installed, I would probably be OK because I'd instantly be able to see that I wasn't in the development environment any more. But if I didn't, and I didn't happen to glance at the URL bar and notice that I was now at www.systemseed.com rather than dev.systemseed.com, I may may well have ended up doing drastic something on the live server by mistake!

The problem with using aliases in links

So, rather than linking to http://www.systemseed.com/about, it would be better if we excluded the domain portion of the URL, and simply used the absolute path to the content:

http://www.systemseed.com/about becomes /about

But what if I decide that /about should actually be located at /company/about? Well, I'd have to run a link checker to find all the broken links, and then fix them manually so that they all point at the new url, /company/about. Or, I could set up a 301 redirect to redirect from /about to /company/about.

Tip: If you use the Pathauto module, Install the Path Redirect module too - you can configure it to automatically set up 301 redirects every time a node alias changes.

The problem with using a Node ID in links

So, rather than linking to /about, it would be even better if just used the Node ID directly, since this will never change.

/about becomes /node/157

This way, that link will always get you to the node with the ID of 157, so regardless of the alias that node has, or whatever redirects to that node, our link will always get you to the page.

Using the Path Filter module to translate Node ID to Path Alias

If the exact same content is accessible at two different URLs, it can be a problem for SEO. Search engines call this duplicate content, and they may mark you down for it. You can use the the canonical link element to hint to search engines that /node/157 and /about are the same thing, but not all search engines take notice of the canonical link element.

The motivation for the Path Filter, which I now co-maintain, was to provide a robust way of linking to internal URLs from within content, so that your links do not break if you move your site to a different path (e.g. from a development site at http://example.com/dev/ to a production site at http://example.com/).

It does this by providing it's own Input Format, which takes internal Drupal paths in single or double quotes, written as e.g. "internal:node/157", and replaces them with the appropriate absolute HTTP URL or path using Drupal's url() function. Version 2 of the module also works for files in your files directory using Drupal's file_create_url() function.

/node/157 becomes internal:node/157

This ensures that if you move your site from one server to another, or from one subdirectory to another, or if you change the URL alias of the node, the links throughout the site will still point to the right node, and they will point to it using the correct URL alias.

Other things to consider

Nodewords and the Canonical URL

The Canonical URL is an HTML link element that tells search engines the preferred location of some content.

<link rel="canonical" href="http://example.com/page.html"/>

The Drupal Nodewords module can automatically generate a Canonical URL for your site's content.

And some modules to help with links and SEO...

Pathologic is an Input Filter which can correct paths in links and images in your Drupal content in situations which would otherwise cause them to “break;” (http://drupal.org/project/pathologic).

Link node is Similar to Path Filter and provides an Input Filter which allows you to use the syntax (where N is a number, and NNN is a node id): [node:NNN,title="Original version of the picture"] (http://drupal.org/project/link_node).

URL Replace Filter allows administrators to replace the base URL in <img> and <a> elements (http://drupal.org/project/url_replace_filter).

Absolute SRC translates absolute paths to absolute URLs (http://drupal.org/project/abssrc).

Global Redirect ensures that your content is only visible at the one, best URL possible (i.e. requests for node/2 are forwarded to "alias-for-node-2").

After years of developing all types of web solutions, Tom made the strategic decision to focus his efforts into making Drupal a better platform. In 2010, he led the successful exit of his KirkDesigns through a joint venture with Web at Ease. That event formed SystemSeed.