First
of all, let me start by saying that it is always better to call in an
SEO manager early in the development stage, so that there is no need
to make sometimes hard-to-implement tweaks afterwards.
Some
content management systems bake poor URL structures right into their
websites. Lax rules can be a culprit, for example, not encoding
spaces or special characters.
From
an SEO point of view, a site’s URL structure should be:
Straightforward:
URLs with duplicate content should have canonical URLs specified for
them; there should be no confusing redirects on the site, etc.
Meaningful:
URL
names should have keywords in them, not gibbering numbers and
punctuation marks.
With
emphasis on the right URLs: SEO-wise,
not all URLs on a site are of equal importance as a rule. Some even
should be concealed from the search engines. At the same time, it is
important to check that the pages that ought
to be accessible to
the search engines are actually open for crawling and indexing.
So,
here is what one can do to achieve an SEO-friendly site URL
structure:
1-
Consolidate your www and the non-www domain versions
As
a rule, there are two major versions of your domain indexed in the
search engines, the www and the non-www version of it. These can be
consolidated in more than one way, but I’d mention the most widely
accepted practice.
Canonical
issues, parameters that do not change page content, loose adherence
to coding standards, or any number of reasons will create duplicate
content.
Options
for dealing with duplicate content include:
- Reconfigure the content management platform to generate one consistent URL for each page of content.
- 301 redirect duplicate URLs to the correct version.
- Add canonical tags to webpages that direct search engines to group duplicate content and combine their ranking signals.
- Configure URL parameters in webmaster tools and direct search engines to ignore any parameters that cause duplicate content (Configuration >> Settings >> Preferred Domain).
2- Avoid dynamic and relative URLs
Depending on your content management system,
the URLs it generates may be “pretty” like this one:
www.example.com/topic-name
or “ugly” like this one:
www.example.com/?p=578544
As
I said earlier, search engines have no problem with either variant,
but for certain reasons it’s better to use static
(prettier) URLs rather than dynamic
(uglier) ones. Thing is, static URLs contain
your keywords
and are more user-friendly, since one can figure out what the page is
about just by looking at the static URL’s name.
Besides,
Google recommends using hyphens (-) instead of underscores (_) in URL
names, since a phrase in which the words are connected using
underscores is treated by Google as one single word, e.g.
one_single_word
is onesingleword
to Google.
And,
to check what other elements of your page should have the same
keywords as your URLs.
Besides,
some web devs make use of relative
URLs. The problem with relative URLs is that they are dependent on
the context in which they occur. Once the context changes, the URL
may not work. SEO-wise, it is better to use absolute
URLs instead of relative ones, since the former are what search
engines prefer.
Now,
sometimes different parameters can be added to the URL for analytics
tracking or other reasons (such as sid, utm, etc.) To make sure that
these parameters don’t make the number of URLs with duplicate
content grow over the top, you can do either of the following:
- Ask Google to disregard certain URL parameters in Google Webmaster Tools in Configuration > URL Parameters.
- See if your content management system allows you to solidify URLs with additional parameters with their shorter counterparts.
3-
Avoid Mixed Case
URLs,
in general, are
'case-sensitive'
(with the exception of machine names).
Mixed
case URLs can be a source of duplicate content. These are not the
same URLs,
- http://example.com/Welcome-Page
- http://example.com/welcome-page
The easiest way to deal with mixed case URLs is to have your
website automatically rewrite all URLs to lower case. With this one
change, you never have to worry if the search engines are dealing
with it automatically or not.
Another great reason to rewrite all URLs to lower case is it will
simplify any case sensitive SEO and analytics reports. That alone is
pure gold.
4- Create an XML Sitemap
An XML Sitemap is not to be confused with the
HTML sitemap. The former is for the search engines, while the latter
is mostly designed for human users.
What is an XML Sitemap? In plain words, it’s
a list of your site’s URLs that you submit to the search engines.
This serves two purposes:
- This helps search engines find your site’s pages more easily;
- Search engines can use the Sitemap as a reference when choosing canonical URLs on your site.
The word “canonical” simply means
“preferred” in this case. Picking a preferred (canonical) URL
becomes necessary when search engines see duplicate pages on your
site.
So, as they don’t want any duplicates in the
search results, search engines use a special algorithm to identify
duplicate pages and pick just one URL to represent the group in the
search results. Other webpages just get filtered out.
Now, back to sitemaps … One of the criteria
search engines may use to pick a canonical URL for the group of
webpages is whether this URL is mentioned in the website’s Sitemap.
So, what webpages should be included into your
sitemap, all of your site’s pages or not? In fact, for SEO-reasons,
it’s recommended to include only the webpages you’d like to show
up in search.
4. Close off irrelevant pages with robots.txt
There may be pages on your site that should be
concealed from the search engines. These could be your “Terms and
conditions” page, pages with sensitive information, etc. It’s
better not to let these get indexed, since they usually don’t
contain your target keywords and only dilute the semantic whole of
your site.
The robotx.txt file contains instructions for
the search engines as to what pages of your site should be ignored
during the crawl. Such pages get a noindex attribute and do not show
up in the search results.
Sometimes,
however, unsavvy webmasters use noindex on the pages it should not be
used. Hence, whenever you start doing SEO for a site, it is important
to make sure that no pages that should be ranking in search have the
noindex attribute.
Conclusion: Having SEO-friendly URL structure on a site means having the URL structure that helps the site rank higher in the search results. While, from the point of view of web development, a particular site’s architecture may seem crystal-clear and error-free, for an SEO manager this could mean missing on certain ranking opportunities.
Click here to Read More