Implementing redirects when upgrading a site to SharePoint

Late last month we launched the best site i’ve ever had the pleasure of working on, Education + Training International (ETI). The pure size and scope of the site, particularly everything which drives it behind the scenes, is something which everyone who has worked on (and will continue to work on) it should be proud of. Having taken the site from inception to launch, the number of hurdles we came across along the way would be enough to fill this blog for the year. This post will touch on just one small aspect of it – ensuring that the URLs from the old site accurately redirected to those in the new.

I actually touched on this topic in some sense a couple of years ago in my post 301 vs 302 Redirects in SharePoint so it was good to get the opportunity to revisit it again, explore options and implement a solution. This post is quite long, so I’ve broken it up into headings outlining what is being discussed – feel free to skip ahead to what interests you.

Requirements and Caveats

There were 2 main considerations for requiring the redirects (and we wanted all URLs redirected – there wasn’t a ridiculous number we’d need to handle – around 130 or so). First and perhaps most importantly was user experience. The last thing I wanted was for our users to be consistently hitting a generic 404 page. The other consideration was for Search Engine Optimisation – ETI had previously invested in SEO on their existing site and we did not want to lose those search engine gains which had been realised.

There was a caveat to the decision making process (isn’t there always). The infrastructure team was not prepared to take on the burden of managing the redirects and had reservations anyway around the feasibility of managing them at the reverse proxy level.

I also personally wanted to avoid any manual deployment steps both out of principle and for disaster recovery and future maintenance reasons – those kinds of steps are often lost over time as people leave and documentation becomes outdated and ignored.

Finally, ideally the process would be able to be managed down the track, preferably through a SharePoint list within the site.

Rejected Solution #1 – custom web part on PageNotFound

My first investigation centred around the concept of placing a custom web part on the 404 PageNotFound page (having easy, editable access to this page from within SharePoint is a great improvement over previous versions). The idea was that a list would exist which managed the old URLs and the mapping to the new URL, if the ?requestUrl= found a match, we’d serve up a 301 redirect to the new URL, otherwise we’d provide the next best thing – perhaps a search within the site for some of the terms within the URL. The user experience was actually seemless with this approach, however a quick look at Fiddler showed that the 404 response was still served before my code hijacked it and served up the 301, meaning the search engines would assume the pages were gone. Back to the drawing board.

Rejected Solution #2 – storing URL rewrite mappings in a separate file

It was at this point that I gave up on my desire to have a user-editable list of redirects (perhaps prematurely). Our specific requirement was simply to redirect the old URLs to the new, future redirections were more of a nicety I was trying to provide. I knew a bit about managing redirects in IIS (I referenced Jeremy Thake’s post How we did it: 301 Redirects in IIS 7.5 on Windows Server 2008 R2 in my previous blog post) so figured I’d explore that track (I did have some reservations about blowing out the web.config particularly in relation to the 250kb file size cap, so I intended to use Ruslan’s Storing URL rewrite mappings in a separate file – I figured this would also allieviate the ‘adminstrative burden’ of managing them through IIS). The approach worked great – I could manage a set of old URL/new URL pairs in a separate configuration file, but to avoid the manual deployment steps I’d want it managed and deployed via our solution.

Unfortunately, that separate file had to exist along side the web.config in the IIS virtual directory for the site. I wasn’t able to reference an absolute path and deploy my configuration file somewhere within the SharePoint layouts directory. I investigated the ability to Deploy Files to SharePoint Web Application Virtual Directories At Feature Activation via Brian Jackett but the noted flaws in that system and the complexity around it quickly turned me off. Back to the drawing board.

Implemented Solution – writing URL rewrite mappings to web.config via feature

My next approach was to take a look at managing the redirects within the web.config itself. I did have concerns regarding blowing out the file and the maximum size limit for it – but tests soon proved that we wouldn’t even get close to that figure, so it wasn’t as much of a concern as I had made it out to be. So this approach worked fine too, the challenge was to remove the manual administration/deployment steps out of the process. Enter Using SPWebConfigModification to Update the Web.config in SharePoint 2013. I think i’ve explained how to do that reasonably well in that post (which i’ve also now updated to include some new information) so i’ll jump straight into some code snippets you can use in combination with that post.

            SPWebConfigModification modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer";
            modification.Name = "rewrite";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<rewrite />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite";
            modification.Name = "rewriteMaps";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<rewriteMaps />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite/rewriteMaps";
            modification.Name = "rewriteMap";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<rewriteMap name='Redirects' />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite/rewriteMaps/rewriteMap";
            modification.Name = "redirect-1";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<add key='/eti-overview/profile-of-education-training-international.html' value='/about-eti' />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite";
            modification.Name = "rules";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<rules />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite/rules";
            modification.Name = "rule";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<rule name='Redirect rule1 for Redirects' />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite/rules/rule";
            modification.Name = "match";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<match url='.*' />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite/rules/rule";
            modification.Name = "conditions";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<conditions />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite/rules/rule/conditions";
            modification.Name = "add";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<add input='{Redirects:{REQUEST_URI}}' pattern='(.+)' />";
            webApp.WebConfigModifications.Add(modification);

            modification = new SPWebConfigModification();
            modification.Path = "configuration/system.webServer/rewrite/rules/rule";
            modification.Name = "action";
            modification.Sequence = 0;
            modification.Owner = "ETIWebConfigModifications";
            modification.Type = SPWebConfigModification.SPWebConfigModificationType.EnsureChildNode;
            modification.Value = "<action type='Redirect' url='{C:1}' appendQueryString='false' />";
            webApp.WebConfigModifications.Add(modification);

The only concerns I had left were around the performance implications of managing so many redirects within IIS/the web.config, but they were soon allayed via How to check for performance in URL Rewriting and IIS Rewrite Module rewrite map performance.

Was there a better option?

In the end I was pretty happy with the end result. I managed to take the burden off the infrastructure team (activating a web-app scoped feature is no significant task), managed to take the manual administrative steps (such as deploying files to specific locations which would need to be repeated if new WFE’s were added to the farm) out of the process and provided a seemless experience for the users while maintaining good SEO practices. The only thing I didn’t achieve was providing the user the ability to manage the redirects and I couldn’t help wonder if that was possible.

Thankfully I have the pleasure of working with someone who’s seen it all before and had a chance to try a few different approaches to this problem over time! Faced with this same challenge (but at a much larger scale), his solution was to create a custom IIS module which inspected the 404 requests going to the redirect page and converted them to 301’s, then a custom web part on that page did another 301 if it found an item in the old-new URL list, otherwise it wrote 404 to the response header for the nicely branded 404 error page.

The only downside to this approach I could see was the need to install the custom IIS module on each server – however is that really worse than having to install the IIS rewrite module on each server? Probably not. The approach also had a significant unexpected upside – by monitoring the analytics on the 404 page they were able to identify legacy links which were still being used across the web, and had the ability to redirect those to a new URL on the fly and provide ongoing functionality to redirect merged/renamed/moved pages on the ever evolving web site.

So hopefully this post has given you a few ideas regarding how to implement your redirects when upgrading a legacy site into SharePoint. As always there are pro’s and con’s to each approach but there are definitely options available to achieve a decent result to meet a number of different requirements.

Search Engine Optimisation (SEO) for SharePoint Sites – Part 2

In Search Engine Optimisation (SEO) for SharePoint Sites – Part 1 of this series I looked at the implications content, naming, metadata and structure of your SharePoint site can have on search rankings. This part of the series will focus more on how content can be manipulated to improve search rankings and other factors that can be leveraged from within SharePoint. Finally, Search Engine Optimisation (SEO) for SharePoint Sites – Part 3 will take a step outside the SharePoint box into how you can boost rankings and traffic to your SharePoint site using other methods.

Optimise the Load Time of your Pages

This one works on a couple of levels. Firstly, there is some benefit in terms of SEO for faster loading pages. Take a read of Geoff Kenyon’s article Site Speed – Are You Fast? Does it Matter for SEO?. The reality however is that it isn’t a huge factor. More importantly though is that the time it takes to load your pages has a huge impact on bounce rates and visitors wanting to return to your site, and for that reason it is a critical task to undertake. SharePoint, especially MOSS 2007, doesn’t have the fastest load times therefore anything you can do to reduce the load time of your pages is paramount. I intend on writing a post on this in the future so i’ll leave it at that for now, but it’s definitely something to keep in mind when creating your site.

Maximise your Content to Markup Ratio

There’s a number of articles available discussing the importance (or lack thereof) of maximising your content to markup ratio. Whether you believe it’s a factor or not is largely irrelevant – there are a number of benefits to gain from minimising the amount of markup on a page for both performance and structural reasons. It’s also important to ensure the text content appears as high on the page as possible and this can be affected by manipulating the markup structure of the page. SharePoint is notoriously bad in this regard. Improvements were made from MOSS 2007 to SharePoint 2010 but it’s still not perfectly clean HTML. There are things you can do to improve the situation – using controls instead of webparts in MOSS will minimise the number of tables used on the page. Control Adapters can be used to manipulate the HTML rendered by controls. You can ensure your Master Pages and Page Layouts structure content higher on the page and adjust their location using custom CSS. Often a lot of effort is required when focussing on this SEO technique so unless it’s of critical importance sometimes it’s best to do whatever you can and just live with the rest.

Optimise your Images and Anchor tags

Images and anchor tags are 2 elements within the page content that can be slightly adjusted to provide an extra SEO boost. Images provide the benefit of also being indexed in the Google Images search engine which can provide extra traffic. ALT text is essential when optimising an image. It should be short, sharp, relevant and if possible keyword-rich. File names can also provide some extra juice and should be similar to the ALT text and hyphen-delimited. The Title attribute is unlikely to hold much weight but it can’t hurt and should match the ALT text. Anchor tags are another which can make use of the Title attribute. The other important aspect of anchor tags, particularly intra-site linking which you have control over, is to ensure the text which is hyperlinked is descriptive and keyword rich – preferably matching the key phrase that the page is optimising. In terms of SharePoint there are 2 features you can utilise to reap the rewards – the ALT text field for images and the Tooltip field for hyperlinks. If you want to utilise the Title of an image you’ll need to delve into the code-behind which may not be worth the effort.

Use 301 redirects over 302

As I mentioned in my post 301 vs 302 Redirects in SharePoint there are a number of reasons why 301 redirects are preferred to 302 from an SEO perspective. Rather than going over everything again I’d encourgage you to read through that post and the associated links to understand the benefits and potential solutions around it. It’s significant in a SharePoint context because when accessing a site by its URL directly rather than the specific page URL, a 302 redirect is used to take you to the Welcome Page. The Redirect Page layout in SharePoint also uses 302 redirects.

Don’t Destroy Old Content

No matter how old a page, article or news item is, it still has the ability to appear in the search results and drive people to your site. In fact, the age of a site often has a positive affect on its ranking. Why would anyone want to simply discard all the effort that went into optimising the page in the first place because it was deemed somewhat out of date? Archives are a great and suitable alternative which maintain the content online and hence the rankings of the page. SharePoint makes this quite easy to do – simply filter a Content Query Web Part to return all out of date pages on an ‘Archive’ page and you’re set. The process can either be automated by scheduling an end date for the page, or by manually configuring a custom column of the page. You could also move it into an ‘Archived’ sub-site, however if this option is chosen, keep in mind the redirect that will be required and the fact that will be a 302 redirect by default.

Avoid Flash and Silverlight if you can

It’s common knowledge that using Flash or Silverlight on a website in place of indexable content is a search ranking destroyer. Flash has made SEO improvements to the format but this is still only limited. With HTML5, CSS3 and jQuery taking off and able to achieve a lot of the rich interactive functionality provided by Flash and Silverlight, you’d have to question the use of it on your search engine optimised site. This goes for SharePoint as much as any other platform – leveraging the jQuery library in SharePoint in particular is easy to do and leaves the site more SEO friendly than a Flash or Silverlight equivalent.

Create an XML Sitemap Automatically

I’m not sold on the benefits of XML Sitemaps to be perfectly honest. I tend to agree with an article written by Matt McGee titled XML Sitemaps: The Most Overrated SEO Tactic Ever in that all an XML Sitemap is really doing is masking and potentially even causing problems. There are many more however that argue the opposite, such as Bruce Clay’s XML Sitemaps in SEO – Part 1. If you’re going to go down the XML Sitemap path, and i’m not going to categorically say it’s something you shouldn’t do, I’d suggest that you ensure that it is constructed automatically so it is completely up to date. In SharePoint one way this can be achieved is via Waldek’s Imtech XML Sitemap or Mavention XML Sitemap for 2010.

Include a Robots.txt File

Having a Robots.txt file for your site is not so much about improving the rankings of pages in your site but more about ensuring pages you don’t want appearing in search results aren’t indexed. For more information about Robots.txt have a read of Robots.txt: All you need to know. This is particularly important for SharePoint because often there are a number of pages you simply wouldn’t want indexed – list views and the like – which can sometimes end up being crawled. I’d definitely suggest including this file for your SharePoint site as part of your overall SEO strategy.

Structure your Content with Heading Tags

Using header tags (H1 through to H6) to structure content is another tool at your disposal to infer importance on particular terms and phrases. Header tags are easily applied in SharePoint via the content editor so its important to stress 2 main concepts. Firstly – use them only for the keywords or phrases. Too often heading tags are used unnecessarily throughout the page or on sub-headings which really convey no SEO benefit to the page. Secondly – use them for the keywords and phrases rather than styling DIVs or SPANs to achieve the same visual effect – it’s the tag which is recognised, not the size of the text on the page. You can have the desired visual effect on the page by using heading tags where appropriate and styling other text with CSS if the term or phrase is not a targeted keyword for the page.

Externalise your JavaScript

This one ties in to maximising your content to markup ratio – the less text on the page with no SEO benefit the better. There are also performance trade-offs however – you don’t want to be creating a bunch of extra page requests to pull down each individual JS file that contains your page’s code. There are also development implications – sometimes it’s easier to code the relevant JavaScript directly within a control rather than having to find it and work on it seperately. Ultimately it comes down to what is more important to you. In a SharePoint context i’d recommend using one file to hold all JavaScript functions that will need to be called from various pages, reference it via the SharePoint:ScriptLink control and make the call to the function from within your control. This minimises the amount of JavaScript text on the page and minimises the page requests.

Add ‘Strength’ to Keywords

This one I think would be extremely minor if it has any influence at all – but I guess you never want to miss an opportunity so it’s something worth discussing. Take a read of Traian Neacsu’s article Bold or Strong Tag and SEO – Complete HTML Reference Guide for SEO for more information on the topic. The main take away is that styling via CSS won’t have any affect, while bolding keywords with the STRONG element may be recognised. It’s something worth considering anyway.

In Search Engine Optimisation (SEO) for SharePoint Sites – Part 3 of this series I’ll take a step outside the SharePoint box into how you can boost the rankings and traffic to your SharePoint site using other methods.

301 vs 302 Redirects in SharePoint

Recently I was faced with the request to implement some permanent 301 redirects for a publically facing SharePoint website. It was more of an investigative request rather than a call for action and as yet there has been no resolution, so rather than presenting a solution to the issue, this post will serve more as a discussion around it.

Rather than focus the article on the differences between a 301 and 302 redirect I’d rather point you in the direction of some existing resources that explain the details better than I could myself. The article Redirects: Permanent 301 vs. Temporary 302 does a good job in doing so.

The main concern around the 301 vs 302 redirect in SharePoint is 2-fold. Predominantly, there is a lot of information floating around the net in regards to the SEO implications of the 2 redirect techniques. To cut a long story short, 301 redirects are generally preferred from an SEO standpoint to 302 redirects. The problem being, SharePoint uses 302 redirects for its ‘Redirect Page’ layout. You can get a more in depth explanation of this in Jeff Cate’s post MOSS for Internet and 301 Redirects.

Those wanting to implement 301 redirects rather than the default 302 redirects are left with 2 main options. Firstly, IIS can be used to re-write the URLS (which serves as a 301 redirect). This can be done natively via an add-on in IIS7 and with a 3rd party add-on in IIS6. Secondly, the option exists to write a custom HttpModule to perform the 301 redirect as per Waldek Mastykarz’s post on Using 301 instead of 302 redirects. Waldek’s post is focussing more on another MOSS 302 redirect phenomenon rather than the Redirect Page layout but the concepts are still applicable.

I have a bit of an issue with both solutions. I should be attracted to the custom code option being a developer by trade, but from a complexity and maintainability point of view it doesn’t really appeal to me. It may be possible to architect a solution whereby the redirects are maintained in a SharePoint list improving the functions extensibility, and that’s probably the angle I would have looked into had time allowed, but it still seems like a bit of an overkill considering the IIS solution existed. The problem with the IIS solution however is (aside from it not being native in IIS6 which was being used in my instance) that maintenance would need to be conducted on the server by IT rather than the site owners.

Jeff posed the question in his post ‘Will it be in SharePoint 2010?’ – unfortunately, the answer is no. Waldek points out in a follow up post Using 301 instead of 302 (without code!) that they can be handled in IIS7 (required for SharePoint 2010) natively, which is true and how Jeremy Thake at NothingButSharePoint implemented their 301 Redirects in IIS7.5, but doesn’t solve the issue of giving access to site owners to perform the function. You really don’t want this administrative overhead placed on IT.

To be honest in my opinion we seem to be left with no ideal option. It would be nice (but still not ideal) if 301 redirects could be set and maintained via Central Administration (at least ensuring server access would not be necessary). It would be even better if they could be set and maintained in the Site Settings of the relevant site. Better still, an option on the redirect page to indicate whether it should be a 301 or 302 redirect would be perfect.

For now we’re left with the options presented. Having a personal interest in SEO myself it would be nice to see future versions of SharePoint tailored towards this goal in anyway possible, and making it easier to perform 301 redirects out of the box would be one way of getting there. Assuming SharePoint vNext is based on .NET 4.0 we should be a lot more likely to have this feature considering the 301 Redirects in .NET 4.0.

If it so happens that this piece of work is given the green light i’ll post a follow up on how the solution was achieved.