In a CMS application it is really common for each authors to add, move and delete pages. In a busy and popular website this can happen really often and it could have a negative effect on how search engines crawl and index your site. It is important to help search engines understand these changes on your pages and index correctly your website.
Every http request has a status code that can give us useful information. There are 5 groups of status codes, each group has a specific context. The groups are the following. Fow more information on http status codes check the wiki article List of HTTP status codes. The most common are 200 (All is ok), 301 (moved permanently), 404 (page/file not found) and 500 (internal server error).
- 1xx Informational
- 2xx Success
- 3xx Redirection
- 4xx Client Error
- 5xx Server Error
Using status codes we can make search engines understand what is going on with our pages. The cases we will investigate in this article in order to make our asp.net application to be more SEO friendly regarding content management will be the following:
- Missing pages and 404 status code.
- Application errors and 500 status code.
- Relocated pages and 301 status code.
By default first two cases are handled by IIS. If a request is about a page that doen’t exist, the web server gives the 404 error code. Also, if your application has an error and the Server Error in ‘/’ Application appears the status code of your page is 500. Regarding search engines all is ok and there is not much to be done here. But for your visitors it would be nice to see something more informative than the ugly error page or the IIS not found message.
There a few options in the Web.config you can use handle these issues. For example you can do the following:
<customErrors redirectMode="ResponseRewrite" defaultRedirect="error.aspx"> <error statusCode="404" redirect="not-found.aspx"/> </customErrors>
We set a default error.aspx page to appear when an error occurs. But for the 404 status code, we want to handle this case differently, so we set a different page to appear for missing pages. The above setup works really well (I really like the option redirectMode=”ResponseRewrite”, because the url doesn’t change) but if you check the response headers the page has is 200. This behavior will confuse search engines since they will treat theses special pages are regular pages. 500 errors are something we must eliminate so usually they are not permanent (by fixing all bugs…). But for links that are removed the 404 status code is important for search engines to stop crawl and index those pages.
Note: If you have an ASP.NET MVC project and follow this approach you might need to comment the
filters.Add(new HandleErrorAttribute()); line from your ~/App_Start/FilterConfig.cs file.
It can be done very easily with ASP.NET to set the status code for the page that loads. In the Page_Load(…) event of error.aspx and not-found.aspx you can add the following line
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load 'The line below is required if you get IIS error code pages 'when setting the status code programmatically Response.TrySkipIisCustomErrors = True Response.StatusCode = 404 'Continue with loading the not-found page End Sub
This simple approach gives the desired behavior in your asp.net application. In the not-found page you could put a search form and suggest to your viewer to use it (like WordPress does!).
Update: I noticed that in some servers when you set the status code programmatically, the IIS default error pages appear. To overcome this, set the TrySkipIisCustomErrors property to True.
Finally, when you are moving your pages (e.g. changing the url that the page loads) the old url is probably out there. So, with our previous approach our application would show the not-found.aspx page. If we have a way to understand that the missing page is a page that moves, we would like to redirect our users to the new url. This is important for website statistics and for social media. But search engines don’t like redirections. The legitimate way to perform a redirection is to use the 301 status code.
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load ... If(MustRedirect) Then Response.StatusCode = 301 Response.Redirect("new-url-for-page.aspx") End If End Sub
The above code makes your page redirects to be search engine friendly. But if you check the response status of the last page it would be 200. To actually check that the 301 code is applied correctly you could either check in websites that offer this analysis, e.g. Redirect Checker. Also, you could check out the Live HTTP Headers firefox plugin.