Optimizing AngularJS driven website for search engines

I was building a static public facing website which didn’t need any dynamic functionality except for rendering views and layouts. So I figured, why not try one of the new JavaScript MVC frameworks such as AngularJS which can do essentially the same without the need to worry about a server side framework.

After I was done building it, I realized it wasn’t going to work well with search engine crawlers and bots which don’t play well with content rendered via JavaScript/AJAX. I did some research online but couldn’t find any good solution besides installing PhantonJS on the server. You can set that up to generate HTML snapshots of your site. Felt like that was an overkill for such a simple site.

So here’s what I did. Consider it a poor man’s SEO fix. First I added this meta tag to tell the search engines that this site is AJAX driven:

<meta name="fragment" content="!">

That will basically make the search engine try an alternate route for your content. So for example, instead of going to http://www.mysite.com/about, it will go to http://www.mysite.com/?_escaped_fragment_=about

So the next thing we’ll need to do is modify .htaccess to handle those URLs with _escaped_fragment_ in them:

DirectoryIndex index.html
RewriteEngine On

RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^$ /crawler.php$1 [QSA,L]

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !index
RewriteRule (.*) index.html [L]

So as you’ll notice, I’m sending all of those requests to a special PHP script (crawler.php) just created for the crawlers/bots:

<?
$request = $_GET['_escaped_fragment_'];
$jsonurl = "./shared/data/pages.json";
$json = file_get_contents($jsonurl);
$json_output = json_decode($json);

foreach ($json_output as $page)
{
    if ($page->slug == $request)
    {
        $title = $page->title;
        $desc = $page->description;
        $keywords = $page->tags;
        $image = $page->thumb;
        $url = "http://".$_SERVER['HTTP_HOST']."/".$request;
    }
}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title><?php echo $title;?></title>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" />
  <meta name="title" content="<?php echo $title;?>" />
  <meta name="description" content="<?php echo $desc;?>">
  <meta name="keywords" content="<?php echo $keywords;?>" />

  <meta property="og:url" content="<?php echo $url; ?>" />
  <meta property="og:site_name" content="My Site" />
  <meta property="og:type" content="website" />
  <meta property="og:title" content="<?php echo $title;?>" />
  <meta property="og:image" content="<?php echo $image; ?>" />
  <meta property="og:description" content="<?php echo $desc;?>" />
</head>
<body>
<!-- Optionally make this body content dynamic to comply with Google's TOS -->
</body>
</html>

Basically what that script is doing is reading the meta tags off of a JSON file based on which URL is being requested. Here’s an example:

[
    {
        "slug": "work",
        "title": "Work",
        "thumb": "/shared/img/work.jpg",
        "description": "Work bla bla",
        "tags": "some, keywords, go, here"
    },
    {
        "slug": "about",
        "title": "About",
        "thumb": "/shared/img/about.jpg",
        "description": "About bla bla",
        "tags": "some, keywords, go, here"
    },
    {
        "slug": "services",
        "title": "Services",
        "thumb": "/shared/img/services.jpg",
        "description": "Services bla bla",
        "tags": "some, keywords, go, here"
    }
]

Now as a bonus, you could program AngularJS to read meta tags for each page from this JSON file as well. Just so you have all the meta information in one central place.

If you have any questions or comments, please post them below. If you liked this post, you can share it with your followers or follow me on Twitter!