Monthly Archive » November 2014

Why Google did not Crawl our New Pages? – format order matters in respond_to blocks !!

Hello All,

So we bumped into a very interesting problem since last 2 Weeks. We rolled out our Spanky new Festival pages which we thought would allow our users to quickly grab best deals, coupons and forums discussion for all the Festivals that happen in India, the entire year around. Right from GOSF to Diwali to smaller festivals like Children Day Sale.

http://www.desidime.com/festivals

http://www.desidime.com/festivals/gosf

We thought it was cool, Our users liked it too But Google decided to remove that page from Index. Our SEO expert Suhan Shukla did everything he can do to get that page into Google’s index. But google decided to ignore it.

Today, Suhan found a solution for the same… He came to a logical conclusion on why Google decided to ignore the page. It turned out that Google Bots were seeing the page much differently than what we were seeing it on Desktop.

The “Aah” Moment:

Thats some Javascript code… There is definitely a hell lot of content on that page… So why does google not see it?

This CURL command gave more hints:


curl -D - -s -A 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)' http://www.desidime.com/festivals/diwali

 

When running the same CURL command on localhost,  the problem became clearer.

class FestivalsController < ApplicationController
  def show
    @festival = Festival.find(params[:id)]
    respond_to do |format|
      format.js { render "tags/show"}
      format.html
    end
  end
end

If you look closely at the respond_to block, the format,js appears before format.html. Since bots Crawl pages without a particular format, Only the js gets rendered.

The Correct Way:

class FestivalsController < ApplicationController
  def show
    @festival = Festival.find(params[:id)]
    respond_to do |format|
      format.html
      format.js { render "tags/show"}
    end
  end
end

And yeah, its not yet fixed in Rails 4, So be careful and always give html the first priority so everyone on the web (including the bots) can read your pages!!

BIG Thanks to Suhan for digging and finding the problem using Google Webmasters “Fetch as Google” tool.