Build your own WordTracker

Written by on May 23, 2008 in General, Must Reads - 8 Comments

Recently i was browsing WordTracker website, and i started to wander: how do they get keywords statistics which no one else has? So i decided to do a little research and i found out that “All search terms are collected from the major metacrawlers – Dogpile and Metacrawler”.

I went to Dogpile website but, on the first sight there seemed to be none statistics that i would be interested to, from all the links there only one catched my eye it was Search Spy … i launched it.

It turned out to be a Flash application witch displays what people are currently searching for thru dogpile, but still i cannot get content directly from Flash application it is impossible, however, on the other side Flash cannot get such information from the server so there has to be some kind of third party application which delivers keywords.

Technically speaking, Flash app is application running on my PC because it was downloaded with HTML content and executed in web browser, so if it is getting any data (and i am sure it is) from some hidden third party then it MUST be thru HTTP protocol.

Fortunately i have Live HTTP Headers plugin installed in my Firefox so i opened and after few seconds i got info that there was a request to dogpile innerpage, as you can see it is XML page with list of queries which were made recently in Dogpile, so this is where WordTracker is getting their keywords from? I do not know maybe they have some other agreement with Dogpile and Metacrawler, but all in all … i do not care and i am not going to investigate that any further.

After 15 seconds there was the same request, made to the same Dogpile website to get list of new queries, i am not sure but probably they are making periodical request to this page, i mean there is the same interval between requests (it is logical that they make constantly making requests to get new data which is added every second).

So, now you know where to get keywords like WordTracker has, now you can do the same investigation for MetaCrawler (they are also using Search Spy), find their “secret” XML file with fresh queries, and then maybe a little competition for WordTracker? :)

About the Author

Greg Winiarski is a freelance PHP and JavaScript programmer. He specializes in web applications and WordPress development.

8 Comments on "Build your own WordTracker"

  1. Greg August 22, 2008 at 7:13 pm ·

    Wow, it looks like they really made it complicated, currently i have no other idea how to get it working again, i will have to dig deeper into their HTML.

    However one thing is sure, if HTML page can get those results, then so can any other application.

    I am a bit busy right now but when i will have some time then i will take a look at it – and probably write another post on this topic :)

  2. John September 29, 2008 at 12:29 pm ·

    Hi Greg,

    Love your blog and have learned quite a bit, I now wish to code my “first wordpress plugin” and would love it to be a “Wordtracker Plugin”… so I’m wondering if you got any further with the html to capture the keywords…..

    I had wanted to write a Firefox plug-in but haven’t managed to find a good tutorial on that hint hint!!

  3. Greg September 29, 2008 at 5:04 pm ·

    Hi John,
    i did not work on it since they changed their system.

    However, i am still going to do that, i will start writing some quality posts when i will finally get my degree ;> it should be done within 2 or three weeks.

    As for firefox plugin there are some good resources on this topic, but maybe i will consider writing one by myself.

  4. Joey September 29, 2008 at 6:48 pm ·

    Greg, I, too, would love to see an update on the Dogpile SearchSpy situation. There is a guy who built a script for Lynix systems floating around but that means nothing to me. I look forward to your update!

  5. Greg October 3, 2008 at 5:06 pm ·

    Hi Joey, i will do my best to get this data out of DogPile :)

  6. John October 3, 2008 at 5:58 pm ·

    He he …. I found that script too, and moved on! Meant nothing to me but just in case you haven’t seen it Greg the link is

    http://www.elifulkerson.com/projects/save-results-from-dogpile-search-spy.php

    And many thanks for replying – it seems increasingly rare these days….

  7. Greg October 3, 2008 at 9:00 pm ·

    Ha! I got them. Do not have exact code yet, but XML data can be easily retrived with FireFox with Live HTTP Headers plugin.

    Just go to the SearchSpy page and launch Live HTTP Headers wait some time until SearchSpy will ask for data (do not do anything in other tabs until something will show in Live HTTP Headers window).

    When you get some headers click button Replay, new window will open. In it also click Replay and you will be taken to the page with XML results.

    Making a software which would retrive this data, will be a bit hard, because they are using JSON, Cookies and probably some other stuff to make data harvesting really complicated.

    @John thanks for the link but it is not useful now, becuase he also uses the method i described in this post.

  8. Tosha Vleming October 5, 2010 at 9:52 am ·

    a lot of fantastic info here. A+

Leave a Comment