Skip to content

Author: Greg Winiarski

Developing Webbots, Spiders and Screen Scrapers with PHP cURL

For a long time i wanted to write detailed tutorial on how to use PHP and cURL to create bots and i am still going to do that, but while browsing Amazon i found very interesting book related to this topic A Guide to Developing Internet Agents with PHP/CURL written by Mike Schrenk. I did not read this book yet, but it looks very promising and interesting … atleast to me, because it is not yet-another-php-for-beginners, but a book that focuses and deeply explores one single topic – creating webbots.

In my opinion knowing how to create web bots is one of the most important web programmer skills, there is so much data on the internet that we need bots to take full adventage of the resources on the web and automate day to day boring online tasks. More over you will be surprised how often you will be asked on job interview if you can write a webbot. Continue reading Developing Webbots, Spiders and Screen Scrapers with PHP cURL

Detecting search engine bots with PHP

I have to admit i am writing this post because i will publish soon big article about cURL extension and show you how to use it to create bots which would be even able to login to web pages and do some actions there, actually nothing illegal, but still i fill the need to tell you how to detect “good” (search engine bots) and “bad” (scrappers) bots and how to protect from them. Continue reading Detecting search engine bots with PHP

Microsoft BrowseRank

Maybe you already heared about it or maybe not. BrowseRank is Microsoft response to Google PageRank, it is not even close to being ready, but there is already some reliable data coming from Microsoft developers. It is only 8 pages long, but contains a lot of text that you may not found very interesting – like mathematical calculations and probability theory – so i boiled it down to most important facts, to present it to you. Continue reading Microsoft BrowseRank

Clone keyword in PHP

A lot of people do not read PHP manual. That is a fact, they will look for answers to their questions on forums, books, blogs (like this one for example) and so on, while most of the answers they are looking for, are already there … in PHP manual. Or maybe it is me who is weired, because i like to browse PHP.net to see “what’s new”, well can’t really tell. Continue reading Clone keyword in PHP

Semantic Search Engines

I read interesting article recently (btw i do not read my news, but if i did then i would probably found out that i begin each of them with the similar sentence), the article was about the future of search engines, mainly semantic search engines. I do not want to get into details here because as you know, English is not my native language and i am not sure if i could explain the whole concept clearly enough, so if you need more information on semantic search check out … correct, WikiPedia 🙂 Continue reading Semantic Search Engines