During my normal website marketing activities, I typically will build a very large Google Site maps, XML files with 40,000 urls, and anywhere from 5-22 per site. To do this, I run a website crawling application which pulls 15 urls every 4 seconds. Now my server is a dual quad core processor, capable of serving hundreds of pages per second.
The problem:
1. Building my site maps quickly to get my new websites indexed quickly, and to let Google know some of the deeper pages of site content.
2. Making use of my server's capability.
3. Not run the crawling application on my desktop
My Solution:
1. Run multiple instances of my crawler on various Windows via Amazon Ec2.
2. Launched 5 instances, quickly and easily right from my browser.
What I discovered? The new cloud computing
No comments:
Post a Comment