Web-Bot Technologies – Preparing the Data
Using a web crawler to index every bit of information ;
1) Divide everything into file types
2) Using MetaData, index keywords, sentences, paragraphs
3) You need to link keywords directly to the info
4) If your program does not understand keywords, sentences and paragraphs in machine language, you have failed, every bit of info can be understood in machine language eg email, tweet, link etc. A furthur enhancement you can search for the languages of the world but it is too complex to discuss here, cause we do not have the technologies now.
5) You need to write an algorithm to spur hundreds of webcrawlers to increase speed
6) Remember after indexing all of your info you need a huge cache to link to searches, once a keyword search is used, it will be kept in memory
7) So in fact everyone is searching for the info from your snapshot instead of directly from Internet
8) So frequency of updates is very important eg every 1 hour
9) Every piece of info is timestamped to give the user a choice to chose the most latest info.
– Contributed by Oogle.