Author Archive
Are Search Engine Robots Useful?
Sometimes referred to as ‘spiders’ or crawlers, automated search engine robots seek out web pages for the user. Just how do they accomplish this and is this of importance? What is the real purpose of these robots?
These spiders actually have a rather limited scope of understanding and power available to them, far less than you would think considering they’re minions of such great and mighty names as Google and Yahoo. There’s a lot of things out of the scope of their understanding, such as frames, visuals such as movies or pictures, and scripting via java. Nor can they peek into parts of sites protected by passwords, or click buttons. Well, that’s what they can’t do. What CAN they do?
The robot makes a list of the web pages in the system at the ‘submit a URL page, then searches for these web pages in order from the list the next time it goes on the web. Sometimes a robot will find your page whether you have submitted it or not because other site links may lead the robot to your site. Building your link popularity and getting links from other topical sites back to your site is important. The first thing a robot does when it arrives is to check for a robots.txt file. This file tells the robots which sites are off-limits. Usually these are files that should be of no concern because they are binaries or other files that are not needed by the robot.
Links are collected from every page that is visited. These links are used in following those links to other pages. The robot gets around on the World Wide Web by following links from one place to another.
To ensure that searchers get the right results with the most relevant response to their query, quick calculations are done to see that this happens. Server logs and log statistics program results can be checked by the user to see what pages have been visited and how often. Some robots may be easy to identify such as Google’s ‘Googlebot’, while less well-known ones such as Inktomi’s ‘Slurp’ are not easily identifiable. Some robots even appear to be human-powered browsers.
There may be robots that you do not want to visit your website such as aggressive bandwidth grabbing robots and others. The ability to identify individual robots and the number of their visits is useful. Information on the undesirable robots is helpful also. IP names and addresses of search engine robots are listed at the end of this article in a resources section. These robots read the pages on your website by visiting your page and looking at the text that is visible on the page, and then looks at the source code tags such as title tags, meta tags and others. They look at the hyperlinks on your page. From these links, the search engine robot can determine what your page is about. Each search engine has its own algorithm to determine what is important. Information is indexed and delivered to the search engine’s database according to how the robot has been set up through the search engine.
Search engines don’t update instantly from moment to moment. No, their database updates can vary in the exact timing. However, once you’re in there, the bots will make a point to visit you frequently so as to pick up on updates and the like. If your site is down at the time the bot may not be able to update your site in the search engine database, so do keep that in mind. So, robots may be scary things in movies, but as you can see, as far as the internet goes they’re nothing but helpful tools to guide us in going from site to site. Embrace them, learn how to help them be more efficient, and work with them to get your web site highly-ranked so that you can maximize your visitors.
Justin Harrison is an internationally recognised Internet Marketing Consultant who provides world class SEO Services to website owners. For more information visit: http://www.seorankings.co.za
A Search Engine Explanation
A search engine, such as Google and Yahoo, are tools that allow users to find information on the World Wide Web by searching with a specially designed tool. The results of a search are organized into a list which includes web pages, information about them and links to them. In some cases, it includes images. Search engines are operated algorithmically in conjunction with human editing.
Web crawling, indexing and searching combine in that order to obtain the most accurate results. Mass amounts of information on millions of web sites are stored and then retrieved relevant to the user’s request. A web crawler is also known as a spider, it analyzes every link and indexes all information for faster retrieval.
Mata tags and even words from the webpage are studied to classify the webpage and its content. All these data are stored for future usage.
All the search engines work on more or less the same principle. Google stores the source pages, also called cache, of all the web pages along with information available on the webpage itself. AltaVista differs slightly in operation as it stores everything that a web page has on offer.
Search engine users normally input a keyword or key phrase into the search field. The engine will search for their particular keyword and key phrase on the World Wide Web. The search engine index will provide an organized list of results with the best matched web pages. A short summary of each webpage describing the contents is provided along with the list.
The goal of major search engines is to supply the most relevant results. Not all sites with the requested keywords are relevant to the search. The search engines have used their spiders and indexing to filter out useless information. They generate their own system for analyzing a website for content.
Page rank is latest addition in the techniques used by search engines to sort out various web pages and their contents. Page rank decides the relevance of a particular page by studying the correlation between its meta tags, descriptions, keywords used and the content of that webpage. The search engines rank those sites high that have association with high ranked web pages. The page rank is essential for any web page or site as it determines its probability of featuring at the top of any particular search.
Justin Harrison is an internationally recognised Internet Marketing Consultant who provides world class SEO Services to website owners. For more information visit: http://www.seorankings.co.za