Photo (Courtesy) http://www.3idatascraping.com/web-data-crawler.php
A web crawler (also known as a web spider or web robot) is a program or automated script which browses the World Wide Web in a methodical, automated manner.
This process is called Web crawling or spidering.
Many legitimate sites, in particular search engines, use spidering as a means of providing up-to-date data.
Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine, that will index the downloaded pages to provide fast searches.
Crawlers can also be used for automating maintenance tasks on a Web site, such as checking links or validating HTML code.
Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam).
For more information about the topic Web crawler, read the full article at Wikipedia.org, or see the following related articles:
Many legitimate sites, in particular search engines, use spidering as a means of providing up-to-date data.
Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine, that will index the downloaded pages to provide fast searches.
Crawlers can also be used for automating maintenance tasks on a Web site, such as checking links or validating HTML code.
Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam).
For more information about the topic Web crawler, read the full article at Wikipedia.org, or see the following related articles:
Search engine — A search engine or search service is a document retrieval system designed to help find information stored on a computer system, such as on the World ... > read more
Search engine optimization — Search engine optimization (SEO) is a subset of search engine marketing, and deals with improving the number and/or quality of visitors to a web site ... > read more
User interface design — User interface design or user interface engineering is the design of computers, gadgets, appliances, machines, mobile communication devices, software ... > read more
HTTP cookie — An HTTP cookie is a packet of information sent by a server to a World Wide Web browser and then sent back by the browser each time it accesses that ... > read more
/
No comments:
Post a Comment