Other Free Encyclopedias » Online Encyclopedia » Encyclopedia - Featured Articles » Contributed Topics from F-J

Image Search Engine - Abstract, Introduction, Web, Collection-based search engine, Content-based image retrieval, Conclusion

images engines text index

Mingjing Li and Wei-Ying Ma
Microsoft Research Asia, China

Definition: Image search engines are Web-based services that collect and index images available on the Internet.


Some commercial image search engines have indexed over one billion images so far. Like general web search, image searching is mostly based on text information associated with images, which can be automatically extracted from containing web pages. In addition to text-based image search engines, there are also some content-based image retrieval systems that index images using their visual characteristics. Those systems are mainly developed for research purpose and usually limited to small image collections.


Due to improved digital imaging technologies and convenient accessibility facilitated by the Internet, the popularity of digital images is rapidly increasing. As most of those images are not annotated with semantic descriptors, it might be a challenge for general users to find specific images from the Internet.

Image search engines are such systems that are specially designed to help users find their intended images. In general, image search engines may adopt two approaches to achieve this goal. One is text-based; the other is content-based.

Text-based image search engines index images using the words associated with the images. Depending on whether the indexing is done automatically or manually, image search engines adopting this approach may be further classified into two categories: Web image search engine or collection-based search engine. Web image search engines collect images embedded in Web pages from other sites on the Internet, and index them using the text automatically derived from containing Web pages. Most commercial image search engines fall into this category. On the contrary, collection-based search engines index image collections using the keywords annotated by human indexers. Digital libraries and commercial stock photo collection providers are good examples of this kind of search engines.

Content-based image retrieval (CBIR) has been an active research topic since 1990s. Such systems index images using their visual characteristics, such as color, texture and shape, which can be extracted from image itself automatically. They can accept an image example as query, and return a list of images that are similar to the query example in appearance. CBIR systems are mainly experimental and often limited to small image collections.

In the following, we briefly describe how those image search engines work.

Web image search engine

The Word Wide Web (WWW) may be the largest repository of digital images in the world. The number of images available on the Internet is increasing rapidly and will continue to grow in the future. Image search engine is a kind of Web-based services devoted to collect and index those Web images.

There are a number of image search engines commercially available, such as AltaVista, Google Image Search and Yahoo! Image Search. AltaVista is the first search engine in the world that launches image search functionalities. It also supports video and music search as well. Yahoo! claims to have indexed over 1.6 billion images in August 2005, while Google claims over 1 billion. Those engines are based on existing search engine technology in the sense that they index images using the text information associated with images.

Such search engines take the text in hosting Web pages as approximate annotation of Web images, assuming that images are embedded into Web pages to complement the text information. Some sources of information might be relevant to the content of embedded images. These include, in the decreasing order of usefulness, image file names, image captions, alternate text, which is an HTML tag used to replace the image when it cannot be displayed, surrounding text, the page tile and others. Surrounding text refers to the words or phrase that are close to the image, such as those in the above, below, left or right areas. However, it is difficult to determine which area is more relevant and how much should be considered. Thus the extraction of surrounding text is somewhat heuristic and subjective. As such information can be extracted automatically from Web pages, the indexing process is automated.

To build a commercial image search engine, a lot of functionalities should be implemented. Among those, at least the following four should be provided.

Image crawler is used to collect Web images, and usually implemented as software robots that run on many machines. Those robots scan the entire Web to identify images and then download images and hosting Web pages. As Web images are usually protected by copyrights, image search engines only keep a thumbnail for each Web image. Original images can be accessed via links in the search result.

Page parser is used to analyze Web pages so as to find informative images and extract associated text for indexing. Not all Web images are useful for search. Some are too small or used for decoration or function only, such as background images, banners and buttons. Some are advertisements that are not relevant to hosting Web pages at all. Those images should be excluded from the indexing. Page parser also tries to determine which parts of the hosting Web page are likely relevant to the contained images and extract corresponding text as for indexing.

Index builder is used to build indexing structure for efficient search of images. The methods adopted are quite similar to general web search, except that each image is treated as a text document rather than a Web page.

Image searching is actually processed in the server side. It accepts users? queries and compares with indexed images. When generating the final search result, it considers a number of factors, e.g. the similarity between the query and an indexed image, the image quality, etc. Google claims to present high-quality images first so as to improve the perceived accuracy.

The search result is organized as a list of thumbnails, typically 20 in one page, along with additional information about the retrieved images, such as the file name, the image resolution, the URL of the host webpage, etc. Some search engines provide advanced search options to limit the search result by size, file type, color or domain, and to exclude adult content.

Image searching service is usually provided via a Web-based user interface. Users may access image search engine in a Web browser, such as Microsoft Internet Explorer.

Collection-based search engine

Unlike Web image search engine, collection-based search engines index image collections using manually annotated keywords. Images in such collections are usually of high-quality, and the indexing is more accurate. Consequently, the search results from collection-based engines are more relevant and much better than those from Web search engines. Large digital libraries and commercial stock photo or clip art providers offer image searching facilities in this way.

Those image collections are often held in databases, and cannot be easily accessed by Web crawlers. Therefore, they are usually not covered by general search engines.

Among those engines, Corbis and Getty] are probably the two largest ones that specialize in providing photography and fine art images to consumers. Corbis collection currently contains 25 million images with more than 2.1 million available online. Its search result can be limited by categories and collections, or even by date photographed or created, by number of people in the image. Getty Images offer localized image data and contextual search capabilities in six local languages.

Microsoft Office Online also provides a large collection of clip art and multimedia data with well annotated keywords in multiple languages. The images in this collection can be used in creating documents. Figure 1 shows an example image from this site with its keyword annotations.

In fact, there are many stock photo or clip art collections available online. A list is provided in TASI’s image search engine review.

Content-based image retrieval

Content-based image retrieval was initially proposed to overcome the difficulties encountered in keyword-based image search in 1990s. Since then, it has been an active research topic, and a lot of algorithms have been published in the literature. In keyword-based image search, images have to be manually annotated with keywords. As keyword annotation is a tedious process, it is impractical to annotate so many images on the Internet. Furthermore, annotation may be inconsistent. Due to the multiple contents in a single image and the subjectivity of human perception, it is also difficult to make exactly the same annotations by different indexers. In contrast, CBIR systems extract visual features from images and use them to index images, such as color, texture or shape. Color histogram is one of the most widely used features. It is essentially the statistics of the color of pixels in an image. As long as the content of an image does not change, the extracted features are always consistent. Moreover, the feature extraction can be performed automatically. Thus, the human labeling process can be avoided.

In a CBIR system, each image is represented as a vector, which is the feature automatically extracted from the image itself. During the retrieval process, the user may submit an image example as query to the system. After that, the system calculates the similarity between the feature vector of the query and that of each database image, rank images in the descending order of their similarities, and returns images with the highest similarities as the search result.

However, those features often do not match human perception very well. Images with similar concepts may have totally different appearance, while images having similar features may be irrelevant to each other at all. This is the so-called semantic gap, which limits the applicability of CBIR techniques. Figure 2 shows an example. Images A and B should be more semantically similar to each other since both are the image of a butterfly. However, images A and C are closer in the feature space because they contain more similar colors. If A is used as a query, it is more likely to retrieve C as the search result.

Because the features used in CBIR are usually of high dimensionality and there is no efficient indexing method, current CBIR systems only index small image collections. So far, the largest CBIR system reported in the literature is Cortina, which indexes over 3 million images. The overall performance of CBIR systems is not satisfactory.


There are so many images available on the Internet that users do need efficient tools to browse and search for those images. The current image search engines can partially fulfill this need. In the future, a proper combination of textual and visual features may produce better image searching experience.

Image Secret Sharing - A (k,n)-threshold scheme, Basis matrices in visual cryptography [next] [back] Image Retrieval - Existing Techniques, Content-Based (CBIR) Systems

User Comments

Your email address will be altered so spam harvesting bots can't read it easily.
Hide my email completely instead?

Cancel or

Vote down Vote up

almost 6 years ago

I am going to choose this topic for project all the information is very useful for me.

Vote down Vote up

almost 5 years ago

useful info thanx/...

Vote down Vote up

about 6 years ago

Very useful for me ... Thank You

Vote down Vote up

almost 7 years ago

hi this is ashok

Vote down Vote up

almost 7 years ago


Vote down Vote up

about 7 years ago

can send me the functionality if the image search engine like google and the yahoo