The family, it has rapidly gained popularity and became

The WWW (World Wide Web) is considered to be a collection of few billion web pages and has witnessed an explosive
growth in the past few years. The web search engines are used to extract query specific information from this massive pool of
WWW. A large number of different search engines are available to the user to satisfy their needs. Every search engine uses
its own specific algorithm to rank the list of web pages returned by the search engine for the users query, so that the most
relevant page appears first in the list. In this paper, a comparative study of two popular search engines (i.e .Google and
AltaVista) is done on twenty-four different query topics in terms of the first return result. It shows that for 21 % of the
selected topics, Google and AltaVista returned the same first web page from the end users point of view. INTRODUCTION
In the past decade the world has witnessed the
explosion of World Wide Web from an information
repository of few million of hyper linked documents
into a massive world wide “organism” that serves
informational, transactional and communicational
needs of people all over the globe. Though a
latecomer in the Internet family, it has rapidly
gained popularity and became the second most
widely used application of the Internet . Search
Engines are specially designed for informational
retrieval, which extracts the information from
WWW as per users query. As argued by
Marchionini, “end users want to achieve their goals
with a minimum of cognitive load and a maximum
of enjoyment”, correspondingly, in the context of
web searches it is observed that a maximum of the
search engine users tends to click on a result within
the first page of the search results. In fact, a survey
done by IProspect and Jupiter research on the
behavior of search engine users in January 2006
shows that 62% of the search engine users click on a
search result within the first page of results. Since,
generally search engines returns a very large list of
documents for the users query, the list is ranked in
accordance with the importance and relevance to the
users query. Thus, for the users query, the ranked list
of results is displayed by search engine, with few
results per page.
The general concept used by most of the search
engines to find quality web pages and rank the list is
that used by Page Rank algorithm 5,6 which
assumes that if web page A has a hyper link to web
page B then the author of web page A thinks that web
page B contains valuable information. This opinion
of A becomes more important is A is itself a very
important web page. This means that ranking of a
web page is high if many highly ranked web pages
points to it.
The WWW (World Wide Web) is considered to be a collection of few billion web pages and has witnessed an explosive
growth in the past few years. The web search engines are used to extract query specific information from this massive pool of
WWW. A large number of different search engines are available to the user to satisfy their needs. Every search engine uses
its own specific algorithm to rank the list of web pages returned by the search engine for the users query, so that the most
relevant page appears first in the list. In this paper, a comparative study of two popular search engines (i.e .Google and
AltaVista) is done on twenty-four different query topics in terms of the first return result. It shows that for 21 % of the
selected topics, Google and AltaVista returned the same first web page from the end users point of view.
02
There are a wide variety and types of search engines
available and each one of it uses its own ranking
algorithm to rank the list. These ranking algorithms
used are not made public because these are one of the
most important parts of any search engine and no
search engine shares its technology with other ones.
This is also done to prevent the misuse of search
engine because if the actual algorithm is made public
then some authors might artificially increase the
ranking of their web pages, which will destroy the
noble purpose of ranking. Page Rank is one of the
most celebrated algorithms for ranking of web pages
but the actual algorithm used by different search
engines is unknown to public.
This paper does the comparative study of the
effectiveness of these ranking algorithms of two
search engines i.e. Google, AltaVista in terms of the
quality of the first document returned, which as per
the corresponding search engine ranking algorithm,
is the best quality document for the users query.
The rest of the paper is organized as follows: Section
II describes the parameter on which the documents
were evaluated by the experts. Section III describes
the methodology followed in carrying out the
research. Section IV describes the key findings and
its implications and section V concludes the paper.
QUALITY PARAMETERS
In this research paper, return results are evaluated
on the basis of content to see the similarity between
the results derived from Google and AltaVista for
same query topic.
METHODOLOGY
To do the comparative study of two selected search
engines, four specialized areas were selected. From
each of these four specialized areas, three different
sub specializations were selected and from each of
these, two different topics were selected. The various
areas and topics are summarized in table 1.
The topics selected were such that they represent
standard concepts and have standard meaning.
These topics do not represent a broader area and
cannot be further bifurcated into further broad sub
areas.
Search was initiated on each of the topics on the two
selected search engines. The first document returned
was saved from the ranked list of documents. For
Google Vs AltaVista Search Engine: Study about similarity from users’ perspective
SPECIALIZED
AREAS
SUB SPECIALIZED
AREAS
TOPICS
Computer
science
Data structure Quick sort
Tower of
Hanoi
Data base
management system
3NF
Primary key
Software Engineering Black box
testing
Spiral model
Physics
Mechanics Impulse
Law of
inertia
Modern Physics E=MC2
Photon
Optics Refraction
Critical
Angle
Chemistry
Biochemistry DNA
RNA
Organic Chemistry Benzene
Picric acid
Physical Chemistry PH Value
Titration
Mathematics
Abstract Algebra SubGroup
Ring
Graph Theory Warshalls
algorithm
Strictly
binary tree
Trigonometry Area of
triangle
Identity
Table 1. The various specialized areas, sub specialized areas
and topics selected for comparison of two
search engines.
While searching the documents on the selected
search engines for selected topics, it was observed
that in many cases the first document returned was
same. Table 2 shows the cases where the first
document returned was same. For each row
corresponding to the topics in the table, a tick mark
indicates that the first document returned by the
corresponding search engine is same.
Table 3 clearly shows that out of a total of twentyeach
of these documents, experts of the
corresponding areas are evaluated separately.
If any two search engines are returning the same set
of web pages for a query topic, then for the end user,
both the search engines are same.
03
four different topics, for five topics or 21% of time
Google and Altavista returned the same first
document. This implies that the result of ranking
algorithms and technology of these two search
engines matches for 21%.
KEY FINDINGS AND RESULTS
The key findings and its implications from table 2 are
as follows:
Table 3 clearly shows that Google and Altavista
returned the similar documents for six query topics
out of twenty four i.e for 21% cases. This implies that
from the end users perspective the two search
engines are same in 21% of the cases and can be used
interchangeably for these query topics
CONCLUSION
There are billions of web pages and, everyday, new
content is produced. Therefore, the use of search
engines is becoming a primary Internet activity, and
search engines have developed increasingly clever
ranking algorithms in order to constantly improve
their quality. There are many search engines
available and their actual ranking algorithms are not
made available to the rest of the world. Because a
majority of the end users see the results only on first
few pages of the search results, it is these ranking
algorithms, which makes the search engines
effective and popular.
The web search engines are different, in various
aspects, from the well established other search tools.
Therefore, they require a different evaluation
methodology, and we have made an attempt with
two search engines and twenty-four different
sample queries. In this research, we have evaluated
the two search engines on the ranking algorithms
result used by them by evaluating the first document
returned by them on sample queries. Among many
findings, one of the important finding is that out of
twenty-four query topics, in 21% of the cases Google
and AltaVista returns the same first web page.
In the future, we plan to apply the proposed
methodology to a wider scope with the hope that our
research findings will truly enable web users to
select a search engine appropriate to their specific
search needs, and help web search engine
developers design even better ones for the Internet
community.
REFERENCES
1. Courtois, Martin P., Baer, William M., and Stark,
Marcella. Cool tools for searching the Web: A
performance evaluation. Online, 19(6), 14-32.
2. Marchionini, G. 1992. “Interfaces for End-User
Information Seeking.” Journal of the American
S . E.
Topics
Google Altavista
Quick sort
Tower of Hanoi
3NF ü ü
Primary key ü ü
Black box testing
Spiral model
Impulse
Law of inertia
E=MC2
Photon
Refraction
Critical Angle
DNA
RNA
Benzene
Picric acid ü ü
PH Value
Titration
SubGroup
Ring
Warshalls algorithm ü ü
Strictly binary tree
Area of triangle ü ü
Identity
Table 2. (Arrow mark shows similar return result
corresponding to given query topic)
Search Engines Total number of
similar results(out of
24 different topics)
05
Table 3. It shows total number of similarity of results for
these two search engines (Google– Altavista)
IPEM JOURNAL OF COMPUTER APPLICATION & RESEARCH Vol.1, No. 1, December 2016
03
Society for Information Science,43(2):156-163.
3. Krishna Bharat, Monika R. Henzinger ,
“Improved Algorithms for Topic Distillation in a
Hyperlinked Environment” 21st ACM SIGIR
Conference, 1998.
4. “iProspect Search Engine User Behavior Study”
a report by iProspect and Jupiter Research.
January 2006. www.iprospect.com
5. L. Page, S Brin, R.Motwani, and T. Winograd.
The Page Rank Citation Ranking: Bringing order
to the web. Stanford Digital Libraries Working
Paper, 1998
6. T. H. Haveliwala. Topic sensitive Page Rank. In
Proceedings of the Eleventh International World
Wide Web Confrernce, 2002.
7. Thelwall, M.; Vaughan, L. & Björneborn, L.
Webometrics. Ann. Rev. of Inf. Sci. & Tech., 2005,
39, 81-135.
8. Dwivedi, N.; Joshi, L. & Gupta, N. Statistical
analysis of search engines (Google, Yahoo and
Altavista) for their search result. Inter. J. of
Comp. The. and Engi., 2013, 5(2), 298-301.
9. Dwivedi, N. & Bansal. A. Effect of
advertisenment and sponsored links on search
engines:Comparative study,IEEE Xplore 2014
10. PageRank. Check PageRank of Web site pages
instantly. http://www. prchecker.info/ check_page
_ rank.php (accessed on 10 October 2015)
11. DomainAgeTool. http://www.webconfs.com
/domainage.php (accessed on 10 October 2015).