We learned how search engines work (briefly), in our math class at Uni... Don't know the english terms... Eigen values (sp?), and stuff... Quite interesting.
___________________
insignificant cor member alliance
Jan-16-2004 22:23
whiskers
old skool
Registered: Sep 2001
Location: in your dreams
lol, eigenvalues? that's linear algebra and matrices... if that's the case, then screw programming, even though linear algebra was easy, i found it stupid and annoying.
___________________
Jan-16-2004 22:33
Cloudburst
I am the maximum
Registered: Oct 2003
Location: Jötebårj
quote:
Originally posted by whiskers
lol, eigenvalues? that's linear algebra and matrices... if that's the case, then screw programming, even though linear algebra was easy, i found it stupid and annoying.
I'm sure google doesn't use that algorithm.
This Indian guy had a guest lecture and he went on and on about his work. He programmed search engines. It sounded good, but it wasn't released yet I think.
___________________
insignificant cor member alliance
Last edited by Cloudburst on Jan-16-2004 at 23:09
Jan-16-2004 22:45
Tranc3
tranceaddict in training
Registered: May 2002
Location: Santa Cruz, CA, US
quote:
Originally posted by whiskers
my logic teacher was talking about it once and said that they rate the sites by hits on them - the sites with most hits get to be on top of the search results...
Not quite, but almost. Google's system is based on how many sites link to your site, not how many hits your site gets. So a site with 300 external sites linking to it will beat out a site with 250 external sites linking to it.
quote:
Originally posted by SuperFarStucker
More like tens of thousands. They have huge farms of rack servers running linux. The real genius is in the algorithm they use to search their internal networks and stuff. About all the delay there is in common searches is internal network transit latency.
Once again, almost but not quite. They use tens of thousands of old PC's running linux - not rackmount servers. In this way they save tons of money.
Google's searching algorithm also makes good use of cached searches, i.e. the top X searches will be cached and ready for instant retrieval.
On a side note, vivisimo is a much better search engine than Google imo. It makes Google look primitive in comparison.
Jan-16-2004 23:05
Noisician
Harsh electronic purity
Registered: Aug 2001
Location:
search engines work fast because:
1. they organize information via *inverted* indexes, implemented through dynamic dual data-structures.
2. their inverted lists are organized in a way that allows the most frequently needed information to be retrieved more easily.
3. these inverted lists (and documents) are divided among a large number of servers.
4. they use various implementations of hash tables to store their entire lexicon in main memory.
5. they use heap structures (balanced binary trees) for partial sorting and also to implement priority queues.
6. they use intensive compression, with different compression schemes for different types of data.
7. they use different query plans for different kinds of queries.
8. google, in particular, uses a number of patented technologies to rank its pages (and the search results) by order of relevance.
___________________
Last edited by Noisician on Jan-16-2004 at 23:40
Jan-16-2004 23:34
UglyDave
i ran a marathon : )
Registered: Jan 2003
Location: Buncrana, Éire
i was told it's some form of AI, but i dont really know too much about that.
i'm gonna do some research on this.. methinks me might have found a new interest...
Jan-17-2004 02:03
jpowelso
tranceaddict
Registered: Nov 2003
Location: from Denver to floori.d.a with love.
Originally posted by 3jaz
ok .. when i do a search on google for "trance" it shows trance.nu first in the list.. WHY? ..
why not tranceaddict ?
does trance.nu pay cash to be put first on the list ?
i think its because they get more hits its shown by how relevent ur search is trance is more like trance.nu then tranceaddict.com
search for trance+addict and u will prolly get tranceaddict.com
thats my guess
It has to do with how many crawled sites link to trance.nu, and the popularity of the crawled site. It combines all this to determine the "popularity" or 'rank' of a site's relevancy. Hence T.NU>TA
___________________
quote:
Venus: And there are troops of savage giraffes whose necks are on fire, like
the starry ejaculations of fireworks in the very pale sky of childhood
... Venus: Enter, enter here - men of all kinds and races, victims of reality!
You who have the thirst for dreams.
... Venus: You, on life's bitter road, drenched in hard sunlight who have the
thirst that once more the dark marvel of dreams...