UPDATE: This post is now available as a completed white paper. Get the PDF download here. 
This blog post is a summary of the forthcoming white paper from OneRiot, “The Inner Workings of a Realtime Search Engine.” For an advance copy, just ping Tobias. In the mean time, please leave comments and ask questions on this blog post. Let us know if we’ve covered enough ground, or gone into enough depth. We will try to address each point both on the blog and during the process of completing of the white paper.
40% of users perform search queries which display an intent that is best satisfied by realtime search results. Industry numbers aside, Iran – the country, the situation and the search query – has conclusively proven that users want search results from the realtime web.
Users Want Realtime Search
Across all the major search engines, including Google, Yahoo, Bing and Ask, industry numbers indicate that 40% of users are performing search queries which display an intent that is best satisfied by realtime search results. Irrespective of industry numbers, Iran – the country, the situation, and the search query – has proved beyond doubt that there is huge demand for search results from the realtime web. The question on everybody’s lips is: “What’s going on right now?” In order to answer that question, they need to find the news, images, conversation, stories and videos with the most social relevance right now. Realtime search results meet that need.
Everyday hundreds of millions of search engine users type something as heavyweight as “Obama,” or as entertaining as “Britney”, into the search box and expect to find out what’s going on right now for that topic. These types of searches are commonly called “browse” searches, as people are Browsing for information. They don’t have a particular URL in mind. They just want to know what’s going on right now – the source of information being less important than the information itself. Those users are best satisfied by search results from the realtime web.
Making up the remaining 60% of searches on the web are “Navigation” searches (20%), and specific “Informative” searches (40%). An example of a navigation search is when a user is trying to get to Sony.com, or Yahoo.com. They will enter a search query in an attempt to find a recognized home page. An example of an informative search is when a user is trying to find a specific recipe for Cabbage Soup that is definitely “out there somewhere.” They enter a query in attempt to find that specific information.
The best traditional search engines are very good at finding navigation search results, and specific information. The best realtime web search engines are very good at finding Browse search results – addressing fully 40% of the market. With 1% of the search market worth $1bn per year, 40% is a huge target to go after.
Traditional Search – A Broad Overview
Traditional search engines treat the web like a library. Web pages are crawled, and the content is stored in an index for efficient retrieval of information. Those web pages also build up a “Rank” over time (e.g. Google’s “PageRank”). Pages with the highest Rank percolate to the top of the results.
A page’s Rank is constructed from many factors, but one of the most important is citation importance – broadly, the number of inbound links to that web page. This approach tends to favor highly referenced resources like Wikipedia. For example, search for “Britney Spears” on a traditional search engine and the top result is likely to be a Wikipedia page. This approach produces dependable results, but results that are not necessarily reflective of why the user would be searching for Britney at any particular time (i.e. to find out what’s going on right now). Additionally, a page’s Rank is relatively static. It changes periodically, but not at a pace to keep up with the realtime world of changing interests in a topic. A page with high rank might be tremendously relevant yesterday, but not tomorrow. A traditional search engine is only able to return yesterday’s relevant result.
Traditional search engines struggle to surface the hyper-fresh and socially relevant “realtime” results that satisfy users performing Browse searches. OneRiot, a realtime search engine, is focused exclusively on solving that problem and addressing that 40% of the market. To do that, we have had to:
Invent new ways to index the web: by harnessing the power of the realtime social web.
Invent new ways to rank the content in that index: at search time, to deliver the most relevant result right now.
We will now consider each of these two innovations in turn.
(more…)
Buddy Up