Back to main AEAChicago2007 post
internal search logs are a missing tool Jakob Nielson says 50% of users are search-dominated Zipf curve - long tail distribution - for search results in this case, try to optimize the short head look for seasonal patterns cluster types of queries to look for patterns how to capture search queries: search logs, local database, commercial search solution most frequent unique queries? do they retrieve quality results? which retrieve zero results? click-through rates per frequent query? most frequently clicked result? what are the referrer pages for frequent queries? which queries retrieve popular documents? then you generate specific questions i.e. Netflix which most searched and clicked titles are least frequently added to the queue? analytics won't tell you the answer to a problem, but they'll tell you the problem is there. User Research type SKUs into catalog site that they looked up in the printed catalog BBC has reports for "people who searched on X also searched on..." using session data segment needs by security clearance, IP address, job function, account information or extrapolate segments directly from the data associate real queries with a persona - now you really know what they care about Content Development start from failed queries - does content exist? are there titling, wording, metadata, or indexing problems "best bets" results defined manually identify points with no or way too many results where you could add help query syntax helps select search features to expose if people are using queries with boolean operators, make them more visible if get zero results, could show options to broaden search if get too many (200 or whatever), could show options to narrow Interface Design: search entry interface, search results consider what elements to include in search results - i.e. author name for books get more clickthroughs on result 10 than 6-9 on a page with 10 results Financial Times saw people entering dates; so let them sort results by date Retrieval Algorithm Modification Deloitte, Barnes & Noble, Vanguard show basic improvements (i.e. best bets) aren't enough needed to go into more complicated and expensive customizations add spell checking weight company names in metadata highly Navigation Design if created "best bets" to show at top of query results, can also use to generate index Michigan State University builds A-Z index automatically based on frequent queries cuts across organizational sils from what pages are searches initiated? those pages are failing and people are stuck. what are the queries from those points? Metadata Development classify queries as types of metadata, then mark documents with that information Netflix had movies, people, and genres get possible values for those categories - natural language, jargon, localization (lorry) most common queries are known-item - there's one correct answer long tail is often research queries, more open-ended do some sampling in long tail to check if it's very different from short head organizational impact bad search results demonstrate what happens when content authors don't follow guidelines look at common queries and make sure good documents aren't falling in results Google Analytics and others make it easy to email reports - viral spread of information Financial Times looks for spikes in queries to find breaking stories complements qualitative methods that can tell you *why* people do something need better tools for parsing logs, generating reports - thinks will get good this year Hitwise and Comscore can help you benchmark against other sites, but are expensive Google Trends may also be helpful having a hard time writing book because can't get data from people middle area of the tail may have fast-rising or slowly falling items has free template for analyzing queries