Pattern Opportunities or Distractions?
August 16, 2010
Search platform vendor Endeca unveiled its patterns, attempting to simplify search for professional and entertainment purposes. While the results are impressive, the company might be overdoing things. The patterns were detailed in a Findability.org profile, “Endeca’s Pattern Library” and showcased patterns for ranges, analytics and more. The profile also claimed, “each pattern includes a problem summary, usages, constraints and challenges, solution elements, examples, and links for further reading.” The patterns themselves are similar to skins, helping guide searches. For example, a range finder for a wine search provides range sliders for prices, flavor, vintage and other factors. By whittling down these various options, the search is supposed to be streamlined. However, in novice hands we fear these patterns are little more than novelty interfaces. Sure, you can choose a range of vintage years, but without the prior knowledge to go along with it, these patterns aren’t helping anyone. Search, in our opinion, has not made giant leaps in the last few years. It may be easier to fiddle with interface than deal with the deeper issues of relevance, precision, and recall.
Pat Roland, August 16, 2010
Yahoo: Booboo and Boohoo
August 16, 2010
I wanted to snag one passage and run it as a quote to note. Here’s the snippet:
But there are worse things than seeming irresponsible. Losing, for example.
“What Happened to Yahoo” contained a couple of other gems, and you will want to read the complete article, not my comments.
The author was a Yahooligan who makes clear he had an idea that Google later emulated. Too bad for Yahoo, of course. Messrs. Wang and Filo had other fish to fry. Search was not on the menu. Banner ads were.
The two gems in this write up, in my opinion, were:
First, the Internet bubble made Yahoo wallow in banner ad revenue. Life was good, so why focus on more difficult aspects of running a business. As a result, Yahoo fell behind in its Overture revenue method. The Google was inspired. Yahoo complained and got $1 billion or so. Then Google raced right up to $24 billion as Yahoo stood on the sidelines and watched.
Source: http://techcrunch.com/2008/09/04/yahoo-stock-falls-off-the-cliff-when-will-jerry-give-up-and-quit/
Second, the article describes how Yahoo was, at its core, pretty indifferent to technology. Some camouflaging actions were needed; namely, focusing on easy sales to big dumb companies, describing itself as a media company, and hiring good enough programmers.
Is this an accurate description of Yahoo? Pretty much, but there were some twists and turns within the company based on my research into Yahoo for a large, clueless client over a period of three years. Here are the points that warrant some additional color:
- Yahoo bought companies, operated each in silos, and created a cost structure that was and still is difficult to control. What’s interesting is that Google seems to be heading down this slippery slope at this time.
- Yahoo management did not manage. In one interview, a Yahooligan suggested that units within Yahoo operated as separate companies. Now that’s a recipe for success. Just ask anyone who worked at Booz, Allen & Hamilton when folks were rewarded for taking clients from other BAH professionals. The shark thing. Fun.
- Yahoo’s own technical staff convinced itself that it was really into cutting edge stuff. Sorry. Yahoo had some winners and converted them into so-so outfits. There was the Delicious.com rewrite. Then there was the Flickr – Yahoo Pictures or whatever it was. Silly stuff.
Bottom line: Yahoo does not matter too much to this addled goose anymore.
Stephen E Arnold, August 16, 2010
Freebie
Data Centers for Facebook and Google: Juiciness in Alleged Facts
August 16, 2010
Navigate to “Two Data Centers Present a Study in Contrasts.” The information in the write up is germane to search and social networking. A happy qua ck to Theodoric Meyer, who did a very good job on this article for the Dalles (Oregon) Chronicle.
Dalles? Yep, that’s the town in which first Google, then Facebook, decided to set up power sucking data centers. In the olden days, data centers had lots of people. Today’s data centers are designed to be as close to people free as possible. Humans wandering around a data center filled with itty-bitty gizmos crunching lots of data can get screwed up in a heartbeat if a clumsy human does something like pull a plug or punch a button to see what happens.
You will want to read the full write up by Mr. Meyer. Here are the factoids that I noted:
- Google set up shop in scenic and struggling Dalles in 2006. Now Facebook with its Xooglers is on the same path.
- Data center managers have to make nice with city officials, particularly in places like scenic and struggling Dalles.
- Facebook is doing a better job of building bridges that Google’s Math Club crowd did.
- No Oreogon taxes will be paid on the data centers for 15 years. (A big yes to the American market system.) A minimum number of hires and higher pay were the requirements Google and Facebook had to meet.
- Facebook’s facility will have 147,000 square feet or about 2.2 American football fields. That’s almost as big a typical trailer here in Harrod’s Creek, Kentucky.
- Facebook power consumption will be at 30 megawatts with a need to access up to 90 megawatts of power. BGF (before Google and Facebook), the township used 30 megawatts of power.
- Google has done “a lot of good” in Dalles.
The key factoid. The fellow responsible for Google’s Dalles facility has been hired by Facebook. You can take the Xoogler out of Google but you can’t take the Google out of the Xoogler.
And that contrast? Math Club compared to making nice with political officials.
Stephen E Arnold, August 16, 2010
A New Concept or Buzzy Jargon?
August 16, 2010
Is the internet changing again? Will the term “Web 2.0” be about as useful as vacuum tubes in a television soon? The job is all but done, one data manipulating company claims. Data governance experts, Collibra (http://www.collibra.com/), who help clients better transform data into usable information, brought up some interesting and head-scratching points about the future of data management in a recent Collibra Inside blog post, “Social Semantics, Hybrid Ontologies and the Tri-Sortal Internet.”(http://inside.collibra.com/?p=767) Providing slides from a recent conference about how we can “tackle the mass of (meta)data about communities (enterprises, business webs), people, and systems and the links in between,” the article went on to claim, “visual analysis of the linked data cloud reveals the same non-linear graph structure as found in social networks. Hence there is indeed a tri-sortal dynamics.” This is some heady stuff, but intriguing. The term “tri-sortal” is definitely one we’ll keep on file for the future. We may not use it, however.
Pat Roland, August 16, 2010
Solr Heats Up Search Landscape
August 16, 2010
“Enterprise Drupal Solr search|Achieve Internet: Enterprise Drupal Web Development & Drupal Web Design” reports that “Achieve Internet is now leading the Solr development portion of the project.” Solr is a popular search platform due to its speed and features, but did you know how many enterprises are using Solr? From Achieve Internet’s website, here are the top five public sites that use Solr to handle search:
- Internet Archive – Search this vast repository of music, documents, and video using Solr.
- Netflix – Solr powers basic movie searching on this extremely busy site.
- Smithsonian Institution – Search the Smithsonian’s collection of over four million items
- StubHub.com – This ticket reseller uses Solr to help visitors search for concerts and sporting events.
- White House.gov– The Obama administration’s keystone website is Drupal and Solr!.
Solr’s gaining momentum and is already being used by industry leaders for searches. You can download Lucene/Solr from www.lucidimagination.com.
Bret Quinn, August 16, 2010
Amazon and A9: Still Kicking
August 15, 2010
Be sure and check out Jeff Dalton’s blog. A recent entry, entitled “Jeff’s Search Engine Caffe: SIGIR 2010 Industry Day: Lessons and Challenges from Product Search” recaps some valuable lessons on search engines.
A9‘s Daniel Rose presents a “buying funnel,” and relates this to Elias St. Elmo Lewis’ 1898 AIDA model: “awareness is followed by interest, then desire, and finally action.”
We need different tools for different stages, and not one facet fits all. Amazon is a marketplace, so the search must be designed for real time.. Zero clicks indicates the search result contained all the information necessary.
We need to satisfy user needs before the user knows he has a need. We can offer different interaction mechanisms (not one facet fits all), and let the type of content influence the way the search works.
Bret Quinn, August 16, 2010
Dictionary Files for Free
August 15, 2010
These days, we’re all involved in a research project. Check out this Greek Translation Vortal site, appropriately titled “Download English Dictionary, Download Roget’s Thesaurus.” This remarkable site was created by Spiros Diokas, who is himself a translator for hire. At this site, you can download the English dictionary for free! It’s called “Project Gutenberg’s Etext of Webster’s Unabridged Dictionary.” Other downloadables include Roget’s Thesaurus, the King James Version of the Bible, and the Oxford Dictionary of Quotations. All of which are indispensible for building term lists for research projects and general reference. The dictionary of quotations is really cool. It installs on your hard drive and runs in the background, so you can consult it only when you need it.
Bret Quinn, August 15, 2010
Need a Solr Primer
August 15, 2010
You know about Apache Solr, right? ‘Introduction to Apache Solr’ is a great demo for this open-source search wunderkind. The video, titled “Introduction to Apache Solr,” explains how queries are parsed, how facets increase findability, and how to interact with Solr. All of which are a snap.
Things really heat up when you enable Lucene faceting queries. Faceting is already being widely used in e-commerce and in libraries. Using Apache Solr, you can search within categories, modify your searches by range fields, and filter results using multiple parameters in single queries.
We’re impressed. Solr is a useful, user-friendly open-source search engine with an austere present and a wide-open future. You can download Lucene/Solr from www.lucidimagination.com.
Bret Quinn, August 15, 2010
Quote to Note: No Anonymity
August 14, 2010
Google had a rough Friday the 13th. From the land that gestated, “There is no privacy. Get over it.” comes a Mozart DuPont variation. Point your browser thingy at “Google CEO Schmidt: No Anonymity Is the Future of Web.” Here’s the quote I noted:
Privacy is incredibly important,” Schmidt stated. “Privacy is not the same thing as anonymity. It’s very important that Google and everyone else respects people’s privacy. People have a right to privacy; it’s natural; it’s normal. It’s the right way to do things. But if you are trying to commit a terrible, evil crime, it’s not obvious that you should be able to do so with complete anonymity. There are no systems in our society which allow you to do that. Judges insist on unmasking who the perpetrator was. So absolute anonymity could lead to some very difficult decisions for our governments and our society as a whole.”
I seem to recall a bit of a snit with Cnet when that outfit published information about a certain Google executive.
I like the medieval approach. The kings and queens at the top operating in one way, and then the surfs digging potatoes and watching lords and ladies do pretty much what each wants. Seems fair to me.
Stephen E Arnold, August 14, 2010
Freebie.
Twitter: New Monetizing Play?
August 14, 2010
Data and text mining boffins like to crunch “big data.” The idea is that the more data one has, the less slop in the wonky “scores” that fancy math slaps on certain “objects.” Individuals think that his / her actions are unique. Not exactly. The more data one has about people, the easier it is to create some conceptual pig pens and push individuals in them. If you don’t know the name and address of the people, no matter. Once a pig pen has enough piggies in it (50 is a minimum I like to use as a lower boundary), I can push anonymous “users” into those pig pens. Once in a pig pen, the piggies do some predictable things. Since I am from farm country, piggies will move toward chow. You get the idea.
When I read “Twitter Search History Dwindling, Now at Four Days”, I said to myself, “Twitter can charge for more data.” Who knows if I am right, but if I worked at Twitter, I can think of some interesting outfits who might be interested in paying for deep Twitter history. Who would want “deep Twitter history?” Good question. I have written about some outfits, and I have done some interviews in Search Wizards Speak and the Beyond Search interviews that shed some light on these folks.
What can a data or text miner do with four days’ data? Learn that he / she needs a heck of a lot more to do some not-so-fuzzy mathy stuff.
Stephen E Arnold, August 14, 2010
Freebie.