Enterprise Search: Baloney Six Ways, like Herring

December 21, 2010

When my team and I discussed my write up about the shift of some vendors from search to business intelligence, quite a bit of discussion ensued.

The idea that a struggling vendor of search—most often an outfit with older technology—“reinvents” itself as a purveyor of business intelligence systems—is common evoked some strong reactions.

One side of the argument was that an established set of methods for indexing unstructured content could be extended. The words used to describe this digital alchemy were Web services, connectors, widgets, and federated content. Now these are or were useful terms. But what happens is that the synthetic nature of English makes it easy to use familiar sounding words in a way to perform an end run around the casual listener’s mental filters. It is just not polite to ask a vendor to define a phrase like business intelligence. The way people react is to nod in a knowing manner and say “for sure” or “I’ve got it.”

image

Have you taken steps to see through the baloney passed off as enterprise search, business intelligence, and knowledge management?

The other side of the argument was that companies are no longer will to pay big money for key word retrieval. The information challenge requires a rethink of what information is available within and to an organization. Then a system developed to “unlock the nuggets” in that treasure trove is needed. This side of the argument points to the use of systems developed for certain government agencies. The idea is that a person wanting to know which supplier delivers the components with the fewest defects needs an entirely different type of system. I understand this side of the argument. I am not sure that I agree but I have heard this case so often, the USB with the MP3 of the business intelligence sound file just runs.

As we approach 2011, I think a different way to look at the information access options is needed. To that end, I have created a tabular representation of information access. I call the table and its content “The Baloney Scorecard, 2011.”

Read more

Webinar Finder from Peelon

December 20, 2010

We’ve recently stumbled upon a promising new resource at Peelon.com.  To put it in their own words, Peelon.com “is a webinar directory and can be used as a webinar search engine” AND it is absolutely free of charge, not to mention free of advertising.  Peelon vows to do just two things: help find a webinar, or help promote a webinar.  After only having investigated the site for minutes, the straightforward, no frills functionality was easily harnessed.

The querying capability is there, allowing the user to sort all available records by date or time, industry and type of webinar.  It wouldn’t be surprising to see these initial option categories expanding with increased traffic.  But for now, if those options are not sufficient to pinpoint the e-lecture of choice, there is a search box to enter any relevant words or phrase.  The results can be filtered by date, comments or even popularity.

Click on any webinar and one will find all the pertinent details spelled out: date, time, description etc.  Curiosity led me to check out the “Add new webinar” link which prompted a page of empty webinar details waiting for user input.  By the looks of it, the process to post a webinar can’t take longer than five minutes and even that includes one coffee break.

All in all, this site is free of clutter, hassle and just plain free.  You won’t hear any complaining here!

Sarah Rogers, December 20, 2010

Freebie

Big Data, CAP, and NoSQL

December 19, 2010

We came across an interesting series of Web write ups about big data. You may know about the CAP theorum. The idea is that is “impossible for a distributed computer system to provide simultaneously” guarantees of “consistency (all nodes see the same data at the same time), availability (node failures do not prevent survivors from continuing to operate), [and] partition Tolerance (the system continues to operate despite arbitrary message loss). For more, read the Wikipedia entry here.

Nati Shalom’s Blog series Part I, II, and III on the CAP theorem postulates that if you are worried about CAP, then maybe you just need to re-define your needs.  Shalom’s general thesis is as follows:

One of the core principals behind the CAP theorem is that you must choose two out of the three CAP properties. In many of the transactional systems giving away consistency is either impossible or yields a huge complexity in the design of those systems. In this series of posts, I’ve tried to suggest a different set of tradeoffs in which we could achieve scalability without compromising on consistency. I also argued that rather than choosing only two out of the three CAP properties we could choose various degrees of all three.

Some useful info to tuck away for future reference and consult before talking to a vendor who is pitching scale and the cloud when you need search or content processing.

Alice Wasielewski, December 19, 2010

Freebie

How Americans Spend Their Time

December 18, 2010

Slurp, slurp. ”

That is the sound of “real journalists” gobbling the latest Forrester confection. I read “Forrester: Americans Spend Equal time Online and Watching TV.” Great headline, but I am not sure I know what “time” means. Also, the pairing of online and watching TV is ambiguous.

I get the point. Web activity is now as popular as watching the boob tube. Great.

But what happens to the data if a person watches TV when online?

I think I know what the mid tier outfit is trying to accomplish: make sales for its consulting business. The “data” are the bait for the canny Forrester fishermen and fisherwomen.

Here’s the main idea. People are spending as much time watching TV as the people are fiddling with their computers, which I think means devices that are computers just hauled around or tucked in a pocket.

Several observations:

  • What’s the sample size? What was the sampling method? Is the n=xxx such a big deal? Omit that from the stats homework in the lousy liberal college I attended as a dull normal and the prof awarded an automatic F. Guess that doesn’t apply to mid tier consulting outfits.
  • Online usage is growing. Okay, great to know since devices have been proliferating for several years. It makes sense that if there are more devices, usage would go up.
  • TV sucks. Well, the write up did not document that, but the TV crowd, like the newspaper and other publishers, are in a tizzy as people use their laptops and gizmos like the Apple TV to get the programming each user wants. With control, TV sucks less. If you want only shows you love, TV does not suck at all.
  • The features used by those online mirror the same Alexis-Charles-Henri Clérel de Tocqueville “average” that his travels in America documented. The only difference is that the stuff that pleases is pretty well know; for example, email, buying stuff, and socializing.

What’s not in the write up may be in the “real” study available from Forrester? Facebook. My hunch is that the demographics of a statistically-valid sample rigorously surveyed would reveal some nuances not in the article and maybe in the “real” study. Here’s a list:

  • In each demographic, which activity is growing more rapidly, which is decreasing more rapidly?
  • In the demographic with the heaviest TV usage, what’s the group doing? Using the TV as background, a way to feel loved, or as a primary activity?
  • In the demographic with the heaviest online usage, what amount of time is spent on Facebook versus any other social system.
  • Across the sample, what is the lean back versus lean forward behavior? How many in each sector use one mode as a primary and the other mode as a secondary?
  • Across demographics, who does the most buying? Under what conditions?

Our work in this field suggests some surprising behavioral shifts. The multitasking characteristic is covered in a Forrester blog post. Presumably that activity is documented rigorously in the “real” report.

But what about that sample? What confidence should I have in the oh-so-precise data? Without data about the mechanics of the study, not much I fear.

Stephen E Arnold, December 18, 2010

Freebie unlike the full reports from mid tier consulting firms

US Search Share, November 2010

December 17, 2010

Short honk: Fancy dancing with search share is underway. I read “Bing Search Share Edges Up in November.” In theory, the link will work for a few days. Key point: The Google tallies a 66.2 percent share. The combined Microsoft-Yahoo search share is pretty much the rest of the traffic. No margin of error and no details of the method.

Stephen E Arnold, December 18, 2010

Freebie

Baidu to Invest in Search Results Filtering

December 17, 2010

Those expensive coffees at boutique coffee shops are filtered. People like filtered coffee.

The goose knows why Baidu is probably going to be successful in certain markets. “Baidu to Spend $15 Mln to Screen Search Engine Results” reports that “China’s leading search engine plans to deploy $15 million to expunge illicit material and false information from its search results.” The source? State media.

image

Filtering makes some things better. One example is revenue derived from for fee service for the China market. Image source: Weekender.com

Observations:

  1. Filtering happens. What’s interesting is the price tag placed on the renewed effort. Details about the scale of the filtering or “expunging” appear in the Interfax.com write up
  2. Happy government officials. Reading between the lines I could see modest smiles of happiness on the faces of some government officials.
  3. Unbeatable advantage in the fastest growing and largest market for online in the world. No further comment necessary because some Google shareholders may ask, “Tell us again why you are not making every effort to maximize shareholder value in the world’s largest market?”

In my opinion, the $15 million is irrelevant. The message is the investment. Message received in Harrod’s Creek. I am not sure about elsewhere.

Stephen E Arnold, December 17, 2010

Freebie

Price Cutting: An Online Mystery

December 17, 2010

One of the mysteries of online is the behavior of users. Individually the actions are idiosyncratic. Put the behaviors of many users together, and you get a completely different insight into what happens in online environments. The usage data don’t falsify online actions. The more data one has, the easier it is to identify what’s hot and what’s not and what’s working and what isn’t.

Traditional media is starting to get with the online program. The chatter about tracking user behavior is one signal of growing awareness of the value of online behavior.

Every once in a while, a story appears in the “real” publishing industry that highlights one of the mysteries of online. To get the information first hand, navigate to ”Amazon Can’t Dent iTunes.” The online version of the story was live as I write this on December 17, 2010. If the link is a 404, you can chase down a hard copy of the December 17, 2010 hard copy newspaper. The main point of the story for me was that Apple’s iTunes has resisted Amazon’s price cutting.

image

Amazon is like the Energizer bunny, one of the great ad campaigns in my opinion.

Now in a normal business, a “sale” or “close out” will attract shoppers. In retail, lower prices are one of the standard items in the selling tool kit. A local store had a surplus of weird green sweaters. I saw a sign that said, “Sweaters. $10 each.” The shoppers took the bait like a hungry trout in late autumn.

The Wall Street Journal story told me that price cutting in digital music has not worked out too well for Amazon. The Apple iTunes and snazzy hardware ecosystem has kept its grip on music. I am not a music person, so the fascination with digital music is interesting, but of no consequence to me.

Read more

Leaks Becoming a River

December 16, 2010

Openleaks Set to Rival WikiLeaks for Business” announces that one of WikiLeaks’ former employees is opening a new, rival company.  In sum: “Openleaks will be a ‘service provider for third parties that want to be able to accept material from anonymous sources’ and will be based in Germany.”  The third party aspect makes it distinctive from WikiLeaks since it will be an intermediary and not hosting the information for the public.  As these types of sites increase, governments are finding that the ability to gather electronic information is a two-way street:  it can gather information on citizens, but citizens also can find ways to gather it themselves.  And with the lack of current laws for adequately prosecuting Julian Assange, these kinds of leaks are not likely to be dammed up any time soon.

Alice Wasielewski, December 16, 2010

Freebie

Yolink from TigerLogic

December 16, 2010

TigerLogic offers a number of data and content solutions. The company (originally named Blyth Holdings, then Omnis Technology, and then Raining Data) uses proprietary methods to normalize data. The company refers to its method as Pick Universal Data Model (Pick UDM). The Pick UDM is a component across the XDMS and MDMS product lines. The approach looks similar to those used by other XML-centric transformation and access methods.

The company’s newest product is a Facebook user’s solution to the problem of aggregating FB content in one display. PostPost, a real-time Facebook newspaper, described this way on the TigerLogic Web site:

PostPost enables users to quickly skim relevant passages of text shared by their Facebook friends and sort shared content by type. To access PostPost, users simply login using Facebook Connect, and in a matter of seconds, all shared links, pictures, videos, articles from their Facebook friends will populate the front page of their personal paper.

You can see a video and obtain more information at www.postpost.com.

image

See http://www.postpost.com

We learned about the firm’s Yolink product this summer. Yolink extracts information from behind links and inside documents. On the Yolink Web site, you can see examples of outputs from the system. The content sources includes Craigslist, Google Patent Search, and Wikipedia.

Wikipedia included this comment sourced from CNet.com:

Yolink searches within the pages of your engine’s results to find your search terms in context. Go beyond the links. Search Web pages and discover information conventional search tools may have never revealed. In addition to mining content on a webpage, yolink will mine all of the links on that page for information relevant to your search. Yolink highlights information in the context of its original Web page and on the right side of your browser. Eliminating the need to bounce between multiple windows. Share your findings effortlessly by clicking on the save and share link. An email message containing your valuable information and the original Web page address is instantly created and ready to send, or save in folders for future use. Go beyond conventional search and find commands. Yolink allows you to search lengthy reference manuals, PDFs, legal documents, contracts, and news sites quickly and effortlessly. Yolink is especially helpful with a multi-word search, because it can extract all of the relevant content surrounding any of your search terms and display it all at once.”

Yolink is a unit of TigerLogic. The company develops software and solutions for creating and improving software applications. In addition to Yolink, the company offers XML Data Management Servers (XDMS), Multidimensional Database Management Systems (MDMS) and Rapid Application Development (RAD) software tools.

We think that the emergence of Facebook centric content aggregation tools is an interesting development. Search without navigating to a FB page is part of the “search without search” shift some vendors are advocating.

Stephen E Arnold, December 16, 2010

Freebie

Online Shopping Reduces Hassles of the Mall

December 15, 2010

As reported on adage.com, Ad Age and Ipsos Observer recently conducted a survey studying the holiday buying habits of American consumers, and the results do not exactly bode well for physical retail stores as compared to virtual ones. The article detailed:

“Shoppers hunting for deals and convenience increasingly turned online during the Thanksgiving selling season. Coremetrics reported sales jumps of 19.4 percent on Cyber Monday, 9 percent on Black Friday and 28 percent on Thanksgiving Day when many retailers started rolling out their Black Friday deals early. These numbers far outpaced the nearly flat sales at physical store locations.”

Per the survey, daily discount websites such as Groupon provide incentive for shoppers to tarry off to the tangible storefronts in the name of getting a good deal. These sites will even hone on members specific interests based on location, demographics and input on interests. While this may seem like a silver lining for the brick-and-mortar locales, the article states that only the consumer is benefiting. “SymphonyIRI Group data show that coupons are not driving incremental sales. They are more likely to offer discounts to those already planning to buy, thereby cutting at the margins for retailers.”

Something else new to the shopping arena is the plethora of iPhone apps suited for this purpose, ranging from instant price comparison programs and barcode scanners to online coupon books, alleviating the need for printing and cutting. Easy just got even easier. So I guess with the internet, ‘the customer is king’ takes on a whole new meaning.

Sarah Rogers, December 15, 2010

Freebie

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta