Exclusive Interview with the Founder of Hot Neuron

March 23, 2010

What happens when a theoretical physicist focuses his attention on the problems of content processing? One answer is the Hot Neuron technology. Dr. Bill Dimm, after a successful career in physics and finance, founded Hot Neuron to “develop innovative methods and algorithms that help people find and organize information that will make their companies more productive.”

In an exclusive interview for the ArnoldIT.com feature Search Wizards Speak, Dr. Dimm said:

Clustify analyzes the text of your documents and groups related documents together into clusters. Each cluster is labeled with a few keywords to tell you what it is about, providing an overview of what the document set is about, and allowing you to browse the clusters by keyword in a a hierarchical fashion. The aim is to help the user more efficiently and consistently categorize documents, since he or she can categorize an entire cluster or a whole group of clusters with a single mouse click. Our approach to forming clusters is impacted by that goal. We use a modified agglomerative algorithm to ensure that the most similar documents get clustered together, and we allow the user to specify how similar documents must be in order to appear in the same cluster. By choosing a high similarity cutoff, the user can be confident that it is safe to categorize all documents in the cluster the same way. Clustify can also do automatic categorization by taking documents that have already been categorized, finding similar documents, and putting them in the same categories.

I asked Dr. Dimm about the intense competition in the text processing sector. He said:

For companies that do original research and adapt their products to their customers’ needs (like us, of course), there is a fair amount of opportunity for differentiation–customers really need to try the products and see what works in their situation. The companies that just pull an algorithm out of a book or mimic another product will be left competing on price.

You can see the technology in action at Dr. Dimm’s MagPortal.com site. For the full text of this exclusive interview with an innovative thinker in information retrieval, read the full text of  Hot Neuron interview. For more information, visit http://www.hotneuron.com.

Stephen E Arnold, March 23, 2010

A free write up and a free article. I will report this “free” stuff to the Department of Labor. I know the DOJ will care.

Infrastructure Ripple from SharePoint

March 22, 2010

Navigate to Thor Projects and read the article “Infrastructure Ripple Effect – The Story of Servers, Racks and Power.” I have about 48 inches of screen real estate and I needed all of it to read the article. The layout is – in a word – interesting. The point of the write up, in my opinion, is summarized in this passage from the article:

I am reminded that any change creates a ton of little ripples.

When an information technology pro runs into problems with a single server, I wonder what the impact of more massive on premises changes might be.

I thought about Mauro Cardarelli’s “Where Does SharePoint Still Fall Short?” when I thought about adding hardware. He wrote:

Let’s face it; the interface for security management is confusing and cumbersome… even for people who use it every day. What are the consequences? First, you increase the likelihood of security breaches (i.e. showing content to the wrong audience). Second, you increase the likelihood of giving users permissions greater than necessary. Finally, you increase the likelihood of a having a security model that is highly diluted and overly complex. This is probably why the 3rd party market for SharePoint administration has been so strong… someone needs to pay attention to what these folks are doing! But I would argue that this is reactive (versus proactive) management… and things need to be taken one step further.

Hardware and security. Hmmm.

Stephen E Arnold, March 22, 2010

No one paid me to write this article. I will report this to the Salvation Army, an outfit that knows about work without pay. Perhaps the cloud access to SharePoint will obviate the problem?

Coveo and GEICO Host Webinar on March 23, 2010

March 21, 2010

Fierce Media has asked Beyond Search to facilitate a discussion about “how GEICO thinks about leveraging its data-rich enterprise systems to generate real-time business value and intelligence.” The participants are GEICO and Coveo as well as Stephen E Arnold.

Topics include how the Coveo system can:

  • Enable improved business intelligence and decision making through dynamic dashboards and information mashups that provide actionable business information
  • Access structured and unstructured data from across enterprise systems and repositories without complex integration or data migration, improving efficiency and cost effectiveness through a unified indexing layer
  • Lower the cost of legacy system integrations and  upgrades, and reduce time-consuming data migration
  • Optimize social networks and incorporate the value of collaboration and just-in-time information exchange into the knowledge ecosystem

The audio program will be on Tuesday, March 23, 2010 beginning at 11:00am Eastern/8:00am Pacific. More information about Coveo may be found at http://www.coveo.com. You can register here.

Ben Kent, March 21, 2010, Beyond Search

This is a sponsored post.

InQuira Embraces the Cloud

March 19, 2010

I read “InQuira Puts It Knowledge Solutions in the Cloud” and learned that the approach “is in no way a light weight version.” On premises search systems can be tough to install, tune, and maintain. Blossom has been, in my opinion, one of the trail blazers for hosted search, and it offers a robust, powerful, and customizable solution. InQuira is moving in that direction as well.

According to the write up which quotes an InQuira officer:

InQuira has existing partnerships with Oracle CRM On Demand, Oracle’s Siebel offering, and Genesys Telecommunications Laboratories. The newest on-demand offering will extend the company’s reach…[InQuira] has a really established reputation as the best-of-breed intelligent search vendor that quickly and easily integrates with everyone,” says John Ragsdale, vice president of technology research for the Technology Services Industry Association (TSIA).

One feature of the approach is that storage is provided in an “on demand” model.

You can get more information from www.inquira.com.

Stephen E Arnold, March 19, 2010

Freebie. No one paid me to write this. I will report non payment to the Bureau of Labor Statistics, an outfit who tracks work for no compensation each day, every day.

Pew Documents What Some Info Vendors Will Learn the Hard Way

March 17, 2010

There are some tricks to learning. To memorize a list, put each item in a room of your house and walk the rooms, recalling each item by association. One of my classmates remembered the names of the Great Lakes with a mnemonic word. I prefer to look at survey data and let the numbers do the talking. The write up “Pew: Readers Prefer Ad Supported News to Pay Walls” provides me with some evidence that the dreams of traditional publishers to make yesterday’s revenue from gizmos like the iPad and the Nook might be just a figment of the imagination.

According to Pew, the oh-so-reliable research outfit, the article reports:

when it comes to online news, getting people to pay for content they otherwise value is “like trying to force butterflies back into their cocoons.

Yikes. People must not know this factoid which is pretty well understood among the savvy, but ageing commercial database publishing crowd.

I found this passage fascinating:

First things first: Pew notes that last year, online advertising saw its first decline since 2002. Numbers from eMarketer said that revenues fell by a total of $1 billion between 2008 and 2009. Still, a full 81 percent of Internet surfers say they’re cool with online ads if it means the content remains free, although “much of that is because they find them easy to ignore.” Further, 21 percent said they click on ads “at least sometimes”—much higher than we expected—and that number goes up when the user is more active. For example, among daily Internet surfers, 28 percent reported clicking on ads. For people who visit at least six sites per day, the click rate is as high as 37 percent.

Where’s the revenue going to originate? In my opinion, the former country club owners will be looking for regulatory help in the form of a “news tax” or some financial piece of the online revenue action from the new owners of the information country club. I caddied for peanuts and I don’t think the new country club proprietors will be too keen to give up too much cash to run “real news”.

Stephen E Arnold, March 16, 2010

Free, free as a goose. No one paid me to write this article. My reference to a goose reminded me of the Bethesda Country Club member who bludgeoned a swan to death decades ago to much fanfare. I will report my killing of this story to the new manager of that country club in suburban Washington.

Another Google Jibe

March 15, 2010

Poor, poor Google. From top of the world to a punching bag in less than three months. This new decade is proving to be a challenging one for Google. I just read “Six Delusions of Google’s Arrogant Leaders.” I want to disclose that I too have been accused of being arrogant. Now I don’t have any good reason to be arrogant. I just find that approach works for me, but, please, keep in mind that I am an addled goose, live in rural Kentucky, and am wandering slowly toward being 66 years old. I am no sports car in today’s NASCAR ego race.

But Google! According the write up, Google is coming across as “cocky”. I don’t want to run down the six delusions. I inveigh you to go direct and suck up the juiciness yourself. However, I can point to two of the examples and offer a comment.

The first is “users are hungry for Google synergy.” I am not sure what synergy means. I know that the Google platform is one that works like a giant plastic bag wrapped around the earth. The idea is to put everyone in the bag and keep them there. This is mostly complete, but about 25 percent of Web users are outside of the bag and Google wants to get them in one way or another. The notion that users want this is irrelevant. What this delusion makes clear is that Google is retrofitting public relations baloney to match what the company has been working on for about decade. What’s interesting is that it has taken mavens, pundits, and “real” journalists 360 months to figure out the Google game plan. Who’s delusional? Google which has mostly accomplished its mission or the folks just figuring out that Google has been and will continue to push the Google PR line?

The second delusion is that “Google is a worker’s utopia.” Okay, when you take money to do work, by definition, this situation is not utopia for the workers. Companies can make work less onerous or more meaningful, but it is work. I don’t think the Googlers I know are doing much more than drinking the Google Kool-Aid, trying to build their knowledge value, and get some money. Like Apple, Google operates a reality distortion field, and, let’s face it, having Google on one’s résumé is arguably more impressive than a degree in Harry Potter studies from Frostburg College. My view is that Google manipulates its workers as effectively as it manipulates the media. Like the media, Google employees play along. It’s a game with high stakes, but it is a game. Google knows exactly what it is doing.

Now what’s the arrogance? The arrogance is not unique to Google. I call this the Math Club Syndrome. Here’s how it works. A group of folks with specialized interests and skills bond, sort of like a golf foursome from Sigma Chi fraternity. The difference is that no one understands the Math Club and most people understand and envy the Sigma Chi golf foursome. As a method of coping with a world that simply does not understand math, the math club becomes insular. The club’s rules are insider rules and act like a protective barrier. No problem until the math club becomes the first next generation supra-national company jousting on an apparently equal footing with China, the Department of Justice, and giants like Microsoft.

What do we expect from the Math Club? I expect Math Club behavior, complete with the insider jokes about janitors in patent documents. (Oh, janitors is a way of describing Google’s semi autonomous agents which “clean up” statistical anomalies in petascale flows of data. Snort, snort, get it. Janitor equals Dilbert’s garbage collector, the smartest person in the comic strip. Oh, you don’t get it? Well, there you have it. A mismatch between Math Club humor and you, gentle reader.)

My view is that it is time to quit worry about Google’s power and time to start figuring out how to surf on Google. My column for KMWorld and this month’s column for the Smart Business Network address two different ways to surf on Google. I don’t grouse. I accept that over the last decade Google has emerged as a new ecosystem. You can’t kill it because the Googlers who leave the company spawn Google-centric entities. My last count tallied a couple of hundred of these Xoogler ventures. And Facebook is not much more than a “legacy” of Google. Maybe Facebook will become the new Google, but that won’t change the arrogance.

Math Club is congruent with arrogance. Reality. Live within it; don’t deny it.

Stephen E Arnold, March 14, 2010

No one paid me to write this article. Because I have not been paid and I refer to psychological behavior, I will report my writing for no pay to the Surgeon General who understands such esoteric notions as delusions.

Newsosaurs

March 15, 2010

I read “It’s Hard To Watch The Newsosaurs Turn A Blind Eye To Their Own Extinction” right after I flipped through the New York Times’s Sunday magazine clone from the Wall Street Journal outfit. Let me comment on each information MIRV and offer a couple of observations from my search vantage point.

First, TechCrunch’s write up has a killer comment:

Everyone wants to wall off the Web and keep grazing on declining ad revenues.

I agree. This is a combination of fear, anger, and ostracism. I enjoy pointing out that in the information economy, the traditional giants no longer own the country club. Each day, the former owners find their future will be as caddies to the new information elite. This is, I suppose, a bitter pill to swallow. The TechCrunch article includes the much quoted “burn the boats” admonition from one of the early superstars of the zippy-doo Web that is not the cat’s pajamas. Like Google’s advice to struggling industry, the listeners think that their outfits have already burned the boats, embraced technology, and reinvented themselves. This mismatch between advice and its perception is characteristic of the domain collision that is now taking place. The passage that caught my attention in the TechCrunch write up was:

The longer media companies wait, the bigger disadvantage they will have when they cross over to the other side and find a whole new host of competitors who never had any print legacy businesses to protect. Those competitors right now are blogs and online news hubs who are still furry little rodents in the underbrush, but who won’t stay little forever. The sooner print media companies cross over, the sooner they can be on pure offense. Their online strategies and business models won’t be crippled by any allegiance, or need to protect, to the old print business. If they wait until their online revenues become 25 or 50 percent before they fully commit, it will be too late.

I don’t disagree with the thought. I disagree with the “will be too late.” It is too late.

The example to wish I refer is the oversized, glossy, 80 plus page WSJ Magazine filled with “reading.” Well, that’s interesting. I just counted about 32 pages of ads plus a number of features that are tough for me to determine if these are placed for consideration or are actual editorial. The stories focused on cars and fashion with a profile tossed in for good measure.

I remember being told by my Financial Times’s delivery agent before I dropped my print subscription that he tossed the magazine insert because it was too much of a hassle. I wonder if my delivery person for my Saturday WSJ will follow the same path.

Did I read any of the stories? The answer is, “No.” None of them appealed to me. I have a person who works for me who drives a Mini Cooper and it seems to have constant tire problems. I am tired of with it executives who overcame hardship. Who hasn’t? Fashion? Not interested. I wear black Travel Smith jackets, black never wrinkle pants, and black shoes that do not set off any alarms anywhere I travel. Spare me the trendy. Was there any financial info, business intelligence, or juicy insights into making money grow? Nope. The WSJ added sports and now it is adding a New York Times’s magazine type publication every couple of months.

What’s my take?

  1. WSJ is going after the NYT advertisers. That’s okay but the effectiveness of print ads have to be demonstrable. That might be tough unless the editorial product provides some content consideration. The boundary between an auto story and an advertiser might be getting a few molecules narrower, might it not?
  2. The problem with traditional media is not content; the problem is finance and business models. Offering me 30 pages of ads in 80 pages of paper is somewhat 17th century in today’s world.
  3. The Financial Times’s last home delivery offer to me was $50 a year. Will the Wall Street Journal face the same subscription challenge as readers discover that blending sports, Details magazine editorial, and business profiles might be out of step with what subscribers like me do on a Saturday?

Now search? How will I be able to locate the Gucci suit on the WSJ Web site? Answer: Not until the WSJ figures out image indexing and some other search tricks. I bet that when the iPad version of the WSJ Magazine comes out I will be able to click on a suit and see a map of locations where I can buy a suit that will fit most 20 year old soccer players. Maybe for some folks. Not for me.

Stephen E Arnold, March 14, 2010

No one paid me to write this article. I will report a failure to charge for my writing to the editor of the Army Times, an outfit focused on information in the modern world.

Hating Search

March 14, 2010

I think search sucks. I don’t hate search. Rupert Murdoch has a different view. Navigate to “Video: Rupert Murdoch Loves the iPad, Hates Search”. The idea is that the “legendary founder of News Corp.” is not happy with “the content stealer.” What’s interesting to me is that this message about hating search is delivered in electronic form and appears to reference a video. There’s a great quote in the write up as well in my opinion:

But search on the Internet whether it be Bing or Google, whatever, it’s free and they simply take all our expensive and we think very good content such as Wall Street Journal or whatever and what they call they scrape it and they use it for search, it gives them their raw material for nothing and then they have this very clever business model of charging for searching it, we don’t get any of that. And they are technologically brilliant, they are a long way ahead but they do not have the right to do it if we want to stop them.”

My view is a little different. The question is, “What about Facebook and Twitter?”

Stephen E Arnold, March 13, 2010

No one paid me to point out that Google is not rowing its boat in the social media river quite as quickly as some other firms. Because I reference water, I will report non payment for this write up to the Maritime Administration * and * the Coast Guard.

Interfaces Put in the Corral

March 13, 2010

In the last six months, I have been flooded with user experience inputs. Books, emails, and conversations purport to tell me that Web sites have to be an “experience.” Sorry. I like command lines. I like to run queries with syntax along the lines “SS ESOP AND CC=76?? AND UD=9999.” The notion that I am going to scan a bunch of 8 pt links is nuts. I prefer to run queries, iterate, modify result sets, and then peruse content. I like to review short items—what I call information wieners and then if the source item has intellectual nutrition, examine the source document, data table, or other information object. I do research my way, and I resent having to figure out what the heck “smart software” is trying to do. The assumption is that I want to buy something like HP Trim 7 or I want to know about Lady Gaga’s most recent fashion moment. Nope, I want specific information on point to a query. For me, machine generated facets, suggestions, and what other people are seeking are irrelevant and often dorky.

I liked “Overdoing the Interface Metaphor.” The article tackles some interface issues with which I resonated. For me, the most important passage in the write up was:

Improving the product, not faithfully reproducing the physical object, always gets priority. I passed on a long, complex page-turning animation because it didn’t make sense (you’re paging up/down, not left/right) and it would have been distracting. And I opted for an extremely brief cross-fade, rather than a slide, because slides take longer and are more visually jarring.

One voice. Not enough on this subject. I know how I think about UX. It SUX. Just my opinion.

Stephen E Arnold, March 13, 2010

No one paid me to write this item and misspell suck. Because of that error, I will report non payment and spelling freedom to the Bureau of Engraving, where and error can have big consequences quickly.

Google, Data, and Owning the Golf Club

March 11, 2010

I remember when I first interviewed with Barry Bingham Jr., then the boss of the Courier Journal & Louisville Times Co. The newspaper was a monopoly and it had a document from a government official saying it was okay for us to own the morning and evening newspaper, the top TV station, the top AM and FM station, and to have other information related businesses such as high volume printing, door knob ad packet hanging, a ham company, and, oh, yes, an electronic information publishing company. That unit was also number one in revenue per record on the big commercial online systems.

One of my recollections of that interview was my question about revenue and profits. I spoke with him in 1980, so my memory of that meeting in his sunny office looking down Broadway in Louisville from the Courier Journal’s main building was along the lines: “Don’t worry. When people spend money for advertising in this area, we get most of it.” He went on to point out that he would subsidize my riding the bus to work because after working in New York and Washington DC, I was not an automobile buff. He also pointed out that I was expected to volunteer in various civic groups, which for me included helping with youth soccer and contributing to the Chamber of Commerce’s first “Advanced Technology Council.” Yep, 30 years ago I thought Louisville was an advanced technology center. Go figure.

After that initial meeting, Mr. Bingham and I became friends. When he had a computer problem, he would come over to my house and we would fix his Mac. After the sale of the Courier to Gannett, he and I would have lunch at least once a year. I had to make sure I had my wallet because Mr. Bingham would often arrive in his beat up Datsun without any money even though he worth was in the big money.

At one of those lunches, he made a comment about owning the country club. My memory locked on to this analogy, and I want to share it before I suggest you spend some time working through a Google presentation about newspapers and advertising. Here’s the country club story. His view was that newspapers were the owners of the country club. The newspapers decided what to emphasize and in the case of the Courier Journal what issues to push. The Courier Journal, when I joined as an officer of the company, was ranked on a couple of lists as among the Top 25 news papers * in the world *. The Courier Journal employee ID pass was a sure fire ticket to many events. When I wanted carpet samples from the now defunct Stewart’s Department Store, I showed my Courier Journal ID and was able to haul off a dozen carpet samples. The salesperson told me, “I will come by your house, answer questions, and pick up the samples.” Now that’s power. In New York, a carpet sample to go required a driver’s license, a credit card, and a one page form. In Louisville, the Courier Journal was the owner of the country club, employees were members, and the influence of the company, its employee practices, and its reputation in the community was remarkable. I joined the Courier Journal after a stint at Booz, Allen & Hamilton and (if you can believe this), the Courier Journal had a bigger stick that the venerable firm founded in 1917 by George Booz. Amazing. What’s become of the Courier Journal? It has been Gannettized, and I don’t recall the paper’s receiving any big writing awards. Most of the special sections are recycled wire service stories and photo essays showing people with shoes or Derby hats.

Now navigate to Scribd and download Google Newspaper Economics: Online and Offline by Google’s chief economist Hal Varian. After you work through the data, ask yourself these questions:

  1. Who owns the country club? Traditional newspapers or the “flat” world of Internet content?
  2. Who are the caddies in the age of the “digital Gutenberg”? Internet companies or newspaper publishers?
  3. What must newspapers to do to regain their revenue flows?
  4. What can newspapers do to regain their commanding position in the information food chain?

I have my answers. Please, post yours in the comments section of this Web log. I won’t grade your responses, but I will look at them and see if I have missed anything in Dr. Varian’s data centric description of a business sector struggling to find a revenue stream.

Stephen E Arnold, March 11, 2010

No one paid me to write this. Since the presentation was given for an FTC group, I will just say, “Freebie.”

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta