Exclusive Interview with the Founder of Hot Neuron
March 23, 2010
What happens when a theoretical physicist focuses his attention on the problems of content processing? One answer is the Hot Neuron technology. Dr. Bill Dimm, after a successful career in physics and finance, founded Hot Neuron to “develop innovative methods and algorithms that help people find and organize information that will make their companies more productive.”
In an exclusive interview for the ArnoldIT.com feature Search Wizards Speak, Dr. Dimm said:
Clustify analyzes the text of your documents and groups related documents together into clusters. Each cluster is labeled with a few keywords to tell you what it is about, providing an overview of what the document set is about, and allowing you to browse the clusters by keyword in a a hierarchical fashion. The aim is to help the user more efficiently and consistently categorize documents, since he or she can categorize an entire cluster or a whole group of clusters with a single mouse click. Our approach to forming clusters is impacted by that goal. We use a modified agglomerative algorithm to ensure that the most similar documents get clustered together, and we allow the user to specify how similar documents must be in order to appear in the same cluster. By choosing a high similarity cutoff, the user can be confident that it is safe to categorize all documents in the cluster the same way. Clustify can also do automatic categorization by taking documents that have already been categorized, finding similar documents, and putting them in the same categories.
I asked Dr. Dimm about the intense competition in the text processing sector. He said:
For companies that do original research and adapt their products to their customers’ needs (like us, of course), there is a fair amount of opportunity for differentiation–customers really need to try the products and see what works in their situation. The companies that just pull an algorithm out of a book or mimic another product will be left competing on price.
You can see the technology in action at Dr. Dimm’s MagPortal.com site. For the full text of this exclusive interview with an innovative thinker in information retrieval, read the full text of Hot Neuron interview. For more information, visit http://www.hotneuron.com.
Stephen E Arnold, March 23, 2010
A free write up and a free article. I will report this “free” stuff to the Department of Labor. I know the DOJ will care.
The CFO – Information Technology Collision
March 23, 2010
No, it is not as “real media” interesting as the Google – China kerfuffle. I think for most of those involved in search and content processing the story “Australian CFOs let down by ICT and BI” has more right now implications for system professionals than posturing on a world stage. The write up makes a point that I think most azure chip consultants, mavens, and pundits slither around; namely:
Fewer than a third of Australia’s chief financial officers have the right mix of ICT and business analytics to let them swiftly recommend business responses to market changes prompted by the Henry review of taxation or an Emissions Trading System.
In sum, regulations need something and information technology cannot deliver. But there is a killer statement in the write up:
…60 per cent of CFOs were planning major changes in order to be able to better respond to change.
You don’t have to be much of a management whiz to figure out that “change” means thinking new thoughts about people, methods, and vendors.
The economic climate is uncertain and now the CFOs, if this study is spot on, have figured out that the people on whom these CFOs depend, cannot deliver.
What’s this mean for search and content processing? Some vendors will be in pickle along with other vendors and IT managers who can talk but cannot deliver. Just my opinion.
Stephen E Arnold, March 23, 2010
Nope, a freebie. I will report to the bastion of IT efficiency, the GSA, that I wrote this without any pay.
Google Becomes a Noun Again!
March 22, 2010
ReadWriteWeb’s “Rulers of the Cloud: Google Becomes the Cloud, Search Is a Feature” may make high school English teachers wince but Google has moved from proper noun to verb to noun. From a trademark and brand name point of view, the morphing of the word “Google” could be good news or bad news.
For me, the most interesting comment in the write up was:
The shortest way to describe this is that Google is no longer a verb. It’s becoming a noun. Not just the few clicks to find information, but the information itself and the experience surrounding it.
Xerox took a militant approach to the word “xerox” as a general purpose descriptor of photocopying. Will Google attack or just let language take its course?
Stephen E Arnold, March 22, 2010
No compensation for this short item. I think legal hassles have to go to the ever efficient USPTO. Would you not agree?
XML Appliances
March 22, 2010
I worked my way through a run down of XML appliances. The write up was “The Modern XML Appliance: From Acceleration to Integration and into the Cloud.” The focus was dedicated hardware devices that allegedly cope with the size of XML files. XML can be more verbose than documents stored in an application’s native file format. In addition, XML is not * one thing *. There are flavors of XML and within XML documents one can encounter issues. For example, have you heard, “Where’s that referenced entity?”
Years ago, I looked at a then-new method of coping with the numerous problems of Extensible Markup Language. That firm was Rocket XML, I believe. The approach was to implement some of these clean up functions via software. The article I read focused exclusively on expensive gizmos that are often expensive, tricky to configure, and narrow in their functionality. Some of the appliances have limits on XML file size. So what’s the fix? Manual rework or expensive off loading methods are the approach some organizations have to follow.
I did a bit of poking around and found another XML appliance article by the same outfit (IT Business Edge), “XML Appliances Get New Mission: Integrating B2B and the Cloud.” This write ignores:
- Increased interest in putting such functions as XML processing in firmware, not boxes
- The need for restrictive XML in order to deal with variances in files
- The costs of making the jump to the cloud at the present time with the current limitations.
I think the XML appliance sector is an interesting one, but like many specialized niche markets, change is coming, and it will come quickly.
Stephen E Arnold, March 22, 2010
A freebie. No one paid me to write this article. I don’t know to whom in the US government to report a no pay write up about XML. Maybe the Government Printing Office? Baffled am I.
eBook Search Engines
March 22, 2010
A happy quack to the reader who sent me a link to “The 5 Best Ebook Search Engines”. My eyes and eBook readers are not like peanut butter and chocolate. I know that a large number of people are getting on the eBook reader bandwagon. The search system that Work Up seems to favor is MegaPDF. I did some testing and got mixed results. For example, a search for Silesian Station returned some unusual results, including a one page PDF from the Book Thief. You are on your own. I prefer to buy hard copies.
Stephen E Arnold, March 22, 2010
A free write up. I suppose I can report “free’ to the Free Public Library here in Louisville, Kentucky.
Infrastructure Ripple from SharePoint
March 22, 2010
Navigate to Thor Projects and read the article “Infrastructure Ripple Effect – The Story of Servers, Racks and Power.” I have about 48 inches of screen real estate and I needed all of it to read the article. The layout is – in a word – interesting. The point of the write up, in my opinion, is summarized in this passage from the article:
I am reminded that any change creates a ton of little ripples.
When an information technology pro runs into problems with a single server, I wonder what the impact of more massive on premises changes might be.
I thought about Mauro Cardarelli’s “Where Does SharePoint Still Fall Short?” when I thought about adding hardware. He wrote:
Let’s face it; the interface for security management is confusing and cumbersome… even for people who use it every day. What are the consequences? First, you increase the likelihood of security breaches (i.e. showing content to the wrong audience). Second, you increase the likelihood of giving users permissions greater than necessary. Finally, you increase the likelihood of a having a security model that is highly diluted and overly complex. This is probably why the 3rd party market for SharePoint administration has been so strong… someone needs to pay attention to what these folks are doing! But I would argue that this is reactive (versus proactive) management… and things need to be taken one step further.
Hardware and security. Hmmm.
Stephen E Arnold, March 22, 2010
No one paid me to write this article. I will report this to the Salvation Army, an outfit that knows about work without pay. Perhaps the cloud access to SharePoint will obviate the problem?
ArnoldIT Expands Overflight
March 22, 2010
If you want one-click access to what’s new from leading vendors of search and content processing, navigate to ArnoldIT’s free Overflight service. Pick a company name, select a Google topic area, or run a query on Google’s own 70 plus Web logs. We have added three vendors to the watch service:
- Comperio, one of the Microsoft Fast support entities which has former FAST Search engineers on staff.
- Exorbyte, a vendor with a system that matches other eCommerce and databased content systems feature for feature.
- Funnelback, the Australian open source search system offered by SQIZ, an open source content management company.
You will also find a list of three social network service providers: Facebook, Twitter, and LinkedIn. What’s interesting is to click through each of the autogenerated pages for the search and content processing vendors. You may be able to tell who is marketing with some savvy and who is clueless.
Stephen E Arnold, March 22, 2010
A shameless promotion of an ArnoldIT.com service. You now are reminded that Beyond Search is a marketing blog devoted to ArnoldIT.com and Stephen E Arnold.
Google Bombshell: Alleged Links to Intelligence Services Alleged
March 22, 2010
I was plonking along looking at ho hum headlines when I spotted “Chinese Media Hits Out at Google, Alleges Intelligence Links”. The addled goose does not know anything about this source nor about the subject of the article. But the addled goose is savvy enough to know that if this story is true, it is pretty darned important. The main point of the story in Economic Times / India Times is:
Xinhua said in an editorial: “Some Chinese Internet users who prefer to use Google still don’t realize perhaps that due to the links between Google and the American intelligence services, search histories on Google will be kept and used by the American intelligence agencies.”
Okay, that’s interesting. Several years ago, I heard a talk by a citizen in Washington, DC who made a similar comment. My recollection is that Google was pretty darned mad. I wondered if the citizen in Washington, DC was right or wrong. If another source comes up with more detail, the story becomes much more interesting.
Chinese intelligence agents are pretty savvy. And the Ministry of State Security is one of the best. I can’t remember whether Section 6 is the go-to bunch, but perhaps more information will surface.
Stephen E Arnold, March 22, 2010
A freebie. I will report non payment to DC Chief of Police who is really clued into Google’s activities in Washington.
Quote to Note: Google and Its View of Exchange
March 21, 2010
ZDNet’s “An Interview with Timothy Bray” softened the new Google evangelist’s rhetoric. But the “interview” contained a quote that I don’t want to misplace:
For small and medium businesses, Microsoft Exchange is a “soft target”, Bray noted.
A soft target. I wonder how Microsoft will react.
Stephen E Arnold, March 20, 2010
An unpaid news item. I will report this to the local commandant of the National Guard. Here in Kentucky there are lots of soft targets I think.
Coveo and GEICO Host Webinar on March 23, 2010
March 21, 2010
Fierce Media has asked Beyond Search to facilitate a discussion about “how GEICO thinks about leveraging its data-rich enterprise systems to generate real-time business value and intelligence.” The participants are GEICO and Coveo as well as Stephen E Arnold.
Topics include how the Coveo system can:
- Enable improved business intelligence and decision making through dynamic dashboards and information mashups that provide actionable business information
- Access structured and unstructured data from across enterprise systems and repositories without complex integration or data migration, improving efficiency and cost effectiveness through a unified indexing layer
- Lower the cost of legacy system integrations and upgrades, and reduce time-consuming data migration
- Optimize social networks and incorporate the value of collaboration and just-in-time information exchange into the knowledge ecosystem
The audio program will be on Tuesday, March 23, 2010 beginning at 11:00am Eastern/8:00am Pacific. More information about Coveo may be found at http://www.coveo.com. You can register here.
Ben Kent, March 21, 2010, Beyond Search
This is a sponsored post.