Exclusive Silobreaker Interview: Mats Bjore, Silobreaker

November 25, 2013

With Google becoming more difficult to use, many professionals need a way to locate, filter, and obtain high value information that works. Silobreaker is an online service and system that delivers actionable information.

The co-founder of Silobreaker said in an exclusive interview for Search Wizards Speaks says:

I learned that in most of the organizations, information was locked in separate silos. The information in those silos was usually kept under close control by the silo manager. My insight was that if software could make available to employees the information in different silos, the organization would reap an enormous gain in productivity. So the idea was to “break” down the the information and knowledge silos that exists within companies, organizations and mindsets.

And knock down barriers the system has. Silobreaker’s popularity is surging. The most enthusiastic supporters of the system come from the intelligence community, law enforcement, analysts, and business intelligence professionals. A user’s query retrieves up-to-the-minute information from Web sources, commercial services, and open source content. The results are available as a series of summaries, full text documents, relationship maps among entities, and other report formats. The user does not have to figure out which item is an advertisement. The Silobreaker system delivers muscle, not fatty tissue.

Mr. Bjore, a former intelligence officer, adds:

Silobreaker is an Internet and a technology company that offers products and services which aggregate, analyze, contextualize and bring meaning to the ever-increasing amount of digital information.

Underscoring the difference between Silobreaker and other online systems, Mr. Bjore points out:

What sets us apart is not only the Silobreaker technology and our commitment to constant innovation. Silobreaker embodies the long term and active experience of having a team of users and developers who can understand the end user environment and challenges. Also, I want to emphasize that our technology is one integrated technology that combines access, content, and actionable outputs.

The ArnoldIT team uses Silobreaker in our intelligence-related work. We include a profile of the system in our lectures about next-generation information gathering and processing systems.

You can get more information about Silobreaker at www.silobreaker.com. A 2008 interview with Mr. Bjore is located at on the Search Wizards Speak site at http://goo.gl/f7niAH.

Stephen E Arnold, November 25, 2013

The Future of Search: Incomprehensible Visualizations?

November 24, 2013

I have watched time shrink in the last 50 years. I recall having time in my first job. I did not feel pressured to do the rush rush thing. Now, when I accept an engagement, the work has to be done in double time in half the time, maybe faster.

As a result, reports have to be short. Graphics have to point out one key point. Presentations have to be six or eight PowerPoint slides. Big decisions are made in a heartbeat. The go go years were the slow slow years.

I took a look at Kantar Information Is Beautiful Awards. I think I saw the future of search. Users want information presented with Hollywood style visuals. Does it matter that the visualizations are incomprehensible? I don’t think so. Style takes precedence over clarity. I can visualize senior managers telling their colleagues, “I want graphics like these Kantar winners in my next PowerPoint.”

Here’s a winning visual.

How to win an Oscar - Christian Tate

Source: http://www.informationisbeautifulawards.com/2013-winners/

The confusion of clarity with visual zing is interesting. As search vendors struggle to find a formula that generates top line revenue growth and yields net profits, are visualizations like the Kantar winners the future of search? I think the answer may be, “Absolutely.”

Vendors are not sure what they are selling. Whether it is BA Insight’s effort to get LinkedIn search group participants to explain the key attributes of search or other vendors slapping on buzzwords to activate a sales magnet, search is confused, lost maybe. Coveo is search, customer support and more. MarkLogic is XML data management, search, and business intelligence. Amazon, Google, IBM, and Microsoft search does everything one would want in the way of information access. Open source ElasticSearch, LucidWorks, and Searchdaimon are signaling a turn into the path that proprietary Verity blazed in 1988. Vendors do everything in an all out effort to close deals. Visualization may be the secret ingredient that gives search focus, purpose, and money.

Why not skip requiring a user to read, analyze, and synthesize? Boring. Why not present a predigested special effect? Exciting. Everyone will be happier.

Decisions making seems to be in a crisis. Pictures instead of works may improve senior managers’ batting averages.

Relying on incomprehensible visuals to communicate will be more fun and prove to be more lucrative. I assume audiences will applaud, cheer, and stomp their feet. Conferences can sell popcorn and soft drinks to accompany the talks.

Go snappy graphics. Will I understand them at a glance. Nope.

Stephen E Arnold, November 24, 2013

Database Ranking Includes Search Engines

November 24, 2013

I read “DB-Engines Ranking.” What struck me is that search engines were included in the list. More remarkable, some of the search systems are not data management systems at all. One data management system bills itself as a search engine. I was surprised to find the Google Search Appliance listed. The system is expensive and garners only basic support from the “search experts” at Google.

Let me highlight the search related notes I made as I worked through the list of 171 systems.

  1. At position 12 is Solr. This is the open source faceted search engine that can be downloaded and installed—usually.
  2. At position 21d is ElasticSearch. The person who created Compass whipped up ElasticSearch and made some changes to enhance system performance. With $39 million in venture funding, ElasticSearch can be many things, but for me the company does search and retrieval.
  3. At position 27 is Sphinx Search. This system makes it easy to retrieve information from MySQL and some other databases without writing formal SQL queries.
  4. At position 38, MarkLogic is the polymath among the group. The company bills itself as enterprise search, XML data management system, and business intelligence vendor. The company also enjoys some notoriety due to its contributions to the exceptional Healthcare.gov project.
  5. In position 44 is the Google Search Appliance. The system is among the most expensive appliances I have examined. Is the GSA an end of life project? Is the GSA a database system? My view is that it is a somewhat limited way to get Google style results for users who see Google as the champion in the search derby.
  6. At position 104 is Xapian. Again, I don’t think of Xapian and its enthusiastic supporters as card carrying members of the database society. For me, Xapian evokes thoughts of Flax.
  7. At position 124 is CloudSearch. Amazon’s somewhat old fashioned search system. Frankly I think of Amazon as more of a database services outfit than a search outfit.
  8. At position 127 is the end of life Compass Search. This was the precursor to ElasticSearch. There are those who are happy with an old school open source solution. Good for them.
  9. At position 149 is SearchBlox. Now SearchBlox uses ElasticSearch. Interesting?
  10. At position 163 is SRCH2. This vendor is one that has some organizational challenges. The focus of the company seems to be shifting to mobile search.

Quite an eclectic list. Some of the systems mentioned are search engines; for example, Basho Riak. In terms of list “points”, ElasticSearch looks like the big winner. Shay Bannon made the list with Compass. ElasticSearch is moving up the charts. SearchBlox uses ElasticSearch in its product. What happened to LucidWorks and reflexive search?

Which of these systems would you select for data management? My thought is that one should check out the software before taking a list at face value.

The confusion about search is evident in this list. No wonder the LinkedIn discussion groups want to do surveys to figure out what search means.

Stephen E Arnold

Search Tech May Shift West from Silicon Valley

November 22, 2013

I read “Chinese Supercomputer Retains ‘World’s Fastest’ Title, Beating US and Japanese Competition” may be nothing more than street racing with silicon. According the the write up:

A Chinese supercomputer has retained the crown of world’s fastest supercomputer, beating competitors from both Japan and the US.

There are several ideas to put the Chinese supercomputer in the back row. Questions about data transfer suggest the new champ has lousy lungs. It is also possible the graphics card makers’ performance enhancing drug—jiggled and manipulated test suites. Yes! Winner!

The Chinese have won the race two years in a row. In terms of my interests, the Chinese performance is one more datum supporting the notion that engineers from other countries have some work to do.

In terms of search, the reality is that most search systems are pretty much the same in terms of what they deliver to users—frustration and off point results. To improve the search and retrieval systems, more computing horsepower is needed.

With zippy computers and their various technologies, will the innovations in search come from the traditional drag race winners? Perhaps faster machines will allow more sophisticated methods of processing text and the magical “Big Data” will come from the Middle Kingdom?

Fast computers are enablers. Worth watching? Probably.

Stephen E Arnold, November 22, 2013

Elasticsearch Boasts a Gild Cheerleader

November 22, 2013

Luca Bonmasser, the co-founder and chief product and technology officer of Gild, recently presented at RubyConf about the best ways to build Elasticsearch in Ruby. PRWeb details the panel in “Gild’s Luca Bonmasser Presents At RubyConf On The Future Of Search.”

Here is a summary of Bonmasser’s speech:

“A consummate innovator and serial entrepreneur, Bonmassar will discuss how to build an Elasticsearch cluster, create indexes, load data, and format and execute robust search features using the Ruby Tire library. With end users expecting a high-level search experience wherever they go, Elasticsearch allows developers to keep up with UX demands by incorporating auto-suggest, spell-correcting, and personalized search on Ruby applications more easily. For Ruby developers who want to incorporate the highest level search features on their platform…”

Gild uses Ruby in its Gild Source tech hiring software. Gild Source helps companies hire skilled IT professionals, especially Ruby experts. Bonmasser is very passionate about the Ruby Tire framework and advocates for open source. In a personal quote he notes that search is a very difficult concept for engineers, but the technology available, such as Elasticsearch, makes it easier to make search simpler. To further spread his love for open source, he started an open source project on how to utilize Elasticsearch’s power.

Whitney Grace, November 22, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Are Yahoo and PRWeb Confusing SEO and Enterprise Search?

November 21, 2013

I get a Yahoo Alert. My single Alert  topic is “enterprise search.” I want a bound phrase match. Like the other alert services I use, there are usually some obvious “false hits.” A “false hit” is an off topic story. The problem with key word alerts is that words have different meaning. A story with the word “search” for a new president often turns up with a story about Oracle’s Secure Enterprise Search system. Most of these “false hits” are easily ignored. Another problem is that some “experts” want a user to see something, so the query is relaxed. That’s a problem for me. For you, maybe not. For spammers, relaxation means more content baloney whether generated by an azure chip consultant, search engine optimization maven, or an organization desperate for visibility. In case you have not noticed, traffic to most Web sites is undergoing quite a change. One Web site owner told me, “We averaged 250,000 uniques a month in 2012. This year we are down to 48,000. What am I going to do?”

Go out of business? Change your Web site? Get a different job?

Perhaps the answer is, “Anything.

Desperation generates some darned interesting business actions in my experience.

There is another problem, particularly with the word “search.” I am interested in enterprise search, and I want to learn about new, substantive information related to information retrieval. The poor word “search” has been sucked dry of meaning. The wispy husk carries zero meaning. For most people search means Google or taking what an app delivers.

I noticed in my Yahoo Alert this morning these two items listed as the number one and number two most relevant stories for me:

image

Both of these are about an outfit that delivers search engine optimization services. The problem is that this sense of the word “search” is of little interest to me.

What is more interesting is that the outfit generating these items for Yahoo is called PRWeb. I don’t know much about PRWeb. My hunch is that one of the PR professionals I have used over the years knows about this firm.

I wanted to capture several thoughts about what I call “alert corruption.”

File:Gustave Dore Inferno1.jpg

Lost and desperate for relevance. Those in the woods are probably evil. See Canto One of the Divine Comedy.

First, Yahoo is not doing a particularly good job providing me with new information about enterprise search. Today I saw items related to OpenText, an outfit that owns a number of search engines. The story, however, talks about enterprise information management. I do not know what that phrase means. There was a story about Imprezzo, a company that purports to “overcome the problem of traditional text based search.” Well, maybe that is worth a look. Of the five items sent me, one was possibly of interest. Does a score of 20 percent warrant a pass or a fail.

Second, four of the items in the Yahoo Alert were from the PRWeb outfit. One thing is certain. PRWeb can get its clients’ content into the Yahoo system. The problem is that two of these stories are about practices that I find like tight shoes. I suppose the shoes look okay but I am uncomfortable. But SEO outfits and those who assist them make me uncomfortable. A buck is a buck, but content manipulation is like wearing small shoes that are damp.

Third, after 40 or 50 years of search innovation, endless surveys from outfits like azure chip consultants and morphing vendors like BA Insight, Smartlogic, and LucidWorks, I am not sure if significant information retrieval progress is evident. One would think that Yahoo would tap some super sophisticated new technology to filter out baloney, deliver on point alerts, and work with vendors who exercise some judgment about what passes for search related content.

My hunch is that PR is in a bit of a sticky wicket. It joins content management, governance, search, and Big Data. These disciplines have to find some way to call attention to themselves. Perhaps these “legitimate” disciplines should emulate the search engine optimization crowd. Visibility without a thought about precision and recall is their game.

I would like to receive alerts that actually match the string “enterprise search.” I think that is just too much for those who think that a user absolutely must have a “hit” whether that item is relevant or not.

Search and marketing may be a match made in heaven. Those who are interested in precision and recall occupy one of Dante’s less salubrious regions.

Stephen E Arnold, November 21, 2013

Healthcare.gov Blog: Content Gap?

November 20, 2013

Healthcare.gov has a blog. You can find it at this link. There is a link for October posts. There is a link for September posts. I was not able to access the full set of posts for either month. Here’s what I saw:

image

I thought the content would be at this link.

Oversight, content management problem, content removal, or my error? Interesting. It is tough to search when content is not available for indexing.

I wanted to read the posts to the blog before and after the launch. No joy. Should I be suspicious?

 

Stephen E Arnold

Users Seek Private Search Options After NSA Revelations

November 20, 2013

This is certainly no surprise. CSO reveals, “People Flock to Anonymizing Services After NSA Snooping Reports.” Writer Grant Gross highlights several anonymous search services that have seen usage soar since certain NSA practices have come to light. DuckDuckGo is on the list, as well as Tor and mobile solution Silent Circle. The brand new Disconnect Search saw over 400,000 searches within four days of its launch. Clearly, many people are beginning to cover their virtual tracks. But is it pointless, after all? The article points out:

Disconnect Search’s FAQ includes information about possible government searches. ‘The reality is the U.S. government may force us to begin logging the search queries of a particular user or group of users,’ the FAQ said. ‘If served with a court order that includes a non-disclosure provision, we may not be able to tell our users about this change for some period of time, possibly forever. And the U.S. government may also have other methods of monitoring user searches which Disconnect Search cannot prevent.'”

Though we now know several prominent firms quietly complied with NSA demands to fork over their records, at least one search service has elected to fold rather than cave. Lavabit made the tough choice to shut down their decade-old organization rather than comply with. . . something. Owner Ladar Levison’s explanation, which is all that is left of the site, laments that he can’t tell us exactly what was demanded of him, but his frustration and ire are apparent in the strongly worded note. He writes:

“I have been forced to make a difficult decision: to become complicit in crimes against the American people or walk away from nearly ten years of hard work by shutting down Lavabit. After significant soul searching, I have decided to suspend operations. I wish that I could legally share with you the events that led to my decision. I cannot. I feel you deserve to know what’s going on–the first amendment is supposed to guarantee me the freedom to speak out in situations like this. Unfortunately, Congress has passed laws that say otherwise.”

So, there’s that. Not exactly encouraging for fans of privacy. Lavison seems to hold at least a sliver of hope for a favorable verdict as Lavabit takes their fight to court. Is even that too optimistic?

Cynthia Murrell, November 20, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

ZooKeeper for Search Applications

November 20, 2013

Looking for Google-style tech to speed up your search app? The AppScale Blog presents us with an affordable option in, “Emulating Google Megastore Using Open Source Technologies.” The article tells us why Apache’s ZooKeeper is even better than Google’s Bigtable (links in the quote are PDFs.):

“The BigTable model is not enough to fully emulate the Google App Engine Datastore API, as it is based on Megastore, which provides the added benefit of transactions on partitioned data. For this AppScale uses ZooKeeper, the open source implementation of Google’s Chubby. ZooKeeper provides a locking API using a variant of the Paxos algorithm.

“To emulate Megastore with open source software, AppScale automatically sets up a datastore for applications to use and provides the mappings from the Google App Engine Datastore API to the Cassandra and ZooKeeper APIs. With both ZooKeeper and Cassandra, whether its a one node, or an eight node deployment, AppScale will create the configuration files, and start the correct processes on each node. Optionally, the AppScalefile (the AppScale configuration file) can dictate the amount of replication the datastore does. This also makes AppScale a great tool to use to automatically set up a Cassandra or ZooKeeper cluster.”

The write-up goes on to address data layout in Cassandra, query types, and ZooKeeper locks. At the bottom are several helpful links for further investigation. Oh, and a brief, unexplained, lukewarm beer review that is apparently part 16 in a series. It is good to have diverse interests.

Cynthia Murrell, November 20, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

Thunderstone Thunders In With An Upgrade

November 18, 2013

While this might not be at the top of anyone’s Black Friday shopping list, it is good to know that ‘Thunderstone Offers Version 9 Of The Thunderstone Search Appliance” according to PR Web. Thunderstone is a little known research and development company that prides itself on providing comprehensive intelligent information and retrieval management solutions. One might recognize their Texis software that provides high-grade text retrieval and publishing.

Thunderstone’s products are used in various fields from multimedia management; help desk support, automated categorization, litigation support, and Web content searching.

The last field is of the greatest interest to us, because the Thunderstone Search Appliance could push the company into a wider range of clients. The upgrade promises to support all of its sister software with improved administrative interface, faster searching, query auto complete, content caching, and a walk log for analysis. Those are just the basic upgraded features.

Thunderstone includes the following benefits with their search software:

· “A one-time, perpetual license that saves customers 40-60 percent (or more) compared to Thunderstone’s closest competitor.

· Two years of included maintenance, easily extended for additional years at affordable annual rates.

· Superior technical support from software engineers readily accessible to customers by phone, email and message board.

· No restrictions on indexing third-party websites for user-empowering applications and for competitive intelligence purposes.

· Ability to fully search targeted repositories (file servers, web servers, intranet/portal servers, database servers, application databases, etc.) and to handle files that exceed 30 MB in size.

·   An attractive Product Investment Protection Program that makes upgrading a breeze, applying 100 percent of the initial Thunderstone product’s purchase price to any desired upgrade.

· Availability as a virtual appliance image to run under a hypervisor to allow for more efficient hardware utilization and manageability.”

These are not bad options. However, having never worked with Thunderstone or even heard of it before this press release we have to question its performance capabilities. Does it really do as advertised or is an extended amount of development needed for implementation?

Whitney Grace, November 18, 2013

Sponsored by ArnoldIT.com, developer of Augmentext

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta