Silobreaker Takes Gold and Silver in Online Decathlon

July 4, 2015

Short honk: I have been a fan of the Silobreaker system, which is available for commercial and governmental content processing. I read Network Products Guide “New Products and Service: Winners 10th Annual 2015 IT Awards” recommended solutions league table this morning. Silobreaker, founded by a couple of wizards with military and commercial experience. According to the league table, the Silobreaker content processing and information access system is the top dog for applications centering in Europe, the Middle East and Asia. This means that the system’s multi-lingual capabilities were the best, according to the Network Products Guide’s editors. The company also nailed a silver medal for US focused solutions. You can get more information about Silobreaker at www.silobreaker.com. Sign up. Join the thousands of users who want to work with a winner.

Stephen E Arnold, July 4, 2015

Google: A Week with Low Power Fireworks

July 3, 2015

I don’t pay too much attention to Google. I find the endless squabbling over the notion that Google is a monopoly, a bad guy, the new Microsoft, yada yada boring. I did notice this morning (July 3, 2015), three unrelated news items mentioning Google. Let me point you to the sources of this information or maybe disinformation and invite you to determine if Google is in the midst of a pre fourth of July fireworks test.

Item 1: “Former Google Worker Pleads Guilty to Raping Woman Inside his East Village Home.” My reaction is surprise because I thought Googlers remained at work. Is this Raping Woman story a caboose to  “Prostitute Pleads Guilty in Google Executive Heroin Overdose Death on Yacht”?

Item 2:  “Google’s Niantic Labs Sorry Over Death Camps In Smartphone Game” makes clear that Google allegedly included the concentration camps Dachau and Sachsenhausen in a game. According to the item, a Googler was apologetic.

Item 3: “Google Apologizes after Photo Software Tags Black People as ‘Gorillas’.” If true, Google’s algorithms display “drift” when performing metatagging functions.

I don’t know if these items are accurate or false. It is interesting to note that each can be triggers for questions about Google’s management and Google’s algorithms. Perhaps the European Commission will perceive these items as germane to their investigation of Google’s business practices. Tess is heartbroken. She has deactivated the alert function on her Apple Watch until the news about Google is less disturbing for her. Calm down, Tess.

Stephen E Arnold, July 3, 2015

NewsBot: Autonomy Kinjin from 14 Years Ago Reinvented

July 3, 2015

Short honk: If you want to run a query as you browse, check out NewsBot from Lateral. The system displays “related articles” when you hit a hot key. The system reminded me of Autonomy’s kinjin service from a decade ago. You will need to be into the Chrome browser to use the service. Be sure to turn off your camera and microphone. More information is at https://getnewsbot.com/. Question: Does the user see the relevant articles? Question: Who defines relevance?

Stephen E Arnold, July 3, 2015

Need Semantic Search: Lucidworks Asserts It Is the Answer by Golly

July 3, 2015

If you read this blog, you know that I comment on semantic technology every month or so. In June I pointed to an article which had been tweeted as “new stuff.” Wrong. Navigate to “Semantic Search Hoohah: Hakia”; you will learn that Hakia is a quiet outfit. Quiet as in no longer on the Web. Maybe gone?

There are other write ups in my free and for fee columns about semantic search. The theme has been consistent. My view is that semantic technology is one component in a modern cybernized system. (To learn about my use of the term cyber, navigate to www.xenky.com/cyberosint.)

I find the promotion of search engine optimization as “semantic” amusing. I find the search service firms’ promotion of their semantic expertise amusing. I find the notion of open source outfits deep in hock to venture capitalists asserting their semantic wizardry amusing.

I don’t know if you are quite as amused as I am. Here’s an easy way to determine your semantic humor score. Navigate to this slideshare link and cruise through the 34 deck presentation made by one of Lucidworks’ search mavens. Lucidworks is a company I have followed since it fired up its jets with Marc Krellenstein on board. Dr. Krellenstein ejected in short order, and the company has consumed many venture dollars with management shifts, repositionings, and the Big Data thing.

We now have Lucidworks in the semantic search sector.

Here’s what I learned from the deck:

  1. The company has a new logo. I think this is the third or fourth.
  2. Search is about technology and language. Without Google’s predictive and personalized routines, words are indeed necessary.
  3. Buzzwords and jargon do not make semantic methods simple. Consider this statement from the deck, “Tokenization plus vector mathematics (TF/IDF) or one of its cousins—“bag of words” – Algorithmic tweaks – enhanced bag of words.” Got that, gentle reader. If not, check out “sausagization.”
  4. Lucidworks offers a “field cache.” Okay, I am not unfamiliar with caching in order to goose performance, which can be an issue with some open source search systems. But Searchdaimon, an open source search system developed in Norway, runs circles around Lucidworks. My team did the benchmark test of major open source systems. Searchdaimon was the speed champ and had other sector leading characteristics as well.)
  5. Lucidworks does the ontology thing as well. The tie up of “category nodes” and “evidence nodes” may be one reason the performance goblin noses into the story.

The problem I encountered is that the write up for the slide deck emphasized Fusion as a key component. I have been poking around the “fusion” notion as we put our new study of the Dark Web together. Fusion is a tricky problem and the US government has made fusion a priority. Keep in mind that content is more than text. There are images, videos, geocodes, cryptic tweets in Farsi, and quite a few challenging issues with making content available to a researcher or analyst.

It seems that Lucidworks has cracked a problem which continues to trouble some reasonably sophisticated folks in the content analysis business. Here’s the “evidence” that Lucidworks can do what others cannot:

image

This diagram shows that after a connector is available, then “pipelines proliferate.” Well, okay.

I thought the goal was to process content objects with low latency, easily, and with semantic value adds. “Lots of stages” and “index pipelines: one way query pipelines: round trip” does not compute for this addled goose.

If the Lucidworks approach makes sense to you go for it. My team and I will stick to here and now tools and open source technology which works without the semantic jargon which is pretty much incidental to the matter. We need to process more than text. CyberOSINT vendors deliver and most use open source search as a utility function. Yep, utility. Not the main event. The failure of semantic search vendors suggests that the buzzword is not the solution to marketing woes. Pop. (That’s a pre fourth of July celebratory ladyfinger.)

Stephen E Arnold, July 3, 2015

Open Source Boundaries

July 3, 2015

Now here is an interesting metaphor to explain how open source is sustainable.  On OpenSource.com, Bryan Behrenshausen posted the article, “Making Collaboration Sustainable” that references the famous scene from Tom Sawyer, where the title character is forced to whitewash a fence by his Aunt Polly.  He does not want to do it, but is able to persuade his friends that whitewashing is fun and has them pay him for the privilege.

Jim Whitehurst refers to it as the “Tom Sawyer” model, where organizations treat communities as gullible chumps who will work without proper compensation.  It is a type of crowdsourcing, where the organizations benefit from the communities’ resources to further their own goals.  Whitehurst continues that this is not a sustainable approach to crowdsourcing.  It could even backfire at some point.

He continues to saw open source requires a different mindset, one that has a commitment from its contributors and everyone is equal and must be treated/respected for their efforts.

“Treating internal and external communities as equals, really listening to and understanding their shared goals, and locating ways to genuinely enhance those goals—that’s the key to successfully open sourcing a project. Crowdsourcing takes what it can; it turns people and their ideas into a resource. Open sourcing reciprocates where it can; it channels people and their ideas into a productive community.”

The entire goal of open source is to work with a community that coalesces around shared beliefs and passions.  Behrenshausen finishes with that an organization might find themselves totally changed by engaging with an open source community and it could be for the better.  Is that a good thing or a bad thing?  It is, however, concerning for enterprise search solutions.

Whitney Grace, July 3, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Dassault Systemes’ “Single View of the Truth” Problem-Solving Approach

July 3, 2015

The article on Today’s Medical Developments titled Collaborative Design Software uses the online collaborative design video game Minecraft to consider the possibilities for programmers working together in the future. Dassault Systemes’ is in the process of implementing a change to many design engineers working more collaboratively off a master file. The article quotes Monica Menghini, a Dassault executive,

“Our platform of 12 software applications covers 3D modeling (SOLIDWORKS, CATIA, GEOVIA, BIOVIA); simulation (3DVIA, DELMIA, SIMULA); social and collaboration (3DSWYM, 3DXCITE, ENOVIA); and information intelligence (EXALEAD, NETVIBES)… These apps together create the experience. No single point solution can do it – it requires a platform capable of connecting the dots. And that platform includes cloud access and social apps, design, engineering, simulation, manufacturing, optimization, support, marketing, sales and distribution, communication…PLM – all aspects of a business; all aspects of a customer’s experience.”

The point is that Dassault wants to sell customers a dozen products to solve a problem, which seems like an interesting and complicated approach. They believe new opportunities could include more efficient design-building, earlier chances for materials specialists to cut costs by opting for lighter materials, marketing could begin earlier in the process and financial planners would have the ability to follow the progress of a design, allowing for a more transparency on every level of production.
Chelsea Kerwin, July 3, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
==

Understanding Smart Software: Two Useful Write Ups

July 2, 2015

I noted two articles about neural networks. The first purports to explain “Neural Networks in Plain English.” The article is useful and includes a useful hyperlink to information about genetic algorithms. The explanation requires the reader to have a basic grasp of mathematics and programming. Compared to other explanations of neural networks, this approach is pretty good.

The second document is “Neural Networks, Manifolds, and Topology.” It covers methods not directly addressed in the Plain English document; for example, some information about manifolds and a reference to k-nearest neighbor. In my lectures about the weaknesses of some text analytics systems, I discuss k-nearest neighbor, which is a versatile and interesting method.

Both write up warrant some attention. Keep in mind that the notions of “plain English” and “simple” are likely to carry a freight of meaning which can be surprising to some readers.

Stephen E Arnold, July 2, 2015

How Does a Xoogler Address Search?

July 2, 2015

There are two ways to answer this question.

At Verizon AOL, the approach is to use Bing and the Microsoft ad platform. See “AOL Takes Over Majority of Microsoft’s Ad Business, Swaps Google Search For Bing.” You may have to pay with something other than Greek coded euros to view this article.

At Yahoo, the approach may be to use Google search results, not Microsoft Bing’s. Will Yahoo embrace the GOOG? According to “Yahoo Search Testing Google Search Results: Search PandaMonium”, this may be happening.

The write up states:

I am uncertain to what degree they [sic The author seems to be referring to Yahoo] are testing search results from Google, but on some web browsers I am seeing Yahoo! organics and ads powered by Bing & in other browsers I am seeing Yahoo! organics and ads powered by Google. Here are a couple screenshots.

Will the change have an impact on the relevance of Yahoo search results? Jury is out.

Stephen E Arnold, July 2, 2015

More about Meta-Analysis

July 2, 2015

I read “Another Five Things to Know about Meta-Analysis.” The write up is a follow on to an earlier article called “5 Key Things to Know about Meta-Analysis.” The use of “5” in one headline and “five” in another is not my typing.

The points are interesting. I want to highlight three.

First, the article points out that meta-analyses are not more reliable than a single study. The notion that averaging five studies’ data can yield wonky results reminds me of comments made to me by my teachers. Basics are good.

Second, studies go out of date. Therefore, the analysis may not be timely. Make decisions on stale data and excitement can ensue. Another useful point.

Third, analyses which are not inclusive of germane studies can be misleading.

The write up is a reminder that old fashioned research, data analysis, and research planning are needed to make fancy math work as expected. My view is that most of the whippersnappers just want answers. Work is not sitting alone and figuring out what to do with what set of data and how best to move forward to get a result that is sort of useful.

Who has time for that?

Stephen E Arnold, July 2, 2015

Software Market Begs for Integration Issue Relief

July 2, 2015

A recent report proves what many users already know: integrating an existing CMS with new and emerging software solutions is difficult. As quickly as software emerges and changes, users are finding that hulking overgrown CMS solutions are lagging behind in terms of agility. SharePoint is no stranger to this criticism. Business Solutions offers more details in their article, “ISVs: Study Shows Microsoft SharePoint Is Open To Disruption.”

A report from Software Advice surveyed employees that use content management systems (CMS) on a daily basis and found 48 percent had considerable problems integrating their CMS with their other software solutions. The findings mirror a recent AIIM report that found only 11 percent of companies experienced successful Microsoft SharePoint implementation . . . The results of this report indicate that the CMS market is ripe for disruption if a software vendor could solve the integration issues typically associated with SharePoint.”

No doubt, Microsoft understands the concerns and perceived threats, and will attempt to solve some of the issue with the upcoming release of SharePoint Server 2016. However, the fact remains that SharePoint is a big ship to turn, and change will not be dramatic or happen overnight. In the meantime, stay on top of the latest news for tips, tricks, and third-party solutions that may ease some of the pain. Look to Stephen E. Arnold and his SharePoint feed on ArnoldIT.com in order to stay in touch without a huge investment in time.

Emily Rae Aldridge, July 2, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta