The Noble Quest Behind Semantic Search

November 25, 2016

A brief write-up at the ontotext blog, “The Knowledge Discovery Quest,” presents a noble vision of the search field. Philologist and blogger Teodora Petkova observed that semantic search is the key to bringing together data from different sources and exploring connections. She elaborates:

On a more practical note, semantic search is about efficient enterprise content usage. As one of the biggest losses of knowledge happens due to inefficient management and retrieval of information. The ability to search for meaning not for keywords brings us a step closer to efficient information management.

If semantic search had a separate icon from the one traditional search has it would have been a microscope. Why? Because semantic search is looking at content as if through the magnifying lens of a microscope. The technology helps us explore large amounts of systems and the connections between them. Sharpening our ability to join the dots, semantic search enhances the way we look for clues and compare correlations on our knowledge discovery quest.

At the bottom of the post is a slideshow on this “knowledge discovery quest.” Sure, it also serves to illustrate how ontotext could help, but we can’t blame them for drumming up business through their own blog. We actually appreciate the company’s approach to semantic search, and we’d be curious to see how they manage the intricacies of content conversion and normalization. Founded in 2000, ontotext is based in Bulgaria.

Cynthia Murrell, November 25, 2016
Sponsored by, publisher of the CyberOSINT monograph

SearchBlox 8.5 Now Available

September 28, 2016

A brief write-up at DataQuest, “AI-Based Cognitive Business Reasoning with SearchBlox v8.5,” informs us about the latest release of the enterprise-search, sentiment-analysis, and text-analytics software. The press release describes this edition:

“Version 8.5 features the addition of new connectors including streaming, API and storage data sources bringing the total number of available sources to 75. This new release allows customers to use advanced entity extraction (person, organization, product, title, location, date, time, urls, identifiers, phone, email, money, distance) from 18 different languages within unstructured data streams on a real time basis. Use cases include advanced federated search, fraud or anomaly detection, content recommendations, smart business workflows, customer experience management and ecommerce optimization solutions. SearchBlox can use your existing data to build AI based cognitive learning models for your most complex use cases.

The write-up describes the three key features of SearchBlox 8.5: The new connectors mentioned above include Magento, YouTube, ServiceNow, MS Exchange, Twilio, Office 365, Quandl, Cassandra, Google BigQuery, Couchbase, HBase, Solr, and Elasticsearch. Their entity extraction tool functions in 18 languages. And users can now leverage the AI to build learning models for specific use cases. The new release also fixes some bugs and implements performance improvements.

Cynthia Murrell, September 28, 2016
Sponsored by, publisher of the CyberOSINT monograph

Who Will Connect the Internet of Things to Business

June 23, 2016

Remember when Nest Labs had all the hype a few years ago? An article from BGR reminds us how the tides have turned: Even Google views its Nest acquisition as a disappointment. It was in 2014 that Google purchased Nest Labs for $3.2 billion. Their newly launched products, a wifi smoke alarm and thermostat, at the time seemed to the position the company for greater and greater success. This article offers a look at the current state:

“Two and a half years later and Nest is reportedly in shambles. Recently, there have been no shortage of reports suggesting that Nest CEO Tony Fadell is something of a tyrannical boss cut from the same cloth as Steve Jobs (at his worst). Additionally, the higher-ups at Google are reportedly disappointed that Nest hasn’t been able to churn out more hardware. Piling it on, Re/Code recently published a report indicating that Nest generated $340 million in revenue last year, a figure that Google found disappointing given how much it spent to acquire the company. And looking ahead, particulars from Google’s initial buyout deal with Nest suggest that the pressure for Nest to ramp up sales will only increase.”

Undoubtedly there are challenges when it comes to expectations about acquired companies’ performance. But when it comes to the nitty gritty details of the work happening in those acquisitions, aren’t managers supposed to solve problems, not simply agree the problem exists? How the success of “internet of things” companies will pan out seems to be predicated on their inherent interconnectedness — that seems to apply at both the levels of product and business.


Megan Feil, June 23, 2016

Sponsored by, publisher of the CyberOSINT monograph

Brown Dog Fetches Buried Data

February 25, 2016

Outdated file formats, particularly those with no metadata, are especially difficult to search and utilize. The National Science Foundation (NSF) reports on a new search engine designed to plumb the unstructured Web in, “Brown Dog: A Search Engine for the Other 99 Percent (ofData).” With the help of a $10 million award from the NSF, a team at the University of Illinois-based National Center for Supercomputing Application (NCSA) has developed two complementary services. Writer Aaron Dubrow explains:

“The first service, the Data Access Proxy (DAP), transforms unreadable files into readable ones by linking together a series of computing and translational operations behind the scenes. Similar to an Internet gateway, the configuration of the Data Access Proxy would be entered into a user’s machine settings and then forgotten. From then on, data requests over HTTP would first be examined by the proxy to determine if the native file format is readable on the client device. If not, the DAP would be called in the background to convert the file into the best possible format….

“The second tool, the Data Tilling Service (DTS), lets individuals search collections of data, possibly using an existing file to discover other similar files in the data. Once the machine and browser settings are configured, a search field will be appended to the browser where example files can be dropped in by the user. Doing so triggers the DTS to search the contents of all the files on a given site that are similar to the one provided by the use….  If the DTS encounters a file format it is unable to parse, it will use the Data Access Proxy to make the file accessible.”

See the article for more on these services, which NCSA’s Kenton McHenry likens to a DNS for data. Brown Dog conforms to NSF’s Data Infrastructure Building Blocks program, which supports development work that advances the field of data science.


Cynthia Murrell, February 25, 2016

Sponsored by, publisher of the CyberOSINT monograph

Cybercrime as a Service Drives Cyber Attacks on Uber Accounts and More

January 26, 2016

Several articles lately have shined light on the dynamics at play in the cybercriminal marketplaces of the Dark Web; How much is your Uber account worth?, for example, was recently published on Daily Mail. Summarizing a report from security researchers at Trend Micro for CNBC, the article explains this new information extends the research previously done by Intel Security’s The Hidden Data Economy report. Beyond describing the value hierarchy where Uber and Paypal logins cost more than social security numbers and credit cards, this article shares insights on the bigger picture,

“’Like any unregulated, efficient economy, the cybercrime ecosystem has quickly evolved to deliver many tools and services to anyone aspiring to criminal behavior,’ said Raj Samani, chief technology officer for Intel Security EMEA. ‘This “cybercrime-as-a-service” marketplace has been a primary driver for the explosion in the size, frequency, and severity of cyber attacks.

‘The same can be said for the proliferation of business models established to sell stolen data and make cybercrime pay.’”

Moving past the shock value of the going rates, this article draws our attention to the burgeoning business of cybercrime. Similarly to the idea that Google has expanded the online ecosystem by serving as a connector, it appears marketplaces in the Dark Web may be carving out a similar position. Quite the implications when you consider the size of the Dark Web.


Megan Feil, January 26, 2016

Sponsored by, publisher of the CyberOSINT monograph

On Embedding Valuable Outside Links

July 21, 2015

If media websites take this suggestion from an article at Monday Note, titled “How Linking to Knowledge Could Boost News Media,” there will be no need to search; we’ll just follow the yellow brick links. Writer Frederic Filloux laments the current state of affairs, wherein websites mostly link to internal content, and describes how embedded links could be much, much more valuable. He describes:

“Now picture this: A hypothetical big-issue story about GE’s strategic climate change thinking, published in the Wall Street Journal, the FT, or in The Atlantic, suddenly opens to a vast web of knowledge. The text (along with graphics, videos, etc.) provided by the news media staff, is amplified by access to three books on global warming, two Ted Talks, several databases containing references to places and people mentioned in the story, an academic paper from Knowledge@Wharton, a MOOC from Coursera, a survey from a Scandinavian research institute, a National Geographic documentary, etc. Since (supposedly), all of the above is semanticized and speaks the same lingua franca as the original journalistic content, the process is largely automatized.”

Filloux posits that such a trend would be valuable not only for today’s Web surfers, but also for future historians and researchers. He cites recent work by a couple of French scholars, Fabian Suchanek and Nicoleta Preda, who have been looking into what they call “Semantic Culturonomics,” defined as “a paradigm that uses semantic knowledge bases in order to give meaning to textual corpora such as news and social media.” Web media that keeps this paradigm in mind will wildly surpass newspapers in the role of contemporary historical documentation, because good outside links will greatly enrich the content.

Before this vision becomes reality, though, media websites must be convinced that linking to valuable content outside their site is worth the risk that users will wander away. The write-up insists that a reputation for providing valuable outside links will more than make up for any amount of such drifting visitors. We’ll see whether media sites agree.

Cynthia Murrell, July 21, 2015

Sponsored by, publisher of the CyberOSINT monograph

Centrifuge Says It Offers More Insights

May 29, 2014

According to a press release from Virtual Strategy, Centrifuge Systems-a company that develops big data software-has created four new data connectors within its visual link analysis software. “Centrifuge Expands Their Big Data Discovery Integration Footprint,” explains that with the additional data software users will be able to make better business decisions.

“ ‘Without the ability to connect disparate data – the potential for meaningful insight and actionable business decisions is limited,’ says Stan Dushko, Chief Product Officer at Centrifuge Systems. ‘It’s like driving your car with a blindfold on. We all take the same route to the office every day, but wouldn’t it be nice to know that today there was an accident and we had the option to consider an alternate path.’ ”

The new connectors offer real time access to ANX file structure, JSON, LDAP, and Apache Hadoop with Cloudera Impala. Centrifuge’s entire goal is to add more data points that give users a broader and more detailed perspective of their data. Centrifuge likes to think of itself as the business intelligence tool of the future. Other companies, though, offer similar functions with their software. What makes Centrifuge different from the competition?

Whitney Grace, May 29, 2014
Sponsored by, developer of Augmentext

Splunk and Tableau Developed Connector to Analyze Machine-Generated Data

March 27, 2014

The article on TechWorld titled Tableau Folds Splunk Data Into Business Analysis shares information on the new connector enabling the analysis of machine-generated data, developed in partnership by Tableau Enterprises and Splunk. The collaboration allows for a better understanding of product analytics and customer experience, since Splunk’s software collects data on what customers do when they visit a website. The article explains,

“The new driver for Tableau expands the scope of how Splunk data can be used by the enterprise. It imports data captured by Splunk into Tableau’s data processing and visualization environment. As a result, business analysts can merge the event data generated by servers with other sources of data, which would potentially provide new insights into customer behavior or corporate operations…The connector is a ODBC (Open Database Connectivity) driver that is included in the Tableau 8.1.4 maintenance release.”

Splunk’s software was initially used more for finding issues in a system, but with the addition of analysis tools the software’s ability’s were broadened. Now instead of just noting trouble spots on a website, the software is used to discover patterns in customer behavior. The article uses the example of users filling shopping carts on a website but not making purchases. Splunk’s software is used by managers to pinpoint the issue that is causing that lack of follow-through. Whether or not the partnership of Tableau and Splunk will pay off remains to be seen.

Chelsea Kerwin, March 27, 2014

Sponsored by, developer of Augmentext

SharePoint Business Data Connector Needed

October 18, 2013

Many organizations still see SharePoint as an internal enterprise tool and have yet to take advantage of any opportunity for external data integration. No doubt external integration is trickier and few organizations are willing to take risks. So, many are turning to the Layer2 Business Data List Connector to seamlessly integrate external data streams into an existing SharePoint infrastructure. OpenPR covers the product in their story, “Layer2 Business Data List Connector for SharePoint V5 Released To Close Gaps With External Data Integration.”

The article begins:

“Layer2 has announced version 5 of the SharePoint Business Data List Connector (BDLC) that connects almost any external corporate data source with native SharePoint lists and closes many gaps that still exist with SharePoint data integration.”

Add-ons are all too common when it comes to SharePoint deployments Many gaps exist, just like the external data integration gap mentioned above. Stephen E. Arnold, of Arnold IT, is a longtime expert in search and a frequent critic of SharePoint. In a recent article, Arnold highlights that SharePoint is missing the mark on its critical functions, including search. Microsoft would do well to listen, but until a major redesign takes place, users will continue to rely on add-ons.

Emily Rae Aldridge, October 18, 2013

Maxxcat Offers SQL Connector

August 30, 2013

Specialized hardware vendor MaxxCAT offers a SQL connector, allowing their appliances to directly access SQL databases. We read about that tool, named BobCAT, at the company’s Search Connect page. We would like to note that the company’s web site has made it easier to locate their expanding range of appliances for search and storage.

Naturally, BobCAT can be configured for use with Microsoft SQL Server, Oracle, and MySQL, among other ODBC databases. The connector ‘s integration with MaxxCAT’s appliances makes it easier to establish crawls and customize output using tools like JSON, HTML and SQL. The write-up emphasizes:

“The results returned from the BobCAT connector can be integrated into web pages, applications, or other systems that use the search appliance as a compute server performing the specialized function of high performance search across large data sets.

“In addition to indexing raw data, The BobCAT connector provides the capability for raw integrators to index business intelligence and back office systems from disparate applications, and can grant the enterprise user a single portal of access to data coming from customer management, ERP or proprietary systems.”

MaxxCAT does not stop with its SQL connector. Their Lynx Connector facilitates connection to their enterprise search appliances by developers, integrators, and connector foundries. The same Search Connect page explains:

“The connector consists of two components, the input bytestream and a subset of the MaxxCAT API that controls the processing of collections and the appliance.

“There are many applications of the Lynx Connector, including building plugins and connector modules that connect MaxxCAT to external software systems, document formats and proprietary cloud or application infrastructure. Users of the Lynx Connector have a straightforward path to take advantage of MaxxCAT’s specialized and high performance retrieval engine in building solutions.”

Developers interested in building around the Lynx framework are asked email the company for more information, including a line on development hardware and support resources. MaxxCAT was founded in 2007 to capitalize on the high-performance, specialized hardware corner of the enterprise search market. The company manages to offer competitive pricing without sacrificing its focus on performance, simplicity, and ease of integration. We continue to applaud MaxxCAT’s recently launched program for nonprofits.

Cynthia Murrell, August 30, 2013

Sponsored by, developer of Augmentext

Next Page »

  • Archives

  • Recent Posts

  • Meta