CyberOSINT banner

Attensity’s Semantic Annotation Tool “Understands” Emoticons

April 27, 2015


The article on PCWorld titled For Attensity’s BI Parsing Tool, Emoticons Are No Problem explains the recent attempts at fine-tuning the monitoring and relaying the conversations about a particular organization or enterprise. The amount of data that must be waded through is massive, and littered with non-traditional grammar, language and symbols. Luminoso is one company interested in aiding companies with their Compass tool, in addition to Attensity. The article says,

“Attensity’s Semantic Annotation natural-language processing tool… Rather than relying on traditional keyword-based approaches to assessing sentiment and deriving meaning… takes a more flexible natural-language approach. By combining and analyzing the linguistic structure of words and the relationship between a sentence’s subject, action and object, it’s designed to decipher and surface the sentiment and themes underlying many kinds of common language—even when there are variations in grammatical or linguistic expression, emoticons, synonyms and polysemies.”

The article does not explain how exactly Attensity’s product works, only that it can somehow “understand” emoticons. This seems like an odd term though, and most likely actually refers to a process of looking it up from a list rather than actually being able to “read” it. At any rate, Attensity promises that their tool will save in hundreds of human work hours.

Chelsea Kerwin, April 27, 2014

Sponsored by, publisher of the CyberOSINT monograph


Four Visualization Tools to Choose From

February 12, 2015

MakeUseOf offers us a list of graphic-making options in its “4 Data Visualization Tools for Captivating Data Journalism.” Writer Brad Jones describes four options, ranging from the quick and easy to more complex solutions. The first entry, Tableau Public, may be the best place for new users to start. The write-up tells us:

“Data visualization can be a very complex process, and as such the programs and tools used to achieve good results can be similarly complex. Tableau Public, at first glance, is not — it’s a very accommodating, intuitive piece of software to start using. Simply import your data as a text file, an Excel spreadsheet or an Access database, and you’re up and running.

“You can create a chart simply by dragging and dropping various dimensions and measures into your workspace. Figuring out exactly how to produce the sort of visualizations you’re looking for might take some experimentation, but there’s no great challenge in creating simple charts and graphs.

“That said, if you’re looking to go further, Tableau Public can cater to you. It’ll take some time on your part to really understand the breadth of what’s on offer, but it’s a matter of learning a skill rather than the program itself being difficult to use.”

The next entry is Google Fusion Tables, which helpfully links to other Google services, and much of its process is automated. The strengths of Infoactive are its ability to combine datasets and a wealth of options to create cohesive longer content. Rounding out the list is R, which Jones warns is “obtuse and far from user friendly”; it even requires a working knowledge of JavaScript and its own proprietary language to make the most of its capabilities. However, he says there is simply nothing better for producing exactly what one needs.

Cynthia Murrell, February 12, 2015

Sponsored by, developer of Augmentext

Fujitsu Creates its Own Hadoop Tool

January 19, 2015

Fujitsu has joined many other companies by taking Hadoop and creating its own software from it to leverage big data. IT Web Open Source’s article, “Fujitsu Makes It Easy For Customers To Reap The Benefits Of Big Data With PRIMEFLEX For Hadoop” divulges the details about the new software.

The new Hadoop application is part of Fijitsu’s PRIMEFLEX software line of workload specific integrated systems. Its purpose is similar to many other big data software on the market: harness big data and make use of actionable analytics. Fujitsu describes it as a wonder software:

“Fujitsu has developed PRIMEFLEX for Hadoop to simplify and tame big data. The powerful, dedicated all-in-one hardware cluster is designed to integrate with existing hardware infrastructures, introducing distributed parallel processing based on Cloudera Enterprise Hadoop. This is an open-source software framework which gathers, processes and analyses data from various sources, then puts together and presents the big picture on how to act on the information gathered.”

Fijitsu is a recognized and respected brand, but the big data market is saturated with other companies that offer comparable software. Other companies also started with a Hadoop based application as part of their software line-up. Fujitsu is entering the Hadoop analytics a little late.

Whitney Grace, January 19, 2015
Sponsored by, developer of Augmentext

Organizing Content is a Manual or Automated Pain

January 16, 2015

Organizing uploaded content is a pain in the rear. In order to catalog the content, users either have to add tags manually or use an automated system that requires several tedious fields to be filled out. CMS Wire explains the difficulties with document organization in “Stop Pulling Teeth: A Better Way To Classify Documents.” Manual tagging is the longer of the two processes and if no one created a set of tagging standards, tags will be raining down from the cloud in a content mess. Automated fields are not that bad to work with if you have one or two documents to upload, but if you have a lot of files to fill out you are more prone to fill out the wrong information to finish the job.

Apparently there is a happy medium:

“Encourage users to work with documents the way they normally do and use a third party tool such as an auto classification tool to extract text based content, products, subjects and terms out of the document. This will create good, standardized metadata to use for search refinement. It can even be used to flag sensitive information or report content detected with code names, personally identifiable information such as credit card numbers, social security numbers or phone numbers.”

While the suggestion is sound, we thought that auto-classification tools were normally built in collaborative content platform like SharePoint. Apparently not. Third party software to improve enterprise platforms once more saves the day for the digital paper pusher.

Whitney Grace, January 16, 2015
Sponsored by, developer of Augmentext

Roundup of Personalization Software for Search Improvement

January 8, 2015

The article titled 15 Website Personalization and Recommendation Software Tools on Smart Insights contains a roundup of personalization software. Think of Groups of customers see vastly different suggestions from the store, all based on what they have bought or looked at in the past and what other people who bought or looked at similar items also considered. But in the last few years personalization software has become even more tailored to specific pursuits. The article explains the winning brands in one category, B2B and publisher personalization tools,

Evergage is mentioned as tool that fits best in this category. WP Greet Box is a personalisation plug-in used by WordPress blogging users, including me once, to deliver a welcome message to first time users depending on their referrers. It’s amazing this approach isn’t used more on commercial sites. WP Marketing Suite is another WordPress plugin that has been featured in the comments.”

The article also explores the best in the category of Commerce management systems. The article states that “both Sitecore and Kentico have built in tools to personalize content based on various rules, such as geo-location, search terms…” this is in addition to the more widely understood personalization based on user behavior. The idea behind all of these companies is to improve search for consumers.

Chelsea Kerwin, January 08, 2014

Sponsored by, developer of Augmentext

Temis Attends University

December 30, 2014

Despite budget cuts in academic research with print materials, higher education is clamoring for more digital content. You do not need Google Translate to understand that means more revenue for companies in that industry. Virtual Strategy writes that someone wants in on the money: “With Luxid Content Enrichment Platform, Automates The Extraction Of Bibliographic References And The Linking To Corresponding Article.”

Temis is an industry leader in semantic content enrichment solutions for enterprise and they signed a license and service agreement with is a publishing portal for social sciences and humanities, providing students with access to the usual research fare.

Taking note of the changes in academic research, wants to upgrade its digital records for a more seamless user experience:

“To make its collection easier to navigate, and ahead of the introduction of an additional 20.000 books which will consolidate its role of reference SSH portal, decided to enhance the interconnectedness of SSH publications with semantic enrichment. Indeed, the body of SSH articles often features embedded bibliographic references that don’t include actual links to the target document. therefore chose to exploit the Luxid® Content Enrichment Platform, driven by a customized annotator (Skill Cartridge®), to automatically identify, extract, and normalize these bibliographic references and to link articles to the documents they refer to.”

A round of applause for, realizing that making research easier will help encourage more students to use its services. If only academic databases would take ease of use into consideration and upgrade their UI dashboards.

Whitney Grace, December 30, 2014
Sponsored by, developer of Augmentext

WikiSummarizer Summarizes Wikipedia Articles in Visual Knowledge Map

December 26, 2014

The interesting tool called WikiSummarizer presents a summary of Wikipedia articles, particularly useful for students and consultants. Rather than reading the full text of a Wikipedia article (which is, yes, already a condensed text) you can now search for summarized article to get the headlines of a given subject. The FAQ’s for WikiSummarizer explain,

“WikiSummarizer automatically summarizes the Wikipedia articles. The program identifies the most important keywords and ranks them by relevancy. For each keyword the most significant sentences in the original text are presented to the reader. You instantly get the headlines with the most important sentences and keywords. The blending of visualization with summarization, knowledge browsing, mind mapping provides you with a wide range of means to explore relevant content. At a glance, without much reading, you immediately spot the key information chunks.”

Perhaps someday soon, we will be able to read nothing at all and know… the “chunks.” For example, when you search the keyword Hamlet, (the play) what Wikipedia decides to promote as the most relevant information is when Shakespeare wrote it and what the story was based on. This is followed by several blurbs summarizing the play itself and then a brief description of the critical reception among Romantics, providing what reads as a Sparknote of a Sparknote. WikiSummarizer offers visual summary maps, visual trees, and word clouds connected to the Wikipedia Knowledge base.

Chelsea Kerwin, December 26, 2014

Sponsored by, developer of Augmentext

New Version of Sail Labs Indexer

December 11, 2014

We’ve learned that Sail Labs has put out the next iteration of its Media Mining Indexer from the company’s post, “Sail Labs Announces Availability of Release Version 2014-2 and Media Mining Indexer 6.3.” The refreshingly straightforward press release offers bulleted lists of new features and major changes to be found throughout the new version. For the indexer, it lists:

    • Support for sentiment analysis, i.e. classification of text segments into positive, negative, neutral or mixed sentiment
    • Currently supported languages: US and International English, German and Russian
    • Support for continuous intermittent result output, without final XML result, which increases performance in cases where collective results are not required.
    • Support for licensing using a central license manager/server (LiMa), which is intended for use with cloud based use cases.
    • Script-based building of language models using lmtscript.

For those not already familiar with Media Mining Indexer, it processes speech from multiple sources into XML, which can then be uploaded into a range of digital-asset-management systems for subsequent search and retrieval. The software boasts automatic speech recognition, speaker ID, speaker change detection, story detection, and topic classification.

Sail Labs specializes in high-end software for speech and multimedia analysis for vertical markets. Its name derives from “Speech Artificial Intelligence Language Laboratories.” Sail Labs is located in Vienna, Austria, and was founded in 1999.

Cynthia Murrell, December 11, 2014

Sponsored by, developer of Augmentext

Secret Hiring Tool for Silicon Valley Types

December 8, 2014

It is harder than ever to find a job for young graduates and seasoned workers. Yet according to the FitFrnd blog, Silicon Valley is having trouble finding good employees. The post “Silicon Valley’s Best-Kept Secret: How AngelList Is Slowly Disrupting The Hiring Industry” explains that rather than relying on “old-fashioned” job search engines, AngelList’s is proving to be more reliable in finding talent.

AngelList is primarily a crowdfundung Web site used by startups to raise money for new endeavors. AngelList, however, is proving to be a new resource to find a job or locate someone to fill the position. Other career Web sites fail to attract the right talent. The post explains how FitFrnd ad trouble finding a blogger/content marketer:

“We finally decided to give AngelList a serious try. We had tried it before, but our efforts had been half-hearted. This time we improved our copy, added information such as why the company is such an amazing place to work (it is!), details about salary and equity ranges, and even screenshots of the app. Within a few days, we have received about 80 resumes, including some really compelling candidates.”

What makes AngelList different is that it allows applicants to apply privately and know the salary up front. It also cuts out the middleman. While the information is searchable, you have to join AngelList. While it does not cost to join, it eventually might, but the price is you are paying for a service that works…for the moment.

Whitney Grace, December 08, 2014
Sponsored by, developer of Augmentext

OntoText Expands into North America with Strategic Hires

December 3, 2014

The article titled Semantic Technology Provider Ontotext Announces Strategic Hires for Ontotext USA on PRWeb discusses the expansion of Ontotext in North America. Tony Agresta, Brad Bogle and Tom Endyke joined Ontotext, as Senior VP of Worldwide Sales, Director of Marketing and Director of Solutions Architecture, respectively. Ontotext, the semantic search and text-mining leader has laid out several main focuses for the near future, including the growth of worldwide marketing efforts and the development of relationships. The article quotes Tony Agresta on Ontotext’s product development,

“Our flagship product, GraphDB™ (formerly OWLIM) has been deployed across the globe and is widely known as a highly scalable enterprise RDF triplestore… But what makes Ontotext truly unique are three other essential elements: 1) a full complement of semantic enrichment, integration, curation and authoring tools that extend our platform approach, 2) a large critical mass of semantic engineers, professional services and support teams that represent the most experienced professionals in the world and 3) S4, the Self Service Semantic Suite.”

Ontotext has provided semantic solutions for such companies as BBC, AstraZeneca, John Willey & Sons, and The British Museum. Their recent expansion efforts in North America are an attempt to reach more semantic technology users in this continent.

Chelsea Kerwin, December 03, 2014

Sponsored by, developer of Augmentext

Next Page »