Ontotext Rank

December 5, 2018

Ontotext, a text processing vendor, has posted a demonstration of its ranking technology. You can find the demos at this link. The graphic below was generated by the system on December 3, 2018, at 0900 am US Eastern time. I specified the industry as information technology and the sub industry as search. Here’s what the system displayed:

image

A few observations:

  1. I specified 25 companies. The system displayed 10. I assume someone from the company will send me an email that the filters I applied did not have sufficient data to generate the desired result. Perhaps those data should be displayed?
  2. No Google Search nor Microsoft Bing search appeared. Google, a search vendor, has been in the news in the countries I have visited recently.
  3. RightNow appeared. The company is (I thought) a unit of Oracle.
  4. Publishers Clearing House sells magazine subscriptions. PCH does not offer information retrieval in the sense that I understand the bound phrase.

Net net: I am not sure about the size of the data set or what the categories mean.

You need to decide for yourself whether to use this service or Google Trends or a similar “popularity” or “sentiment” analysis system.

Stephen E Arnold, December 5, 2018

Digital Reasoning: From Intelligence Centric Text Retrieval to Wealth Management

November 12, 2018

Vendors of text processing systems have had to find new ways to generate revenue. The early days of entity extraction and social graphs provided customers from the US government and specialized companies like Booz, Allen & Hamilton.

Today, different economic realities have forced change.

The capitalist tool published “Digital Reasoning Brings AI To Wealth Management.” The write up does little to put Digital Reasoning in context. The company was founded in 2000. The firm accepted outside financing which now amounts to about $100 million. The firm became cozy with IBM, labored in the vineyards of the star crossed Distributed Common Ground System, and then faced a fire storm of competition from companies big and small. The reason? Entity extraction and link analysis became commodities. The fancy math also migrated into a wide range of applications.

New buzzwords appeared and gained currency. These ranged from artificial intelligence (who knows that that phrase means?) to real time data analytics (Yeah, what is “real time”?).

Digital Reasoning’s response is interesting. The company, like Attivio and Coveo, has nosed into customer support. But the intriguing play is that the Digital Reasoning system, which was text centric, is now packaging its system to help wealth management firms.

Is this text based?

Sure is. I learned:

For advisors, Digital Reasoning helps them prioritize which customers to focus on, which can be useful when an adviser may have 200 or more clients. At the management level, Digital Reasoning can show if the firm has specific advisors getting a lot of complaints so it can respond with training and intervention. At a strategic level, it can sift through communications and identify if customers are looking for a specific offering or type of product.

Interesting approach.

The challenge, of course, will be to differentiate Digital Reasoning’s system from those available from dozens of vendors.

Digital Reasoning has investors who want a return on their $100 million. After 18 years, time may be compressing as once solutions once perceived as sophisticated become more widely available and subject to price pressure.

Rumors of Amazon’s interest in this “wealth management” sector have reached us in Harrod’s Creek. That might be another reason why the low profile Digital Reasoning is stirring the PR waters using the capitalist’s tool, Forbes Magazine, once a source of “real” news.

Stephen E Arnold, November 12, 2018

Picking and Poking Palantir Technologies: A New Blood Sport?

April 25, 2018

My reaction to “Palantir Has Figured Out How to Make Money by Using Algorithms to Ascribe Guilt to People, Now They’re Looking for New Customers” is a a sign and a groan.

I don’t work for Palantir Technologies, although I have been a consultant to one of its major competitors. I do lecture about next generation information systems at law enforcement and intelligence centric conferences in the US and elsewhere. I also wrote a book called “CyberOSINT: Next Generation Information Access.” That study has spawned a number of “experts” who are recycling some of my views and research. A couple of government agencies have shortened by word “cyberosint” into the “cyint.” In a manner of speaking, I have an information base which can be used to put the actions of companies which offer services similar to those available from Palantir in perspective.

The article in Boing Boing falls into the category of “yikes” analysis. Suddenly, it seems, the idea that cook book mathematical procedures can be used to make sense of a wide range of data. Let me assure you that this is not a new development, and Palantir is definitely not the first of the companies developing applications for law enforcement and intelligence professionals to land customers in financial and law firms.

baseball card part 5

A Palantir bubble gum card shows details about a person of interest and links to underlying data from which the key facts have been selected. Note that this is from an older version of Palantir Gotham. Source: Google Images, 2015

Decades ago, a friend of mine (Ev Brenner, now deceased) was one of the pioneers using technology and cook book math to make sense of oil and gas exploration data. How long ago? Think 50 years.

The focus of “Palantir Has Figured Out…” is that:

Palantir seems to be the kind of company that is always willing to sell magic beans to anyone who puts out an RFP for them. They have promised that with enough surveillance and enough secret, unaccountable parsing of surveillance data, they can find “bad guys” and stop them before they even commit a bad action.

Okay, that sounds good in the context of the article, but Palantir is just one vendor responding to the need for next generation information access tools from many commercial sectors.

Read more

CyberOSINT: Next Generation Information Access Explains the Tech Behind the Facebook, GSR, Cambridge Analytica Matter

April 5, 2018

In 2015, I published CyberOSINT: Next Generation Information Access. This is a quick reminder that the profiles of the vendors who have created software systems and tools for law enforcement and intelligence professionals remains timely.

The 200 page book provides examples, screenshots, and explanations of the tools which are available to analyze social media information. The book is the most comprehensive run down of the open source, commercial, and cloud based systems which can make sense of social media data, lawful intercept data, and general text and imagery content.

Companies described in this collection of “tools” include:

  • Cyveillance (now LookingGlass)
  • Decisive Analytics
  • IBM i2 (Analysts Notebook)
  • Geofeedia
  • Leidos
  • Palantir Gotham
  • and more than a dozen developers of commercial and open source, high impact cyberOSINT tool vendors.

The book is available for $49. Additional information is available on my Xenky.com Web site. You can buy the PDF book online at this link gum.co/cyberosint.

Get the CyberOSINT monograph. It’s the standard reference for practical and effective analysis, text analytics, and next generation solutions.

Stephen E Arnold, April 5, 2018

Insight into the Value of Big Data and Human Conversation

April 5, 2018

Big data and AI have been tackling tons of written material for years. But actual spoken human conversation has been largely overlooked in this world, mostly due to the difficulty of collecting this information. However, that is on the cusp of changing as we discovered from a white paper from the Business and Local Government Resource Center,The SENSEI Project: Making Sense of Human Conversations.”

According to the paper:

“In the SENSEI project we plan to go beyond keyword search and sentence-based analysis of conversations. We adapt lightweight and large coverage linguistic models of semantic and discourse resources to learn a layered model of conversations. SENSEI addresses the issue of multi-dimensional textual, spoken and metadata descriptors in terms of semantic, para-semantic and discourse structures.”

While some people are excited about the potential for advancement this kind of big data research presents, others are a little more nervous; for example, one or two of the 87 million individuals whose Facebook data found its way into the capable hands of GSR and Facebook.

In fact, there is a growing movement, according to the Guardian, to scale back big data intrusion. What makes this interesting is that advocates are demanding companies that harvest our information for big data purposes give some of that money back to the people whom the info originate, not unlike how songwriters are given royalties every time their music is used for film or television. Putting a financial stipulation on big data collection could cause SENSEI to top its brake pedal. Maybe?

Patrick Roland, April 5, 2018

Can Factmata Do What Other Text Analytics Firms Cannot?

April 2, 2018

Consider it a sign of the times—Information Management reveals, “Twitter, Craigslist Co-Founders Back Fact-Check Startup Factmata.” Writer Jeremy Kahn reports:

“Twitter Inc. co-founder Biz Stone and Craigslist Inc. co-founder Craig Newmark are investing in London-based fact-checking startup Factmata, the company said Thursday. … Factmata aims to use artificial intelligence to help social media companies, publishers and advertising networks weed out fake news, propaganda and clickbait. The company says its technology can also help detect online bullying and hate speech.”

Particularly amid concerns about the influence of Russian-backed propaganda in U.S. and the U.K., several tech firms and other organizations have taken aim at false information online. What about Factmata has piqued the interest of leading investors? We’re informed:

“Dhruv Ghulati, Factmata’s chief executive officer, said the startup’s approach to fact-checking differs from other companies. While some companies are looking at a wide range of content, Factmata is initially focused exclusively on news. Many automated fact-checking approaches rely primarily on metadata – the information behind the scenes that describe online news items and other posts. But Factmata is using natural language processing to assess the actual words, including the logic being used, whether assertions are backed up by facts and whether those facts are attributed to reputable sources.”

Ghulati goes on to predict Facebook will be supplanted as users’ number one news source within the next decade. Apparently, we can look forward to the launch of Factmata’s own news service sometime “later this year.”

We will wait. We do want to point out that based on the information available to the Beyond Search and DarkCyber research teams, no vendor has been able to identify text which is weaponized at a high level of accuracy without the assistance of expensive, human, and vacation hungry subject matter experts.

Maybe Factmata will “mata”?

Cynthia Murrell, April 2, 2018

What Happens When Intelligence Centric Companies Serve the Commercial and Political Sectors?

March 18, 2018

Here’s a partial answer:

image

And

image

Plus

image

Years ago, certain types of companies with specific LE and intel capabilities maintained low profiles and, in general, focused on sales to government entities.

How times have changed!

In the DarkCyber video news program for March 27, 2018, I report on the Madison Avenue type marketing campaigns. These will create more opportunities for a Cambridge Analytica “activity.”

Net net: Sometimes discretion is useful.

Stephen E Arnold, March 18, 2018

Searching Video and Audio Files is Now Easier Than Ever

February 7, 2018

While text-based search has been honed to near perfection in recent years, video and audio search still lags. However, a few companies are really beginning to chip away at this problem. One that recently caught our attention was VidDistill, a company that distills YouTube videos into an indexed list.

According to their website:

vidDistill first gets the video and captions from YouTube based off of the URL the user enters. The caption text is annotated with the time in the video the text corresponds to. If manually provided captions are available, vidDistill uses those captions. If manually provided captions are not available, vidDistill tries to fall back on automatically generated captions. If no captioning of any sort is available, then vidDistill will not work.

 

Once vidDistill has the punctuated text, it uses a text summarization algorithm to identify the most important sentences of the entire transcript of the video. The text summarization algorithm compresses the text as much as the user specifies.

It was interesting and did what they claimed, however, we wish you could search for words and have it brought up in the index so users could skip directly to specific parts of a video. This technology has been done in audio, quite well. A service called Happy Scribe, which is aimed at journalists transcribing audio notes, takes an audio file and (for a small fee) transcribes it to text, which can then be searched. It’s pretty elegant and fairly accurate, depending on the audio quality. We could see VidDistill using this mentality to great success.

Patrick Roland, February 7, 2018

AI Predictions for 2018

October 11, 2017

AI just keeps gaining steam, and is positioned to be extremely influential in the year to come. KnowStartup describes “10 Artificial Intelligence (AI) Technologies that Will Rule 2018.” Writer Biplab Ghosh introduces the list:

Artificial Intelligence is changing the way we think of technology. It is radically changing the various aspects of our daily life. Companies are now significantly making investments in AI to boost their future businesses. According to a Narrative Science report, just 38% percent of the companies surveys used artificial intelligence in 2016—but by 2018, this percentage will increase to 62%. Another study performed by Forrester Research predicted an increase of 300% in investment in AI this year (2017), compared to last year. IDC estimated that the AI market will grow from $8 billion in 2016 to more than $47 billion in 2020. ‘Artificial Intelligence’ today includes a variety of technologies and tools, some time-tested, others relatively new.

We are not surprised that the top three entries are natural language generation, speech recognition, and machine learning platforms, in that order. Next are virtual agents (aka “chatbots” or “bots”), then decision management systems, AI-optimized hardware, deep learning platforms, robotic process automation, text analytics & natural language processing, and biometrics. See the write-up for details on each of these topics, including some top vendors in each space.

Cynthia Murrell, October 11, 2017

New Beyond Search Overflight Report: The Bitext Conversational Chatbot Service

September 25, 2017

Stephen E Arnold and the team at Arnold Information Technology analyzed Bitext’s Conversational Chatbot Service. The BCBS taps Bitext’s proprietary Deep Linguistic Analysis Platform to provide greater accuracy for chatbots regardless of platform.

Arnold said:

The BCBS augments chatbot platforms from Amazon, Facebook, Google, Microsoft, and IBM, among others. The system uses specific DLAP operations to understand conversational queries. Syntactic functions, semantic roles, and knowledge graph tags increase the accuracy of chatbot intent and slotting operations.

One unique engineering feature of the BCBS is that specific Bitext content processing functions can be activated to meet specific chatbot applications and use cases. DLAP supports more than 50 languages. A BCBS licensee can activate additional language support as needed. A chatbot may be designed to handle English language queries, but Spanish, Italian, and other languages can be activated with via an instruction.

Dr. Antonio Valderrabanos said:

People want devices that understand what they say and intend. BCBS (Bitext Chatbot Service) allows smart software to take the intended action. BCBS allows a chatbot to understand context and leverage deep learning, machine intelligence, and other technologies to turbo-charge chatbot platforms.

Based on ArnoldIT’s test of the BCBS, accuracy of tagging resulted in accuracy jumps as high as 70 percent. Another surprising finding was that the time required to perform content tagging decreased.

Paul Korzeniowski, a member of the ArnoldIT study team, observed:

The Bitext system handles a number of difficult content processing issues easily. Specifically, the BCBS can identify negation regardless of the structure of the user’s query. The system can understand double intent; that is, a statement which contains two or more intents. BCBS is one of the most effective content processing systems to deal correctly  with variability in human statements, instructions, and queries.

Bitext’s BCBS and DLAP solutions deliver higher accuracy, and enable more reliable sentiment analyses, and even output critical actor-action-outcome content processing. Such data are invaluable for disambiguating in Web and enterprise search applications, content processing for discovery solutions used in fraud detection and law enforcement and consumer-facing mobile applications.

Because Bitext was one of the first platform solution providers, the firm was able to identify market trends and create its unique BCBS service for major chatbot platforms. The company focuses solely on solving problems common to companies relying on machine learning and, as a result, has done a better job delivering such functionality than other firms have.

A copy of the 22 page Beyond Search Overflight analysis is available directly from Bitext at this link on the Bitext site.

Once again, Bitext has broken through the barriers that block multi-language text analysis. The company’s Deep Linguistics Analysis Platform supports more than 50 languages at a lexical level and +20 at a syntactic level and makes the company’s technology available for a wide range of applications in Big Data, Artificial Intelligence, social media analysis, text analytics,  and the new wave of products designed for voice interfaces supporting multiple languages, such as chatbots. Bitext’s breakthrough technology solves many complex language problems and integrates machine learning engines with linguistic features. Bitext’s Deep Linguistics Analysis Platform allows seamless integration with commercial, off-the-shelf content processing and text analytics systems. The innovative Bitext’s system reduces costs for processing multilingual text for government agencies and commercial enterprises worldwide. The company has offices in Madrid, Spain, and San Francisco, California. For more information, visit www.bitext.com.

Kenny Toth, September 25, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta