Enterprise Search: Pool Party and Philosophy 101

September 8, 2016

I noted this catchphrase: “An enterprise without a semantic layer is like a country without a map.” I immediately thought of this statement made by Polish-American scientist and philosopher Alfred Korzybski:

The map is not the territory.

When I think about enterprise search, I am thrilled to have an opportunity to do the type of thinking demanded in my college class in philosophy and logic. Great fun. I am confident that any procurement team will be invigorated by an animated discussion about representations of reality.

I did a bit of digging and located “Introducing a Graph-based Semantic Layer in Enterprises” as the source of the “country without a map” statement.

What is interesting about the article is that the payload appears at the end of the write up. The magic of information representation as a way to make enterprise search finally work is technology from a company called Pool Party.

Pool Party describes itself this way:

Pool Party is a semantic technology platform developed, owned and licensed by the Semantic Web Company. The company is also involved in international R&D projects, which continuously impact the product development. The EU-based company has been a pioneer in the Semantic Web for over a decade.

From my reading of the article and the company’s marketing collateral it strikes me that this is a 12 year old semantic software and consulting company.

The idea is that there is a pool of structured and unstructured information. The company performs content processing and offers such features as:

  • Taxonomy editor and maintenance
  • A controlled vocabulary management component
  • An audit trail to see who changed what and when
  • Link analysis
  • User role management
  • Workflows.

The write up with the catchphrase provides an informational foundation for the company’s semantic approach to enterprise search and retrieval; for example, the company’s four layered architecture:

image

The base is the content layer. There is a metadata layer which in Harrod’s Creek is called “indexing”. There is the “semantic layer”. At the top is the interface layer. The “semantic” layer seems to be the secret sauce in the recipe for information access. The phrase used to describe the value added content processing is “semantic knowledge graphs.” These, according to the article:

let you find out unknown linkages or even non-obvious patterns to give you new insights into your data.

The system performs entity extraction, supports custom ontologies (a concept designed to make subject matter experts quiver), text analysis, and “graph search.”

Graph search is, according to the company’s Web site:

Semantic search at the highest level: Pool Party Graph Search Server combines the power of graph databases and SPARQL engines with features of ‘traditional’ search engines. Document search and visual  analytics: Benefit from additional  insights through interactive visualizations of reports and search results derived from your data lake by executing sophisticated SPARQL queries.

To make this more clear, the company offers a number of videos via YouTube.

The idea reminded us of the approach taken in BAE NetReveal and Palantir Gotham products.

Pool Party emphasizes, as does Palantir, that humans play an important role in the system. Instead of “augmented intelligence,” the article describes the approach methods which “combine machine learning and human intelligence.”

The company’s annual growth rate is more than 20 percent. The firm has customers in more than 20 countries. Customers include Pearson, Credit Suisse, the European Commission, Springer Nature, Wolters Kluwer, and the World Bank and “many other customers.” The firm’s projected “Euro R&D project volume” is 17 million (although I am not sure what this 17,000,000 number means. The company’s partners include Accenture, Complexible, Digirati, and EPAM, among others.

I noted that the company uses the catchphrase: “Semantic Web Company” and the catchphrase “Linking data to knowledge.”

The catchphrase, I assume, make it easier for some to understand the firm’s graph based semantic approach. I am still mired in figuring out that the map is not the territory.

Stephen E Arnold, September 8, 2016

Hewlett Packard: About Face

September 7, 2016

I read “Exclusive: HP Enterprise in Talks to Sell Software Unit to Thoma Bravo – Sources.” Who does not love a news story labeled “exclusive” and attributed to “sources” when the subject is Hewlett Packard Enterprise? The thrust of the story is that HPE, fresh from making marketing noises about its enterprise software business, is allegedly selling those software businesses.

Let’s assume that this is indeed accurate. The asking price is is in the neighborhood of $8 to $10 billion or more if the excited buyer really wants this collection of software.

Why is HPE selling what it has been working hard to craft into a sustainable revenue stream with healthy profit margins? The write up reports:

HPE’s software unit generated $3.6 billion in net revenue in 2015, down from $3.9 billion in 2014. The company has said revenue growth in its software unit has been challenged by a market shift toward cloud subscription offerings.

Yep, these numbers will drive potential buyers into a frenzy.

The word in Harrod’s Creek, Kentucky, is that HPE is eager to find a way to make money, boost the company’s value to shareholders, and plug into to the fluffy cloud opportunities. HPE’s present software may not be the answer for HPE. Another outfit should be able to release a flood of revenue.

One of the goslings (un-named, of course) thought that HPE was going cold turkey to kick its Autonomy habit. The shadow of the search business makes life chilly for the would be technology leader. In an “exclusive” comment to Beyond Search, HPE anticipates victory in its legal flap associated with the purchase of Autonomy for an modest $10 or $11 billion.

We don’t know if our un-named gosling is on the right track, but if HPE sells Autonomy and other assorted gems from its software vault, the difference between what HPE paid for Autonomy and then the amount generated by the sale of Autonomy is only a couple billion dollars.

What’s a few billion dollars for a focused, consistent, well managed outfit like HPE? A pittance I say.

I wonder, “Does the buyer of HPE’s Autonomy-infused bundle recognize the excitement selling search and retrieval will engender?” Sure. These are savvy folks. Generating revenue from proprietary search and content processing software is really easy.

If Google can do, anyone can, right? Oh, Google closed its enterprise search product. Well, what about Palantir? Oh, Palantir relies on open source for findability functions. How about IBM? Oh, shucks, IBM relies on Lucene with home brew code and acquired technology.

As I said, search is easy.

Stephen E Arnold, September 7, 2016

A Blurred In-Q-Tel X-Ray: Real Journalists Uncover Old News

September 4, 2016

I noted this write up by the Rupert Murdoch outfit, the Wall Street Journal: “The CIA’s Venture-Capital Firm, Like Its Sponsor, Operates in the Shadows.” You may have to buy a dead tree version of the Wall Street Journal, go to your public library, or subscribe to read the source itself. (Don’t hassle me if the link begs for dollars. Buzz Mr. Murdoch and express your views.)

The point of the article is that the US government’s intelligence outfit operates a venture capital firm. That investment entity does business as In-Q-Tel. The goal, in my opinion, is to identify promising technologies which may have application at the Central Intelligence Agency. Please, note that much of the work at the CIA is not public. That’s because it mostly operates in secrecy. The fact that a government has secret activities is not exactly news.

Furthermore, whom do you think advises the Central Intelligence Agency and its various units? Choose from the following list:

  1. Immigrants without US entry authorizations
  2. Felons recently released from prison to a half way house
  3. Individuals working for governments antithetical to the posture of the United States
  4. Investigative journalists looking for a gig
  5. Individuals with clearances or a track record of serving the US.

Okay, you picked one to three. You may qualify for work at a large, “real news” outfit. If you selected item four, you now understand why the news about the individuals and the companies exposed to In-Q-Tel is stale.

Obviously those in the spy game want folks who are in the same fox hole.

The write up reveals this stunning factoid: In-Q-Tel provides only limited information about its investments, and some of its trustees have ties to funded companies.

No kidding.

With considerable assiduity, the write up lists the companies in which In-Q-Tel has invested and notes:

Of about 325 investments In-Q-Tel says it has made since its founding, more than 100 weren’t announced, although the identities of some of those companies have leaked out. The absence of disclosure can be due to national-security concerns or simply because a startup company doesn’t want its financial ties to intelligence publicized, people familiar with the arrangements said. While moneymaking isn’t In-Q-Tel’s goal, when that happens, such as when a startup it funded goes public, In-Q-Tel can keep the profit and roll it into new projects. It doesn’t obtain rights to technology or inventions.

There you go. Why not let another nation’s intelligence services invest in high potential but little known innovators? The US government is trying to bring more business discipline to some of its activities. Therefore, is it not logical that an intelligence agency seeking high value products and services can use the proceeds from its investments to further the work of the intelligence agency?

I guess that’s a thought foreign to some real journalists.

What does one expect the CIA and In-Q-Tel to do? Publish a daily newspaper detailing the companies, people, and technologies the CIA is interested in? What about going on Fox News and explaining what’s hot and what’s not in advanced technology? Oh, right. Technology is not as much fun as pundits who over talk one another.

I know that an outfit owned by Rupert Murdoch is in the news business. I know that gathering information from the In-Q-Tel Web site is really difficult. For me, information about In-Q-Tel is a bit of a yawner.

I would much rather read about some of the management methods used in some major media entities. Government efforts to identify cutting edge technologies is just not that interesting to me. Where’s the beef? Why not consider why certain categories of investments have not yielded products and services which can be used across missions? Why not explore why Purple Yogi was a dead end and why Palantir is not? Oh, right. That’s harder than realizing that in certain types of work one wants to deal with individuals from that fox hole.

Stephen E Arnold, September 4, 2016

Business Intelligence: Four Generalized Hurdles

August 30, 2016

Business intelligence, like government intelligence, may be an oxymoron. Nevertheless, doing “intelligence” is a big business. That’s why Palantir Technologies is hoping lawyers can crack open the US Army’s coin purse.

I read “4 Huge Challenges Facing CIOs and IT Leaders.” I quite like the use of “chief information officer” and “information technology leaders” in the headline. CIOs seems to be struggling to meet their budgets, deal with security issues, and attend conferences. The notional “information technology leader” is busy reading reports from mid tier consulting firms, dealing with the all-too-frequent emergencies, and removing malware from senior executives’ computing devices.

The write up identifies four “challenges” these busy professionals must convert to opportunities in their spare time. What are these “challenges”? Here’s my translation of MBA speak into Harrod’s Creek, Kentucky lingo:

  1. Executives have to write checks and push aside bureaucratic baloney to that business intelligence can move forward. If the top dog doesn’t care, well, you can always check out Facebook and read Reddit.
  2. Get something done when  you said you would complete the task. Good luck with that. Meetings, approvals, crashes [see the comment above about information technology professionals’ time allocation], and software that simply doesn’t work are enemies of finishing a job. I assume that the people performing business intelligence know what they are doing most of the time when they are not sure what the objective of the project is.
  3. Normalizing, vetting, and processing data. Yikes, this challenge has been in the fast lanes of the information superhighway for more than 50 years. Hey, that XML is just great, isn’t it?
  4. Getting users to use the business intelligence outputs. If the users don’t understand the outputs, don’t trust the outputs, or prefer their own methods—up date that link graph thing on Microsoft LinkedIn.

When one steps back from this list of challenges, the issues are not new. The more chaotic the business environment is perceived to be, the less likely converting these opportunities into a career win may be.

Even when a system does deliver useful outputs like Palantir Gotham, getting acceptance is a very difficult challenge. A person without the resources of Palantir might find the conversion of these challenges a bit of a challenge in itself.

May I suggest that the solution is to start small, demonstrate value, and move forward? How popular is that approach? Not very.

Stephen E Arnold, August 30, 2016

Truth or Fiction: US Army Cannot Count Money

August 24, 2016

I believe everything I read on the Internet. When the information comes from a real journalism type outfit, I am no Doubting Thomas. I wish to point out that the write up “US Army Fudged Its Accounts by Trillions of Dollars, Auditor Finds” strikes me as fiction. Just to keep the math straight, here’s a summary of numbers:

  • 1,000 is one thousand
  • 10,000 is ten one thousands
  • 100,000 is ten ten thousands
  • Let’s jump up a bit.
  • One million is 1,000,000
  • A billion is 1,000,000,000
  • A trillion is 1,000,000,000,000.

In Zimbabwe there was a $10 trillion dollar bill. So misplacing a bill is easy to do:

image

My recollection from my days at Booz, Allen is that most humanoids have difficult with quantities over 1,000. Imagine what happens when one has to think about trillions or a one followed by 12 zeros.

image

If the write up is on the money, the US Army is composed of individuals who cannot deal with big numbers or money. I learned:

The Defense Department’s Inspector General, in a June report, said the Army made $2.8 trillion in wrongful adjustments to accounting entries in one quarter alone in 2015, and $6.5 trillion for the year. Yet the Army lacked receipts and invoices to support those numbers or simply made them up.

How can a US federal entity make up numbers? The Department of Defense is into Windows and Excel. The US Army has a fancy data aggregation and analysis system called Distributed Common Ground or DCGS-A.

The write up stated:

The report affirms a 2013 Reuters series revealing how the Defense Department falsified accounting on a large scale as it scrambled to close its books. As a result, there has been no way to know how the Defense Department – far and away the biggest chunk of Congress’ annual budget – spends the public’s money. The new report focused on the Army’s General Fund, the bigger of its two main accounts, with assets of $282.6 billion in 2015. The Army lost or didn’t keep required data, and much of the data it had was inaccurate, the IG said.

I was surprised an auditor was able to assemble the needed information. I highlighted this statement from the source article:

The IG report also blamed DFAS [Defense Finance and Accounting Service] , saying it too made unjustified changes to numbers. For example, two DFAS computer systems showed different values of supplies for missiles and ammunition, the report noted – but rather than solving the disparity, DFAS personnel inserted a false “correction” to make the numbers match. DFAS also could not make accurate year-end Army financial statements because more than 16,000 financial data files had vanished from its computer system. Faulty computer programming and employees’ inability to detect the flaw were at fault, the IG said.

Trillions. Hmmm. Why not put DCGS-A on the forensic team? If that system does not work, why not let Palantir Gotham have a go at figuring out where the money went? Another option is IBM i2 Analyst’s Notebook, right?

Yes, government integrity. There’s a Web site for that too: https://www.oge.gov/.

Did you know that?

Stephen E Arnold, August 24, 2016

Ami Albert Joined Bertin Technologies a Year Ago and Is Alive

August 22, 2016

I learned about the Ami search system called Albert a decade ago. My notes indicated that at that time the company was Swiss but had strong ties to France. Not surprisingly, when Ami’s market momentum dictated a sale, a French company stepped forward and bought Ami and its happy face identity:

image

Bertin Technologies has integrated Ami Albert into its market intelligence suite. Search appears to be a utility function. The company says that it is “a publisher and integrator of cutting edge software solutions.” The company offers cyber security, digital intelligence, and speech processing.

According to the deal description on the Bertin Web site:

The ability to offer Market Intelligence and Risk Intelligence sees the creation of a key player in Web Content Mining, whose international outlook is supported by an industrial group with a presence in 15 countries.

Ami, a search vendor, morphed into a market intelligence company. When the deal was announced in mid 2015, AMI had 150 clients. The company operated via two subsidiaries in the UK and Morocco. The unique value of Ami comes from Bertin’s capabilities.

In 2006, Ami counted LexisNexis, Sinequa, Lingway, and itself via the Go Albert unit as “partners.”

The company’s search interface looked like this before Ami pivoted to content scraping and “market intelligence.”

ami interface

Search results looked like this:

image

Ami emphasized that it could perform metasearch functions; that is, take a user’s query and send it to different systems with individual search interfaces. Here’s how Ami presented this idea to prospective customers:

image

Ami also emulated the analytic report methods found in i2 Analyst’s Notebook and Palantir Technologies, among others.

image

No details about the terms of the deal were announced. I did not include Ami Albert in any of the Enterprise Search Report profiles I created. The company seemed to be focused on building traction in Europe, not the US. In retrospect, Ami’s trajectory is similar to many other search vendors’. The company enters the market, moves forward for ten years, and then sells. A new owner is probably a better fate than locking the doors and turning off the lights.

Stephen E Arnold, August 22, 2016

An IBM Watson Retrospective

August 20, 2016

We love IBM Watson. We avidly devoured “A Look at IBM’s Watson 5 Years After Its Breathtaking Jeopardy Debut.” The “singularity” is associated in my mind with Google, but the write up is about IBM Watson. What’s not to like?

The review kicks off by reminding the reader (in this case, me) that Watson is a version of DeepQA software. I added this mental footnote: Lucene, home brew code, and acquired technologies.)

I did not know that IBM wanted to create a Siri for business. IBM and Apple have formed a bit of a teamlet in the last year or so.

I highlighted this passage:

Watson shrunk from the size of a large bedroom to that of four pizza boxes and is now accessible via the cloud on tablet and smartphone. The system is 240% more powerful than its predecessor and can process 28 types (or modules) of data, compared to just 5 previously.

In 2013, IBM open-sourced Watson’s API and now offers IBM Bluemix, a comprehensive cloud platform for third-party developers to build and run apps on top of Watson’s many computing capabilities.

But one of the biggest moves that’s made Watson into what it is today was when, in 2014, IBM invested $1 billion into creating “IBM Watson Group,” a massive division dedicated to all things Watson and housing some 2,000 employees. This was the tipping point when Watson went from “startup mode” to making cognitive computing mainstream. It’s when Watson started to feel very, well, “IBM.” Fast-forward to 2016, and Watson has more enterprise services and solutions than I can list here—financial advisor, automated customer service representative, research compiler—you name the service, Watson can probably do it.

The confidence in Watson seems unbounded.

The write up explains the future of Watson. I learned:

IBM is aware of deep learning and last year told MIT Technology Review that the team is integrating the deep learning approach into Watson. The original system was already a bit of a mashup—combining natural language understanding with statistical analysis of large datasets. Deep learning may round it out further.

I was under the impression that training Watson with data was part of the plumbing. Deep learning, I conclude, is a bit of frosting on the cake.

The review ends with a reminder that Watson is an augmented intelligence system just like Palantir Technologies’ Gotham and Metropolitan systems, not an artificial intelligence system.

The future is “powerful ways” for IBM, humans, and Watson to work together. I believe this. I believe this. I believe this. I believe this. I believe… Sustainable revenues and profits will follow. I believe this too.

Stephen E Arnold, August 20, 2016

Library Search and Survival

July 29, 2016

I read “Library Systems Report 2016.” Interesting round up of niche player companies. The focus is upon library systems. This is today’s equivalent of the card catalogs I used when I was a wee lad in central Illinois.

Three points jumped out at me:

  • Most of the companies mentioned in the report are unknown outside of the library market. That’s okay. One can make a great deal of money serving niche markets. The takeaway for me was the technologies referenced struck me as decidedly 1990s-ish. There are no Palantir Technologies in this collection of “high tech” leaders.
  • The industry, which strikes me as small, compared with Pokémon Go is consolidating. I have no problem with this, but it suggests that library funding may be further constrained. With fewer libraries, there will be fewer customers; therefore, only the “big” will survive. MBAs threaten MLSs it seems.
  • Open source software and Web based and cloud solutions are beginning to have an impact. As I said, 1990s-ish thinking perhaps.

This quote sums up how one of the big dogs approaches the financial challenges it faces:

EBSCO Information Services stands as one of the major forces in the library technology sector, despite not offering it own comprehensive management product.

Unlike Google or Facebook, EBSCO, a company once known for making three ring binders, wants to be everyone’s connectivity pal.

Which of these vendors will become a billion dollar company? Which library start up will be the next big thing on Shark Tank?

I think I know the answer to these questions. Do you?

Stephen E Arnold, July 29, 2016

Thomson Reuters: Selling at Peak Value?

July 12, 2016

I read “Thomson Reuters Announces Definitive Agreement to Sell its Intellectual Property & Science Business to Onex and Baring Asia for $3.55 billion.” Thomson Reuters has been working hard to pump up revenue and generate a juicy profit for its stakeholders. Like IBM, it seems that the best way to get a large, established company in gear is to sell assets. According to the write up, Thomson Reuters’ management thinks:

“With the completion of this divestiture, Thomson Reuters will be even more focused on operating at the intersection of global commerce and regulation.”

What’s next for Thomson Reuters? More video? More Palantir repackaging? Higher fees for its professional information services?

Thomson Reuters has tried many things in the last two decades. The result is suggested in this chart:

image

The top line has been drifting down. The profit margin (the all important red line) has been a roller coaster. The net income has been a result of management moves and cost controls.

The question is: Is this collection of patent and IP related properties at peak value?

My hunch is that Thomson Reuters found the deal palatable. What will the new owners do with the properties. Both are investment outfits. The trajectory of these “services” like Compumark will be interesting to follow.

For Thomson Reuters, the hurdle remains growth. Isn’t that a problem with which IBM is struggling? Running specialist businesses with those who are not experts in each niche has been a challenge for many firms. Now the new owners Onex Partners and Baring Private Equity Asia have an opportunity to display their management expertise.

Selling is easier than innovating. Managing a bundle of businesses may be even more difficult.

Stephen E Arnold, July 12, 2016

Enterprise Search Vendors: A Partial List

June 24, 2016

I spoke with a confused and unbudgeted worker bee at a giant outfit this weekend. The stellar professional was involved in figuring out what to do about enterprise search. The story is one I have heard many times in the last 40 years. The system doesn’t meet the needs of the users. The system is over budget. The system does not index in real time. Yadda yadda yadda.

The big question was, “What are the enterprise search vendors offering a system which actually works, does not experience downtime, cost overruns, and user outrage. Note that this is not the word “outage.” The word is “outrage”.

I don’t know of such a system. As a helpful 72 year old, I rattled off a list of vendors who purport to offer Big Data capable, next generation semantic-linguistic-NLP systems. True to form, I repeated the list twice. I thought he would cry.

For those of you who want to know the vendors I plucked from my list of outfits in the search and content processing game, I reproduce the list. If you want upsides, downsides, license fees, gotchas, and other assorted details, I will provide the information. But since you are not likely to buy me dinner this evening, you will have to pay for my thoughts.

Here’s the selected list. Reader, start your browser:

  • Attivio
  • Coveo
  • dtSearch
  • Elasticsearch (Lucene)
  • Fabasoft Mindbreeze
  • IBM Omnifind
  • IHS Goldfire
  • Lookeen
  • Lucid Works (Solr)
  • Marklogic
  • Maxxcat
  • Polyspot
  • Sinequa
  • Solcara
  • Squiz Funnelback
  • Thunderstone
  • X1
  • Yippy

There are quite a few outfits whose systems do search like Palantir, but I trimmed the list to companies for my worried pal.

What’s interesting is that most of these outfits explain that their systems are much, much more than search and retrieval. Believe it or not as Mr. Ripley used to say.

Factoid: Most of these outfits have been around for quite a few years. Only Elasticsearch has managed to become a “brand” in the search space. What happened to Autonomy, Convera, Endeca, Fast Search & Transfer, and Verity since I wrote the first three editions of the Enterprise Search Report between 2003 and 2007? Ugly for some.

Search is a tough problem and has yet to deliver what users expect. Remember Google killed its search appliance. Ads are a better business because they spell money for Alphabet.

Stephen E Arnold, June 24, 2016

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta