Crimping: Is the Method Used for Text Processing?
October 4, 2016
I read an article I found quite thought provoking. “Why Companies Make Their Products Worse” explains that reducing costs allows a manufacturer to expand the market for a product. The idea is that more people will buy a product if it is less expensive than a more sophisticated version of the product. The example which I highlighted in eyeshade green explained that IBM introduced an expensive printer in the 1980s. Then IBM manufactured the different version of the printer using cheaper labor. The folks from Big Blue added electronic components to make the cheaper printer slower. The result was a lower cost printer that was “worse” than the original.
Perhaps enterprise search and content processing is a hybrid of two or more creatures?
The write up explained that this approach to degrading a product to make more money has a name—crimping. The concept creates “product sabotage”; that is, intentionally degrading a product for business reasons.
The comments to the article offer additional examples and one helpful person with the handle Dadpolice stated:
The examples you give are accurate, but these aren’t relics of the past. They are incredibly common strategies that chip makers still use today.
I understand the hardware or tangible product application of this idea. I began to think about the use of the tactic by text processing vendors.
The Google Search Appliance may have been a product subject to crimping. As I recall, the most economical GSA was less than $2000, a price which was relatively easy to justify in many organizations. Over the years, the low cost option disappeared and the prices for the Google Search Appliances soared to Autonomy- and Fast Search-levels.
Other vendors introduced search and content processing systems, but the prices remained lofty. Search and content processing in an organization never seemed to get less expensive when one considered the resources required, the license fees, the “customer” support, the upgrades, and the engineering for customization and optimization.
My hypothesis is that enterprise content processing does not yield compelling examples like the IBM printer example.
Perhaps the adoption rate for open source content processing reflects a pent up demand for “crimping”? Perhaps some clever graduate student would take the initiative to examine the content processing product prices? Licensees spend for sophisticated solution systems like those available from outfits like IBM and Palantir Technologies. The money comes from the engineering and what I call “soft” charges; that is, training, customer support, and engineering and consulting services.
At the other end of the content processing spectrum are open source solutions. The middle between free or low cost systems and high end solutions does not have too many examples. I am confident there are some, but I could identify Funnelback, dtSearch, and a handful of other outfits.
Perhaps “crimping” is not a universal principle? On the other hand, perhaps content processing is an example of a technical software which has its own idiosyncrasies.
Content processing products, I believe, become worse over time. The reason is not “crimping.” The trajectory of lousiness comes from:
- Layering features on keyword retrieval in hopes of finding a way to generate keen buyer interest
- Adding features helps justify price increases
- The greater the complexity of the system, the less likely the licensee will be able to fiddle with the system
- A refusal to admit that content processing is a core component of many other types of software so “finding information” has become a standard component for other applications.
If content processing is idiosyncratic, that might explain why investors pour money into content processing companies which have little chance to generate sufficient revenue to pay off investors, generate a profit, and build a sustainable business. Enterprise search and content processing vendors seem to be in a state of reinventing or reimagining themselves. Guitar makers just pursue cost cutting and expand their market. It is not so easy for content processing companies.
Stephen E Arnold, October 4, 2016
US Government: Computer Infrastructure
September 26, 2016
Curious about the cost of maintaining a computer infrastructures. Companies may know how much is spent to maintain the plumbing, but those numbers are usually buried in accounting-speak within the company. Details rarely emerge.
Here’s a useful chart about how much spending for information technology goes to maintain the old stuff and the status quo versus how much goes to the nifty new technology:
The important line is the solid blue one. Notice that the US Federal government spent $0.68 cents of every IT dollar on operations and maintenance in 2010. Jump to the 2017 estimate. Notice that the status quo is likely to consume $0.77 cents of every IT dollar.
Progress? If you want to dig into the information behind this chart, you can find the report GAO 677454 by running queries on the Alphabet Google system m. The title of the report is “Information Technology. Federal Agencies Need to Address Aging Legacy Systems.” Don’t bother trying the search box on the GAO.org Web site. The document is not in the index.
If you are not too keen on running old school mobile queries or talking to your nifty voice enabled search system, you can find the document at this link.
I want to point out that Palantir Technologies may see these types of data as proof that the US government needs to approach information technology in a different manner.
Stephen E Arnold, September 26, 2016
Alphabet Google Faces a Secret Foe
September 21, 2016
I thought indexing the world’s information made it possible to put together disparate items of information. Once assembled, these clues from the world of real time online content would allow a person with access to answer a business question.
Apparently for Alphabet Google it faces a secret foe. I learned this by reading “Secretive Foe Attacks Google over Government Influence.” I learned:
Google has come under attack by a mysterious group that keeps mum about its sponsors while issuing scathing reports about the Mountain View search giant’s influence on government.
The blockbuster write up reported:
So far, only Redwood Shores-based Oracle has admitted to funding the Transparency Project, telling Fortune it wanted the public to know about its support for the initiative.
Yikes, a neighbor based at the now long gone Sea World.
The outfit going after the lovable Alphabet Google thing is called the Transparency Group. The excited syntax of the write up told me:
The Transparency Project commenced hostilities against Google in April, gaining national media attention with a report tracking the number of Googlers taking jobs in the White House and federal agencies, and the number of federal officials traveling in the other direction, into Google. Project researchers reported 113 “revolving door” moves between Google — plus its associated companies, law firms and lobbyists — and the White House and federal agencies.
Okay, but back to my original point. With the world’s information at one’s metaphorical fingerprints, is it not possible to process email, Google Plus, user search histories, and similar data laden troves for clues about the Transparency Group?
Perhaps the Alphabet Google entity lacks the staff and software to perform this type of analysis? May I suggest a quick telephone call to Palantir Technologies. From what I understand by reading open source information about the Gotham product, Palantir can knit together disparate and fragmented data and put the members of the Transparency Group on the map in a manner of speaking.
I understand the concept of finding fault with a near perfect company. But the inability of a search giant to find out who, what, when, where, what, how, and why baffles me.
It does not, as an old school engineer with a pocket protector might say, compute.
Stephen E Arnold, September 14, 2016
OpenText: Content Marketing or Real News?
September 18, 2016
When I knew people at the original Malcolm Forbes Forbes, I learned that stories were meticulously researched and edited. I read “Advanced Analytics: Insights Produce New Wealth.” I was baffled, but, I thought, that’s the point.
The main point of the write up pivots on the assertion that an “insight” converts directly to “wealth.” I am not sure about the difference between old and new wealth. Wealth is wealth in my book.
The write up tells me:
Data is the foundation that allows transformative, digital change to happen.
The logic escapes me. The squirrels in Harrod’s Creek come and go. Change seems to be baked into squirreldom. The focus is “the capitalist tool,” and I accept that the notion of changing one’s business can be difficult. The examples are easy to spot: IBM is trying to change into a Watson powered sustainable revenue machine. HP is trying to change from a conglomeration of disparate parts into a smaller conglomeration of disparate parts. OpenText is trying to change from a roll up of old school search systems into a Big Data wealth creator. Tough jobs indeed.
I learned that visualization is important for business intelligence. News flash. Visualization has been important since a person has been able to scratch animals on a cave’s wall. But again I understand. Predictive analytics from outfits like Spotfire (now part of Tibco) provided a wake up call to some folks.
The write up informs me:
While devices attached to the Internet of Things will continue to throw out growing levels of structured data (which can be stored in files and databases), the amount of unstructured data being produced will also rise. So the next wave of analytics tools will inevitably be geared to dealing with both forms of information seamlessly, while also enabling you to embed the insights gleaned into the applications of your choosing. Now that’s innovation.
Let’s recap. Outfits need data to change. (Squirrels excepted.) Companies have to make sense of their data. The data come in structured and unstructured forms. The future will be software able to handle structured and unstructured data. Charts and graphs help. Once an insight is located, founded, presented by algorithms which may or may not be biased, the “insights” will be easy to put into a PowerPoint.
BAE Systems’ “Detica” was poking around in this insight in the 1990s. There were antecedents, but BAE is a good example for my purpose. Palantir Technologies provided an application demo in 2004 which kicked the embedded analytics notion down the road. A click would display a wonky circular pop up, and the user could perform feats of analytic magic with a mouse click.
Now Forbes’ editors have either discovered something that has been around for decades or been paid to create a “news” article that reports information almost as timely as how Lucy died eons ago.
Back to the premise: Where exactly is the connection between insight and wealth? How does one make the leap from a roll up of unusual search vendors like Information Dimension, BRS, Nstein, Recommind, and my favorite old time Fulcrum Technologies produce evidence of the insight to wealth argument. If I NOT out these search vendors and focus on the Tim Bray SGML search engine, I still don’t see the connection. Delete Dr. Bray’s invention. What do we have? We have a content management company which sells content management as governance, compliance, and other consulting friendly disciplines.
Consultants can indeed amass wealth. But the insight comes not from Big Data. The wealth comes from selling time to organizations unable to separate the giblets from the goose feathers. Do you know the difference? The answer you provide may allow another to create wealth from that situation.
One doesn’t need Big Data to market complex and often interesting software to folks who require a bus load of consultants to make it work. For Forbes, the distinction between giblets and goose feathers may be too difficult to discern.
My hunch is that others, not trained in high end Manhattan journalism, may be able to figure out which one can be consumed and which one can ornament an outfit at an after party following a Fashion Week showing.
Stephen E Arnold, September 18, 2016
In-Q-Tel Wants Less Latency, Fewer Humans, and Smarter Dashboards
September 15, 2016
I read “The CIA Just Invested in a Hot Startup That Makes Sense of Big Data.” I love the “just.” In-Q-Tel investments are not like bumping into a friend in Penn Station. Zoomdata, founded in 2012, has been making calls, raising venture funding (more than $45 million in four rounds from 21 investors), and staffing up to about 100 full time equivalents. With its headquarters in Reston, Virginia, the company is not exactly operating from a log cabin west of Paducah, Kentucky.
The write up explains:
Zoom Data uses something called Data Sharpening technology to deliver visual analytics from real-time or historical data. Instead of a user searching through an Excel file or creating a pivot table, Zoom Data puts what’s important into a custom dashboard so users can see what they need to know immediately.
What Zoomdata does is offer hope to its customers for less human fiddling with data and faster outputs of actionable intelligence. If you recall how IBM i2 and Palantir Gotham work, humans are needed. IBM even snagged Palantir’s jargon of AI for “augmented intelligence.”
In-Q-Tel wants more smart software with less dependence on expensive, hard to train, and often careless humans. When incoming rounds hit near a mobile operations center, it is possible to lose one’s train of thought.
Zoomdata has some Booz, Allen DNA, some MIT RNA, and protein from other essential chemicals.
The write up mentions Palantir, but does not make explicit the need to reduce t6o some degree the human-centric approaches which are part of the major systems’ core architecture. You have nifty cloud stuff, but you have less nifty humans in most mission critical work processes.
To speed up the outputs, software should be the answer. An investment in Zoomdata delivers three messages to me here in rural Kentucky:
- In-Q-Tel continues to look for ways to move along the “less wait and less weight” requirement of those involved in operations. “Weight” refers to heavy, old-fashioned system. “Wait” refers to the latency imposed by manual processes.
- Zoomdata and other investments whips to the flanks of the BAE Systems, IBMs, and Palantirs chasing government contracts. The investment focuses attention not on scope changes but on figuring out how to deal with the unacceptable complexity and latency of many existing systems.
- In-Q-Tel has upped the value of Zoomdata. With consolidation in the commercial intelligence business rolling along at NASCAR speeds, it won’t take long before Zoomdata finds itself going to big company meetings to learn what the true costs of being acquired are.
For more information about Zoomdata, check out the paid-for reports at this link.
Stephen E Arnold, September 15, 2016
HonkinNews, September 13, 2016 Now Available
September 13, 2016
Interested in having your polynomials probed? The Beyond Search weekly news explains this preventive action. In this week’s program you will learn about Google new enterprise search solution. Palantir is taking legal action against an investor in the company. IBM Watson helps out at the US Open. Catch up on the search, online, and content processing news that makes the enterprise procurement teams squirm. Dive in with Springboard and Pool Party. To view the video, click this link.
Kenny Toth, September 13, 2016
Enterprise Search: Pool Party and Philosophy 101
September 8, 2016
I noted this catchphrase: “An enterprise without a semantic layer is like a country without a map.” I immediately thought of this statement made by Polish-American scientist and philosopher Alfred Korzybski:
The map is not the territory.
When I think about enterprise search, I am thrilled to have an opportunity to do the type of thinking demanded in my college class in philosophy and logic. Great fun. I am confident that any procurement team will be invigorated by an animated discussion about representations of reality.
I did a bit of digging and located “Introducing a Graph-based Semantic Layer in Enterprises” as the source of the “country without a map” statement.
What is interesting about the article is that the payload appears at the end of the write up. The magic of information representation as a way to make enterprise search finally work is technology from a company called Pool Party.
Pool Party describes itself this way:
Pool Party is a semantic technology platform developed, owned and licensed by the Semantic Web Company. The company is also involved in international R&D projects, which continuously impact the product development. The EU-based company has been a pioneer in the Semantic Web for over a decade.
From my reading of the article and the company’s marketing collateral it strikes me that this is a 12 year old semantic software and consulting company.
The idea is that there is a pool of structured and unstructured information. The company performs content processing and offers such features as:
- Taxonomy editor and maintenance
- A controlled vocabulary management component
- An audit trail to see who changed what and when
- Link analysis
- User role management
- Workflows.
The write up with the catchphrase provides an informational foundation for the company’s semantic approach to enterprise search and retrieval; for example, the company’s four layered architecture:
The base is the content layer. There is a metadata layer which in Harrod’s Creek is called “indexing”. There is the “semantic layer”. At the top is the interface layer. The “semantic” layer seems to be the secret sauce in the recipe for information access. The phrase used to describe the value added content processing is “semantic knowledge graphs.” These, according to the article:
let you find out unknown linkages or even non-obvious patterns to give you new insights into your data.
The system performs entity extraction, supports custom ontologies (a concept designed to make subject matter experts quiver), text analysis, and “graph search.”
Graph search is, according to the company’s Web site:
Semantic search at the highest level: Pool Party Graph Search Server combines the power of graph databases and SPARQL engines with features of ‘traditional’ search engines. Document search and visual analytics: Benefit from additional insights through interactive visualizations of reports and search results derived from your data lake by executing sophisticated SPARQL queries.
To make this more clear, the company offers a number of videos via YouTube.
The idea reminded us of the approach taken in BAE NetReveal and Palantir Gotham products.
Pool Party emphasizes, as does Palantir, that humans play an important role in the system. Instead of “augmented intelligence,” the article describes the approach methods which “combine machine learning and human intelligence.”
The company’s annual growth rate is more than 20 percent. The firm has customers in more than 20 countries. Customers include Pearson, Credit Suisse, the European Commission, Springer Nature, Wolters Kluwer, and the World Bank and “many other customers.” The firm’s projected “Euro R&D project volume” is 17 million (although I am not sure what this 17,000,000 number means. The company’s partners include Accenture, Complexible, Digirati, and EPAM, among others.
I noted that the company uses the catchphrase: “Semantic Web Company” and the catchphrase “Linking data to knowledge.”
The catchphrase, I assume, make it easier for some to understand the firm’s graph based semantic approach. I am still mired in figuring out that the map is not the territory.
Stephen E Arnold, September 8, 2016
Hewlett Packard: About Face
September 7, 2016
I read “Exclusive: HP Enterprise in Talks to Sell Software Unit to Thoma Bravo – Sources.” Who does not love a news story labeled “exclusive” and attributed to “sources” when the subject is Hewlett Packard Enterprise? The thrust of the story is that HPE, fresh from making marketing noises about its enterprise software business, is allegedly selling those software businesses.
Let’s assume that this is indeed accurate. The asking price is is in the neighborhood of $8 to $10 billion or more if the excited buyer really wants this collection of software.
Why is HPE selling what it has been working hard to craft into a sustainable revenue stream with healthy profit margins? The write up reports:
HPE’s software unit generated $3.6 billion in net revenue in 2015, down from $3.9 billion in 2014. The company has said revenue growth in its software unit has been challenged by a market shift toward cloud subscription offerings.
Yep, these numbers will drive potential buyers into a frenzy.
The word in Harrod’s Creek, Kentucky, is that HPE is eager to find a way to make money, boost the company’s value to shareholders, and plug into to the fluffy cloud opportunities. HPE’s present software may not be the answer for HPE. Another outfit should be able to release a flood of revenue.
One of the goslings (un-named, of course) thought that HPE was going cold turkey to kick its Autonomy habit. The shadow of the search business makes life chilly for the would be technology leader. In an “exclusive” comment to Beyond Search, HPE anticipates victory in its legal flap associated with the purchase of Autonomy for an modest $10 or $11 billion.
We don’t know if our un-named gosling is on the right track, but if HPE sells Autonomy and other assorted gems from its software vault, the difference between what HPE paid for Autonomy and then the amount generated by the sale of Autonomy is only a couple billion dollars.
What’s a few billion dollars for a focused, consistent, well managed outfit like HPE? A pittance I say.
I wonder, “Does the buyer of HPE’s Autonomy-infused bundle recognize the excitement selling search and retrieval will engender?” Sure. These are savvy folks. Generating revenue from proprietary search and content processing software is really easy.
If Google can do, anyone can, right? Oh, Google closed its enterprise search product. Well, what about Palantir? Oh, Palantir relies on open source for findability functions. How about IBM? Oh, shucks, IBM relies on Lucene with home brew code and acquired technology.
As I said, search is easy.
Stephen E Arnold, September 7, 2016
A Blurred In-Q-Tel X-Ray: Real Journalists Uncover Old News
September 4, 2016
I noted this write up by the Rupert Murdoch outfit, the Wall Street Journal: “The CIA’s Venture-Capital Firm, Like Its Sponsor, Operates in the Shadows.” You may have to buy a dead tree version of the Wall Street Journal, go to your public library, or subscribe to read the source itself. (Don’t hassle me if the link begs for dollars. Buzz Mr. Murdoch and express your views.)
The point of the article is that the US government’s intelligence outfit operates a venture capital firm. That investment entity does business as In-Q-Tel. The goal, in my opinion, is to identify promising technologies which may have application at the Central Intelligence Agency. Please, note that much of the work at the CIA is not public. That’s because it mostly operates in secrecy. The fact that a government has secret activities is not exactly news.
Furthermore, whom do you think advises the Central Intelligence Agency and its various units? Choose from the following list:
- Immigrants without US entry authorizations
- Felons recently released from prison to a half way house
- Individuals working for governments antithetical to the posture of the United States
- Investigative journalists looking for a gig
- Individuals with clearances or a track record of serving the US.
Okay, you picked one to three. You may qualify for work at a large, “real news” outfit. If you selected item four, you now understand why the news about the individuals and the companies exposed to In-Q-Tel is stale.
Obviously those in the spy game want folks who are in the same fox hole.
The write up reveals this stunning factoid: In-Q-Tel provides only limited information about its investments, and some of its trustees have ties to funded companies.
No kidding.
With considerable assiduity, the write up lists the companies in which In-Q-Tel has invested and notes:
Of about 325 investments In-Q-Tel says it has made since its founding, more than 100 weren’t announced, although the identities of some of those companies have leaked out. The absence of disclosure can be due to national-security concerns or simply because a startup company doesn’t want its financial ties to intelligence publicized, people familiar with the arrangements said. While moneymaking isn’t In-Q-Tel’s goal, when that happens, such as when a startup it funded goes public, In-Q-Tel can keep the profit and roll it into new projects. It doesn’t obtain rights to technology or inventions.
There you go. Why not let another nation’s intelligence services invest in high potential but little known innovators? The US government is trying to bring more business discipline to some of its activities. Therefore, is it not logical that an intelligence agency seeking high value products and services can use the proceeds from its investments to further the work of the intelligence agency?
I guess that’s a thought foreign to some real journalists.
What does one expect the CIA and In-Q-Tel to do? Publish a daily newspaper detailing the companies, people, and technologies the CIA is interested in? What about going on Fox News and explaining what’s hot and what’s not in advanced technology? Oh, right. Technology is not as much fun as pundits who over talk one another.
I know that an outfit owned by Rupert Murdoch is in the news business. I know that gathering information from the In-Q-Tel Web site is really difficult. For me, information about In-Q-Tel is a bit of a yawner.
I would much rather read about some of the management methods used in some major media entities. Government efforts to identify cutting edge technologies is just not that interesting to me. Where’s the beef? Why not consider why certain categories of investments have not yielded products and services which can be used across missions? Why not explore why Purple Yogi was a dead end and why Palantir is not? Oh, right. That’s harder than realizing that in certain types of work one wants to deal with individuals from that fox hole.
Stephen E Arnold, September 4, 2016
Business Intelligence: Four Generalized Hurdles
August 30, 2016
Business intelligence, like government intelligence, may be an oxymoron. Nevertheless, doing “intelligence” is a big business. That’s why Palantir Technologies is hoping lawyers can crack open the US Army’s coin purse.
I read “4 Huge Challenges Facing CIOs and IT Leaders.” I quite like the use of “chief information officer” and “information technology leaders” in the headline. CIOs seems to be struggling to meet their budgets, deal with security issues, and attend conferences. The notional “information technology leader” is busy reading reports from mid tier consulting firms, dealing with the all-too-frequent emergencies, and removing malware from senior executives’ computing devices.
The write up identifies four “challenges” these busy professionals must convert to opportunities in their spare time. What are these “challenges”? Here’s my translation of MBA speak into Harrod’s Creek, Kentucky lingo:
- Executives have to write checks and push aside bureaucratic baloney to that business intelligence can move forward. If the top dog doesn’t care, well, you can always check out Facebook and read Reddit.
- Get something done when you said you would complete the task. Good luck with that. Meetings, approvals, crashes [see the comment above about information technology professionals’ time allocation], and software that simply doesn’t work are enemies of finishing a job. I assume that the people performing business intelligence know what they are doing most of the time when they are not sure what the objective of the project is.
- Normalizing, vetting, and processing data. Yikes, this challenge has been in the fast lanes of the information superhighway for more than 50 years. Hey, that XML is just great, isn’t it?
- Getting users to use the business intelligence outputs. If the users don’t understand the outputs, don’t trust the outputs, or prefer their own methods—up date that link graph thing on Microsoft LinkedIn.
When one steps back from this list of challenges, the issues are not new. The more chaotic the business environment is perceived to be, the less likely converting these opportunities into a career win may be.
Even when a system does deliver useful outputs like Palantir Gotham, getting acceptance is a very difficult challenge. A person without the resources of Palantir might find the conversion of these challenges a bit of a challenge in itself.
May I suggest that the solution is to start small, demonstrate value, and move forward? How popular is that approach? Not very.
Stephen E Arnold, August 30, 2016