February 26, 2015
I am a simple person, gliding slowly into the assisted living facility. I know I cannot keep up with the management wizards in the search and content processing sectors. (I do bristle when “experts” address parental instructions toward me in their LinkedIn posts.)
I ran a query for SmartLogic. The Smartlogic I know a bit about is an outfit that performs automated indexing. The company’s hook is “the content intelligence company.” The idea is that if a document is indexed, then the content becomes smarter. This is a claim I have heard repeated from prescient thinkers like Dr. Ron Sacks Davis, the person making possible the TeraText system. Dr. Sacks Davis floated this idea in 1975. Down the line, CALS and then SMGL advocates pitched the advantages of tagging structural elements, stuffing the components and the tags into a database, and discovering the joys of scripted content slicing and dicing. In the modern era, many companies, including Smartlogic, have dusted off the intelligent content moniker as a way to generate interest in automated index, the joys of taxonomies, and slipping a data management system into a company under the cover of metadata. LinkedIn experts are thrashing about this Trojan Horse maneuver as I write this blog item.
Run a query for Smartlogic, however, and one sees that there are two Smartlogics. One uses a lower case “l” in its spelling; the other, an upper case “L.” When I run the query for “smartlogic” on Google, this is what the GOOG displays:
Yep, two Smartlogics. One has a dot com domain and the other a dot io domain. The big “L” outfit is doing a much better job of getting its brand into the various electronic media. When i run the query “smartlogic Baltimore”, the heavens open and rain links to helping other companies, writing software, and making the Baltimore business scene vibrant.
Here’s the newer (upper case “L”) SmartLogic.io. Pretty snappy design I would suggest.
About one year ago, the content intelligence flavor of Smartlogic was the Big Dog in the Google index. Today, not so much. Here’s what the indexing Smartlogic’s Web site (lower case “l”) looks like on February 25, 2015:
Understated in comparison to the upper case “L” outfit I perceive.
Questions I formulated are:
- How has Smartlogic marketing squandered its grip on the name “smartlogic”?
- How is the SmartLogic.io company dominating social media?
- What happens to Web site traffic and over the transom questions from potential customers who want indexing and end up looking at a services firm in Baltimore?
When search vendors lose control of their brand, I often hear, as I did from Brainware before it was acquired by Lexmark, “You cannot provide links to videos via the keyword “brainware.” The videos are inappropriate.” Mismanaging a company name is my fault?
Get real, Brainware.
I see this erosion when I search for Connotate, Thunderstone, and now Smartlogic and others. I track this via my public Overflight pages.
Fascinating insight into what content processing executives perceive as important.
Stephen E Arnold, February 26, 2015
February 21, 2015
I wanted to capture Antidot’s semi pivot from enterprise search to eCommerce search. The French company provides a useful description of its afs@store product. If you bang this product name into the GOOG, you find that the American Foundry Society, Associated Food Stores, and the American Fisheries Society push Antidot’s product down the results list. In general, names of search and content processing systems often disappear into search results. Perhaps Antidot has a way to make the use of the “@” sign somewhat less problematic.
The system, according to Antidot, system delivers features that sidestep the unsticky nature of most eCommerce customer visits. Antidot asserts:
- Rich, tolerant and customizable auto complete featuring products, brands, categories…
- Fully typo-tolerant search
- Semantic search that understands your customer’s words
- Dynamic filtering facets to rapidly select desired products
- Web interface to simply monitor and manage your searchandising
the company offers a plug in for Magento, the open source eCommerce system, that enjoyed love from eBay. It is difficult to know if that love is growing stronger with time, however.
I did notice that the “See and read more” panel had zero information and no links. Hopefully this void will be addressed.
Stephen E Arnold, February 21, 2015
February 9, 2015
Is Silicon Valley a place or a state of mind? “An “Analysis of San Francisco’s Startups Shows Where the Real Silicon Valley Is” attempts to peg the there there.
The map suggests that Silicon Valley embraces a chunk of territory in the San Francisco area.
Apart from geography, there was an interesting list of success characteristics.
I noted this passage:
Based on the researchers’ model, the best indicators of entrepreneurial quality were characteristics like a company’s name. Companies that have short names containing words associated with technology tend to grow better. Businesses that aren’t named after their founders also tend to do better over time. In addition, the model showed that corporations are six times more likely to do well compared with companies that aren’t incorporated. The same goes for trademarks; companies with trademarks are five times more likely to grow than non trademarked businesses. Patents are also important indicators of entrepreneurial quality…
In my view, one can use these characteristics to identify search and content processing vendors likely to have a greater chance of success. The company name issue seems to be important. When a company located outside of the Silicon Valley hot zones lose control of their name and brand, the company may face more friction. Trademarks are also important.
Does this mean that when a vendor’s attempts to boost its identity are ineffective, that vendor may be at risk? In short, marketing may be more important than some technologists believe. Also, Berkeley seems to be a better location than a trailer in Mt. Diablo.
Stephen E Arnold, February 9, 2015
February 8, 2015
Protecting the “name” of a company is important. I pointed out that several content processing vendors were losing control of their name in terms of a Bing or Google query. I noticed another vendor finding itself in the same pickle.
Navigate to YouTube. Run a query for Mondeca. What the query “mondeca” triggers is a spate of videos to Joey Mondeca.
Now try the Twitter search for the string “mondeca”. Here’s what I see:
Mondeca, the French smart content outfit, may want to turn its attention to dealing with Joey Mondeca’s social media presence. On the other hand, maybe findability is not a priority.
A Google query for “mondeca” returns links to the company. But Joey is moving up in the results list. When vendors lose control of a name as Brainware did, getting back that semantic traction is difficult. Augmentext has a solution.
Stephen E Arnold, February 9, 2015
February 8, 2015
I read “Why It’s Never Been Harder to Be Seen on Social Networks (and What to Do about It: Hint, Buy Ads).” Google would certainly approve of this title’s message. The Twitter Google tie up is designed to deal with recalcitrant Twitter members like my dog Tess. She has a Twitter account and a Facebook page.
I noted a factoid:
tweets have an extremely short life span; a tweet’s half-life—that is, when half of a link’s total clicks occur—is 24 minutes, according to social media analytics firm Wisemetrics. So if a consumer doesn’t interact with a brand’s message shortly after it’s posted, chances are, he probably never will. Marketers are finding similar situations on Pinterest, Instagram, Tumblr and even the shopping-focused, advertising-supported Polyvore.
I have zero idea if this assertion is accurate. My hunch is that the time value of a tweet is even less. I, for example, do not read tweets; therefore, the tweets flowing out that I could view have a life spam of zero.
With Twitter a growth challenged and geographic-centric activity, tweets are ephemeral in my view.
One way to interpret this factoid is that a Twitter member who wants to be noticed faces an uphill climb. But Mother Google and Cousin Twitter are there to help—for a price.
Stephen E Arnold, February 8, 2015
February 7, 2015
With new senior managers and a hunt on for a new director of financial services, Attivio is definitely trying to shake ‘em up. I received some public relations spam about the most recent version of the Attivio system. The approach combines open source software with home brew code, an increasingly popular way to sell licenses, consulting, and services. To top it off, Attivio is an outfit that has the “best company culture” and Dave Schubmehl’s IDC report about Attivio with my name on it available for free. This was a $3,500 item on Amazon earlier this year. Now. Free.
Attivio’s February 3, 2015, news release explains that Attivio is in the enterprise search business. You can read the presser at this link. Not too long ago, Attivio was asserting that it was the solution to some business intelligence woes. I suppose search and business intelligence are related, but “real” intelligence requires more than keyword search and a report capability.
The release explains that Attivio is—I find this fascinating—“reinventing Big Data Search and Dexterity.” Not bad for open source, home brew, and Fast Search & Technology flavoring. Search and dexterity. Definitely a Google Adword keeper.
Attivio’s presser says:
Attivio 4.3 delivers new functionality and improvements that make it dramatically easier to build, deploy, and manage contextually relevant applications that drive revolutionary insight. Companies with structured and unstructured data in disparate silos can now quickly gain immediate access to all information with universal contextual enrichment, all delivered from Attivio’s agile enterprise platform.
I like “revolutionary insight.” Keep in mind that Attivio was formed by former Fast Search & Transfer executives in 2007 and has ingested, according to Crunchbase, $71.1 million in seven years. That works out to $10 million per year to do various technical things and sell products and services to generate money.
More significant to me than money that may be difficult or impossible to repay with a hefty uptick is that in seven years, Attivio has released four versions of its flagship software. With open source providing a chunk of functionality, it strikes me that Attivio may be lagging behind the development curve of some other companies in the content processing sector. But with advisors like Dave Schubmehl and his colleagues, the pace of innovation is likely to be explained as just wonderful. At Cambridge University, one researcher pointed out that work done in 2014 is essentially part of ancient history. There is perhaps a difference between Cambridge in the UK and Cambridge in Massachusetts.
What does Attivo 4.3 offer as “key features”? Here’s what the news release offers:
- ASAP: Attivio Search Application Platform – a simple, intuitive user interface for non-technical users building search-based applications;
- SAIL: Search Analytics Interactive Layer – offers more robust functionality and an enhanced user experience;
- Advanced Entity Extraction: New machine-learning based entity extraction module enriches content with higher accuracy and improved disambiguation, enabling deeper discovery and providing a smart alternative to managing entity dictionaries;
- Simplified Management: Empowers business users to handle documents and manage settings in a code-free environment;
- Composite Documents: Unique ability to search across document fragments optimized to deliver sub-second response times;
- New Designer Tools: Simplifies Attivio management through Visual Workflow and Component Editors, enables all users to design and build custom processing logic in an integrated UI.
There are a couple of important features that are available in other vendors’ systems; for example, geographic functions, automated real-time content collection, automated content analytics, and automated outputs to a range of devices, humans, or other systems.
The notion of ASAP and SAIL are catchy acronyms, but I find them less than satisfying. The entity extraction function is interesting but there is no detail about how it works in languages other than Roman based character sets, how the system deals with variants, and how the system maps one version of an entity to another in content that is either static imagery or video.
I am not sure what a composite document is. If a document contains images and videos, what does the system do with these content objects. If the document is an XML representation, what’s the time penalty to convert content objects to well formed XML? With interfaces becoming the new black, Attivio is closing the gap with the Endeca interface toolkit. Endeca dates from the late 1990s and has blazed a trail through the same marketing jungle that Attivio is now retracing.
For more information about Attivio, visit the company’s Web site at www.attivio.com. The company will be better equipped to explain virtual, enterprise search, big data, and the company’s financial posture than I.
Stephen E Arnold, February 7, 2015
February 4, 2015
For months I have been commenting about the increasingly weird marketing pitches for IBM Watson. This is the Lucene and home grown script system positioned as the next big thing in information retrieval. The financial goals for this system were crazy. My recollection is that IBM wanted to generate a billion in revenue from open source search and bits and pieces of the IBM technology lumber.
Impossible. Having a system ingest bounded content and then answer “questions” about that content is neither new, remarkable, or particularly interesting to me. When the system is presented as a way to solve the problem of cancer and generate barbeque sauce with tamarind, the silliness points to desperation.
IBM marketers were trying everything to make open source search into a billion dollar baby and pull of the stunt quickly. Keep in mind that Autonomy required 15 years and a number of pretty savvy acquisitions to nose into the $700 million range.
IBM, in its confused state, believed that it could do the trick in a fraction of the time. IBM apparently was unaware of the erratic thinking at Hewlett Packard that spent $11 billion for Autonomy and wanted to generate billions from that system at the same time IBM was going to collect a billion or more from the same market.
Both of these companies, dazed by a long term struggle with spreadsheet fever, were ignoring or simply did not understand the doldrums of the enterprise information access market. Big companies were quite happy to give open source solutions a try. Vendors of proprietary systems were pitching their keyword systems as everything from customer support “solutions” to business intelligence systems that would “predict” what the company should know.
I read with some sadness the posts at Alliance@IBM. The viewpoint is not that of IBM management which is now firing or resource allocating its way people. I am not sure how many folks are going to be terminated, but the comments in this series of IBM employee comments suggest that the staff are unhappy. Some may not go gentle into that good night.
The point is that the underlying problems at IBM were evident in the silly Watson marketing. An organization that can with a straight face suggest that a next generation information access system can discover a new recipe provides a glimpse into an organization’s disconnect at a fundamental level.
Too bad. The stock buybacks, the sale of manufacturing assets, and the assertions that a mainframe is a mobile platform tells me that IBM stockholders may want to reevaluate those holdings.
If IBM asked Watson, I question the outputs.
Stephen E Arnold, February 4, 2015
February 2, 2015
I received one of those off the wall LinkedIn requests. Years ago the original LucidWorks (Really?) was a client of my advisory services. Marc Krellenstein, who left the company in an interesting, mysterious, and wave generating founder escape, mentioned me to another LucidWorks (Really?) employee. (Note: Dr. Krellenstein is now the senior vice president of technology development at Decision Resources.)
In the beginning, there was the dream of becoming the next RedHat of the enterprise search world.
Flash forward through two presidents and a legion of leaders to the departure of Paul Doscher, once involved with Exalead and Jaspersoft. Eric Gries left his CEO role after the first Lucene Revolution Conference. Yep, revolution. A new platoon of Horse Artillery arrived. I lost interest in the outfit.
Then the company morphed into a vendor who sold consulting that actually worked, often a rarity in the world of information access.
About half way through the almost eight year journey, Lucid Imagination morphed into LucidWorks (Really?). The company flip flopped from a consulting firm selling Lucene/Solr engineering into a Big Data company. The move was sparked by the company’s inability to generate a payback on the $40 million in venture capital pumped into the company since it opened for business in 2007.
Now the company has an off kilter logo in two shades of red and a lower case “w.” Marketing genius illuminates this substantive typographical maneuver. My goodness, the shift from blue to red is something I would associate with Dr. Einstein’s analysis of Brownian motion or Dr. Jon Kleinberg’s CLEVER algorithm or Dr. Jeffrey Dean’s work on Google Chubby.
The way I do math reveals that LucidWorks (Really?) is a seven year old company. The burn rate works out to about $6 million in venture funding plus whatever revenues the company has been able to generate on its 84 month journey. When LucidWorks (Really?) with Krellenstein on board set up shop Bill Cowher resigned as head coach of the Pittsburgh Steelers and started his journey to seemingly low key Time Warner pitchman. Also in 2007 the Indianapolis Colts beat the the Chicago Bears to win the super bowl. The first episode of Mad Men ran on a US pay for view channel. The number one song in 2007 was Beyonce’s “Irreplaceable.” Is this the tune Elasticsearch plays as it wins clients from LucidWorks (Really?)?
Now to the LinkedIn email:
A LucidWorks (Really?) employee wanted me to know that he was previously employed by Raritan, a connector and consulting company specializing in “federated search.” This person wanted to be my LinkedIn “amigo,” “BBF,” “Robin,” or who knows what else.
I pointed out that I did not want to be a LinkedIn friend with an outfit that may be the object of considerable attention from Granite Ventures, Shasta Ventures, Walden International, and In-Q-Tel, an outfit known for investments based on the US government’s curiosity, not payback.
My former Raritan federated search expert read my “no” and sent me this message:
Fair enough – we are after all a startup for chrissakes! I just published a blog on our Lucidworks site -( lower case ‘w’ please dude! that was from our Marketing Guys) called The Well Tempered Search Application – Prelude. Fusion 1.1 has a lot of gaps to fill – I have trying to help our whizz kids realize that this is somewhat wheel-reinvention … I would be interested in your thoughts on my blog/rant because you are one of my heroes: a real dyed in the wool crusty curmudgeon if you will (that is meant as a compliment!)
Okay, I took away a couple of factoids from this email: Cursing is a Sillycon Valley convention. I live in rural Kentucky where there are Baptists and others who get frisky when curse words are tossed around the Speedy Mart. Another factoid is that LucidWorks (Really?) is a startup. But now to the big deal at LucidWorks (Really?): Lucidworks with a lower case “w.” I had to reach for my blood pressure medicine. A lower case “w”. Oy vay. LucidWorks (Really?) has hit upon a significant and brilliant move. A. Lower. Case. W. I have to take a couple of deep breaths.
I pointed out that a seven year old company is not a startup as much as the marketing “guys” want it to be. I then learned this from my correspondent:
Point taken what I meant was that we are still VC funded. We have undergone a lot of transformation in the last year so your criticisms are totally valid say up to 2013, but we are working hard to redress these as we speak. So stay tuned sir, hope that we can make a convert but to be clear, I am NOT a sales or marketing guy thank you very much. But whatever the case, I share your cynicism in general – I have been doing this for about 15 years now – so I have seen hype cycles like Big Data come and go – FWIW our earlier claims for Big Data were BS but the re-tooling that we are doing now will hopefully change your mind somewhat. [emphasis added]
Fascinating is the phrase “still VC funded.” In my mind this begs the question, “After seven years of trying to generate revenue, when will LucidWorks (Really?) start to fund itself, pay back its stakeholders, and generate sufficient surplus to invest in research to deal with the demons of Big Data?”
Maybe LucidWorks (Really?) should update its information in stories like this: “Trouble at LucidWorks: Lawsuits, Lost Deals, & Layoffs Plague the Search Startup Despite Funding.” Isn’t the Big Data drum becoming noise; for example, “The Promise of Big Data Still Looms, but Execution Lags.”
Looking back over seven years, LucidWorks (Really?) has an intriguing pattern of hiring people, engaging in litigation, getting more venture funding, and repositioning itself. How many repackagers of Lucene/Solr does the world’s appetite demand.
Based on my monograph about open source search, the winner in the keyword search solutions is Elasticsearch. In terms of venture funding, staff stability, and developer support—Elasticsearch is the winner in this game.
LucidWorks (Really?) will have to do more than tell me that it is not a start up after telling me it is a startup, flip-flopping its value proposition, making substantive changes like the use of a lower case “w”, and asking me to give the company a hunting license for my LinkedIn contacts.
In short, as the revenue pressure mounts, I look forward to more amusing antics. I particularly like the slang phrase “We are after all a startup for chrissakes!”
No, dear LucidWorks (Really?), you are not a start up and you are not a player in the next generation information access market. If I were more like my old Halliburton/Booz Allen self, I would try to sell a briefing to your venture funding outfits. Now it is not my problem. l
Enjoy your meetings to review your lower case “w” quarterly revenues. And, please, do not tell me that you cannot afford my CyberOSINT: Next Generation Information Access study. That’s okay. I cannot afford a McLaren P1. No one cares, including me. I prefer products that work, really.
Stephen E Arnold, February 2, 2015
January 31, 2015
I kid you not. I received a spam mail from an outfit called BA Insight. The spam was a newsletter published every three months. You know that regular flows of news are what ring Google’s PageRank chimes, right?
Here’s the missive:
The lead item is an invitation to:
Unstructured content – email, video, instant messages, documents and other formats accounts for 90% of all digital information.
View the IDC Infographic:
Unlock the Hidden Value of Information.
With my fully protected computer, I boldly clicked on the link. I don’t worry too much about keyword search vendors’ malware, but prudence is a habit my now deceased grandma drummed into me.
Here’s what greeted me:
Yep, a giant infographic cartoon stuffed with assertions and a meaningless chunk of jargon: knowledge quotient. Give me cyber OSINT any day.
The concept presented in this fascinating marketing play is that unstructured content has value waiting to be delivered. I learned:
This content is locked in variety locations [sic] and applications made up of separate repositories that don’t talk to each other—e.g., EMC Documentum, Salesforce.com, Google Drive, SharePoint, et al.
Now it looks to me as if the word “of” has been omitted between “variety locations”. I also think that EMC Documentum has a new name. Oh, well. Let’s move on.
The key point in the cartoon is that “some organizations can and do unlock information’s hidden value. Organizations with a high knowledge quotient.”
I thought I addressed this silly phrase in this write up.
Let me be clear. IDC is the outfit that sold my information on Amazon without my permission. More embarrassing to me was the fact that the work was attributed to a fellow named Dave Schubmehl, who is one of the, if not the premier, IDC search expert. Scary I believe. Frightful.
What’s the point?
The world of information access has leapfrogged outfits like BA Insight and “experts” like IDC’s pride of pontificators.
The future of information access is automated collection, analysis, and reporting. You can learn about this new world in CyberOSINT: Next Generation Information Access. No cartoons but plenty of screenshots that show what the outputs of NGIA systems deliver to users who need to reduce risk and make decisions of considerable importance and time sensitivity.,
In the meantime, if you want cartoons, flip through the New Yorker. More intelligent fare I would suggest.
How do you become a knowledge quotient leader? In my opinion, not by licensing a keyword search system or buying information from an outfit that surfs on my research. Just a thought.
Stephen E Arnold, January 31, 2015
January 30, 2015
After the disappointing and somewhat negative IBM financial reports, the Watson PR machine has lurched into action. Watson, as you may know, is the next big thing in content processing. Lucene plus home brew code converts search into an artificial intelligence powerhouse. Well, that’s what the Watson cheerleaders want me to believe. I wonder if cheerleading correlates with making sales of more than $1 billion in the next quarter or two or three or four or five.
I read two news items. One is indicative of the use of Watson on a bounded content set, not the big, wide, wonderful world of real time data flows. The other is somewhat troubling but not particularly surprising.
IBM Watson is now a lawyer. Navigate to “Meet Ross, the IBM Watson-Powered Lawyer.” The idea is that systems from LexisNexis and Thomson Reuters are not what lawyers or the thrifty legal searcher wants. Nope, Watson converts to a lawyer more easily than a graduate of a third tier law school chases accident victims. According to the write up:
University of Toronto team launches a cognitive computing application that helps lawyers conduct world-class case research.
If I understand the write up, Watson is a search system equipped with the magical powers that allowed the machine and software to win a TV game show. Is post production allowed in the court room? I know that post plays a part in prime time TV. Just asking.
A couple of thoughts. The current line up of legal research systems are struggling to keep revenues and make profits. The reason for the squeeze is that law firms are having some difficulty returning to the salad days of the LingTemcoVought era. Lawyers are getting fired. Lawyers are suing law schools with allegations of false advertising about the employment picture for the newly minted JDs. Lawyers are becoming human resource, public relations, and school counselors. Others are just quitting. I know one Duke Law lawyer who has worked at several of the world’s most highly regarded law firms. Know what the Duke Law degree is doing for money? Running a health club. Interesting development for those embarking on a l;aw degree.
Will Watson generate significant revenue and a profit from its legal research prowess? The answer, in my opinion, is, “No.” What is going to happen is that efficacy of Watson’s usefulness on a bounded set of legal content can be compared to the outputs from the smart system offered by Thomson Reuters and the decidedly less smart system from LexisNexis. For an academic, this comparison will be an endless source of reputational zoomitude. For the person needing legal advice, hire an attorney. These folks advertise on TV now and offer 24×7 hotlines and toll free numbers.
The second item casts a shadow over my skeptical and extremely tiny intellectual capability. Navigate to to “This Medical Supercomputer Isn’t a Pacemaker, IBM Tells Congress.” Excluding classified and closed hearings about next generation intelligence systems, this may be the first time a Lucene recycler is pitching Congress about search and retrieval. The write up says:
The effort to protect decision support tools like Watson from Food and Drug Administration regulation is part of a proposal by the Republican chairman of the House Energy and Commerce Committee, Michigan’s Fred Upton. Called the 21st Century Cures initiative, it’s a major overhaul in the pharmaceutical and medical-device world, and the possibility of its passage is boosted by Republican control of both chambers of Congress. Upton’s bill would give the FDA two years to come up with a verification process for what it calls “medical software.” Such programs wouldn’t require the strict approval process faced by makers of medical devices like heart stents. Another set of products defined as “health software” wouldn’t require FDA oversight at all.
I think an infusion of US government money will provide some revenue to the game show winner. Go for it. Remember I used to work at Halliburton Nuclear and Booz, Allen & Hamilton. But in terms of utility I think that if the Golden Fleece Award were still around, Watson might get a quick look by the 20 somethings filtering the government funding of interesting projects.
Net net: Watson is going to have to vie with HP Autonomy for the billions in revenue from their content processing technologies. Perhaps IBM should take a closer look at i2 and Cybertap? Those IBM owned content processing systems may deliver more value than the keyword centric, super smart Watson system. Just a suggestion from rural Kentucky.
The gray side of the cloud is that IBM may actually get government money. Will Watson bond with Mr. Obama’s health programs? That is an exciting notion.
Stephen E Arnold, January 30, 2015