Free Academic Journals? Maybe
March 10, 2016
I read “This Renowned Mathematician Is Bent On Proving Academic Journals Can Cost Nothing.” If you are not an academic, you may not know that some folks pay the publisher to publish one’s research report, journal article, or wild and crazy summary of non reproducible results.
Good business?
You betcha. I remember a meeting a decade ago at the Cornell Theory Center. I asked if a faculty member who published in an online journal would be recognized for the work. The answer, not surprisingly, was, “No.” Flash forward to today. Many institutions like the estimable University of Louisville prefer their wizards’ write ups to be in prestigious paper journals. Sure, maybe a short item in the Harvard Business School blog will get some blue or green stars. The gold ones, from what I have heard, go to the expensive, paper journals like those from the ever savvy Elsevier outfit.
The write up states:
Despite a decades-old “open access” movement — which aims to put research findings in the public domain instead of languishing behind expensive pay walls — the traditional approach to publishing remains firmly entrenched.
The Cambridge math whiz is launch Discrete Analysis. Sorry, no snaps of the new Bugatti Chiron or Maserati SUV.
The write up points out some of the realities of academic publishing. The arguments are somewhat tired. I highlighted this passage:
So far, these alternative ventures have had little success dismantling the knowledge fiefdoms like Elsevier. The ArXiv (which launched in 1991) and open-access publishers like PLoS (established in 2000) still haven’t displaced traditional journals. But maybe, as more and more mini ventures chip away at the incumbent publishers, the revolution will take shape.
The fellow leading the charge for no cost or low cost academic publishing may find the task more difficult than tackling one of Hilbert’s unsolved problems.
Stephen E Arnold, March 10, 2016
Microsoft Delve Described as Tainted
March 10, 2016
I read “Microsoft Delve Faces Challenges in Enterprise Search Role.” Seemed like old news to me. Fast Search never seemed to be in sync with what Fast marketers said the system could do.
In this write up, there is a darned remarkable statement. Here’s the quote that goes right into my “Did Someone Really Say This?” folder:
Delve is already a tainted product…
Gasp. Microsoft bought Fast Search & Transfer in 2008 for $1.2 billion. After the deal closed, the president of Fast Search found himself on the wrong end of Norwegian law. Microsoft killed the Unix version of Fast Search and seemed to be commited to making good on the promises Fast Search marketers offered. Check out the pre-sale presentation to CERN for the “future according to Fast Search.”
SharePoint search, the cloud thing, and Bing—Is Microsoft focused on enterprise search or any search application?
Any way that is quite a statement about Delve. Tainted ain’t a positive word.
Stephen E Arnold, March 10, 2016
Organized Cybercrime Continues to Evolves
March 10, 2016
In any kind of organized crime, operations take place on multiple levels and cybercrime is no different. A recent article from Security Intelligence, Dark Web Suppliers and Organized Cybercrime Gigs, describes the hierarchy and how the visibility of top-level Cybercrime-as-a-Service (CaaS) has evolved with heightened scrutiny from law enforcement. As recently as a decade ago, expert CaaS vendors were visible on forums and underground boards; however, now they only show up to forums and community sites typically closed to newcomers and their role encompasses more expertise and less information sharing and accomplice-gathering. The article describes their niche,
“Some of the most popular CaaS commodities in the exclusive parts of the Dark Web are the services of expert webinjection writers who supply their skills to banking Trojan operators.
Webinjections are code snippets that financial malware can force into otherwise legitimate Web pages by hooking the Internet browser. Once a browser has been compromised by the malware, attackers can use these injections to modify what infected users see on their bank’s pages or insert additional data input fields into legitimate login pages in order to steal information or mislead unsuspecting users.”
The cybercrime arena shows one set of organized crime professionals, preying on individuals and organizations while simultaneously being sought out by organized cyber security professionals and law enforcement. It will be most interesting to see how collisions and interactions between these two groups will play out — and how that shapes the organization of their rings.
Megan Feil, March 10, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Germany Launches Malware to Spy on Suspicious Citizens
March 10, 2016
The article titled German Government to Use Trojan Spyware to Monitor Citizens on DW explains the recent steps taken in Germany to utilize Trojans, or software programs, created to sneak into someone else’s computer. Typically they are used by hackers to gain access to someone’s data and steal valuable information. The article states,
“The approval will help officials get access to the suspect’s personal computer, laptop and smartphone. Once the spyware installs itself on the suspect’s device, it can skim data on the computer’s hard drive and monitor ongoing chats and conversations. Members of the Green party protested the launching of the Trojan, with the party’s deputy head Konstantin von Notz saying, “We do understand the needs of security officials, but still, in a country under the rule of law, the means don’t justify the end.”
Exactly whom the German government wants to monitor is not discussed in the article, but obviously there is growing animosity towards not only the Syrian refugees but also all people of Middle Eastern descent. Some of this hostility is based in facts and targeted, but the growing prejudice towards innocent people who share nothing but history with terrorists is obviously cause for concern in Germany, Europe, and the United States as well. One can only imagine how President Trump might cavalierly employ malware to spy on an entire population that he has already stated his distrust of in the most general terms.
Chelsea Kerwin, March 10, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Watson Weakly: Jargon and Resource Allocations
March 9, 2016
In case you missed the news, IBM seems to be trimming its workforce. Does anyone remember Robert X. Cringely’s “IBM Is So Screwed?” I do. I would wager that Mr. Cringely remembers IBM’s suggestion that Mr. Cringely was off base with his analysis.
Perhaps Mr. Cringely is vindicated. I read “IBM Job Cuts: US Tech Giant Begins Mass Firing One Third of Workforce.” Hmmm. One third of a workforce having an opportunity to find its future elsewhere? That sounds like a swell way to greet spring 2016. March in like a lion and march out like a lamb. Is the lamb heading to the local meat packers?
Against this cheerful seasonal background, I want to mention “Moving from Enterprise Search to Cognitive Exploration.” This is a recycling of an earlier white paper for which one must register in order to read or download the document. Please, note that you will have to jump through some hoops to get this March 2016 publication. Do not complain to me about the link, the involvement of a middleman, and the need to provide details about your interest in enterprise search. Take it up with IBM; that is, if someone will take your call or answer your email. Hey, good luck with that.
What’s notable about this white paper is this word pair: Cognitive Exploration. Original? Nah. The phrase turns up in the title of a collection of essays called Cognitive Exploratioin of Language and Linguistics in 1999. The phrase is some of the jingoism from the super reliable psychology linguistics disciplines. IBM has dallied with the phrase for a number of years but in the RA world, the phrase is getting a jump start. An example of IBM’s arguement is that no one no longer runs a search across a customer service database. Nope, one cognitively explores that customer database.
Cognitive Exploration. It flows trippingly on the tongue does it not. IBM does not fire people; IBM RA’s them. (RA. Resource allocation or termination or reduction in force.)
What is Cognitive Exploration? Well, it is Lucene search plus some home brew code and a dollop of acquired technology. IBM’s original commercial enterprise search system (STAIRS) is just not up to the task of cognitively exploring one’s information assets it seems.
The white paper is a tribute to the search buzzwords that have been used by marketers in the past. I just love Cognitive Exploration.
What is it? For the full answer, you will need to read the 13 pages of explanation. Here’s a sampling of the facts in the write up:
Analysts expect the total data created and copied to reach 44 ZB by the year 2020 (Analyst firm IDC). After all, there are more than 204,000,000 emails launched every minute every day (Mashable.com). How do you manage, search, and process that data and turn it into usable information?
Yep, that’s a lot of information. How is an organization going to deal with “all” those zeros and ones? I suppose I would begin by using a system designed to manipulate large data flows. How about Palantir, BAE Systems, Leidos for starters. What no IBM? Bummer.
The IBM argument advances:
To meet today’s expectations, a search system must be able to access all of your important data sources and filter results based on a user’s access permissions within the organization.
I love the “all”. IBM obviously has nailed video, audio, binaries of various types, disparate file types, and dynamic content flows from intercepts, social media, and interesting sources from the Dark Web. I love “all” type solutions. Too bad these are science fiction based on my experience.
The fix is Cognitive Exploration. Thank you, IBM. A new buzzword to explain what search and retrieval has flubbed for — what? — 50 years” IBM explains:
Cognitive exploration is the combination of search, content analytics, and cognitive computing. Not only can cognitive exploration accelerate the rate at which users can find and navigate information; by leveraging advanced technologies such as content analytics, machine learning, and reasoning it has the potential to augment human expertise.
I don’t want to be a party pooper, but this is perilously close to Palantir’s “augmented intelligence” jargon. Attivio, BA Insight, and even the French folks at Sinequa use similar lingo. Me-too’ism at its finest? Nah, this is IBM, the outfit taking Groupon (a discount coupong business) to court for allegedly infringing on Prodigy patents. Prodigy? Remember that online service?
After snoozing through the white paper’s three pillars of Cognitive Exploration, I raced to the the finish line.
Cognitive Exploration involves the i2 type of relationship analysis, some good old fashioned cuddling between search and cognitive computing (think Watson, gentle reader), and a unified view or what a popular novelist calls “God’s eye” view. Please note that IBM offers some examples, but get the numbering wrong. Where is number one? Watson, Watson, can you assist me? Guess not. IBM’s cognitive exploration essay begins counting with number 2. I am okay with zero. I am okay with one. But I am not okay with an enumerated list beginning with the number two. Careless typo? Indifference? Rushing to the RA meeting? Don’t know. Cognitive Watson counts two, three, four, not one, two, three.
At the end of this remarkable description of Cognitive Exploration I learned:
The cognitive capabilities that can be leveraged by Watson Explorer are provided by the IBM Watson platform.
Isn’t this a recycling of some of the early 1990s marketing material from i2 Group Limited, which IBM bought. Isn’t this lingo influenced by Palantir’s explanations of its Gotham platform?
Omitted from the “all” I assume is the seamless interchange of Gotham files with i2 Analyst Notebook and i2 Analyst Notebook with Gotham. The users and customers have to learn that “all,” like Mr. Clinton’s “is” may not be exactly congruent with one’s understanding of “federation” and “unified.”
Enough already. Go for the close:
IBM Watson Explorer unlocks the value within your data, utilizing that information to help employees make well-informed decisions, provide better support, and identify more customers and business opportunities. By reaching across multiple silos of information within your enterprise, search results will include information never previously integrated into single solutions. Users will benefit from search results from all the data in your company, structured and unstructured, and include data from outside as well. Rather than trying to make good decisions with limited insight, cognitive exploration users can now extract and understand all of the valuable information at their fingertips.
With such a wonderful tool at IBM’s disposal, why is IBM’s management unable to generate revenues? Perhaps the silliness of the marketing explanation of Cognitive Exploration does not deliver the results that obviously someone at IBM believes.
I am stuck on that error in numbering, the recycling of Palantir’s marketing lingo, and the somewhat silly phrase “Cognitive Exploration.”
I won’t sail my Nina, Pinta, and Santa Maria to that digital shore. I will use Google Earth and tools which I know sort of work.
Stephen E Arnold, March 9, 2016
Enterprise Search Morphs
March 9, 2016
I read “One Size Doesn’t Fit All with Enterprise Search.” The problem for me is that the article does not discuss enterprise search. Sure, there are buzzwords like knowledge discovery, but the focus is on a quite specific type of search and retrieval application: Customer service.
The idea behind search as a substitute for a human who knows a product is simple. Think money, headcount, and personnel hassles or churn in the parlance of the customer support world. Let software do a thankless job and move on with sales. That support thing? Hey, let the customer find the answer.
The focus of the write up is on what is called the “self service customer.” The person or persona in the write up has a couple of alter egos; namely, a call center agent and a call center analyst.
What this has to do with enterprise search baffles me. No wonder vendors of basic search and retrieval are struggling to close deals. Instead of describing a specific use case and what systems and methods are needed to deflect the customer yet keep ‘em buying, the once useful phrase “enterprise search” is further devalued.
Why not do what IBM has done and invent a new phrase for an enterprise solution which few love and many prefer to view as a utility and a commodity tool? Cognitive exploration, anyone?
Stephen E Arnold, March 9, 2016
Enterprise Search Revisionism: Can One Change What Happened
March 9, 2016
I read “The Search Continues: A History of Search’s Unsatisfactory Progress.” I noted some points which, in my opinion, underscore why enterprise search has been problematic and why the menagerie of experts and marketers have put search and retrieval on the path to enterprise irrelevance. The word that came to mind when I read the article was “revisionism” for the millennials among us.
The write up ignores the fact that enterprise search dates back to the early 1970s. One can argue that IBM’s Storage and Information Retrieval System (STAIRS) was the first significant enterprise search system. The point is that enterprise search as a productized service has a history of over promising and under delivering of more than 40 years.
Enterprise search with a touch of Stalinist revisionism.
Customers said they wanted to “find” information. What those individuals meant was have access to information that provided the relevant facts, documents, and data needed to deal with a problem.
Because providing on point information was and remains a very, very difficult problem, the vendors interpreted “find” to mean a list of indexed documents that contained the users’ search terms. But there was a problem. Users were not skilled in crafting queries which were essentially computer instructions between words the index actually contained.
After STAIRS came other systems, many other systems which have been documented reasonably well in Bourne and Bellardo-Hahn’s A History of Online information Services 1963-1976. (The period prior to 1970 describes for-fee research centric online systems. STAIRS was among the most well known early enterprise information retrieval system.) I provided some history in the first three editions of the Enterprise Search Report, published from 2003 to 2007. I have continued to document enterprise search in the Xenky profiles and in this blog.
The history makes painful reading for those who invested in many search and retrieval companies and for the executives who experienced the crushing of their dreams and sometimes career under the buzz saw of reality.
In a nutshell, enterprise search vendors heard what prospects, workers overwhelmed with digital and print information, and unhappy users of those early systems were saying.
The disconnect was that enterprise search vendors parroted back marketing pitches that assured enterprise procurement teams of these functions:
- Easy to use
- “All” information instantly available
- Answers to business questions
- Faster decision making
- Access to the organization’s knowledge.
The result was a steady stream of enterprise search product launches. Some of these were funded by US government money like Verity. Sure, the company struggled with the cost of infrastructure the Verity system required. The work arounds were okay as long as the infrastructure could keep pace with the new and changed word-centric documents. Toss in other types of digital information, make the system perform ever faster indexing, and keep the Verity system responding quickly was another kettle of fish.
Research oriented information retrieval experts looked at the Verity type system and concluded, “We can do more. We can use better algorithms. We can use smart software to eliminate some of the costs and indexing delays. We can [ fill in the blank ].
The cycle of describing what an enterprise search system could actually deliver was disconnected from the promises the vendors made. As one moves through the decades from 1973 to the present, the failures of search vendors made it clear that:
- Companies and government agencies would buy a system, discover it did not do the job users needed, and buy another system.
- New search vendors picked up the methods taught at Cornell, Stanford, and other search-centric research centers and wrap on additional functions like semantics. The core of most modern enterprise search systems is unchanged from what STAIRS implemented.
- Search vendors came like Convera, failed, and went away. Some hit revenue ceilings and sold to larger companies looking for a search utility. The acquisitions hit a high water mark with the sale of Autonomy (a 1990s system) to HP for $11 billion.
What about Oracle, as a representative outfit. Oracle database has included search as a core system function since the day Larry Ellison envisioned becoming a big dog in enterprise software. The search language was Oracle’s version of the structured query language. But people found that difficult to use. Oracle purchased Artificial Linguistics in order to make finding information more intuitive. Oracle continued to try to crack the find information problem through the acquisitions of Triple Hop, its in-house Secure Enterprise Search, and some other odds and ends until it bought in rapid succession InQuira (a company formed from the failure of two search vendors), RightNow (technology from a Dutch outfit RightNow acquired), and Endeca. Where is search at Oracle today? Essentially search is a utility and it is available in Oracle applications: customer support, ecommerce, and business intelligence. In short, search has shifted from the “solution” to a component used to get started with an application that allows the user to find the answer to business questions.
I mention the Oracle story because it illustrates the consistent pattern of companies which are actually trying to deliver information that the u9ser of a search system needs to answer a business or technical question.
I don’t want to highlight the inaccuracies of “The Search Continues.” Instead I want to point out the problem buzzwords create when trying to understand why search has consistently been a problem and why today’s most promising solutions may relegate search to a permanent role of necessary evil.
In the write up, the notion of answering questions, analytics, federation (that is, running a single query across multiple collections of content and file types), the cloud, and system performance are the conclusion of the write up.
Wrong.
The use of open source search systems means that good enough is the foundation of many modern systems. Palantir-type outfits, essential an enterprise search vendors describing themselves as “intelligence” providing systems,, uses open source technology in order to reduce costs, shift bug chasing to a community, The good enough core is wrapped with subsystems that deal with the pesky problems of video, audio, data streams from sensors or similar sources. Attivio, formed by professionals who worked at the infamous Fast Search & Transfer company, delivers active intelligence but uses open source to handle the STAIRS-type functions. These companies have figured out that open source search is a good foundation. Available resources can be invested in visualizations, generating reports instead of results lists, and graphical interfaces which involve the user in performing tasks smart software at this time cannot perform.
For a low cost enterprise search system, one can download Lucene, Solr, SphinxSearch, or any one of a number of open source systems. There are low cost (keep in mind that costs of search can be tricky to nail down) appliances from vendors like Maxxcat and Thunderstone. One can make do with the craziness of the search included with Microsoft SharePoint.
For a serious application, enterprises have many choices. Some of these are highly specialized like BAE NetReveal and Palantir Metropolitan. Others are more generic like the Elastic offering. Some are free like the Effective File Search system.
The point is that enterprise search is not what users wanted in the 1970s when IBM pitched the mainframe centric STAIRS system, in the 1980s when Verity pitched its system, in the 1990s when Excalibur (later Convera) sold its system, in the 2000s when Fast Search shifted from Web search to enterprise search and put the company on the road to improper financial behavior, and in the efflorescence of search sell offs (Dassault bought Exalead, IBM bought iPhrase and other search vendors), and Lexmark bought Brainware and ISYS Search Software.
Where are we today?
Users still want on point information. The solutions on offer today are application and use case centric, not the silly one-size-fits-all approach of the period from 2001 to 2011 when Autonomy sold to HP.
Open source search has helped create an opportunity for vendors to deliver information access in interesting ways. There are cloud solutions. There are open source solutions. There are small company solutions. There are more ways to find information than at any other time in the history of search as I know it.
Unfortunately, the same problems remain. These are:
- As the volume of digital information goes up, so does the cost of indexing and accessing the sources in the corpus
- Multimedia remains a significant challenge for which there is no particularly good solution
- Federation of content requires considerable investment in data grooming and normalizing
- Multi-lingual corpuses require humans to deal with certain synonyms and entity names
- Graphical interfaces still are stupid and need more intelligence behind the icons and links
- Visualizations have to be “accurate” because a bad decision can have significant real world consequences
- Intelligent systems are creeping forward but crazy Watson-like marketing raises expectations and exacerbates the credibility of enterprise search’s capabilities.
I am okay with history. I am not okay with analyses that ignore some very real and painful lessons. I sure would like some of the experts today to know a bit more about the facts behind the implosions of Convera, Delphis, Entopia, and many other companies.
I also would like investors in search start ups to know a bit more about the risks associated with search and content processing.
In short, for a history of search, one needs more than 900 words mixing up what happened with what is.
Stephen E Arnold, March 9, 2016
Celebros Launches Natural Language Processing Ecommerce Extension with Seven Conversions
March 9, 2016
An e-commerce site search company, Celebros, shared a news release touting their new product. Celebros, First to Launch Natural Language Site Search Extension for Magento 2.0 announces their Semantic Site Search extension for Magento 2.0. Magento 2.0 boasts the largest marketplace of e-commerce extensions in the world. This product, along with other Magento extensions, are designed to help online merchants expand their marketing and e-commerce capabilities. Celebros CMO and President of Global Sales Jeffrey Tower states,
“Celebros is proud to add the new Magento 2 extension to our existing and very successful Magento 1 extension. Celebros will offer the new extension free of charge to our entire Magento client base to ensure an easy, fast and pain-free upgrade while providing free integrations to new Celebros clients world-wide. The new extension encompasses our Natural Language Site Search in seven languages along with eight additional features that include our advanced auto-complete, guided navigation, dynamic landing pages and merchandising engine, product recommendations and more.”
For online retailers, extension products like Celebros may make or break the platforms like Magento 2.0, as these products are what add value and drive e-commerce technologies forward. It is intriguing that the Celebros natural language processing technology offers conversions available in seven languages. We live in an increasingly globalized world.
Megan Feil, March 9, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Facebook Exploits Dark Web to Avoid Local Censorship
March 9, 2016
The article on Nextgov titled Facebook Is Giving Users a New Way to Access It On the ‘Dark Web’ discusses the lesser-known services of the dark web such as user privacy. Facebook began taking advantage of the dark web in 2014, when it created a Tor address (recognizable through the .onion ending.) The article explains the perks of this for global Facebook users,
“Facebook’s Tor site is one way for people to access their accounts when the regular Facebook site is blocked by governments—such as when Bangladesh cut off access to Facebook, its Messenger and Whatsapp chat platforms, and messaging app Viber for about three weeks in November 2015. As the ban took effect, the overall number of Tor users in Bangladesh spiked by about 10 times, to more than 20,000 a day. When the ban was lifted, the number dropped..”
Facebook has encountered its fair share of hostility from international governments, particularly Russia. Russia has a long history of censorship, and has even clocked Wikipedia in the past, among other sites. But even if a site is not blocked, governments can still prevent full access through filtering of domain names and even specific keywords. The Tor option can certainly help global users access their Facebook accounts, but however else they use Tor is not publicly known, and Facebook’s lips are sealed.
Chelsea Kerwin, March 9, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Google, Contents, and Original Video
March 8, 2016
I read “Moon Shot: Google Teams with J.J. Abrams and XPrize for Space Documentary Series.” The nine part series will premier on YouTube and Google Play this month. The idea of the original programming from a search vendor makes sense if you buy into Cnet’s inclusion of fiction in its content stream. I have a tough time figuring out who sponsors what YouTube video and which technical write up in Cnet is marketing fiction already.
Confusing, right? Alternatively, who cares? Well, that’s why there is a sales oriented documentary. Google wants to sell tickets for a rocket ride. I learned about the J.J. Abrams’ confection:
Fittingly, it is about the Google Lunar XPrize, the competition that Google and the XPrize Foundation started back in 2007, which promises $30 million to the first team able to land a privately funded robotic rover on the moon and drive it around — making history in the process (until now, only a few government space agencies have managed to put rovers on the moon). The new documentary looks appropriately cosmic and stirring in its scope, profiling the dedicated dreamers and entrepreneurs of all 16 remaining teams in the competition as they inch closer to the goal (so far, only two teams have booked tickets on spacecraft actually set to launch soon, both in 2017).
I thought virtual reality would eliminate the boundary between what’s made up and what’s here and now. Wrong am I again.
Stephen E Arnold, March 8, 2016