Honkin' News banner

Avast: Pirate Libraries

July 26, 2016

They are called “pirate libraries,” but one would be better-served envisioning Robin Hood than Blackbeard.  Atlas Obscura takes a look at these floaters of scientific-journal copyrights in, “The Rise of Pirate Libraries.” These are not physical libraries, but virtual ones, where researchers and other curious folks can study articles otherwise accessible only through expensive scientific journal paywalls. Reporter Sarah Laskow writes:

“The creators of these repositories are a small group who try to keep a low profile, since distributing copyrighted material in this way is illegal. Many of them are academics. The largest pirate libraries have come from Russia’s cultural orbit, but the documents they collect are used by people around the world, in countries both wealthy and poor. Pirate libraries have become so popular that in 2015, Elsevier, one of the largest academic publishers in America, went to court to try to shut down two of the most popular, Sci-Hub and Library Genesis.

“These libraries, Elsevier alleged, cost the company millions of dollars in lost profits. But the people who run and support pirate libraries argue that they’re filling a market gap, providing access to information to researchers around the world who wouldn’t have the resources to obtain these materials any other way.”

The development of these illicit repositories traces back to Russia and its surrounds, where academics had a long history of secretly sharing documents under the repressive Soviet Union.  In the 1990s, this tradition began to move online; one of the first pirate-library websites was Lib.Ru. Since then, illegally shared knowledge from more parts of the world has been made available, particularly from Western publishers and universities. Furthermore, the speed with which materials make it online has increased considerably.

Which is more worthy: protecting the stranglehold academic journals have managed to legally establish, and profit from, on research and other information? Or allowing people who possess great curiosity, but who lack deep pockets, to access the latest research? The scholarly pirates have made their choice.

 

 

Cynthia Murrell, July 26, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

There is a Louisville, Kentucky Hidden Web/Dark Web meet up on July 26, 2016. Information is at this link: http://bit.ly/29tVKpx.

 

Sinequa: Now Seeking US Partners

July 25, 2016

Sinequa, a French search vendor, is hunting for partners in the US. The news appears in “Sinequa Partner Advantage Program Empowers the Channel to Capitalize on Leading Cognitive Search & Analytics Technology.” If you liked the title of this article, you will love the subtitle:

Company Launches New Partner Program to Drive Cross-Industry Adoption of Cognitive Search & Analytics and Address Growing Customer Demands

Keywords galore. What I noted was the euphony of “leading cognitive search and analytics technology.” A number of outfits are chasing the “cognitive search” pot of gold. Competitors include the champion in declining quarterly revenue IBM. Then there are the assorted machine learning folks at the Alphabet Google thing. Plus there are various and sundry deep learning initiatives appearing on a daily basis from the money crucible in Sillycon Valley; for example, Indico, MetaMind, Ripjar, Synapsify, and, my favorite, Idibon. I just love “idibon.” So many associations from ichibon to bon bon. Good, right?

image

Partners flock like Zika bearing mosquitoes when there is big money in a reseller/OEM/integrator tie up.

I learned from the Sinequa write up about Sinequa:

Sinequa continues to grow its partnerships with leading global systems integrators and value-add resellers (VARs) as well vendors of enterprise application, cloud and Big Data. In an effort to address rising customer demands from Global Fortune 2000 organizations for turning data into actionable insights, Sinequa extends its worldwide network with partners seeking to enrich their Big Data/analytics offerings in key strategic markets such as banking, defense and security, life sciences, manufacturing, utilities and government. The Sinequa Partner Advantage Program enables channel and service partners to quickly capitalize on the high growth opportunity in cognitive search and analytics. Designed to empower partners with certification programs, technical support and world-class training, Sinequa also offers partners performance-based incentives and marketing support programs…Certified partners access the recently introduced Sinequa ES Version 10. Powered by Machine Learning capabilities at its core, this ground breaking version helps deliver deep analytics of contents and user behavior, offering information with continually improving relevance to users in their work environments.

A point I think is important: Sinequa was founded in 2002. That makes the company 14 years young. Not quite a start up but agile enough when it comes to cognitive technology.

I assume that in today’s economic environment, potential partners will be swarming like the Zika bearing mosquitoes in the river marsh near my home in Harrod’s Creek, Kentucky. These critters seem to fancy my chubby, 72 year old body.

I have noted, however, that some vendors of search are having to work extra hard to close deals. Examples range from Big Blue in Union Square to SLI Systems in New Zealand and parts in between.

The idea of partnering is a good one. Endeca rose to its legitimate $100 million plus in search revenue with its carefully crafted partnering program. On the other hand, the Google Search Appliance partners continue to regroup because the wiser minds at Mother Google killed off the pricey Google Search Appliance. I treasure my print out of the GSA schedule with the five and six digit license fees for the wonderful GB 7007 and 9009 models. Imagine a locked down appliance for the price of a pre acquisition Autonomy IDOL license. Then when the document capacity of the search appliance was reached, a customer could license more Google Search Appliances. I found this business model interesting because taxi meter pricing is often an issue for chief financial officers who want to budget for certain products and services.

The upside of partnerships is that, as Endeca learned, unusual opportunities can be discovered. Once the deal is closed, the lucky partner has an opportunity to tailor the search system to meet the needs of the customer. Once up and running, life is good. Renewals, customization, consulting, maintenance fees, and other oddments make a search vendor’s life one of comfort and joy. The downsides include lawsuits, squabbles, and disruptions from competitors.

Worth watching how Sinequa maneuvers in the US market. Other French search vendors have found the costs and cultural issues a bit of a headache. Examples range from Antidot, Pertimm, and Exalead among others. Do you use Qwant?

Stephen E Arnold, July 25, 2016

Content Marketing about Bing Changing Lives

July 25, 2016

I love content marketing. Stories which contain a mixture of facts and other information are amusing. Consider “How the Power of Search Has Changed the Way We Live.” I use Bing. I also use Yandex, the Google thing, Unbubble, MillionShort (when it is online), Gibiru, and a number of other systems. No one search system duplicates the result sets of other systems. The write up blithely ignores this observation.

I learned that I could learn about search in a Microsoft white paper (yep, another content marketing thing) called “The Humanization of Search.” I assume Microsoft has abandoned its effort to co-opt the phrase “beyond search.” Nice try, folks.

You can download this write up from this link and watch a video. The write up is 18 pages of juicy fruit. I noted three statements:

  1. Voice queries are longer than text queries
  2. People ask questions when entering a search via voice
  3. Questions use who, what, how, when, and where structures.

Okay, take a moment to catch your breath.

Microsoft wants to be the big dog in voice search. I understand. The hitch in the git along is that the big dog seems to be cross town neighbor Amazon with its weird black speaker gizmo. Then there is the persistent problem of the Alphabet Google. Microsoft is in the game, but I don’t see the company pushing the Messis and Ronaldos of voice search to the second team for a while if ever.

Like IBM, the notion that saying things is much easier than delivering results. I find the parallel between IBM Watson cognitive computing marketing and IBM’s performance start evidence that talk does not generate sustainable revenues and rising profits. Microsoft may be dazzled by its white paper lingo, but the company has to demonstrate that its mismanagement of the mobile market is an exception, not the steady pulse of missing shots in front of the goal.

Read the white paper. Watch how the shift from search leads to marketing; for example:

As experiences across platforms become more prevalent, marketers need to familiarize themselves with emerging technology, as well as the massive growth opportunities that stem from search being more incorporated into everyday human life.

Confused. So was I.

Stephen E Arnold, July 25, 2016

DuckDuckGo: Filtering

July 22, 2016

I read “Is DuckDuckGo.com Partially Enforcing the “Celebrity Threesome Injunction“? The point of the write up is that information is filtered from search systems, including the privacy-centric system DuckDuckGo.com. I assume the queries summarized in the write up are spot on. If accurate, one cannot search that which is not in an index. That’s helpful for those who want to be thorough. It is also helpful for those who find themselves the subject of write ups already published and want to keep the links out of a search system’s results page. With folks loving the mobile research experience, who would know? The more interesting question, “Does anyone care?” A good example is the “artist” whose work disappeared from the Alphabet Google thing’s Blogger system. See “Google Deletes Artist’s Blog and a Decade of His Work along with It.” Back ups are good if not filtered by a helpful cloud service. Where did my music go anyway?

Stephen E Arnold, July 22, 2016

Alphabet Google Is Busy Reinventing

July 22, 2016

From Forbes in India (“Sundar Pichai to Reinvent Google with a Heavy Dose of Artificial Intelligence” which may require a proxy maneuver due to the digitally with it Forbes) or Switzerland (“Google’s New Research Lab in Zurich Is Inventing the Future of Search”) — the Alphabet Google thing is trying to reinvent search.

There you go: Stark evidence that Google information retrieval system is deeply flawed. The electric car does not reinvent the car. But search has to reinvent search.

This is a big and probably futile job. My view is that search is an evolutionary beastie. Incremental innovations from research labs, one man band coders, and start ups with one good idea and couple of crazed investors do the job.

Google itself was a roll up of ideas from IBM Almaden (hell, Jon Kleinberg), AltaVista (hello, Jeff Dean, Simon Tong, and Sanjay Ghemawat), and the fumble bumbles of folks at precursors (hello, AskJeeves and Lycos).

The India angle states:

Think of it as Search 3.0—a new, interactive way to communicate with Google itself. With it you’ll be able to order a ticket, book a flight, play music, schedule a task, reply to a message; the Google assistant might even write it for you. It might prompt you to order flowers ahead of Mother’s Day or to pack for your upcoming trip, and it might be able to pick up an earlier conversation from where you left off. In other words, it will be there, ready to help, in your phone, your speakers, your television, your car, your watch and eventually everywhere. “You are trying to go about your day, and in an ambient way, things are there to help you,” Pichai says. Making sure this assistant lives up to its full potential will take years, and building it will be harder than it was for Page and co-founder Sergey Brin to create search itself. Adds Pichai: “In every dimension, it is more ambitious.”

Yep, ambitious.

From the Swiss side:

he new team has a distinct goal: to invent the future of Search, a voice-activated, human-like entity that can answer any query intelligently. “We are building the ultimate assistant. In two years, you can expect Google to become a personal life assistant across multiple surfaces, including your phone, Google Home, even cars,” Mogenet [Google wizard] said. Some of Google’s best-known products are already shaped by machine learning, the ability of computers to spot patterns in large datasets and learn by example. For instance, Google Photos uses it to understand the content of an image. This means you could search for “cardigan corgi” or “passport” or “birthday celebrations 2014” and the app will bring up the relevant photos.

There you go. Reinvent.

The challenge is to find a way to avoid the stagnation which seems to befall certain types of high technology outfits. Do you use your DEC Rainbow today?

I love the Google. It is just super. The problem is that as it has concentrated traffic, it has left itself unable to respond to opportunities such as those identified by Facebook and Amazon. By the way, both of these outfits face some challenges as well.

The investment in search will benefit some folks. But how likely is it that Google will come up with an “innovation” that matters. I think that when octopus companies do something — whether it is good or bad — it is easy to define whatever happens as success.

The problem is that information returned from Google is often off point. When I run queries for documents I have in my hand, I cannot find them without jumping through hoops. I documented this with a Dark Web paper from Denmark in this blog. Homonyms give the Google fits. Even though my search history is available to Mother Google, the system is tone deaf for my queries. When I look for certain information, the data are often disappeared. I noticed that indexing of pastesites, PDF files, and PowerPoint presentations has become laughable.

Innovation is more than a public relations campaign. How do I know? Google’s marketing is starting to remind me of IBM Watson. You know Watson, the revolutionary information access system from Big Blue. Yep, innovation.

Stephen E Arnold, July 22, 2016

Amazon: Not the Corner Store? Big Insight

July 21, 2016

I love Amazon almost as much as I love Google. I would have a tough time deciding which of these services warrants more of my affection, trust, and respect. I said to myself “Bummer” when I read “Amazon’s Dominance Is Bad for Your Business.” I recently ordered a paperback from Amazon and noticed that the 150 page monograph was a $1,000, not $10. Anyone could have clicked the incorrect link between the correctly priced volume and the used discounted books. Amazon respected my klutziness, and I think I got my money back after I sent the $1000 paperback back to the outstanding merchant. This firm obviously valued its paperback more highly than the half dozen vendors selling the same paperback for $10. What more could one want? (One of my goslings asked me, “Why does Amazon list certain products at vastly inflated prices? I don’t know. I love Amazon. Love is blind.)

The write up includes a quote allegedly generated by the world’s smartest person, Jeff Bezos; to wit:

“…Amazon should approach these small publishers the way a cheetah would pursue a sickly gazelle.”

I like that. Google’s meat eating dinosaur is, after all, dead unless the team solving death brings T Rex back to life. A cheetah is a here and now creature able to snag small, sickly, or inept prey with a batting average a major league player would covet.

The write up also states:

Amazon has done a very good job with search and discovery on mobile,” BloomReach marketing chief Joelle Kaufman said. “They are capturing the lion’s share of mobile revenue. Consumers said they start on a cellphone and they use it as a research tool. But 81 percent want to buy on that laptop/desktop.”

Google, it seems, is an also ran in the shopping search sector. But what about Amazon’s competitors and merchants who do not want to sell their products via Amazon?

The answer is, according to the write up:

There are still a plethora of avenues to make sales through, and portals to gain consumer attention. Despite Amazon’s utter dominance in the U.S. e-retail market, you can still grow your business, and become highly successful along the way. Just remember the importance of content, social media, and a great attitude. If David had submitted to Goliath’s size before the battle had begun, he never would have realized his own strength and capabilities.

This sounds like Google Adwords, Snapchat, and YouTube videos to me? Those work really well for mom and pop merchants (at least for the small number remaining in the good, old USA), small businesses, and unfunded start ups.

Is what’s good for Amazon good for us or was it “What’s good for General Motors is good for the USA”? When will Amazon address the shortcomings I find in Amazon search? Maybe never. If it is not broken, why try to fix it. That’s why suggested prices are irrelevant in the Amazon jungle.

Stephen E Arnold, July 21, 2016

Coveo Wins a Stevie. Congrats Coveo. What Is a Stevie?

July 21, 2016

The article titled Coveo Sweeps Early 2016 Awards Programs on Coveo promotes some of the many honors and recognitions that the Coveo company and its apps have earned. Among these is the Gold Stevie Award they earned for Sales and Customer Service through Coveo Reveal. The article details the competition for this prestigious yet unknown award,

“More than 2,100 nominations from organizations of all sizes and in virtually every industry were evaluated in this year’s competition, an increase of 11% over 2015. Finalists were determined by the average scores of 115 professionals worldwide, acting as preliminary judges. More than 60 members of several specialized judging committees determined the Gold, Silver and Bronze Stevie Award placements from among the Finalists during final judging.”

Coveo Reveal is the first cloud-based, machine leaning search platform for the enterprise. Its main users are customer service professionals, who are able to gain a stronger understanding of areas that can be improved in the overall search process. No surprise that it is winning awards, but we are unfamiliar with this Stevie recognition. According to the American Stevie Awards website, the award has been around since 2002 is named Stevie as in Stephen after the Greek derivation: “crowned.”

 

Chelsea Kerwin, July 21, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

There is a Louisville, Kentucky Hidden Web/Dark
Web meet up on July 26, 2016.
Information is at this link: http://bit.ly/29tVKpx.

 

Scholarship Evolving with the Web

July 21, 2016

Is big data good only for the hard sciences, or does it have something to offer the humanities? Writer Marcus A Banks thinks it does, as he states in, “Challenging the Print Paradigm: Web-Powered Scholarship is Set to Advance the Creation and Distribution of Research” at the Impact Blog (a project of the London School of Economics and Political Science). Banks suggests that data analysis can lead to a better understanding of, for example, how the perception of certain historical events have evolved over time. He goes on to explain what the literary community has to gain by moving forward:

“Despite my confidence in data mining I worry that our containers for scholarly works — ‘papers,’ ‘monographs’ — are anachronistic. When scholarship could only be expressed in print, on paper, these vessels made perfect sense. Today we have PDFs, which are surely a more efficient distribution mechanism than mailing print volumes to be placed onto library shelves. Nonetheless, PDFs reinforce the idea that scholarship must be portioned into discrete units, when the truth is that the best scholarship is sprawling, unbounded and mutable. The Web is flexible enough to facilitate this, in a way that print could never do. A print piece is necessarily reductive, while Web-oriented scholarship can be as capacious as required.

“To date, though, we still think in terms of print antecedents. This is not surprising, given that the Web is the merest of infants in historical terms. So we find that most advocacy surrounding open access publishing has been about increasing access to the PDFs of research articles. I am in complete support of this cause, especially when these articles report upon publicly or philanthropically funded research. Nonetheless, this feels narrow, quite modest. Text mining across a large swath of PDFs would yield useful insights, for sure. But this is not ‘data mining’ in the maximal sense of analyzing every aspect of a scholarly endeavor, even those that cannot easily be captured in print.”

Banks does note that a cautious approach to such fundamental change is warranted, citing the development of the data paper in 2011 as an example.  He also mentions Scholarly HTML, a project that hopes to evolve into a formal W3C standard, and the Content Mine, a project aiming to glean 100 million facts from published research papers. The sky is the limit, Banks indicates, when it comes to Web-powered scholarship.

 

Cynthia Murrell, July 21, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

There is a Louisville, Kentucky Hidden Web/Dark
Web meet up on July 26, 2016.
Information is at this link: http://bit.ly/29tVKpx.

 

Coveo Changes Its Positioning

July 20, 2016

Short honk: Coveo, the Canadian enterprise search outfit, has changed its positioning. I should probably say “added to” it positioning as an information retrieval vendor. “Montreal Opening for Big Data Search Firm Coveo” reports that the company has a new office in Montréal. What I noticed was the description of Coveo as a “big data search firm.” The company has been describing itself as a customer support solution and a vendor of unified search. But Big Data is a thing, so it makes sense that an information processing outfit would embrace the moniker. The write up reports that a Coveo wizard said:

We have an amazing pipeline of cloud solutions, and the integration of machine learning, artificial intelligence and data-driven personalization to our technology creates huge market opportunities. We believe Montreal is the best place for us to build on this momentum and assert our position as market leader.

The write up does not mention if any provincial or national subsidies were provided to Coveo. I am no expert on Canada, but I have heard that incentives, including salary support, have been made available to firms meeting certain criteria.

Stephen E Arnold, July 20, 2016

Recommind Follows BRS, IDI Basis, Fulcrum, and Nstein

July 19, 2016

OpenText is, by golly, one of the outfits which “owns” more search and retrieval technology than any other firm I can name. I read “OpenText Lives Up to Promise, Acquires Recommind.” The write up points out:

Just a week after it announced it was selling off $600 million worth of senior debt notes to fund future acquisitions, OpenText dropped $163 million to acquire Recommind, an e-discovery and information analytics provider.

The write up explains that Recommind “could generate between $70 and $80 million of annualized revenues.” This is a hefty sum for a system which has in my mind been dumped into the Autonomy-type search system pigeon hole. (If anyone is interested, I have a profile of Recommind technology. Write benkent2020 at yahoo dot com for details.) Frankly I was surprised at the modest size of the deal. What would Recommind have been worth if it had added Big Data, advanced analytics, and artificial intelligence to its system? On the other hand, maybe Recommind did exactly that.

Several observations:

  • Search and content processing systems incur significant technological debt. This means that the software system has be fed regular injections of real cash to work, keep customers happy, and keep pace with the competition
  • A vendor with multiple systems has to figure out exactly what system to pitch to a potential customer. This is often difficult if the prospect asks such questions as, “What is Nstein’s capability in terms of Recommind’s functions?” Or, “What search system is included with RedDot and what other options are available to install today and use tomorrow?”
  • Portfolio search and content processing vendors are rare birds in today’s corporate jungle. IBM is similar, and its financial performance suggests that having numerous search and content processing arrows in its quiver does not seem to hit the financial bull’s eye.

OpenText, in my view, is a company which may have to make very hard decisions about what technology debt to retire. The interest on that debt could, if left unmanaged, could lead to financial headaches.

Stephen E Arnold, July 19, 2016

« Previous PageNext Page »