Governance Problem with Open Source Software
May 4, 2012
Open source is great, but licensees want everything to be easy. That’s our takeaway from ZDNet’s “Only 20% of Corporate OSS Users Manage Components.” The piece reports on a survey from Sonatype which surveyed about 2,500 developers at companies that use open source components. Writer Paula Rooney notes:
“Roughly 20 percent, or 500 respondents, said they were locked down and could only use corporate-approved components, compared to 13 percent in a similar but smaller survey performed a year ago.
Fewer than 50 percent — 49 percent — indicated they had a corporate policy in place and 63 percent acknowledged that corporate standards are not enforced or there are none in place. But that’s still up from last year’s survey, in which almost 90 percent said there were no corporate policies at all.”
So, most companies are still too lax with their open source policies, but that status is gradually improving. That’s a good thing, since employment of open source projects is growing apace.
Founded in 2008, the relatively young Sonatype is a software company built on a powerful open source combination—Apache‘s Maven Build System and Central Repository. They are committed to supporting the open source community.
Cynthia Murrell, May 4, 2012
Sponsored by PolySpot
Black Duck Analyzes the Decline of Copyleft Licenses
May 3, 2012
IT World recently reported on a growing disuse of copyleft licenses in the article “GPL, Copyleft Use Declining Faster Than Ever.”
According to the article, recent data analyzed by Matthew Aslett of Black Duck Software has shown that while the use of the GPL, LGPL, and AGPL set of copyleft (the method of making software free) licenses dominates free and open source projects, that use is still on the decline.
When asked his opinion on the cause for the decline, Aslett said:
“The analysis indicated that the previous dominance of strong copyleft licenses was achieved and maintained to a significant degree due to vendor-led open source projects, and that the ongoing shift away from projects controlled by a single vendor toward community projects was in part driving a shift towards more permissive non-copyleft licenses.”
He also believes that the decline in GPL specifically is not a result of projects moving away from GPL, but that new vendors are choosing community approaches enabled by permissive licenses, rather than attempting to control projects using the GPL.
This is a very interesting interpretation of the causality of declining GPL. We’re interested to see how it continues to unfold.
Jasmine Ashton, May 3, 2012
Sponsored by PolySpot
Will Harvard Library to Jettison Paid Access Academic Journals?
May 3, 2012
In what could be another step toward knowledge failure, BoingBoing reports “Harvard Library to Faculty: We’re Going Broke Unless You Go Open Access.” Struggling with the high costs of academic journal access fees, the Harvard Library Faculty Advisory Council has decided to cancel all the library’s paid scholarly subscriptions.
There’s no doubt that these charges are out of control, and steadily encroaching on the budgets for other acquisitions. Writer Cory Doctorow quotes the Council’s Memorandum on Journal Pricing:
“Harvard’s annual cost for journals from these providers now approaches $3.75M. . . . Some journals cost as much as $40,000 per year, others in the tens of thousands. Prices for online content from two providers have increased by about 145% over the past six years, which far exceeds not only the consumer price index, but also the higher education and the library price indices.”
We understand that the library must control costs. It is unfortunate, however, that that knowledge will no longer be at students’ fingertips. The open access academic world is still sparsely populated, and the Council makes this plea in hope of a richer open access community in the future:
“It’s suggesting that faculty make their research publicly available, switch to publishing in open access journals and consider resigning from the boards of journals that don’t allow open access.”
Perhaps the scholarly open access options will grow, in time. In the meanwhile, it will be the students who miss out on key knowledge.
Cynthia Murrell, May 3, 2012
Sponsored by PolySpot
Sponsored by PolySpot
Oracle and SAP: The Milagro Database War
May 3, 2012
I received an email inducing me to read “Hana and Exalytics: SAP’s Hype Versus Oracle’s FUD.” The write up takes a serious or at least semi serious at Milagro database war. If you are not familiar with the Milagro Beanfield War, you might find the write up a loose allegory of what’s happening in traditional data management companies and the NoSQL farmers.

The Information Week write up does not talk about the real story, however. What we get is two giants of traditional enterprise software squabbling over which traditional data management system is most likely to keep the Fortune 1000, government agencies, and big educational institutions within the traditional enterprise software corral.
With regard to Oracle, the write up asserts:
Oracle’s Larry Ellison and Safra Catz have missed few opportunities to discredit Hana in recent months. But executive VP Thomas Kurian took the slams a level deeper on Friday with a one-hour Webinar clearly intended to sow seeds of fear, uncertainty and doubt in the minds of would-be Hana customers. The session was billed as an Exalytics seminar, but each point set up a contrast with Hana. Kurian claimed, among other things, that SAP’s product costs five times to 50 times more than Exalytics and that it doesn’t support SQL (relational) or MDX (multidimensional) query languages, requiring apps to be rewritten to run on the new database.
The Information Week write up reports:
SAP’s hype about these apps is getting a little ahead of deployed market reality. Both Hana and Oracle Exalytics can point to dramatic before-and-after differences in query speeds. (Even SAP grants that Exalytics can accelerate queries.) SAP says the real payoff from Hana will be in transforming business processes, not just accelerating queries. But we haven’t seen enough solid, real-world customer examples documenting transformed business competitiveness.
Open Source: Researchers Get Frisky
May 2, 2012
Group of Researchers Demand Source Code be Made Public
Phys.org recently reported on a new policy paper written by a group of academic research scientist from across the U.S in the post “Academic Group Says it’s Time for Researchers to Begin Sharing Source Code.”
According to the article, the group is arguing that due to the fact that computer programs have become an integral part of scientific research and since many researchers use public funds to conduct their research, entities that provide funding should require that the source code created be made public, as is the case with other resource materials.
The article states:
“Not providing source code, they say, is now akin to withholding parts of the procedural process, which results in a “black box” approach to science, which is of course, not tolerated in virtually every other area of research in which results are published. It’s difficult to imagine any other realm of scientific research getting such a pass and the fact that code is not published in an open source forum detracts from the credibility of any study upon which it is based.”
This is an interesting point, since as science continues to evolve, the use of computer code, both off the shelf and custom written will likely become ever more present in research endeavors. If researchers don’t begin to share their findings with one another it could easily stunt the progression of many scientific findings.
Jasmine Ashton, May 2, 2012
Sponsored by PolySpot
Constellio Profile Now Available
May 1, 2012
If you are tracking open source search, you can download a free profile of Doculibre’s Constellio system. Each week, ArnoldIT will make available a profile of an open source search vendor. You can request a copy of this week’s profile from our TheSeed2020 site. We leave each new profile “live” for one week. If you want the complete set, you will need to request each profile when the file becomes available.
The full collection will comprise 12 profiles. Once each profile has been available without charge, the full collection plus the market analysis and outlook sections will be available in a single PDF file for a service charge.
Stephen E Arnold, May 1, 2012
Sponsored by ArnoldIT
Ikanow and Carahsoft Bring Infinit.e to Government
May 1, 2012
Here at Beyond Search we often report on tech partnerships that will result in improved search technology. In this vein, Yahoo News recently published “IKANOW and Carahsoft Partner to Bring Open Analytics and Agile Intelligence to Government,”a news release on an important new partnership to look out for.
According to the article, government IT solutions provider Carahsoft Technology Corp and the open analytics platform developer IKANOW have joined together to bring IKANOW’s Infinit.e data and knowledge discovery platform, built on open source technologies, to the U.S. public sector market. Whereas IKANOW will be in charge of product development, Carahsoft will focus on sales and marketing activities to educate the market.
The article states:
“Infinit.e is an all-in-one knowledge discovery tool with an intuitive browser-like interface and powerful algorithms that automatically find relevant information, determine connections and present results in easily understood diagrams, saving time, increasing confidence and improving communications.”
This cloud based analyst platform will allow companies to quickly and easily sort and analyze mass amounts of structured and unstructured data and could make a big impact on our government’s ever expanding demand for research tools.
Jasmine Ashton, May 1, 2012
Sponsored by Augmentext
Oracle Google: An Interesting View
April 30, 2012
I read “My attitude on Oracle v Google.” I was confused. The “inside baseball” write up references folks by their first name. There is, therefore, some ambiguity, which in legal matters is often a quite effective rhetorical technique. I noted this passage:
Just because Sun didn’t have patent suits in our genetic code doesn’t mean we didn’t feel wronged. While I have differences with Oracle, in this case they are in the right. Google totally slimed Sun.
The post is by James Gosling, whom I associate with Sun’s rise and fall, Java, and some interesting wordsmithing.
My takeaway from the short item was that emotion, not engineering seems to be important in the Oracle Google dust up. The other value of the post is that it triggered a quite interesting comment from Bruno Lowagie, who opined:
I hope Oracle wins, not regarding the ‘copyright on APIs’, but because Google doesn’t respect copyright in general. As for Apache, ASL stands for the Apache Slavery License doesn’t it? Large corporations encourage you to use the ASL instead of the GPL so that they can make money using the code, whereas the developer can hardly make any money writing it.
Yikes, slavery. Perhaps Mr. Lowagie is angling for a project at Oracle, an outstanding employer.
Stephen E Arnold, April 30, 2012
Sponsored by PolySpot
The Open Source Search Ostriches
April 30, 2012
ArnoldIT, located in Harrod’s Creek, Kentucky, has spotted a new species of search, content processing, and text mining vendor: The Scrutans Struthioniformes. Believed to be related to the ratites, this new subspecies is known to be indifferent to ignorant of the predator from the open source jungle.
The proprietary search vendor, Scrutans Struthioniformes, ignores the impact of open source search and information retrieval systems.
ArnoldIT has completed a couple of exploratory expeditions thought he wilds of open source search, clustering, and related disciplines. Sparked by the bimonthly feature on open source search which is currently appearing in Information Today’s Online Magazine, the discovery of the Scrutans Struthioniformes was unexpected.
For almost 50 years, information retrieval meant proprietary systems built upon innovations by academic researchers. When the influence was from the number crunching of the Cornell school or the semantic shenanigans from Stanford, search and retrieval translated to:
- Expensive to license, install, optimize, and maintain systems
- Licensing restrictions which prevented client-specific tailoring and fast cycle problem remediation or feature addition
- High levels of user dissatisfaction from the CFO’s office (the lady who pays the bills) to the user in the sales department (the person who has to find out what happened to a particular customer’s order).
What’s changed, according to ArnoldIT, is that open source options are readily available. Smart outfits like IBM killed off in house, brute force search efforts and embraced the open source Lucene/Solr technology. IBM is a proprietary outfit, but the use of Lucene/Solr allowed more effort to be put into value-adding projects such as the “wrappers” which make Watson a game show winner. IBM has also used its billions to purchase proprietary vendors to deliver “additional value.” The purchase of Vivisimo is a good example of a quick way to get clustering, deduping, and federating functions to bolt on the open source plumbing. IBM may disagree, but we have our views.
Other vendors have built businesses on open source search. One example is the emergence of Lucid Imagination and its Lucid Works Enterpriser 2.0 solution. Licensees get speedy search and retrieval, a staff able to answer questions, and a the rapid cycle innovation of the open source Lucene/Solr software.
Clever Amazon is a “sort of” open outfit. On one hand, the company uses open source software to make the Amazon cloud work. However,the CloudSearch solution is based on A9. Amazon, however, provides “sort of” open application programming interfaces. Open source as a business angle is part of the CloudSearch play along with making life easy for developers to deliver “good enough” search.
The Basho Riak Search angle is a variation. Riak Search is proprietary but Basho has made it open source. (A free profile of Basho is available by registering at TheSeed2020, an ArnoldIT content delivery Web site.) Good citizens and good marketing. For a company with a problem which requires Basho data management, the Riak Search solution is available, and it is open source.
There are other variations as well, and these are explained in the ArnoldIT briefing about open source search, its opportunities, and its challenges. Unlike the technology payloads delivered by blogs, the ArnoldIT briefing focuses on the business angle of open source search, and the research has delivered some shockers; for example:
- In a sample of 35 proprietary search vendors, 25 assert that their systems are in some way open source. Good marketing, better technology, or great hyperbole?
- In a sample of 100 search vendors, two thirds of those pinged by ArnoldIT know about or are on top of open source search. Quite an assertion as the Lucid Imagination Lucene Revolution approaches with dozens of case studies that reveal large companies’ willingness to shift from proprietary solutions to open source search. Are most vendors of proprietary search systems ignoring reality? Sure looks like some are confident the search world tomorrow will look the way it did in 2003.
- Hosted search is gaining traction in some specific niches. Two of these niches have long been dominated by proprietary systems. More surprising in the fact that the greatest inroads are being made among the Fortune 1000. That’s the market where money often is for enterprise software vendors.
Will vendors of proprietary search and retrieval systems be able to keep their investors and stakeholders happy as open source becomes a greater force in 2013? The briefing considers the scenario when firms pour more funds into open source search and content processing start ups. If this happens, life becomes more difficult from “on the bubble” vendors of taxonomy, clustering, search, and basic information retrieval systems.
Net net: Another search revolution is brewing. Is your proprietary search vendor a Scrutans Struthioniformes? A better question: Are you? For more information about the ArnoldIT open source search briefing, write seaky2000 at yahoo dot com for options and fees. ArnoldIT may create an open source search ostrich T shirt. Stay tuned. Max and Tess are working on this project now.
Stephen E Arnold, April 30, 2012
Sponsored by Ikanow
ElasticSearch Support from Sematext
April 26, 2012
Sematext has achieved a first, PRWeb announces in “Sematext Int’l First to Offer ElasticSearch Tech Support.” The company already offers Tech Support for Apache projects Solr and Lucene. The press release explains:
“ElasticSearch is a highly scalable open-source search solution that is rapidly being adopted by enterprises world-wide. Over the last 12 months Sematext has witnessed an increase in demand for ElasticSearch and has helped a number of clients build ElasticSearch-based search solutions. Some of Sematext clients are using ElasticSearch on a truly massive scale, ingesting many millions of documents per day and serving hundreds of queries per second.”
The company responded to the growing need, and ElasticSearch Technical Support was born. Sematext founder Otis Gospedneti? stated that his company is in the best position to offer such support, having captured more ElasticSearch engagements (with huge data and query volumes) than anyone. The information gathered at Beyond Search suggests that Lucid Imagination’s Lucene/Solr is the most widely used at this time, however. Lucid offers comprehensive technical and engineer support services via full time staff and a number of partners around the world; for example, Lemur Consulting/Flax.
With many high-profile client organizations around the world, Sematext provides search and data analytics products based on a variety of open source projects. The company started out as Lucene Consulting in 2005, but branched into Solr-based services the very next year. They are proud to have never taken on debt or external funding.
ElasticSearch has been resource constrained. Hopefully the tie up with Sematext will smooth some rough edges. Support is important. Like most Lucene variant vendors, support and engineering services are needed.
Cynthia Murrell, April 26, 2012
Sponsored by PolySpot



