Text Analytics SummitPolySpot: Agile Enterprise Search Infrastructure

dtSearch Locks Down a Deal with MaxxVault

November 11, 2011

MaxxVault LLC and dtSearch Announce That the dtSearch® Engine is Now Embedded in MaxxDocs,” reports PR Newswire:

MaxxVault LLC, specialists in enterprise Electronic Document Management Systems, and dtSearch Corp., a leading supplier of enterprise and developer text retrieval and file parsing software, announce that the dtSearch Engine is now embedded in MaxxDocs 5. Adding dtSearch allows MaxxDocs users to accurately search the content of documents to find the information they need – quickly.

This is a bright feather in the cap of DtSearch. The company is a text retrieval veteran, in the game since1991, and purveys its products worldwide. Besides the search engine now employed by MaxxVault, its offerings include spider-equipped desktop, network, and web search tools, as well as an application that publishes searchable documents.

MaxxVault‘s open-system software helps clients manage, distribute, and control documents securely. The company prides itself on quickly incorporating the suggestions of their users and partners. This latest version of MaxxDocs employs dtSearch to incorporate full-text search capabilities.

Cynthia Murrell   November 11, 2011

Sponsored by Pandia.com

Will a Silver Bullet Save Sci-Tech Publishers?

November 11, 2011

I poked around my Overflight service and noticed a recent news release with the meaty title “Scientific Publisher Saving Hundreds of Thousands of Dollars with MarkLogic.” The subtitle was compelling as well: “New Mobile Applications Let Researchers Study in the Field.”

I thought a moment about the logic of the two statements. I am okay with the idea that a scientific publisher faces some significant challenges. The traditional markets for scientific and technical information in traditional journal form are under severe budget pressure. In response to some scientific publishers’ pricing policies, libraries and some not for profit outfits no longer renew certain journal subscriptions. Others have joined consortia in order to get better value for available budgets.

But STM (scientific, technical, and medical) publications have other issues with which to cope as well. First, technology may not be a core competency. Why would it be? Publishers get authors to write. Publishers package and sell. Technology is talked about but even giants like Thomson  Reuters buy print publishing companies in  Argentina. So much for embracing the digital revolution. Even more interesting is that some STM publishers often ask authors pay the journal typesetting, correction, and maybe some production costs. As headcount comes under pressure in research institutes and universities, some scientific publishers are finding that authors are either not willing to pay or not able to get a third party to pony up the money. In short, STM in the traditional mode is fighting for oxygen.

The mobile angle baffled me as well.

In my experience, many scientists work in what might be called “controlled environments.” In the pharmaceutical sector, certain firms operate the research facilities the way a South African gold mine superintendents monitor workers at the end of a shift. If this type of security does not resonate with you, you need to do some backfilling on gold and diamond mining security protocols. Think naked. Think weighing workers before and after a shift. Think requiring showers and filtering the gray water. You get the idea. Other types of research does require mobile devices; for example, cleaning up  a gone-wrong nuclear reactor which is not a job for an outfit like AtomicPR, in my experience. Public relations “experts” write about radiation and often have limited experience with micro-contamination and chemical decontamination. The point? Mobile often has specific requirements which stretch beyond creating an “app for that.”

In a nutshell, here’s the nub of the news release from my point of view:

Taking research into the field has a new, literal meaning with the launch of new mobile applications built on MarkLogic that are helping scientists better understand soil and crops. MarkLogic Corporation, the company empowering organizations to make high stakes decisions on Big Data in real time, today announced the American Society of Agronomy (ASA) launched Science Pubs, developed for iPad, iPhone, Android, and BlackBerry devices. Science Pubs utilizes MarkLogic to give subscribers and non-subscribers the freedom to dig deep into ASA’s journals, magazines, and eBooks while conducting first-hand research and observations in the field.

The point is that a markup language makes it possible to do an app. Puzzled I plunged forward:

“MarkLogic will save us at least $150,000 per year. That is a lot of money for any publisher, especially a non-profit like the American Society of Agronomy,” said Ian Popkewitz, director, Information Technology & Operations, American Society of Agronomy. “We originally implemented MarkLogic to cut the cost of providing critical publications to our subscribers, but we quickly realized several intangible benefits such as speed, ease of use, and flexibility. The flexibility allowed us to focus on the deployment of Science Pubs. ASA is very pleased to be able to quickly launch these services for subscribers and non-subscribers, and we expect them to generate revenue.”

I understand. However, I want to offer several observations based on my modest experience in publishing. Note I did work for a newspaper that was once one of the Top 25 in the world, but the paper is a starved dog now. I also worked for Bill Ziff, mastermind of multiple empires and the magnate other New York publishers loved to loathe, which is what I learned when I was escorted from the New York Times’s president’s office when he learned I worked for the interesting Mr. Ziff.

First, publishers absolutely have to reduce their costs and in a big way. Saving $150,00 is great, but my question is, “How much does it cost to implement a cost saving system such as a MarkLogic or JSON solution (the fat free alternative to chubby  XML), keep it up, and then running at a scientific publisher such as the American Society of Agronomy?” If a system costs $50,000, 100,000, or even $300,000, the publisher has to pay off the system, its maintenance fee, and whip out some products that sell. With revenues at many scientific publishers flat lining or shriveling, the savings are important and may light a fire under the agronomists to cope with a big expense in the name of cost savings. That type of race can be brutal. And it is one that I would be reluctant to enter.

Second, many not for profit organizations and “charities” in the UK are facing declining memberships. Unthinkable five years ago, professional organizations have to market to their members and then spend money to collect on slow paying professionals. Even the certification angle in the UK is not working as it once did. Unemployment among professionals is making it difficult for some experts to pay to be in a must-have organization. Faced with rising costs across the board and decreasing or flat revenue, some not for profit outfits are looking at a nuclear winter, not AtomicPR with a very short half life.

Third, the notion that scientific research has to be peer reviewed in a lengthy, antiquated manner.  Also, the long publication cycles for some STM journals are out of step with the real time culture in fast moving fields. Not surprisingly, the no-cost or low-cost alternatives to traditional journal publishing refuse to go away. In some fields like mathematics and physics, blogs and even social media have become the important channels for dissemination of technical information and making or breaking careers. Even grants can be determined by a Facebook-type of presence. Quite a shift.

My take on this “news story” is that it makes a possibly compelling case that an XML repository can help reduce certain costs. But without the context of total cost burdens, I have a question, “Why not use JSON?” XML is darned useful, but so is JSON.  My concern is that for many scientific, technical, and medical publishers, is JSON a viable option?

The ArnoldIT team is  finishing a report about the outlook for a major publishing company. With more than $5 billion in revenues, this well known firm may be forced to sell its STM business to generate cash. Not even cost cutting can prevent the dislocations that some publishing companies face. The digital revolution has arrived and is now moving in new directions. Many traditional publishers face stark choices and very difficult financial challenges. Alas, no silver bullets today in my opinion.

Stephen E Arnold, November 11, 2011

Sponsored by Pandia.com

Mindbreeze Named Trendsetting Product

November 11, 2011

KM World lists Fabasoft Mindbreeze as one of its “Trend-Setting Products of 2011.”  Microsoft SharePoint’s main claim is that it makes it easier for people to work together.  However, we argue that a uniform platform does not guarantee ease, not unless the solution is customized to the organization and the situation.  Fabasoft Mindbreeze boasts a highly customized enterprise solution allowing for not only efficient searching, but also “finding” across an organization’s entire system.

“Mindbreeze understands, relates and combines information from all sources and presents intelligent search results. Information can be grouped and is classified. Users can scan the different categories and spot a particular document without having to click through a list of links themselves. The information’s semantic relationship is recognized and depicted, navigation elements and facets are provided as well as a preview of any result in the browser. With Fabasoft Mindbreeze Enterprise you get a 360 degree view of your business, customers, competitors and more.”

Any discuss of an entity’s entire platform would be incomplete without some attention to mobile devices.  Fabasoft Mindbreeze not only supports search and retrieval from mobile devices, but also ensures that access rights are continually maintained and updated on these devices as well.

“Fabasoft Mindbreeze Mobile supports your enterprise to profit from new opportunities to e-mail, collaborate and work with documents from any location. With Fabasoft Mindbreeze Mobile you can deliver any information to your mobile device’s interface and enhance it with context and classification features. Again, approved security procedures ensure that users can only see information for which they have rights.”

Since search is often carried out under time constraints, an easy and intuitive interface is essential.  Explore the features of Fabasoft Mindbreeze to learn how this trend-setting solution can work for you.

Emily Rae Aldridge, November 11, 2011

Barnes and Noble: Crusher and Crushee

November 10, 2011

When I moved from the East Coast to rural Kentucky, I located a couple of local book stores. When Barnes & Noble came to town, the local book stores were crushed. I recall hearing one of the founders of a local book store saying that his company was “crushed like a bug.” The crusher was Barnes & Noble. Don’t get me wrong. Before we sold The Point to Lycos, we had an office in a building next to Barnes & Noble on Union Square. I bought some books there and even saw a celebrity looking at a book with pictures of big houses. Few words  in that book I recall.

Anyway I want to point out that the crusher is now the crushee. Said crushee does not like the feeling of a giant company’s hob nail boot on one’s neck. Point your browser to “Barnes & Noble Cries Antitrust at Microsoft.” Here’s the passage found fascinating:

“Microsoft’s exorbitant licenses for its patents entrench the dominant players in the relevant markets because those players can afford to take a license,” he continued, “while small players cannot”. The company took the complaints to the International Trade Commission. There, it is on the back foot against Microsoft in a patent infringement lawsuit. The book seller hasn’t disclosed how much Microsoft wants it to pay for licensing, however, it did mention it was too much.

Now back to the book store near Harrod’s Creek. What does a crusher do? Crushes. Okay, what does the crushee do? Whines. Oh, and fights with a PowerPoint deck. To see the Barnes & Noble attack on Redmond, navigate to this link.

Crushees with money  seek relief from the courts. Google may help out Barnes & Noble too.

My thought is that Barnes & Noble is learning another useful life lesson about technology. Publishing and retailing do not automatically equip some outfits to survive in the digital world. At lunch with a flock of goslings I pointed out that I lack sympathy for organizations who want to play in the technology world and then wish to apply some of the precepts from the clubby world of urban publishing mores from the early 20th century.

I like steam trains too. Guess what an automated people mover can kill a person just as dead and in more ways than the locomotives I love. Listen. I hear that whine a comin’, a comin’ round the bend.

Time to get off the tracks.

Stephen E Arnold, November 11, 2011

Sponsored by Pandia.com

SharePoint Architecture Can Be Mapped

November 10, 2011

It has been awhile since we found a suitable SharePoint graphic that met out high standards, but we found one from the MSDN SharePoint Developer Team Blog in the article, “Business Connectivity Services High-Level Architecture in SharePoint.”

What we discovered is that Business Connectivity Services (BCS) was first introduced in SharePoint 2007 as the Business Data Catalog (the two products do build upon the other for those worried about updates and patches). It allows SharePoint to connect with all kinds of external data sources, including CRM and ERP databases.

“BCS provides a developer with a means of pre-defining all the information needed by an application to connect with and manipulate this external data through External Content Types (ECT). The most important aspect of ECTs is that once the developer creates it, the ECT will be available for use by SharePoint users to connect and use the external systems without knowing any code.”

Our new favorite graphic shows how BCS components work together with SharePoint. SharePoint is moving towards high-level architecture with more servers and third party software to make it work properly. Take a glimpse at this image and see how it is set up. It may give you an idea on how to implement SurfRay.

Whitney Grace, November 10, 2011

Gartner and Another Magic Quadrant

November 10, 2011

We don’t believe in magic, but consultants do. We came across an interesting write up from a Megan Feil. We contacted her and she said she was working on a test of a new blog. Alas, no details.

Here’s what she wrote:

Bizzdesign, a company specializing in enterprise architecture, recently published a news release on their website claiming that they are “positioned as a leader in Gartner Magic Quadrant for Enterprise Architecture 2011.”

One of Gartner‘s research methodologies is the magic quadrant. This graphic released by the technology research giant shows competing players in various markets. With ability to execute on the y-axis and completeness of vision on the x-axis, this research can offer a comprehensive look at where companies are in the larger picture. Gartner’s analysts identify the challengers, leaders, niche-players, and visionaries.

Gartner appears to be pretty transparent regarding the details of their process for collecting and evaluating data. Further information can be found in the Frequently Asked Questions section. It is important to note that each of these roles has a place in the ecosystem. It would be interesting to discover how much monetary success factors into the overall equation, and by how much those numbers vary, for each of these categories. 

Bizzdesign’s CEO Henry Franken said the following in regards to their placement in the leader quadrant: 

Only three years ago we set our first steps into the international market. From a solid Dutch base, with a lot of knowledge and experience, we have since then grown very fast and have made remarkable progress. To be able to achieve an international top position in a relatively short amount of time is a milestone for both our organization and customers. We are proud of the confidence that the market has given us and we are aiming to expand this confidence along with our growth strategy.

Seeing a relatively young company labeled as a leader in the enterprise architecture market is surprising since it is such a specific market, and one that includes a few larger companies with more seniority. However, it is great to see a company with support for open standards like the TOGAF method and the ArchiMate modeling language in such a position. Additionally, this shows hope for other smaller, younger, and innovative companies to pop up.

Open source architecture provides clients with an enterprise architecture that allows them to customize solutions to solve their business problems instead of adding to them. As 2011 nears to a close we’ll be looking at more companies that just those mentioned on the quadrant for new leaders in the field.

We’re not so sure about magic. We have questions about azure chip consulting firms. We did find the information about Polyspot suggestive. Worth a look. No magic required.

Stephen E Arnold, November 10, 2011

Sponsored by Pandia.com

 

Google TV Seems Shockingly Vivid

November 10, 2011

What is your definition of vivid, of evil, of family-appropriate video?

Don’t know. I was just asking.

We read “Can 24/7 Porn Rescue Google TV?” If you are interested in this alleged content type, you may want to read the article. We find it interesting that such companies as Thomson Reuters will be participating in Google’s new television initiative. What if the staid Thomson Reuters’ channel and the alleged off color channel are adjacent or easily confused? Google needs to sharpen its video search precision and recall or there will be some surprised financial TV watchers who will wonder what happened to their money channel.

Stephen E Arnold, November 10, 2011

Sponsored by Pandia.com

Quote to Note: Google Plus Surely Doomed?

November 10, 2011

Quote to note: Today is November 9, 2011, and I just read an article in an online publication which I thought was going to shut down several times. So predictions about the death of an online service can be wrong. Nevertheless, point your browser thing at “Google+ Is Dead.” Absorb the information, but here’s the quote I have now safely tucked away into the Beyond Search archive:

But a social network isn’t a product; it’s a place. Like a bar or a club, a social network needs a critical mass of people to be successful—the more people it attracts, the more people it attracts. Google couldn’t have possibly built every one of Facebook’s features into its new service when it launched, but to make up for its deficits, it ought to have let users experiment more freely with the site. That freewheeling attitude is precisely how Twitter—the only other social network to successfully take on Facebook in the last few years—got so big. When Twitter users invented ways to reply to one another or echo other people’s tweets, the service didn’t stop them—it embraced and extended their creativity. This attitude marked Twitter as a place whose hosts appreciated its users, and that attitude—and all the fun people were having—pushed people to stick with the site despite its many flaws (Twitter’s frequent downtime, for example). Google+, by contrast, never managed to translate its initial surge into lasting enthusiasm. And for that reason, it’s surely doomed.

Yowza. “Surely doomed.” But aren’t we all?

Stephen E Arnold, November 10, 2011

Sponsored by Pandia.com

Mindbreeze Delivers More Convenience More Relevance

November 10, 2011

Our previous stories have highlighted the rapid and broad adoption of SharePoint following the release of its 2010 version.  However, SharePoint adoption by an organization does not always equal greater efficiency or productivity.  Enterprise solutions, such as Fabasoft Mindbreeze Enterprise are built with the user in mind, greatly increasing the user’s perception of convenience and relevance.  “Stadium of More: The 2011 Summer Release,” details the many ways that Mindbreeze improves the enterprise experience from the viewpoint of the user.

A highlight is the Microsoft SharePoint 2010 Connector:

“Fabasoft Mindbreeze Enterprise now supports Microsoft SharePoint 2010 out-of-the-box with its standard product functionality. All standard types of Microsoft SharePoint 2010 are indexed and line of business applications are supported as well.”

This tutorial briefly explains how Mindbreeze’s composite application allows the user to search and access all items to which they have access while remaining within the application.  Streamlining search and retrieval into the same platform saves time and aggravation.  Mindbreeze constantly verifies and updates access rights, insuring that each user is only retrieving results that they have permission to view.

Continuing the company’s theme of “Not searching, but finding,” this tutorial highlights additional ways that Fabasoft Mindbreeze brings convenience and relevance to the forefront.  Search restrictions are available via intuitive tabs.  Mindbreeze thinks semantically, allowing a search for “records” to retrieve actual records and not simply files names containing the word “record.”  More examples can be found on the brief tutorial.

Don’t make the mistake of assuming that “searching” via Microsoft SharePoint is enough.  Choose Fabasoft Mindbreeze and discover how “finding” can be the key to efficiency and user satisfaction for your enterprise needs.

Emily Rae Aldridge, November 10, 2011

Search Silver Bullets, Elixirs, and Magic Potions: Thinking about Findability in 2012

November 10, 2011

I feel expansive today (November 9, 2011), generous even. My left eye seems to be working at 70 percent capacity. No babies are screaming in the airport waiting area. In fact, I am sitting in a not too sticky seat, enjoying the announcements about keeping pets in their cage and reporting suspicious packages to law enforcement by dialing 250.

I wonder if the mother who left a pink and white plastic bag with a small bunny and box of animal crackers is evil. Much in today’s society is crazy marketing hype and fear mongering.

Whilst thinking about pets in cages and animal crackers which may be laced with rat poison, and plump, fabric bunnies, my thoughts turned to the notion of instant fixes for horribly broken search and content processing systems.

I think it was the association of the failure of societal systems that determined passengers at the gate would allow a pet to run wild or that a stuffed bunny was a threat. My thoughts jumped to the world of search, its crazy marketing pitches, and the satraps who have promoted themselves to “expert in search.” I wanted to capture these ideas, conforming to the precepts of the About section of this free blog. Did I say, “Free.”

A happy quack to http://www.alchemywebsite.com/amcl_astronomical_material02.html for this image of the 21st century azure chip consultant, a self appointed expert in search with a degree in English and a minor in home economics with an emphasis on finger sandwiches.

The Silver Bullets, Garlic Balls, and Eyes of Newts

First, let me list the instant fixes, the silver bullets,  the magic potions, the faerie dust, and the alchemy which makes “enterprise search” work today. Fasten your alchemist’s robe, lift your chin, and grab your paper cone. I may rain on your magic potion. Here are 14 magic fixes for a lousy search system. Oh, one more caveat. I am not picking on any one company or approach. The key to this essay is the collection of pixie dust, not a single firm’s blend of baloney, owl feathers, and goat horn.

  1. Analytics (The kind equations some of us wrangled and struggled with in Statistics 101 or the more complex predictive methods which, if you know how to make the numerical recipes work, will get you a job at Palantir, Recorded FutureSAS, or one of the other purveyors of wisdom based on big data number crunching)
  2. Cloud (Most companies in the magic elixir business invoke the cloud. Not even Macbeth’s witches do as good  a job with the incantation of Hadoop the Loop as Cloudera,but there are many contenders in this pixie concoction. Amazon comes to mind but A9 gives me a headache when I use A9 to locate a book for my trusty e Reeder.)
  3. Clustering (Which I associate with Clustify and Vivisimo, but Vivisimo has morphed clustering in “information optimization” and gets a happy quack for this leap)
  4. Connectors (One can search unless one can acquire content. I like the Palantir approach which triggered some push back but I find the morphing of ISYS Search Software a useful touchstone in this potion category)
  5. Discovery systems (My associative thought process offers up Clearwell Systems and Recommind. I like Recommind, however, because it is so similar to Autonomy’s method and it has been the pivot for the company’s flip flow from law firms to enterprise search and back to eDiscovery in the last 12 or 18 months)
  6. Federation (I like the approach of Deep Web Technologies and for the record, the company does not position its method as a magical solution, but some federating vendors do so I will mention this concept. Yhink mash up and data fusion too)
  7. Natural language processing (My candidate for NLP wonder worker is Oracle which acquired InQuira. InQuira is  a success story because it was formed from the components of two antecedent search companies, pitched NLP for customer support,and got acquired by Oracle. Happy stakeholders all.)
  8. Metatagging (Many candidates here. I nominate the Microsoft SharePoint technology as the silver bullet candidate. SharePoint search offers almost flawless implementation of finding a document by virtue of  knowing who wrote it, when, and what file type it is. Amazing. A first of sorts because the method has spawned third party solutions from Austria to t he United States.)
  9. Open source (Hands down I think about IBM. From Content Analytics to the wild and crazy Watson, IBM has open source tattooed over large expanses of its corporate hide. Free? Did I mention free? Think again. IBM did not hit $100 billion in revenue by giving software away.)
  10. Relationship maps (I have to go with the Inxight Software solution. Not only was the live map an inspiration to every business intelligence and social network analysis vendor it was cool to drag objects around. Now Inxight is part of Business Objects which is part of SAP, which is an interesting company occupied with reinventing itself and ignored TREX, a search engine)
  11. Semantics (I have to mention Google as the poster child for making software know what content is about. I stand by my praise of Ramanathan Guha’s programmable search engine and the somewhat complementary work of Dr. Alon Halevy, both happy Googlers as far as I know. Did I mention that Google has oodles of semantic methods, but the focus is on selling ads and Pandas, which are somewhat related.)
  12. Sentiment analysis (the winner in the sentiment analysis sector is up for grabs. In terms of reinventing and repositioning, I want to acknowledge Attensity. But when it comes to making lemonade from lemons, check out Lexalytics (now a unit of Infonics). I like the Newssift case, but that is not included in my free blog posts and information about this modest multi-vehicle accident on the UK information highway is harder and harder to find. Alas.)
  13. Taxonomies (I am a traditionalist, so I quite like the pioneering work of Access Innovations. But firms run by individuals who are not experts in controlled vocabularies, machine assisted indexing, and ANSI compliance have captured the attention of the azure chip, home economics, and self appointed expert crowd. Access innovations knows its stuff. Some of the boot camp crowd, maybe somewhat less? I read a blog post recently that said librarians are not necessary when one creates an enterprise taxonomy. My how interesting. When we did the ABI/INFORM and Business Dateline controlled vocabularies we used “real” experts and quite a few librarians with experience conceptualizing, developing, refining, and ensuring logical consistency of our word lists. It worked because even the shadow of the original ABI/INFORM still uses some of our term 30 plus years later. There are so many taxonomy vendors, I will not attempt to highlight others. Even Microsoft signed on with Cognition Technologies to beef up its methods.)
  14. XML (there are Google and MarkLogic again. XML is now a genuine silver bullet. I thought it was a markup language. Well, not any more, pal.)

Read more

« Previous PageNext Page »

  •  Only search links from this page: