The Future of Enterprise and Web Search: Worrying about a Very Frail Goose

May 28, 2015

For a moment, I thought search was undergoing a renascence. But I was wrong. I noted a chart which purports to illustrate that the future is not keyword search. You can find the illustration (for now) at this Twitter location. The idea is that keyword search is less and less effective as the volume of data goes up. I don’t want to be a spoil sport, but for certain queries key words and good old Boolean may be the only way to retrieve certain types of information. Don’t believe me. Log on to your organization’s network or to Google. Now look for the telephone number of a specific person whose name you know or a tire company located in a specific city with a specific name which you know. Would you prefer to browse a directory, a word cloud, a list of suggestions? I want to zing directly to the specific fact. Yep, key word search. The old reliable.

But the chart points out that the future is composed of three “webs”: The Social Web, the Semantic Web, and the Intelligent Web. The dates for the Intelligent Web appears to be 2018 (the diagram at which I am looking is fuzzy). We are now perched half way through 2015. In 30 months, the Intelligent Web will arrive with these characteristics:

Embedded image permalink

  • Web scale reasoning (Don’t we have Watson? Oh, right. I forgot.)
  • Intelligent agents (Why not tap Connotate? Agents ready to roll.)
  • Natural language search (Yep, talk to your phone How is that working out on a noisy subway train?)
  • Semantics. (Embrace the OWL. Now.)

Now these benchmarks will arrive in the next 30 months, which implies a gradual emergence of Web 4.0.

The hitch in the git along, like most futuristic predictions about information access, is that reality behaves in some unpredictable ways. The assumption behind this graph is “Semantic technology help to regain productivity in the face of overwhelming information growth.”

Read more

Data Darkness

May 28, 2015

According to Datameer, organizations do not use a large chunk of their data and it is commonly referred to “dark data.”  “Shine Light On Dark Data” explains that organizations are trying to dig out the dark data and use it for business intelligence or in more recent terms big data.  Dark data is created from back end business processes as well as from regular business activities.  It is usually stored on storage silo in a closet and only kept for compliance audits.

Dark data has a lot of hidden potential:

Research firm IDC estimates that 90 percent of digital data is dark. This dark data may come in the form of machine or sensor logs that when analyzed help predict vacated real estate or customer time zones that may help businesses pinpoint when customers in a specific region prefer to engage with brands. While the value of these insights are very significant, setting foot into the world of dark data that is unstructured, untagged and untapped is daunting for both IT and business users.”

The article suggests making a plan to harness the dark data and it does not offer much in the way of approaching a project other than making it specifically for dark data, such as identifying sources, use Hadoop to mine it, and tests results against other data sets.

This article is really a puff piece highlighting dark data without going into much detail about it.  They are forgetting that the biggest movement in IT from the past three years: big data!

Whitney Grace, May 28, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

SharePoint Is Back and Yammer Is Left Behind

May 28, 2015

Many old things become trend and new again, and even that holds true with software, at least in principle. The old functions of SharePoint are withstanding the test of time, and the trendy new buzzwords that Microsoft worked so hard to push these last few years (cloud, social, collaborative) are fading out. Of course, some of it has to do with perception, but it does seem that Microsoft is harkening back to what the tried and true longtime users want. Read more in the CMS Wire article, “SharePoint is Back, Yammer… Not So Much.”

The article sums up the last few years:

“But these last few years, Microsoft seemingly didn’t want to talk about SharePoint. It wanted to talk about Office 365, the cloud, collaboration, social, mobile devices and perpetual monthly licensing models. Yet no one appears to have told many of the big traditional SharePoint customers of these shifts. These people are still running SharePoint 2007, 2010 and 2013 happily in-house and have no plans to change that for many years.”

So it seems that with the returned focus to on-premises SharePoint, users are pleased in theory. However, it remains to be seen how satisfying SharePoint Server 2016 will be in reality. To stay tuned to the latest reviews and feedback, keep an eye on ArnoldIT.com and his dedicated SharePoint feed. Stephen E. Arnold is a longtime leader in search with an interest in SharePoint. His reporting will shed a light on the realities of user experience once SharePoint Server 2016 becomes available.

Emily Rae Aldridge, May 28, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

 

Altiar And dtSearch Combine

May 27, 2015

Sometimes when items are combine they create something even better, such as Oreos and peanut butter, Disney and Marvel, and Netflix and original series.  EContentMag alerted us that a new team-up is underway between two well known companies.  The press release title says it all, “Altiar Cloud-Based ECM Platform Is Embedding The dtSearch Engine.”  Altair is an enterprise content management platform that has been specifically used by Microsoft Azure.  The popular dtSearch platform has been searching through terabytes since 1991 and is referred to as a powerful search tool.  Embedding dtSearch into the Altiar core will make it a more powerful ECM.

Altiar is a popular ECM and can only be improved by dtSearch:

“A cloud-based service, Altiar includes rapid setup, scalability, and storage. It can accept any type of file, from PowerPoint to streaming video, as well as providing a host of tools and services to create custom content pages, newsletters, personal zones, and the like. The platform lets users not only access content from any connected device, but also manage, share, and track content, including features like email alerts.”

Microsoft is not a main player in the cloud computing and Microsoft Azure is supposed to drive more customers to them.  Anything, like this new Altair improving its search will make it more appealing.

Whitney Grace, May 27, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Hijacking Semantics for Search Engine Optimization

May 26, 2015

I am just too old and cranky to get with the search engine optimization program. If a person cannot find your content, too bad. SEO has caused some of the erosion of relevance across public Web search engines.

The reason is that pages with lousy content are marketed as having other, more valuable content. The result is queries like this:

image

I want information about methods of digital reasoning. What I get is a company profile.

How do I get information for my specific requirement? I have to know how to work around the problems SEO puts in my face every day, over and over again.

This query works on Bing, Google, and Yandex: artificial intelligence decision procedures.

image

The results do not point to a small company in Tennessee, but to substantive documents from which other, pointed queries can be launched for individuals, industry associations, and methods.

When I read “Semantic Search Strategies That Work,” I became agitated. The notion of “forgetting about content” and “focusing on quality” miss the mark. Telling me to “spend time on engagement” are a collection of unrelated assertions.

The goal of semantics for SEO is to generate traffic. The search systems suck in shaped content and persist in directing people to topics that may have little or nothing to do with the information a person needs to solve his or her problem.

In short, the bastardization of semantics in the name of SEO is ensuring that some users will define the world from the point of view of marketing, not objective information.

What’s the fix?

Here’s the shocker: There is no fix. As individuals abrogate their responsibility to demand high value, on point results, schlock becomes the order of the day.

So much for clear thinking. Semantic strategies that erode relevance do not “work” from my point of view. This type of semantics thickens the cloud of unknowning.

Stephen E Arnold, May 26, 2015

Computing Power Up a Trillion Fold in 60 Years. Search Remains Unchanged.

May 25, 2015

I get the Moore’s Law thing. The question is, “Why isn’t search and content processing improving?”

Navigate to “Processing Power Has Increased by One Trillion-Fold over the Past Six Decades” and check out the infographic. There are FLOPs and examples of devices which deliver them. I focused on the technology equivalents; for example, the Tianhe 2 Supercomputer is the equivalent of 18,400 PlayStation 4s.

The problem is that search and content processing continue to bedevil users. Perhaps the limitations of the methods cannot be remediated by a bigger, faster assemblage of metal and circuits?

The improvement in graphics is evident. But allowing me to locate a single document in my multi petabyte archive continues to a challenge. I have more search systems than the average squirrel in Harrod’s Creek.

Findability is creeping along. After 60 years, the benefits of information access systems are very difficult to tie to better decisions, increased revenues, and more efficient human endeavors even when a “team of teams” approach is used.

Wake up call for the search industry. Why not deliver some substantive improvements in information access which are not tied to advertising? Please, do not use the words metadata, semantics, analytics, and intelligence in your marketing. Just deliver something that provides me with the information I require without my having to guess key words, figure out odd ball clustering, or waiting minutes or hours for a query to process.

I don’t want Hollywood graphics. I want on point information. In the last 60 years, my information access needs have not been met.

Stephen E Arnold, May 25, 2015

Yahoo Considers Options for Japanese Division

May 25, 2015

Despite a series of changes since former Googler Marissa Mayer took over at Yahoo, the search-and-entertainment company still struggles to find its footing in a tech landscape that shifted around it long ago. Bloomberg Business wonders whether the Yahoo’s next steps in Japan will set it on a sturdier path in, “Yahoo Weighs Options for Japan Stake; Sales Miss Estimates.” Writer Brian Womack reports that Mayer plans to make the most of her company’s Japanese assets. He posits:

“By telling investors she’s looking at options for Yahoo Japan, Mayer may be seeking to buy herself more time to jump-start growth at the company she’s been working to turn around for almost three years. Unless she can expand sales, investors may eventually lose patience with the strategy and question her leadership. Some analysts speculated earlier this year that Yahoo could become a takeover target for a larger Internet company after it spins off the Alibaba stake.

“Yahoo’s share of the U.S. online display ad market may slide to 3.5 percent in 2017 from 5.5 percent last year, according to EMarketer Inc. Quarterly revenue growth has come in at less than 4 percent or negative since the end of 2012.”

The success of China’s largest e-commerce firm, and Yahoo asset, Alibaba is responsible for much of the company’s recent growth, such as it is, but that boost will only last so long. Womack reports there has been investor pressure to  spin off Yahoo’ Japanese division, but apparently Mayer prefers to consider a range of options. Will Yahoo find salvation in the land of the rising sun?

Cynthia Murrell, May 25, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Google Puts Wood behind Enterprise Search

May 24, 2015

A couple of years ago, enterprise search at Google was not setting my world on fire. I enjoyed reporting on the cost of the Google Search Appliance, a fail over component, and the services required by a Google partner to make the GSA sing and dance the way the licensee wanted. I listened to Googlers for little bits of gossip. One of the items which I was not able to verify was that there were not enough engineers working on the GSA. Other activities at Google beckoned. See, for example, my write up about the robotic teddy bear or run a search for the Loon balloon thing.

I read “Google Labs for Enterprise Search” and learned that those rumors were wrong. Wrong, wrong, wrong. Google Enterprise Search is not just the exciting, job creating engine embodied in the Google Search Appliance. Enterprise search embraces Google Intranet search and the Google Mini. I thought the Mini was history.

Wait. A newspaper for enterprise search reported this story as if it were recent. Well, the undated story about Google is not recent. The article comes from something called “In the Googleplex” and it is a Google hobby site.

So maybe I was not wrong, wrong, wrong.

I highlight this item because folks writing and curating information about search do not date their articles. Google is date challenged, which is one reason the GOOG has funded Recorded Future and its time technology.

Pumping out old information as if it were “fresh” just confuses the already credibility challenged search and content processing sector. Maybe no one cares or most readers are content to accept baloney as sirloin.

Stephen E Arnold, May 24, 2015

A Bigger, Faster, Better Technology Innovation Pipeline: Think Corporate Funding of R&D

May 24, 2015

I read the opinion piece by MIT president Rafael Reif. This item appeared in Mr. Bezos’ pet newspaper, the Washington Post. You can find a version of the editorial in the MIT News. Dr. Reif is, if Wikipedia is spot on, on the board of Alcoa. He also has invented “Method of forming a multi-layer semiconductor structure having a seamless bonding interface” and more than a dozen other systems and methods. You can get the biographical details in Wikipedia and on the MIT Office of the President’s Web site. Neither of these sources reference, as far as I could tell, “Trendy Reif Strikes Again” and this selfie:

The write up points out:

Together, the public and private sectors make investments in higher education and scientific research. (LiquiGlide emerged from research funded by the National Science Foundation and the Defense Department.) These investments spawn graduates and ideas that, through venture-capital-funded start-ups, pay off in innovations that serve society: the ultimate return on investment.

Okay, corporate funded academic research. The approach is a bit different from the now out moded Bell Labs’s angle of attack. But corporate funding generates some nifty architecture, even niftier piles of money to use for various purposes, and some nifty opportunities for the students and faculty.

There is a downside. I was surprised to learn:

But this system leaves a category of innovation stranded: new ideas based on new science. Self-fertilizing plants. Bacteria that can synthesize biofuels. Safe nuclear energy technology. Affordable desalination at scale. It takes time for new-science technologies to make the journey from lab to market, often including time to invent new manufacturing processes. It may take 10 years, which is longer than most venture capitalists can wait. The result? As a nation, we leave a lot of innovation ketchup in the bottle.

What?

The problem has to be addressed. I assume that a hamburger without ketchup is not going to keep the conscientious, serious students and their mentors on the beam. MIT has to produce innovation. If ketchup is in the bottle, we need a vacuum device equipped with artificial intelligence and next generation features which perform non chaotic bottle maneuvers to remove the condiment while reporting data in real time. Yes.

What the fix?

There are two—count ‘em—two ways to tackle ketchup left in the bottle problems.

  1. Do the corporate funding of schools like MIT. That one percent silliness does not apply to academic institutions near the Charles River.
  2. Move faster. Hey, that nuclear bomb development which dragged on for a decade, old fashioned. We need to accelerate innovation.

The assumption is that innovation is the way to fix the challenges in today’s world. Nothing works, its seems, unless we have more, better, faster technology.

The only problem is that certain technologies like search and information access are not improving. I can identify a couple of other technological enhancements which are not having the desired impact. I wrote about the attention span of a goldfish and a 20 something.

The goldfish had the ability to concentrate for a longer period of time. MIT and other techno-havens are ecosystems. I lived in central Illinois in the winter. When I was a freshman in high school in 1958 I created a terrarium and grew some of the plants which overran my mother’s garden in Campinas, Brazil, before we returned to America.

I got the plants to thrive. I had a glass walled box that was just like Brazil until I left the lid off one day. The plants died. Whatever lived in the terrarium probably assumed the real world of Illinois in January was just like a tropical clime.

Bzzzz. Wrong.

Stephen E Arnold, May 24, 2015

Maana from Heaven: Sustaining Big Data Search

May 23, 2015

Need to search Big Data in Hadoop? Other data management systems? Maana is now ready to assist you. Fresh from stealth mode, the company received an infusion of venture capital which now totals $14.2 million. (You may have to pay to access the details of this cash injection.) Maana garnered only a fraction of the money pumped into search vendors Attivio ($71 million), Coveo ($34 million) or Palantir (hundreds of millions). But Maana has some big name backers; for example, GE Ventures and Intel Capital, among others.

Maana’s manna looks a lot like legal tender.

According to the company:

Maana is pioneering new search technology for big data. It helps corporations drive significant improvements in productivity, efficiency, safety, and security in the operations of their core assets.

This value proposition strikes me as familiar.

Maana is ready to enable customers to perform knowledge modeling, evaluation, data understanding, data shaping, and orchestration. Differentiation is likely to be a challenge. The company offers this diagram to assist prospects in understanding why Maana is different from other Big Data search solutions:

image

Image from www.maana.com

A key differentiator is that the company says:

Maana is not based on open source Solr/Lucene.

That should chop out the LuceneWorks (Really?) and other open source Big Data options in a competitive fray.

Will Manna’s positioning tactic thwart other proprietary Big Data information access solutions? Hewlett Packard, are you ready to rumble? Oracle. Wait. Oracle is always ready to rumble. Google and In-Q-Tel backed Recorded Future? Oops. Recorded Future is jammed with work and inquiries as I understand it. Whatever. Let the proprietary Big Data search Copa de Data off begin.

Stephen E Arnold, May 23, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta