Search 2020: Peering into the Future of Information Access
May 22, 2015
The shift in search, user behaviors, and marketing are transforming bread-and-butter keyword search. Quite to my surprise, one of my two or three readers wrote to one of the goslings with a request. In a nutshell, the reader wanted my view of a write up which appeared in the TDWI online publication. TDWI, according to the Web site, is “your source for in depth education and research on all things data.” Okay, I can related to a categorical affirmative, education, research, and data.
The article has a title which tickles my poobah bone: “The Future of Search.” The poobah bone is the part of the anatomy which emits signals about the future. I look at a new search system based on Lucene and other open source technology. My poobah bone tingles. Lots of folks have poobah bones, but these constructs of nerves and tissues are most highly developed in entrepreneurs who invent new ways to locate information, venture capitalists who seek the next Google, and managers who are hired to convert information access into billions and billions of dollars in organic revenue.
The write up identifies three predictions about drivers on the information retrieval utility access road:
- Big Data
- Cloud infrastructure
- Analytics.
Nothing unfamiliar in these three items. Each shares a common characteristic: None has a definition which can be explained in a clear concise way. These are the coat hooks in the search marketers’ cloakroom. Arguments and sales pitches are placed on these hooks because each connotes a “new” way to perform certain enterprise computer processes.
But what about these drivers: Mobile access, just-in-time temporary/contract workers, short attention spans of many “workers”, video, images, and real time information requirements? Perhaps these are subsets of the Big Data, cloud, and analytics generalities, but maybe, just maybe, could these realities be depleted uranium warheads when it comes to information access?
These are the present. What is the future? Here’s a passage I highlighted:
Enterprise search in 2020 will work much differently than it does today. Apple’s Siri, IBM’s Watson, and Microsoft’s Cortana have shown the world how enterprise search and text analytics can combine to serve as a personal assistant. Enterprise search will continue to evolve from being your personal assistant to being your personal advisor.
How are these systems actually working in noisy automobiles or in the kitchen?
I know that the vendors I profiled in CyberOSINT: Next Generation Information Access are installing systems which perform this type of content processing. The problem is that search, as I point out in CyberOSINT, is that the function is, at best, a utility. The heavy lifting comes from collection, automated content processing, and various output options. One of the most promising is to deliver specific types of outputs to both humans and to other systems.
The future does tailor information to a person or to a unit. Organizations are composed of teams of teams, a concept now getting a bit more attention. The idea is not a new one. What is important is that next generation information access systems operate in a more nuanced manner than a list of results from a Lucene based search query.
The article veers into a interesting high school teacher type application of Microsoft’s spelling and grammar checker. The article suggests that the future of search will be to alert the system user his or her “tone” is inappropriate. Well, maybe. I turn off these inputs from software.
The future of search involves privacy issues which have to be “worked out.” No, privacy issues have been worked out via comprehensive, automated collection. The issue is how quickly organizations will make use of the features automated collection and real time processing deliver. Want to eliminate the risk of insider trading? Want to identify bad actors in an organization? One can, but this is not a search function. This is an NGIA function.
The write up touches on a few of the dozens of issues implicit in the emergence of next generation information access systems. But NGIA is not search. NGIA systems are a logical consequence of the failures of enterprise search. These failures are not addressed with generalizations. NGIA systems, while not perfect, move beyond the failures, disappointments, and constant legal hassles search vendors have created in the last 40 years.
My question, “What is taking so long?”
Stephen E Arnold, May 22, 2015
Lexmark Buys Kofax: Scanning and Intelligence Support
May 22, 2015
Short honk: I was fascinated when Lexmark purchased Brainware and ISYS Search Software a couple of years ago. Lexmark, based in Lexington, Kentucky, used to be an IBM unit. That it seems did not work out as planned. Now Lexmark is in the scanning and intelligence business. Kofax converts paper to digital images. Think health care and financial services, among other paper centric operations. Kofax bought Kapow, a Danish outfit that provides connectors and ETL software and services. ETL means extract, transform, and load. Kapow is a go to outfit in the intelligence and government services sector. You can read about Lexmark’s move in “Lexmark Completes Acquisition of Kofax, Announces Enterprise Software Leadership Change.”
According to the write up:
- This was a billion dollar deal
- The executive revolving door is spinning.
In my experience, it is easier to spend money than to make it. Will Lexmark be able to convert these content processing functions into billions of dollars in revenue? Another good question to watch the company try to answer in the next year or so. Printers are a tough business. Content processing may be even more challenging. But it is Kentucky. Long shots get some love until the race is run.
Stephen E Arnold, May 22, 2015
Peruse Until You Are Really Happy
May 22, 2015
Have you ever needed to quickly locate a file that you just know you made, but were unable to find it on your computer, cloud storage, tablet, smartphone, or company pool drive? What is even worse is if your search query does not pick up on any of your keywords! What are you supposed to do then? VentureBeat might have the answer to your problems as explained in the article, “Peruse Is A New Natural Language Search Tool For Your Dropbox And Box Files.” Peruse is a search tool that allows users to use their natural flow of talking to find their files and information.
Natural language querying is already a big market for business intelligence software, but it is not as common in file sharing services. Peruse is a startup with the ability to search Dropbox and Box accounts using a regular question. If you ask, “Where is the marketing data from last week?” The software will be able to pull the file for you without even opening the file. Right now, Peruse can only find information in spreadsheets, but the company is working on expanding the supported file types.
“The way we index these files is we actually look at them visually — it understands them in a way a person would understand them,” said [co-founder and CEO Luke Gotszling], who is showing off Peruse…”
Peruse’s goal is to change the way people use document search. Document search has remained pretty consistent since 1995, twenty years later Gotszling is believes it is time for big change. Gotzling is right, document search remains the same, while Web search changes everyday.
Whitney Grace, May 22, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Is Collaboration the Key to Big Data Progress?
May 22, 2015
The article titled Big Data Must Haves: Capacity, Compute, Collaboration on GCN offers insights into the best areas of focus for big data researchers. The Internet2 Global Summit is in D.C. this year with many exciting panelists who support the emphasis on collaboration in particular. The article mentions the work being presented by several people including Clemson professor Alex Feltus,
“…his research team is leveraging the Internet2 infrastructure, including its Advanced Layer 2 Service high-speed connections and perfSONAR network monitoring, to substantially accelerate genomic big data transfers and transform researcher collaboration…Arizona State University, which recently got 100 gigabit/sec connections to Internet2, has developed the Next Generation Cyber Capability, or NGCC, to respond to big data challenges. The NGCC integrates big data platforms and traditional supercomputing technologies with software-defined networking, high-speed interconnects and visualization for medical research.”
Arizona’s NGCC provides the essence of the article’s claims, stressing capacity with Internet2, several types of computing, and of course collaboration between everyone at work on the system. Feltus commented on the importance of cooperation in Arizona State’s work, suggesting that personal relationships outweigh individual successes. He claims his own teamwork with network and storage researchers helped him find new potential avenues of innovation that might not have occurred to him without thoughtful collaboration.
Chelsea Kerwin, May 22, 2014
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Expert System Connects with PwC
May 21, 2015
Short honk: Expert System has become a collaborator with PwC (formerly a unit of IBM and an accounting firm). The article points out:
Expert System, a company active in the semantic technology for information management and big data listed on Aim Italy, announced the collaboration with PwC, global network of professional services and integrated audit, advisory, legal and tax, for Expo Business Matching, the virtual platform of business meetings promoted by Expo Milano 2015, the Chamber of Commerce of Milan, Promos, Fiera Milano and PwC.
Expert System is a content processing developer with offices in Italy and the United States.
Stephen E Arnold, May 21, 2015
Big Data: The Shrinky Dink Approach
May 21, 2015
I read “To Handle Big Data, Shrink It.” Years ago I did a job for a unit of a blue chip consulting firm. My task was to find a technology which allowed a financial institution to query massive data sets without bringing the computing system to its knees and causing the on-staff programmers to howl with pain.
I located an outfit in what is now someplace near a Prague-like location. The company was CrossZ, and it used a wonky method of compression and a modified version of SQL with a point and click interface. The idea was that a huge chunk of the bank data—for instance, the transactions in the week before mother’s day—to be queried for purchasing-related trends. Think fraud. Think flowers. Think special promotions that increased sales. I have not kept track of the low profile, secretive company. I did think of it when I read the “shrink Big Data story.”
This passage resonated and sparked my memory:
MIT researchers will present a new algorithm that finds the smallest possible approximation of the original matrix that guarantees reliable computations. For a class of problems important in engineering and machine learning, this is a significant improvement over previous techniques. And for all classes of problems, the algorithm finds the approximation as quickly as possible.
The point is that it is now 2015 and a mid 1990s notion seems to be fresh. My hunch is that the approach will be described as novel, innovative, and a solution to the problems Big Data poses.
Perhaps the MIT approach is those things. For me, the basic idea is that Big Data has to be approached in a rational way. Otherwise, how will queries of “Big Data” which has been processed and a stream of new or changed “Big Data” be processed in a way that is affordable, is computable, and is meaningful to a person who has no clue what is “in” the Big Data.
Fractal compression, recursive methods, mereological techniques, and other methods are a good idea. I am delighted with the notion that Big Data has to be made small in order to be more useful to companies with limited budgets and a desire to answer some basic business questions with small data.
Stephen E Arnold, May 21, 2015
Decrease from TrueVue
May 21, 2015
The article on Business Insider titled Google Has a New and Unexpected Explanation for Its Falling Ad Rates places the blame on Youtube’s “TrueView” video ads. For some time there has been concern over Google’s falling cost-per-click (CPC) money, the cash earned each time a user clicks on an ad. The first quarter of this year has CPC down 7%. The article quotes outgoing Google CFO Patrick Pichette on the real reason for these numbers. He states,
“TrueView ads currently monetize at a lower rate than ad clicks on Google.com. As you know, video ads generally reach people earlier in the purchase funnel, and so across the industry, they tend to have a different pricing profile than that of search ads,” Pichette explained. “Excluding the impact of YouTube TrueView ads, growth in Sites clicks would be lower, but still positive and CPCs would be healthy and growing Y/Y,” Pichette continued.
It is often thought that the increasing dependence on mobile internet access through smartphones is the reason for falling CPC. Google can’t charge as much for mobile ads as for PC ads, making it a logical leap that this is the area of concern. Pichette offers a different view, and one with an entirely positive spin.
Chelsea Kerwin, May 21, 2014
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Long-term Plans for SharePoint
May 21, 2015
Through all the iterations of SharePoint, it seems that Microsoft has wised up and is finally giving customers more of what they want. The release of SharePoint Server 2016 shows a shift back toward on-premises installations, and yet there will still be functions supported through the cloud. This new hybrid emphasis provides a third pathway through which users are experiencing SharePoint. The CMS Wire article, “3 SharePoint Paths for the Next 10 Years,” covers all the details.
The article begins:
“Microsoft Office 365 has proven to be a major disruption of how companies use SharePoint to meet business requirements. Rumors, fear, uncertainty and doubt proliferate around Microsoft’s plans for SharePoint’s future releases, as well as the support of critical features and functionality companies rely on . . . So, taking into account Office 365, the question is: How will companies be using SharePoint over the next 10 years?”
Stephen E. Arnold of ArnoldIT.com is a leader in SharePoint, with a lifelong career in search. His SharePoint feed is a great resource for users and managers alike, or anyone who needs to keep on top of the latest developments. It may be that the hybrid solution is a way to keep on-premises users happy while they still benefit from the latest cloud functions like Delve and OneDrive.
Emily Rae Aldridge, May 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Make Mine Mobile Search
May 21, 2015
It was only a matter of time, but Google searches on mobile phones and tablets have finally pulled ahead of desktop searches says The Register in “Peak PC: ‘Most’ Google Web Searches ‘Come From Mobiles’ In US.” Google AdWords product management representative Jerry Dischler said that more Google searches took place on mobile devices in ten countries, including the US and Japan. Google owns 92.22 percent of the mobile search market and 65.73 percent of desktop searches. What do you think Google wants to do next? They want to sell more mobile apps!
The article says that Google has not shared any of the data about the ten countries except for the US and Japan and the search differential between platforms. Google, however, is trying to get more people to by more ads and the search engine giant is making the technology and tools available:
“Google has also introduced new tools for marketers to track their advertising performance to see where advertising clicks are coming from, and to try out new ways to draw people in. The end result, Google hopes, is to bring up the value of its mobile advertising business that’s now in the majority, allegedly.”
Mobile ads are apparently cheaper than desktop ads, so Google will get lower revenues. What will probably happen is that as more users transition to making purchases via phones and tablets, ad revenue will increase vi mobile platforms.
Whitney Grace, May 21, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
Semantic Search: A Return to Hieroglyphics
May 20, 2015
I am so out of date, lost in time, and dumb that I experienced a touch of nausea when I read “Feeligo Expands Semantic Search for Branded Online Stickers.” Feeligo I learned is “a leading provided of branded stickers for online conversations.”
Source: http://blog.feeligo.com/wp-content/uploads/2014/06/vick_export_04.png
The leap from a sticker to semantic search is dazzling. According to the write up, Feeligo has 500 million users. These folks are doing semantic search. How does this sticker-semantic marriage work? The article says:
Feeligo has developed a platform that capitalizes on the growing awareness of marketing to online and mobile users through social conversations, including comment forums and user forums. Feeligo offers clients a plug-and-play solution for all messaging services, complete with generic and branded stickers, which are installed on client sites. Through Feeligo’s semantic recommendation algorithms, direct matches between words and phrases in users’ text conversations and stickers are made, enabling users to quickly find the appropriate sticker for a user’s message.
I have watched enterprise search vendors distort language in their remarkable attempts to generate sales. I have watched the search engine optimization crowd trash relevance and then embrace the jargon of RDF and Owl. I have now seen how purveyors of digital stickers have tapped semantic technology to make hieroglyphics a brand message technique.
Does anyone notice that a digital sticker is a cartoon empowered to generate three views every second? Do these sticker consumers consider dipping into William James or Charles Dickens? Nah, no stickers for that irrelevant material.
Stephen E Arnold, May 20, 2015