Microsoft Bing in Edge is Baidu: Confused?

September 24, 2015

I received an alert about Bing. I usually ignore these. The headline did not reference search. The article is billed as “Windows 10 in China.” I am not sure why I scanned the item, but I noted that the Microsoft blog post contained an interesting factoid about Bing search.

Here’s the passage I noted:

Together [Baidu and Microsoft], we will make it easy for Baidu customers to upgrade to Windows 10 and we will deliver a custom experience for customers in China, providing local browsing and search experiences. Baidu.com will become the default homepage and search for the Microsoft Edge browser in Windows 10.

I wondered if I understood the message. The Windows 10 browser, called Edge, will include a Web and local search function. The search is going to be provided by Baidu for “local browsing and search experiences.”

I find this interesting for two reasons: Is Bing, assisted by a search wizard from Australia, now “funneling” queries to Baidu? and Has Microsoft given up on the job of indexing Chinese language content?

I recall reading “About Microsoft Research Asia,” and learning that one of the goals for Microsoft’s expanding research activities in Asia was:

Search and online advertising takes Web search and online advertising to the next level by applying data-mining, machine-learning and knowledge-discovery techniques to information analysis, organization, retrieval and visualization.

Now the company is relying on a third party for search. Is this a signal that Bing is not up to the search and retrieval job in China?

Stephen E Arnold, September 24, 2015

Rundown on Legal Knowledge Management

September 24, 2015

One of the new legal buzzwords is knowledge management and not just old-fashioned knowledge management, but rather quick, efficient, and effective.  Time is an expensive commodity for legal professionals, especially with the amount of data they have to sift through for cases.  Mondaq explains the importance of knowledge management for law professionals in the article, “United States: A Brief Overview Of Legal Knowledge Management.”

Knowledge management first started in creating an effective process for managing, locating, and searching relevant files, but it quickly evolved into implementing a document managements system.  While knowledge management companies offered law practices decent document management software to tackle the data hill, an even bigger problem arose. The law practices needed a dedicated person to be software experts:

“Consequently, KM emphasis had to shift from finding documents to finding experts. The expert could both identify useful documents and explain their context and use. Early expertise location efforts relied primarily on self-rating. These attempts almost always failed because lawyers would not participate and, if they did, they typically under- or over-rated themselves.”

The biggest problem law professional face is that they might invest a small fortune in a document management license, but they do not know how to use the software or do not have the time to learn.  It is a reminder that someone might have all the knowledge and best tools at their fingertips, but unless people have the knowledge on how to use and access it, the knowledge is useless.

Whitney Grace, September 24, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Funding Granted for American Archive Search Project

September 23, 2015

Here’s an interesting project: we received an announcement about funding for Pop Up Archive: Search Your Sound. A joint effort of the WGBH Educational Foundation and the American Archive of Public Broadcasting, the venture’s goal is nothing less than to make almost 40,000 hours of Public Broadcasting media content easily accessible. The American Archive, now under the care of WGBH and the Library of Congress, has digitized that wealth of sound and video. Now, the details are in the metadata. The announcement reveals:

As we’ve written before, metadata creation for media at scale benefits from both machine analysis and human correction. Pop Up Archive and WGBH are combining forces to do just that. Innovative features of the project include:

*Speech-to-text and audio analysis tools to transcribe and analyze almost 40,000 hours of digital audio from the American Archive of Public Broadcasting

*Open source web-based tools to improve transcripts and descriptive data by engaging the public in a crowdsourced, participatory cataloging project

*Creating and distributing data sets to provide a public database of audiovisual metadata for use by other projects.

“In addition to Pop Up Archive’s machine transcripts and automatic entity extraction (tagging), we’ll be conducting research in partnership with the HiPSTAS center at University of Texas at Austin to identify characteristics in audio beyond the words themselves. That could include emotional reactions like laughter and crying, speaker identities, and transitions between moods or segments.”

The project just received almost $900,000 in funding from the Institute of Museum and Library Services. This loot is on top of the grant received in 2013, from the Corporation for Public Broadcasting, that got the project started. But will it be enough money to develop a system that delivers on-point results? If not, we may be stuck with something clunky, something that resembles the old Autonomy Virage, Blinkxx, Exalead video search, or Google YouTube search. Let us hope this worthy endeavor continues to attract funding so that, someday, anyone can reliably (and intuitively) find valuable Public Broadcasting content.

Cynthia Murrell, September 23, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Exalead Gets a New Application

September 22, 2015

Exalead is Dassault Systems’s big data software targeted specifically at businesses.  Exalead offers innovative data discovery and analytics solutions to manage information in real time across various servers and generate insightful reports to make better, faster decisions.  It is the big data solution of choice for many businesses across various industries.  The Exalead blog shares that “PricewaterhouseCoopers Is Launching Its Information Management Application, Based on Exalead CloudView.”

PricewaterhouseCoopers (PwC) analyzed the amount of time users spent trying to locate, organize, and disseminated information.  When users spend the time on information management, they lose two valuable resources: time and money.  PwC designed Pulse, a search and information tool as a solution to the problem.

“The EXALEAD CloudView software solution from Dassault Systèmes facilitates the rapid search and use of sources of structured and unstructured information. In existence since 2007, this enterprise information management concept was integrated initially in other software applications. Since it was reworked as EXALEAD CloudView, the configuration of the queries has become easier and they are processed much faster. Furthermore, the results of the searches are more precise, significantly reducing the number of duplicates and the time wasted managing them. PwC has deliberately decided to roll out Pulse on an international scale gradually, in order to generate plenty of enthusiasm amongst users. A business case is prepared for each country on the basis of its needs, the benefits and the potential savings. PwC also intends to make the content in Pulse accessible by other internal systems (e.g., the project workspaces), to integrate the sources, and to make the search function even smarter.”

Pulse is supposed to cut costs and reinvest the resources into more fruitful venues.  One interesting aspect to note is that PwC did not build the Pulse upgrade, Exalead provided the plumbing.

Whitney Grace, September 22, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

New Search System for Comparing Companies

September 22, 2015

There is a new tool out to help companies compile information on their competitors: RivalSeek. This brainchild of entrepreneur Richard Brevig seeks to combat an issue he encountered when he turned to Google while researching the market for a different project: Google’s “personalized search” filters

keep users from viewing the whole landscape of any particular field. Frustration led Brevig to develop some tools of his own, which he realized might appeal to others. The site’s homepage explains simply:

“Find your competitors that Google can’t. RivalSeek’s competitor search engine looks past filter bubbles, finding competitors you’ve never heard of.”

More information can be found in Brevig’s brief introductory video on YouTube. There’s also this “quick demo,” which can be found on YouTube or playing quietly on RivalSeek’s home page. While the tool is still in Beta, Brevig is confident enough in its usefulness to charge $29 a month for access. You can find an example success story, for the Dollar Shave Club, at the company’s blog.

This is a great idea. While Google’s filter bubbles can be convenient, it is clear that confirmation bias is not their only hazard. Perhaps Brevig would be interested in expanding this tool into other areas, like science, literature, or sociology. Just a suggestion.

Cynthia Murrell, September 22, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Redundant Dark Data

September 21, 2015

Have you heard the one about how dark data hides within an organization’s servers and holds potential business insights? Wait, you did not?  Then where have you been for the past three years?  Datameer posted an SEO heavy post on its blog called, “Shine Light On Dark Data.”  The post features the same redundant song and dance about how dark data retained on server has valuable customer trend and business patterns that can put them bring them out ahead of the competition.

One new fact is presented: IDC reports that 90% of digital data is dark.  That is a very interesting fact and spurs information specialists to action to get a big data plan in place, but then we are fed this tired explanation:

“This dark data may come in the form of machine or sensor logs that when analyzed help predict vacated real estate or customer time zones that may help businesses pinpoint when customers in a specific region prefer to engage with brands. While the value of these insights are very significant, setting foot into the world of dark data that is unstructured, untagged and untapped is daunting for both IT and business users.”

The post ends on some less than thorough advice to create an implementation plan.  There are other guides on the Internet that better prepare a person to create a big data action guide.  The post’s only purpose is to serve as a search engine bumper for Datameer.  While Datameer is one of the leading big data software providers, one would think they wouldn’t post a “dark data definition” post this late in the game.

Whitney Grace, September 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

The Semantic Web Has Arrived

September 20, 2015

Short honk: If you want evidence of the impact of the semantic Web, you will find “What Happened to the Semantic Web?” useful. The author captures 10 examples of the semantic Web in action. I highlighted this passage in the narrative accompanying the screenshots:

there is no question that the Web already has a population of HTML documents that include semantically-enriched islands of structured data. This new generation of documents creates a new Web dimension in which links are no longer seen solely as document addresses, but can function as unambiguous names for anything, while also enabling the construction of controlled natural language sentences for encoding and decoding information [data in context] — comprehensible by both humans and machines (bots).

Structured data will probably play a large part in the new walled gardens now under construction.

The conclusion will thrill the search engine optimization folks who want to decide what is relevant to a user’s query; to wit:

A final note — The live demonstrations in this post demonstrate a fundamental fact: the addition of semantically-rich structured data islands to documents already being published on the Web is what modern SEO (Search Engine Optimization) is all about. Resistance is futile, so just get with the program — fast!

Be happy.

Stephen E Arnold, September 20, 2015

Google Play Serves as Make Up Letter from Google to China

September 18, 2015

The article titled Google’s Return to China Won’t Be Easy on VentureBeat discusses Google’s ambitions to revisit China with the help of Google Play, its Android mobile operating system app store. If you don’t remember, about five years ago Google refused to self-censor search results and pulled its services from China to boot. But Google can’t help looking longingly over its shoulder at the world’s largest Internet market. The article explains,

“Apple Inc complies with local laws and made $13.2 billion last quarter in Greater China…, making it its second-biggest market. Some in the industry doubt whether Google can use the Play store to help get its other services into China as domestic rivals are now well established and Google would have to comply with Chinese law. That would mean storing all data in China, and meeting information access and censorship requests, a thorny issue, particularly if the U.S. government gets involved.”

Obviously, China did not heed Google’s advice on reforming its approach to business and government oversight. Some argue that the focus on Google Play may make the movement toward China less threatening to Chinese regulators than their other services like search and Gmail. The article suggests the possibility that the lapse in Google’s presence in the market may be fatal to them there. The niche market has been working just fine, thank you very much, many mobile players believe. At any rate, Google’s hopes are a long shot unless they are willing to do it the Chinese way.

Chelsea Kerwin, September 18, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Wabion Pairs with Twigkit to Boost User Experience

September 18, 2015

We’ve learned of an interesting alliance from this announcement at OpenPR, “Strategic Partnership Between Wabion and Twigkit in the Enterprise Search Sector.” We predict that more and fancier interfaces will arise from this deal. Wabion works closely with Google, and was named “top Google for Work Partner” in the DACH (Germany, Austria, and Switzerland) region. Now the company will bring TwigKit’s user-experience prowess to their enterprise search offerings. The press release notes:

“By providing simple building blocks for traditionally complex problems, Twigkit strikes the perfect balance between out of the box experience and fine-grained control for GSA applications. Twigkit delivers customised, elegant, search-based applications that can be delivered in a fraction of the time when compared to bespoke development. The resulting applications delivers demonstrably better results and have been proven in the most demanding scenarios. The outcome is not just a better and more efficient experience for both administrators and users alike but the opportunity to allow businesses to realise the value of their information outside of the standard keyword search and list of results approach.”

Twigkit is excited for this chance to expand into the German-speaking market, while Wabion looks forward to providing a richer UI within the Google Search Appliance.

Founded in 2009, Twigkit splits its operations between Cambridge, UK, and Milpitas, California. As of this writing, they  are looking to hire some developers and engineers. The Wabion Group maintains offices in Germany and Austria, and was founded in 2011. They are currently seeking one developer to fill a vacancy in Switzerland.

Cynthia Murrell, September 18, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Oracle Revenues: Implications for HP and IBM

September 17, 2015

Oracle is an interesting company because it owns a number of enterprise search and content processing technologies. For example, decades ago, the company bought the often overlooked Artificial Linguistics. Then Oracle complemented its “Text” and “Secure Enterprise Search” technology with Triple Hop. Gentle reader, I am confident you know about Triple Hop’s clustering methods. Then in a spate of content processing fury, Oracle bought RightNow (Dutch developed indexing technology), InQuira (natural language processing crafted from two early Sillycon Valley search vendors), and Endeca, the now long in the tooth, computationally intensive “Guided Navigation” outfit. And we must not forget the retrieval functions of PL/SQL. Oracle has almost as many search and retrieval systems to nurture as that high flying OpenText outfit in Canada.

With such a backpack of information access goodies, should we expect a revenue report bursting with good news? It struck me as I read “Oracle Beats Profit Estimates by a Penny a Share but Revenue Slides” that search and retrieval may not be a zoo with golden geese.

Oracle delivered earnings which made the fine Wall Street MBAs glow. However, the revenue did not win a gold star.

Set aside Oracle for the nonce.

Think about Hewlett Packard (Autonomy stuff) and IBM (Watson stuff). Both of these outfits are reporting declining revenues too. Both have bet large sums on information access.

My question is, “Will a payoff arrive?”

My other question is, “When the payoff arrives, will it make up for the loss in revenues from old line products and services?”

My hunch is that these big bets on search are current and future ponds of despair.

Now set aside these floundering blue chips.

What about the up and coming search vendors? Life is not easy for vendors of search and content processing technology. There are some bright spots, of course, but vendors with deep roots in traditional search craziness are likely to find revenues insufficient to pay for customer support, bug fixing, and implementation of new technical methods.

Google before its founders did an arabesque into Alphabet figured this out with the high interest credit card of technical debt. When will HP, IBM, and Oracle get the message?

Stephen E Arnold, September 17, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta