The Noble Quest Behind Semantic Search

November 25, 2016

A brief write-up at the ontotext blog, “The Knowledge Discovery Quest,” presents a noble vision of the search field. Philologist and blogger Teodora Petkova observed that semantic search is the key to bringing together data from different sources and exploring connections. She elaborates:

On a more practical note, semantic search is about efficient enterprise content usage. As one of the biggest losses of knowledge happens due to inefficient management and retrieval of information. The ability to search for meaning not for keywords brings us a step closer to efficient information management.

If semantic search had a separate icon from the one traditional search has it would have been a microscope. Why? Because semantic search is looking at content as if through the magnifying lens of a microscope. The technology helps us explore large amounts of systems and the connections between them. Sharpening our ability to join the dots, semantic search enhances the way we look for clues and compare correlations on our knowledge discovery quest.

At the bottom of the post is a slideshow on this “knowledge discovery quest.” Sure, it also serves to illustrate how ontotext could help, but we can’t blame them for drumming up business through their own blog. We actually appreciate the company’s approach to semantic search, and we’d be curious to see how they manage the intricacies of content conversion and normalization. Founded in 2000, ontotext is based in Bulgaria.

Cynthia Murrell, November 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Alphabet Google: AI Acceleration

November 24, 2016

The Alphabet Google thing is upping the amps in its quest to be the big dog in artificial intelligence. After years of IBM public relations Watsonage, the GOOG wants the world to know that it is the leader in smart software. There is some news zipping around the fact checked, ever accurate online information sources; for example:

How does one know that Google’s smart software is really smart and not a response to the IBM Watson assertions?

Easy question. Google will make online demos available at AI Experiments, a new Web site.

image

You can “explore machine learning by playing with pictures, language, music, code, and more.” You can “visualize high dimensional space” which is not an easy trick for some folks here in rural Kentucky. Also, you can see “what neural networks see.” Well, sort of. If you are a coder, you can submit your smart software using Google goodies as well.

My recollection is that Google has been doing smart software for almost two decades. What’s new is the PR-ification of Google’s effort to reduce costs, create new services, and remain the top technology dog.

And what about search? Hey, that’s not part of the agenda unless you count Google’s intent to become the travel question answering machine. Precision, recall, relevance — Google has that baked in along side its ads.

Stephen E Arnold, November 24, 2016

Keeping Current with Elastic.co

November 24, 2016

Short honk. If you want to keep up with Elastic and Elasticsearch, the company’s “This Week in Elasticsearch and Apache Lucene” may be of interest. The weekly posting includes information about commits, releases, and training. Unlike the slightly crazed, revenue challenged open source search vendors, Elastic.co provides factual information about the plumbing for the search and retrieval system. We found the “Ongoing Changes” section useful and interesting. The idea is that one can keep track of certain features, methods, and issues by scanning a list. The short description of an issue, for instance, includes a link to additional information. Highly recommended for those hooked on Elastic.co’s free and open source solution or the for fee products and services the company offers.

Stephen E Arnold, November 24, 2016

Do Not Forget to Show Your Work

November 24, 2016

Showing work is messy, necessary step to prove how one arrived at a solution.  Most of the time it is never reviewed, but with big data people wonder how computer algorithms arrive at their conclusions.  Engadget explains that computers are being forced to prove their results in, “MIT Makes Neural Networks Show Their Work.”

Understanding neural networks is extremely difficult, but MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has developed a way to map the complex systems.  CSAIL figured the task out by splitting networks in two smaller modules.  One for extracting text segments and scoring according to their length and accordance and the second module predicts the segment’s subject and attempts to classify them.  The mapping modules sounds almost as complex as the actual neural networks.  To alleviate the stress and add a giggle to their research, CSAIL had the modules analyze beer reviews:

For their test, the team used online reviews from a beer rating website and had their network attempt to rank beers on a 5-star scale based on the brew’s aroma, palate, and appearance, using the site’s written reviews. After training the system, the CSAIL team found that their neural network rated beers based on aroma and appearance the same way that humans did 95 and 96 percent of the time, respectively. On the more subjective field of “palate,” the network agreed with people 80 percent of the time.

One set of data is as good as another to test CSAIL’s network mapping tool.  CSAIL hopes to fine tune the machine learning project and use it in breast cancer research to analyze pathologist data.

Whitney Grace, November 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Dawn of Blockchain Technology

November 24, 2016

Blockchain technology though currently powers the Bitcoin and other cryptocurrencies, soon the technology might find takers in mainstream commercial activities.

Blockgeeks in an in-depth article guide titled What Is Blockchain Technology? A Step-By-Step Guide for Beginners says:

The blockchain is an incorruptible digital ledger of economic transactions that can be programmed to record not just financial transactions but virtually everything of value.

Without getting into how the technology works, it would be interesting to know how and where the revolutionary technology can be utilized. Due to its inherent nature of being incorruptible due to human intervention and non-centralization, blockchain has numerous applications in the field of banking, remittances, shared economy, crowdfunding and many more, the list is just endless.

The technology will be especially helpful for people who transact over the Web and as the article points out:

Goldman Sachs believes that blockchain technology holds great potential especially to optimize clearing and settlements, and could represent global savings of up to $6bn per year.

Governments and commercial establishment, however, are apprehensive about it as blockchain might end their control over a multitude of things. Just because blockchain never stores data at one location. This also is the reason why Bitcoin is yet to gain full acceptance. But, can a driving force like blockchain technology that will empower the actual users can be stopped?

Vishal Ingole, November 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Hitachi Digs into Enterprise Search

November 23, 2016

HItachi Data Systems has embraced “content intelligence.” My recollection is that the “search” underlying the HItachi Content Platform is Perfect Search, a proprietary system which emphasized its performance features, not its ease of use for system administrators.

“Hitachi Adds Enterprise Search to Object Store” informs me that:

Hitachi Data Systems today debuted Content Intelligence, a new offering that adds a slew of enterprise search and analytic capabilities to its object-based file system.

Slew?

The system supports multi tenant, cloud scale deployments. The block diagram for the system looks like this:

image

According to a Hitachi professional, the new system will be “invaluable.” That is, I presume, a “slew” of value.

Hitachi was the second best system for object storage according to the big moon, mid tier consulting firm Gartner Group. The number one system was IBM Watson’s cell mate CleverSafe dsNet. (This is not the IBM Almaden Clever system for relevance determination.)

Other features, in addition to search, are a cloud gateway component, a file synchronization tool, and the ability to share access. For more information about the system, you can read “Better Object Storage with Hitachi Content Platform 2014.”

Stephen E Arnold, November 23, 2016

Hear That Bing Ding: A Warning for Google Web Search

November 23, 2016

Bing. Bing. Bing. The sound reminds me of a broken elevator door in the Block & Kuhl when I was but a wee lad. Bing. Bing. Bing. Annoying? You bet.

I read “Microsoft Corporation Can Defeat Alphabet Inc in Search.” I enjoy these odd, disconnected from the real world write ups predicting that Microsoft will trounce Google in a particular niche. This particular write up seizes upon the fluff about Microsoft having an “intelligence fabric.” Then with a spectacular leap, which ignores the fact that more than 90 percent of the humans use Google Web search, suggests that Bing will be the next big thing in Web search.

Get real.

Bing, after two decades of floundering, allegedly is profitable. No word on how long it will take to pay back the money Microsoft has invested in Web search over these 4,000 days of stumbling.

I highlighted this passage in the write up:

Rik van der Kooi, corporate vice president of Microsoft Search Advertising, referred to Bing as an “intelligence fabric” that has been embedded into Windows 10, Cortana, Xbox and other products, including Hololens. He went on to say the future Bing will be personal, pervasive and offer a personal experience so much that it “might not be obvious users are even interacting with the search engine.

I think I understand. Microsoft is everywhere. Microsoft Bing is embedded. Therefore, Microsoft beats Google Web search.

Great thinking.

I do like this passage:

This is a bold call considering that Google owned 89.38% of the global desktop search engine market, while Microsoft owned 4.2% as of July 2016, according to data provided by Statista. With MSFT’s endeavors to create an integrated ecosystem, however, the long-term scale is tipping in the favor of Microsoft stock. That’s because Microsoft’s traditional business is entrenched into many people’s lives as well as business operations. For instance, the majority of desktop devices run on Windows.

Yep, there are lots of desktops still. However, there are more mobile devices. If I am not mistaken, Google’s Android runs more than 80 percent of these devices. Add desktop and mobile and what do you get? No dominance of Web search by Bing the way I understand the situation.

Sure, I love the Bing thing. I have some affection for Qwant.com, Yandex.com, and Inxight.com too. But Microsoft has yet to demonstrate that it can deliver a Web search system which is able to change the behaviors of today’s users. Look at the Google in the word processing space. Microsoft continues to have an edge and Google has been trying for more than a decade to make Word an afterthought. That hasn’t happened. Inertia is a big factor.

Search for growing market share on Bing. What’s that answer look like? Less than five percent of the Web search market? Oh, do that query on Google by the way.

Stephen E Arnold, November 23, 2016

Writing That Is Never Read

November 23, 2016

It is inevitable in college that you were forced to write an essay.  Writing an essay usually requires the citation of various sources from scholarly journals.  As you perused the academic articles, the thought probably crossed your mind: who ever reads this stuff?  Smithsonian Magazine tells us who in the article, “Academics Write Papers Arguing Over How Many People Read (And Cite) Their Papers.”  In other words, themselves.

Academic articles are read mostly by their authors, journal editors, and the study’s author write, and students forced to cite them for assignments.  In perfect scholarly fashion, many academics do not believe that their work has a limited scope.  So what do they do?  They decided to write about it and have done so for twenty years.

Most academics are not surprised that most written works go unread.  The common belief is that it is better to publish something rather than nothing and it could also be a requirement to keep their position.  As they are prone to do, academics complain about the numbers and their accuracy:

It seems like this should be an easy question to answer: all you have to do is count the number of citations each paper has. But it’s harder than you might think. There are entire papers themselves dedicated to figuring out how to do this efficiently and accurately. The point of the 2007 paper wasn’t to assert that 50 percent of studies are unread. It was actually about citation analysis and the ways that the internet is letting academics see more accurately who is reading and citing their papers. “Since the turn of the century, dozens of databases such as Scopus and Google Scholar have appeared, which allow the citation patterns of academic papers to be studied with unprecedented speed and ease,” the paper’s authors wrote.

Academics always need something to argue about, no matter how miniscule the topic. This particular article concludes on the note that someone should get the number straight so academics can move onto to another item to argue about.  Going back to the original thought a student forced to write an essay with citations also probably thought: the reason this stuff does not get read is because they are so boring.

Whitney Grace, November 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Android Has No Competition in Mobile OS Market

November 23, 2016

Google’s Android OS currently powers 88% of the smartphones in the world, leaving minuscule 12.1 percent to Apple’s iOS and the remaining 0.3 percent for Windows Mobile, BlackBerry OS and Tizen.

IBTimes in an article titled Android Rules! 9 out of Every 10 Phones Run Google’s OS says:

Google’s Android OS dominated the world by powering 88 percent of the world’s smartphone market in the third quarter of 2016. This means 9 out of every 10 mobile phones in the world are using Android, while the rest rely on iOS or other mobile OS such as BlackBerry OS, Tizen and Windows Phone.

The growth occurred despite the fact that smartphone shipments are falling. China and Africa which were big markets have been performing poorly since last three-quarters. Android’s gain thus can be attributed to the fact that Android is an OpenSource system that can be used by any device manufacturer.

Despite being the clear leader, the mobile OS is full of bugs and other inherent problems, as the article points out:

Android platform is getting overcrowded with hundreds of manufacturers, few Android device vendors make profits, and Google’s new Pixel range is attacking its own hardware partners that made Android popular in the first place.

At present, Samsung, Huawei, Oppo and Vivo are the leading Android phone makers. However, Google recently unveiled Pixel, its flagship phone for the premium category. Does it mean that Google has its eyes set on the premium handset category market? Only time can tell.

Vishal Ingole, November 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

HonkinNews for November 22, 2016 Now Available

November 22, 2016

This week’s HonkinNews talks about a Thanksgiving surprise for Verizon AOL professionals. Shakespeare’s idea about what to do with lawyers is revisited with a 21st century twist: Hit the delete key. Is there disappearing information in the Google index? We report on one interesting unfindable. Video search may improve, but for now, it’s a meh. HonkinNews points out that “fake news” is now a thing and offers three variations on spoofing the “experts”. We identify the two French enterprise search vendors who have transformed themselves into artificial intelligence vendors in a remarkable demonstration of their flexibility. IBM Watson’s new initiative homes to get “into” a host of new revenue opportunities. You can view the program at this link. The special Google Trilogy programs filmed live in Harrod’s Creek become available starting on December 20, 2016. There are three seven minute videos in the series which summarizes the principal findings from The Google Legacy (2005), Google Version 2.0 (2007), and Google: The Digital Gutenberg (2009). The monographs are out of print but the information remains timely as Alphabet Google spells out its future.

Kenny Toth, November 22, 2016

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta