Lousy Search Results. An Attention Span Issue?

May 15, 2015

I read the enervating “Humans Have Shorter Attention Span Than Goldfish, Thanks to Smartphones.” Yep, thanks. When I am working and someone speaks to me, I often let out a squeal and twitch. I concentrate on the task at hand to the exclusion of the world. Some folks may lack this old-school concentration.

According to the write up, short attention spans are due to smartphones, not stupidity, a failure to exercise discipline over the mind, or the cranial wiring which permits one to focus. I learned:

According to scientists, the age of smartphones has left humans with such a short attention span even a goldfish can hold a thought for longer. Researchers surveyed 2,000 participants in Canada and studied the brain activity of 112 others using electroencephalograms. The results showed the average human attention span has fallen from 12 seconds in 2000, or around the time the mobile revolution began, to eight seconds.

Right, 12 seconds. That is probably enough attention for pre-Millennials. Eight seconds is too darned long to concentrate on any one thing.

Is this the next Dark Web research specialist I will hire?

When one of the people lobbying me for work whips out a smartphone, scans an iPad, and lets his or her eyes roam around the room—that’s it. No work. The goldfish has a nine second attention span. The fish I have watched in the holding tank in a Chinese restaurant in Wu Han seemed to be able to fix their attention for far long. One red fish just hovered in place and regarded me for 30 seconds maybe more.

Instead of hiring humans, perhaps I should go with a giant koi? Are lousy search skills an example of what happens when one cannot concentrate? Nah, blame the vendor or the IT department. Entitlement management works well.

Stephen E Arnold, May 15, 2015

Hacking a Newspaper: Distancing and Finger Pointing

May 15, 2015

I read “This Is How the Syrian Electronic Army Hacked the Washington Post.” Hacking into a company’s computer system is not something I condone. The target of the hacking is not too keen on the practice either I assume.

One of our Twitter accounts was compromised. We contacted Twitter. Even though we knew the CTO, it took a couple of days to sort out the problem. Apparently Miley Cyrus became a fan of Beyond Search and wanted to share her photo graphs via the blog’s newsfeed. One reader, an Exalead professional, was quite incensed that I was pumping out Miley snaps. I assume he found a better source of search and content processing news or left the field entirely due to the shock I imparted to him. I did not objectify the hacking incident. I don’t think I mentioned it until this moment. A script from somewhere in the datasphere got lucky.

In the aforementioned write up, I noted this passage:

Th3 Pr0, one of the members of the group, confirmed to Motherboard that they were indeed the group behind the attack, which appeared to last for around 30 minutes. Th3 Pr0 said that they were able to insert the alerts by hacking into Instart Logic, a content delivery network (CDN) used by the Washington Post. “We hacked InStart CDN service, and we were working on hacking the main site of Washington Post, but they took down the control panel,” Th3 Pr0 told Motherboard in an email. “We just wanted to deliver a message on several media sites like Washington Post, US News and others, but we didn’t have time :P.” The group often defaces media sites by hacking into other third parties, such as ad networks, that serve content on the sites.

The Washington Post, it seems, was not the problem. A content delivery network was the problem.

The article then reminded me:

This is the second time the hackers get to the Washington Post. The group briefly disrupted the site in 2013 with a phishing attack.

But the kicker for me is this statement:

This hack shows, once again, that a site is only as secure as its third-party resources,including ads, are.

Well, these problems are short lived. The problems are not the problems of the Washington Post. Bueno indeed. Perhaps Amazon’s Jeff Bezos will provide some security inputs to the Washington Post folks. Fool me once, shame on me. Fool me twice, well, blame the third party.

Works in Washington I assume.

Stephen E Arnold, May 15, 2015

Developing an NLP Semantic Search

May 15, 2015

Can you imagine a natural language processing semantic search engine?  It would be a lovely tool to use in your daily routines and make research a bit easier.  If you are working on such a project and are making a progress, keep at that startup because this is lucrative field at the moment.  Over at Stack Overflow, an entrepreneuring spirit is trying to develop a “Semantic Search With NLP And Elasticsearch”:

“I am experimenting with Elasticsearch as a search server and my task is to build a “semantic” search functionality. From a short text phrase like “I have a burst pipe” the system should infer that the user is searching for a plumber and return all plumbers indexed in Elasticsearch.

Can that be done directly in a search server like Elasticsearch or do I have to use a natural language processing (NLP) tool like e.g. Maui Indexer. What is the exact terminology for my task at hand, text classification? Though the given text is very short as it is a search phrase.”

Given that this question was asked about three years ago, a lot has been done not only with Elasticsearch, but also NLP.  Search is moving towards a more organic experience, but accuracy is often muddled by different factors.  These include the quality of the technology, classification, taxonomies, ads in results, and even keywords (still!).

NLP semantic search is closer now than it was three years ago, but technology companies would invest a lot of money in a startup that can bridge the gap between natural language and machine learning.

Whitney Grace, May 15, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Popular and Problematic Hadoop

May 15, 2015

We love open source on principle, and Hadoop is indeed an open-source powerhouse. However, any organization considering a Hadoop system must understand how tricky implementation can be, despite the hype. A pair of writers at GCN asks and answers the question, “What’s Holding Back Hadoop?” The brief article reports on a recent survey of data management pros by data-researcher TDWI. Reporters Troy K. Schneider and Jonathan Lutton explain:

“Hadoop — the open-source, distributed programming framework that relies on parallel processing to store and analyze both structured and unstructured data — has been the talk of big data for several years now.  And while a recent survey of IT, business intelligence and data warehousing leaders found that 60 percent will Hadoop in production by 2016, deployment remains a daunting task. TDWI — which, like GCN, is owned by 1105 Media — polled data management professionals in both the public and private sector, who reported that staff expertise and the lack of a clear business case topped their list of barriers to implementation.”

The write-up supplies a couple bar graphs of survey results, including the top obstacles to implementation and the primary benefits of going to the trouble. Strikingly, only six percent or respondents say there’s no Hadoop in their organizations’ foreseeable future. Though not covered in the GCN write-up, the full, 43-page report includes word on best practices and implementation trends; it can be downloaded here (registration required).

Cynthia Murrell, May 15, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Google and UK in the Spring Time of Cyber Crime

May 14, 2015

Elections are over. Rhyming is in season. New thoughts are in the spring time breeze wafting through the sward at New Scotland Yard. If you have visited the location, you will appreciate the sward thing.

I read “Google More Intrusive Than State, Says Britain’s Top Policeman.” The write up reports:

“Look at intrusion by commerce which is far greater than you would experience from the State,” Sir Bernard told a cybercrime conference organized by London First. “Google and Tesco‘s intrusion into our lives is pretty remarkable for what is a commercial benefit.”

Will Google be under more scrutiny in the UK? Will the authorities in the UK want Google or companies in which it has a financial stake to be more helpful in addressing cyber crime?

Worth watching how the hedge is trimmed.

Stephen E Arnold, May 14, 2015

SAP and Business Intelligence: Simple Stuff, Really Simple

May 14, 2015

I came across an interesting summary of SAP’s business intelligence approach. Navigate to “SAP BI Suite Roadmap Strategy Update from ASUG SapphireNow.” ASUG, in case you are not into the SAP world, means America’s SAP User Group. Doesn’t everyone know that acronym? I did not.

The article begins with a legal disclaimer, always a strange attractor to me. I find content on the Web which includes unreadable legal lingo sort of exciting.

image

It is almost as thrilling as some of the security methods which SAP employs across its systems and software. I learned from a former SAP advisor that SAP was, as I recall the comment, “Security has never been a priority at SAP.”

The other interesting thing about the article is that it appears to be composed of images captured either from a low resolution screen capture program or a digital camera without a massive megapixel capability.

I worked through the slides and comments as best as I could. I noted several points in addition to the aforementioned lacunae regarding security; to wit:

  1. SAP wants to simplify the analytics landscape. This is a noble goal, but my experience has been that SAP is a pretty complex beastie. That may be my own ignorance coloring what is just an intuitive, tightly integrated example of enterprise software.
  2. SAP likes dedicating servers or clusters of servers to tasks. There is a server for the in memory database. There is a server for what I think used to be Business Objects. There is the SAP desktop. There are edge servers in case your SAP installation is not for a single user. There is the SAP cloud which, I assume, is an all purpose solution to computational and storage bottlenecks. Lots of servers.
  3. Business Objects is the business intelligence engine. I am not confident in my assessment of complexity, but, as I recall, Business Objects can be a challenge.

image

My reaction to the presentation is that for the faithful who owe their job and their consulting revenue to SAP’s simplified business intelligence solutions and servers, joy suffuses their happy selves.

For me, I keep wondering about security. And whatever happened to TREX? What happened to Inxight’s Thingfinder and related server technologies?

How simple can an enterprise solution be? Obviously really simple. Did I mention security?

Stephen E Arnold, May 14, 2015

Automated Search News: Lost in Link Land

May 14, 2015

I scanned the Paper.li’s “The Enterprise Search Daily.” I spotted this item:

image

Curious, I clicked on it. Here’s what Sinequa displayed:

image

Isn’t Sinequa one of the vendors Gartner described as a leader of the search pack. Not only was the Paper.li link submitted by Embedded something wrong. The source url is a 404.

So, how are those automated information systems supposed to work? See my write up about IBM’s burrito to get a glimpse of what happens when big ideas cannot be converted into workable components.

Yep, page not found. Reality is different from the marketing hoo hah.

Stephen E Arnold, May 14, 2015

Don’t  Fear the AI

May 14, 2015

Will intelligent machines bring about the downfall of the human race? Unlikely, says The Technium, in “Why I Don’t Worry About a Super AI.” The blogger details four specific reasons he or she is unafraid: First, AI does not seem to adhere to Moore’s law, so no Terminators anytime soon. Also, we do have the power to reprogram any uppity AI that does crop up and (reason three) it is unlikely that an AI would develop the initiative to reprogram itself, anyway. Finally, we should see managing this technology as an opportunity to clarify our own principles, instead of a path to dystopia. The blog opines:

“AI gives us the opportunity to elevate and sharpen our own ethics and morality and ambition. We smugly believe humans – all humans – have superior behavior to machines, but human ethics are sloppy, slippery, inconsistent, and often suspect. […] The clear ethical programing AIs need to follow will force us to bear down and be much clearer about why we believe what we think we believe. Under what conditions do we want to be relativistic? What specific contexts do we want the law to be contextual? Human morality is a mess of conundrums that could benefit from scrutiny, less superstition, and more evidence-based thinking. We’ll quickly find that trying to train AIs to be more humanistic will challenge us to be more humanistic. In the way that children can better their parents, the challenge of rearing AIs is an opportunity – not a horror. We should welcome it.”

Machine learning as a catalyst for philosophical progress—interesting perspective. See the post for more details behind this writer’s reasoning. Is he or she being realistic, or naïve?

Cynthia Murrell, May 14, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Explaining Big Data Mythology

May 14, 2015

Mythologies usually develop over a course of centuries, but big data has only been around for (arguably) a couple decades—at least in the modern incarnate.  Recently big data has received a lot of media attention and product development, which was enough to give the Internet time to create a big data mythology.  The Globe and Mail wanted to dispel some of the bigger myths in the article, “Unearthing Big Myths About Big Data.”

The article focuses on Prof. Joerg Niessing’s big data expertise and how he explains the truth behind many of the biggest big data myths.  One of the biggest items that Niessing wants people to understand is that gathering data does not equal dollar signs, you have to be active with data:

“You must take control, starting with developing a strategic outlook in which you will determine how to use the data at your disposal effectively. “That’s where a lot of companies struggle. They do not have a strategic approach. They don’t understand what they want to learn and get lost in the data,” he said in an interview. So before rushing into data mining, step back and figure out which customer segments and what aspects of their behavior you most want to learn about.”

Niessing says that big data is not really big, but made up of many diverse, data points.  Big data also does not have all the answers, instead it provides ambiguous results that need to be interpreted.  Have questions you want to be answered before gathering data.  Also all of the data returned is not the greatest.  Some of it is actually garbage, so it cannot be usable for a project.  Several other myths are uncovered, but the truth remains that having a strategic big data plan in place is the best way to make the most of big data.

Whitney Grace, May 14, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

The Latest SharePoint News from Ignite

May 14, 2015

The Ignite conference in Chicago has answered many of the questions that SharePoint users have been curious about for months now. Among them was the question of release timing and features for the latest iteration of SharePoint. CMS Wire gives a rundown in their article, “What’s Up With SharePoint? #MSIgnite.”

The article sums up the biggest news:

“Microsoft will continue to enhance the core offerings in the on-premises edition. It will also continue to develop SharePoint Online and update it as quickly as the updates are available. A preview version of SharePoint 2016 will be made available later this summer, with a beta version expected by the end of the year . . . In an afternoon session entitled Evolution of SharePoint Overview and Roadmap, the duo gave a rough outline of Microsoft’s plans, albeit without precise delivery dates.”

Having had to push back delivery dates once already, Microsoft is likely hesitant to announce anything solid until development is final. As far as qualities for the new version, Microsoft is focusing on: user experience, extensibility, and SharePoint management. The inclusion of user experience should be a welcome change for many. To stay in touch with developments as they become available, keep an eye on ArnoldIT.com, and particularly his feed devoted to SharePoint. Stephen E. Arnold has made a lifelong career out of all things search, and he has a knack for distilling down the “need to know” facts to keep an organization on track.

Emily Rae Aldridge, May 14, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta