Honkin News: Beyond Search Video News Program Available Now
August 2, 2016
Honkin’ News is now online via YouTube at https://youtu.be/hf93zTSixgo. The weekly program tries to separate the giblets from the goose feathers in online search and content processing. Each program draws upon articles and opinion appearing in the Beyond Search blog.
The Beyond Search program is presented by Stephen E Arnold, who resides in rural Kentucky. The five minute programs highlights stories appearing in the daily Beyond Search blog and includes observations not appearing in the printed version of the stories. No registration is required to view the free video.
Arnold told Beyond Search:
Online search and content processing generate modest excitement. Honkin’ News comments on some of the more interesting and unusual aspects of information retrieval, natural language processing, and the activities of those working to make software understand digital content. The inaugural program highlights Verizon’s Yahoo AOL integration strategy, explores why search fails, and how manufacturing binders and fishing lures might boost an open source information access strategy.
The video is created using high tech found in the hollows of rural Kentucky; for example, eight mm black-and-white film and two coal-fired computing devices. One surprising aspect of the video is the window showing the vista outside the window of the Beyond Search facility. The pond filled with mine drainage is not visible, however.
Kenny Toth, August 2, 2016
A Big Data Disconnect. Who Knew
August 2, 2016
I read “Advisors and Big Data: The Disconnect.” Stunned am I. Consultants not listening to their clients. Systems with severed communication channels to those to who their licensing bills. Unbelievable.
I learned:
But while many companies have big dough invested in this ongoing project, they still rely far too much on intuition and gut instinct instead of using their data to operate. This is often due to a fundamental disconnection between the actual needs of the business versus what the data analytics are designed to deliver.
The write up makes a number of statements which suggest there is some snake oil laced with ineptitude in the Big Data world; for example:
- Analytics enable. What if analytics enable poor decision making?
- Algorithms are not a “magic kit.” I thought algorithms were really smart.
- Bad data are bad. Really?
- Data are not insights. I thought data were chock full of insight.
- Moving big data from Point A to Point B is not a slam dunk. What about a three point shot?
If these points resonate with you, you are probably not getting with the Big Data program. I thought Big Data was a silver bullet and a magic potion blended in one tasty for fee meal. Stunned, I tell you. Stunned. Imagine. Disconnect advisors.
Stephen E Arnold, August 2, 2016
Getty Images: Why Are Free Range Chickens Coming Home?
August 2, 2016
I don’t know anything about Getty Images. Well, I think I recall at someone involved with Microsoft may have applied the magic touch to the outfit. Frankly I don’t know and I don’t care.
I think about Getty Images when a snappy headline draws my attention to an outfit selling rights to art, images, and probably lots of other “intellectual property.”
Navigate to “Getty Sued for $1 Billion for Selling Publicly Donated Photos.” Someone believes that Getty did them wrong. Who knows if it is true. I find the idea interesting.
According to the write up:
The Seattle-based company, which owns and licenses a collection of over 80 million images, has been sued by documentary photographer Carol Highsmith for ‘gross misuse’, after it sold more than 18,000 of her photos despite having already donated them for public use. Highsmith’s photos which were sold via Getty Images had been available for free via the Library of Congress. Getty has now been accused of selling unauthorized licenses of the images, not crediting the author, and for also sending threatening warnings and fines to those who had used the pictures without paying for the falsely imposed copyright.
What a clever idea. Take images from a public source and charge money for their use.
Extending this idea, perhaps Getty-type outfits would like to dig through the printed volumes in the Vatican Library, scan them, and sell those. Why not reach out to the digital crowd and suck down the video snippets which are free to use. Heck, there are some free photo services out there too.
I think that clever is definitely a great business angle. Oh, that billion dollars. How long will an individual be able to feed legal eagles if Getty has some deep pockets outside its door.
I admire MBA think. I wonder if roosting chickens leave behind a mess. I know roosting chickens have a nifty odor on hot days in those comfy coops in Maryland.
Stephen E Arnold, August 2, 2016
Summize, an App with the Technology to Make Our Children Learn. But Is They?
August 2, 2016
The article on TheNextWeb titled Teenagers Have Built a Summary App that Could Help Students Ace Exams might be difficult to read over the sound of a million teachers weeping into their syllabi. It’s no shock that students hate to read, and there is even some cause for alarm over the sheer amount of reading that some graduate students are expected to complete. But for middle schoolers, high schoolers, and even undergrads in college, there is a growing concern about the average reading comprehension level. This new app can only make matters worse by removing a student’s incentive to absorb the material and decide for themselves what is important. The article describes the app,
“Available for iOS, Summize is an intelligent summary generator that will automatically recap the contents of any textbook page (or news article) you take a photo of with your smartphone. The app also supports concept, keyword and bias analysis, which breaks down the summaries to make them more accessible. With this feature, users can easily isolate concepts and keywords from the rest of the text to focus precisely on the material that matters the most to them.”
There is nothing wrong with any of this if it is really about time management instead of supporting illiteracy and lazy study habits. This app is the result of the efforts of an 18-year-old Rami Ghanem using optical character recognition software. A product of the era of No Child Left Behind, not coincidentally, exposed to years of teaching to the test and forgetting the lesson, of rote memorization in favor of analysis and understanding. Yes, with Summize, little Jimmy might ace the test. But shouldn’t an education be more than talking point mcnuggets?
Chelsea Kerwin, August 2, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Is Resting Data Safe Data?
August 2, 2016
Have you ever wondered if the data resting on your hard drive is safe while you are away from your computer? Have you ever worried that a hacker could sneak into your system and steal everything even when the data is resting (not actively being used)? It is a worry that most computer users experience as the traverse the Internet and possibly leaving themselves exposed. Network World describes how a potential upgrade could protect data in databases, “ A New Update To The NoSQL Database Adds Cryptsoft Technology.”
MarkLogic’s NoSQL database version nine will be released later in 2016 with an added security update that includes Cryptsoft’s KMIP (Key Management Interoperability Protocol). MarkLogic’s upgrade will use the flexibility, scalability, and agility of NoSQL with enterprise features, government-grade security, and high availability. Along with the basic upgrades, there will also be stronger augmentations to security, manageability, and data integration. MarkLogic is betting that companies will be integrating more data into their systems from dispersed silos. Data integration has its own series of security problems, but there are more solutions to protect data in transition than at rest, which is where the Cryptsoft KMIP enters:
“Data is frequently protected while in transit between consumers and businesses, MarkLogic notes, but the same isn’t always true when data is at rest within the business because of a variety of challenges associated with that task. That’s where Cryptsoft’s technology could make a difference. Rather than grappling with multiple key management tools, MarkLogic 9 users will be able to tap Cryptsoft’s embedded Key Management SDKs to manage data security from across the enterprise using a comprehensive, standards-compliant KMIP toolkit.”
Protecting data at rest is just as important as securing transitioning data. This reminds me of Oracle’s secure enterprise search angle that came out a few years ago. Is it a coincidence?
Whitney Grace, August 2, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
From the Amusing Searches File: Trump to Hitler to Omission
August 1, 2016
I read a story which I assume is spot on, dead accurate, and 110 percent true. Navigate to “Google Search Connects Trump’s Book to Hitler’s ‘Mein Kampf’.” The story, in the the best traditions of real journalism, reports:
…typing the name of Trump’s 2015 book “Crippled America” into a Google image search, in addition to bringing up images of that book, displayed images of Adolf Hitler’s 1920s manifesto “Mein Kampf.” Google has been in the spotlight before for a connection between Trump and the infamous Nazi leader. In June, Googling the phrase “When was Hitler born” also produced an image of Donald Trump and listed his birthday. In that case, Google said it removed the Trump image, and a recent search confirms that the candidate’s image is no longer connected with Hitler’s birthday.
If you find the Hitler thing amusing, check out “Google Tweaks System after Trump Left Off Search Results for Presidential Candidates.” The write up, which I am sure is right as rain states:
According to Google, the omissions were the result of a “technical bug” in the Knowledge Graph, the massive information-mapping system that provides the top results bar under many fact-based searches. “Only the presidential candidates participating in an active primary election were appearing in a Knowledge Graph result,” a Google spokesperson said in a statement. “Because the Republican and Libertarian primaries have ended, those candidates did not appear. This bug was resolved early this morning.”
Was this self correcting or did an analogy entity make the fix? I recall from some time and place that Google did not fiddle search results. It must, therefore, be algorithms. Why worry about algorithms driving autos, performing surgery, or filtering information? I don’t worry. I believe everything I read on the Internet.
Those algorithms have a sense of humor. How was this linkage fixed? Maybe a human intervened, but I thought Google’s smart system worked all by its lonesome. I know that relevance is a struggle. Is it mine or others’?
Stephen E Arnold, August 1, 2016
Verizon: From Baby Bell to Online Gong
August 1, 2016
I don’t have many thoughts about Verizon’s purchase of Yahoo. I am tired of the melting ice approach to Yahoo’s problems. I am bored with old-school online systems. I am disappointed that a Baby Bell is now a portal wanna be. Haven’t these folks heard of Snapchat and Pokémon Go?
I read “Verizon to Buy Yahoo’s Core Business for $4.8 Billion in Digital Ad Push.” The write up explains:
Verizon could combine data from AOL and Yahoo users in addition to its more than 100 million wireless customers to help advertisers target users based on online behavior and preferences.
Advertisers may want to have their products in Snapchat and Pokémon Go type environments.
As a former contractor to Bell Labs, I understand the problems of the “old” AT&T. But a Baby Bell doing the portal thing and aiming at Facebook and Google as digital ad competitors? Interesting.
Verizool. Catchy. Rhymes with drool.
Stephen E Arnold, August 1, 2016
Jurors for Google v. Oracle Case Exposed to Major Privacy Violation Potential
August 1, 2016
The article titled Judge Doesn’t Want Google to Google the Favorite Books and Songs of Potential Jurors on Billboard provides some context into the difficulties of putting Google on trial. Oracle is currently suing Google for copyright violations involving a Java API code. The federal judge presiding over the case, William Alsup, is trying to figure out how to protect the privacy of the jurors from both parties—but mostly Google. The article quotes from Alsup,
“For example, if a search found that a juror’s favorite book is To Kill A Mockingbird, it wouldn’t be hard for counsel to construct a copyright jury argument (or a line of expert questions) based on an analogy to that work and to play upon the recent death of Harper Lee, all in an effort to ingratiate himself or herself into the heartstrings of that juror,” he writes. ” The same could be done… with any number of other juror attitudes…”
Alsup considered a straightforward ban on researching jurors, but this would put both sides’ attorneys at a disadvantage. Instead, Google and Oracle have until the end of the month to either consent to a voluntary ban, or agree to clearly inform the jurors of their intentions regarding social media research.
Chelsea Kerwin, August 1, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Big Data Is Just a Myth
August 1, 2016
Remember in the 1979 hit The Muppet Movie there was a running gag where Kermit the Frog kept saying, “It’s a myth. A myth!” Then a woman named Myth would appear out of nowhere and say, “Yes?” It was a funny random gag, but while it is a myth that frogs give warts, most of the myths related to big data may or not be. Data Science Central decided to explain some of the myths in, “Debunking The 68 Most Common Myths About Big Data-Part 2.”
Some of the prior myths debunked in the first part were that big data was the newest power word, an end all solution for companies, only meant for big companies, and that it was complicated and expensive. In truth, anyone can benefit from big data with a decent implementation plan and with someone who knows how to take charge of it.
Big data, in fact, can be integrated with preexisting systems, although it takes time and knowledge to link the new and the old together (it is not as difficult as it seems). Keeping on that same thought, users need to realize that there is not a one size fits all big data solution. Big data is a solution that requires analytical, storage, and other software. It cannot be purchased like other proprietary software and it needs to be individualized for each organization.
One myth that is has converted into truth is that big data relies on Hadoop storage. It used to be Hadoop managed a market of many, but bow it is an integral bit of software needed to get the big data job done. One of the most prevalent myths is it only belongs in the IT department:
“Here’s the core of the issue. Big Data gives companies the greatly enhanced ability to reap benefits from data-driven insights and to make better decisions. These are strategic issues.
You know who is most likely to be clamoring for Big Data? Not IT. Most likely it’s sales, marketing, pricing, logistics, and production forecasting. All areas that tend to reap outsize rewards from better forward views of the business.”
Big data is becoming more of an essential tool for organizations in every field as it tells them more about how they operate and their shortcomings. Big data offers a very detailed examination of these issues; the biggest issue users need to deal with is how they will use it?
Whitney Grace, August 1, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph