FTC and Google: Never Complain, Never Explain Usually

March 26, 2015

I read “FTC Addresses Its Choice Not to Sue Google.” The write up reports that the FTC is explaining its decision not to chase Google around the conference table. Heck, would that tire out the Googlers, making it tough to stay awake in a White House meeting?

According to the write up:

“All five Commissioners (three Democrats and two Republicans) agreed that there was no legal basis for action with respect to the main focus of the investigation — search,” the statement released on Wednesday read. “The Commission’s decision on the search allegations was in accord with the recommendations of the F.T.C.’s Bureau of Competition, Bureau of Economics, and Office of General Counsel.”

I think this means, “No problemo.”

I also found this statement about the FTC’s expertise in information governance interesting:

In the final paragraph of the commissioners’ statement, the agency once more expressed regret at the inadvertent release of its internal document. “We are taking additional steps to ensure that such a disclosure does not occur in the future,” it said.

That’s good. The future. Many search vendors point out that the functions their marketers say are available today really mean in the “future.” Is this a characteristic of our digital era.

Stephen E Arnold, March 26, 2015

Study Find Millennials Willing to Pay for News to a Point

March 26, 2015

The article titled Millennials Say Keeping Up With the News Is Important To Them—But Good Luck Getting Them To Pay For It on NiemanLab explores the findings of a recent study by the Media Insight Project in partnership with the American Press Institute. A great deal of respondents get their news from Facebook, although the majority (88%) said it was only occasionally. Twitter and Reddit also made the list. Interestingly, millennials claimed multiple access methods to news categories across the board. The article states,

“The survey asked respondents how they accessed 24 different news topics, from national politics and government to style, beauty, and fashion. Facebook was either the number one or two source of information for 20 of the 24 topics, and in nine of those topics it was the only source cited by a majority of respondents. Search was the second most popular source of information, ranking first or second in 13 of the 24 news topics.”

In spite of the title of the article, most millennials in the study were willing to pay for at least one subscription, either digital or print. The article doesn’t mention the number of people involved in the study, but deeper interviews were held with 23 millennials, which is the basis for the assumptions about broader unwillingness to pay for the news, whether out of entitlement or a belief that access to free news is a fundamental pillar of democracy.

Chelsea Kerwin, March 26, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Glimpses of SharePoint 2016 on the Way

March 26, 2015

The tech world is excited for the upcoming SharePoint 2016 release. Curious parties will be glad to hear that sneak peaks will be coming this spring. Read more in the CMS Wire article, “Microsoft Leaks Offer a Glimpse of SharePoint 2016.”

The article lays out some of the details:

“Microsoft has started leaking news about SharePoint 2016 — and they suggest the company plans to showcase an early edition at Ignite, its upcoming all-in-one conference for everyone from senior decision makers, IT pros and “big thinkers” and to enterprise developers and architects. In a just released podcast, Bill Baer, senior product manager for SharePoint, said the company will offer a look at the latest version of SharePoint at the conference, which will be held in Chicago from May 4 through 8.”

Some experts have already weighed in with predictions for SharePoint 2016 features: hybrid search and improved user experience among them. Stephen E. Arnold will also be keeping an eye on the new version, reporting his findings on his dedicated SharePoint feed. He has devoted his career to all things search, including SharePoint, and keeps readers informed on his Web site ArnoldIT.com. Stay tuned for more updates on SharePoint 2016 as it becomes available.

Emily Rae Aldridge, March 26, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

A Little Lucene History

March 26, 2015

Instead of venturing to Wikipedia to learn about Lucene’s history, visit the Parse.ly blog and read the post, “Lucene: The Good Parts.”  After detailing how Doug Cutting created Lucene in 1999, the post describes how searching through SQL in the early 2000s was a huge task.   SQL databases are not the best when it comes to unstructured search, so developers installed Lucene to make SQL document search more reliable.  What is interesting is how much it has been adopted:

“At the time, Solr and Elasticsearch didn’t yet exist. Solr would be released in one year by the team at CNET. With that release would come a very important application of Lucene: faceted search. Elasticsearch would take another 5 years to be released. With its recent releases, it has brought another important application of Lucene to the world: aggregations. Over the last decade, the Solr and Elasticsearch packages have brought Lucene to a much wider community. Solr and Elasticsearch are now being considered alongside data stores like MongoDB and Cassandra, and people are genuinely confused by the differences.”

If you need a refresher or a brief overview of how Lucene works, related jargon, tips for using in big data projects, and a few more tricks.  Lucene might just be a java library, but it makes using databases much easier.  We have said for a while, information is only useful if you can find it easily.  Lucene made information search and retrieval much simpler and accurate.  It set the grounds for the current big data boom.

Whitney Grace, March 26, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

IBM Methods Are Alive and Well: Google Goes with Lock In for Search in Africa

March 25, 2015

I assume the information in “Facebook and Google Are Locking In African Customers with Freebie Deals” is accurate. Let me  be upfront. I don’t worry too much about Facebook. Folks using that service make a decision to post information, build friend lists, and do other social functions.

Search is different. A person enters a query and assumes, maybe believes, that the results are objective, accurate, and related to the query itself. I am not sure this utopia exists or that most users, even with graduate degrees, can figure out the difference between information, disinformation, misinformation, or reformation of information.I know it takes considerable work. To see the depth of the problem, run a query for the seemingly innocuous phrase “concept searching.” Check out the results. Nifty, eh?

The article states:

Both companies [Facebook and Google] are rolling out programs in some African countries that give people free internet access—but the complimentary access is contingent on people using their services. Their large-scale world-connectivity projects are tailored to ensure that Facebook and Google become the go-to on-ramps for accessing Internet.

Is the objective market control for the purpose of generating revenue?

The story explains:

It’s hard for companies to compete with Facebook and Google in the US; in Africa, where these tech giants will have a huge leg up on local competitors, it will be even harder. By establishing themselves as home bases for the internet, Facebook and Google are elbowing control over the online experiences of a continent away from would-be domestic entrepreneurs and local startups.

Perhaps Facebook and Google will merge, sort of a Kraft and Heinz type deal. That will provide even more freebies to the markets in Africa, right? Is this article getting close to explaining how a Belgium-type deal was such a plus for the Congo? Absolutely not. The parallels are specious. Neither Facebook or Google is a monarchy. Neither Facebook or Google are interested in natural resources? Neither Facebook or Google wishes to prevent others from serving the markets in Africa.

This is just great marketing for those with a dog in the fight, especially advertisers.

Stephen E Arnold, March 25, 2015

Lexmark Jumps into Intelligence Work

March 25, 2015

I read “Lexmark Buys Software Maker Kofax at 47% Premium in $1B Deal.” The write up focuses on Kofax’s content management services. Largely overlooked is Kofax’s Kapow Tech unit. This company provides specialized services to intelligence, law enforcement, and security entities. How will a printer company in Lexington manage the ageing Kofax technology and the more promising Kapow entity? This should be interesting. Lexmark already owns the Brainware technology and the ISYS Search Software system. Lexmark is starting to look a bit like IBM and OpenText. These companies have rolled up promising firms, only to lose their focus. Will Lexmark follow in IBM’s footsteps and cook up a Watson? I think there is still some IBM DNA in the pale blue veins of the Lexmark outfit. On the other hand, Lexmark seems to be emulating some of the dance steps emerging from the Hewlett Packard ballroom as well. Fascinating. The mid-tier consultants with waves, quadrants, and paid for webinars will have to find a way to shoehorn hardware, health care, intelligence, and document scanning into one overview. Confused? Just wait.

Stephen E Arnold, March 25, 2015

Big Data and Their Interesting Processes

March 25, 2015

I love it when mid tier consultants wax enthusiastically about Big Data. Search your data lake, enjoins one clueless marketer. Big Data is the future, sings a self appointed expert. Yikes.

To get a glimpse of exactly what has to be done to process certain types of Big Data in an economical yet timely manner, I suggest you read “Analytics on the Cheap.” The author is 0X74696D. Get it?

The write up explains the procedures required to crunch data and manage the budget. The work flow process I found interesting is:

  • Incoming message passes through our CDN to pick up geolocation headers
  • Message has its session authenticated (this happens at our routing layer in Nginx/OpenResty)
  • Message is routed to an ingest server
  • Ingest server transforms message and headers into a single character-delimited querystring value
  • Ingest server makes a HTTP GET to a 0-byte file on S3 with that querystring
  • The bucket on S3 has S3 logging turned on.
  • We ingest the S3 logs directly into Redshift on a daily basis.

The write up then provides code snippets and some business commentary. The author also identifies the upside of the approach used.

Why is this important? It is easy to talk about Big Data. Looking at what is required to make use of Big Data reveals the complexity of the task.

Keep this hype versus real world split in mind the next time you listen to a search vendor yak about Big Data.

Stephen E Arnold, March 25, 2015

SAS Text Miner Provides Valuable Predictive Analytics

March 25, 2015

If you are searching for predictive analytics software that provides in-depth text analysis with advanced linguistic capabilities, you may want to check out “SAS Text Miner.”  Predictive Analytics Today runs down the features and what SAS Text Miner and details how it works.

It is a user-friendly software with data visualization, flexible entity options, document theme discovery, and more.

“The text analytics software provides supervised, unsupervised, and semi-supervised methods to discover previously unknown patterns in document collections.  It structures data in a numeric representation so that it can be included in advanced analytics, such as predictive analysis, data mining, and forecasting.  This version also includes insightful reports describing the results from the rule generator node, providing clarity to model training and validation results.”

SAS Text Miner includes other features that draw on automatic Boolean rule generation to categorize documents and other rules can be exported into Boolean rules.  Data sets can be made from a directory on crawled from the Web.  The visual analysis feature highlights the relationships between discovered patterns and displays them using a concept link diagram.  SAS Text Miner has received high praise as a predictive analytics software and it might be the solution your company is looking for.

Whitney Grace, March 25, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Elasticsearch Becomes Elastic, Acquires Found

March 25, 2015

The article on Forbes.com titled Elasticsearch Changes Its Name, Enjoys An Amazing Open Source Ride and Hopes to Avoid Mistakes explains the latest acquisition and the reasons behind the name change to simply Elastic. That choice is surmised to be due to Elastic’s wish to avoid confusion over the open source product Elasticsearch and the company itself. It also signals the company’s movement beyond solely providing search technology. The article also discusses the acquisition of Found, a Norwegian company,

“Found provides hosted and fully ­managed Elasticsearch clusters with technology that automates processes such as installation, configuration, maintenance, backup, and high­availability. Doing all of this heavy-lifting enables developers to integrate a search engine into their database, website or app quickly In addition, Found has created a turnkey process to scale Elasticsearch clusters up or down at any time and without any downtime. Found’s Elasticsearch as a Service offering is being used by companies like Docker, Gild… and the New York Public Library.”

Elasticsearch has raised almost $105 million since its start after being created by Shay Banon in 2010. The article posits that they have been doing the right things so far, such as the acquisition of Kibana, the visualization vendor. Although some startups relying on Elasticsearch may throw shade at the Found acquisition, there are no foreseeable threats to Elastic’s future.

Chelsea Kerwin, March 25, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Image and Video Recognition: A Bump in the Road

March 24, 2015

I read “Images That Fool Computer Vision Raise Security Concerns.” I found the write up a reminder that the marketing and venture capitalists’ hype are one thing. Real world software performance is another thing.

The article states:

Cornell researchers have found that computers, like humans, can be fooled by optical illusions, which raises security concerns and opens new avenues for research in computer vision.

The passage I highlighted in a mellow yellow says:

But computers don’t process images the way humans do, Yosinski [a Cornell wizard] said. “We realized that the neural nets did not encode knowledge necessary to produce an image of a fire truck, only the knowledge necessary to tell fire trucks apart from other classes,” he explained. Blobs of color and patterns of lines might be enough. For example, the computer might say “school bus” given just yellow and black stripes, or “computer keyboard” for a repeating array of roughly square shapes.

So what?

It turns out that this diagram looks exactly like a penguin.

image

The smart software sees the abstraction as what most grade school children know as a lovable penguin. I did not smell a penguin until after I left grade school. Someone should have warned me.

image

And the challenge? I have no comment about the expectations of a government professional who relies on image recognition as part of an on going investigation.

Stephen E Arnold, March 24, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta