The Alphabet Google Thing: The Crunch Cometh

October 22, 2015

I am in a remote location with so so Internet—sometimes. I wanted to capture this write up “Google’s Growing Problem: 50% of People Do Zero Searches per Day on Mobile.” It is not the good old days from 2002 to 2006 for the GOOG. What happens when most of the folks in this third world country in which I sojourn get online? Well, I don’t think that the users will be doing the 2002-2006 search for information. I also think that zippy new users will embrace social media, apps, or maybe not search at all. Smart software can be convenient. According to the write up:

Thus where someone using a desktop/laptop might fulfil their “average” one or two searches per day by typing “Facebook” when they open their browser, on mobile that doesn’t happen because it doesn’t need to happen; they just open the app. For Google, that means it’s losing out, even though Google search is front and centre on every Android phone (as per Google’s instructions as part of its Mobile Application Device Agreement, MADA). People don’t, on average, search very much on mobile.

Is this a cup half full or half empty issue? Maybe Google can sustain its top line revenue growth. I suppose it has little choice, since the company after 15 years is almost completely dependent on a single revenue stream. On the other hand, perhaps the engine which floats the Loon balloons will run out of hot air?

Stephen E Arnold, October 22, 2015

Concept Searching SharePoint White Paper

October 22, 2015

I saw a reference to “2015 SharePoint and Office 365 State of the Market Survey White Paper.” If you are interested in things SharePoint and Office 365, you can (as of October 15, 2015) download the 40 page document at this Concept Searching link. A companion webinar is also available.

The most interesting portion of the white paper is its Appendix A. A number of buzzwords are presented as “Priorities by Application.” Note that the Appendix is graphical and presents the result of a “survey.” Goodness, SharePoint seems to have some holes in its digital fabric.

The data for enterprise search are interesting.

image

Source: Concept Searching, 2015

It appears that fewer than 20 percent of those included in the sample (not many details about the mechanics of this survey the data for which was gathered via the Web) do not see enterprise search as a high priority issue. About 30 percent of the respondents perceive search as working as intended. An equal number, however, are beavering away to improve their enterprise search system.

Unlike some enterprise search and content processing vendors, Concept Search is squarely in the Microsoft camp. With third party vendors providing “solutions” for SharePoint and Office 365, I ask myself, “Why doesn’t Microsoft address the shortcomings third parties attack?”

Stephen E Arnold, October 22, 2015

University Partners up with Leidos to Investigate How to Cut Costs in Healthcare with Big Data Usage

October 22, 2015

The article on News360 titled Gulu Gambhir: Leidos Virginia Tech to Research Big Data Usage for Healthcare Field explains the partnership based on researching the possible reduction in healthcare costs through big data. Obviously, healthcare costs in this country have gotten out of control, and perhaps that is more clear to students who grew up watching the cost of single pain pill grow larger and larger without regulation. The article doesn’t go into detail on how the application of big data from electronic health records might ease costs, but Leidos CTO Gulu Gambhir sounds optimistic.

“The company said Thursday the team will utilize technical data from healthcare providers to develop methods that address the sector’s challenges in terms of cost and the quality of care. Gulu Gambhir, chief technology officer and a senior vice president at Leidos, said the company entered the partnership to gain knowledge for its commercial and federal healthcare business.”

The partnership also affords excellent opportunities for Virginia Tech students to gain real-world, hands-on knowledge of data research, hopefully while innovating the healthcare industry. Leidos has supplied funding to the university’s Center for Business Intelligence and Analytics as well as a fellowship program for grad students studying advanced information systems related to healthcare research.
Chelsea Kerwin, October 22, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Genentech Joins the Google Enterprise Crew

October 22, 2015

Enterprise search offers customizable solutions for organizations to locate and organize their data.  Most of the time organizations purchase a search solution is to become more efficient, comply with procedures for quality compliance, and or to further their business development.  The latter usually revolves around sales operation planning, program research, customer service, contracts, and tech sales collateral.

Life sciences companies are but one of the few that can benefit from enterprise search solutions.  Genentech recently deployed the Google Search Application to improve the three areas listed above.  Perficient explains the benefits of enterprise search for a life science company in the video, “Why Life Sciences Leader Genentech Adopted Google Enterprise Search.”

“‘…we explore why life sciences leader Genentech executed Google Search Appliance. “No company is or should ever be static. You have to evolve,’ said CEO Ian Clark.”

Perficient helps companies like Genentech by customizing a search solution by evaluating the company and identifying the areas where it can be improved the most.  They host workshops to evaluate where people in different areas must stop to search for information before returning to the task.  From the workshops, Perficient can create a business prototype to take their existing business process and improve upon it.  Perficient follows this procedure when it deploys enterprise search in new companies.

The video only explains a short version of the process Perficient deployed at Genentech to improve their business operations with search.  A full webinar was posted on their Web site: “Google Search For Life Sciences Companies.”

 

Whitney Grace, October 22, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Searching Tweets: Just $24,000 per Year

October 21, 2015

Short honk: Love Twitter. Want to search tweets and sort of make sense of the short messages? A new service from Union Metrics is now available, according to “Union Metrics d\Debuts Search Engine That Gives You Access to Twitter’s Entire Archive.” This is a link from News360 which is available, but slowly, to me in South Africa. For you? Who knows?

Here’s the pricing, which I assume is spot on:

Although available to all Social Suite subscribers today, the service costs extra. For $500 per month, companies can access up to 30 days of data from Twitter’s archive. For $1,000 per month, Union Metrics’ Echo 365 plan grants unlimited access to up to a year’s worth of data. Finally, for $2,000 per month, the company’s Echo Full Archive plan grants full access to everything.

Twitter is looking for revenue and customer love. Will this type of tie up help?

Stephen E Arnold, October 21, 2015

Attensity: Discover Now

October 21, 2015

i read “Speedier Data Analysis Focus of Attensity’s DiscoverNow.” Attensity is one of the firms processing content for information signals.The company has undergone some management turnover. The company has rolled out DiscoverNow, a product that runs from the cloud and features “built in integration with the Informatica cloud.” The write up reports:

According to the company, DiscoverNow connects to more than 150 internal and external text-based data sources, including popular enterprise apps and databases such as Salesforce.com, SAP, Oracle/Siebel, Box, Concur, Dropbox, Datasift, Eloqua, JIRA, MailChimp, Marketo, NetSuite, Hadoop, MySQL and Thomson Reuters. It combines insights from these internal data sources with external text sources such as Twitter, Facebook, Google+, YouTube, Reddit, forums and review sites, to offer a robust view of customer activities.

Attensity is, according to the article, different and outperforms its competitors. According to Cary Fulbright, Attensity’s chief strategy officer:

Attensity outperforms competing text analytics systems that rely more heavily on keywords. “We parse sentences by subject, noun and object, so we can identify the context used,” he said. “For example, DiscoverNow understands the difference between the Venetian Hotel, Venetian blinds and Venetian gondolas, or ‘uber cool’ and Uber ridesharing. Our team of linguists is constantly updating our generic and industry-specific libraries with new terms, including slang.”

A number of companies offer text processing systems. Attensity is a mash up of several organizations. DiscoverNow may be the breakthrough product the company has been seeking. To date, according to Crunchbase, the company has ingested since 2000 $90 million.

Stephen E Arnold, October 21, 2015

Algorithmic Bias and the Unintentional Discrimination in the Results

October 21, 2015

The article titled When Big Data Becomes Bad Data on Tech In America discusses the legal ramifications of relying on algorithms for companies. The “disparate impact” theory has been used in the courtroom for some time to ensure that discriminatory policies be struck down whether they were created with the intention to discriminate or not. Algorithmic bias occurs all the time, and according to the spirit of the law, it discriminates although unintentionally. The article states,

“It’s troubling enough when Flickr’s auto-tagging of online photos label pictures of black men as “animal” or “ape,” or when researchers determine that Google search results for black-sounding names are more likely to be accompanied by ads about criminal activity than search results for white-sounding names. But what about when big data is used to determine a person’s credit score, ability to get hired, or even the length of a prison sentence?”

The article also reminds us that data can often be a reflection of “historical or institutional discrimination.” The only thing that matters is whether the results are biased. This is where the question of human bias becomes irrelevant. There are legal scholars and researchers arguing on behalf of ethical machine learning design that roots out algorithmic bias. Stronger regulations and better oversight of the algorithms themselves might be the only way to prevent time in court.

Chelsea Kerwin, October 21, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Reclaiming Academic Publishing

October 21, 2015

Researchers and writers are at the mercy of academic publishers who control the venues to print their work, select the content of their work, and often control the funds behind their research.  Even worse is that academic research is locked behind database walls that require a subscription well beyond the price range of a researcher not associated with a university or research institute.  One researcher was fed up enough with academic publishers that he decided to return publishing and distributing work back to the common people, says Nature in “Leading Mathematician Launches arXiv ‘Overlay’ Journal.”

The new mathematics journal Discrete Analysis peer reviews and publishes papers free of charge on the preprint server arXiv.  Timothy Gowers started the journal to avoid the commercial pressures that often distort scientific literature.

“ ‘Part of the motivation for starting the journal is, of course, to challenge existing models of academic publishing and to contribute in a small way to creating an alternative and much cheaper system,’ he explained in a 10 September blog post announcing the journal. ‘If you trust authors to do their own typesetting and copy-editing to a satisfactory standard, with the help of suggestions from referees, then the cost of running a mathematics journal can be at least two orders of magnitude lower than the cost incurred by traditional publishers.’ ”

Some funds are required to keep Discrete Analysis running, costs are ten dollars per submitted papers to pay for software that manages peer review and journal Web site and arXiv requires an additional ten dollars a month to keep running.

Gowers hopes to extend the journal model to other scientific fields and he believes it will work, especially for fields that only require text.  The biggest problem is persuading other academics to adopt the model, but things move slowly in academia so it will probably be years before it becomes widespread.

Whitney Grace, October 21, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Spark Burns Down Hadoop

October 20, 2015

I read “Apache Spark vs Hadoop.” I conceptualized Ronda Rousey climbing in the octagon with Ramazan Emeev. A big gate. As a certain presidential candidate might say, “Huge.”

Alas, the dust up between Spark (MapReduce on steroids) and Hadoop (a batch operation clustering system) was not much of a contest, according to the article.

I highlighted this passage:

With Apache Spark, you can act on your data in whatever way you want. Want to look for interesting tidbits in your data? You can perform some quick queries. Want to run something you know will take a long time? You can use a batch job. Want to process your data streams in real time? You can do that too.

The key to the Spark wonderfulness is RDDs or resilient distributed datasets. I underlined with definition:

They’re fine-grained, keeping track of all changes that have been made from other transformations such as map or join. This means that it’s possible to recover from failures by rebuilding from these transformations (which is why they’re called Resilient Distributed Datasets).

My goodness with these features, poor, old Hadoop may not stand a chance. Now who would win a fight between Rousey and Emeev? One could, I assume, input data about the two fighters and perform on quick queries and get an “answer.”

Like most NoSQL confections, will the answer match what happens in the ring?

Stephen E Arnold, October 20, 2015

Weakly Watson: IBM Doubles Down on Its Smart Software Future

October 20, 2015

InfoWorld ran a remarkable article about Watson, IBM’s TV game show winning search and retrieval system. Since the victory, which may or may not have required some post production touch ups, Watson has gotten chubbier. The Watson moniker now embraces applications, APIs, Bob Dylan, analytics, and 100 million lines of code. (Yep, that’s a lot of code. No word on the number of bugs in that code, however.)

The write up “First Jeopardy, Next the World: IBM’s Plans for Watson.” Spoiler alert: Watson is going to give IBM “a new lease on life.”

IBM’s plan is to generate $1 billion in annual revenue. I am not sure when the accountants at IBM will finish their work to roll in content management, i2, and other lines of business to hit this magical number. I would point out that IBM is in the $100 billion range, so the goal for Watson is modest, a mere one percent of IBM’s annual revenue. True, IBM has tallied a stunning record: more than 12 consecutive quarters of declining revenue. But Big Blue is giving Watson the ball and a chance to score a touchdown in the revenue game.

The InfoWorld story reveals some gems which I did not have in my “weakly Watson” file. Permit me to highlight several of these informational nuggets.

First, Watson has created “a 2,000-person business unit that will draw on the expertise of consulting pros who bring backgrounds in machine learning, advanced analytics, data science and development, and industry and change management.” Staff additions are good, particularly when one tracks the commentary in Alliance@IBM.

Second, Watson wants developers. Presumably these folks will use the APIs like F____d. I kid you not. This is IBM lingo for face detection. Here’s the icon from the multi page ads which ran in the New York Times and Wall Street Journal on  October 14, 2015.

image

Look closely. The acronym for face detection is F____d. I like the happy face and the hexagon. Very techno-geodesic.

Third, Watson is no longer encumbered with “mini-van” sized IBM computers. Watson resides in the cloud. The mini van sized computers are there, just not in the licensee’s computer room.

Fourth, IBM Watson has “launched about 100 new applications.” I did not know that.

Fifth, I learned that IBM Watson “released new and enhanced Watson cognitive services across four areas: language, vision, speech, and data insights. They’re meant to reduce the time required to combine Watson APIs and data sets, as well as to embed Watson APIs in mobile devices, cloud services, and connected systems.” Interesting. I thought the point was for IBM to sell consulting and engineering services. Guess not. Those are higher margin lines of business. Getting big money for APIs may be a bit more challenging.

Sixth, IBM rolled out “Expert Storybooks.” I heard about this, but I am not sure I understand the concept. I learned: “An Expert Storybook built with the Weather Co. is designed to help users incorporate weather data into revenue analysis; a Twitter Expert Storybook helps analyze social data to, among other things, measure reputational risk.” Okay, Twitter. Maybe not the most stable social content source, but I still am not grasping the cognitive computing / analytics thing via Expert Storybooks. Doesn’t one need to be an expert to create an Expert Storybook?

I put double red boxes around the paragraphs explaining to me that Watson is the future of IBM. Here’s the passage that caused me to chuckle:

How big is Watson? IBM CEO Ginny Rometti has said it could be a $10 billion business. John Kelly, IBM’s senior vice president in charge of Watson and related businesses, told a Bloomberg reporter that he expects it to be a $1 billion business fairly soon. When I asked [Stephen] Gold [Stephen Gold, vice president of the Watson group] the same question, he didn’t want to touch it.

I like that “didn’t want to touch it” comment. No kidding. I wouldn’t want to be cheerleading for a $1 billion from what is essentially open source software, home brew code, and acquired technology cranking out $1 billion. Maybe Watson can be an Endeca sized $100 million? After some years of travail, Watson might nose into Autonomy’s $700 million range. The $1 billion number strikes me as a long shot without some Fancy Dan shuffling of lines of business and a bit of accounting effort.

And, I wish to note in ending, that the InfoWorld story worked in curing cancer. All in a day’s work for Watson.

Stephen E Arnold, October 20, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta