Attensity: Downplaying Automated Collection and Analysis

November 7, 2014

I read “Do What I Mean, Not What I Say: The Text Analytics Paradox.” The write up made a comment which I found interesting; to wit:

Now, before you start worrying about robots replacing humans (relax—that’s at least a couple of years away), understand this: context and disambiguation within these billions of daily social posts, tweets, comments, and online surveys is they key to viable, business-relevant data. The way human use language is replete with nuance, idiomatic expressions, slang, typos, and of course, context. This underscores the magnitude of surfacing actionable intelligence in data for any industry.

Based on information my research team has collected, the notion of threat detection via automated collection and analysis of Internet-accessible information is quite advanced. In fact, some of the technology has been undergoing continuous refinement since the late 1990s. Rutgers University has been one of the academic outfits in the forefront of this approach to the paradox puzzling Attensity.

The more recent entrants in this important branch (perhaps an new redwood in the search forest) of information access are keeping a low profile. There is a promising venture funded company in Baltimore as well as a China-based firm operating from offices in Hong Kong. Neither of these companies has captured the imagination of traditional content processing vendors for three reasons:

First, the approach is not from traditional information retrieval methodologies.

Second, the companies generate most of their revenue from organizations demanding “quiet service.” (This means that when there is no marketing hoo hah, the most interesting companies are simply not visible to the casual, MBA inspired analyst.

Third, the outputs are of stunning utility. Information about quite particular subjects are presented without recourse to traditional human intermediated fiddling.

I want to float an idea: The next generation firms delivering state of the art solutions and have yet to hit the wall that requires the type of marketing that now characterizes some content processing efforts.

I am trying to figure out how to present these important but little known players. I will write about one in my next Info Today article. The challenge is that there are two dozen firms pushing “search” in a new and productive direction.

Stephen E Arnold, November 7, 2014

Palantir Is a Fund-Raising Leader

November 7, 2014

We have been following the progress of content processing firm Palantir, a business that seems to have both a strong vision and robust follow-through. Now, the Silicon Valley Business Journal highlights Palantir’s fundraising chops in, “Q3 VC Update: Who Did the Most Deals, Got the Most Money in Silicon Valley.” Their senior tech reporter Cromwell Schubarth gives a funding rundown from the quarter that spans July through August of this year; he reports:

“Palo Alto-based Palantir Technologies, the $9 billion Big Data analytics company that counts U.S. government intelligence agencies among its backers, had the Bay Area’s biggest funding round in the quarter. It raised $337 million in the quarter, according to CB Insights.

“CB Insights reported on Tuesday that venture dollars raised in the third quarter in the U.S. dropped 30 percent from the post-dotcom high they hit in the second quarter, and the number of deals done declined by 10 percent.”

So, Palantir excelled despite a downturn that quarter. The article goes on to list more details about each entry in their list (see the piece for the four runners-up), and this is what Schubarth says about Palantir:

“This Palo Alto-based Big Data analytics company is led by CEO Alex Karp and has raised about $1 billion since it launched in 2004. It is valued at about $9 billion and was co-founded by Karp, Peter Thiel, Joe Lonsdale and others. Its backers include Thiel’s Founders Fund and In-Q-Tel, the venture arm of U.S. intelligence agencies.”

Palantir’s founding members came from such promising pools as PayPal alumni and Stanford computer science grads. The firm is famous for serving government intelligence agencies, but maintains clients in a range of fields. Its massive-scale data platforms allow even the largest organizations to integrate, manage, and secure all sorts of data. The company is based in Palo Alto, California, but has offices around the world.

Cynthia Murrell, November 07, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Insights from Search Pro Dave Hawking

November 7, 2014

Search-technology expert Dave Hawking is now working with Microsoft to improve Bing. Our own Stephen Arnold spoke to Mr. Hawking when he was still helping propel Funnelback to great heights. Now, IDM Magazine interviews the search wizard about his new gig, some search history, and challenges currently facing enterprise search in, “To Bing and Beyond.”

Anyone interested in the future of Bing, Microsoft, or enterprise search, or in Australian computer-science history, should check out the article. I was interested in this bit Hawking had to say about ways that tangled repository access can affect enterprise search:

“Access controls for particular repositories are often out of date, inappropriate, and inconsistent, and deployment of enterprise search exposes these problems. They can arise from organisational restructuring, staff changes or knee-jerk responses to unauthorised accesses. As there are usually a large number of repositories, rationalising access controls to ensure that search results respect policies is a lot of work.

“Organisations vary widely in their approach to security: some want security enforced with early binding (recording permissions at indexing time), others want late binding, where current permissions are applied when query result are displayed, or a hybrid of the two.

“This choice has a major impact on performance. Another option is ‘translucency’, where users may see the title of a document but not its content, or receive an indication that documents matching the query exist but that they need to request permission to access them. As well these security model variations, organisations vary in their requirements for customization, integration and presentation, and how results from multiple repositories should be prioritized, tending to make enterprise search projects quite complex.”

Eventually, standards and best practices may spread that will reduce these complexities. Then again, perhaps technology now changes too fast for such guidelines to take root. For now, at least, experts who can skillfully navigate this obstacle-strewn field will continue to command a pretty penny.

Cynthia Murrell, November 07, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Google and Axel Springer: Traffic Means Power

November 6, 2014

In the summer of 2014, Axel Springer acquired 20 percent of the Pertimm-powered Qwant. As you may know, which I profile in my current Information Today article, is a Web search engine with features. Believe me, lots of features. What Qwant does not have is traffic. Google’s Eric Schmidt believes the quirky system is a threat. From my lookout on top of the crest of the hill near the hollow in which I live in rural Kentucky, that strikes me as a very rotten red herring.

Axel Springer now understands the difference between the traffic generated by Qwanta and other Web search engines and the Google if I understand “German Publishing Giant Axel Springer Caves in over Google News Snippets Row.” The article reports:

Announcing the free license for Google yesterday, Axel Springer said that traffic to the sites had declined by nearly 40 percent since Google stopped producing snippets and thumbnails on October 23. It also claimed that traffic to the German sites from Google News was down by almost 80 percent.

You can work through the “real” journalistic approach to this point when you read the original article.

What’s important to me is that Google traffic flows are a powerful tool in Google’s negotiating arsenal. Even if you own a search engine, if you are not in Google, you don’t exist. I wonder how Edmund Gustav Albrecht Husserl would view this fact.

Stephen E Arnold, November 6, 2014

IBM Watson Has a Tough Question to Answer

November 6, 2014

In a sense, the erosion of a well-known company is interesting to watch. Some might use the word “bittersweet.” IBM has been struggling. Its “profits” come from stock buybacks, reductions in force, cost cutting, and divestitures. Coincident with the company’s quarterly financial reports, I heard two messages.

  1. We are not going to hit the 2015 targets we said we were going to hit
  2. IBM paid another company money to “acquire” one of IBM’s semiconductor units.

I may have these facts wrong, but what’s important is that the messaging about IBM’s strategic health sends signals which I find troubling. IBM is a big company, and it will take time for its ultimate trajectory to be discernable. But from my vantage point in rural Kentucky, IBM has its work cut out for its thousands of professionals.

I read “Does Watson Know the Answer to IBM’s Woes?” Compared to other Technology Review write ups about IBM’s projected $10 billion revenue juggernaut, the article finally suggests that IBM’s Watson may not be the unit that produces billions in new revenue.

Here’s a passage I highlighted with my trusty yellow marker:

Watson is still a work in progress. Some companies and researchers testing Watson systems have reported difficulties in adapting the technology to work with their data sets. IBM’s CEO, Virginia Rometty, said in October last year that she expects Watson to bring in $10 billion in annual revenue in 10 years, even though that figure then stood at around $100 million.

Let’s consider this $100 million number. If it is accurate, IBM is now one eighth the size of Autonomy when HP paid $11 billion for the company. It took Autonomy more than 14 years to hit this figure. In order to produce $800 million in revenue, Autonomy had to invest, license, and acquire technology and businesses. In total, Autonomy was more like an information processing holding company, not a company built on a one trick pony like Google’s search and advertising technology. Autonomy’s revenue was diversified for one good reason: It has been very difficult to built multi billion dollar businesses on basic search and retrieval. Google hit $60 billion because it hooked search to advertising. Autonomy generated seven times more revenue than Endeca because it was diversified. Endeca never broke out of three main product lines: ecommerce, search, and business intelligence. And Endeca never generated more than an estimated $160 million in revenue per year at the time of its sale to Oracle. Even Google’s search appliance fell short of Autonomy’s revenues. Now IBM wants to generate more money from search than Autonomy and in one third the time. Perhaps IBM could emulate Mike Lynch’s business approach, but event then this seems like a bridge too far. (This is a more gentle way of saying, “Not possible in 60 months.”)

It is very difficult to generate billions of dollars from search without some amazing luck and an angle.

If IBM has $100 million in revenue, how will the company generate $1 billion and then an additional $9 billion. The PR razzle dazzle that has involved TV game shows, recipes with tamarind, and an all out assault on main stream media about Watson has been impressive. In search, $100 million is a pretty good achievement. But $100 million does not beget $1 billion without some significant breakthroughs in marketing, technology, must have applications, and outstanding management.

From my point of view, Technology Review and other high profile “real” news outfits have parroted the IBM story about Watson, artificial intelligence, and curing cancer. To IBM’s credit, it has refrained from trying to cure death. Google has this task in hand.

The story includes a modest but refreshing statement about the improbability of Watson’s financial goal:

“It’s not taking off as quickly as they would like,” says Robert Austin, a professor of management at Copenhagen Business School who has studied IBM’s strategy over the years. “This is one of those areas where turning demos into real business value depends on the devils in the details. I think there’s a bold new world coming, but not as fast as some people think.”

As the story points out, “Watson is still a work in progress.”

Hey, no kidding?

Stephen E Arnold, November 6, 2014

Textio is a Promising Text Analysis Startup

November 6, 2014

Here’s an interesting development from the world of text-processing technology. GeekWire reports, “Microsoft and Amazon Vets Form Textio, a New Startup Looking to Discover Patterns in Documents.” The new company expects to release its first product next spring. Writer John Cook tells us:

“Kieran Snyder, a linguistics expert who previously worked at Amazon and Microsoft’s Bing unit, and Jensen Harris, who spent 16 years at Microsoft, including stints running the user experience team for Windows 8, have a formed a new data visualization startup by the name of Textio.

“The Seattle company’s tagline: ‘Turn business text into insights.’ The emergence of the startup was first reported by Re/code, which noted that the Textio tool could be used by companies to scour job descriptions, performance reviews and other corporate HR documents to uncover unintended discrimination. In fact, Textio was formed after Snyder conducted research on gender bias in performance reviews in the tech industry.”

That is an interesting origin, especially amid the discussions about gender that currently suffuse the tech community. Textio sees much room for improvement in text analytics, and hopes to help clients reach insights beyond those competing platforms can divine. CEO Snyder’s doctorate and experience in linguistics and cognitive science should give the young company an edge in the competitive field.

Cynthia Murrell, November 06, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

A More Transparent Twitch

November 6, 2014

As many of you probably know, the website Twitch is a video platform for the gaming community. There, one can watch live streams and recordings of gameplay from a plethora of video games and, of course, chat about them. There is also sponsored content in the mix. Now, the Next Web tells us that “Twitch Promises ‘Complete Transparency’ with New Sponsored Content Policies.” The article relates:

“Twitch has continued to grow, it has to worry more and more about how its broadcasters behave. Today, the video game streaming service is addressing how sponsored content will live on the site going forward. Beginning today, all sponsored content on Twitch will have a Sponsored Channel badge applied to the stream. If you’re a subscriber to the Twitch newsletter you will also see a banner signifying sponsored content. Twitch wants to make sure that when a brand is sponsoring a stream — usually by offering up pre-release games or new games to popular broadcasters — viewers are aware of the deal between the broadcaster and the brand.”

Writer Roberto Baldwin adds that the site hopes to sidestep criticism with this move. He notes that, because Amazon acquired Twitch in August, we can expect more “grown-up corporation” moves from the service. Is that a good thing or a bad thing?

Cynthia Murrell, November 06, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

The Current Relevance of SharePoint

November 6, 2014

SharePoint is still very much alive in terms of number of deployments. However, the proverbial jury has pretty much decided that it is out-of-date software that needs a lot of customization to remain functional. CMS Wire covers it in their latest article, “SharePoint is Already Legacy.”

The article reflects on SharePoint’s history and legacy:

“It was built in a world that needed a better enterprise solution for basic document management capabilities than the big enterprise content management ECM vendors were offering. And it spread like wildfire because it was easier to deploy and was more end-user focused than the large ECM tools . . . the lack of functionality was exactly what made SharePoint so dangerous. It provided document management functionality that was good enough for end-users and IT with a much lower cost of deployment.”

So the low cost solution grew legs and took over the enterprise. Now managers are struggling with how to keep it functional. One way to stay up-to-date is to keep an eye on ArnoldIT.com, particularly his SharePoint feed. Stephen E. Arnold is the expert behind the site. He is an expert in all things search and has made a career out of providing thorough coverage to end users and managers alike.

Emily Rae Aldridge, November 6, 2014

Enterprise Search: Is It Really a Loser?

November 5, 2014

I read “Enterprise Search: Despite Benefits, Few Organizations Use Enterprise Search.” The headline caught my attention. In my experience, most organizations have information access systems. Let me give you several recent examples:

  • US government agency. This agency licenses technology from a start up called Red Owl Analytics. That system automatically gathers and makes available information pertinent to the licensing agency. One of the options available to the licensee is to process information that is available within the agency. The system generates outputs and there are functions that allow a user to look for information. I am reasonably confident that the phrase “enterprise search” would not be applied to this company’s information access system. Because Red Owl fits into a process for solving a business problem, the notion of “enterprise search” would be inappropriate.
  • Small accounting firm. This company uses Microsoft Windows 7. The six person staff uses a “workgroup” method that is easy to set up and maintain. The Windows 7 user can browse the drives to which access has been granted by the part time system administrator. When a person needs to locate a document, the built in search function is used. The solution is good enough. I know that when Windows-centric, third party solutions were made known to the owner, the response was, “Windows 7 search is good enough.”
  • Large health care company with dozens of operating units. The company has been working to integrate certain key systems. The largest on-going project is deploying a electronic health care system. Each of the units has legacy search technology. The most popular search systems are those built into applications used every day. Database access is provided by these applications. One unit experimented with a Google Appliance and found that it was useful to the marketing department. Another unit has a RedDot content management system and has an Autonomy stub. The company has no plans, as I understand it, to make federated enterprise search a priority. There is no single reason. Other projects have higher priority and include a search function.

If my experience is representative (and I am not suggesting what I have encountered is what you will encounter), enterprise search is a tough sell. When I read this snippet, I was a bit surprised:

Enterprise search tools are expected to improve and that may improve uptake of the technology.  Steven Nicolaou, Principal Consultant at Microsoft, commented that “enterprise search products will become increasingly and more deeply integrated with existing platforms, allowing more types of content to be searchable and in more meaningful ways. It will also become increasingly commoditized, making it less of a dark art and more of a platform for discovery and analysis.”

What this means is that the company that provides “good enough” search baked into an operating system (think Windows) or an application (think about the search function in an electronic health record), there will be little room for a third party to land a deal in most cases.

The focus in enterprise search has been off the mark for many years. In fact, today’s vendors are recycling the benefits and features hawked 30 years ago. I posted a series of enterprise search vendor profiles at www.xenky.com/vendor-profiles. If you work through that information, you will find that the marketing approaches today are little more than demonstrations of recycling.

The opportunity in information access has shifted. The companies making sales and delivering solid utility to licensees are NOT the companies that beat the drum for customer support, indexing, and federated search.

The future belongs to information access systems that fit into mission critical business processes. Until the enterprise search vendors embrace a more innovative approach to information access, their future looks a bit cloudy.

In cooperation with Telestrategies, we may offer a seminar that talks about new directions in information access. The key is automation, analytics, and outputs that alert, not the old model of fiddling with a query until the index is unlocked and potentially useful information is available.

If you want more information about this invitation only seminar, write me at seaky2000 at yahoo dot com, and I will provide more information.

Stephen E Arnold, November 5, 2014

Google and Containers

November 5, 2014

I have been following the container technology now associated closely with Docker. The Docker Web site offers useful information about the innovation. In a nutshell, if you love virtualization, you know that it has some portability issues. If the notion of “portability” does not resonate with you, you probably won’t need to dig into containers. Containers eliminates most, but not all, of the hassles of creating an application, sticking it on a virtual machine somewhere, and then moving it, changing it, or integrating it. With containers, life gets a little easier.

In the write up “Google Cloud Platform Live: Introducing Container Engine, Cloud Networking and Much More” Google embraces containers. The article is long by Google’s standards. You can work through it and learn one surprising thing: Google did not wrest container leadership from Docker.

I find this interesting because in the period prior to the run up for Google’s initial public offering, Google was the leader in distributed processing. I can recall Jeff Dean explaining some of Google’s innovations in a couple of lectures. I thought that Google had snagged the best ideas from research computing and productized them for the Google Web search system. Google looked like the leader in next generation cloud-centric processing.

Docker’s emergence as the go to container company illustrates that Google has matured. The company is supporting containers in the manner of Docker. Google explains this in the article. Has Google lost its ability to spot and commercialize innovative ways to deliver apps and services?

This container move is not taking place in a vacuum. Amazon and others are eager to “me too” containers. And what of Docker? Good question.

Stephen E Arnold, November 5, 2014

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta