Attivio: Search and Almost Everything Else
October 24, 2016
I spent a few minutes catching up with the news on the Attivio blog. You can find the information at this link. As I worked through the write ups over the past five weeks, I was struck by the diversity of Attivio’s marketing messages. Here are the ones which I noted:
- Attivio is a cognitive computing company, not a search or database company
- Attivio has an interest in governance and risk / compliance
- Attivio is involved in Big Data management
- Attivio is active in anti fraud solutions
- Attivio embraces NoSQL
- Attivio knows about modernizing an organization’s data architecture
- Attivio is a business intelligence solution.
My reaction to these capabilities is two fold:
First, for a company which has its roots in Fast Search & Transfer type of software, Attivio has added a number of applications to basic content processing and information access. Attivio embodies the vision Fast Search articulated before the company ran into some “challenges” and sold to Microsoft in 2008. Fast Search, as I understood the vision, was a platform upon which information applications could be built. Attivio appears to be heading in that direction.
The second reaction is that Attivio is churning out capabilities which embody buzzwords, jargon, and trends. Like a fisherman in a bass boat, the Attivio approach is to use different lures in order to snag a decent sized bass. I find it difficult to accept the assertion that a company rooted in search can deliver in the array of technical niches the blog posts reference.
The major takeaway for me was that Attivio has hired a new Chief Revenue Officer whose job is to generate revenue from the company’s “data catalog” business. I learned from “Attivio Names New Chief Revenue Officer”:
Connon [the insider who took over the revenue job] sees his new role as a reflection of the growing demand for technology that can break down data silos and help successful companies answer, not just the question of “what” the data is reporting, but identify correlation and patterns to answer critical “why” questions. Connon is passionate when he talks about the value of Attivio’s newest technology solution—the Semantic Data Catalog–and its ability to unify a wide array of data for a diverse customer base. “The Semantic Data Catalog is not just for financial service industries. It’s truly a horizontal technology solution that can benefit companies in any industry with data—in other words, with any company, in any industry,” explains Connon. “Our established Cognitive Search and Insight technology provides the foundation for our Semantic Data Catalog to provide companies with a self-service, permission-based ability to locate, sort, and analyze key information across an unlimited number of data applications,” adds Connon.
For me, Attivio’s “momentum” in marketing has to be converted to sustainable revenue. My assumption is that almost every professional at a software / services company sells and generates revenue. When a company lags in revenue, will one person be able to generate revenue?
I don’t have an answer. Worth monitoring to learn if the Chief Revenue Officer can deliver the money.
Stephen E Arnold, October 24, 2016
Partnership Aims to Establish AI Conventions
October 24, 2016
Artificial intelligence research has been booming, and it is easy to see why— recent advances in the field have opened some exciting possibilities, both for business and society as a whole. Still, it is important to proceed carefully, given the potential dangers of relying too much on the judgement of algorithms. The Philadelphia Inquirer reports on a joint effort to develop some AI principles and best practices in its article, “Why This AI Partnership Could Bring Profits to These Tech Titans.” Writer Chiradeep BasuMallick explains:
Given this backdrop, the grandly named Partnership on AI to Benefit People and Society is a bold move by Alphabet, Facebook, IBM and Microsoft. These globally revered companies are literally creating a technology Justice League on a quest to shape public/government opinion on AI and to push for friendly policies regarding its application and general audience acceptability. And it should reward investors along the way.
The job at hand is very simple: Create a wave of goodwill for AI, talk about the best practices and then indirectly push for change. Remember, global laws are still obscure when it comes to AI and its impact.
Curiously enough, this elite team is missing two major heavyweights. Apple and Tesla Motors are notably absent. Apple Chief Executive Tim Cook, always secretive about AI work, though we know about the estimated $200 million Turi project, is probably waiting for a more opportune moment. And Elon Musk, co-founder, chief executive and product architect of Tesla Motors, has his own platform to promote technology, called OpenAI.
Along with representatives of each participating company, the partnership also includes some independent experts in the AI field. To say that technology is advancing faster than the law can keep up with is a vast understatement. This ongoing imbalance underscores the urgency of this group’s mission to develop best practices for companies and recommendations for legislators. Their work could do a lot to shape the future of AI and, by extension, society itself. Stay tuned.
Cynthia Murrell, October 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Half of the Largest Companies: Threat Vulnerable
October 24, 2016
Compromised Credentials, a research report by Digital Shadows reveals that around 1,000 companies comprising of Forbes Global 2000 are at risk as credentials of their employees are leaked or compromised.
As reported by Channel EMEA in Digital Shadows Global Study Reveals UAE Tops List in Middle East for…
The report found that 97 percent of those 1000 of the Forbes Global 2000 companies, spanning all businesses sectors and geographical regions, had leaked credentials publicly available online, many of them from third-party breaches.
Owing to large-scale data breaches in recent times, credentials of 5.5 million employees are available in public domain for anyone to see. Social networks like LinkedIN, MySpace and Tumblr were the affliction points of these breaches, the report states.
Analyzed geographically, companies in Middle-East seem to be the most affected:
The report revealed that the most affected country in the Middle East – with over 15,000 leaked credentials was the UAE. Saudi Arabia (3360), Kuwait (203) followed by Qatar (99) made up the rest of the list. This figure is relatively small as compared to the global figure due to the lower percentage of organizations that reside in the Middle East.
Affected organizations may not be able to contain the damages by simply resetting the passwords of the employees. It also needs to be seen if the information available is contemporary, not reposted and is unique. Moreover, mere password resetting can cause lot of friction within the IT departments of the organizations.
Without proper analysis, it will be difficult for the affected companies to gauge the extent of the damage. But considering the PR nightmare it leads to, will these companies come forward and acknowledge the breaches?
Vishal Ingole, October 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Rocket Software: Video Marketing Moment
October 23, 2016
0I did a quick, routine check of Rocket Software’s search and text analytics Web page at this link. I saw a snippet of text and then a link to a new video:
Rocket is a player in the five day IBM Watson conference later this month. What’s interesting about the WOW 2016 event is that no list of participating companies is available via a search on IBM.com or via public Web search systems. Interesting. A five day event with many luminaries I surmise.
Stephen E Arnold, October 23, 2016
Computerworld Becomes ComputerWatson
October 22, 2016
I followed a series of links to three articles about IBM Watson. Here are the stories I read:
- Watson’s the Name, Data’s the Game, dated October 10, 2016
- Milestones Along the Way in Watson’s Colorful History, October 10, 2016
- It’s (Not) Elementary: How Watson Works, October 10, 2016
The publication running these three write ups is Computerworld, which I translated as “ComputerWatson.”
Intrigued by the notion of “news,” I learned:
Watson uses some 50 technologies today, tapping artificial-intelligence techniques such as machine learning, deep learning, neural networks, natural language processing, computer vision, speech recognition and sentiment analysis.
But IBM does not like the idea of artificial intelligence even though I have spotted such synonyms in other “real” news write ups; for example, “augmented intelligence.”
There are factoids like “Watson can read more than 800 pages a second.” Figure 125 words per “page” and that works out to 100,000 words per second which is a nice round number. Does Watson perform this magic on a basic laptop? Probably not. What are the bandwidth and storage requirements? Oh,not a peep.
Computerworld—I mean ComputerWatson—provides a complete timeline of the technology too. The future begins in 1997. Imagine that. Boom. Watson wins at chess.
The “history” of Watson is embellished with a fanciful account of how IBM trained via humans assembling information. How much does corpus assembly cost? ComputerWatson—oh, I meant “Computerworld”—does not dive into investment.
To make Watson’s inner workings clear, the “real” news write up provides a link to an IBM video. Here’s an example of the cartoonish presentation:
These three write ups strike me as a public relations exercise. If IBM paid Computerworld to write and run these stories, the three articles are advertising. Who wrote these “news stories”? The byline is Katherine Noyes, who describes herself as “an ardent geek.” Her beat? Enterprise software, cloud computing, big data, analytics, and artificial intelligence.
Remarkable stuff but I had several thoughts:
- Not much “news” is included in the articles. It seems to me that the information has appeared in other writings.
- IBM Watson is working overtime to be recognized as the leader in the smart software game. That’s okay, but IBM seems to be pushing big markets with no easy way to monetize its efforts; for example, education, cancer, and game show participation.
- The Computerworld IBM Watson content party strikes me as eroding the credibility of both outfits.
Oh, I remember. Dave Schubmehl, the fellow who tried to sell on Amazon reports containing my research without my consent, was hooked up with IDG. I have lost track of the wizard, but I do recall the connection. More information is here.
Yep, credibility for possible content marketing and possible presentation of “news” as marketing collateral. Fascinating. Perhaps I should ask Watson: “What’s up?”
Stephen E Arnold, October 22, 2016
Google Finds That Times Change: Privacy Redefined
October 21, 2016
I read “Google Has Quietly Dropped Ban on Personally Identifiable Web Tracking.” The main idea is that an individual can be mapped to just about anything in the Google-verse. The write up points out that in 2007, one of the chief Googlers said that privacy was a “number one priority when we [the Google] contemplate new kinds of advertising products.”
That was before Facebook saddled up with former Googlers (aka Xooglers) and started to ride the ad pony, detailed user information, and the interstellar beast of user generated content. Googlers knew that social was a big deal, probably more important than offering Boolean operators and time stamp metadata for users of its index. But that was then and this is now.
The write up reveals:
But this summer, Google quietly erased that last privacy line in the sand – literally crossing out the lines in its privacy policy that promised to keep the two pots of data separate by default. In its place, Google substituted new language that says browsing habits “may be” combined with what the company learns from the use Gmail and other tools. The change is enabled by default for new Google accounts. Existing users were prompted to opt-in to the change this summer.
I must admit that when I saw the information, I ignored it. I don’t use too many Google services, and I am not one of the cats in the bag that Google is carrying to and fro. I am old (73), happy with my BlackBerry, and I don’t use mobile search. But the shift is an important part of the “new” Alphabet Google thing.
Tracking users 24×7 is the new black in Sillycon Valley. The yip yap about privacy, ethics, and making explicit what data are gathered is noise. Buy a new Pixel phone and live the dream, gentle reader.
You can work through the story cited above for more details. My thoughts went a slightly different direction:
- Facebook poses a significant challenge to Google, and today it does not have a viable option to offer its users
- The shift to mobile means that Google has to — note the phrase “has to” — find a way to juice up ad revenues. Sure, these are okay, but to keep the Loon balloons aloft more dough is needed.
- Higher value data boils down to detailed information about specific users, their cohorts, their affinity groups, and their behaviors. As the economy continues to struggle, the Alphabet Google thing will have data to buttress the Google ad sales’ professionals pitches to customers.
- Offering nifty data to nation states like China-type countries may allow Google to enter a new market with the Pixel and mobile search as Trojan horses.
In my monograph “Google Version 2.0: The Calculating Predator,” I described some of the technical underpinnings of Google’s acquisitions and inventors. With more data, the value of these innovations may begin to pay off. If the money does not flow, Google Version 3.0 may be a reprise of the agonies of the Yahooligans. Those Guha and Halevy “inventions” are fascinating in their scope and capabilities. Think about an email for which one can know who wrote it, who received it, who read it, who changed, what the changes were, who the downstream recipients were, and other assorted informational gems.
Allow me to leave you with a single question:
Do you think the Alphabet Google thing was not collecting fine grained data prior to the official announcement?
Although out of print, I have a pre publication copy of the Google 2.0 monograph available as a PDF. If you want a copy, write my intrepid sales manager, Ben Kent at benkent2020 at yahoo dot com. Yep, Yahoo. Inept as it may be, Yahoo is not the GOOG. The Facebook, however, remains the Facebook, and that’s one of Google’s irritants.
Stephen E Arnold, October 21, 2016
The Thrill of Rising Yahoo Traffic
October 21, 2016
I love the Gray Lady. The Bits column is chock full of technology items which inspire, excite, and sometimes implant silly ideas in readers’ minds. That’s real journalism.
Navigate to “Daily Report: Explaining Yahoo’s Unexpected Rise in Traffic.”
The write up pivots on the idea that Internet traffic can be monitored in a way that is accurate and makes sense. A click is a click. A packet is a packet. Makes sense. The are the “minor” points of figuring out which clicks are from humans and which clicks are from automated scripts performing some function like probing for soft spots. There are outfits which generate clicks for various reasons including running down a company’s advertising “checkbook.” There are clicks which ask such questions as, “Are you alive?” or “What’s the response time?” You get the idea because you have a bit of doubt about traffic generated by a landing page, a Web site, or even an ad. The counting thing is difficult.
The write up in the Gray Lady assumes that these “minor” points are irrelevant in the Yahoo scheme of themes; for example:
an increased number of people were drawn to Yahoo in September. The reason may have been Yahoo’s disclosure that month that hackers stole data on 500 million users in 2014.
“People”? How do we know that the traffic is people?
The Gray Lady states:
Yahoo’s traffic has been declining for a long time, overtaken by more adept, varied and apparently secure places to stay on the internet.
Let’s think about this. We don’t know if the traffic data are counting humans, software scripts, or utility functions. We do know that Yahoo has been on a glide path to a green field without rocks and ruts. We know that Yahoo is a bit of a hoot in terms of management.
My hunch is that Yahoo’s traffic is pretty much what it has been; that is, oscillating a bit but heading in for a landing, either hard or soft.
Suggesting that Yahoo may be growing is interesting but unfounded. That traffic stuff is mushy. What’s the traffic to the New York Times’s pay walled subsite? How does the Times know that a click is a human from a “partner” and not a third party scraping content?
And maybe the traffic spike is a result of disenchanted Yahoo users logging in to change their password or cancel their accounts.
Stephen E Arnold, October 21, 2016
Twitter: A Security Breach
October 21, 2016
Several years ago, the Beyond Search Twitter account was compromised. I received emails about tweets relating to a pop singer named Miley Cyrus. We knew the Twitter CTO at the time and it took about 10 days to fix the issue. At that time, I knew that Twitter had an issue.
I read “Passwords for 32 Million Twitter Accounts May Have Been Hacked and Leaked.” I learned:
the data comes from a Twitter hack in which 32 million Twitter accounts may have been compromised. The incident and the news comes from a rather unusual source that lets you download such data and even lets you remove yourself from the listing for free.
No word about how many days will be consumed addressing affected accounts.
Stephen E Arnold, October 21, 2016
Picking Away at Predictive Programs
October 21, 2016
I read “Predicting Terrorism From Big Data Challenges U.S. Intelligence.” I assume that Bloomberg knows that Thomson Reuters licenses the Palantir Technologies Metropolitan suite to provide certain information to Thomson Reuters’ customers. Nevertheless, I was surprised at some of the information presented in this “real” journalism write up.
The main point is that numerical recipes cannot predict what, when, where, why, and how bad actors will do bad things. Excluding financial fraud, which seems to be a fertile field for wrong doing, the article chases the terrorist angle.
I learned:
- Connect the dots is a popular phrase, but connecting the dots to create a meaningful picture of bad actors’ future actions is tough
- Big data is a “fundamental fuel”
- Intel, PredPol, and Global Intellectual Property Enforcement Center are working in the field of “predictive policing”
- The buzzword “total information awareness” is once again okay to use in public
I highlighted this passage attributed too a big thinker at the Brennan Center for Justice at NYU School of Law:
Computer algorithms also fail to understand the context of data, such as whether someone commenting on social media is joking or serious,
Several observations:
- Not a single peep about Google Deep Mind and Recorded Future, outfits which I consider the leaders in the predictive ball game
- Not a hint that Bloomberg was itself late to the party because Thomson Reuters, not exactly an innovation speed demon, saw value in Palantir’s methods
- Not much about what “predictive technology” does.
In short, the write up delivers a modest payload in my opinion. I predict that more work will be needed to explain the interaction of math, data, and law enforcement. I don’t think a five minute segment with talking heads on Bloomberg TV won’t do it.
Stephen E Arnold, October 21, 2016
Falcon Searches Through Browser History
October 21, 2016
Have you ever visited a Web site and then lost the address or could not find a particular section on it? You know that the page exists, but no matter how often you use an advanced search feature or scour through your browser history it cannot be found. If you use Google Chrome as your main browser than there is a solution, says GHacks in the article, “Falcon: Full-Text history Search For Chrome.”
Falcon is a Google Chrome extension that adds full-text history search to a browser. Chrome usually remembers Web sites and their extensions when you type them into the address bar. The Falcon extension augments the default behavior to match text found on previously visited Web Sites.
Falcon is a search option within a search feature:
The main advantage of Falcon over Chrome’s default way of returning results is that it may provide you with better results. If the title or URL of a page don’t contain the keyword you entered in the address bar, it won’t be displayed by Chrome as a suggestion even if the page is full of that keyword. With Falcon, that page may be returned as well in the suggestions.
The new Chrome extension acts as a delimiter to recorded Web history and improves a user’s search experience so they do not have to sift through results individually.
Whitney Grace, October 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph