Chrome Restricts Extensions amid Security Threats
June 22, 2015
Despite efforts to maintain an open Internet, malware seems to be pushing online explorers into walled gardens, akin the old AOL setup. The trend is illustrated by a story at PandoDaily, “Security Trumps Ideology as Google Closes Off its Chrome Platform.” Beginning this July, Chrome users will only be able to download extensions for that browser from the official Chrome Web Store. This change is on the heels of one made in March—apps submitted to Google’s Play Store must now pass a review. Extreme measures to combat an extreme problem with malicious software.
The company tried a middle-ground approach last year, when they imposed the our-store-only policy on all users except those using Chrome’s development build. The makers of malware, though, are adaptable creatures; they found a way to force users into the development channel, then slip in their pernicious extensions. Writer Nathanieo Mott welcomes the changes, given the realities:
“It’s hard to convince people that they should use open platforms that leave them vulnerable to attack. There are good reasons to support those platforms—like limiting the influence tech companies have on the world’s information and avoiding government backdoors—but those pale in comparison to everyday security concerns. Google seems to have realized this. The chaos of openness has been replaced by the order of closed-off systems, not because the company has abandoned its ideals, but because protecting consumers is more important than ideology.”
Better safe than sorry? Perhaps.
Cynthia Murrell, June 22, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Expert Systems Acquires TEMIS
June 22, 2015
In a move to improve its product offerings, Expert System acquired TEMIS. The two companies will combine their assets to create a leading semantic provider for cognitive computing. Reuters described the acquisition in very sparse details: “Expert System Signs Agreement To Acquire French TEMIS SA.”
Reuters describes the merger as:
“Reported on Wednesday that it [Expert System] signed binding agreement to buy 100 percent of TEMIS SA, a French company offering solutions in text analytics
- Deal value is 12 million euros ($13.13 million)”
TEMIS creates technology that helps organizations leverage, manage, and structure their unstructured information assets. It is best known for Luxid, which identifies and extracts information to semantically enrich content with domain-specific metadata.
Expert System, on the other hand, is another semantically inclined company and its flagship product is Cogito. The Cogito software is designed to understand content within unstructured text, systems, and analytics. The goal is give organizations a complete picture of your information, because Cogitio actually understand what is processing.
TEMIS and Expert System have similar goals to make unstructured data useful to organizations. Other than the actual acquisition deal, details on how Expert System plans to use TEMIS have not been revealed. Expert System, of course, plans to use TEMIS to improve its own semantic technology and increase revenue. Both companies are pleased at the acquisition, but if you consider other buy outs in recent times the cost to Expert System is very modest. Thirteen million dollars underscores the valuation of other text analysis companies. Other text analysis companies would definitely cost more than TEMIS.
Whitney Grace, June 22, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
How to Search Craigslist
June 21, 2015
Short honk. Looking for an item on Craigslist.org. The main Craigslist.org site wants you to look in your area and then manually grind through listings for other areas region by region. I read “How to Search All Craigslist at Once.” The article does a good job of explaining how to use Google and Ad Huntr. The write lists some other Craigslist search tools as well. A happy quack for Karar Halder, who assembled the article.
Stephen E Arnold, June 21, 2015
Amazon, Pages, and Research
June 21, 2015
I read “What If Authors Were Paid Every Time Someone Turned a Page.” As you may know, I have complained directly and through my attorney because IDC and its wizard Dave Schubmehl sold a report containing my information on Amazon. The mid tier consulting firm pegged a $3,500 price tag on an eight page report based on my work. Well, as Jack Benny used to say. Well.
The publisher / consultant behavior annoyed me, but I do not sell my content via Amazon. I would prefer to give away a report than get tangled in the Bezos buzz saw. Sure, I buy talcum powder from the Zon, but that’s because the grocery in Harrod’s Creek does not sell any talcum powder. The Zon gets the product to me in a few days. Sometimes.
My thoughts about Amazon ramped up a notch when I read this passage in the article from The Atlantic:
Soon, the maker of the Kindle is going to flip the formula used for reimbursing some of the authors who depend on it for sales. Instead of paying these authors by the book, Amazon will soon start paying authors based on how many pages are read—not how many pages are downloaded, but how many pages are displayed on the screen long enough to be parsed. So much for the old publishing-industry cliché that it doesn’t matter how many people read your book, only how many buy it. For the many authors who publish directly through Amazon, the new model could warp the priorities of writing: A system with per-page payouts is a system that rewards cliffhangers and mysteries across all genres. It rewards anything that keeps people hooked, even if that means putting less of an emphasis on nuance and complexity.
Several observations:
- I often buy digital and hard copy books because I need access to a specific passage. I recently ordered a book about law enforcement and the Web. I was interested in two chapters and the bibliographies for this chapter. The notion of paying the author, a police professional, for only those pages I examined rubs me the wrong way. I have the book and I may need to access other chapters at a different point in time. But I want the author to be paid for this very good work. If I understand the write up, Amazon wants to move in a different direction.
- When I get a book via Amazon for my Kindle, I thought I could use the book as long as I had the device. Well. (There’s the Benny word again) I have experienced disappearing content. My wife asked me where a title was, I said, “In the archive.” Nope. The title was disappeared. Nifty. I contacted Amazon via a form and heard nothing back. Who got paid? Amazon but I no longer have the digital book. Nifty, but I probably made a mistake or at least that’s what outfits operating like Time Warner-type companies tell me. My fault.
- Amazon, like the Google, is faced with cost projections that are likely to give accountants headaches and sleepless nights. Amazon, a digital Wal-Mart type operation, is going to squeezing revenue any way possible. Someone has to pay for the Amazon phone and other Amazon adventures. Same day groceries, anyone?
Net net: No wonder the second hand book stores in Louisville, Kentucky are crowded. Physical books work the way they have for centuries, thank you. You will be able to buy my new study from the electronic store we have set up. The book will even be available in hard copy if a person wants a tangible instance. Maybe I will sell fewer copies. That’s okay. I prefer to avoid being clever and making my work available to anyone who wants to access it. None of that IDC like behavior either. $3,500 for eight pages. Crazy, right?
I often purchase fiction books, read a few pages, and then decide the book is not in my wheel house. I want the author to get paid whether I read every page or not. I think the author wants to get paid as well. The only outfit who doesn’t want to pay may be the Zon.
Stephen E Arnold, June 21, 2015
Sprylogics Repositioned to Mobile Search
June 20, 2015
I learned about Cluuz.com in a briefing in a gray building in a gray room with gray carpeting. The person yapping explained how i2 Ltd.-type relationship analysis was influencing certain intelligence-centric software. I jotted down some urls the speaker mentioned.
When I returned to my office, I check out the urls. I found the Cluuz.com service interesting. The system allowed me to run a query, review results with inline extracts, and relationship visualizations among entities. In that 2007 version of Cluuz.com’s system, I found the presentation, the inclusion of emails, phone numbers, and parent child relationships quite useful. The demonstration used queries passed against Web indexes. Technically, Cluuz.com belonged to the category of search systems which I call “metasearch” engines. The Googles and Yahoos index the Web; Cluuz.com added value. Nifty.
I chased down Alex Zivkovic, the individual then identified as the chief technical professional at Sprylogics. You can read my 2008 interview with Zivkovic in my Search Wizards Speak collection. The Cluuz.com system originated with a former military professional’s vision for information analysis. According to Zivkovic, the prime mover for Cluuz.com was Avi Shachar. At the time of the interview, the company focused on enterprise customers.
Zivkovic told me in 2008:
We have clustering. We have entity extraction. We have a relational ship analysis in a graph format. I want to point out that for enterprise applications, the Cluuz.com functions are significantly more rich. For example, a query can be run across internal content and external content. The user sees that the internal information is useful but not exactly on point. Our graph technology makes it easy for the user to spot useful information from an external source such as the Web in conjunction with the internal information. With a single click, the user can be looking into those information objects. We think we have come up with a very useful way to allow an organization to give its professionals an efficient way to search for content that is behind the firewall and on the Web. The main point, however, is that user does not have to be trained. Our graphical interface makes it obvious what information is available from which source. Instead of formulating complex queries, the person doing the search can scan, click, and browse. Trips back to the search box are options, not mandatory.
I visited the Sprylogics.com Web site the other day and learned that the Cluuz.com-type technology has been repackaged as a mobile search solution and real time sports application.
There is a very good explanation of the company’s use of its technology in a more consumer friendly presentation. You can find that presentation at this link, but the material can be removed at any time, so don’t blame me if the link is dead when you try to review the explanation of the 2015 version of Sprylogics.
From my point of view, the Sprylogics’ repositioning is an excellent example of how a company with technology designed for intelligence professionals can be packaged into a consumer application. The firm has more than a dozen patents, which some search and content processing companies cannot match. The semantic functions and the system’s ability to process Web content in near real time make the firm’s Poynt product interesting to me.
Sprylogics’ approach, in my opinion, is a far more innovative approach to leveraging advanced content processing capabilities than approaches taken by most search vendors. It is easier to slap a customer relationship management, customer support, or business intelligence label on what is essential search and retrieval software than create a consumer facing app.
Kudos to Sprylogics. The ArnoldIT team hopes their stock, which is listed on the Toronto Stock Exchange, takes wing.
Stephen E Arnold, June 20, 2015
Content Grooming: An Opportunity for Tamr
June 20, 2015
Think back. Vivisimo asserted that it deduplicated and presented federated search results. There are folks at Oracle who have pointed to Outside In and other file conversion products available from the database company as a way to deal with different types of data. There are specialist vendors, which I will not name, who are today touting their software’s ability to turn a basket of data types into well-behaved rows and columns complete with metatags.
Well, not so fast.
Unifying structured and unstructured information is a time consuming, expensive process. The reasons for the obese exception files where objects which cannot be processed go to live out their short, brutish lives.
I read “Tamr Snaps Up $25.2 Million to Unify Enterprise Data.” The stakeholders know, as do I, that unifying disparate types of data is an elephant in any indexing or content analytics conference room. Only the naive believe that software whips heterogeneous data into Napoleonic War parade formations. Today’s software processing tools cannot get undercover police officers to look ship shape for the mayor.
Ergo, an outfit with an aversion to the vowel “e” plans to capture the flag on top of the money pile available for data normalization and information polishing. The write up states:
Tamr can create a central catalogue of all these data sources (and spreadsheets and logs) spread out across the company and give greater visibility into what exactly a company has. This has value on so many levels, but especially on a security level in light of all the recent high-profile breaches. If you do lose something, at least you have a sense of what you lost (unlike with so many breaches).
Tamr is correct. Organizations don’t know what data they have. I could mention a US government agency which does not know what data reside on the server next to another server managed by the same system administrator. But I shall not. The problem is common and it is not confined to bureaucratic blenders in government entities.
Tamr, despite the odd ball spelling, has Michael Stonebraker, a true wizard on the task. The write up mentions an outfit what might be politely described as a “database challenge” as a customer. If Thomson Reuters cannot figure out data after decades of efforts and millions upon millions of investment, believe me when I point out that Tamr may be on to something.
Stephen E Arnold, June 20, 2015
Watson and Coffee Shops. Smart Software Needs k\More Than a Latte
June 19, 2015
I read “IBM Watson Analytics Helps Grind Big Data in Unmanned Coffee Shops.” I promised myself I would not call attention to the wild and wonderful Watson public relations efforts. But coffee shops?
The main idea is that:
IBM has worked with Revive Vending to create systems for unmanned coffee shops that tap into the cognitive computing technology of Watson Analytics for data analysis.
Note the verb: past tense. I would have preferred “is working” but presumably Watson is not longer sipping its latte at Revive.
According to the article:
IBM’s cloud-powered analytics service is used to crunch the vending machine data and form a picture of customers. Summerill [a Revive executive] explained that Watson Analytics allows Honest Café to understand which customers sit and have a drink with friends, and which ones dash in to grab a quick coffee while on the move. Transactional data is analyzed to see how people pay for their food and drinks at certain times of the day so that Honest Café can automatically offer relevant promotions and products to individual customers.
The write up also includes a balling statement from my pals at IDC, the outfit which sold my content without my permission on Amazon courtesy of the wizard Dave Schubmehl:
Miya Knights, senior research analyst at IDC, said that the mass of data generated by retailers through networked systems that cover retail activity can be used to support increasingly complex and sophisticated customer interactions.
Okay, but don’t point of sale systems (whether manual or automated) track these data? With a small operation, why not use what’s provided by the POS vendor?
The answer to the question is that IBM is chasing demo customers even to small coffee shops. IDC, ever quick to offer obvious comments without facts to substantiate the assertion, is right there. Why? Maybe IDC sells professional services to IBM?
Where are the revenue reports which substantiate Watson’s market success? Where are substantive case examples from major firms? Where is a public demonstration of Watson using Wikipedia information?
Think about these questions as you sip your cheap 7-11 coffee, gentle reader.
Ponder that there may be nothing substantive to report, so I learn about unmanned coffee shops unable to figure out who bought what without IBM Watson. Overkill? Yep.
Stephen E Arnold, June 19, 2015
Big Data and Old, Incomplete Listicles
June 19, 2015
I enjoy lists of the most important companies, the top 25 vendors of a specialized service, and a list of companies I should monitor. Wonderful stuff because I encounter firms about which I have zero information in my files and about which I have heard nary a word.
An interesting list appears in “50 Big Data Companies to Follow.” The idea is that I should set up a Google Alert for each company and direct my Overflight system to filter content mentioning these firms. The problem with this post is that the information does not originate with Datamation or Data Science Center. The list was formulated by Sand Hill.com in a story called “Sand Hill 50 “Swift and Strong” in Big Data.” The list was compiled prior to its publication in January 2014. This makes the list 18 months old. With the speed of change in Big Data, the list in my opinion is stale.
A similar list appears in “CRN 50 Big Data business Analytics Companies,” which appears on the KDNuggets.com Web site. This list appears to date from the middle of 2014, which makes it about a year old. Better but not fresh.
I did locate an update called “2015 Big Data 100: Business Analytics.” Locating a current list of Big Data companies was not easy. Presumably my search skills are sub par. Nevertheless, the list is interesting.
Here are some firms in Big Data which were new to me:
- Guavas
- Knime
- Zoomdata
But the problem was that the CRN Web site presented only 46 vendors, not 100.
Observations:
- Datamation is pushing out via its feed links to old content originating on other publishers’ Web sites
- The obscurity of the names in the list is the defining characteristic of the lists
- Getting a comprehensive, current list of Big Data vendors is difficult. Data Science just listed 15 companies and back linked to Sand Hill. CRN displayed 46 companies but forced me to click on each listing. I could not view the entire list.
Not too useful, folks.
Stephen E Arnold, June 19, 2015
Cloud Search: Are Data Secure?
June 19, 2015
I have seen a flurry of news announcements about Coveo’s cloud based enterprise search. You can review a representative example by reading “Coveo Lassos the Cloud for Enterprise Search.” Coveo is also aware of the questions about security. See “How Does Coveo Secure Your Data and Services.”
With Coveo’s me-too cloud service, I thought about other vendors which offer cloud-based solutions. The most robust based on our tests is Blossom Search. The company was founded by Dr. Alan Feuer, a former Bell Labs’ wizard. When my team was active in government work, we used the Blossom system to index a Federal law enforcement agency’s content shortly after Blossom opened for business in 1999. As government procurements unfold, Blossom was nosed out by an established government contractor, but the experience made clear:
- Blossom’s indexing method delivered near real time updates
- Creating and building an initial index was four times faster than the reference system against which we test Dr. Feuer’s solution. (The two reference systems were Fast Search & Transfer and Verity.)
- The Blossom security method conformed to the US government guidelines in effect at the time we did the work.
I read “Billions of Records at Risk from Mobile App Data Flow.” With search shifting from the desktop to other types of computing devices, I formulated several questions:
- Are vendors deploying search on clouds similar to Amazon’s system and method ensuring the security of their customers’ data? Open source vendors like resellers of Elastic and proprietary vendors like MarkLogic are likely to be giving some additional thought to the security of their customers’ data.
- Are licensees of cloud based search systems performing security reviews as we did when we implemented the Blossom search system? I am not sure if the responsibility for this security review rests with the vendor, the licensee, or a third party contracted to perform the work.
- How secure are hybrid systems; that is, an enterprise search or content processing system which pulls, processes, and stores customer data across disparate systems? Google, based on my experience, does a good job of handling search security for the Google Search Appliance and for Site Search. Other vendors may be taking similar steps, but the information is not presented with basic marketing information.
My view is that certain types of enterprise search may benefit from a cloud based solution. There will be other situations in which the licensee has a contractual or regulatory obligation to maintain indexes and content in systems which minimize the likelihood that alarmist headlines like “Billions of Records at Risk from Mobile App Data Flow.”
Security is the search industry’s industry of a topic which is moving up to number one with a “bullet.”
Stephen E Arnold, June 19, 2015
Latest Version of DataStax Enterprise Now Available
June 19, 2015
A post over at the SD Times informs us, “DataStax Enterprise 4.7 Released.” Enterprise is DataStax’s platform that helps organizations manage Apache Cassandra databases. Writer Rob Marvin tells us:
“DataStax Enterprise (DSE) 4.7 includes a production-certified version of Cassandra 2.1, and it adds enhanced enterprise search, analytics, security, in-memory, and database monitoring capabilities. These include a new certified version of Apache Solr and Live Indexing, a new DSE feature that makes data immediately available for search by leveraging Cassandra’s native ability to run across multiple data centers. …
“DSE 4.7 also adds enhancements to security and encryption through integration with the DataStax OpsCenter 5.2 visual-management and monitoring console. Using OpsCenter, developers can store encryption keys on servers outside the DSE cluster and use the Lightweight Directory Access Protocol to manage admin security.”
Four main features/ updates are listed in the write-up: extended search analytics, intelligent query routing, fault-tolerant search operations, and upgraded analytics functionality. See the article for details on each of these improvements.
Founded in 2010, DataStax is headquartered in San Mateo, California. Clients for their Cassandra-management software (and related training and professional services) range from young startups to Fortune 100 companies.
Cynthia Murrell, June 19, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph