July 27, 2016
Salesforce.com is a cloud computing company with the majority of its profits coming from customer relationship management and acquiring commercial social networking apps. According to PC World, Salesforce recently had a blackout and the details were told in: “Salesforce Outage Continues In Some Parts Of The US.” In early May, Salesforce was down for over twelve hours due to a file integrity issue in the NA14 database.
The outage occurred in the morning with limited services restored later in the evening. Salesforce divides its customers into instances. The NA14 instance is located in North America as many of the customers who complained via Twitter are located in the US.
The exact details were:
“The database failure happened after “a successful site switch” of the NA14 instance “to resolve a service disruption that occurred between 00:47 to 02:39 UTC on May 10, 2016 due to a failure in the power distribution in the primary data center,” the company said. Later on Tuesday, Salesforce continued to report that users were still unable to access the service. It said it did not believe “at this point” that it would be able to repair the file integrity issue. Instead, it had shifted its focus to recovering from a prior backup, which had not been affected by the file integrity issues.”
It is to be expected that power outages like this would happen and they will reoccur in the future. Technology is only as reliable as the best circuit breaker and electricity flows. This is why it is recommended to back up your files in more than one place.
July 22, 2016
A company with a long history is getting fresh scrutiny. An article at Fortune reports, “This Little-Known Firm Is Getting Rich Off Your Medical Data.” Writer Adam Tanner informs us:
“A global company based in Danbury, Connecticut, IMS buys bulk data from pharmacy chains such as CVS , doctor’s electronic record systems such as Allscripts, claims from insurers such as Blue Cross Blue Shield and from others who handle your health information. The data is anonymized—stripped from the identifiers that identify individuals. In turn, IMS sells insights from its more than half a billion patient dossiers mainly to drug companies.
“So-called health care data mining is a growing market—and one largely dominated by IMS. Last week, the company reported 2015 net income of $417 million on revenue of $2.9 billion, compared with a loss of $189 million in 2014 (an acquisition also boosted revenue over the year). ‘The outlook for this business remains strong,’ CEO Ari Bousbib said in announcing the earnings.”
IMS Health dates back to the 1950s, when a medical ad man sought to make a buck on drug-sales marketing reports. In the 1980s and ‘90s, the company thrived selling profiles of specific doctors’ proscribing patterns to pharmaceutical marketing folks. Later, they moved into aggregating information on individual patients—anonymized, of course, in accordance with HIPAA rules.
Despite those rules, some are concerned about patient privacy. IMS does not disclose how it compiles their patient dossiers, and it may be possible that records could, somehow someday, become identifiable. One solution would be to allow patients to opt out of contributing their records to the collection, anonymized or not, as marketing data firm Acxiom began doing in 2013.
Of course, it isn’t quite so simple for the consumer. Each health record system makes its own decisions about data sharing, so opting out could require changing doctors. On the other hand, many of us have little choice in our insurance provider, and a lot of those firms also share patient information. Will IMS move toward transparency, or continue to keep patients in the dark about the paths of their own medical data?
Cynthia Murrell, July 22, 2016
There is a Louisville, Kentucky Hidden Web/Dark
Web meet up on July 26, 2016.
Information is at this link: http://bit.ly/29tVKpx.
July 20, 2016
The article titled An Intranet Success Story on BA Insight asserts that search is less about finding information than it is about user experience. In the context of Intranet networks and search, the article discusses what makes for an effective search engine. Nationwide Insurance, for example, forged a strong, award-winning intranet which was detailed in the article,
“Their “Find Anything” locator, navigation search bar, and extended refiners are all great examples of the proven patterns we preach at BA Insight…The focus for SPOT was clear. It’s expressed in three points: Simple consumer-like experience, One-stop shop for knowledge, Things to make our jobs easier… All three of these connect directly to search that actually works. The Nationwide project has generated clear, documented business results.”
The results include Engagement, Efficiency, and Cost Savings, in the form of $1.5M saved each year. What is most interesting about this article is the assumption that UX experience trumps search results, or at least, search results are merely one aspect of search, not the alpha and omega. Rather, providing an intuitive, user-friendly experience should be the target. For Nationwide, part of that targeting process included identifying user experience as a priority. SPOT, Nationwide’s social intranet, is built on Yammer and SharePoint, and it is still one of the few successful and engaging intranet platforms.
Chelsea Kerwin, July 20, 2016
There is a Louisville, Kentucky Hidden Web/Dark
Web meet up on July 26, 2016.
Information is at this link: http://bit.ly/29tVKpx.
July 16, 2016
Just a factoid. There is now a version of Elasticsearch which is integrated with Cassandra. You can get the code for version 2.1.1-14 via Github. Just another example of the diffusion of the Elastic search system.
Stephen E Arnold, July 16, 2016
July 15, 2016
I read “Lessons To Learn From How Google Stores Its Data.” I noted a couple of interesting factoids (which I assume are spot on). The source is an “independent consultant and entrepreneur based out of Bangalore, India.”
- Google could be holding as much as 15 exabytes on their servers. That’s 15 million terrabytes [sic] of data which would be the equivalent of 30 million personal computers.
- “A typical database contains tables that perform specific tasks.”
- According to a paper published on the Google File System (GFS), the company duplicates each data indexed as many as three times. What this means is that if there are 20 petabytes of data indexed each day, Google will need to store as much as 60 petabytes of data.
As you digest these factoids, keep in mind the spelling issues, the obvious, and the reference to a decade old Google article.
Now the baloney. Google keeps it code in one big thing. Google scatters other data hither and yon. Google struggles to retrieve specific items from its helter skelter set up when asked to provide something to a person with a legitimate request.
In short, Google is like other large companies wrestling with new, old, and changed data. The difference is that Google has the money and almost enough staff to deal with the bumps in the information superhighway.
The Google sells online ads; it does not lead the world in each and every technology, including data management. Bummer, right?
Stephen E Arnold, July 15, 2016
July 7, 2016
I was cruising through the outputs of my Overflight system and spotted a write up with the fetching title “Big Data Services | @CloudExpo #BigData #IoT #M2M #ML #InternetOfThings.” Unreadable? Nah. Just a somewhat interesting attempt to get a marketing write up indexed by a Web search engine. Unfortunately humans have to get involved at some point. Thus, in my quest to learn what the heck Big Data is, I explored the content of the write up. What the article presents is mini summaries of slide decks developed by assorted mavens, wizards, and experts. I dutifully viewed most of the information but tired quickly as I moved through a truly unusual article about a conference held in early June. I assume that the “news” is that the post conference publicity is going to provide me with high value information in exchange for the time I invested in trying to figure out what the heck the title means.
I viewed a slide deck from an outfit called Cazena. You can view “Tech Primer: Big Data in the Cloud.” I want to highlight this deck because it contains one of the most amazing diagrams I have seen in months. Here’s the image:
Not only is the diagram enhanced by the colors and lines, the world it depicts is a listing of data management products. The image was produced in June 2015 by a consulting firm and recycled in “Tech Primer” a year later.
I assume the folks in the audience benefited from the presentation of information from mid tier consulting firms. I concluded that the title of the article is actually pretty clear.
I wonder, Is a T shirt is available with the database graphic? If so, I want one. Perhaps I can search for the strings “#M2M #ML.”
Stephen E Arnold, July 7, 2016
June 30, 2016
I read “Google Tools Up with Its Spanner Database, Looks for a Fight with AWS.” Interesting. Google continues to innovate in data management systems. Its MapReduce tool helped “spark” the Hadoopers. Now Spanner is moving into a cloud war fighting machine. The write up reports:
Google has gone on the record to talk about Spanner in the past, saying its an SQL-like database that can run across multiple data centers, and is capable of scaling up to millions of machines in hundreds of data centers and trillions of database rows. It is “the first system to distribute data at global scale and support externally-consistent distributed transactions,” Google has said. Spanner’s most appealing feature is that it supports synchronous replication, which means that any changes made to the database will automatically be replicated across every data center in real-time, so the data stays consistent regardless of where it’s accessed from.
But what is interesting to me is the headline: “A fight with AWS.” Let’s see how the Amazon fight is progressing. Amazon has a big cloud business. Amazon has a number of options to expand its enterprise services. Amazon has a big ecommerce business the costs of which are partially offset by the Amazon cloud business. Amazon has a search system which in my opinion is a work in progress.
Google has a fight with the EU and the challenge of those Facebookers’ surging ad business. Google also has the task of solving death and getting the Loon balloons aloft and generating revenue. Now the company, according to the write up, wants to fight with Amazon.
Fascinating. Oh, and details of the new data management system and its application to folks with real world problems? Not much info. I love to sit on the sidelines when companies allegedly engage in a multi-front war.
Stephen E Arnold, June 30, 2016
June 27, 2016
Ever wonder about the difference in the noise a bowhead whale makes versus a humpback whale? This is yet another query Google can answer. Tech Insider informed us that Google Search has a secret feature that shouts animal noises at you. This feature allows users to listen to 20 different animal sounds, but according to the article, it is not a well-known service yet. Available on mobile devices as well, this feature appears with a simply query of “what noise does an elephant make?” The post tells us,
“Ever wondered what noise a cow makes? Or a sheep? Or an elephant? No, of course you haven’t because you’re a normal adult with some grasp of reality. You know what noise a sheep makes. But let’s assume for a minute that you don’t. Well, not to worry: Google has got your back. That’s because as well as being a calculator, a tool for researching coworkers, and a portal for all the world’s information, Google has another, little-known feature … It’s capable of making animal noises. Lots of them.”
I don’t know if we would call 20 animal noises “a lot” considering the entirety of the animal kingdom, but it’s definitely a good start. As the article alludes to, the usefulness of this feature is questionable for adults, but perhaps it could be educational for kids or of some novelty interest to animal lovers of all ages. Search is always searching to deliver more.
Megan Feil, June 27, 2016
June 15, 2016
I read “Data Lakes vs Data Streams: Which Is Better?” The answer seems to me to be “both.” Streams are now. Lakes are “were.” Who wants to make decisions based on historical data. On the other hand, real time data may mislead the unwary data sailor. The write up states:
The availability of these new ways [lakes and streams] of storing and managing data has created a need for smarter, faster data storage and analytics tools to keep up with the scale and speed of the data. There is also a much broader set of users out there who want to be able to ask questions of their data themselves, perhaps to aid their decision making and drive their trading strategy in real-time rather than weekly or quarterly. And they don’t want to rely on or wait for someone else such as a dedicated business analyst or other limited resource to do the analysis for them. This increased ability and accessibility is creating whole new sets of users and completely new use cases, as well as transforming old ones.
Good news for self appointed lake and stream experts. Bad news for a company trying to figure out how to generate new revenues.
The first step may be to answer some basic questions about what data are available, their reliability, and what person “knows” about data wrangling. Worrying about lakes and streams before one knows if the water is polluted is a good idea before diving into the murky waters.
Stephen E Arnold, June 15, 2016
June 8, 2016
Discrimination or wise precaution? Perhaps both? MakeUseOf tells us, “This Is Why Tor Users Are Being Blocked by Major Websites.” A recent study (PDF) by the University of Cambridge; University of California, Berkeley; University College London; and International Computer Science Institute, Berkeley confirms that many sites are actively blocking users who approach through a known Tor exit node. Writer Philip Bates explains:
“Users are finding that they’re faced with a substandard service from some websites, CAPTCHAs and other such nuisances from others, and in further cases, are denied access completely. The researchers argue that this: ‘Degraded service [results in Tor users] effectively being relegated to the role of second-class citizens on the Internet.’ Two good examples of prejudice hosting and content delivery firms are CloudFlare and Akamai — the latter of which either blocks Tor users or, in the case of Macys.com, infinitely redirects. CloudFlare, meanwhile, presents CAPTCHA to prove the user isn’t a malicious bot. It identifies large amounts of traffic from an exit node, then assigns a score to an IP address that determines whether the server has a good or bad reputation. This means that innocent users are treated the same way as those with negative intentions, just because they happen to use the same exit node.”
The article goes on to discuss legitimate reasons users might want the privacy Tor provides, as well as reasons companies feel they must protect their Websites from anonymous users. Bates notes that there is not much one can do about such measures. He does point to Tor’s own Don’t Block Me project, which is working to convince sites to stop blocking people just for using Tor. It is also developing a list of best practices that concerned sites can follow, instead. One site, GameFAQs, has reportedly lifted its block, and CloudFlare may be considering a similar move. Will the momentum build, or must those who protect their online privacy resign themselves to being treated with suspicion?
Cynthia Murrell, June 8, 2016