April 29, 2016
There is a new tool for organizations to more quickly detect whether their sensitive data has been hacked. The Atlantic discusses “The Spider that Crawls the Dark Web Looking for Stolen Data.” Until now, it was often many moons before an organization realized it had been hacked. Matchlight, from Terbium Labs, offers a more proactive approach. The service combs the corners of the Dark Web looking for the “fingerprints” of its clients’ information. Writer Kevah Waddell reveals how it is done:
“Once Matchlight has an index of what’s being traded on the Internet, it needs to compare it against its clients’ data. But instead of keeping a database of sensitive and private client information to compare against, Terbium uses cryptographic hashes to find stolen data.
“Hashes are functions that create an effectively unique fingerprint based on a file or a message. They’re particularly useful here because they only work in one direction: You can’t figure out what the original input was just by looking at a fingerprint. So clients can use hashing to create fingerprints of their sensitive data, and send them on to Terbium; Terbium then uses the same hash function on the data its web crawler comes across. If anything matches, the red flag goes up. Rogers says the program can find matches in a matter of minutes after a dataset is posted.”
What an organization does with this information is, of course, up to them; but whatever the response, now they can implement it much sooner than if they had not used Matchlight. Terbium CEO Danny Rogers reports that, each day, his company sends out several thousand alerts to their clients. Founded in 2013, Terbium Labs is based in Baltimore, Maryland. As of this writing, they are looking to hire a software engineer and an analyst, in case anyone here is interested.
Cynthia Murrell, April 29, 2016
April 27, 2016
It looks like some hackers are no longer afraid of the proverbial light, we learn from “Sony Hackers Still Active, ‘Darkhotel’ Checks Out of Hotel Hacking” at InformationWeek. Writer Kelly Jackson Higgins cites Kaspersky security researcher Juan Andres Guerrero-Saade, who observes that those behind the 2014 Sony hack, thought to be based in North Korea, did not vanish from the scene after that infamous attack. Higgins continues:
“There has been a noticeable shift in how some advanced threat groups such as this respond after being publicly outed by security researchers. Historically, cyber espionage gangs would go dark. ‘They would immediately shut down their infrastructure when they were reported on,’ said Kurt Baumgartner, principal security researcher with Kaspersky Lab. ‘You just didn’t see the return of an actor sometimes for years at a time.’
“But Baumgartner says he’s seen a dramatic shift in the past few years in how these groups react to publicity. Take Darkhotel, the Korean-speaking attack group known for hacking into WiFi networks at luxury hotels in order to target corporate and government executives. Darkhotel is no longer waging hotel-targeted attacks — but they aren’t hiding out, either.
“In July, Darkhotel was spotted employing a zero-day Adobe Flash exploit pilfered from the HackingTeam breach. ‘Within 48 hours, they took the Flash exploit down … They left a loosely configured server’ exposed, however, he told Dark Reading. ‘That’s unusual for an APT [advanced persistent threat] group.’”
Seeming to care little about public exposure, Darkhotel has moved on to other projects, like reportedly using Webmail to attack targets in Southeast Asia.
On the other hand, one group which experts had expected to see more of has remained dark for some time. We learn:
“Kaspersky Lab still hasn’t seen any sign of the so-called Equation Group, the nation-state threat actor operation that the security firm exposed early last year and that fell off its radar screen in January of 2014. The Equation Group, which has ties to Stuxnet and Flame as well as clues that point to a US connection, was found with advanced tools and techniques including the ability to hack air gapped computers, and to reprogram victims’ hard drives so its malware can’t be detected nor erased. While Kaspersky Lab stopped short of attributing the group to the National Security Agency (NSA), security experts say all signs indicate that the Equation Group equals the NSA.”
The Kaspersky team doesn’t think for a minute that this group has stopped operating, but believe they’ve changed up their communications. Whether a group continues to lurk in the shadows or walks boldly in the open may be cultural, they say; those in the Far East seem to care less about leaving tracks. Interesting.
Cynthia Murrell, April 27, 2016
April 21, 2016
Is Google trying to emulate BAE System‘s NetReveal, IBM i2, and systems from Palantir? Looking back at an older article from Search Engine Watch, How the Semantic Web Changes Everything for Search may provide insight. Then, Knowledge Graph had launched, and along with it came a wave of communications generating buzz about a new era of search moving from string-based queries to a semantic approach, organizing by “things”. The write-up explains,
“The cornerstone of any march to a semantic future is the organization of data and in recent years Google has worked hard in the acquisition space to help ensure that they have both the structure and the data in place to begin creating “entities”. In buying Wavii, a natural language processing business, and Waze, a business with reams of data on local traffic and by plugging into the CIA World Factbook, Freebase and Wikipedia and other information sources, Google has begun delivering in-search info on people, places and things.”
This article mentioned Knowledge Graph’s implication for Google to deliver strengthened and more relevant advertising with this semantic approach. Even today, we see the Alphabet Google thing continuing to shift from search to other interesting information access functions in order to sell ads.
Megan Feil, April 21, 2016
April 12, 2016
I read “With Government Data Unlocked, MIT Tries to Make It Easier to Soft Through.” I came away from the write up a bit confused. I recall that Palantir Technologies offered for a short period of time a site called AnalyzeThe.US. It disappeared. I also recalled seeing a job posting for a person with a top secret clearance who knew Tableau (Excel on steroids) and Palantir Gotham (augmented intelligence). Ii am getting old but I thought that Michael Kim, once a Deloitte wizard, gave a lecture about how one can use Palantir for analytics.
Why is this important?
The write up points out that MIT worked with Deloitte which, I learned:
provided funding and expertise on how people use government data sets in business and for research.
The Gray Lady’s article does not see any DNA linking AnalyzeThe.US, Deloitte, and the “new” Data USA site. Palantir’s Stephanie Yu gave a talk at MIT. I wonder if those in that session perceive any connection between Palantir and MIT. Who knows. I wonder if the MIT site makes use of AngularJS.
With regard to US government information, www.data.gov is still online. The information can be a challenge for a person without Tableau and Palantir expertise to wrangle in my experience. For those who don’t think Palantir is into sales, my view is that Palantir sells via intermediaries. The deal, in this type of MIT case, is to try to get some MIT students to get bitten by the Gotham and Metropolitan fever. Thank goodness I am not a real journalist trying to figure out who provides what to whom and for what reason. Okay, back to contemplating the pond filled with Kentucky mine run off water.
Stephen E Arnold, April 12, 2016
April 8, 2016
I read “Business Analytics Is a Big Sham and Over Rated.” My hunch is that the write up is a bit of April fool baloney. But, maybe not?
Many vendors are changing their marketing collateral to proclaim one very special outfit can make sense out of oodles of data and textual information.
The write up makes some interesting statements; for example:
Analysts waste every one’s time. Perhaps the statement should be “often are too busy to deal with requests for their services.”
But the write up is an April Fool joke. The problem is that large organizations and government entities want a silver bullet. Who has witnessed the implosion of a massive enterprise software project?
In my experience, business analytics are becoming a must have function. The problem is that the hoo haa tossed around by vendors and pundits seems reasonably accurate.
Humor and reality are one.
Stephen E Arnold, April 8, 2016
April 6, 2016
I read “IBM Launches Mainframe Platform for Spark.” This is an announcement which makes sense to me. The Watson baloney annoys; the mainframe news thrills.
According to the write up:
IBM is expanding its embrace of Apache Spark with the release of a mainframe platform that would allow the emerging open-source analytics framework to run natively on the company’s mainframe operating system.
I noted this passage as well:
The IBM platform also seeks to leverage Spark’s in-memory processing approach to crunching data. Hence, the z Systems platform includes data abstraction and integration services so that z/OS analytics applications can leverage standard Spark APIs. That approach eliminates processing and security issues associated with ETL while allowing organizations to analyze data in-place.
Hopefully IBM will play to its strengths not chase rainbows.
Stephen E Arnold, April 6, 2016
April 1, 2016
According to the capitalist tool:
A new survey of data scientists found that they spend most of their time massaging rather than mining or modeling data.
The point is that few wizards want to come to grips with the problem of figuring out what’s wrong with data in a set or a stream and then getting the data into a form that can be used with reasonable confidence.
Those exception folders, annoying, aren’t they?
The write up points that a data scientist spends 80 percent of his or her time doing housecleaning. Skip the job and the house becomes unpleasant indeed.
The survey also reveals that data scientists have to organize the data to be analyzed. Imagine that. The baloney about automatically sucking in a wide range of data does not match the reality of the survey sample.
Another grim bit of drudgery emerges from the sample which we assume was conducted with the appropriate textbook procedures was that the skills most in demand were for SQL. Yep, old school.
Consider that most of the companies marketing next generation data mining and analytics systems never discuss grunt work and old fashioned data management.
Why the disconnect?
My hunch is that it is the sizzle, not the steak, which sells. Little wonder that some analytics outputs might be lab-made hamburger.
Stephen E Arnold, April 1, 2016
March 30, 2016
Here is a helpful list from Street Fight that could help small and mid-sized businesses find a data analysis platform that is right for them—“5 Self-Service Predictive Analytics Platforms.” Writer Stephanie Miles notes that, with nearly a quarter of small and mid-sized organizations reporting plans to adopt predictive analytics, vendors are rolling out platforms for companies with smaller pockets than those of multinational corporations. She writes:
“A 2015 survey by Dresner Advisory Services found that predictive analytics is still in the early stages of deployment, with just 27% of organizations currently using these techniques. In a separate survey by IDG Enterprise, 24% of small and mid-size organizations said they planned to invest in predictive analytics to gain more value from their data in the next 12 months. In an effort to encourage this growth and expand their base of users, vendors with business intelligence software are introducing more self-service platforms. Many of these platforms include predictive analytics capabilities that business owners can utilize to make smarter marketing and operations decisions. Here are five of the options available right now.”
Here are the five platforms listed in the write-up: Versium’s Datafinder; IBM’s Watson Analytics; Predixion, which can run within Excel; Canopy Labs; and Spotfire from TIBCO. See the article for Miles’ description of each of these options.
Cynthia Murrell, March 30, 2016
March 29, 2016
Years ago I was a rental to an outfit called i2 Group in the UK. Please, don’t confuse the UK i2 with the ecommerce i2 which chugged along in the US of A.
The UK i2 had a product called Analysts Notebook. At one time it was basking in a 95 percent share of the law enforcement and intelligence market for augmented investigatory software. Analysts Notebook is still alive and kicking in the loving arms of IBM.
I thought of the vagaries of product naming when I read “Expert System USA Launches Analysts’ Workspace.”
According to the write up:
Analysts’ Workspace features comprehensive enterprise search and case management software integrated with a customizable semantic engine. It incorporates a sophisticated and efficient workflow process that enables team-wide collaboration and rapid information sharing. The product includes an intuitive dashboard allowing analysts to monitor, navigate, and access information using different taxonomies, maps, and worldviews, as well as intelligent workflow features specifically designed to proactively support analysts and investigators in the different phases of their activities.
The lingo reminds me of the early i2 Group marketing collateral. The terminology has surfaced in some of Palantir’s marketing statements and, quite recently, in the explanation of the venture funded Digital Shadows’ service.
I love me-too products. Where would one be if Mozart had not heard and remembered the note sequences of other composers.
Now the trick will be to make some money. Mozart, though a very good me too innovator, struggled in that department. Expert System, according to Google Finance, is going to have to find a way to keep that share price climbing. Today’s (March 22, 2016) share price is in penny stock territory:
Stephen E Arnold, March 29, 2016
March 29, 2016
Short honk: Put your code hat on. “Mining Mailboxes with Elasticsearch and Kibana” walks a reader through using open source technology to do text analysis. The example under the microscope is email, but the method will work for any text corpus ingested by Elasticsearch. The write up includes code samples and enough explanation to get the Elastic system moving forward. Visualizations are included. These make it easy to spot certain trends; for example, the top recipients of the email analyzed for the tutorial. Worth a look.
Stephen E Arnold, March 29, 2016