Automated Tools for Dark Web Data Tracking
September 15, 2016
Naturally, tracking stolen data through the dark web is a challenge. Investigators have traditionally infiltrated chatrooms and forums in the effort—a tedious procedure with no guarantee of success. Now, automated tools may give organizations a leg up, we learn from the article, “Tools to Track Stolen Data Through the Dark Web” at GCN. Reporter Mark Pomerleau informs us:
“The Department of Veterans Affairs last month said it was seeking software that can search the dark web for exploited VA data improperly outside its control, distinguish between VA data and other data and create a ‘one-way encrypted hash’ of VA data to ensure that other parties cannot ascertain or use it. The software would also use VA’s encrypted data hash to search the dark web for VA content. We learned:
Some companies, such as Terbium Labs, have developed similar hashing technologies. ‘It’s not code that’s embedded in the data so much as a computation done on the data itself,’ Danny Rogers, a Terbium Labs co-founder, told Defense One regarding its cryptographic hashing. This capability essentially enables a company or agency to recognize its stolen data if discovered. Bitglass, a cloud access security broker, uses watermarking technology to track stolen data. A digital watermark or encryption algorithm is applied to files such as spreadsheets, Word documents or PDFs that requires users to go through an authentication process in order to access it.
We’re told such watermarks can even thwart hackers trying to copy-and-paste into a new document, and that Bitglass tests its tech by leaking and following false data onto the dark web. Pomerleau notes that regulations can make it difficult to implement commercial solutions within a government agency. However, government personnel are very motivated to find solutions that will allow them to work securely outside the office.
The article wraps up with a mention of DARPA’s Memex search engine, designed to plumb the even-more-extensive deep web. Law enforcement is currently using Memex, but the software is expected to eventually make it to the commercial market.
Cynthia Murrell, September 15, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Mobile Data May Help Fight Disease
September 14, 2016
Data from smartphones and other mobile devices may give us a new tool in the fight against communicable diseases. Pen State News reports, “Walking and Talking Behaviors May Help Predict Epidemics and Trends.” A recent study, completed by an impressive roster of academics at several institutions, reveals a strong connection between our movements and our communications. So strong, in fact, that a dataset on one can pretty accurately predict the other. The article cites one participant, researcher Dashun Wang of Penn State:
[Wang] added that because movement and communication are connected, researchers may only need one type of data to make predictions about the other phenomenon. For instance, communication data could reveal information about how people move. …
The equation could better forecast, among other things, how a virus might spread, according to the researchers, who report their findings today (June 6) in the Proceedings of the National Academy of Sciences. In the study, they tested the equation on a simulated epidemic and found that either location or communication datasets could be used to reliably predict the movement of the disease.
Perhaps not as dramatic but still useful, the same process could be used to predict the spread of trends and ideas. The research was performed on three databases full of messages from users in Portugal and another (mysteriously unidentified) country and on four years of Rwandan mobile-phone data. These data sets document who contacted whom, when, and where.
Containing epidemics is a vital cause, and the potential to boost its success is worth celebrating. However, let us take note of who is funding this study: The U.S. Army Research Laboratory, the Office of Naval Research, the Defense Threat Reduction Agency and the James S. McDonnell Foundation’s program, Studying Complex Systems. Note the first three organizations in the list; it will be interesting to learn what other capabilities derive from this research (once they are unclassified, of course).
Cynthia Murrell, September 14, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Toshiba Amps up Vector Indexing and Overall Data Matching Technology
September 13, 2016
The article on MyNewsDesk titled Toshiba’s Ultra-Fast Data Matching Technology is 50 Times Faster than its Predecessors relates the bold claims swirling around Toshiba and their Vector Indexing Technology. By skipping the step involving computation of the distance between vectors, Toshiba has slashed the time it takes to identify vectors (they claim). The article states,
Toshiba initially intends to apply the technology in three areas: pattern mining, media recognition and big data analysis. For example, pattern mining would allow a particular person to be identified almost instantly among a large set of images taken by surveillance cameras, while media recognition could be used to protect soft targets, such as airports and railway stations*4by automatically identifying persons wanted by the authorities.
In sum, Toshiba technology is able to quickly and accurately recognize faces in the crowd. But the specifics are much more interesting. Current technology takes around 20 seconds to identify an individual out of 10 million, and Toshiba can do it in under a second. The precision rates that Toshiba reports are also outstanding at 98%. The world of Minority Report, where ads recognize and direct themselves to random individuals seems to be increasingly within reach. Perhaps more importantly, this technology should be of dire importance to the criminal and perceived criminal populations of the world
Chelsea Kerwin, September 13, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monographThere is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
How Collaboration and Experimentation Are Key to Advancing Machine Learning Technology
September 12, 2016
The article on CIO titled Machine Learning “Still a Cottage Industry” conveys the sentiments of a man at the heart of the industry in Australia, Professor Bob Williamson. Williamson is the Commonwealth Scientific and Industrial Research Organisation’s (CSIRO’s) Data 61 group chief scientist. His work in machine learning and data analytics led him to the conclusion that for machine learning to truly move forward, scientists must find a way to collaborate. He is quoted in the article,
There’s these walled gardens: ‘I’ve gone and coded my models in a particular way, you’ve got your models coded in a different way, we can’t share’. This is a real challenge for the community. No one’s cracked this yet.” A number of start-ups have entered the “machine-learning-as-a-service” market, such as BigML, Wise.io and Precog, and the big names including IBM, Microsoft and Amazon haven’t been far behind. Though these MLaaSs herald some impressive results, Williamson warned businesses to be cautious.
Williamson speaks to the possibility of stagnation in machine learning due to the emphasis on data mining as opposed to experimenting. He hopes businesses will do more with their data than simply look for patterns. It is a refreshing take on the industry from an outsider/insider, a scientist more interested in the science of it all than the massive stacks of cash at stake.
Chelsea Kerwin, September 12, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Revenue Takes a Backseat to Patent Filings at IBM
September 9, 2016
The post on Slashdot titled IBM Has Been Awarded an Average of 24 Patents Per Day So Far in 2016 compares the patent development emphasis of major companies, with IBM coming out on top with 3,617 patent awards so far in 2016, according to a Quartz report. Patents are the bi-product of IBM’s focus on scientific research, as the report finds,
The company is in the middle of a painful reinvention, that sees the company shifting further away from hardware sales into cloud computing, analytics, and AI services. It’s also plugging away on a myriad of fundamental scientific research projects — many of which could revolutionize the world if they can come to fruition — which is where many of its patent applications originate. IBM accounted for about 1% of all US patents awarded in 2015.
Samsung claimed a close second (with just over 3,000 patents), and on the next rung down sits Google (with roughly 1,500 patents for the same period), Intel, Qualcomm, Microsoft, and Apple. Keep in mind though, that IBM and Samsung have been awarded more than twice as many patents as Google and the others, making it an unstoppable patent machine. You may well ask, what about revenue? They will get back to you on that score later.
Chelsea Kerwin, September 9, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Is Google Biotech Team Overreaching?
September 9, 2016
Science reality is often inspired by science fiction, and Google’s biotech research division, Verily Life Sciences, is no exception. Business Insider posts, “‘Silicon Valley Arrogance’? Google Misfires as It Strives to Turn Star Trek Fiction Into Reality.” The “Star Trek” reference points to Verily’s Tricorder project, announced three years ago, which set out to create a cancer-early-detection device. Sadly, that hopeful venture may be sputtering out. STAT reporter Charles Piller writes:
Recently departed employees said the prototype didn’t work as hoped, and the Tricorder project is floundering. Tricorder is not the only misfire for Google’s ambitious and extravagantly funded biotech venture, now named Verily Life Sciences. It has announced three signature projects meant to transform medicine, and a STAT examination found that all of them are plagued by serious, if not fatal, scientific shortcomings, even as Verily has vigorously promoted their promise.
Piller cites two projects, besides the Tricorder, that underwhelm. We’re told that independent experts are dubious about the development of a smart contact lens that can detect glucose levels for diabetics. Then there is the very expensive Baseline study—an attempt to define what it means to be healthy and to catch diseases earlier—which critics call “lofty” and “far-fetched.” Not surprisingly, Google being Google, there are also some privacy concerns being raised about the data being collected to feed the study.
There are several criticisms and specific examples in the lengthy article, and interested readers should check it out. There seems to be one central notion, though— that Verily Live Sciences is attempting to approach the human body like a computer when medicine is much, much more complicated than that. The impressive roster of medical researchers on the team seems to provide little solace to critics. The write-up relates:
It’s axiomatic in Silicon Valley’s tech companies that if the math and the coding can be done, the product can be made. But seven former Verily employees said the company’s leadership often seems not to grasp the reality that biology can be more complex and less predictable than computers. They said Conrad, who has a PhD in anatomy and cell biology, applies the confident impatience of computer engineering, along with extravagant hype, to biotech ideas that demand rigorous peer review and years or decades of painstaking work.
Are former employees the most objective source? I suspect ex-Googlers and third-party scientists are underestimating Google. The company has a history of reaching the moon by shooting for the stars, and for enduring a few failures as a price of success. I would not be surprised to see Google emerge on top of the biotech field. (As sci fi fans know, biotech is the medicine of the future. You’ll see.) The real question is how the company will treat privacy, data rights, and patient safety along the way.
Cynthia Murrell, September 9, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Palantir: More Legal Excitement
September 6, 2016
One of the Beyond Search goslings directed my attention to a legal document “Palantir Technologies Inc. (”Palantir”) Sues Defendants Marc L. Abramowitz…” The 20 page complaint asserts that a Palantir investor sucked in proprietary information and then used that information outside the boundaries of Sillycon Valley norms of behavior. These norms apply to the one percent of the one percent in my opinion.
The legal “complaint” points to several patent documents which embodied Palantir’s proprietary information. The documents require that one use the Justia system to locate; specifically, Provisional Application No. 62/072,36 Provisional Application No. 62/066,716, and Provisional Application No. 62/094,888. These provisional applications, I concluded, reveal that Palantir seeks to enter insurance and health care type markets. This information appears to put Palantir Technologies at a competitive disadvantage.
Who is the individual named in the complaint?
Marc Abramowitz, who is associated with an outfit named KT4. KT4 does not have much of an online presence. The sparse information available to me about Abramowitz is that he is a Harvard trained lawyer and connected to Stanford’s Hoover econo-think unit. Abramowitz’s link to Palantir is that he invested in the company and made visits to the Hobbits’ Palo Alto “shire” part of his work routine.
Despite the legalese, the annoyance of Palantir with Abramowitz seeps through the sentences.
For me what is interesting is that IBM i2 asserted several years ago that Palantir Technologies improperly tapped into proprietary methods used in the Analyst’s Notebook software product and system. See “i2 and Palantir: Resolved Quietly.”
One new twist is that the Palantir complaint against Abramowitz includes a reference to Abramowitz’s appropriating of the word “Shire.” If you are not in the know in Sillycon Valley, Palantir has referenced its offices as the shire; that is, the firm’s office in Palo Alto.
When I read the document, I did not spot a reference to Hobbits or seeing stones.
When I checked this morning (September 6, 2016), the document was still publicly accessible at the link above. However, Palantir’s complaint about the US Army’s procurement system was sealed shortly after it was filed. This Abramowitz complaint may go away for some folks as well. If you can’t locate the Abramowitz document, you will have to up your legal research game. My hunch is that neither Palantir or Mr. Abramowitz will respond to your request for a copy.
There are several hypothetical, Tolkienesque cyclones from this dust up between and investor and the Palantir outfit, which is alleged to be a mythical unicorn:
- Trust seems to need a more precise definition when dealing with either Palantir and Abramowitz
- Some folks use Tolkein’s jargon and don’t want anyone else to “horn” in on this appropriation
- Filing patents on relatively narrow “new” concepts when one does not have a software engineering track record goes against the accepted norms of innovation
- IBM i2’s team may await the trajectory of this Abramowitz manner more attentively than the next IBM Watson marketing innovation.
Worth monitoring just for the irony molecules in this Palantir complaint. WWTK or What would Tolkien think? Perhaps a quick check of the seeing stone is appropriate.
Stephen E Arnold, September 6, 2016
Government Seeks Sentiment Analysis on Its PR Efforts
September 6, 2016
Sentiment analysis is taking off — government agencies are using it for PR purposes. Next Gov released a story, Spy Agency Wants Tech that Shows How Well Its PR Team Is Doing, which covers the National Geospatial-Intelligence Agency’s request for information about sentiment analysis. The NGA hopes to use this technology to assess their PR efforts to increase public awareness of their agency and communicate its mission, especially to groups such as college students, recruits and those in the private sector. Commenting on the bigger picture, the author writes,
The request for information appears to be part of a broader effort within the intelligence community to improve public opinion about its operations, especially among younger, tech-savvy citizens. The CIA has been using Twitter since 2014 to inform the public about the agency’s past missions and to demonstrate that it has a sense of humor, according to an Nextgov interview last year with its social media team. The CIA’s social media director said at the time there weren’t plans to use sentiment analysis technology to analyze the public’s tweets about the CIA because it was unclear how accurate those systems are.
The technologies used in sentiment analysis such as natural language processing and computational linguistics are attractive in many sectors for PR and other purposes, the government is no exception. Especially now that CIA and other organizations are using social media, the space is certainly ripe for government sentiment analysis. Though, we must echo the question posed by the CIA’s social media director in regards to accuracy.
Megan Feil, September 6, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph There is a Louisville, Kentucky Hidden Web/DarkWeb meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Google Enables Users to Delete Search History, Piece by Piece
August 31, 2016
The article on CIO titled Google Quietly Brings Forgetting to the U.S. draws attention to Google have enabled Americans to view and edit their search history. Simply visit My Activity and login to witness the mind-boggling amount of data Google has collected in your search career. To delete, all you have to do is complete two clicks. But the article points out that to delete a lot of searches, you will need an afternoon dedicated to cleaning up your history. And afterward you might find that your searches are less customized, as are your ads and autofills. But the article emphasizes a more communal concern,
There’s something else to consider here, though, and this has societal implications. Google’s forget policy has some key right-to-know overlaps with its takedown policy. The takedown policy allows people to request that stories about or images of them be removed from the database. The forget policy allows the user to decide on his own to delete something…I like being able to edit my history, but I am painfully aware that allowing the worst among us to do the same can have undesired consequences.
Of course, by “the worse among us” he means terrorists. But for many people, the right to privacy is more important than the hypothetical ways that terrorists will potentially suffer within a more totalitarian, Big Brother state. Indeed, Google’s claim that the search history information is entirely private is already suspect. If Google personnel or Google partners can see this data, doesn’t that mean it is no longer private?
Chelsea Kerwin, August 31, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Technical Debt and Technical Wealth
August 29, 2016
I read “Forget Technical Debt. Here’s How to Build Technical Wealth.” Lemons? Make lemonade. Works almost every time.
The write up begins with a reminder that recent code which is tough to improve is a version of legacy code. I understand. I highlighted this statement:
Legacy code isn’t a technical problem. It’s a communication problem.
I am not sure I understand. But let’s move forward in the write up. I noted this statement:
“It’s the law that says your codebase will mirror the communication structures across your organization. If you want to fix your legacy code, you can’t do it without also addressing operations, too. That’s the missing link that so many people miss.”—Andrea Goulet, CEO of Corgibytes
So what’s the fix for legacy code an an outfit like Delta Airlines or the US air traffic control system or the US Internal Revenue Service or a Web site crafted in 1995?
I highlighted this advice:
Forget debt, build technical wealth.
Very MBA-ish. I trust MBAs. Heck, I have affection for some, well, one or two. The mental orientation struck me as quite Wordsworthian:
Stop thinking about your software as a project. Start thinking about it as a house you will live in for a long time…
Just like with a house, modernization and upkeep happens in two ways: small, superficial changes (“I bought a new rug!”) and big, costly investments that will pay off over time (“I guess we’ll replace the plumbing…”). You have to think about both to keep your product current and your team running smoothly. This also requires budgeting ahead — if you don’t, those bigger purchases are going to hurt. Regular upkeep is the expected cost of home ownership. Shockingly, many companies don’t anticipate maintenance as the cost of doing business.
Okay, let’s think about legacy code in something like a “typical” airline or a “typical” agency of the US Executive Branch. Efforts have been made over the last 20 years to improve the systems. Yet these outfits, like many commercial enterprises, are a digital Joseph’s coat of many systems, software, hardware, systems, and methods. The idea is to keep the IRS up and running; that is, good enough to remain dry when it rains and pours.
There is, in my opinion, not enough money to “fix” the IRS systems. If there were money, the problem of code written by many hands over many years is intractable. The idea for “menders” is a good one. But where does one find enough menders to remediate the systems at a big outfit.
Google’s approach is to minimize “legacy” code in some situations. See “Google Is in a Vicious Build Retire Cycle.”
The MBA charts, graphs and checklists do not deliver wealth. The approach sidesteps a very important fact. There are legacy systems which, if they crash, are increasingly difficult to get back up and running. The thought of remediating the systems coded by folks long since retired or deceased is something that few people, including me, have a desire to contemplate. Legacy code is a problem, and there is no quick, easy, or business school thinking fix I know about.
Maybe somewhere? Maybe someplace? Just not in Harrod’s Creek.
Stephen E Arnold, August 29, 2016