Honkin' News banner

Geoparsing Is More Magical Than We Think

September 23, 2016

The term geoparsing sounds like it has something to do with cartography, but according to Directions Magazine in the article, “Geoparsing Maps The Future Of Text Documents” it is more like an alchemical spell.  Geoparsing refers to when text documents into a geospatial database that allows entity extraction and disambiguation (aka is geotagging).  It relies on natural language processing and is generally used to analyze text document collections.

While it might appear that geoparsing is magical, it actually is a complex technological process that relies on data to put information into context.  Places often have the same name, so disambiguation would have difficulty inputting the correct tags.  Geoparsing has important applications, such as:

Military users will not only want to exploit automatically geoparsed documents, they will require a capability to efficiently edit the results to certify that the place names in the document are all geotagged, and geotagged correctly. Just as cartographers review and validate map content prior to publication, geospatial analysts will review and validate geotagged text documents. Place checking, like spell checking, allows users to quickly and easily edit the content of their documents.

The article acts as a promo piece for the GeoDoc application, however, it does delve into the details into how geoparsing works and its benefits.

Whitney Grace, September 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Watch out for Falling Burritos

September 22, 2016

Amazon and Wal-Mart are already trying to deliver packages by drones, but now a Mexican restaurant wants in on the automated delivery game.  Bloomberg Technology tells the story in “Alphabet And Chipotle Are Bringing Burrito Delivery Drones To Campus.”  If you think you can now order a burrito and have it delivered to you via drone, sorry to

tell you that the service is only available on the Virginia Tech campus.  Alphabet Inc. unit Project Wing has teamed up with Chipotle Mexican Grill for the food delivery service.

Self-guided hybrid drones will deliver the burritos.  The burritos will come from a nearby food truck, so the navigation will be accurate and also so the food will be fresh.  The best part is that when the drones are making the delivery, they will hover and lower the burritos with a winch.

While the drones will be automated, human pilots will be nearby to protect people on campus from falling burritos and in case the drones veer from their flight pattern.  The FAA approved the burrito delivering drone test, but the association is hesitant to clear unmanned drones for bigger deliver routes.

…the experiment will not assess one of the major technology hurdles facing drone deliveries: creation of a low-level air-traffic system that can maintain order as the skies become more crowded with unmanned vehicles. NASA is working with Project Wing and other companies to develop the framework for such a system. Data from the tests will be provided to the FAA to help the agency develop new rules allowing deliveries…

The drone burrito delivery at Virginia Tech is believed to be the most complex delivery flight operation in the US.  It is a test for a not too distant future when unmanned drones deliver packages and food.  It will increase the amount of vehicles in the sky, but it will also put the delivery business in jeopardy.  Once more things change and more jobs become obsolete.

Whitney Grace, September 22, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Open Source Log File Viewer Glogg

September 21, 2016

Here is an open source solution for those looking to dig up information within large and complex log files; BetaNews shares, “View and Search Huge Log Files with Glogg.”  The software reads directly from your drive, saving time and keeping memory free (or at least as free as it was before.) Reviewer, Mike Williams tells us:

Glogg’s interface is simple and uncluttered, allowing anyone to use it as a plain text viewer. Open a log, browse the file, and the program grabs and displays new log lines as they’re added. There’s also a search box. Enter a plain text keyword, a regular or extended regular expression and any matches are highlighted in the main window and displayed in a separate pane. Enable ‘auto-refresh’ and glogg reruns searches as lines are added, ensuring the matches are always up-to-date. Glogg also supports ‘filters’, essentially canned searches which change text color in the document window. You could have lines containing ‘error’ displayed as black on red, lines containing ‘success’ shown black on green, and as many others as you need.

Williams spotted some more noteworthy features, like a quick-text search, highlighted matches, and helpful Next and Previous buttons. He notes the program is not exactly chock-full of fancy features, but suggests that is probably just as well for this particular task. Glogg runs on 64-bit Windows 7 and later, and on Linux.

Cynthia Murrell, September 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Featurespace Raises Capital for Bank Fraud Monitoring Technology

September 21, 2016

Monitoring online fraud has become an increasingly popular application for machine learning and search technology. The Telegraph reported Cambridge AI fraud detection group raises £6.2m. The company, Featurespace, grew out of Cambridge University and its ARIC technology goes beyond rule-based fraud-detection. It scans all activity on a network and thus learns what registers as fraudulent or suspicious. The write-up tells us,

The company has now raised $9m (£6.2m), which it will use to open a US office after signing two big stateside deals. The funding is led by US fintech investor TTV Capital – the first time it has backed a UK company – and early stage investors Imperial Innovations and Nesta.

Mike Lynch, the renowned technology investor who founded software group Autonomy before its $11.7bn sale to Hewlett Packard, has previously invested in the company and sits on its board. Ms King said Featurespace had won a contract with a major US bank, as well as payments company TSYS, which processes MasterCard and Visa transactions.”

Overall, the company aims to protect consumers from credit and debit card fraud. The article reminds us that millions of consumers have been affected by stolen credit and debit card information. Betfair, William Hill and VocaLink are current customers of Featurespace and several banks are using its technology too. Will this become a big ticket application for these machine learning technologies?

Megan Feil, September 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

 

 

The Case for Algorithmic Equity

September 20, 2016

We know that AI algorithms are skewed by the biases of both their creators and, depending on the application, their users. Social activist Cathy O’Neil addresses the broad consequences to society in her book, Weapons of Math Destruction. Time covers her views in its article, “This Mathematician Says Big Data is Causing a ‘Silent Financial Crisis’.” O’Neil studied mathematics at Harvard, utilized quantitative trading at a hedge-fund, and introduced a targeted-advertising startup. It is fair to say she knows what she is talking about.

More and more businesses and organizations rely on algorithms to make decisions that have big impacts on people’s lives: choices about employment, financial matters, scholarship awards, and where to deploy police officers, for example. Yet, the processes are shrouded in secrecy, and lawmakers are nowhere close to being on top of the issue. There is currently no way to ensure these decisions are anything approaching fair. In fact, the algorithms can create a sort of feedback loop of disadvantage. Reporter Rana Foroohar writes:

Using her deep technical understanding of modeling, she shows how the algorithms used to, say, rank teacher performance are based on exactly the sort of shallow and volatile type of data sets that informed those faulty mortgage models in the run up to 2008. Her work makes particularly disturbing points about how being on the wrong side of an algorithmic decision can snowball in incredibly destructive ways—a young black man, for example, who lives in an area targeted by crime fighting algorithms that add more police to his neighborhood because of higher violent crime rates will necessarily be more likely to be targeted for any petty violation, which adds to a digital profile that could subsequently limit his credit, his job prospects, and so on. Yet neighborhoods more likely to commit white collar crime aren’t targeted in this way.

Yes, unsurprisingly, it is the underprivileged who bear the brunt of algorithmic aftermath; the above is just one example. The write-up continues:

Indeed, O’Neil writes that WMDs [Weapons of Math Destruction] punish the poor especially, since ‘they are engineered to evaluate large numbers of people. They specialize in bulk. They are cheap. That’s part of their appeal.’ Whereas the poor engage more with faceless educators and employers, ‘the wealthy, by contrast, often benefit from personal input. A white-shoe law firm or an exclusive prep school will lean far more on recommendations and face-to-face interviews than a fast-food chain or a cash-strapped urban school district. The privileged… are processed more by people, the masses by machines.

So, algorithms add to the disparity between how the wealthy and the poor experience life. Compounding the problem, algorithms also allow the wealthy to isolate themselves online as well as in real life, through curated news and advertising that make it ever easier to deny that poverty is even a problem. See the article for its more thorough discussion.

What does O’Neil suggest we do about this? First, she proposes a “Hippocratic Oath for mathematicians.” She also joins the calls for much more thorough regulation of the AI field and to update existing civic-rights laws to include algorithm-based decisions. Such measures will require the cooperation of legislators, who, as a group, are hardly known for their technical understanding. It is up to those of us who do comprehend the issues to inform them action must be taken. Sooner rather than later, please.

Cynthia Murrell, September 20, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

 

Hundreds of Thousands of Patient Records Offered up on the Dark Web

September 19, 2016

Some of us suspected this was coming, despite many assurances to the contrary. Softpedia informs us, “Hacker Selling 651,894 Patient Records on the Dark Web.” Haughtily going by the handle TheDarkOverlord, the hacker responsible is looking to make over seven hundred grand off the data. Reporter Catalin Cimpanu writes:

The hacker is selling the data on The Real Deal marketplace, and he [or she] says he breached these companies using an RDP (Remote Desktop Protocol) bug. TheDarkOverlord has told DeepDotWeb, who first spotted the ads, that it’s ‘a very particular bug. The conditions have to be very precise for it.’ He has also provided a series of screenshots as proof, showing him accessing the hacked systems via a Remote Desktop connection. The hacker also recalls that, before putting the data on the Dark Web, he contacted the companies and informed them of their problems, offering to disclose the bug for a price, in a tactic known as bug poaching. Obviously, all three companies declined, so here we are, with their data available on the Dark Web. TheDarkOverlord says that all databases are a one-time sale, meaning only one buyer can get their hands on the stolen data.

The three databases contain information on patients in Farmington, Missouri; Atlanta, Georgia; and the Central and Midwest areas of the U.S. TheDarkOverloard asserts that the data includes details like contact information, Social Security numbers, and personal facts like gender and race. The collection does not, apparently, include medical history. I suppose that is a relief—for now.

Cynthia Murrell, September 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

 

Enterprise Technology Perspective on Preventing Security Breaches

September 16, 2016

When it comes to the Dark Web, the enterprise perspective wants solutions to prevent security breaches. Fort Scale released an article, Dark Web — Tor Use is 50% Criminal Activity — How to Detect It, speaking to this audience. This write-up explains the anonymizer Tor as The Onion Router, a name explained by the multiple layers used to hide an IP address and therefore the user’s identity. How does the security software works to detect Tor users? We learned,

There are a couple of ways security software can determine if a user is connecting via the Tor network. The first way is through their IP address. The list of Tor relays is public, so you can check whether the user is coming from a known Tor relay. It’s actually a little bit trickier than that, but a quality security package should be able to alert you if user behaviors include connecting via a Tor network. The second way is by looking at various application-level characteristics. For example, a good security system can distinguish the differences between a standard browser and a Tor Browser because among other things,Tor software won’t respond to certain history requests or JavaScript queries.

Many cybersecurity software companies that exist offer solutions that monitor the Dark Web for sensitive data, which is more of a recovery strategy. However, this article highlights the importance of cybersecurity solutions which monitor enterprise systems usage to identify users connecting through Tor. While this appears a sound strategy to understand the frequency of Tor-based users, it will be important to know whether these data-producing software solutions facilitate action such as removing Tor users from the network.

Megan Feil, September 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Automated Tools for Dark Web Data Tracking

September 15, 2016

Naturally, tracking stolen data through the dark web is a challenge. Investigators have traditionally infiltrated chatrooms and forums in the effort—a tedious procedure with no guarantee of success. Now, automated tools may give organizations a leg up, we learn from the article, “Tools to Track Stolen Data Through the Dark Web” at GCN. Reporter Mark Pomerleau informs us:
“The Department of Veterans Affairs last month said it was seeking software that can search the dark web for exploited VA data improperly outside its control, distinguish between VA data and other data and create a ‘one-way encrypted hash’ of VA data to ensure that other parties cannot ascertain or use it. The software would also use VA’s encrypted data hash to search the dark web for VA content. We learned:

Some companies, such as Terbium Labs, have developed similar hashing technologies.  ‘It’s not code that’s embedded in the data so much as a computation done on the data itself,’ Danny Rogers, a Terbium Labs co-founder, told Defense One regarding its cryptographic hashing.  This capability essentially enables a company or agency to recognize its stolen data if discovered. Bitglass, a cloud access security broker, uses watermarking technology to track stolen data.  A digital watermark or encryption algorithm is applied to files such as spreadsheets, Word documents or PDFs that requires users to go through an authentication process in order to access it.

We’re told such watermarks can even thwart hackers trying to copy-and-paste into a new document, and that Bitglass tests its tech by leaking and following false data onto the dark web. Pomerleau notes that regulations can make it difficult to implement commercial solutions within a government agency. However, government personnel are very motivated to find solutions that will allow them to work securely outside the office.

The article wraps up with a mention of DARPA’s  Memex search engine, designed to plumb the even-more-extensive deep web. Law enforcement is currently using Memex, but the software is expected to eventually make it to the commercial market.

Cynthia Murrell, September 15, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Mobile Data May Help Fight Disease

September 14, 2016

Data from smartphones and other mobile devices may give us a new tool in the fight against communicable diseases.  Pen State News reports, “Walking and Talking Behaviors May Help Predict Epidemics and Trends.” A recent study, completed by an impressive roster of academics at several institutions, reveals a strong connection between our movements and our communications. So strong, in fact, that a dataset on one can pretty accurately predict the other. The article cites one participant, researcher Dashun Wang of Penn State:

[Wang] added that because movement and communication are connected, researchers may only need one type of data to make predictions about the other phenomenon. For instance, communication data could reveal information about how people move. …

The equation could better forecast, among other things, how a virus might spread, according to the researchers, who report their findings today (June 6) in the Proceedings of the National Academy of Sciences. In the study, they tested the equation on a simulated epidemic and found that either location or communication datasets could be used to reliably predict the movement of the disease.

Perhaps not as dramatic but still useful, the same process could be used to predict the spread of trends and ideas. The research was performed on three databases full of messages from users in Portugal and another (mysteriously unidentified) country and on four years of Rwandan mobile-phone data. These data sets document who contacted whom, when, and where.

Containing epidemics is a vital cause, and the potential to boost its success is worth celebrating. However, let us take note of who is funding this study: The U.S. Army Research Laboratory, the Office of Naval Research, the Defense Threat Reduction Agency and the James S. McDonnell Foundation’s program, Studying Complex Systems. Note the first three organizations in the list; it will be interesting to learn what other capabilities derive from this research (once they are unclassified, of course).

Cynthia Murrell, September 14, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Toshiba Amps up Vector Indexing and Overall Data Matching Technology

September 13, 2016

The article on MyNewsDesk titled Toshiba’s Ultra-Fast Data Matching Technology is 50 Times Faster than its Predecessors relates the bold claims swirling around Toshiba and their Vector Indexing Technology. By skipping the step involving computation of the distance between vectors, Toshiba has slashed the time it takes to identify vectors (they claim). The article states,

Toshiba initially intends to apply the technology in three areas: pattern mining, media recognition and big data analysis. For example, pattern mining would allow a particular person to be identified almost instantly among a large set of images taken by surveillance cameras, while media recognition could be used to protect soft targets, such as airports and railway stations*4by automatically identifying persons wanted by the authorities.

In sum, Toshiba technology is able to quickly and accurately recognize faces in the crowd. But the specifics are much more interesting. Current technology takes around 20 seconds to identify an individual out of 10 million, and Toshiba can do it in under a second. The precision rates that Toshiba reports are also outstanding at 98%. The world of Minority Report, where ads recognize and direct themselves to random individuals seems to be increasingly within reach. Perhaps more importantly, this technology should be of dire importance to the criminal and perceived criminal populations of the world

Chelsea Kerwin, September 13, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monographThere is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/

Next Page »