CyberOSINT: Next Generation Information Access Explains the Tech Behind the Facebook, GSR, Cambridge Analytica Matter
April 5, 2018
In 2015, I published CyberOSINT: Next Generation Information Access. This is a quick reminder that the profiles of the vendors who have created software systems and tools for law enforcement and intelligence professionals remains timely.
The 200 page book provides examples, screenshots, and explanations of the tools which are available to analyze social media information. The book is the most comprehensive run down of the open source, commercial, and cloud based systems which can make sense of social media data, lawful intercept data, and general text and imagery content.
Companies described in this collection of “tools” include:
- Cyveillance (now LookingGlass)
- Decisive Analytics
- IBM i2 (Analysts Notebook)
- Geofeedia
- Leidos
- Palantir Gotham
- and more than a dozen developers of commercial and open source, high impact cyberOSINT tool vendors.
The book is available for $49. Additional information is available on my Xenky.com Web site. You can buy the PDF book online at this link gum.co/cyberosint.
Get the CyberOSINT monograph. It’s the standard reference for practical and effective analysis, text analytics, and next generation solutions.
Stephen E Arnold, April 5, 2018
Multi-purpose Search Tool Is Like Magic
March 2, 2018
The Internet of things has evolved from an entertaining gimmick to instantly access information to an indispensable tool for daily life. Search engines like Google and Duckduckgo make searching the Internet simple, but in closed systems like databases and storage silos, searching is still complicated. Usually, individual systems have their own out-of-the-box search engines, but its accuracy is so-so. Cloud computing complicates search even more. Instead of searching just one system, cloud computing requires search software that can handle multiple systems at once. The search technology is out there, but can it really perform as well as Google or even DuckDuckGo?
The Code Project wrote about a new, multi-faceted search tool in the post, “Multidatabase Text Search Tool.” Searching text in all files across many systems is one of the most complicated procedures for a search engine, especially if you want accuracy and curated results. That is what DBTextFinder was developed for:
DBTextFinder is a simple tool that helps you to perform a precise search in all the stored procedures, functions, triggers, packages and views code, or a selected subset of them, using regular expressions.Additionally, you can search for a given text in all the text fields of a selected set of tables, using regular expressions too.The application provides connections to MySQL, SQL Server and Oracle servers, and supports remote connections via WCF services. You can easily extend the list of available DBMS writing your own connectors without having to change the application code.
DBTextFinder appears to have it all. It is programmable, gets along well with other computer languages, and was designed to be user-friendly. What more could you ask for?
Whitney Grace, March 2, 2018
Universal Text Translation Is the Next Milestone for AI
February 9, 2018
As the globe gets smaller, individuals are in more contact with people who don’t speak their language. Or, we are reading information written in a foreign language. Programs like Google Translate are flawed at best and it is clear this is a niche waiting to be filled. With the increase of AI, it looks like that is about to happen, according to a recent GCN article, “IARPA Contracts for Universal Text Translator.”
According to the article:
The Intelligence Advanced Research Projects Activity is a step closer to developing a universal text translator that will eventually allow English speakers to search through multilanguage data sources — such as social media, newswires and press reports — and retrieve results in English.
The intelligence community’s research arm awarded research and performance monitoring contracts for its Machine Translation for English Retrieval of Information in Any Language program to teams headed by leading research universities paired with federal technology contractors.
Intelligence agencies, said IARPA project managers in a statement in late December, grapple with an increasingly multilingual, worldwide data pool to do their analytic work. Most of those languages, they said, have few or no automated tools for cross-language data mining.
This sounds like a very promising opportunity to get everyone speaking the same language. However, we think there is still a lot of room for error. We are hedging our bets on Unibabel’s AI translation software that is backed up by human editors. (They raised $23M, so they must be doing something right.) That human angle seems to be the hinge that will be a success for someone in this rich field.
Patrick Roland, February 9, 2018
Big Data Used to Confirm Bad Science
November 30, 2017
I had thought we had moved beyond harnessing big data and were now focusing on AI and machine learning, but Forbes has some possible new insights in, “Big Data: Insights Or Illusions?”
Big data is a tool that can generate new business insights or it can reinforce a company’s negative aspects. The article consists of an interview with Christian Madsbjerg of ReD Associates. It opens with how Madsbjerg and his colleagues studied credit card fraud by living like a fraudster for a while. They learned some tricks and called their experience contextual analytics. This leads to an important discussion topic:
Dryburgh: This is really interesting, because it seems to me that big data could be a very two-edged sword. On the one hand you can use it in the way that you’ve described to validate hypotheses that you’ve arrived at by very subjective, qualitative means. I guess the other alternative is that you can use it simply to provide confirmation for what you already think.
Madsbjerg: Which is what’s happening, and with the ethos that we’ve got a truth machine that you can’t challenge because it’s big data. So you’ll cement and intensify the toxic assumptions you have in the company if you don’t use it to challenge and explore, rather than to confirm things you already know.
This topic is not new. We are seeing unverified news stories reach airwaves and circulate the Internet for the pure sake of generating views and profit. Corporate entities do the same when they want to churn more money into their coffers than think of their workers or their actual customers. It is also like Hollywood executives making superhero movies based on comic heroes when they have no idea about the medium’s integrity.
In other words, do not forget context and the human factor!
Whitney Grace, November 30, 2017
AI Changes Power Delivery
November 13, 2017
Artificial intelligence has already changed the way our lives progress daily, and will continue to do so the more advances are made in that field. What is amazing is how AI concepts can be applied to nearly every industry in the world and T&D World takes a look at how it has affected the power grid, “Artificial Intelligence Is Changing The Power-Delivery Grid.” The author introduces the article explaining how he has noticed that when people say Thomas Edison would be familiar with today’s modern innovations and how it is a put down to the industry.
In truth, Edison would be hard pressed to rationalize today’s real-time mechanics and AI structure. AI is an important development in all fields from medical to finance, but it plays an important role in the modern power grid:
Some of us call situational awareness technology machine learning, but most of us use the more common term of artificial intelligence (AI). AI is being used on the grid right now and it is more widespread than you might think. Case in point: In June, the U.S. Department of Energy held its 2017 Transmission Reliability Program in Washington, D.C., and there were several AI presentations. The presentation that caught my attention was about using advanced machine learning with synchrophasor technology. Synchrophasor technology generates a great deal of data using phasor measurement units (PMUs) to measure voltage and current of the grid, which can be used to calculate parameters such as frequency and phase angle.
Edison would not feel at home, instead, he would want to get his toolset and tear the power grid apart to figure out how it worked. Geniuses may be ahead of their time, but they still are products of their own.
Whitney Grace, November 13, 2017
Treating Google Knowledge Panel as Your Own Home Page
November 8, 2017
Now, this is interesting. Mike Ramsey at AttorneyAtWork suggests fellow lawyers leverage certain Google tools in, “Three Reasons Google Is Your New Home Page.” He points out that Google now delivers accessibility to many entities directly on the results page, reducing the number of clicks potential clients must perform to contact a firm. He writes:
[Google] has rolled out three products that provide potential clients with information about your law firm before they get to your site:
*Messages (on mobile)
*Questions and Answers (on mobile)
*Optional URLs for booking appointments (both mobile and desktop)
This means that Google search results are becoming your new ‘home page.’ All three products — Messages, Questions and Answers and URLs for appointments — are accessible from your Google My Business dashboard. They appear in your local Knowledge Panel in Google. If Google really is becoming your home page, but also giving you a say in providing potential clients with information about your firm, you will definitely want to take advantage of it.
The article explains how to best leverage each tool. For example, Messages let you incorporate text messages into your Knowledge Panel; Ramsey notes that customers prefer using text messages to resolve customer service issues. Questions and Answers will build an FAQ-like dialogue for the panel, while optional URLs allow clients to schedule appointments right from the results page. Ramsey predicts it should take about an hour to set up these tools for any given law firm, and emphasizes it is well worth that investment to make it as easy as possible for potential clients to get in touch.
Cynthia Murrell, November 8, 2017
Privacy Is Lost in Translation
October 30, 2017
Online translation tools are a wonder! Instead of having to rely on limited dictionaries and grammars, online translation tools deliver real-time, nearly accurate translations of documents and other text. It is usually good to double check the translation because sometimes the tools do make mistakes. Translation tools, however, can make mistakes that lose privacy in translation. Quartz tells an alarming story in, “If You Value Your Privacy, Be Careful With Online Translation Tools.”
Norwegian state oil company Statoil used Translate.com to translate sensitive company documents. One would think that would not be a problem, except Translate.com stored the data in the cloud. The sensitive documents included dismissal letters, contracts, workforce reduction plans, and more. News traveled fast in Norway, resulting in the Oslo Stock Exchange blocking employees’ access to Translate.com and Google Translate.
It was dubbed a massive privacy breach as private documents from other organizations and individuals were discovered. Translate.com views the incident differently:
Translate.com sees things a little differently, however, saying it was straight with users about the fact that it was crowdsourcing human translations to improve on machine work. In a Sept. 6 blog post responding to the news reports, the company explained that in the past, they were using human volunteer translators to improve their algorithm, and during that time, had made documents submitted for translation public so that any human volunteers could easily access them. ‘As a precaution, there was a clear note on our homepage stating: ‘All translations will be sent to our community to improve accuracy.’
Translate.com also offered to remove any documents upon request, but sensitive documents were still available when the Quartz article was written. Vice president of Sales for Translate.com Maria Burud pointed out that they offer a paid translation software intended for businesses to maintain their privacy. Burud notes that that anything translate using a free web tool is bound to have privacy issues, but that there is a disclaimer on her company’s Web site. It is up to the user to de-identify the information or watch what they post in a translation box.
In other words, watch what you translate and post online. It will come back to haunt you.
Whitney Grace, October 30, 2017
Learn About Machine Learning
August 30, 2017
For an in-depth look at the technology behind Google Translate, turn to Stats and Bots’ write-up, “Machine Learning Translation and the Google Translate Algorithm.” Part of a series that aims to educate users about the technology behind machine learning (ML), the illustrated article delves into the details behind Google’s deep learning translation tools. Writer Daniil Korbut explains the factors that make it problematic to “teach” human language to an AI, then describes Long Short-Term Memory (LSTM) networks, bidirectional RNNs, sequence-to-sequence models, and how Google put those tools together. See the article for those details that are a bit above this writer’s head. There’s just one thing missing—any acknowledgment of the third parties that provide Google with language technology. Oh well.
Another valuable resource on machine learning, found at YCombinator, is Google researcher Jeff Dean’s Lecture for YC AI. The post includes a video that is over an hour long, but it also shares the informative slides from Dean’s presentation. They touch on scientific and medical applications for machine learning, then examine sequence-to-sequence models, automated machine learning, and “higher performance” ML models. One early slide reproduces a Google blog post in which Dean gives a little history (and several relevant links):
Allowing computers to better understand human language is one key area for our research. In late 2014, three Brain team researchers published a paper on Sequence to Sequence Learning with Neural Networks, and demonstrated that the approach could be used for machine translation. In 2015, we showed that this this approach could also be used for generating captions for images, parsing sentences, and solving computational geometry problems. In 2016, this previous research (plus many enhancements) culminated in Brain team members worked closely with members of the Google Translate team to wholly replace the translation algorithms powering Google Translate with a completely end-to-end learned system (research paper). This new system closed the gap between the old system and human quality translations by up to 85% for some language pairs. A few weeks later, we showed how the system could do “zero-shot translation”, learning to translate between languages for which it had never seen example sentence pairs (research paper). This system is now deployed on the production Google Translate service for a growing number of language pairs.
These surveys of Google’s machine translation tools offer a lot of detailed information for those interested in the topic. Just remember that Google is not (yet?) the only game in town.
Cynthia Murrell, August 30, 2017
Time to Ditch PowerPoint?
August 23, 2017
For decades, Microsoft PowerPoint has been used for making presentations. That is all set to change as a recent study indicates that PowerPoint presentations are ineffective.
According to an article published by Quartz and titled The Scientific Reason No One Wants to See Your PowerPoint Presentation, the publisher says:
Because the human brain process information both visually (using shapes and colors) and spatially (using location and distance, the researchers said, ZUI helps audiences by locating the information in a place, allowing them to mentally retrieve it later.
The problem with the study is that it appears to be too promotional. For instance, the article says tools like Prezi are better for making presentations because it offers a lot of animated options. Why not then use Gifographics or stock videos then?
The effectiveness of a presentation mostly depends on the person presenting it. Many speakers completely do away with any type of tools so that their audience can concentrate on what the speaker says. Moreover, the presentation can be made effective if the slides are designed professionally. Don’t be surprised if, in the near future, all presentations are made using VR headsets for that truly immersive experience.
Vishal Ingole, August 23, 2017
Fake News Is Here to Stay
August 22, 2017
Morphed pictures and videos were the realms of experts. New tools, however, are making it easier for people with average computer skills to create hyper-realistic content.
As reported by Mashable in an article titled This Scary Video Tool Makes Fake News Look Legit, which says:
Researchers at the University of Washington recently announced a new video-editing tool that they used to superimpose audio — with realistic lip movements — onto a video of former U.S. president Barack Obama, making it appear as though he’s saying whatever they want him to.
The intention of making this tool was to help special effects artists in the entertainment industry. However, as is the case with any other tool, the tool as a test run was to create a fake news content. Couple this tool with other available tools like Google DeepMind AI and Lyrebird, a single person could be producing a number of fake videos sitting in the dungeon.
Social media platforms are already fighting the menace of fake news. However such tools make their tasks tougher. Facebook, for instance, employs an army of analysts to weed out fake news. Seems like until the problem of fake news or information is going to get worse.
Vishal Ingole, August 22, 2017