Do Not Forget to Show Your Work
November 24, 2016
Showing work is messy, necessary step to prove how one arrived at a solution. Most of the time it is never reviewed, but with big data people wonder how computer algorithms arrive at their conclusions. Engadget explains that computers are being forced to prove their results in, “MIT Makes Neural Networks Show Their Work.”
Understanding neural networks is extremely difficult, but MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has developed a way to map the complex systems. CSAIL figured the task out by splitting networks in two smaller modules. One for extracting text segments and scoring according to their length and accordance and the second module predicts the segment’s subject and attempts to classify them. The mapping modules sounds almost as complex as the actual neural networks. To alleviate the stress and add a giggle to their research, CSAIL had the modules analyze beer reviews:
For their test, the team used online reviews from a beer rating website and had their network attempt to rank beers on a 5-star scale based on the brew’s aroma, palate, and appearance, using the site’s written reviews. After training the system, the CSAIL team found that their neural network rated beers based on aroma and appearance the same way that humans did 95 and 96 percent of the time, respectively. On the more subjective field of “palate,” the network agreed with people 80 percent of the time.
One set of data is as good as another to test CSAIL’s network mapping tool. CSAIL hopes to fine tune the machine learning project and use it in breast cancer research to analyze pathologist data.
Whitney Grace, November 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Writing That Is Never Read
November 23, 2016
It is inevitable in college that you were forced to write an essay. Writing an essay usually requires the citation of various sources from scholarly journals. As you perused the academic articles, the thought probably crossed your mind: who ever reads this stuff? Smithsonian Magazine tells us who in the article, “Academics Write Papers Arguing Over How Many People Read (And Cite) Their Papers.” In other words, themselves.
Academic articles are read mostly by their authors, journal editors, and the study’s author write, and students forced to cite them for assignments. In perfect scholarly fashion, many academics do not believe that their work has a limited scope. So what do they do? They decided to write about it and have done so for twenty years.
Most academics are not surprised that most written works go unread. The common belief is that it is better to publish something rather than nothing and it could also be a requirement to keep their position. As they are prone to do, academics complain about the numbers and their accuracy:
It seems like this should be an easy question to answer: all you have to do is count the number of citations each paper has. But it’s harder than you might think. There are entire papers themselves dedicated to figuring out how to do this efficiently and accurately. The point of the 2007 paper wasn’t to assert that 50 percent of studies are unread. It was actually about citation analysis and the ways that the internet is letting academics see more accurately who is reading and citing their papers. “Since the turn of the century, dozens of databases such as Scopus and Google Scholar have appeared, which allow the citation patterns of academic papers to be studied with unprecedented speed and ease,” the paper’s authors wrote.
Academics always need something to argue about, no matter how miniscule the topic. This particular article concludes on the note that someone should get the number straight so academics can move onto to another item to argue about. Going back to the original thought a student forced to write an essay with citations also probably thought: the reason this stuff does not get read is because they are so boring.
Whitney Grace, November 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Facebook Still Having Trouble with Trending Topics
October 28, 2016
Despite taking action to fix its problems with Trending Topics, Facebook is still receiving criticism on the issue. A post at Slashdot tells us, “The Washington Post Tracked Facebook’s Trending Topics for 3 Weeks, Found 5 Fake Stories and 3 Inaccurate Articles.” The Slashdot post by msmash cites a Washington Post article. (There’s a paywall if, like me, you’ve read your five free WP articles for this month.) The Post monitored Facebook’s Trending Topics for three weeks and found that issue far from resolved. Msmash quotes the report:
The Megyn Kelly incident was supposed to be an anomaly. An unfortunate one-off. A bit of (very public, embarrassing) bad luck. But in the six weeks since Facebook revamped its Trending system — and a hoax about the Fox News Channel star subsequently trended — the site has repeatedly promoted ‘news’ stories that are actually works of fiction. As part of a larger audit of Facebook’s Trending topics, the Intersect logged every news story that trended across four accounts during the workdays from Aug. 31 to Sept. 22. During that time, we uncovered five trending stories that were indisputably fake and three that were profoundly inaccurate. On top of that, we found that news releases, blog posts from sites such as Medium and links to online stores such as iTunes regularly trended. Facebook declined to comment about Trending on the record.
It is worth noting that the team may not have caught every fake story, since it only checked in with Trending Topics once every hour. Quite the quandary. We wonder—would a tool like Google’s new fact-checking feature help? And, if so, will Facebook admit its rival is on to something?
Cynthia Murrell, October 28, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Google Introduces Fact Checking Tool
October 26, 2016
If it works as advertised, a new Google feature will be welcomed by many users—World News Report tells us, “Google Introduced Fact Checking Feature Intended to Help Readers See Whether News Is Actually True—Just in Time for US Elections.” The move is part of a trend for websites, who seem to have recognized that savvy readers don’t just believe everything they read. Writer Peter Woodford reports:
Through an algorithmic process from schema.org known as ClaimReview, live stories will be linked to fact checking articles and websites. This will allow readers to quickly validate or debunk stories they read online. Related fact-checking stories will appear onscreen underneath the main headline. The example Google uses shows a headline over passport checks for pregnant women, with a link to Full Fact’s analysis of the issue. Readers will be able to see if stories are fake or if claims in the headline are false or being exaggerated. Fact check will initially be available in the UK and US through the Google News site as well as the News & Weather apps for both Android and iOS. Publishers who wish to become part of the new service can apply to have their sites included.
Woodford points to Facebook’s recent trouble with the truth within its Trending Topics feature and observes that many people are concerned about the lack of honesty on display this particular election cycle. Google, wisely, did not mention any candidates, but Woodford notes that Politifact rates 71% of Trump’s statements as false (and, I would add, 27% of Secretary Clinton’s statements as false. Everything is relative.) If the trend continues, it will be prudent for all citizens to rely on (unbiased) fact-checking tools on a regular basis.
Cynthia Murrell, October 26, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Trending Topics: Google and Twitter Compared
October 25, 2016
For those with no time to browse through the headlines, tools that aggregate trending topics can provide a cursory way to keep up with the news. The blog post from communications firm Cision, “How to Find Trending Topics Like an Expert,” examines the two leading trending topic tools—Google’s and Twitter’s. Each approaches its tasks differently, so the best choice depends on the user’s needs.
Though the Google Trends homepage is limited, according to writer Jim Dougherty, one can get further with its extension, Google Explore. He elaborates:
If we go to the Google Trends Explore page (google.com/trends/explore), our sorting options become more robust. We can sort by the following criteria:
*By country (or worldwide)
*By time (search within a customized date range – minimum: past hour, maximum: since 2004)
*By category (arts and entertainment, sports, health, et cetera)
*By Google Property (web search, image search, news search, Google Shopping, YouTube)
You can also use the search feature via the trends page or explore the page to search the popularity of a search term over a period (custom date ranges are permitted), and you can compare the popularity of search terms using this feature as well. The Explore page also allows you to download any chart to a .csv file, or to embed the table directly to a website.
The write-up goes on to note that there are no robust third-party tools to parse data found with Google Trends/ Explore, because the company has not made the API publicly available.
Unlike Google, we’re told, Twitter does not make it intuitive to find and analyze trending topics. However, its inclusion of location data can make Twitter a valuable source for this information, if you know how to find it. Dougherty suggests a work-around:
To ‘analyze’ current trends on the native Twitter app, you have to go to the ‘home’ page. In the lower left of the home page you’ll see ‘trending topics’ and immediately below that a ‘change’ button which allows you to modify the location of your search.
Location is a huge advantage of Twitter trends compared to Google: Although Google’s data is more robust and accessible in general, it can only be parsed by country. Twitter uses Yahoo’s GeoPlanet infrastructure for its location data so that it can be exercised at a much more granular level than Google Trends.
Since Twitter does publicly share its trending-topics API, there are third-party tools one can use with Twitter Trends, like TrendoGate, TrendsMap, and ttHistory. The post concludes with a reminder to maximize the usefulness of data with tools that “go beyond trends,” like (unsurprisingly) the monitoring software offered by Daugherty’s company. Paid add-ons may be worth it for some enterprises, but we recommend you check out what is freely available first.
Cynthia Murrell, October 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Artificial Intelligence Is Only a Download Away
October 17, 2016
Artificial intelligence still remains a thing of imagination in most people’s minds, because we do not understand how much it actually impacts our daily lives. If you use a smartphone of any kind, it is programmed with software, apps, and a digital assistant teeming with artificial intelligence. We are just so used to thinking that AI is the product of robots that we are unaware our phones, tablets, and other mobiles devices are little robots of their own.
Artificial intelligence programming and development is also on the daily task list on many software technicians. If you happen to have any technical background, you might be interested to know that there are many open source options to begin experimenting with artificial intelligence. Datamation rounded up the “15 Top Open Source Artificial Intelligence Tools” and these might be the next tool you use to complete your machine learning project. The article shares that:
Artificial Intelligence (AI) is one of the hottest areas of technology research. Companies like IBM, Google, Microsoft, Facebook and Amazon are investing heavily in their own R&D, as well as buying up startups that have made progress in areas like machine learning, neural networks, natural language and image processing. Given the level of interest, it should come as no surprise that a recent artificial intelligence report from experts at Stanford University concluded that ‘increasingly useful applications of AI, with potentially profound positive impacts on our society and economy are likely to emerge between now and 2030.
The statement reiterates what I already wrote. The list runs down open source tools, including PredictionIO, Oryx 2, OpenNN, MLib, Mahout, H20, Distributed Machine Learning Toolkit, Deeplearning4j, CNTK, Caffe, SystemML, TensorFlow, and Torch. The use of each tool is described and most of them rely on some sort of Apache software. Perhaps your own artificial intelligence project can contribute to further development of these open source tools.
Whitney Grace, October 17, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Malware with Community on the Dark Web
October 14, 2016
While Mac malware is perhaps less common than attacks designed for PC, it is not entirely absent. The Register covers this in a recent article, EasyDoc malware adds Tor backdoor to Macs for botnet control. The malware is disguised as a software application called EasyDoc Converter which is supposed to be a file converter but does not actually perform that function. Instead, it allows hackers to control the hacked mac via Tor. The details of the software are explained as follows,
The malware, dubbed Backdoor.MAC.Eleanor, sets up a hidden Tor service and PHP-capable web server on the infected computer, generating a .onion domain that the attacker can use to connect to the Mac and control it. Once installed, the malware grants full access to the file system and can run scripts given to it by its masters. Eleanor’s controllers also uses the open-source tool wacaw to take control of the infected computer’s camera. That would allow them to not only spy on the victim but also take photographs of them, opening up the possibility of blackmail.
A Computer World article on EasyDoc expands on an additional aspect of this enabled by the Dark Web. Namely, there is a Pastebin agent which takes the infected system’s .onion URL, encrypts it with an RSA public key and posts it on Pastebin where attackers can find it and use it. This certainly seems to point to the strengthening of hacking culture and community, as counterintuitive of a form of community, it may be to those on the outside.
Megan Feil, October 14, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Reverse Image Searching Is Easier Than You Think
October 6, 2016
One of the newest forms of search is using actual images. All search engines from Google to Bing to DuckDuckGo have an image search option, where using keywords you can find an image to your specifications. It seemed to be a thing of the future to use an actual image to power a search, but it has actually been around for a while. The only problem was that reverse image searching sucked and returned poor results.
Now the technology has improved, but very few people actually know how to use it. ZDNet explains how to use this search feature in the article, “Reverse Image Searching Made Easy…”. It explains that Google and TinEye are the best way to begin reverse image search. Google has the larger image database, but TinEye has the better photo experts. TinEye is better because:
TinEye’s results often show a variety of closely related images, because some versions have been edited or adapted. Sometimes you find your searched-for picture is a small part of a larger image, which is very useful: you can switch to searching for the whole thing. TinEye is also good at finding versions of images that haven’t had logos added, which is another step closer to the original.
TinEye does have its disadvantages, such as outdated results and not being able to find them on the Web. In some cases Google is the better choice as one can search by usage rights. Browser extensions for image searching are another option. Lastly if you are a Reddit user, Karma Decay is a useful image search tool and users often post comments on the image’s origin.
The future of image searching is now.
Whitney Grace, October 6, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Geoparsing Is More Magical Than We Think
September 23, 2016
The term geoparsing sounds like it has something to do with cartography, but according to Directions Magazine in the article, “Geoparsing Maps The Future Of Text Documents” it is more like an alchemical spell. Geoparsing refers to when text documents into a geospatial database that allows entity extraction and disambiguation (aka is geotagging). It relies on natural language processing and is generally used to analyze text document collections.
While it might appear that geoparsing is magical, it actually is a complex technological process that relies on data to put information into context. Places often have the same name, so disambiguation would have difficulty inputting the correct tags. Geoparsing has important applications, such as:
Military users will not only want to exploit automatically geoparsed documents, they will require a capability to efficiently edit the results to certify that the place names in the document are all geotagged, and geotagged correctly. Just as cartographers review and validate map content prior to publication, geospatial analysts will review and validate geotagged text documents. Place checking, like spell checking, allows users to quickly and easily edit the content of their documents.
The article acts as a promo piece for the GeoDoc application, however, it does delve into the details into how geoparsing works and its benefits.
Whitney Grace, September 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Open Source Log File Viewer Glogg
September 21, 2016
Here is an open source solution for those looking to dig up information within large and complex log files; BetaNews shares, “View and Search Huge Log Files with Glogg.” The software reads directly from your drive, saving time and keeping memory free (or at least as free as it was before.) Reviewer, Mike Williams tells us:
Glogg’s interface is simple and uncluttered, allowing anyone to use it as a plain text viewer. Open a log, browse the file, and the program grabs and displays new log lines as they’re added. There’s also a search box. Enter a plain text keyword, a regular or extended regular expression and any matches are highlighted in the main window and displayed in a separate pane. Enable ‘auto-refresh’ and glogg reruns searches as lines are added, ensuring the matches are always up-to-date. Glogg also supports ‘filters’, essentially canned searches which change text color in the document window. You could have lines containing ‘error’ displayed as black on red, lines containing ‘success’ shown black on green, and as many others as you need.
Williams spotted some more noteworthy features, like a quick-text search, highlighted matches, and helpful Next and Previous buttons. He notes the program is not exactly chock-full of fancy features, but suggests that is probably just as well for this particular task. Glogg runs on 64-bit Windows 7 and later, and on Linux.
Cynthia Murrell, September 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/