Writing That Is Never Read
November 23, 2016
It is inevitable in college that you were forced to write an essay. Writing an essay usually requires the citation of various sources from scholarly journals. As you perused the academic articles, the thought probably crossed your mind: who ever reads this stuff? Smithsonian Magazine tells us who in the article, “Academics Write Papers Arguing Over How Many People Read (And Cite) Their Papers.” In other words, themselves.
Academic articles are read mostly by their authors, journal editors, and the study’s author write, and students forced to cite them for assignments. In perfect scholarly fashion, many academics do not believe that their work has a limited scope. So what do they do? They decided to write about it and have done so for twenty years.
Most academics are not surprised that most written works go unread. The common belief is that it is better to publish something rather than nothing and it could also be a requirement to keep their position. As they are prone to do, academics complain about the numbers and their accuracy:
It seems like this should be an easy question to answer: all you have to do is count the number of citations each paper has. But it’s harder than you might think. There are entire papers themselves dedicated to figuring out how to do this efficiently and accurately. The point of the 2007 paper wasn’t to assert that 50 percent of studies are unread. It was actually about citation analysis and the ways that the internet is letting academics see more accurately who is reading and citing their papers. “Since the turn of the century, dozens of databases such as Scopus and Google Scholar have appeared, which allow the citation patterns of academic papers to be studied with unprecedented speed and ease,” the paper’s authors wrote.
Academics always need something to argue about, no matter how miniscule the topic. This particular article concludes on the note that someone should get the number straight so academics can move onto to another item to argue about. Going back to the original thought a student forced to write an essay with citations also probably thought: the reason this stuff does not get read is because they are so boring.
Whitney Grace, November 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Exit Shakespeare, for He Had a Coauthor
November 22, 2016
Shakespeare is regarded as the greatest writer in the English language. Many studies, however, are devoted to the theory that he did not pen all of his plays and poems. Some attribute them to Francis Bacon, Edward de Vere, Christopher Marlowe, and others. Whether Shakespeare was a singular author or one of many, two facts remain: he was a dirty, old man and it could be said he plagiarized his ideas from other writers. Shall he still be regarded as the figurehead for English literature?
Philly.com takes the Shakespeare authorship into question in the article, “Penn Engineers Use Big Data To Show Shakespeare Had Coauthor On ‘Henry VI’ Plays.” Editors of a new edition of Shakespeare’s complete works listed Marlowe as a coauthor on the Henry VI plays due to a recent study at the University of Pennsylvania. Alejandro Ribeiro used his experience researching networks could be applied to the Shakespeare authorship question using big data.
Ribeiro learned that Henry VI was among the works for which scholars thought Shakespeare might have had a co-author, so he and lab members Santiago Segarra and Mark Eisen tackled the question with the tools of big data. Working with Shakespeare expert Gabriel Egan of De Montfort University in Leicester, England, they analyzed the proximity of certain target words in the playwright’s works, developing a statistical fingerprint that could be compared with those of other authors from his era.
Two other research groups had the same conclusion with other analytical techniques. The results from all three studies were enough to convince the lead general editor of the New Oxford Shakespeare Gary Taylor, who decided to list Marlowe as a coauthor to Henry VI. More research has been conducted to determine other potential Shakespeare coauthors and six more will also be credited in the New Oxford editions.
Ribeiro and his team created “word-adjacency networks” that discovered patterns in Shakespeare’s writing style and six other dramatists. They discovered that many scenes in Henry VI were non-written in Shakespeare’s style, enough to prove a coauthor.
Some Shakespeare purists remain against the theory that Shakespeare did not pen all of his plays, but big data analytics proves many of the theories that other academics have theorized for generations. The dirty old man was not old alone as he wrote his ditties.
Whitney Grace, November 22, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
All the Things Watson Could Do
November 21, 2016
One of our favorite artificial intelligence topics has made the news again: Watson. Technology Review focuses on Watson’s job descriptions and his emergence in new fields, “IBM’s Watson Is Everywhere-But What Is It?” We all know that Watson won Jeopardy and has been deployed as the ultimate business intelligence solution, but what exactly does Watson do for a company?
The truth about Watson’s Jeopardy appearance is that very little of the technology was used. In reality, Watson is an umbrella name IBM uses for an entire group of their machine learning and artificial intelligence technology. The Watson brand is employed in a variety of ways from medical disease interpretation to creating new recipes via experimentation. The technology can be used for many industries and applied to a variety of scenarios. It all depends on what the business needs resolved. There is another problem:
Beyond the marketing hype, Watson is an interesting and potentially important AI effort. That’s because, for all the excitement over the ways in which companies like Google and Facebook are harnessing AI, no one has yet worked out how AI is going to fit into many workplaces. IBM is trying to make it easier for companies to apply these techniques, and to tap into the expertise required to do so.
IBM is experiencing problems of its own, but beyond those another consideration to take is Watson’s expense. Businesses are usually eager to incorporate new technology, if the benefit is huge. However, they are reluctant for the initial payout, especially if the technology is still experimental and not standard yet. Nobody wants to be a guinea pig, but someone needs to set the pace for everyone else. So who wants to deploy Watson?
Whitney Grace, November 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Hacking the Internet of Things
November 17, 2016
Readers may recall that October’s DoS attack against internet-performance-management firm Dyn, which disrupted web traffic at popular sites like Twitter, Netflix, Reddit, and Etsy. As it turns out, the growing “Internet of Things (IoT)” facilitated that attack; specifically, thousands of cameras and DVRs were hacked and used to bombard Dyn with page requests. CNet examines the issue of hacking through the IoT in, “Search Engine Shodan Knows Where Your Toaster Lives.”
Reporter Laura Hautala informs us that it is quite easy for those who know what they’re doing to access any and all internet-connected devices. Skilled hackers can do so using search engines like Google or Bing, she tells us, but tools created for white-hat researchers, like Shodan, make the task even easier. Hautala writes:
While it’s possible hackers used Shodan, Google or Bing to locate the cameras and DVRs they compromised for the attack, they also could have done it with tools available in shady hacker circles. But without these legit, legal search tools, white hat researchers would have a harder time finding vulnerable systems connected to the internet. That could keep cybersecurity workers in a company’s IT department from checking which of its devices are leaking sensitive data onto the internet, for example, or have a known vulnerability that could let hackers in.
Even though sites like Shodan might leave you feeling exposed, security experts say the good guys need to be able to see as much as the bad guys can in order to be effective.
Indeed. Like every tool ever invented, the impacts of Shodan depend on the intentions of the people using it.
Cynthia Murrell, November 17, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Dark Web Marketplaces Are Getting Customer Savvy
November 17, 2016
Offering on Dark Web marketplaces are getting weirder by the day. Apart from guns, ammo, porn, fake identities, products like forged train tickets are now available for sale.
The Guardian in an investigative article titled Dark Web Departure: Fake Train Tickets Go on Sale Alongside AK-47s reveals that:
At least that’s the impression left by an investigation into the sale of forged train tickets on hidden parts of the internet. BBC South East bought several sophisticated fakes, including a first-class Hastings fare, for as little as a third of their face value. The tickets cannot fool machines but barrier staff accepted them on 12 occasions.
According to the group selling these tickets, the counterfeiting was done to inflict financial losses on the operators who are providing deficient services. Of course, it is also possible that the fake tickets are used by people (without criminalistics inclinations) who do not want to pay for the full fares.
One school of thought also says that like online marketplaces on Open Web, Dark Web marketplaces are also getting customer-savvy and are providing products and services that the customers need or want. This becomes apparent in this portion of the article:
The academics say the sites, once accessed by invitation or via dark-web search engines (there’ll be no hyperlinks here) resemble typical marketplaces such as Amazon or eBay, and that customer service is improving. “Agora was invitation-only but many of these marketplaces are easily accessible if you know how to search,” Dr Lee adds. “I think any secondary school student who knows how to use Google could get access – and that’s the danger of it.
One of the most active consumer group on Dark Web happens to be students, who are purchasing anything from fake certificates to hacker services to improve their grades and attendance records. Educational institutions, as well as law enforcement officials, are worried about this trend. And as more people get savvy with Dark Web, this trend is going to strengthen creating a parallel e-commerce, albeit a dark one.
Vishal Ingole, November 17, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Black-Hat SEO Tactics Google Hates
November 16, 2016
The article on Search Engine Watch titled Guide to Black Hat SEO: Which Practices Will Earn You a Manual Penalty? follows up on a prior article that listed some of the sob stories of companies caught by Google using black-hat practices. Google does not take kindly to such activities, strangely enough. This article goes through some of those practices, which are meant to “falsely manipulate a website’s search position.”
Any kind of scheme where links are bought and sold is frowned upon, however money doesn’t necessarily have to change hands… Be aware of anyone asking to swap links, particularly if both sites operate in completely different niches. Also stay away from any automated software that creates links to your site. If you have guest bloggers on your site, it’s good idea to automatically Nofollow any links in their blog signature, as this can be seen as a ‘link trade’.
Other practices that earned a place on the list include automatically generated content, cloaking and irrelevant redirects, and hidden text and links. Doorway pages are multiple pages for a key phrase that lead visitors to the same end destination. If you think these activities don’t sound so terrible, you are in great company. Mozilla, BMW, and the BBC have all been caught and punished by Google for such tactics. Good or bad? You decide.
Chelsea Kerwin, November 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
AI to Profile Gang Members on Twitter
November 16, 2016
Researchers from Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) are claiming that an algorithm developed by them is capable of identifying gang members on Twitter.
Vice.com recently published an article titled Researchers Claim AI Can Identify Gang Members on Twitter, which claims that:
A deep learning AI algorithm that can identify street gang members based solely on their Twitter posts, and with 77 percent accuracy.
The article then points out the shortcomings of the algorithm or AI by saying this:
According to one expert contacted by Motherboard, this technology has serious shortcomings that might end up doing more harm than good, especially if a computer pegs someone as a gang member just because they use certain words, enjoy rap, or frequently use certain emojis—all criteria employed by this experimental AI.
The shortcomings do not end here. The data on Twitter is being analyzed in a silo. For example, let us assume that few gang members are identified using the algorithm (remember, no location information is taken into consideration by the AI), what next?
Is it not necessary then to also identify other social media profiles of the supposed gang members, look at Big Data generated by them, analyze their communication patterns and then form some conclusion? Unfortunately, none of this is done by the AI. It, in fact, would be a mammoth task to extrapolate data from multiple sources just to identify people with certain traits.
And most importantly, what if the AI is put in place, and someone just for the sake of fun projects an innocent person as a gang member? As rightly pointed out in the article – machines trained on prejudiced data tend to reproduce those same, very human, prejudices.
Vishal Ingole, November 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
French Smart Software Companies: Some Surprises
November 15, 2016
I read “French AI Ecosystem.” Most of the companies have zero or a low profile in the United States. The history of French high technology outfits remains a project for an enterprising graduate student with one foot in La Belle France and one in the USA. This write up is a bit of a sales pitch for venture capital in my opinion. The reason that VC inputs are needed is that raising money in France is — how shall I put this? — not easy. There is no Silicon Valley. There is Paris and a handful of other acceptable places to be intelligent. In the Paris high tech setting, there are a handful of big outfits and lots and lots of institutions which keep the French one percent in truffles and the best the right side of the Rhone have to offer. The situation is dire unless the start up is connected by birth, by education at one of the acceptable institutions, or hooked up with a government entity. I want to mention that there is a bit of French ethnocentrism at work in the French high tech scene. I won’t go into detail, but you can check it out yourself if you attend a French high tech conference in one of the okay cities. Ars-en-Ré and Gémenos do not qualify. Worth a visit, however.
Now to the listings. You will have to work through the almost unreadable graphic or contact the outfit creating the listing, which is why the graphic is unreadable I surmise. From the version of the graphic I saw, I did discern a couple of interesting points. Here we go:
Three outfits were identified as having natural language capabilities. These are Proxem, syJLabs (no, I don’t know how to pronounce this”syjl” string. I can do “abs”, though.), and Yseop k(maybe, Aesop from the fable?). Proxem offers its Advanced Natural Language Object Orient Processing Environment (Antelope). The company was founded in 2007.) syJLabs does not appear in my file of French outfits, and we drew a blank when looking for the company’s Web site. Sigh. Yseop has been identified as a “top IT innovator” by an objective, unimpeachable, high value, super credible, wonderful, and stellar outfit (Ventana Research). Yseop, also founded in 2007, offers a system which “turns data into narrative in English, French, German, and Spanish, all at the speed of thousands of pages per second.”
As I worked through a graphic containing lots of companies, I spotted two interesting inclusions. The first is Sinequa, a vendor of search founded in 2002, now positioned as an important outfit in Big Data and machine learning. Fascinating. The reinvention of Sinequa is a logical reaction to the implosion of the market for search and retrieval for the enterprise. The other company I noted was Antidot, which mounted a push to the US market several years ago. Antidot, like Sinequa, focused on information access. It too is “into” Big Data and machine learning.
I noted some omissions; for example, Hear&Know, among others. Too bad the listing is almost unreadable and does not include a category for law enforcement, surveillance, and intelligence innovators.
Stephen E Arnold, November 15, 2016
Oh No! The Ads Are Becoming Smarter
November 15, 2016
I love Christmas and subsequent holiday season, although I am tired of it starting in October. Thankfully the holiday music does not start playing until Thanksgiving week, as do the ads, although they have been sneaking into the year earlier and earlier. I like the fact that commercials and Internet ads are inanimate objects, so I can turn them off. IT Pro Portal tells me, however, that I might be in for a Christmas nightmare; “IBM’s Watson Now Used In Native Advertising” or the ads are becoming smarter!
While credit card expenditures, browsing history, and other factors are already used for individualized, targeted ads, they still remain a static tool dependent on external factors. Watson is going to try be tried in the advertising game to improve targeting in native advertising. Watson will add an aesthetic quality too:
The difference is – it’s not just looking at keywords as the practice was so far – it’s actually looking at the ad, determining what it’s about and then places it where it believes is a good fit. According to the press release, Watson “looks at where, why and how the existing editorial content on each site is ‘talking about’ subjects”, and then makes sure best ads are placed to deliver content in proper context.
Another way Watson’s implementation in advertising is “semantic targeting AI for native advertising.” It will work in real-time and deliver more individualized targeted ads, over your recent Amazon, eBay, and other Web site shopping. It is an interesting factor how Watson can disseminate all this information for one person, but if you imagine that the same technology is being used in the medical and law fields, it does inspire hope.
Whitney Grace, November 15, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Most Dark Web Content Is Legal and Boring
November 15, 2016
Data crunching done by an information security firm reveals that around 55% is legal and mundane like the clear or Open Web.
Digital Journal, which published the article Despite its Nefarious Reputation, New Report Finds Majority of Activity on the Dark Web is Totally Legal and Mundane, says that:
What we’ve found is that the dark web isn’t quite as dark as you may have thought,” said Emily Wilson, Director of Analysis at Terbium Labs. “The vast majority of dark web research to date has focused on illegal activity while overlooking the existence of legal content. We wanted to take a complete view of the dark web to determine its true nature and to offer readers of this report a holistic view of dark web activity — both good and bad.
The findings have been curated in a report The Truth About the Dark Web: Separating Fact from Fiction that puts the Dark Web in a new light. According to this report, around 55% of the content on Dark Web is legal; porn makes 7% of content on Dark Web, and most of it is legal. Drugs though is a favorite topic, only 45% of the content related to it can be termed as illegal. Fraud, extremism and illegal weapons trading on the other hand just make 5-7% of Dark Web.
The research methodology was done using a mix of machine intelligence and human intelligence, as pointed out in the article:
Conducting research on the dark web is a difficult task because the boundaries between categories are unclear,” said Clare Gollnick, Chief Data Scientist at Terbium Labs. “We put significant effort into making sure this study was based on a representative, random sample of the dark web. We believe the end result is a fair and comprehensive assessment of dark web activity, with clear acknowledgment of the limitations involved in both dark web data specifically and broader limitations of data generally.
Dark Web slowly is gaining traction as users of Open Web are finding utilities on this hidden portion of the Internet. Though the study is illuminating indeed, it fails to address how much of the illegal activity or content on Dark Web affects the real world. For instance, what quantity of drug trade takes place over Dark Web. Any answers?
Vishal Ingole, November 15, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph