Fighting the Academic Publishers Gets You Fired

September 11, 2015

Academic publishers, such as Springer and Elsevier, have a monopoly on academic publishing and they do not want to lose their grasp.  In the Slashdot science forum, a report from The Guardian was posted “Paywalled Science Journals Under Fire Again” describing how the academic publishers won a battle in Australia.

The Medical Journal of Australia (MJA) fired their editor Professor Stephen Leeder, when he expressed his displeasure over the journal outsourcing its functions to Elsevier.  Leeder might have lost his job, but he will speak at a symposium at the State Library of NSW about ways academic communities can fight against the commoditization of knowledge.

What is concerning is that academic publishers are more interested in turning a profit than expanding humanity’s knowledge base:

“Alex Holcombe, an associate professor of psychology who will also be presenting at the symposium, said the business model of some of the major academic publishers was more profitable than owning a gold mine. Some of the 1,600 titles published by Elsevier charged institutions more than $19,000 for an annual subscription to just one journal. The Springer group, which publishes more than 2,000 titles, charges more than $21,000 for access to some of its titles. ‘The mining giant Rio Tinto has a profit margin of about 23%,’ Holcombe said. ‘Elsevier consistently comes in at around 37%. Open access publishing is catching on, but it requires researchers to pay up to $3000 to get a single open access article published.’”

Where does the pursuit of knowledge actually take place if researchers are at the mercy of academic publishers?  One might say that researchers could publish their work for free on the Web, but remember that anyone can do that.  Being published under a reputable banner adds to study’s authenticity and also helps it get used to support other research.  The problem lies in the fact that big academic publishers limit who accesses their content to subscription holders and often those subscriptions are too expensive for the average researcher to afford on their own.  Researchers want to have access to more academic content, but it is being locked down.

Whitney Grace, September 11, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

A Fun Japanese Elasticsearch Promotion Video

September 10, 2015

Elasticsearch is one of the top open source search engines and is employed by many companies including Netflix, Wikipedia, GitHub, and Facebook.  Elasticsearch wants to get a foothold into the Japanese technology market.  We can assume, because Japan is one of the world’s top producers of advanced technology and has a huge consumer base.  Once a technology is adopted in Japan, you can bet that it will have an even bigger adoption rate.

The company has launched a Japanese promotional campaign and a uploaded video entitled “Elasticsearch Product Video” to its YouTube channel.  The video comes with Japanese subtitles with appearances by CEO Steven Schuurman, VP of Engineering Kevin Kluge, Elasticsearch creator Shay Bannon, and VP of Sales Justin Hoffman.  The video showcases how Elasticsearch is open source software, how it has been integrated into many companies’ frameworks, its worldwide reach, product improvement, as well as the good it can do.

Justin Hoffman said that, “I think the concept of an open source company bringing a commercial product to market is very important to our company.  Because the customers want to know on one hand that you have the open source community and its evolution and development at the top of your priority list.  On the other hand, they appreciate that you’re innovating and bringing products to market that solve real problems.”

It is a neat video that runs down what Elasticsearch is capable of, the only complaint is that bland music in the background.  They could benefit from licensing the Jive Aces “Bring Me Sunshine” it relates the proper mood.

Whitney Grace, September 10, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

The AI Evolution

September 10, 2015

An article at WT Vox announces, “Google Is Working on a New Type of Algorithm Called ‘Thought Vectors’.” It sounds like a good use for a baseball cap with electrodes, a battery pack, WiFi, and a person who thinks great thoughts. In actuality, it’s a project based on the work of esteemed computer scientist Geoffrey E. Hinton, who has been exploring the idea of neural networks for decades. Hinton is now working with Google to create the sophisticated algorithm of our dreams (or nightmares, depending on one’s perspective).

Existing language processing software has come a very long way; Google Translate, for example, searches dictionaries and previously translated docs to translate phrases. The app usually does a passably good job of giving one the gist of a source document, but results are far from reliably accurate (and are often grammatically comical.) Thought vectors, on the other hand, will allow software to extract meanings, not just correlations, from text.

Continuing to use translation software as the example, reporter Aiden Russell writes:

“The technique works by ascribing each word a set of numbers (or vector) that define its position in a theoretical ‘meaning space’ or cloud. A sentence can be looked at as a path between these words, which can in turn be distilled down to its own set of numbers, or thought vector….

“The key is working out which numbers to assign each word in a language – this is where deep learning comes in. Initially the positions of words within each cloud are ordered at random and the translation algorithm begins training on a dataset of translated sentences. At first the translations it produces are nonsense, but a feedback loop provides an error signal that allows the position of each word to be refined until eventually the positions of words in the cloud captures the way humans use them – effectively a map of their meanings.”

But, won’t all efficient machine learning lead to a killer-robot-ruled dystopia? Hinton bats away that claim as a distraction; he’s actually more concerned about the ways big data is already being (mis)used by intelligence agencies. The man has a point.

Cynthia Murrell, September 10, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 Datameer Declares a Celebration

September 8, 2015

The big data analytics and visualization company Datameer, Inc. has cause to celebrate, because they have received a huge investment.  How happy is Datameer?  Datameer’s CEO Stefan Groschupf explains on the company blog in the post, “Time To Celebrate The Next Stage Of Our Journey.”

Datameer received $40 million in a round of financing from ST Telemedia, Top Tier Capital Partners, Next World Capital, Redpoint, Kleiner Perkins Caufield & Byers, Software AG and Citi Ventures.  Groschupf details how Datameer was added to the market in 2009 with the vision to democratize analytics.  Since 2009, Datameer has helped solve problems across the globe and is even helping make it a better place.  He continues he is humbled by the trust the investors and clients place in Datameer, which feeds into the importance of analytics for not only companies, but also anyone who wants supportable truth.

Datameer has big plans for the funding:

“We’ll be focusing on expanding globally, with an eye toward APAC and Latin America as well as additional investment in our existing teams. I’m looking forward to continuing our growth and building a long-term, sustainable company that consistently provides value to our customers. Our vision has been the same since day one – to make big data analytics easy for everyone. Today, I’m happy to say we’re still where we want to be.”

Datameer was one of the early contenders in big data that always managed to outshine and outperform its bigger name competitors.  Despite its record growth, Datameer continues to remain true to its open source roots.  The company wants to make analytics available to every industry and everyone.  What is incredibly impressive is that Datameer has numerous applications for its products from gaming to healthcare, which is usually unheard of.  Congratulations to Datameer!

Whitney Grace, September 8, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Content Matching Helps Police Bust Dark Web Sex Trafficking Ring

September 4, 2015

The Dark Web is not only used to buy and sell illegal drugs, but it is also used to perpetuate sex trafficking, especially of children.  The work of law enforcement agencies working to prevent the abuse of sex trafficking victims is detailed in a report by the Australia Broadcasting Corporation called “Secret ‘Dark Net’ Operation Saves Scores Of Children From Abuse; Ringleader Shannon McCoole Behind Bars After Police Take Over Child Porn Site.”  For ten months, Argos, the Queensland, police anti-pedophile taskforce tracked usage on an Internet bulletin board with 45,000 members that viewed and uploaded child pornography.

The Dark Web is notorious for encrypting user information and that is one of the main draws, because users can conduct business or other illegal activities, such as view child pornography, without fear of retribution.  Even the Dark Web, however, leaves a digital trail and Argos was able to track down the Web site’s administrator.  It turned out the administrator was an Australian childcare worker who had been sentenced to 35 years in jail for sexually abusing seven children in his care and sharing child pornography.

Argos was able to catch the perpetrator by noticing patterns in his language usage in posts he made to the bulletin board (he used the greeting “hiya”). Using advanced search techniques, the police sifted through results and narrowed them down to a Facebook page and a photograph.  From the Facebook page, they got the administrator’s name and made an arrest.

After arresting the ringleader, Argos took over the community and started to track down the rest of the users.

” ‘Phase two was to take over the network, assume control of the network, try to identify as many of the key administrators as we could and remove them,’ Detective Inspector Jon Rouse said.  ‘Ultimately, you had a child sex offender network that was being administered by police.’ ”

When they took over the network, the police were required to work in real-time to interact with the users and gather information to make arrests.

Even though the Queensland police were able to end one Dark Web child pornography ring and save many children from abuse, there are still many Dark Web sites centered on child sex trafficking.

 

Whitney Grace, September 4, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

 

 

Freedom Versus Fear

September 4, 2015

The Ashley Madison data breach has understandably been getting a lot of press, but what does it portend for the future of the Internet? Computerworld’s Tech Decoder predicts far-reaching consequences in, “Here’s Why the Dark Web Just  Got a Lot Darker.” Security experts predict a boom in phishing scams connected to this data breach, as well as copycat hackers poised to attack other (more legit) companies.

Reporter John Brandon suspects such activity will lead to the government stepping in to create two separate Internet channels: one “wild and unprotected” side and a “commercial” side, perhaps sponsored by big-name communications companies, that comes with an expectation of privacy. Great, one might think, we won’t have to worry if we’re not up to anything shady! But there’s more to it. Brandon explains:

“The problem is that I’m a big proponent of entrepreneurship. I won’t comment on whether I think Ashley Madison is a legitimate business. … However, I do want to defend the rights of some random dude in Omaha who wants to sell smartphone cables. He won’t have a chance to compete on the ‘commercial’ side of the Internet, so he’ll probably have to create a site on the unprotected second-tier channel, the one that is ‘free and open’ for everyone. Good luck with that.

“Is it fair? Is it even (shudder) moral? The commercial side will likely be well funded, fast, reliable, government-sanctioned, and possibly heavily taxed. The free side will be like drinking water at the local cesspool. In the end, the free and open Internet is that way for a reason. It’s not so you can cheat on your wife. Frankly, people will do that with or without the Internet. The ‘free and open’ bit is intended to foster ideas. It’s meant to level the playing field. It’s meant to help that one guy in Omaha.”

Yes, security is important, but so is opportunity. Can our society strike a balance, or will fear reign? Stay tuned.

Cynthia Murrell, September 4, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Dark Web Drug Trade Unfazed by Law Enforcement Crackdowns

September 3, 2015

When Silk Road was taken down in 2013, the Dark Web took a big hit, but it was only a few months before black marketers found alternate means to sell their wares, including illegal drugs.  The Dark Web provides an anonymous and often secure means to purchase everything from heroin to prescription narcotics with, apparently, few worries about the threat of prosecution.  Wired explains that “Crackdowns Haven’t Stopped The Dark Web’s $100M Yearly Drug Sale,” proving that if there is a demand, the Internet will provide a means for illegal sales.

In an effort to determine if the Dark Web have grown to declined, Carnegie Mellon researchers Nicolas Cristin and Kyle Soska studied thirty-five Dark Web markets from 2013 to January 2015.  They discovered that the Dark Web markets are no longer explosively growing, but the market has remained stable fluctuating from $100 million to $180 million a year.

The researchers concluded that the Dark Web market is able to survive any “economic” shifts, including law enforcement crackdowns:

“More surprising, perhaps, is that the Dark Web economy roughly maintains that sales volume even after major disasters like thefts, scams, takedowns, and arrests. According to the Carnegie Mellon data, the market quickly recovered after the Silk Road 2 market lost millions of dollars of users’ bitcoins in an apparent hack or theft. Even law enforcement operations that remove entire marketplaces, as in last year’s purge of half a dozen sites in the Europol/FBI investigation known as Operation Onymous, haven’t dropped the market under $100 million in sales per year.”

Cristin and Soska’s study is the most comprehensive to measure the size and trajectory of the Dark Web’s drug market.  Their study ended prematurely, because two Web sites grew so big that the researchers’ software wasn’t able to track the content.  Their study showed that most Dark Web vendors are using more encryption tools, they make profits less $1000, and they are mostly selling MDMA and marijuana.

Soska and Cristin also argue that the Dark Web drug trade decreases violence in the retail drug trade, i.e. it keeps the transactions digital than having there be more violence on the streets.  They urge law enforcement officials to rethink shutting down the Dark Web markets, because it does not seem to have any effect.

Whitney Grace, September 3, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Watson Speaks Naturally

September 3, 2015

While there are many companies that offer accurate natural language comprehension software, completely understanding the complexities of human language still eludes computers.  IBM reports that it is close to overcoming the natural language barriers with IBM Watson Content Analytics as described in “Discover And Use Real-World Terminology With IBM Watson Content Analytics.”

The tutorial points out that any analytics program that only relies on structured data loses about four fifths of information, which is a big disadvantage in the big data era, especially when insights are supposed to be hidden in the unstructured.  The Watson Content Analytics is a search and analytics platform and it uses rich-text analysis to find extract actionable insights from new sources, such as email, social media, Web content, and databases.

The Watson Content Analytics can be used in two ways:

  • “Immediately use WCA analytics views to derive quick insights from sizeable collections of contents. These views often operate on facets. Facets are significant aspects of the documents that are derived from either metadata that is already structured (for example, date, author, tags) or from concepts that are extracted from textual content.
  • Extracting entities or concepts, for use by WCA analytics view or other downstream solutions. Typical examples include mining physician or lab analysis reports to populate patient records, extracting named entities and relationships to feed investigation software, or defining a typology of sentiments that are expressed on social networks to improve statistical analysis of consumer behavior.”

The tutorial runs through a domain specific terminology application for the Watson Content Analytics.  The application gets very intensive, but it teaches how Watson Content Analytics is possibly beyond the regular big data application.

Whitney Grace, September 3, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Suggestions for Developers to Improve Functionality for Search

September 2, 2015

The article on SiteCrafting titled Maxxcat Pro Tips lays out some guidelines for improved functionality when it comes deep search. Limiting your Crawls is the first suggestion. Since all links are not created equally, it is wise to avoid runaway crawls on links where there will always be a “Next” button. The article suggests hand-selecting the links you want to use. The second tip is Specify Your Snippets. The article explains,

“When MaxxCAT returns search results, each result comes with four pieces of information: url, title, meta, and snippet (a preview of some of the text found at the link). By default, MaxxCAT formulates a snippet by parsing the document, extracting content, and assembling a snippet out of that content. This works well for binary documents… but for webpages you wanted to trim out the content that is repeated on every page (e.g. navigation…) so search results are as accurate as possible.”

The third suggestion is to Implement Meta-Tag Filtering. Each suggestion is followed up with step-by-step instructions. These handy tips come from a partnering between Sitecrafting is a web design company founded in 1995 by Brian Forth. Maxxcat is a company acknowledged for its achievements in high performance search since 2007.

Chelsea Kerwin, September 2, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Forbes Bitten by Sci-Fi Bug

September 1, 2015

The article titled Semantic Technology: Building the HAL 9000 Computer on Forbes runs with the gossip from the Smart Data Conference this year. Namely, that semantic technology has finally landed. The article examines several leaders of the field including Maana, Loop AI Labs and Blazegraph. The article mentions,

“Computers still can’t truly understand human language, but they can make sense out of certain aspects of textual content. For example, Lexalytics (www.lexalytics.com) is able to perform sentiment analysis, entity extraction, and ambiguity resolution. Sentiment analysis can determine whether some text – a tweet, say, expresses a positive or negative opinion, and how strong that opinion is. Entity extraction identifies what a paragraph is actually talking about, while ambiguity resolution solves problems like the Paris Hilton one above.”

(The “Paris Hilton problem” referred to is distinguishing between the hotel and the person in semantic search.) In spite of the excitable tone of the article’s title, its conclusion is much more measured. HAL, the sentient computer from 2001: A Space Odyssey, remains in our imaginations. In spite of the exciting work being done, the article reminds us that even Watson, IBM’s supercomputer, is still without the “curiosity or reasoning skills of any two-year-old human.” For the more paranoid among us, this might be good news.

Chelsea Kerwin, September 1, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta