July 25, 2016
The article titled Microsoft CaptionBot: AI Image Guessing App Really Isn’t Sure Who Barack Obama Is on International Business Times assesses Microsoft’s latest attempt at AI following the catastrophic Twitter robot Tay which quickly “learned” and repeated some pretty darn offensive ideas about Hitler and Obama. The newly released version named CaptionBot is more focused on image descriptions. The article states,
“Users are asked to upload any photo to the site, then Microsoft’s AI system attempts to describe what is in the image. The system can recognise celebrities and understands the basics of image composition but…, it isn’t yet perfect… You know when you recognise someone, but can’t quite put your finger on who it is? Caption Bot doesn’t do that, it just fails to even describe what a photo of Barack Obama is, never mind who he might be.”
From the examples, it is clear that while CaptionBot is much better at understanding and defining objects than people, objects often create difficulty as well. An image of a yellow vehicle from Cars was described (without confidence) as a white toilet next to a yellow building. To be sure, if you stare at the image long enough, the toilet shape emerges.
Chelsea Kerwin, July 25, 2016
There is a Louisville, Kentucky Hidden Web/DarkWeb meet up on July 26, 2016. Information is at this link: http://bit.ly/29tVKpx.2
June 22, 2016
The latest version of Savanna, the collaborative data-visualization platform from Thetus Corporation, has an important new feature—it can now link to external content. The press release at PR Newswire, “Savanna 4.7 Introduces Plugins, Opening ‘A World of New Content’ to Visual Analysis Software,” tells us:
“With Savanna, users can visualize data to document insights mined from complexity and analyze relationships. New in this release are Savanna Plugins. Plugins do more than allow users to import data. The game changer is in the ability to link to external content, leaving the data in its original source. Data lives in many places. Analyzing data from many sources often means full data transformation and migration into a new program. This process is daunting and exactly what Savanna 4.7 Plugins address. Whether on databases or on the web, users can search all of their sources from one application to enrich a living knowledge base. Plugins also enable Savanna to receive streams of information from sources like RSS, Twitter, geolocators, and others.”
Thetus’ CTO is excited about this release, calling the new feature “truly transformative.” The write-up notes that Plugins opens new opportunities for Thetus to partner with other organizations. For example, the company is working with the natural language processing firm Basis Technology to boost translation and text mining capacities. Founded in 2003, Thetus is based in Portland, Oregon.
Cynthia Murrell, June 22, 2016
June 20, 2016
A recent report from social analytics firm Parse.ly examined the relationship between Twitter and digital publishers. NeimanLab shares a few details in, “Twitter Has Outsized Influence, but It Doesn’t Drive Much Traffic for Most News Orgs, a New Report Says.” Parse.ly tapped into data from a couple hundred of its clients, a group that includes digital publishers like Business Insider, the Daily Beast, Slate, and Upworthy.
Naturally, news sites that make the most of Twitter do so by knowing what their audience wants and supplying it. The study found there are two main types of Twitter news posts, conversational and breaking, and each drives traffic in its own way. While conversations can engage thousands of users over a period of time, breaking news produces traffic spikes.
Neither of those findings is unexpected, but some may be surprised that Twitter feeds are not inspiring more visits publishers’ sites. Writer Joseph Lichterman reports:
“Despite its conversational and breaking news value, Twitter remains a relatively small source of traffic for most publishers. According to Parse.ly, less than 5 percent of referrals in its network came from Twitter during January and February 2016. Twitter trails Facebook, Google, and even Yahoo as sources of traffic, the report said (though it does edge out Bing!)”
Still, publishers are unlikely to jettison their Twitter accounts anytime soon, because that platform offers a different sort of value. One that is, perhaps, more important for consumers. Lichterman quotes the report:
“Though Twitter may not be a huge overall source of traffic to news websites relative to Facebook and Google, it serves a unique place in the link economy. News really does ‘start’ on Twitter.”
And the earlier a news organization knows about a situation, the better. That is an advantage few publishers will want to relinquish.
Cynthia Murrell, June 20, 2016
April 7, 2016
Once more we turn to the Fuzzy Notepad’s advice and their Pokémon mascot, Evee. This time we visited the fuzz pad for tips on Twitter. The 140-character social media platform has a slew of hidden features that do not have a button on the user interface. Check out “Twitter’s Missing Manual” to read more about these tricks.
It is inconceivable for every feature to have a shortcut on the user interface. Twitter relies on its users to understand basic features, while the experienced user will have picked up tricks that only come with experience or reading tips on the Internet. The problem is:
“The hard part is striking a balance. On one end of the spectrum you have tools like Notepad, where the only easter egg is that pressing F5 inserts the current time. On the other end you have tools like vim, which consist exclusively of easter eggs.
One of Twitter’s problems is that it’s tilted a little too far towards the vim end of the scale. It looks like a dead-simple service, but those humble 140 characters have been crammed full of features over the years, and the ways they interact aren’t always obvious. There are rules, and the rules generally make sense once you know them, but it’s also really easy to overlook them.”
Twitter is a great social media platform, but a headache to use because it never came with an owner’s manual. Fuzzy notepad has lined up hint for every conceivable problem, including the elusive advanced search page.
March 28, 2016
The Internet is often described as the world’s biggest library containing all the world’s knowledge that someone dumped on the floor. The Internet is the world’s biggest information database as well as the world’s biggest data mess. In the olden days, librarians used to be the gateway to knowledge management but they need to vamp up their skills beyond the Dewey Decimal System and database searching. Librarians need to do more and Christian Lauersen’s personal blog explains how in, “Data Scientist Training For Librarians-Re-Skilling Libraries For The Future.”
DST4L is a boot camp for librarians and other information professionals to learn new skills to maintain relevancy. Last year DST4L was held as:
“DST4L has been held three times in The States and was to be set for the first time in Europe at Library of Technical University of Denmark just outside of Copenhagen. 40 participants from all across Europe were ready to get there hands dirty over three days marathon of relevant tools within data archiving, handling, sharing and analyzing. See the full program here and check the #DST4L hashtag at Twitter.”
Over the course of three days, the participants learned about OpenRefine, a spreadsheet-like application that cane be used for data cleanup and transformation. They also learned about the benefits of GitHub and how to program using Python. These skills are well beyond the classed they teach in library graduate programs, but it is a good sign that the profession is evolving even if the academia aspects lag behind.
March 11, 2016
Academics are no strangers to the shadowy corners of the Dark Web. In fact, as the The Research Pirates of the Dark Web published by The Atlantic reports, one university student in Kazakhstan populated the Dark Web with free access to academic research after her website, Sci-Hub was shut down in accordance with a legal case brought to court by the publisher Elsevier. Sci-Hub has existed under a few different domain names on the web since then, continuing its service of opening the floodgates to release paywalled papers for free. The article tells us,
“Soon, the service popped up again under a different domain. But even if the new domain gets shut down, too, Sci-Hub will still be accessible on the dark web, a part of the Internet often associated with drugs, weapons, and child porn. Like its seedy dark-web neighbors, the Sci-Hub site is accessible only through Tor, a network of computers that passes web requests through a randomized series of servers in order to preserve visitors’ anonymity.”
The open source philosophy continues to emerge in various sectors: technology, academia, and beyond. And while the Dark Web appears to be a primed for open source proponents to prosper, it will be interesting to see what takes shape. As the article points out, other avenues exist; scholars may make public requests for paywalled research via Twitter and using the hashtag #icanhazpdf.
Megan Feil, March 11, 2016
March 11, 2016
The article on Insight Data Engineering titled Building a Streaming Search Platform offers a glimpse into the Fellows Program wherein grad students and software engineers alike build data platforms and learn cutting-edge open source technologies. The article delves into the components of the platform, which enables close to real-time search of a streaming text data source, with Twitter as an example. It also explores the usefulness of such a platform,
On average, Twitter users worldwide generate about 6,000 tweets per second. Obviously, there is much interest in extracting real-time signal from this rich but noisy stream of data. More generally, there are many open and interesting problems in using high-velocity streaming text sources to track real-time events. … Such a platform can have many applications far beyond monitoring Twitter…All code for the platform I describe here can be found on my github repository Straw.”
Ryan Walker, a Casetext Data Engineer, describes how these products might deliver major results in the hands of a skilled developer. He uses the example of a speech to text monitor being able to transcribe radio or TV feeds and send the transcriptions to the platform. The platform would then seek key phrases and even be set up to respond with real-time event management. There are many industries that will find this capability very intriguing due to their dependence on real-time information processing, including finance and marketing.
Chelsea Kerwin, March 11, 2016
February 4, 2016
Despite attempts to improve Bing, it still remains the laughing stock of search engines. Google has run it over with its self-driving cars multiple times. DuckDuckGo tagged it as the “goose,” outran it, and forced Bing to sit in the proverbial pot. Facebook even has unfriended Bing. Microsoft has not given up on its search engine, so while there has been a list of novelty improvements (that Google already did or copied not long after their release) it has a ways to go.
Windows Central tells about the most recent Bing development: a bandwidth speed test in “Bing May Be Building A Speed Test Widget Within Search Results.” Now that might be a game changer for a day, until Google releases its own version. Usually to test bandwidth, you have to search for a Web site that provides the service. Bing might do it on command within every search results page. Not a bad idea, especially if you want to see how quickly your Internet runs, how fast it takes to process your query, or if you are troubleshooting your Internet connection.
The bandwidth test widget is not available just yet:
“A reader of the site Kabir tweeted a few images displaying widget like speed test app within Bing both on the web and their phone (in this case an iPhone). We were unable to reproduce the results on our devices when typing ‘speed test’ into Bing. However, like many new features, this could be either rolling out or simply A/B testing by Microsoft.”
Keep your fingers crossed that Microsoft releases a useful and practical widget. If not just go to Google and search for “bandwidth test.”
January 29, 2016
Propaganda from the Islamic State (Isis) exists not only in the Dark Web, but is also infiltrating the familiar internet. A Wired article discusses the best case scenario to stop such information from spreading in their article Google: ISIS must be ‘contained to the Dark Web’. Google describes ISIS only existing in the Dark Web as success. This information helps explain why,
“As Isis has become more prominent in Syria and Iraq, social media, alongside traditional offline methods, have have been used to spread the group’s messages and recruit members. In 2014 analysis of the group’s online activity showed that they routinely hijack hashtags, use bots, and post gruesome videos to Twitter, Facebook, and YouTube. The UK’s internet counter terrorism unit claims to remove 1,000 illegal pieces of terrorism related content from the internet each week — it says that roughly 800 of these are to do with Syria and Iraq. The group claims in the 12 months before June 2012 that 39,000 internet takedowns were completed.”
The director of Google Ideas is quoted as describing ISIS’ tactics ranging from communication to spamming to typical email scams; he explains they are not “tech-savy.” Unfortunately, tech chops is not a requirement for effective marketing, so the question still remains whether containing this group and their messages to the Dark Web is possible — and whether that means success with growing numbers of people using the Dark Web.
Megan Feil, January 29, 2016
December 28, 2015
It used to be that if you wanted to be an enemy of western civilization you had to have ties to a derelict organization or even visit an enemy nation. It was difficult, especially with the limits of communication in pre-Internet days. Western Union and secret radio signals only went so far, but now with the Internet insurgent recruitment is just a few mouse clicks away or even an app download. The Telegraph reports that the “Islamic State Releases Its Own Smartphone App” to spread propaganda and pollute Islam’s true message.
Islamic State (Isil) released an Android app to disseminate the terrorist group’s radical propaganda. The app was brought to light by hacktivist Ghost Security Group, who uncovered directions to install the app on the encrypted message service Telegram. Ghost Security says that the app publishes propaganda from Amaq News Agency, the Islamic State’s propaganda channel, such as beheadings and warnings about terrorist attacks. It goes to show that despite limited resources, if one is tech savvy and has an Internet connection the possibilities are endless.
” ‘They want to create a broadcast capability that is more secure than just leveraging Twitter and Facebook,’ ” Michael Smith of Kronos Advisory, a company that acts as a conduit between GhostSec and the US government, told CS Monitor.
‘[Isil] has always been looking for a way to provide easy access to all of the material.’ ”
Isil might have the ability to create propaganda and an app, but they do have a limited reach. In order to find this app, one has to dig within the Internet and find instructions. Hacktivist organizations like Ghost Security and Anonymous are using their technology skills to combat terrorist organizations with success. Most terrorist group propaganda will not be found within the first page of search results, one has to work to find them, but not that hard.