Torrent Anonymously with AlphaReign
December 12, 2016
Peer-to-peer file sharing gets a boost with AlphaReign, a new torrent sharing site that enables registered users to share files anonymously using Distributed Hash Table.
TorrentFreak in an article titled Alphareign: DHT Search Engine Takes Public Torrents Private says:
AlphaReign.se is a new site that allows users to find torrents gathered from BitTorrent’s ‘trackerless’ Distributed Hash Table, or DHT for short. While we have seen DHT search engines before, this one requires an account to gain access.
The biggest issue with most torrent sites is The Digital Millennium Copyright Act (DMCA), which prohibits the sites (if possible) and the search engines from displaying search results on the search engine result page. As content or torrent indexes on AlphaReign are accessible only to registered users, seeders and leechers are free to share files without risking themselves.
Though most files shared through torrents are copyrighted materials like movies, music, software and books, torrents are also used by people who want to share large files without being spied upon.
AlphaReign also manages to address a persistent issue faced by torrent sites:
AlphaReign with new software allows users to search the DHT network on their own devices, with help from peers. Such a system would remain online, even if the website itself goes down.
In the past, popular torrent search engines like YTS, KickAssTorrents, The Pirate Bay, Torrentz among many others have been shut down owing to pressure from law enforcement agencies. However, if AlphaReign manages to do what it claims to, torrent users are going to be the delighted.
Vishal Ingole, December 12, 2016
IDOL Is Back and with NLP
December 11, 2016
I must admit that I am confused. Hewlett Packard bought Autonomy, wrote off billions, and seemed to sell the Autonomy software (IDOL and DRE) to an outfit in England. Oh, HPE, the part of the Sillycon Valley icon, which sells “enterprise” products and services owns part of the UK outfit which owns Autonomy. Got that? I am not sure I have the intricacies of this stunning series of management moves straight in my addled goose brain.
Close enough for horseshoes, however.
I read what looks like a content marketing flufferoo called “HPE Boosts IDOL Data Analytics Engine with Natural Language Processing Tools.”
I thought that IDOL had NLP functions, but obviously I am wildly off base. To get my view of the Autonomy IDOL system, check out the free analysis at this link. (Nota bene: I have done a tiny bit of work for Autonomy and have had to wrestle with the system when I labored with the system as a contractor when I worked on US government projects. I know that this is no substitute for the whizzy analysis included in the aforementioned write up. But, hey, it is what it is.)
The write up states in reasonably clear marketing lingo:
HPE has added natural language processing capabilities to its HPE IDOL data analytics engine, which could improve how humans interact with computers and data, the company announced Tuesday. By using machine learning technology, HPE IDOL will be able to improve the context around data insights, the company said.
A minor point: IDOL is based on machine learning processes. That’s the guts of the black box comprising the Bayesian, LaPlacian, and Markovian methods in the guts of the Digital Reasoning Engine which underpins the Integrated Data Operating Layer of the Autonomy system.
Here’s the killer statement:
… the company [I presume this outfit is Hewlett Packard Enterprise and not MicroFocus] has introduced HPE Natural Language Question Answering to its IDOL platform to help solve the problem. According to the release, the technology seeks to determine the original intent of the question and then “provides an answer or initiates an action drawing from an organization’s own structured and unstructured data assets in addition to available public data sources to provide actionable, trusted answers and business critical responses.
I love the actionable, trusted bit. Autonomy’s core approach is based on probabilities. Trust is okay, but it is helpful to understand that probabilities are — well — probable. The notion of “trusted answers” is a quaint one to those who drink deep from the statistical springs of data.
I highlighted this quotation, presumably from the wizards at HPE:
“IDOL Natural Language Question Answering is the industry’s first comprehensive approach to delivering enterprise class answers,” Sean Blanchflower, vice president of engineering for big data platforms at HPE, said in the release. “Designed to meet the demanding needs of data-driven enterprises, this new, language-independent capability can enhance applications with machine learning powered natural language exchange.”
My hunch is that HPE or MicroFocus or an elf has wrapped a query system around the IDOL technology. The write up does not provide too many juicy details about the plumbing. I did note these features, however:
- An IDOL Answer Bank. Ah, ha. Ask Jeeves style canned questions. There is no better way to extract information than the use of carefully crafted queries. None of the variable, real life stuff that system users throw at search and retrieval systems. My experience is that maintaining canned queries can become a bit tedious and also expensive.
- IDOL Fact Bank. Ah, ha. A query that processes content to present “factoids.” Much better than a laundry list of results. What happens when the source data return factoids which are [a] not current, [b] not accurate, or [c] without context? Hey, don’t worry about the details. Take your facts and decide, folks.
- IDOL Passage Extract. Ah, ha. A snippet or key words in context! Not exactly new, but a time proven way to provide some context to the factoid. Now wasn’t that an IDOL function in 2001? Guess not.
- IDOL Answer Server. Ah, ha. A Google style wrapper; that is, leave the plumbing alone and provide a modernized paint job.
If you match these breakthroughs with the diagram in the HP IDOL write up’s diagrams, you will note that these capabilities appear in the IDOL/DRE system diagram and features.
What’s important in this content marketing piece. The write up provides a takeaway section to help out those who are unaware of the history of IDOL, which dates from the late 1990s. Here you go. Revel in new features, enjoy NLP, and recognize that HPE is competing with IBM Watson.
There you go. Factual content in action. Isn’t modern technology analysis satisfying? IBM Watson, your play.
Stephen E Arnold, December 11, 2017
Alphabet Google: Cutbacks in Translation Will Not Change the Business Climate for Alphabet
December 10, 2016
The Alphabet Google thing is twitching its Godzilla like tail again. This time, the effect is more noticeable than the cancellation of Web Accelerator or Orkut. Googzilla has neutered its free online translation service. Of course, I assume the information in “Google Translate Now Has a 5,000 Character Limit.” Remember the good old days of November 2016 when Google announced an improvement in its translation service. Well, lucky are we that the entire service has not be killed off.
The write up I scanned reported:
there is a small counter at the bottom right corner of the text box that counts down the characters as you type them. This means that the maximum number of characters allowed per translation is set at 5,000.
How much text is 5,000 characters? That works out to about 2.15 pages of text in 12 point Times Roman on a default Word page.
What does one do if more text has to be translated? Well, translate in segments, silly.
I have noticed that when feeding Google Translate complex pages in non Roman languages, the Google Translate system was taking longer and returning truncated translations. Now the size limit is enforced and one sees a counter for text like this:
I have a number of translation tools, so the loss of the Google service is no biggie here in Harrod’s Creek. I did give this shift a bit of thought. I have some ideas about why this change has been made.
First, the new financial discipline in effect at Google is going to curtail some services which are not used by large numbers of Google “customers.” I know that amidst that trillion plus searches run on Google each year, a tiny fraction rely need online translations. It is, therefore, logical to allocate computational resources to other activities. The good old days of spending freely to support less popular services is over due to cost controls. Let’s face it. The shift to mobile, the crazy moon shots, and ill fated ventures like Google Fiber don’t produce big bucks. Logical. Go with ad spending, not feel good spending.
Second, the Alphabet Google tips its hand when it kills off products and services like the Google Search Appliance. There is no future in some services. That means that the hassle of charging people to use Translate is not going to generate sufficient money to offset the erosion in the core business of ad sales. The shift to mobile puts more pressure on Google’s only money making business capable of supporting the Alphabet Google system. The costs of slipstreaming ads into Translate is not going to produce enough money to justify the investment. Who at Google wants to run an ad service delivering English to Ukrainian translations when a bonus depends on getting more Ukrainian pizza ads displayed along side the translation results?
Third, the shift from search on a big desktop device to search on a tiny mobile device is a problem. One can have more than two thirds of the queries in the US and more than 90 percent of the queries in the rest of the world. However, when only one or two ads can be shoehorned into the available screen real estate, that’s an earnings problem for the Google. Right now the revenue pump up is chugging along, but the continued robustness of Amazon, Apple, Facebook, and other online vendors presages more pressure for the Google.
What’s the common thread running through my three “ideas” or “observations”? The answer is a threat to Google ad model which is predicated on the desktop search model so generously influenced by GoTo.com/Overture.com/Yahoo. The only way to keep this dog from getting hit by speeding competitors is to change the kennel set up.
That’s what the shift in the GSA, the thrashing about YouTube, and the crippling of Google Translate signal. The joy ride of the free spending, we can do anything Google is ending. It had to happen. The founders are getting older. The legal hassles are going up. The silly little Facebook thing is now a very big thing chuck full of Xooglers who know how to probe the soft parts of Googzilla.
I am probably wildly off base in my focus on Google and its revenue. Nevertheless, I am willing to go into a part of the undergrowth that some in Harrod’s Creek avoid. The undergrowth, in this case, faces an environmental threat. Google has to find a source of revenue to make up the deltas between revenue from mobile and desktop mobile, tough to control infrastructure cost inflation, and the new product flop to success ratio.
Times are changing. Googzilla is being affected by its environment. Environments can be tough to change. Monocultures can be vulnerable. Very vulnerable.
Stephen E Arnold, December 10, 2016
At Last an Academic Search, but How Much Does It Cost?
December 9, 2016
I love Google. You love Google. Everyone loves Google so much that it has become a verb in practically every language. Google does present many problems, however, especially in the inclusion of paid ads in search results and Google searches are not academically credible. Researchers love the ease of use with Google, but there a search engine does not exist that returns results that answer a simple question based on a few keywords, NLP, and citations (those are extremely important).
It is possible that a search engine designed for academia could exist, especially if it can be subject specific and allows full-text access to all results. The biggest problem and barrier in the way of a complete academic search engine is that scholarly research is protected by copyright and most research is behind pay walls belonging to academic publishers, like Elsevier.
Elsevier is a notorious academic publisher because it provides great publication and it is also expensive to subscribe to it digitally. The Mendeley Blog shares that Elsevier has answered the academic search engine cry: “Introducing Elsevier DataSearch.” The Elsevier DataSearch promises to search through reputable information repositories and help researchers accelerate their work.
DataSearch is still in the infant stage and there is an open call for beta testers:
DataSearch offers a new and innovative approach. Most search engines don’t actively involve their users in making them better; we invite you, the user, to join our User Panel and advise how we can improve the results. We are looking for users in a variety of fields, no technical expertise is required (though welcomed). In order to join us, visit https://datasearch.elsevier.com and click on the button marked ‘Join Our User Panel’.”
This is the right step forward for any academic publisher! There is one thing I am worried about and that is: how much is the DataSearch engine going to cost users? I respect copyright and the need to make a profit, but I wish there was one all-encompassing academic database that was free or had a low-cost subscription plan.
Whitney Grace, December 9, 2016
GE Now Manufactures Artificial Intelligence
December 9, 2016
GE (General Electric) makes appliances, such as ovens, ranges, microwaves, washers, dryers, and refrigerators. Once you get them out of the appliance market, their expertise appears to end. Fast Company tells us that GE wants to branch out into new markets and the story is in, “GE Wants To Be The Next Artificial Intelligence Powerhouse .”
GE is a multi-billion dollar company and they have the resources to invest in the burgeoning artificial intelligence market. They plan to employ two new acquisitions and bring machine learning to the markets they already dominate. GE first used machine learning in 2015 with Predix Cloud, which recorded industrial machinery sensor patterns. It was, however, more of a custom design for GE than one with a universal application.
GE purchased Bit Stew Systems, a company similar to the Predix Cloud except that collected industrial data, and Wise.io, a company that used astronomy-based technology to streamline customer support systems. Predix already has a string of customers and has seen much growth:
Though young, Predix is growing fast, with 270 partner companies using the platform, according to GE, which expects revenue on software and services to grow over 25% this year, to more than $7 billion. Ruh calls Predix a “significant part” of that extra money. And he’s ready to brag, taking a jab at IBM Watson for being a “general-purpose” machine-learning provider without the deep knowledge of the industries it serves. “We have domain algorithms, on machine learning, that’ll know what a power plant is and all the depth of that, that a general-purpose machine learning will never really understand,” he says.
GE is tackling issues in healthcare and energy issues with Predix. GE is proving it can do more than make a device that can heat up a waffle. The company can affect the energy, metal, plastic, and computer system used to heat the waffle. It is exactly like how mason jars created tools that will be used in space.
Whitney Grace, December 9, 2016
Digital Reasoning Releases Synthesis Version 4
December 9, 2016
Digital Reasoning has released the latest iteration of its Synthesys platform, we learn from Datanami’s piece, “Cognitive Platform Sharpens Focus on Untructured Data.” Readers may recall that Digital Reasoning provides tools to the controversial US Army intelligence system known as DCGS. The write-up specifies:
Version 4 of the Digital Reasoning platform released on Tuesday (June 21) is based on proprietary analytics tools that apply deep learning neural network techniques across text, audio and images. Synthesys 4 also incorporates behavioral analytics based on anomaly detection techniques.
The upgrade also reflects the company’s push into user and ‘entity’ behavior analytics, a technique used to leverage machine learning in security applications such as tracking suspicious activity on enterprise networks and detecting ransomware attacks. ‘We are especially excited to expand into the area of entity behavior analytics, combining the analysis of structured and unstructured data into a person-centric, prioritized profile that can be used to predict employees at risk for insider threats,’ Bill DiPietro, Digital Reasoning’s vice president of product management noted in a statement.
The platform has added Spanish and Chinese to its supported languages, which come with syntactic parsing. There is also now support for Elasticsearch, included in the pursuit of leveraging unstructured data in real time. The company emphasizes the software’s ability to learn from context, as well as enhanced tools for working with reports.
Digital Reasoning was founded in 2000, and makes its primary home in Nashville, Tennessee, with offices in Washington, DC, and London. The booming company is also hiring, especially in the Nashville area.
Cynthia Murrell, December 9, 2016
Yahoo Freedom of Information Case Punted Back to Lower German Courts
December 9, 2016
The article on DW titled Germany’s Highest Court Rejects Yahoo Content Payment Case reports that Yahoo’s fight against paying publishers for publishing their content has been sent back to the lower courts. Yahoo claims that the new copyright laws limit access to information. The article explains,
The court, in the western city of Karlsruhe, said on Wednesday that Yahoo hadn’t exhausted its legal possibilities in lower courts and should turn to them first. The decision suggests Yahoo could now take its case to the civil law courts. The judges didn’t rule on the issue itself, which also affects rival search engine companies…. Germany revised its copyright laws in August 2013 allowing media companies to request payment from search engines that use more than snippets of their content.
The article points out that the new law fails to define “snippet.” Does it mean a few sentences or a few paragraphs? The article doesn’t go into much detail on how this major oversight was possible. The outcome of the case will certainly affect Google as well as Yahoo. Since its summer sale of the principal online asset to Verizon, a new direction has emerged. Verizon aims to forge a Yahoo brand that can compete in online advertising with the likes of Google and Facebook.
Chelsea Kerwin, December 9, 2016
Zo Tay! Piz Daint. Microsoft Talks Quantum But Goes Cray
December 8, 2016
There’s a new Microsoft chatbot coming. Microsoft wants to deploy smarter, less racist chatbots I assume. To achieve that goal, Microsoft talks quantum computing and qubits (not Quberts). However, when it comes to crunching data, Microsoft is embracing the ever popular and somewhat iconic Cray outfit.
Navigate to “Microsoft Goes Cray for Deep Learning on Supercomputers.” The write up informed me that:
The deep learning process could be about to change dramatically thanks to work being carried out Cray, Microsoft and the Swiss National Supercomputing Centre. In existing architectures and conventional systems, deep learning requires a slow training process that can take months, something that can lead to significantly higher costs and delays in making scientific discoveries. Cray believes that its work with Microsoft and CSSC could have solved this problem by applying supercomputing architectures to accelerate the training process.
The name of my servers are derived from dogs owned by my friends. Yes, there was an Oreo and a Biscuit.
But what is the name of the pricey, complex, and semi fast Cray supercomputer?
Give up? Here’s a clue: “A prominent peak in Grisons that overlooks the Fuorn pass.”
Need more time?
Give up?
Okay.
The answer is…
Piz Daint
There you go. Tay, Zo, and Piz Daint. Outstanding.
The write up told me:
According to the supercomputer manufacturer, deep learning problems share algorithmic similarities with applications that are traditionally run on a massively parallel supercomputer. So by optimizing inter-node communication using the Cray XC Aries network and a high performance MPI library, each training job is said to be able to leverage more compute resources and therefore reduce the amount of time required to train them.
I too believe everything computer assemblers tell me. I recall a demonstration online system which boasted fancy Dan machines from Sun Microsystems. The high powered, expensive hardware could support four—yep, four—simultaneous users. Another example is the system designed to search video news which boasted a five minute response time. Flashy hardware. Software seemed to be the problem. And Microsoft rarely distributes software which does not work as advertised. I wish I knew how to get that Word numbering system to work. Oh, well.
Keep in mind that Cray is providing some Microsoft hardware with its machines. Plus, Cray is based in Seattle. Microsoft’s and Cray’s partner in the test is the Swiss National Supercomputing Centre (CSCS). I like Switzerland, and I assume there will be some meetings there. The Swiss also enjoy US holiday shopping. I assume there will be or have been some visits by Swiss wizards to Seattle. I am not sure how many meetings will be scheduled in Chippewa Falls, Wisconsin, however. I thought Cray was owned by Tera Computer. I did a quick check on the financial health of the Cray outfit. I concluded that the tie up will definitely be a plus for the Cray folks. By the way, Cray was founded in 1972.
Stephen E Arnold, December 8, 2016
Cray
Oh, Canada: Censorship Means If It Is Not Indexed, Information Does Not Exist
December 8, 2016
I read “Activists Back Google’s Appeal against Canadian Order to Censor Search Results.” The write up appears in a “real” journalistic endeavor, a newspaper in fact. (Note that newspapers are facing an ad revenue Armageddon if the information in “By 2020 More Money Will Be Spent on Online Ads Than on Radio or Newspapers” is accurate.)
The point of the “real” journalistic endeavor’s write up is to point out that censorship could get a bit of a turbo boost. I highlighted this passage:
In an appeal heard on Tuesday [December 6, 2016] in the supreme court of Canada, Google Inc took aim at a 2015 court decision that sought to censor search results beyond Canada’s borders.
If the appeal goes south, a government could instruct the Google and presumably any other indexing outfit to delete pointers to content. If one cannot find online information, that information may cease to be findable. Ergo. The information does not exist for one of the search savvy wizards holding a mobile phone or struggling to locate a US government document.
The “real” journalistic endeavor offers:
A court order to remove worldwide search results could threaten free expression if it catches on globally – where it would then be subject to wildly divergent standards on freedom of speech.
It is apparently okay for a “real” journalistic endeavor to prevent information from appearing in its information flows as long as the newspaper is doing the deciding. But when a third party like a mere government makes the decision, the omission is a very bad thing.
I don’t have a dog in this fight because I live in rural Kentucky, am an actual addled goose (honk!), and find that so many folks are now realizing the implications of indexing digital content. Let’s see. Online Web indexes have been around and free for 20, maybe 30 years.
There is nothing like the howls of an animal caught in a trap. The animal wandered into or was lured into the trap. Let’s howl.
Stephen E Arnold, December 8, 2016
The One Percent Have Privately Disappeared
December 8, 2016
People like to think that their lives are not always monitored, especially inside their domiciles. However, if you have installed any type of security camera, especially a baby monitor, the bad news is that they are easily hacked. Malware can also be downloaded onto a computer to spy on you through the built-in camera. Mark Zuckerberg coves his laptop’s camera with a piece of electrical tape. With all the conveniences to spy on the average individual, it is not surprising that the rich one percent are literally buying their privacy by disappearing. FT.com takes a look about, “How The Super-Rich Are Making Their Homes ‘Invisible.’”
The article opens with a description about how an entire high-end California neighborhood exists, but it is digitally “invisible” on Google Street View. Celebrities live in this affluent California neighborhood and the management company does not even give interviews. Privacy is one of the greatest luxuries one can buy in this age and the demand will grow as mobile Internet usages increases. The use of cameras is proportional to Internet usage.
People who buy privacy by hiding their homes want to avoid prying eyes, such a paparazzi and protect themselves from burglars. The same type of people who buy privacy are also being discreet about their wealth. They do not flaunt it, unlike previous eras. In the business sector, more and more clients want to remain anonymous so corporations are creating shell businesses to protect their identities.
There is an entire market for home designs that hide the actual building from prying eyes. The ultimate way to disappear, however, is to live off the grid:
For extra stealth, property owners can take their homes off the grid — generating their own electricity and water supply avoids tell-tale pipes and wires heading on to their land. Self-sufficient communities have become increasingly popular for privacy, as well as ecological, reasons; some estimates suggest that 180,000 households are living off the grid in the US alone.
Those people who live off the grid will also survive during a zombie apocalypse, but I digress.
It is understandable that celebrities and others in the public eye require more privacy than the average citizen, but we all deserve the same privacy rights. But it brings up another question: information needs to be found in order to be used. Why should some be able to disappear while others cannot?
Whitney Grace, December 8, 2016