February 12, 2016
The Dark Web is an intriguing and mysterious phenomenon, but rumors about what can be found there are exaggerated. Infomania examines what is and what is not readily available in that murky realm in, “Murder-for-Hire on the Dark Web? It Can’t Be True!”
Anonymity is the key factor in whether certain types of criminals hang out their shingles on the TOR network. Crimes that can be more easily committed without risking identification include drug trafficking, fraud, and information leaks. On the other hand, contract assassins, torture-as-entertainment, and human trafficking are not actually to be found, despite reports to the contrary. See the article for details on each of these, and more. The article cites independent researcher Chris Monteiro as it summarizes:
The dark web is rife with cyber crime. But it’s more rampant with sensationalized myths about assassination and torture schemes — which, as Chris can attest, simply aren’t true. “What’s interesting is so much of the coverage of these scam sites is taken at face value. Like, ‘There is a website. Therefore its contents must be true.’ Even when mainstream media picks it up, very few pick it up skeptically,” he says.
Take the Assassination Market, for example. When news outlets got wind of its alleged existence in 2013, they ran with the idea of “Murder-for-hire!!” on the Internet underground. Although Chris has finally demonstrated that these sites are not real, their legend lives on in Internet folklore. “Talking about the facts — this is how cybercrime works, this is how Tor and Bitcoin work — is a lot less sexy than saying, ‘If you click on the wrong link, you’ll be kidnapped, and you’ll end up in a room where you’ll be livestreamed, murdered, and you’re all over the internet!’” Chris says. “All I can do is point out what’s proven and what isn’t.”
So, next time someone spins a scary tale about killers-for-hire who are easily found online, you can point them to this article. Yes, drug trafficking, stolen data, and other infractions are big problems associated with the Dark Web, but let us not jump at shadows.
Cynthia Murrell, February 12, 2016
February 11, 2016
“On the TOR network you can find various websites just like you find on the ‘normal web.’ The websites which are hosted on the TOR network are not indexed by search engines like Google, Bing and Yahoo, but the search engines which are listed below, do index the TOR websites which are hosted via the TOR network. It is important to remember that you do need the TOR client on your device in order to access the TOR network, if you cannot use a TOR client on your device, you can use one of the free TOR gateways which are listed below in the web TOR providers tab.”
The article warns about malicious TOR clients, and strongly suggests readers download the client found at the official TOR website. Four search engines are listed— https://Ahmia.fi, https://Onion.cab, https://onion.link/, and http://thehiddenwiki.org/. CWZ also lists those Web TOR gateways, through which one can connect to TOR services with a standard Web browser instead of using a TOR client. See the end of the article for that information.
Cynthia Murrell, February 11, 2016
February 11, 2016
The article on e27 titled 5 Asian Artificial Intelligence Startups that Caught Our Eye lists several exciting new companies working to unleash AI technology, often for quotidian tasks. For example, Arya.ai provides for speeder and more productive decision-making, while Mad Street Den and Niki.ai offers AI shopping support! The article goes into detail about the latter,
“Niki understands human language in the context of products and services that a consumer would like to purchase, guides her along with recommendations to find the right service and completes the purchase with in-chat payment. It performs end-to-end transactions on recharge, cab booking and bill payments at present, but Niki plans to add more services including bus booking, food ordering, movie ticketing, among others.”
Mad Street Den, on the other hand, is more focused on object recognition. Users input an image and the AI platform seeks matches on e-commerce sites. Marketers will be excited to hear about Appier, a Taiwan-based business offering cross-screen insights, or in layman’s terms, they can link separate devices belonging to one person and also estimate how users switch devices during the day and what each device will be used for. These capabilities allow marketers to make targeted ads for each device, and a better understanding of who will see what and via which device.
Chelsea Kerwin, February 11, 2016
February 10, 2016
Big data was a popular buzzword a few years ago, making it seem that it was a brand new innovation. The eDiscovery process, however, has been around for several decades, but recent technology advancements have allowed it to take off and be implemented in more industrial fields. While many big data startups have sprung up, ZyLab-a leading innovator in the eDiscovery and information governance-started in its big data venture in 1983. ZyLab created a timeline detailing its history called, “ZyLab’s Timeline Of Technical Ingenuity.”
Even though ZyLab was founded in 1983 and introduced the ZyIndex, its big data products did not really take off until the 1990s when personal computers became an indispensable industry tool. In 1995, ZyLab made history by being used in the OJ Simpson and Uni-bomber investigations. Three years later it introduced text search in images, which is now a standard search feature for all search engines.
Things really began to take off for ZyLab in the 2000s as technology advanced to the point where it became easier for companies to create and store data as well as beginning the start of masses of unstructured data. Advanced text analytics were added in 2005 and ZyLab made history again by becoming the standard for United Nations War Crime Tribunals.
During 2008 and later years, ZyLab’s milestones were more technological, such as creating the Zylmage SharePoint connector and Google Web search engine integration, the introduction of the ZyLab Information Management Platform, first to offer integrated machine translation in eDiscovery, adding audio search, and incorporating true native visual search and categorization.
ZyLab continues to make historical as well as market innovations for eDiscovery and big data.
February 5, 2016
Elasticsearch is one of the most popular open source search applications and it has been deployed for personal as well as corporate use. Elasticsearch is built on another popular open source application called Apache Lucene and it was designed for horizontal scalability, reliability, and easy usage. Elasticsearch has become such an invaluable piece of software that people do not realize just how useful it is. Eweek takes the opportunity to discuss the search application’s uses in “9 Ways Elasticsearch Helps Us, From Dawn To Dusk.”
“With more than 45 million downloads since 2012, the Elastic Stack, which includes Elasticsearch and other popular open-source tools like Logstash (data collection), Kibana (data visualization) and Beats (data shippers) makes it easy for developers to make massive amounts of structured, unstructured and time-series data available in real-time for search, logging, analytics and other use cases.”
How is Elasticsearch being used? The Guardian is daily used by its readers to interact with content, Microsoft Dynamics ERP and CRM use it to index and analyze social feeds, it powers Yelp, and her is a big one Wikimedia uses it to power the well-loved and used Wikipedia. We can already see how much Elasticsearch makes an impact on our daily lives without us being aware. Other companies that use Elasticsearch for our and their benefit are Hotels Tonight, Dell, Groupon, Quizlet, and Netflix.
Elasticsearch will continue to grow as an inexpensive alternative to proprietary software and the number of Web services/companies that use it will only continues to grow.
February 4, 2016
I read an article that I dismissed. The title nagged at my ageing mind and dwindling intellect. “This is Why Dictators Love Big Data” did not ring my search, content processing, or Dark Web chimes.
Annoyed at my inner voice, I returned to the story, annoyed with the “This Is Why” phrase in the headline.
Predictive analytics are not new. The packaging is better.
I think this is the main point of the write up, but I an never sure with online articles. The articles can be ads or sponsored content. The authors could be looking for another job. The doubts about information today plague me.
The circled passage is:
Governments and government agencies can easily use the information every one of us makes public every day for social engineering — and even the cleverest among us is not totally immune. Do you like cycling? Have children? A certain breed of dog? Volunteer for a particular cause? This information is public, and could be used to manipulate you into giving away more sensitive information.
The only hitch in the git along is that this is not just old news. The systems and methods for making decisions based on the munching of math in numerical recipes has been around for a while. Autonomy? A pioneer in the 1990s. Nope. Not even the super secret use of Bayesian, Markov, and related methods during World War II reaches back far enough. Nudge the ball to hundreds of years farther on the timeline. Not new in my opinion.
I also noted this comment:
In China, the government is rolling out a social credit score that aggregates not only a citizen’s financial worthiness, but also how patriotic he or she is, what they post on social media, and who they socialize with. If your “social credit” drops below a certain level because you post anti-government messages online or because you’re socially associated with other dissidents, you could be denied credit approval, financial opportunities, job promotions, and more.
Just China? I fear not, gentle reader. Once again the “real” journalists are taking an approach which does not do justice to the wide diffusion of certain mathy applications.
Net net: I should have skipped this write up. My initial judgment was correct. Not only is the headline annoying to me, the information is par for the Big Data course.
Stephen E Arnold, February 4, 2016
February 2, 2016
The infographic on the IBM Big Data & Analytics Hub titled Extracting Business Value From the 4 V’s of Big Data involves quantifying Volume (scale of data), Velocity (speed of data), Veracity (certainty of data), and Variety (diversity of data). In a time when big data may have been largely demystified, IBM makes an argument for its current relevance and import, not to mention its mystique, with reminders of the tremendous amounts of data being created and consumed on a daily basis. Ultimately the graphic is an ad for the IBM Analytics Technology Platform. The infographic also references a “fifth “V”,
“Big data = the ability to achieve greater Value through insights from superior analytics. Case Study: A US-based aircraft engine manufacturer now uses analytics to predict engine events that lead to costly airline disruptions, with 97% accuracy. If this prediction capability had ben available in the previous year, it would have saved $63 million.”
IBM struggles for revenue. But, obviously from this infographic, IBM knows how to create Value with a capital “V”, if not revenue. The IBM Analytics Technology Platform promises speedier insights and actionable information from trustworthy sources. The infographic reminds us that poor quality in data leads to sad executives, and that data is growing exponentially, with 90% of all data forged in only the last two years.
Chelsea Kerwin, February 2, 2016
February 2, 2016
A friend recently told me how they can go months avoiding suspicious emails, spyware, and Web sites on her computer, but the moment she hands her laptop over to her father he downloads a virus within an hour. Despite the technology gap existing between generations, the story goes to show how easy it is to deceive and steal information these days. ExpertClick thinks that metadata might hold the future means for cyber security in “What Metadata And Data Analytics Mean For Data Security-And Beyond.”
The article uses biological analogy to explain metadata’s importance: “One of my favorite analogies is that of data as proteins or molecules, coursing through the corporate body and sustaining its interrelated functions. This analogy has a special relevance to the topic of using metadata to detect data leakage and minimize information risk — but more about that in a minute.”
This plays into new companies like, Ayasdi, using data to reveal new correlations using different methods than the standard statistical ones. The article compares this to getting to the data atomic level, where data scientists will be able to separate data into different elements and increase the analysis complexity.
“The truly exciting news is that this concept is ripe for being developed to enable an even deeper type of data analytics. By taking the ‘Shape of Data’ concept and applying to a single character of data, and then capturing that shape as metadata, one could gain the ability to analyze data at an atomic level, revealing a new and unexplored frontier. Doing so could bring advanced predictive analytics to cyber security, data valuation, and counter- and anti-terrorism efforts — but I see this area of data analytics as having enormous implications in other areas as well.”
There are more devices connected to the Internet than ever before and 2016 could be the year we see a significant rise in cyber attacks. New ways to interpret data will leverage predictive and proactive analytics to create new ways to fight security breaches.
February 1, 2016
The article on Fortune titled Has Big Data Gone Mainstream? asks whether big data is now an expected part of data analysis. The “merger” as Deloitte advisor Tom Davenport puts it, makes big data an indistinguishable aspect of data crunching. Only a few years ago, it was a scary buzzword that executives scrambled to understand and few experts specialized in. The article shows what has changed lately,
“Now, however, universities offer specialized master’s degrees for advanced data analytics and companies are creating their own in-house programs to train talent in data science. The Deloitte report cites networking giant Cisco CSCO -4.22% as an example of a company that created an internal data science training program that over 200 employees have gone through. Because of media reports, consulting services, and analysts talking up “big data,” people now generally understand what big data means…”
Davenport sums up the trend nicely with the statement that people are tired of reading about big data and ready to “do it.” So what will replace big data as the current mysterious buzzword that irks laypeople and the C-suite simultaneously? The article suggests “cognitive computing” or computer systems using artificial intelligence for speech recognition, object identification, and machine learning. Buzz, buzz!
Chelsea Kerwin, February 1, 2016
February 1, 2016
Computer programmers who specialize in machine learning, artificial intelligence, data mining, data visualization, and statistics are smart individuals, but they sometimes even get stumped. Using the same form of communication as reddit and old-fashioned forums, Cross Validated is a question an answer site run by Stack Exchange. People can post questions related to data and relation topics and then wait for a response. One user posted a question about “Machine Learning Classifiers”:
“I have been trying to find a good summary for the usage of popular classifiers, kind of like rules of thumb for when to use which classifier. For example, if there are lots of features, if there are millions of samples, if there are streaming samples coming in, etc., which classifier would be better suited in which scenarios?”
The response the user received was that the question was too broad. Classifiers perform best depending on the data and the process that generates it. It is kind of like asking the best way to organize books or your taxes, it depends on the content within the said items.
Another user replied that there was an easy way to explain the general process of understanding the best way to use classifiers. The user directed users to the Sci-Kit.org chart about “choosing the estimator”. Other users say that the chart is incomplete, because it does not include deep learning, decision trees, and logistic regression.
We say create some other diagrams and share those. Classifiers are complex, but they are a necessity to the artificial intelligence and big data craze.