Australian Software Developer Revealed the Panama Papers

May 23, 2016

The Panama Papers have released an entire slew of scandals that sent out ripples we will be dealing with for years to come.  It also strikes another notch in the power of software and that nothing is private anymore.  But how were the Panama Papers leaked?  Reuters reports that a “Small Australian Software Firm Helps Join The Dots On The Panama Papers”.

Nuix Pty Ltd. is a Sydney-based software development company that donated its document analysis program to the International Consortium of Investigative Journalists (ICIJ) to delve through the data from Mossack Fonseca, the Panamanian law firm that leaked the documents.  Reporters have searched through the data for some time and discovered within the 2.6 terabytes the names of politicians and public figures with questionable offshore financial accounts.

“By using the software, the Washington-based ICIJ was able to make millions of scanned documents, some decades old, text-searchable and help its network of journalists cross reference Mossack Fonseca’s clients across these documents.  The massive leak has prompted global investigations into suspected illegal activities by the world’s wealthy and powerful. Mossack Fonseca, the firm at the center of the leaks, denies any wrongdoing.  The use of advanced document and data analysis technology shows the growing importance of technology’s role in helping journalists make better sense of increasingly bigger news discoveries.”

Nuix Pty is a ten-year-old company and their products have been used to conduct data analysis in child pornography rings, people trafficking, and high-end tax evasion.  Another selling feature for the company is their dedication to their clients’ privacy.  They did not allow themselves to have access to the information within the Panama Papers.  That is an interesting fact, considering how some tech companies need to have total access to their clients’ information.

Nuix sounds like the Swiss bank of software companies, guaranteeing high-quality services and products that guarantee results, plus undeniable privacy.

 

Whitney Grace, May 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Search Sink Hole Identified and Allegedly Paved and Converted to a Data Convenience Store

May 20, 2016

I try to avoid reading more than one write up a day about alleged revolutions in content processing and information analytics. My addled goose brain cannot cope with the endlessly recycled algorithms dressed up in Project Runway finery.

I read “Ryft: Bringing High Performance Analytics to Every Enterprise,” and I was pleased to see a couple of statements which resonated with my dim view of information access systems. There is an accompanying video in the write up. I, as you may know, gentle reader, am not into video. I prefer reading, which is the old fashioned way to suck up useful factoids.

Here’s the first passage I highlighted:

Any search tool can match an exact query to structured data—but only after all of the data is indexed. What happens when there are variations? What if the data is unstructured and there’s no time for indexing? [Emphasis added]

The answer to the question is increasing costs for sales and marketing. The early warning for amped up baloney are the presentations given at conferences and pumped out via public relations firms. (No, Buffy, no, Trent, I am not interested in speaking with the visionary CEO who hired you.)

I also highlighted:

With the power to complete fuzzy search 600X faster at scale, Ryft has opened up tremendous new possibilities for data-driven advances in every industry.”

I circled the 600X. Gentle reader, I struggle to comprehend a 600X increase in content processing. Dear Mother Google has invested to create a new chip to get around the limitations of our friend Von Neumann’s approach to executing instructions. I am not sure Mother Google has this nailed because Mother Google, like IBM, announces innovations without too much real world demonstration of the nifty “new” things.

I noted this statement too:

For the first time, you can conduct the most accurate fuzzy search and matching at the same speed as exact search without spending days or weeks indexing data.

Okay, this strikes me as a capability I would embrace if I could get over or around my skepticism. I was able to take a look at the “solution” which delivers the astounding performance and information access capability. Here’s an image from Ryft’s engineering professionals:

image

Notice that we have Spark and pre built components. I assume there are myriad other innovations at work.

The hitch in the git along is that in order to deal with certain real world information processing challenges, the inputs come from disparate systems, each generating substantial data flows in real time.

Here’s an example of a real world information access and understanding challenge, which, as far as I know, has not been solved in a cost effective, reliable, or usable manner.

image

Image source: Plugfest 2016 Unclassified.

This unclassified illustration makes clear that the little things in the sky pump out lots of data into operational theaters. Each stream of data must be normalized and then converted to actionable intelligence.

The assertion about 600X sounds tempting, but my hunch is that the latency in normalizing, transferring, and processing will not meet the need for real time, actionable, accurate outputs when someone is shooting at a person with a hardened laptop in a threat environment.

In short, perhaps the spark will ignite a fire of performance. But I have my doubts. Hey, that’s why I spend my time in rural Kentucky where reasonable people shoot squirrels with high power surplus military equipment.

Stephen E Arnold, May 20, 2016

The Kardashians Rank Higher Than Yahoo

May 20, 2016

I avoid the Kardashians and other fame chasers, because I have better things to do with my time.  I never figured that I would actually write about the Kardashians, but the phrase “never say never” comes into play.  As I read Vanity Fair’s “Marissa Mayer Vs. ‘Kim Kardashian’s Ass” : What Sunk Yahoo’s Media Ambitions?” tells a bleak story about the current happenings at Yahoo.

Yahoo has ended many of its services, let go fifteen percent of staff, and there are very few journalists left on the team.  The remaining journalists are not worried about producing golden content, they have to compete with a lot already on the Web, especially “Kim Kardashian’s ass” as they say.

When Marissa Mayer took over Yahoo as the CEO in 2012, she was determined to carve out Yahoo’s identity as a tech company.  Mayer, however, wanted Yahoo to be media powerhouse, so she hired many well-known journalists to run specific niche projects in popular areas from finance to beauty to politics.  It was not a successful move and now Yahoo is tightening its belt one more time.  The Yahoo news algorithm did not mesh with the big name journalists, the hope was that their names would soar above popular content such as Kim Kardashian’s ass.  They did not.

Much of Yahoo’s current work comes from the Alibaba market.  The result is:

“But the irony is that Mayer, a self-professed geek from Silicon Valley, threw so much of her reputation behind high-profile media figures and went with her gut, just like a 1980s magazine editor—when even magazine editors, including those who don’t profess to “get” technology, have long abandoned that practice themselves, in favor of what the geeks in Silicon Valley are doing.”

Mayer was trying to create a premiere media company, but lower quality content is more popular than top of the line journalists.  The masses prefer junk food in their news.

 

Whitney Grace, May 20, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Signs of Life from Funnelback

May 19, 2016

Funnelback has been silent as of late, according to our research, but the search company has emerged from the tomb with eyes wide open and a heartbeat.  The Funnelback blog has shared some new updates with us.  The first bit of news is if you are “Searchless In Seattle? (AKA We’ve Just Opened A New Office!)” explains that Funnelback opened a new office in Seattle, Washington.   The search company already has offices in Poland, United Kingdom, and New Zealand, but now they want to establish a branch in the United States.  Given their successful track record with the finance, higher education, and government sectors in the other countries they stand a chance to offer more competition in the US.  Seattle also has a reputable technology center and Funnelback will not have to deal with the Silicon Valley group.

The second piece of Funnelback news deals with “Driving Channel Shift With Site Search.”  Channel shift is the process of creating the most efficient and cost effective way to deliver information access and usage to users.  It can be difficult to implement a channel shift, but increasing the effectiveness of a Web site’s search can have a huge impact.

Being able to quickly and effectively locate information on a Web site saves time for not only more important facts, but it also can drive sales, further reputation, etc.

“You can go further still, using your search solution to provide targeted experiences; outputting results on maps, searching by postcode, allowing for short-listing and comparison baskets and even dynamically serving content related to what you know of a visitor, up-weighting content that is most relevant to them based on their browsing history or registered account.

Couple any of the features above with some intelligent search analytics, that highlight the content your users are finding and importantly what they aren’t finding (allowing you to make the relevant connections through promoted results, metadata tweaking or synonyms), and your online experience is starting to become a lot more appealing to users than that queue on hold at your call centre.”

I have written about it many times, but a decent Web site search function can make or break a site.  Not only does it demonstrate that the Web site is not professional, it does not inspire confidence in a business.  It is a very big rookie mistake to make.

 

Whitney Grace, May 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Travel to South Africa Virtually with Googles Mzansi Experience

May 18, 2016

The article on Elle titled Google SA Launches the Mzansi Experience On Maps illustrates the new Google Street View collection for South Africa. For people without the ability to travel, or scared of malaria or Oscar Pistorius, this collection offers an in-depth platform to view some of South Africa’s natural wonders and parks. The article explains,

“Using images collected by the Street View Tripod and Trekker, Google has created 360-degree imagery of some of South Africa’s most beautiful locations, and created virtual tours that enable visitors to see the sights for themselves on their phones, tablets or computers. Visitors will be able to, for the first time, visit a family of elephants in the Kruger National Park, take a virtual walk on Table Mountain, admire Cape Point, or take a walk along Durban’s Golden Mile.”

For South Africa, this initiative might spark increased tourism once people realize just how much the country has to offer. So many of the images of Africa that we are exposed to in the US are reductive and patronizing, like those ceaseless commercials depicting all of Africa as a small, poverty-stricken village. Google’s new collection helps to promote a more diverse and appealing look at one African country: South Africa. Whether you want to go in person or virtually, this is worth checking out!

Chelsea Kerwin, May 18, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Tech Savvy Users Turn to DuckDuckGo

May 18, 2016

A recent report from SimilarWeb tells us what sorts of people turn to Internet search engine DuckDuckGo, which protects users’ privacy, over a more prominent engine, Microsoft’s Bing. The Search Engine Journal summarizes the results in, “New Research Reveals Who is Using DuckDuckGo and Why.”

The study drew its conclusions by looking at the top five destinations of DuckDuckGo users: Whitehatsec.com, Github.com, NYtimes.com,  4chan.org, and  YCombinator.com. Note that four of these five sites have pretty specific audiences, and compare them to the top five, more widely used, sites accessed through Bing: MSN.com, Amazon.com, Reddit.com, Google.com, and Baidu.com.

Writer Matt Southern observes:

“DuckDuckGo users also like to engage with their search engine of choice for longer periods of time — averaging 9.38 minutes spent on DuckDuckGo vs. Bing.

“Despite its growth over the past year, DuckDuckGo faces a considerable challenge when it comes to getting found by new users. Data shows the people using DuckDuckGo are those who already know about the search engine, with 93% of its traffic coming from direct visits. Only 1.5% of its traffic comes from organic search.

“Roy Hinkis of SimilarWeb concludes by saying the loyal users of DuckDuckGo are those who love tech, and they use they use DuckDuckGo as an alternative because they’re concerned about having their privacy protected while they search online.”

Though Southern agrees DuckDuckGo needs to do some targeted marketing, he notes traffic to the site has been rising by 22% per year.  It is telling that the privacy-protecting engine is most popular among those who understand the technology.

 

Cynthia Murrell, May 18, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Enterprise Search: The Valiant Fight On

May 17, 2016

I read “VirtualWorks and Language Tools Announce Merger.” I ran across Language Tools several years ago. The company was working to create components for ElasticSearch’s burgeoning user base. The firm espoused natural language processing as a core technology. NLP is useful, but it imposes some computational burdens on some content processing functions. ElasticSearch works pretty well, and there are a number of companies optimizing, integrating, and creating widgets to make life with ElasticSearch better, faster, and presumably more impressive than the open source system is.

This news release highlights the fact that VirtualWorks and Language Tools have merged. The financial details are not explicit, and it appears that a company founded by a wizard from Citrix will make Language Tools’ R&D hub for the Florida-based VirtualWorks’ operation.

According to the story:

The combined organization brings together best of breed core technologies in the areas of enterprise search, data management, text analytics, discovery techniques and analytics to enable the development of new and exciting next generation applications in the business intelligence space.

VirtualWorks is or was a SharePoint centric solution. Like other search vendors, the company uses connectors to suck data into a central indexing point. Users then search the content and have access to the content without having to query separate systems.

This idea has fueled enterprise search since the days of Verity, Autonomy, Fast Search, Convera, et al. The real money today seems to be in the consulting and engineering services required to make enterprise search useful.

SharePoint is certainly widely used, and it is fraught with interesting challenges. Will the lash up of these two firms generate the type of revenue once associated with Autonomy and Fast Search & Transfer?

My hunch is that enterprise search continues to be a tough market. There are functional solutions to locating information available as open source or at comparatively modest license fees. I am thinking of dtSearch and Maxxcat. Both of these work well within Microsoft centric environments.

Stephen E Arnold, May 17, 2016

Google Moonshot Targets Disease Management, but Might Face Obstacle with Google Management Methods

May 17, 2016

The article on STAT titled Google’s Bold Bid to Transform Medicine Hits Turbulence Under a Divisive CEO explores Google management methods for one of its “moonshot” projects. Namely, the massive company has directed its considerable resources toward overhauling medicine. Verily Life Sciences is the three year-old startup with a mysterious mission and a controversial leader in Andrew Conrad. So far, roughly a dozen Verily players have abandoned the project.

“But “if they are getting off the roller coaster before it gets to the first dip,” something looks seriously wrong, said Rob Enderle, a technology analyst who has tracked Google since its inception. Those who depart well-financed startups usually forsake potential financial windfalls down the line, which further suggests that the people leaving Verily “are losing confidence in the leadership,” he said. No similar brain drain has occurred at Calico, another ambitious Google spinoff, which is focused on increasing the human lifespan.”

Given the scope of the Verily project, which Sergey Brin, Google co-founder, announced that he hoped would significantly change the way we identify, avoid, and handle illness, perhaps Conrad is cracking under the stress. He has maintained complete radio silence and rumors abound that his employees operate under threat of termination for speaking to a reporter.

Chelsea Kerwin, May 17, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Extensive Cultural Resources Available at Europeana Collections

May 17, 2016

Check out this valuable cultural archive, highlighted by Open Culture in the piece, “Discover Europeana Collections, a Portal of 48 Million Free Artworks, Books, Videos, Artifacts & Sounds from across Europe.” Writer Josh Jones is clearly excited about the Internet’s ability to place information and artifacts at our fingertips, and he cites the Europeana Collections as the most extensive archive he’s discovered yet. He tells us the works are:

“… sourced from well over 100 institutions such as The European Library, Europhoto, the National Library of Finland, University College Dublin, Museo Galileo, and many, many more, including contributions from the public at large. Where does one begin?

“In such an enormous warehouse of cultural history, one could begin anywhere and in an instant come across something of interest, such as the the stunning collection of Art Nouveau posters like that fine example at the top, ‘Cercle Artstique de Schaerbeek,’ by Henri Privat-Livemont (from the Plandiura Collection, courtesy of Museu Nacional d’Art de Catalynya, Barcelona). One might enter any one of the available interactive lessons and courses on the history of World War I or visit some of the many exhibits on the period, with letters, diaries, photographs, films, official documents, and war propaganda. One might stop by the virtual exhibit, ‘Photography on a Silver Plate,’ a fascinating history of the medium from 1839-1860, or ‘Recording and Playing Machines,’ a history of exactly what it sounds like, or a gallery of the work of Swiss painter Jean Antoine Linck. All of the artifacts have source and licensing information clearly indicated.”

Jones mentions the archive might be considered “endless,” since content is being added faster than anyone could hope to keep up with.  While such a wealth of information and images could easily overwhelm a visitor, he advises us to look at it as an opportunity for discovery. We concur.

 

Cynthia Murrell, May 17, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Excite and Ask: Where Are They Now?

May 14, 2016

I learned a factoid from “Yahoo Stock: Analyzing 5 Key Suppliers.” Here’s the passage with the items I noted in bold face:

Excite Japan Co., Ltd. was established in 1997 as a joint venture with Excite, Inc., which is wholly owned by IAC/InterActiveCorp. At the time, Excite, Inc., which is known in 2016 as Ask.com, was among the largest and most popular Web portals offering personalized home pages for searching content. In 2015, Excite Japan generated 9.91% of its revenues from Yahoo through a revenue-sharing agreement for ad-clicks going through Yahoo’s search engine. In 2015, the company had revenue of $66.47 million in U.S. dollars and a market capitalization of $3.77 billion.

Interesting about Excite. About Yahoo? Not so much.

Stephen E Arnold, May 14, 2016

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta