Deusu or Deutsche Suchmaschine
September 20, 2016
An open source Web search system is available. You can locate Deusu at this link. We ran a number of test queries and learned that the index is less robust than Qwant’s and Yandex’s. But it is early days. The results of our queries were okay. A search for “enterprise search” returned the first hit as search engine optimization. There were links to pundits, mavens, and Datastax. Like Unbubble and Giburu, the need for a non US search engine is joined by a number of outfits. The hurdle will be the cost of building and updating a 25 billion page index. It is expensive, and we wish to point out that certain dominant Web search outfits are trimming their indexes in an effort to cut costs. Here are the results of our query on Deusu for “European Web search engine”:
Zero relevant hits in the first page of results.
Stephen E Arnold, September 20, 2016
Microsoft and Both Hewlett Packards Are Chums
September 20, 2016
I read “Microsoft Beats Out Rivals for HP Software Deal.” The write up does not answer the following questions:
- Did Microsoft or HP’s public relations advisers bring this story to Fortune Magazine?
- How much will HP save by using Microsoft’s sales management and database software instead of Oracle’s and Salesforce’s software?
- How much will the transition from the Oracle and Salesforce systems to the Microsoft system cost?
- Why couldn’t HP use its hardware with the Oracle and Saleforce systems?
- Why did HP choose a proprietary solution when there are satisfactory open source options available?
- Who back was injured after the frenzy of scratching ended?
What the write up reveals is that Oracle and Salesforce lost a big customer. I also highlighted this passage:
This deal adds another dimension to HP-Microsoft partnerships. HP is a huge and longtime hardware partner—its PCs ship with Microsoft Windows and often with its Office applications as well. There is significant overlap between the two companies’ reseller partners. And since most Microsoft partners run Dynamics CRM already, HP’s use of the product could simplify collaboration and data exchange. HP claims about 100,000 partners worldwide.
I will not comment about the “claims” about partners. Let’s see. HPQ buys hardware from HPE. Microsoft is a partner for HPQ and HPE. Looks like a friendly group. Add one person and the companies have a gold foursome. Will Google get asked to join the group? We know Oracle and Salesforce won’t.
Stephen E Arnold, September 20, 2016
The Case for Algorithmic Equity
September 20, 2016
We know that AI algorithms are skewed by the biases of both their creators and, depending on the application, their users. Social activist Cathy O’Neil addresses the broad consequences to society in her book, Weapons of Math Destruction. Time covers her views in its article, “This Mathematician Says Big Data is Causing a ‘Silent Financial Crisis’.” O’Neil studied mathematics at Harvard, utilized quantitative trading at a hedge-fund, and introduced a targeted-advertising startup. It is fair to say she knows what she is talking about.
More and more businesses and organizations rely on algorithms to make decisions that have big impacts on people’s lives: choices about employment, financial matters, scholarship awards, and where to deploy police officers, for example. Yet, the processes are shrouded in secrecy, and lawmakers are nowhere close to being on top of the issue. There is currently no way to ensure these decisions are anything approaching fair. In fact, the algorithms can create a sort of feedback loop of disadvantage. Reporter Rana Foroohar writes:
Using her deep technical understanding of modeling, she shows how the algorithms used to, say, rank teacher performance are based on exactly the sort of shallow and volatile type of data sets that informed those faulty mortgage models in the run up to 2008. Her work makes particularly disturbing points about how being on the wrong side of an algorithmic decision can snowball in incredibly destructive ways—a young black man, for example, who lives in an area targeted by crime fighting algorithms that add more police to his neighborhood because of higher violent crime rates will necessarily be more likely to be targeted for any petty violation, which adds to a digital profile that could subsequently limit his credit, his job prospects, and so on. Yet neighborhoods more likely to commit white collar crime aren’t targeted in this way.
Yes, unsurprisingly, it is the underprivileged who bear the brunt of algorithmic aftermath; the above is just one example. The write-up continues:
Indeed, O’Neil writes that WMDs [Weapons of Math Destruction] punish the poor especially, since ‘they are engineered to evaluate large numbers of people. They specialize in bulk. They are cheap. That’s part of their appeal.’ Whereas the poor engage more with faceless educators and employers, ‘the wealthy, by contrast, often benefit from personal input. A white-shoe law firm or an exclusive prep school will lean far more on recommendations and face-to-face interviews than a fast-food chain or a cash-strapped urban school district. The privileged… are processed more by people, the masses by machines.
So, algorithms add to the disparity between how the wealthy and the poor experience life. Compounding the problem, algorithms also allow the wealthy to isolate themselves online as well as in real life, through curated news and advertising that make it ever easier to deny that poverty is even a problem. See the article for its more thorough discussion.
What does O’Neil suggest we do about this? First, she proposes a “Hippocratic Oath for mathematicians.” She also joins the calls for much more thorough regulation of the AI field and to update existing civic-rights laws to include algorithm-based decisions. Such measures will require the cooperation of legislators, who, as a group, are hardly known for their technical understanding. It is up to those of us who do comprehend the issues to inform them action must be taken. Sooner rather than later, please.
Cynthia Murrell, September 20, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Paris Police Face Data Problem in Google Tax Evasion Investigation
September 20, 2016
Google has been under scrutiny for suspected tax evasion. Yahoo published a brief piece updating us on the investigation: Data analysis from Paris raid on Google will take months, possibly years: prosecutor. French police raided Google’s office in Paris, taking the tax avoidance inquiry to a new level. This comes after much pressure from across Europe to prevent multinational corporations from using their worldwide presence to pay less taxes. Financial prosecutor Eliane Houlette is quoted stating,
We have collected a lot of computer data, Houlette said in an interview with Europe 1 radio, TV channel iTele and newspaper Le Monde, adding that 96 people took part in the raid. “We need to analyze (the data) … (it will take) months, I hope that it won’t be several years, but we are very limited in resources’. Google, which said it is complying fully with French law, is under pressure across Europe from public opinion and governments angry at the way multinationals exploit their global presence to minimize tax liabilities.
While big data search technology exists, government and law enforcement agencies may not have the funds to utilize such technologies. Or, perhaps the knowledge of open source solutions is not apparent. If nothing else, these comments made by Houlette go to show the need for increased focus on upgrading systems for real-time and rapid data analysis.
Megan Feil, September 20, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
HonkinNews for September 20, 2016 Available
September 20, 2016
Stories in the Beyond Search weekly video news program “HonkinNews” include LinkedIn’s censorship of a former CIA professional’s post about the 2016 election. Documentum, founded in 1990, has moved to the frozen wilds of Canada. A Microsoft and Nvidia sponsored online beauty contest may have embraced algorithmic bias. Google can write a customer’s ad automatically and may be able to alter users’ thoughts and actions. Which vendors of intelligence-centric software may be shown the door to the retirement home? The September 20, 2016, edition of “HonkinNews”, filmed with old-fashioned technology in the wilds of rural Kentucky is online at this link.
Kenny Toth, September 20, 2016
Lousy Backlog? Sell with Interesting Assertions.
September 19, 2016
If you are struggling to fill the sales pipeline, you will feel some pressure. If you really need to make sales, marketing collateral may be an easy weapon to seize.
I read “Examples of False Claims about Self-Service Analytics.” The write up singles out interesting sales assertions and offers them up in a listicle. I loved the write up. I lack the energy to sift through the slices of baloney in my enterprise search files. Therefore, let’s highlight the work the brave person who singled out eight vendors’ marketing statements as containing what the author called “false claims.” Personally I think each of these claims is probably rock solid when viewed from the point of view of the vendors’ legal advisers.
Here are three examples of false claims about self service analytics. (For the other five, consult the article cited in the preceding paragraph.) Keep in mind that I find these statements as good as the gold for sale in the local grocery in Harrod’s Creek. Come to think of it the gold is foil wrapped around a lousy piece of ersatz chocolate. But gold is gold sort of.
Example 1 from Information Builders. Information Builders loves New York. The company delivers “integrated solutions for data management.” Here’s the item from the article which contains a “false claim.”
Self-service BI and analytics isn’t just about giving tools to analysts; it’s about empowering every user with actionable and relevant information for confident decision-making. (link). Self-service Analytics for Everyone…Who’s Everyone? Your entire universe of employees, customers, and partners. Our WebFOCUS Business Intelligence (BI) and Analytics platform empowers people inside and outside your organization to attain insights and make better decision.
I see a subject verb agreement error. I see a semicolon which puts me on my rhetorical guard. I see the universal “everyone”. I see the fact that WebFOCUS empowers.
What’s not to like? Information Builders is repeating facts which I accept. The fact that the company is in New York enhances the credibility of the statements. Footnotes, evidence? Who needs them?
Example 2 from SAP, the outfit that delivered R3 to Westinghouse and Trex to the enterprise search market. Here’s the “false assertion” which looks as solid as a peer reviewed journal containing data related to drug trials. Remember. This quote comes from the source article. I believe absolutely whatever SAP communicates to me. Don’t you?
This tool is intended for those who need to do analysis but are not Analysts nor wish to become them.
Why study math, statistics, and related disciplines? Why get a degree? I know that I can embrace the SAP way (which is a bit like the IBM way) and crunch numbers until the cows return to my pasture in Harrod’s Creek. Who needs to worry about data integrity, sample size, threshold settings, and algorithmic sequencing? Not me. Gibraltar does not stand on such solid footing as SAP’s tool for those who eschew analysts and does not want to wake up like Kafka’s protagonist as an analyst.
Example 3 from ZoomData, a company which has caught the attention of some folks in the DC area. I love those cobblestones in Reston too.
ZoomData brings the power of self-service BI to the 99%—the non-data geeks of the world who thirst for a simple, intuitive, and collaborative way to visually interact with data to solve business problems.
To me this looks better than the stone tablets Moses hauled down from the mountain. I love the notion of non geeks who thirst for pointing and clicking. I would have expressed the idea as drink deep of data’s Empyrean spring, but I am okay with the split infinitive “to visually interact” because the statement is a fact. I tell you that it is a fact.
For the other five allegedly false assertions, please, consult the original article. I have to take a break. When my knowledge is confirmed in these brilliant assertions, I need a moment to congratulate myself on my wisdom. Wait. I am an addled goose. Maybe these examples really are hog wash? Because i live in rural Kentucky, I will step outside and seek inputs from Henrietta, my hog.
Stephen E Arnold, September 19, 2016
Dr. Mike Lynch: After Dark Trace, Luminance
September 19, 2016
I read “Time for Robo-lawyer? Mike Lynch backs Cambridge Law-Tech Start-Up Luminance.” The founder of Autonomy worked his magic on Dark Trace. I write a short description of Dark Trace as part of the Commercial Tools section of Dark Web Notebook. With that firm up and growing, Dr. Lynch is now backing smart software to replace human lawyers. With Dr. Lynch’s experience in the rarified atmosphere of the legal eagles, his new venture makes sense. Use software to trim the wings and perhaps the legal fees of the savvy litigators, tort specialists, and interpreters of wild and crazy laws.
According to the write up:
Founded by a combination of lawyers, experts in M&A and mathematicians Luminance’s technology is based on R&D from Cambridge University, and is anchored in Recursive Bayesian Estimation theory. Obviously. It harnesses the power of artificial intelligence to automatically read and understand hundreds of pages of detailed and complex legal documentation every minute. This offers companies the ability to carry out essential due diligence work with much greater speed.
Yep, Bayesian, Markovian, and Laplacian methods are about the fatten Dr. Lynch’s bank account again.
I highlighted this passage:
Luminance has been trained to think like a lawyer,” said CEO Emily Foges. “With Slaughter and May’s help, we are designing the system to understand how lawyers think, and to draw out key findings without the need to be told what to look for. This will transform document analysis and enhance the entire transaction process for law firms and their clients. Highly-trained lawyers who would otherwise be scanning through thousands of pages of repetitive documents can spend more of their time analyzing the findings and negotiating the terms of the deal.
One wonders how Hewlett Packard would have turned out if HP kept Dr. Lynch and let him fix the old time Sillycon Valley icon. Well, I wonder. I don’t think Meg Whitman spends much time thinking about Dr. Lynch until the court date in 2017. Perhaps Dr. Lynch will license Luminance technology to HPE so Meg Whitman can understand the value of Dr. Lynch’s approach to business. On the other hand, HPE may embrace OpenText Recommind. That new Luminance stuff may not make Meg Whitman comfortable.
Stephen E Arnold, September 19, 2016
Hundreds of Thousands of Patient Records Offered up on the Dark Web
September 19, 2016
Some of us suspected this was coming, despite many assurances to the contrary. Softpedia informs us, “Hacker Selling 651,894 Patient Records on the Dark Web.” Haughtily going by the handle TheDarkOverlord, the hacker responsible is looking to make over seven hundred grand off the data. Reporter Catalin Cimpanu writes:
The hacker is selling the data on The Real Deal marketplace, and he [or she] says he breached these companies using an RDP (Remote Desktop Protocol) bug. TheDarkOverlord has told DeepDotWeb, who first spotted the ads, that it’s ‘a very particular bug. The conditions have to be very precise for it.’ He has also provided a series of screenshots as proof, showing him accessing the hacked systems via a Remote Desktop connection. The hacker also recalls that, before putting the data on the Dark Web, he contacted the companies and informed them of their problems, offering to disclose the bug for a price, in a tactic known as bug poaching. Obviously, all three companies declined, so here we are, with their data available on the Dark Web. TheDarkOverlord says that all databases are a one-time sale, meaning only one buyer can get their hands on the stolen data.
The three databases contain information on patients in Farmington, Missouri; Atlanta, Georgia; and the Central and Midwest areas of the U.S. TheDarkOverloard asserts that the data includes details like contact information, Social Security numbers, and personal facts like gender and race. The collection does not, apparently, include medical history. I suppose that is a relief—for now.
Cynthia Murrell, September 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
Ancient History Tumblr Hack Still Beats Myspace Passwords Sale
September 19, 2016
Personal information remains a hot ticket item on the darknet. Metro shared an article highlighting the latest breach, More than 65million Tumblr emails sold on the darknet. While the leak happened in 2013, Tumblr has now reported the magnitude of the database that was hacked. As a call to action, the article reports Tumblr’s recommendation for users to change their passwords and look out for phishing attempts. The article reports,
The database includes email addresses and passwords. These are heavily protected by a procedure which makes it extremely difficult to reproduce the passwords. The database has turned up on the darknet marketplace The Real Deal at a price of £102, reports Motherboard.
Troy Hunt, who runs the security research site Have I Been Pwned, said the leak is an example of a ‘historical mega breach’. Users who fear their credentials were involved in the Tumblr hack can find out here.
Let’s not forget the more recent hack of potentially the largest login credentials theft: Hacker offers 427 million MySpace passwords for just $2,800. Many are commenting on the low price tag for such a huge quantity of personal information as a sign of MySpace’s lack of appeal even on the Dark Web. When login information including passwords are stolen, phishing attempts on the site are not the only issue for victims to be concerned with; many individuals use the same login credentials for multiple accounts.
Megan Feil, September 19, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
There is a Louisville, Kentucky Hidden Web/Dark Web meet up on September 27, 2016.
Information is at this link: https://www.meetup.com/Louisville-Hidden-Dark-Web-Meetup/events/233599645/
OpenText: Content Marketing or Real News?
September 18, 2016
When I knew people at the original Malcolm Forbes Forbes, I learned that stories were meticulously researched and edited. I read “Advanced Analytics: Insights Produce New Wealth.” I was baffled, but, I thought, that’s the point.
The main point of the write up pivots on the assertion that an “insight” converts directly to “wealth.” I am not sure about the difference between old and new wealth. Wealth is wealth in my book.
The write up tells me:
Data is the foundation that allows transformative, digital change to happen.
The logic escapes me. The squirrels in Harrod’s Creek come and go. Change seems to be baked into squirreldom. The focus is “the capitalist tool,” and I accept that the notion of changing one’s business can be difficult. The examples are easy to spot: IBM is trying to change into a Watson powered sustainable revenue machine. HP is trying to change from a conglomeration of disparate parts into a smaller conglomeration of disparate parts. OpenText is trying to change from a roll up of old school search systems into a Big Data wealth creator. Tough jobs indeed.
I learned that visualization is important for business intelligence. News flash. Visualization has been important since a person has been able to scratch animals on a cave’s wall. But again I understand. Predictive analytics from outfits like Spotfire (now part of Tibco) provided a wake up call to some folks.
The write up informs me:
While devices attached to the Internet of Things will continue to throw out growing levels of structured data (which can be stored in files and databases), the amount of unstructured data being produced will also rise. So the next wave of analytics tools will inevitably be geared to dealing with both forms of information seamlessly, while also enabling you to embed the insights gleaned into the applications of your choosing. Now that’s innovation.
Let’s recap. Outfits need data to change. (Squirrels excepted.) Companies have to make sense of their data. The data come in structured and unstructured forms. The future will be software able to handle structured and unstructured data. Charts and graphs help. Once an insight is located, founded, presented by algorithms which may or may not be biased, the “insights” will be easy to put into a PowerPoint.
BAE Systems’ “Detica” was poking around in this insight in the 1990s. There were antecedents, but BAE is a good example for my purpose. Palantir Technologies provided an application demo in 2004 which kicked the embedded analytics notion down the road. A click would display a wonky circular pop up, and the user could perform feats of analytic magic with a mouse click.
Now Forbes’ editors have either discovered something that has been around for decades or been paid to create a “news” article that reports information almost as timely as how Lucy died eons ago.
Back to the premise: Where exactly is the connection between insight and wealth? How does one make the leap from a roll up of unusual search vendors like Information Dimension, BRS, Nstein, Recommind, and my favorite old time Fulcrum Technologies produce evidence of the insight to wealth argument. If I NOT out these search vendors and focus on the Tim Bray SGML search engine, I still don’t see the connection. Delete Dr. Bray’s invention. What do we have? We have a content management company which sells content management as governance, compliance, and other consulting friendly disciplines.
Consultants can indeed amass wealth. But the insight comes not from Big Data. The wealth comes from selling time to organizations unable to separate the giblets from the goose feathers. Do you know the difference? The answer you provide may allow another to create wealth from that situation.
One doesn’t need Big Data to market complex and often interesting software to folks who require a bus load of consultants to make it work. For Forbes, the distinction between giblets and goose feathers may be too difficult to discern.
My hunch is that others, not trained in high end Manhattan journalism, may be able to figure out which one can be consumed and which one can ornament an outfit at an after party following a Fashion Week showing.
Stephen E Arnold, September 18, 2016