Machine Learning Solution Would Help Keep Wikipedia Entries Updated
February 27, 2020
In a development that could ease the burden on Wikipedia volunteers, Eurasia Review reports, “Automated System Can Rewrite Outdated Sentences in Wikipedia Articles.” Researchers at MIT have created a system that could greatly simplify the never-ending process of keeping articles up to date on the site. Instead of having to rewrite sentences or paragraphs, volunteers could just insert the updated information into an unstructured sentence. The system would then generate “humanlike” text. Here’s how:
“Behind the system is a fair bit of text-generating ingenuity in identifying contradictory information between, and then fusing together, two separate sentences. It takes as input an ‘outdated’ sentence from a Wikipedia article, plus a separate ‘claim’ sentence that contains the updated and conflicting information. The system must automatically delete and keep specific words in the outdated sentence, based on information in the claim, to update facts but maintain style and grammar. …
We noted:
“The system was trained on a popular dataset that contains pairs of sentences, in which one sentence is a claim and the other is a relevant Wikipedia sentence. Each pair is labeled in one of three ways: ‘agree,’ meaning the sentences contain matching factual information; ‘disagree,’ meaning they contain contradictory information; or ‘neutral,’ where there’s not enough information for either label. The system must make all disagreeing pairs agree, by modifying the outdated sentence to match the claim. That requires using two separate models to produce the desired output. The first model is a fact-checking classifier — pretrained to label each sentence pair as ‘agree,’ ‘disagree,’ or ‘neutral’ — that focuses on disagreeing pairs. Running in conjunction with the classifier is a custom ‘neutrality masker’ module that identifies which words in the outdated sentence contradict the claim.”
Note this process still requires people to decide what needs updating, but researchers look forward to a time that even that human input could be sidestepped. (Is that a good thing?) Another hope is that the tool could be used to eliminate bias in the training of “fake news” detection bots. Researchers point out the system could be used on text-generating applications beyond Wikipedia, as well. See the write-up for more information.
Cynthia Murrell, February 27, 2020
British Maps Online: Finding a Map Is Challenging
February 26, 2020
The British Royal Collection recently added a brand new addition to their online collection. The blog Ian Visits explores the new collection in the post: “Huge Archive Of Old Military Maps Published.” The post explains that over three thousand maps that King George III collected have been digitized. Dr, Yolande Hodson headed the project and spend ten years cataloging George III’s collection. This is the first time in history that these documents have been available free to the public.
Scholars are amazed at the breadth and wealth of information available in the maps, but the average user will find the maps fun due to their age and information. The map collection contains items from the sixteenth to eighteenth centuries, consisting of maps drawn in the field, uniform depictions, fortification plans, and presentation maps of sieges, battles, and marches. King George III loved maps:
“Maps were an important part of George’s early life and education, and he built up a huge collection of more than 55,000 topographical, maritime and military prints, drawings, maps and charts. Upon the King’s death, his son, George IV, gave his father’s collections of topographical views and maritime charts to the British Museum (now in the British Library), but retained the military plans due to their strategic value and his own keen interest in the tactics of warfare.”
These maps offer a window to the past. They show how common soldiers and people dealt with in the daily lives. They are not photographs, but they offer more details than many a picture can.
Keep in mind that browsing may be needed to locate a particular map.
Whitney Grace, February 26, 2020
Elastic App to Stretch Finding
February 26, 2020
Elasticsearch is one of the most used open source search application. While Elasticsearch is free for open source developers to download, the company offers subscriptions for customer support and enhanced software. Street Insider shares that Elasticsearch added a new addition to their service, “Elastic Announces The General Availability Of Elastic App Search On Elasticsearch Service.”
Starting now Elasticsearch Service users can deploy App Search simply from their dashboard. A powerful search experience is available in mobile devices harnessing the Elastic Cloud. The new Elastic App Search also includes new geolocation services and pricing:
“This milestone also unlocks a whole new choice of geolocation options for Elastic App Search users: from São Paulo to Singapore and California to Germany, App Search can be hosted everywhere you find our Elasticsearch Service.
Elastic didn’t just make getting started on App Search easier — they’ve also simplified pricing by switching to the same resource-based pricing model that Elasticsearch Service uses. With App Search on Elasticsearch Service, users only pay for the resources they consume, without worrying about artificial constraints around the number of users, documents, or operations made. It’s a whole new approach to pricing search that’s transparent and fair.”
Elastic, the parent company, is dedicated to making its software available to anyone who needs powerful search. Elastic offers free trials and opportunities to build prototypes.
Whitney Grace, February 26, 2020
Amazon Pursues the Ipanema Way
February 26, 2020
Foreign technology investment is a booming industry. Most major technology investments appear to occur in Asia and western countries. Amazon Web Services is looking south for technology investments, specifically Brazil. ZDNet reports that, “AWS Plants Multimillion-Dollar Investment In Brazil” for development.
AWS plans to invest $233 million (1 billion reais) to expand its infrastructure in São Paulo. The investment will be made over two years. Governor of São Paulo João Doria predicts that AWS’s investment will create more jobs and opportunities for startups within the state. AWS and the Brazilian officials did not share anything else about the deal other than an official press releases. AWS first came to São Paulo in 2012, when they built its first datacenter. That was just the start of the new investment:
“A few years later, the company announced that it would be using the customer cost-consciousness driven by economic instability to grow its business in Brazil and increase its influence in the local technology community. Cloud computing and artificial intelligence will be the core areas of focus when it comes to investment in technology in Brazil in 2020, according to a study released last month by technology firm CI&T.”
AWS’s major challenge in Brazil will be guaranteeing that this sector of the market can keep up with the rest of the world. Cloud computing technology advancements are driving AWS to invest in Brazil because it is a new, although volatile market.
Whitney Grace, February 26, 2020
Amazon: Buying More Innovation
February 26, 2020
DarkCyber noted the article “Amazon Acquires Turkish Startup Datarow.” The word “startup” is rather loosely applied. Datarow was founded in 2016. Not a spring chicken in DarkCyber’s view is a four year old outfit.
What’s interesting about this acquisition is that it provides the sometimes unartful Amazon with an outfit that specializes in making easier-to-use data tools. The firm appears to have been built around AWS Redshift.
The company’s quite wonky Web site says:
We’re proud to have created an innovative tool that facilitates data exploration and visualization for data analysts in Amazon Redshift, providing users with an easy to use interface to create tables, load data, author queries, perform visual analysis, and collaborate with others to share SQL code, analysis, and results. Together with AWS, we look forward to taking our tool to the next level for customers.
The company provides what it calls “data governance,” a term which DarkCyber means “get your act together” with regard to information. This is easier said than done, but it is a hot button among companies struggling to reduce costs, comply with assorted rules and regulations, and figure out what’s actually happening in their lines of business. Profit and loss statements are not up to the job of dealing with diverse content, audio, video, real time data, and tweets. Well, neither is Amazon, but that’s not germane.
Will Amazon AWS Redshift (love the naming, don’t you?) become easier to use? Perhaps Datarow will become responsible for the AWS Web site?
Stephen E Arnold, February 26, 2020
Does Amazon Have Dark User Interface Patterns?
February 25, 2020
The question “Does Amazon make use of interfaces intentionally designed to generate revenue?” is an interesting one. Amazon does have a boatload of features, functions, and services. There are — what? — more than a half dozen different databases, including the quantum thing.
The article “My First AWS Free Tier Hosting Bill Was $900.” The idea is that “free” did not mean exactly free. This is akin to the word “unlimited” when it appears in mobile data plans. Is Amazon following a path blazed by telecommunications giants, truly models of consumer centric behavior in DarkCyber’s narrow view of the economic world.
The write up states:
A major part of AWS marketing is pay-per-use for their services:
“You only pay for the services you consume, and once you stop using them, there are no additional costs or termination fees.” – AWS Pricing
They also market “free tier” products, less powerful instances that are free for the first year of use.
The article reports that a slow roll out allowed the system to “sit around for a month.”
That decision cost about $1,000.
The article points out that assuming that an “idle server” would not cost anything. Also, the Amazon jargon did not make sense, so the developer ignored the Amazon speak.
The write up goes through the Amazon lingo to alert other individuals of Amazon’s approach to “free.”
Several observations:
- Amazon is confusing. DarkCyber thinks this is party due to the vaunted two pizza team approach to programming and part due to clever marketers who really want to match up to the founder’s principles.
- Amazon pitches itself hard as the logical, best, and superior choice for cloud anything. Individuals who buy this pig in a poke are going to pay.
- Amazon, if one makes a good case to the customer service unit staffed with people who sort of speak like those in rural Kentucky, will modify the charge.
Are these some lessons one can learn from this write up? Maybe, for example:
- Learn to speak Amazon
- Think before clicking
- Amazon became really big for a reason: Avoid becoming a third party merchant whose hot product became part of Amazon Basics.
Your mileage may vary from the drive through the Tunnel of Love that the author of the article took.
Stephen E Arnold, February 25, 2020
Intel: Getting Serious about Artificial Intelligence
February 25, 2020
Intel has tried to impact the AI market for years, but they have lagged behind IBM’s Watson. Intel’s biggest weakness with AI is that while its technology is decent for inferencing it lacks in the training department. Training and large amounts of data is how AI learns and is programmed. Without these, AI is only expensive software. Datamation explores how Intel is expanding its AI edge by, “Intel Buys Habana And Gets Serious About Keeping Learning AI.”
Intel purchased Habana entirely, not simply its technology, and is leaving the company alone to continue its work. Intel’s recent acquisition was inspired by rival companies developing GPUs with intense training capabilities and powerful inference engines. It appears Intel wants to go further by focusing on deep learning AI. Habana has secretly guarded its technology to prevent theft. The company is in trials with Facebook due to the social media’s presence with politics. Intel has already sunk $75 million into Habana to gear it up for competition:
“The move was designed specifically to provide stronger competition for the market leader, NVIDIA, in the space as it ramps to its $25+ estimated billion-dollar potential by 2024. Habana’s Chairman Avigdor Willenz has an impressive success record, having sold Galileo Technologies to Marvell Technology Group for $2.7B in 2001 and Annapurna Labs to Amazon for $370M in 2015. Intel paid $2B for Habana.”
With Habana, Intel should advance its AI developments. Intel currently does not have strong training engines so it will pull on Habana’s technology to bolster that, while Habana will rely on Intel to diversify its client base.
Intel has all the tools to succeed in becoming a NVIDIA’s rival, but it lags behind. Intel needs to create a thorough development plan to get its AI projects on track to becoming market leaders.
Whitney Grace, February 25, 2020
Did PopSockets Slow the Bezos Bulldozer?
February 25, 2020
At a recent hearing of the House Subcommittee on Antitrust, Commercial, and Administrative Law, major tech players Amazon, Apple, Facebook, and Google were all accused of anticompetitive practices. Representatives listened to executives from Sonos, Basecamp, Tile, and PopSockets describe how those larger companies have unfairly wielded their market dominance against smaller players. For one CEO, Amazon’s behavior was especially egregious. Mashable reports, “PopSockets CEO Calls Out Amazon’s ‘Bullying with a Smile’ Tactics.” Reporter Jack Morse writes:
“Barnett, under oath, told the gathered members of the House that Amazon initially played nice only to drop the hammer when it believed no one was watching. After agreeing to a written contract stipulating a price at which PopSockets would be sold on Amazon, the e-commerce giant would then allegedly unilaterally lower the price and demand that PopSockets make up the difference.”
When asked how Amazon could ignore their contract, Barnett elaborated:
“‘With coercive tactics, basically,’ he replied. ‘And these are tactics that are mainly executed by phone. It’s one of the strangest relationships I’ve ever had with a retailer.’ Barnett emphasized that, on paper, the contract ‘appears to be negotiated in good faith.’ However, he claimed, this is followed by ‘… frequent phone calls. And on the phone calls we get what I might call bullying with a smile. Very friendly people that we deal with who say, “By the way, we dropped the price of X product last week. We need you to pay for it.”’ Barnett said he would push back and that’s when ‘the threats come.’ He asserted that Amazon representatives would tell him over the phone: ‘If we don’t get it, then we’re going to source product from the gray market.’”
PBS, the TV outfit, took a close look at Amazon and saw George Orwell, not the smiling box. So with JEDI stalled, Amazon seems to be popping and socketing to the rhythm of ringing cash registers.
Cynthia Murrell, February 25, 2020
DarkCyber for February 25, 2020, Now Available
February 25, 2020
This week’s DarkCyber video news program features an interview with Dr. Rado Kotorov, the chief executive officer of Trendalyze. The company provides time series analytics on steroids.
Most professionals are aware that some Wall Street traders analyze time series data for stocks. In the last decade, the business of buying and selling stocks has evolved. Today there are more data available and the importance
of obtaining certain data in near real time has sky rocketed. Plugging numbers into Excel is useful; however, more sophisticated analytic systems are required to deal with financial data.
The shift from a hard working broker making trades before heading to the Kiwanis club has ended. The focus is now on high frequency trading and using advanced analytics, pattern analysis, machine learning, deep learning, AI, and a suite of tools designed to exploit price fluctuations in nano seconds.
In this interview, Dr. Kotorov explains that the methods of Wall Street high frequency traders have now moved into other business sectors. Examples range from health care to companies like Amazon, Tesla, and Walmart. Time series analyses provide high-value results for policeware and government systems.
Dr. Kotorov reviews a theory of intelligence which relies on time series analysis of real time flows of large volumes of data. Specifically, the approach enables more refinement in certain machine learning applications as well as adding precision to some artificial intelligence approaches.
Dr. Kotorov, who holds a law degree and a Ph.D. in philosophy, heads one of the fastest growing analytics firms in the world.
For more information about Trendalyze, navigate to the url presented in the interview.
DarkCyber is a video news program produced by Stephen E Arnold, publisher of DarkCyber blog. The blog and the twice-a-month video news program are provided without advertising or sponsored content.
Kenny Toth, February 25, 2020
An Uncanny Blind Alley
February 24, 2020
I subscribe to the dead tree edition of the New York Times. I spend less time with the expensive reminder of a bygone era than I did when I was an eager beaver working at a nuclear consulting company. One never knew when a hot event (no pun intended) would break like Three Mile Island.
Now to the New York Times Magazine, a pinnacle of content. Am I right? Clarity in titling, hard facts, and helpful analysis based on those facts. Am I right?
I read either “RE: Working the System. In an economy with few protections for employees, how do you gain power on the job? (Very Carefully)” or “the Young and the Restless. Generational consultants believe that Millennial and Gen Z professionals have different values—and that to recruit and keep them, companies need a whole new approach” or “Yaaass! We’re HIRING!”
Note: I think the the “them” in the second odd ball title refers to “employees”, not “values.” Well, maybe not? The notion of a title that makes sense is just sooo! OLD FASHIONED!
If you want to read the story which ran in the NYT Magazine, yep, Sunday”s graphically and bibliographically challenged NYT Magazine, hunt up the February 23, 2020 edition. The story appeared on February 19, 2020, at this paywalled link of which the NYT is quite proud. Note: To keep subscribers, why not put the story online after the dead tree customers receive the newspaper? Oh, right. It’s a generational thing.
Now to the write up.
As soon as I saw the graphics, which continue to baffle me because my mobile phone does not present information in the manner depicted, I thought of Amy Wiener’s best selling book Uncanny Valley, published either by Macmillan Publishers or Farrar, Straus and Giroux. Yep, another outfit which worries not about useless trivia like bibliographic references. You can buy a copy, which I recommend, at a Barnes & Noble if there’s one left in your neighborhood, Google Play, Kobo (what? who?) and the Bezos bulldozer’s book store and policeware company.
The NYT Magazine’s approach lacks three characteristics of Ms. Wiener’s book.
First, the humor in the NYT Magazine missed its mark with me. I was not sure if “phigital” was a joke or a real-live word used in the Big Apple. For me, the jury’s out or hung.
Second, the examples used to characterize the different “generations” identified in the article struck me as outliers. Ms. Wiener offered context. Consider the NYT example of a person who wanted a day off and lied about the death of a relative. When the boss found out, it was like you really sort of okay. (I would not advise trying this approach at Bain, BCG, Booz Allen, or McKinsey when a deadline is fast approaching.) Not funny, by the way, that death lie.
Third, the author who lets me know that he/she is a member of one of these generations learned how to do term papers, not write in a manner as compelling as Ms. Wiener’s. There are references to hot consulting firms like GenGuru and academic-sounding books like “The Remix: How to Lead and Succeed in the Multigenerational Workplace,” and presumably validated statistics. For instance, I did not know nor do I necessarily believe that Gen Zers live below the poverty line. I thought this members of this group live with their parents or used the old fogies as a meatware infused automatic teller machines. Source of the number? Nope. Sample size? Nope. Context of the survey? Nope. Oh, well, it is the New York Times. “Yaaass!”
Now don’t get me wrong. DarkCyber reads, filters, and pays attention to a wide range of content. This particular article struck the team as an attempt to ride the interest in Ms. Wiener’s book, who writes for the often highly regarded New Yorker Magazine. That outfit usually uses one title on an article and restrains absolutely too-hip graphics professionals from creating an article with three possible titles for librarians and the wizards at Google to index. And the colors? Don’t rev DarkCyber’s engines, please.
Several observations:
- Originality is a useful characteristic of some writing. Would this ingredient be useful at the NYT? Maybe less “Yaaass”?
- Quasi clever is okay on a blog or a TikTok video. Maybe not so much in the Gray Lady’s venerable magazine? Techno-viral fluency? Less “Yaaass”?
- The graphics consume more space than the article itself. Maybe three pages of content, data, and analysis. Maybe less “Yaaass”?
DarkCyber noted this statement in the article:
“Until, that is, these generations start to see the forest and not just the trees.”
Trees become wood pulp and some facilitate the dead tree NYT’s goals.
Stephen E Arnold, February 24, 2020