Factualities for September 26, 2018

September 26, 2018

Believe ‘em or not:

58 million. The number of new jobs artificial intelligence will create by 2022. Source: Forbes
183.9. The top speed a woman reached riding a bicycle. Source: National Public Radio
55 percent of millennials prefer learning via YouTube. 59 percent of Generation Z prefer learning via YouTube. Source: Axios
71 percent. The percentage of startups in Israel focusing on business to business applications based on artificial intelligence. Source: Forbes
557,000. Number of backlogged US security clearances in the last 90 days. Source: FAS.org
$783. Average monthly p9ayment of an Uber or Yelp driver in 2017. Source: Technology Review
2.0 billion euros. Amount raised by startups in France in the first six months of 2018. Amount raised in same period in 2017: 2.6 billion euros. Source: Bloomberg
Everyone. The number of people Twitter will ask about its policy changes. Techcrunch.

Yep, numbers one can trust. Like “everyone.”

Stephen E Arnold, September 26, 2018

Written by Stephen E. Arnold · Filed Under News, Statistics, Technology | Comments Off on Factualities for September 26, 2018

Change Is Difficult: Especially So for the Big Search Folks

September 25, 2018

There is a pretty good reminder that plumbing is an issue. Most users of smart search just assume that systems will continue to work. Hey, it is 2018. This Moore’s Law stuff, free services, and nifty new “old” phones are slam dunks.

Not quite.

I found “The Woes of Incremental Resource Drains in Big Systems” useful because it offers practical information. Most of the content snagged by my monitoring systems return what I consider marketing craziness.

The write up explains that an attempt to implement a necessary change can go off the rails. It’s like magic. One day Gmail or Amazon doesn’t work. Hand waving. Tweets. Then the problem is solved and forgotten.

The write up explains:

Let’s say you have a tier of 100,000 web servers. Every one of them opens one connection to your database. That connection is shared with all of the concurrent hits/code executing on those web servers. It just gets multiplexed down the pipe and off it goes. Then, one fine day, someone decides to write a change that makes the web servers open four connections to the database. They’ve invented this new “pooling” strategy, such that requests grab the least-busy connection instead of always sharing the single one. It’s supposed to help latency by n% (and get them a promotion, but let’s not get into that now).

Do the math. Dead system.

Worth remembering because as certain companies become like the timesharing systems of the past, excitement will be inevitable. Some can be hidden. Some will surface in unexpected ways.

Bing, Facebook, Google, and maybe Amazon may face this type of challenge more often than one believes.

Stephen E Arnold, September 25, 2018

Written by Stephen E. Arnold · Filed Under News, Technology | Comments Off on Change Is Difficult: Especially So for the Big Search Folks

Rapprochement: Russia, a Copyright Defender

September 25, 2018

I know there are “exciting” search announcements from Bing and Google. Sigh. I don’t have the energy to tilt at the relevance windmill today. However, I noted an interesting search development regarding copyright protection in Russia. Yep, Russia, home of fancy bears and other digital creatures.

“Google, Yandex Discuss Creation of Anti-Piracy Database” explains the “real” news, which I assume is the truth:

Google, Yandex and other prominent Internet companies in Russia are discussing the creation of a database of infringing content including movies, TV shows, games, and software. The idea is that the companies will automatically query this database every five minutes with a view to removing such content from search results within six hours, no court order required.\

I learned:

Takedowns like this are common in the West, with Google removing billions of links upon request. In Russia, however, search engine Yandex found itself in hot water recently after refusing to remove links on the basis that the law does not require it to do so. This prompted the authorities to suggest that a compromise agreement needs to be made, backed up by possible changes in the law. It now appears that this event, which could’ ve led to Yandex being blocked by ISPs, has prompted both Internet companies and copyright holders to consider a voluntary agreement. Discussions currently underway suggest a unique and potentially ground-breaking plan.

I particularly enjoyed this explanation of the downside related to a failure to cooperate:

Talks appear to be fairly advanced, with agreements on the framework for the database potentially being reached by the middle of this week. If that’s the case, a lawsuit recently filed by Gazprom Media against Yandex could be settled amicably. It’s understood that Yandex wants all major Internet players to become involved, including social networks. With the carrot comes the possibility of the stick, of course. Gazprom Media indicates that if a voluntary agreement cannot be reached, it will seek amendments to copyright law that will achieve the same end results.

From my vantage point in bourbon saturated Harrod’s Creek, Kentucky, I blurrily perceive several implications:

Copyright compliance leadership appears to be centered in Russia. Yep, Russia. You know, the bear place.
Yandex seems to be taking the lead, possibly because its management understands the downsides of Putin pushback.
American companies are participating, but after decades of copyright hassles, I think I see a certain reluctance because an outfit like Google doesn’t want a reprise of its diplomatic performance with China. Ad dollars, perhaps?

Worth watching. Some queries on Yandex return some interesting copy protected content. If you have a moment, try this query at Yandex.ru:

(Sorry, but I can only show the Russian word for crack via screenshot without the text being corrupted when we disseminate this write up.) You can get the Russian version of crack using a translation service.)

Stephen E Arnold,

Written by Stephen E. Arnold · Filed Under Copyright, News, Search | Comments Off on Rapprochement: Russia, a Copyright Defender

DarkCyber for September 25, 2018, Now Available

September 25, 2018

DarkCyber for September 25, 2018, is now available at www.arnoldit.com/wordpress and on Vimeo at https://vimeo.com/291347184

Stephen E Arnold’s DarkCyber is a weekly video news and analysis program about the Dark Web and lesser known Internet services.

This week’s program covers four Dark Web and security related stories.

The first story answers the question, “What are some essential programs for my hacking toolkit. DarkCyber identifies eight tools used by an ethical hacker and provides links to these programs. Each program performs a specific function and delivers information about passwords, system configuration, and other items of information associated with a target.

The second story explores a money laundering method implemented via online games. By exploiting the allegedly lax credit card verification methods used by Apple and other online game sellers, bad actors can use a stolen card to purchase digital assets sold within an online game. The assets can enhance the game play of the purchaser by activating special powers and other features. These digital assets can then be resold with the payments directed to an encrypted and allegedly anonymous digital currency wallet. DarkCyber notes that few parents and some game players are unaware of this scam.

The third story takes a look at Verizon’s detailed analysis of cyber crime exploits. The free report provides “how to” instructions for undertaking social engineering, hardware attacks, and malware attacks. The report includes detailed tables and appendices with additional cyber crime information. Stephen E Arnold, author of Dark Web Notebook, said, “The Verizon report contains information of value for security and law enforcement personnel. Unfortunately, this type of explanatory information provides bad actors with important insights into specific methods are effective when attacking an organization or an individual.”

The final story explains how to create a custom Tor Onion URL. Instead of a string of incomprehensible letters and numbers, DarkCyber reviews a method for generating a more easily recognized URL like “bobsbankxxxxxxxx. The procedure taps an open source software program and specific operational types created by a security expert. The video includes the site locations for the software and the instructional article.

Beginning with the program for October 30, 2018, and then for programs released on November 6, November 13, and November 20, Stephen will issue a series of four DarkCyber programs about Amazon’s policeware initiative. Each video will be about three minutes. The standard news format will resume on November 27.

The DarkCyber team has developed a for fee one hour briefing about the little known facet of Amazon’s product and services initiative. To set up a video conference, email benkent2020 at yahoo dot com. Please, put “Amazon policeware” in the subject line.

Kenny Toth, September 25, 2018

Kenny Toth

Written by Stephen E. Arnold · Filed Under Dark Web, DarkCyber, News, Video | Comments Off on DarkCyber for September 25, 2018, Now Available

Scripts and Rules: The Future Is Not Fully Automatic

September 24, 2018

I wanted to capture an item of information which may be lost in the flood of marketing craziness. The subject is smart software, autonomous systems, and why humans may have to push buttons and turn knobs.

The write up is “Deep Learning Is Inferior to Scripting for Teaching Robots.” The main idea, as I understood the article, is that creating useful robots (hardware and software) may benefit from old school methods.

The method is for one or more smart humans to create data sets and train robots. But the humans are not out of a job. The robots have to be retrained with more rules, updated rules, and fresh data sets.

The article points out:

They [A. Rupam Mahmood, Dmytro Korenkevych, Gautham Vasan, William Ma and James Bergstra] conclude that deep learning lags behind the traditional way of training robots with scripts ‘by a large margin in some tasks, where such solutions were well established or easy to script‘. However, RL was ‘more competitive’ in complex tasks (for instance, docking to a charging station). The report also shows that the RL algorithms need careful tuning to their variables before being useful for any task – but after tuning, those same variables were applicable across a range of tasks.

In short, autonomous methods are not as efficient as humans doing the rule work and creating scripts or training sets.

The point is an important one. Smart software is not the cost and accuracy silver bullet that marketers have described as a reality.

I am not disputing that for certain specific operations on bounded data smart software can be magical.

But bootstrapping intelligence and learning with zeros and ones has not yet docked at the port, unloaded, and moved the goods to the user.

Stephen E Arnold, September 24, 2018

Written by Stephen E. Arnold · Filed Under AI, News | Comments Off on Scripts and Rules: The Future Is Not Fully Automatic

Two Internets: Preparing for a Google Centric Walled Garden

September 24, 2018

In my “The Google Legacy” I included a diagram which showed a diagram of a Google centric Internet. Here’s a diagram from that monograph written in 2004:

This is an old diagram. To modernize it, we need to add more closed systems such as a Russian Internet, an EU Internet, and maybe an Amazon Internet.

The idea is that information flows into a Google construct. Users use the system. Think of the diagram as illustrating an Internet which is, in effect, inside Google.

I thought about this now 15 year old diagram when I read “Former Google CEO Predicts the Internet Will Split in Two and One Part Will Be Led by China.” The write up reports:

If you think of China as like ‘Oh yeah, they’re good with the Internet,’ you’re missing the point. Globalization means that they get to play too. I think you’re going to see fantastic leadership in products and services from China. There’s a real danger that along with those products and services comes a different leadership regime from government, with censorship, controls, etc.

The subtext, of course, is that Google wanted China to change, a laughable moment like solving death.

Today Google finds itself faced with losing its walled garden. Think of the problem as AMP and online advertising under pressure from outsiders. These are people who can withdraw cash from an ATM but not understand its plumbing.

I interpreted the remark in the context of “The Google Legacy.”

Two Internets means that Google will have to find a way to be a big dog in the other Internets which are likely to try and exclude the Silicon Valley scooter riders
An explanation of why Google’s smart search system has to be given some special training
A signal that Google will change, maybe taking another step down the road that Hewlett Packard followed. (Remember HP owned AltaVista? I do.)

To sum up, I think more allegedly helpful insights are revealing more than than paying for coffee with a mobile phone.

Stephen E Arnold, September 24, 2018

Written by Stephen E. Arnold · Filed Under Business strategy, Google, News | Comments Off on Two Internets: Preparing for a Google Centric Walled Garden

The Semantic Web: Technology Roadkill or a Roadside Snack?

September 24, 2018

I spotted a quote to note. Here it is:

The Semantic Web is as dead as last year’s roadkill.

The statement appears in “Whatever Happened to the Semantic Web?” The write up provides a run through of the starts and stops associated with making the Web into a more organized place.

I would point out that the state of the Semantic Web can be glimpsed in the TweetedTimes’ auto generated list of articles called “Semantic Search.” The collection of items focuses on a range of topics, but the thrust seems to be getting traffic for a Web site; for example, “How to Optimize Content for Semantic SEO.”

If you are an adherent of the Semantic Web, check out the included footnotes. I would point out that the Google has a number of Guha patents in its portfolio. I think the Semantic Web may be of interest to some at the online ad search giant.

Guha’s patents plus the work by Alon Halevy may suggest some interesting use cases for the mark up, triplet, smart agent system and methods.

Stephen E Arnold, September 24, 2018

Written by Stephen E. Arnold · Filed Under Google, Indexing, News | 3 Comments

Smart Software and Clever Humans

September 23, 2018

Online translation works pretty well. If you want 70 to 85 percent accuracy, you are home free. Most online translation systems handle routine communications like short blog posts written in declarative sentences and articles written in technical jargon just fine. Stick to mainstream languages, and the services work okay.

But if you want an online system to translate my pet phrases like HSSCM or azure chip consultant, you have to attend more closely. HSSCM refers to the way in which some Silicon Valley outfits run their companies. You know. Like a high school science club which decides that proms are for goofs and football players are not smart. The azure chip thing refers to consulting firms which lack the big time reputation of outfits like Bain, BCG, Booz, etc. (Now don’t get me wrong. The current incarnations of these blue chip outfits is far from stellar. Think questionable practices. Maybe criminal behavior.) The azure chip crowd means second string, maybe third string, knowledge work. Just my opinion, but online translation systems don’t get my drift. My references to Harrod’s Creek are geocoding nightmares when I reference squirrel hunting and bourbon in cereal. Savvy?

I was, therefore, not surprised when I read “AI Company Accused of Using Humans to Fake Its AI.” The main point seems to be:

[An[ interpreter accuses leading voice recognition company of ripping off his work and disguising it as the efforts of artificial intelligence.

There are rumors that some outfits use Amazon’s far from mechanical Turk or just use regular employees who can translate that which baffles the smart software.

The allegation from a former human disguised as smart software offered this information to Sixth Tone, a blog publishing the article:

In an open letter posted on Quora-like Q&A platform Zhihu, interpreter Bell Wang claimed he was one of a team of simultaneous interpreters who helped translate the 2018 International Forum on Innovation and Emerging Industries Development on Thursday. The forum claimed to use iFlytek’s automated interpretation service.

Trust me, you zippy millennials, smart software can be fast. It can be efficient. It can be less expensive than manual methods. But it can be wrong. Not just off base. Playing a different game with expensive Ronaldo types.

Why not run this blog post through Google Translate and check out the French or Spanish the system produces? Better yet, aim the system as a poor quality surveillance video or a VoIP call laden with insider talk between a cartel member and the Drug Llama?

Stephen E Arnold, September 23, 2018

Written by Stephen E. Arnold · Filed Under News, Text processing, Translation | Comments Off on Smart Software and Clever Humans

HSSCM Methods: Hey, Enough of This Already

September 22, 2018

I read an allegedly “real journalism” story called “Google Suppresses memo Revealing Plans to Closely Track Search Users in China.” I won’t call attention to the split infinitive, which is not popular among the sensitive set.

And now to the “real” story:

The write up reveals that allegedly Google’s high school science club management methods include forcing employees to “delete a confidential memo circulating insight the company.”

But here’s the juicy bit:

The memo, authored by a Google engineer who was asked to work on the project, disclosed that the search system, codenamed Dragonfly, would require users to log in to perform searches, track their location — and share the resulting history with a Chinese partner who would have “unilateral access” to the data.

Okay, now that’s management: Confidential material circulating. Info must be deleted. Data from the alleged memo gets leaked to a “real” news outfit.

The reaction is classic HSSCM: Anger, possible governance goofs, and saying one thing and maybe, just maybe, doing something else.

Well, the HSSCM method includes what the write up says is an interesting angle:

Google reportedly maintains an aggressive security and investigation team known as “stopleaks,” which is dedicated to preventing unauthorized disclosures. The team is also said to monitor internal discussions.

I particularly like the phrase “moral agency.”

Hey, hey, the HSSCM method means that the science club sets the rules. “Moral agency?” Can that be measured in mendacity?

Stephen E Arnold, September 22, 2018

Written by Stephen E. Arnold · Filed Under Management, News | Comments Off on HSSCM Methods: Hey, Enough of This Already

Modern Journalism: Fact or Fiction or Something Quite New

September 21, 2018

I read the information on this New York Times’ Web page. The intent is to explain how the NYT can learn a “secret” from a helpful reader. I noted this statement:

Each tip, be it from a submission or from a source, is rigorously vetted and probed.”

I find that interesting. I wonder how the anonymous editorial about the “soft” revolt within President Donald Trump’s staff was verified.

Probably not important. “Real” journalists do not have to reveal sources when a secret tip is rigorously vetted and probed.

From my vantage point in Harrod’s Creek, where real journalism is embodied in the local newspaper, the difference between reality and fiction is blurred. It is not thinking here; it is bourbon that does the trick.

Stephen E Arnold, September 21, 2018

Written by Stephen E. Arnold · Filed Under News, Publishing | 2 Comments

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Telegram
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Factualities for September 26, 2018

Change Is Difficult: Especially So for the Big Search Folks

Rapprochement: Russia, a Copyright Defender

DarkCyber for September 25, 2018, Now Available

Scripts and Rules: The Future Is Not Fully Automatic

Two Internets: Preparing for a Google Centric Walled Garden

The Semantic Web: Technology Roadkill or a Roadside Snack?

Smart Software and Clever Humans

HSSCM Methods: Hey, Enough of This Already

Modern Journalism: Fact or Fiction or Something Quite New

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta