Amazon and Halliburton: A Tie Up to Watch? Yep
September 11, 2020
DarkCyber noted “Explor, Halliburton, AWS Collaborate to Achieve Breakthrough with Seismic Data Processing in the Cloud.” The write up explains that crunching massive seismic data sets works. Among the benchmarks reported by the online bookstore and the environmentally-aware engineering and services companies are:
- An 85% decrease in CDP sort order times: Tested by sorting 308 million traces comprising of 1.72 TB from shot domain to CDP domain, completing the flow in an hour.
- An 88% decrease in CDP FK Filtering times: Tested with a 57 million-trace subset of the data comprising 318 GB, completing the flow in less than 6 minutes.
- An 82% decrease in pre-stack time migration times: Tested on the full 165 million-trace dataset comprising of 922GB, completing the flow in 54 minutes.
What do these data suggest? Better, faster, and cheaper processing?
We noted this paragraph in the write up:
“The collaboration with AWS and Explor demonstrates the power of digital investments that Halliburton is making, in this instance to bring high-density surveys to market faster and more economically than ever before. By working with industry thought leaders like Explor and AWS, we have been able to demonstrate that digital transformation can deliver step-change improvements in the seismic processing market.” – Philip Norlund, Geophysics Domain Manager, Halliburton, Landmark
Keep in mind that these data are slightly more difficult to manipulate than a couple hundred thousand tweets.
Stephen E Arnold, September 11, 2020
IBM, Canned Noise, but No Moan from the Injured Line Judge: IBM Watson, What Is Happening?
September 10, 2020
As it has done since 2015, IBM has shared details about the AI tech it is using to support the US Open tennis championship. This year’s tournament, though, is different from most due to the pandemic. VentureBeat reports, “IBM Will Use AI to Pipe in Simulated Crowd Noise During the U.S. Open.” We think IBM is making the project more complicated than necessary. A sound bed can be accomplished with sound snips on a laptop. Need sound, click a link. IBM’s solution? Bring an F 35, its support team, and a truck filled with spares. Writer Kyle Wiggers reports:
“The first [addition] is AI Sounds, which aims to recreate the ambient noise normally emanating from the stadium. IBM says it leveraged its AI Highlights platform to digest video from last year’s U.S. Open and rank the ‘excitement level’ of various clips, which it compiled into a reel and classified to give each a crowd reaction score. IBM used hundreds of hours of footage to extract crowd sounds, which it plans to make available to ESPN production teams that will serve it dynamically based on play. How natural these AI-generated sounds will be remains a question. Some fans have taken issue with the artificiality of noises produced by platforms like Electronic Arts’ Sounds of the Stands, which simulates crowd sounds using technology borrowed from the publishers’ FIFA game series. The NBA has reportedly considered mixing in audio from NBA 2K during its broadcasts, and the NFL is expected to use artificial fan noise for its live games this year if they’re played in empty stadiums.”
Whether fake crowd noise sounds authentic, do viewers really want to pretend there is a live crowd when there is not? Perhaps; the pandemic is affecting people in strange ways. There has even been a call to bring back canned laughter while it is too risky to gather live studio audiences for sit coms.
The other technology IBM hopes will garner attention at the Open is Watson Discovery, which we’re told will facilitate tennis debates between online viewers by feeding them questions and researching the validity of resulting arguments. The same platform will supply factoids about upcoming matches through the smartphone app. It seems Watson is auditioning for the job of sport commentator.
Ouch! Was that Watson or the line judge? Watson? Watson?
Cynthia Murrell, September 10, 2020
Elastic: Making Improvements
August 27, 2020
Elasticsearch is one of the most popular open-source enterprise search platforms. While Elasticsearch is free for developers to download, Elastic offers subscriptions for customer support and enhanced software. Now the company offers some new capabilities and features, HostReview reveals in, “Elastic Announces a Single, Unified Agent and New Integrations to Bring Speed, Scale, and Simplicity to Users Everywhere.” The press release tells us:
“With this launch, portions of Elastic Workplace Search, part of the Elastic Enterprise Search solution, have been made available as part of the free Basic distribution tier, enabling organizations to build an intuitive internal search experience without impacting their bottom line. Customers can access additional enterprise features, such as single sign-on capabilities and enhanced support, through a paid subscription tier, or can deploy as a managed service on Elastic Cloud. This launch also marks the first major beta milestone for Elastic in delivering comprehensive endpoint security fully integrated into the Elastic Stack, under a unified agent. This includes malware prevention that is provided under the free distribution tier. Elastic users gain third-party validated malware prevention on-premises or in the cloud, on Windows and macOS systems, centrally managed and enabled with one click.”
The upgrades are available across the company’s enterprise search, observability, and security solutions as well as Elastic Stack and Elastic Cloud. (We noted Elastic’s welcome new emphasis on security last year.) See the write-up for the specific updates and features in each area. Elasticsearch underpins operations in thousands of organizations around the world, including the likes of Microsoft, the Mayo Clinic, NASA, and Wikipedia. Founded in 2012, Elastic is based in Silicon Valley. They also happen to be hiring for many locations as of this writing, with quite a few remote (“distributed”) positions available.
Cynthia Murrell, August 27, 2020
Amazon and Toyota: Tacoma Connects to AWS
August 20, 2020
This is just a very minor story. For most people, the information reported in “Toyota, Amazon Web Services Partner On Cloud-Connected Vehicle Data” will be irrelevant. The value of the data collected by the respective firms and their partners is trivial and will not have much impact. Furthermore, any data processed within Amazon’s streaming data marketplace and made available to some of the firm’s customers will be of questionable value. That’s why I am not immediately updating my Amazon reports to include the Toyota and insurance connection.
Now to the minor announcement:
Toyota will use AWS’ services to process and analyze data “to help Toyota engineers develop, deploy, and manage the next generation of data-driven mobility services for driver and passenger safety, security, comfort, and convenience in Toyota’s cloud-connected vehicles. The MSPF and its application programming interfaces (API) will enable Toyota to use connected vehicle data to improve vehicle design and development, as well as offer new services such as rideshare, full-service lease, proactive vehicle maintenance notifications and driving behavior-based insurance.
Are there possible implications from this link up? Sure, but few people care about Amazon’s commercial, financial, and governmental services, why think about issues like:
- Value of the data to the AWS streaming data marketplace
- Link analytics related to high risk individuals or fleet owners
- Significance of the real time data to predictive analytics, maybe to insurance carriers and others?
Nope, not much of a big deal at all. Who cares? Just mash that Buy Now button and move on. Curious about how Amazon ensures data integrity in such a system? If you are, you can purchase our 50 page report about Amazon’s advanced data security services. Just write darkcyber333 at yandex dot com.
But I know first hand after two years of commentary, shopping is more fun than thinking about Amazon examined from a different viewshed.
Stephen E Arnold, August 20, 2020
IBM: A New PR Direction without Recipes and TV Game Shows?
August 18, 2020
IBM appears to be shifting its marketing in an interesting way. IBM announced its Power10 chips. Representative of the coverage is Forbes’ Magazine’s “IBM POWER10 Mega Chip For Hybrid Cloud Is Revealed.” The write up is not written by Forbes’ staff. The article is from an outfit called Tirias Research, a member of a contributor group. I am not sure what a contributor group is. The article seems like marketing speak to me, but you judge for yourself. Here’s a snippet:
To handle the ever more complex cloud workloads, the POWER10 improves capacity (socket throughput) and efficiency by about 3x over the POWER9. The energy efficiency gains were critical because IBM increased CPU core count over the POWER9 but kept the socket power roughly the same. All in all, the POWER10 big step forward for the architecture.
Next, I noticed write ups about IBM’s mainframe business. Navigate to “COBOL Still Handles 70% of Global Business Transactions.” The content strikes me as a recycling of IBM-prepared visuals. Here’s an example of the “analysis” and “news” in the article about the next big future:
Several observations:
- It was not that long ago that IBM was touting IBM Watson as capable of matching pets with potential owners. Now IBM is focusing on semiconductors and “workhorse” mainframes
- There are chips using technology more advanced than IBM’s 7 and 14 nanometer chips. Like Intel, IBM makes no reference to manufacturing techniques which may offer more advantages. That’s understandable. But three nanometer fabs are approaching, and IBM appears to be following, not leading.
- The cheerleading for hybrid clouds is different from cheerleading for “the cloud.” Has IBM decided that its future pivots on getting companies to build data centers and hire IBM to maintain them.
The craziness of the state unemployment agencies with COBOL based systems is fresh in my mind. For me, emphasizing the dependence of organizations upon COBOL is interesting. This statement caught my attention:
COBOL still handle [sic] more than 70% of the business transactions that take place in the world today.
Is this a good thing? Are Amazon, Microsoft, and Google embracing mainframes? My hunch is that companies are unable to shift from legacy systems. Inertia, not innovation, may be creating what some people seeking unemployment benefits from COBOL-centric systems perceive as a dysfunctional approach.
Net net: At least IBM is not talking about recipes created by Watson.
Stephen E Arnold, August 18, 2020
A Former Science Club Member Critiques Google, THE Science Club
August 17, 2020
I truly enjoy posts from former insiders at giant technology monopolies. Each of them hires from the other. The meta-revolving door spin is fascinating to watch. Some get tossed out of the mechanism because of some career negative factor: Family, health, mental orientation, or some other exogenous, non-technical event. Others go through a Scientological reformation and realize that the world of the high-technology nation-states is a weird place: Language, food customs, expectations of non-conformist “norm” behavior, and other cultural suckerfish. What gives me a chuckle are revelations like Tim Bray’s or Steve Yegge’s “Dear Google Cloud: Your Deprecation Policy is Killing You.” In my opinion, Mr. Yegge’s thoughtful, calm, and “in the moment” essay about the GOOG is more intriguing than the Financial Times’ story that reports Google has predicted the end of the world as we know it in Australia. What? Australia? Yep, for those receiving this warning from the Oracle at Mountain View their life amidst the kangaroos will mean no “free search” and — gasp! — curtains for “a dramatically worse” YouTube. Search can’t get much worse, so the YouTube threat means angry kids. Yikes! YouTube. Will Australia, a mere country, at the wrong end of a Google phaser strike?
Back to Mr. Yegge: In his essay, the phrase “deprecation treadmill” appears. This is the key insight. Googlers have to have something to do it seems. The bright science club members interact via an acceptable online service and make decisions informed by data. As Mr. Yegge points out, the data fueling insights may not be comprehensive or processed by some master intelligence. He notes that a Bigtable storage technology had been running for many years before any smart science club member or smart Google software noticed. (So much for attention to detail.)
Mr. Yegge points out that
One is that running a Bigtable was so inconsequential to Google’s scale that it took 2 years before anyone even noticed it, and even then, only because the version was old. As a point of comparison, I considered using Google Cloud Bigtable for my online game, but it cost (at the time) an estimated $16,000/year for an empty Bigtable on GCP. I’m not saying they’re gouging you, but in my own personal opinion, that feels like a lot of money for an empty [censored] database.
This paragraph underscores the lack of internal controls which operate in real time, although every two years could be considered near real time if one worked in a data center at Dialog Information Services in the mid 1980s. Today? Two years means a number of TikToks can come and go along with IPOs, unicorns, and Congressional hearings live streamed.
Mr. Yegge also uses a phrase I find delicious: “Deprecation treadmill.” The Google science club members use data (some old, some new, and some selected to support a lateral arabesque to a hotter team) to make changes. Examples range from the Dodgeball wackiness to change in cloud APIs which Mr. Yegge mentions in his essay. He notes:
Google engineers pride themselves on their software engineering discipline, and that’s actually what gets them into trouble. Pride is a trap for the unwary, and it has ensnared many a Google team into thinking that their decisions are always right, and that correctness (by some vague fuzzy definition) is more important than customer focus.
I wish to point out that Mr. Yegge is overlooking the key tenet of high school science club management methods: The science club is ALWAYS right. Since Mr. Yegge no longer works at the Google, Mr. Yegge is WRONG. Anyone who is not a right-now Googler is WRONG. A failure to understand the core of this mindset cannot work at the Google. Therefore, that individual is mentally unable to understand that Mother Google’s right-now brood is RIGHT. Australia and the European Union, for example, do not understand the logic of the Google. And they are obviously WRONG.
How simple this is.
Mr. Yegge points out how some activities are supposed to be carrier out in the “real world” as opposed to the cultural norms of the techno-monopolies. He writes:
Successful long-lived open systems owe their success to building decades-long micro-communities around extensions/plugins, also known as a marketplace.
This is indeed amusing. Google delivers advertising, and that is a game within a casino hotel. I think the online advertising game run by the Google blends the best of a 1950s Las Vegas outfit on the Strip and the mythical Hotel California of the Eagles’ song. Compare my metaphor with Mr. Yegge’s. Which is more accurate?
Google’s pride in their software engineering hygiene is what gets them into trouble here. They don’t like it when there are lots of different ways to do the same thing, with older, less-desirable ways sitting alongside newer fancier ways. It increases the learning curve for newcomers to the system, it increases the burden of supporting the legacy APIs, it slows down new feature velocity, and the worst sin of all: it’s ugly. Google is like Lady Ascot in Tim Burton’s Alice in Wonderland:
Lady Ascot: Alice, do you know what I fear most?
Alice Kingsley: The decline of the aristocracy?
Lady Ascot: Ugly grandchildren.
Mr. Yegge’s point is a brilliant one: The Google wants its customers to operate like “here and now” Googlers. But customers do not understand and they are WRONG.
The disconnect between the Google and mere customers is nailed in this statement by Mr. Yegge:
But after all these years, Google Cloud is still #3
Yes, Google does make decisions based on data. Those decisions are RIGHT. If this seems like a paradox, it is obvious that the customer is once again proving that he or she is not capable of working for Google. Achieving third prize in the cloud race is RIGHT, at least to some real Googlers. For Mr. Yegge, the crummy third place ranking is evidence of the mismatch between the techno-monopoly and cloud users and, I might add, Australia and the EU.
Mr. Yegge points out what may be a hint of the tension between the Google science club and its “wanna be” members. He writes about a Percona-centric, ready-to-use solution. He calmly points out:
Go ahead, I dare you. Follow the link and click the button. Choose “yes” to get all the default parameters and deploy the cluster to your Google Cloud project. Haha, joke’s on you; it doesn’t work. None of that [censored] works. It’s never tested, starts bit-rotting the minute they roll it out, and it wouldn’t surprise me if over half the click-to-deploy “solutions” (now we understand the air quotes) don’t work at all. It’s a completely embarrassing dark alley that you don’t want to wander down. But Google is straight-up encouraging you to use it. They want you to buy it. It’s transactional for them. They don’t want to support anything.
DarkCyber looks forward to Mr. Yegge’s next essay about the Google. Perhaps he will tackle the logic of reporting an offensive advertisement to the online monopoly. That process helps one understand a non-deprecation method in use at the A Number One science club. The management method is breathtaking.
As Eugène Ionesco noted:
“Realism falls short of reality. It shrinks it, attenuates it, falsifies it; it does not take into account our basic truths and our fundamental obsessions: love, death, astonishment. It presents man in a reduced and estranged perspective. Truth is in our dreams, in the imagination.”
The “our” is Google’s reality. If you are not a “here and now” Googler, you cannot understand.
Stephen E Arnold, August 17, 2020
Amazon Alexa Is Sensitive
August 17, 2020
The gimmick behind digital assistants is with a simple vocal command, using a smart speaker like Amazon Echo, users have access to knowledge and can complete small tasks. Amazon is either very clever or very clumsy when it comes to digital assistant Alexa. Venture Beat shares that, “Researchers Identify Dozens Of Words That Accidentally Trigger Amazon Echo Speakers” in a recent study.
The problem with digital assistants, other than them recording conversations and storing them to the cloud, is who will have access tot these conversations. LeakyPick is a platform that investigates microphone equipped devices and monitors network traffic for audio transmissions. LeakyPick was founded by University of Darmstadt, North Carolina State University, and the University of Paris Saclay.
LeakyPick was designed to test when smart speakers are activated:
“LeakyPick — which the researchers claim is 94% accurate at detecting speech traffic — works for both devices that use a wakeword and those that don’t, like security cameras and smoke alarms. In the case of the former, it’s preconfigured to prefix probes with known wakewords and noises (e.g., “Alexa,” “Hey Google”), and on the network level, it looks for “bursting,” where microphone-enabled devices that don’t typically send much data cause increased network traffic. A statistical probing step serves to filter out cases where bursts result from non-audio transmissions.”
To identify words that could activate smart speakers, LeakyPick uses all words in a phoneme dictionary. After testing smart speakers: Echo Dot, HomePod, and Google Home over fifty-two days. LeakyPick discovered that the Echo Dot reacted to eighty-nine words, some phonetically different from Alexa, to activate.
Amazon responded it built privacy deep into Alexa and sometimes smart speakers respond to words other than command signals.
LeakyPick, however, does show potential for testing smart home privacy and how to prevent them.
Whitney Grace, August 17, 2020
Object Detection AI Offers Deal
August 13, 2020
Object detection AI are currently big projects in the technology community. AI developers are teaching computers using large datasets how to learn and reason like a human infant. It his hard to imagine that AI object detection software is available for consumers, but Product Hunt recently ranked Priceless AI as a number one product of the day.
Priceless AI describes itself as an, “‘all-you-can-eat’ image object detection at a fixed monthly price.” What is astonishing is how cheap Priceless AI is compared to its counterparts AWS Rekognition and Google Cloud Vision. AWS Rekognition starts at $5/month for 10,000 monthly predictions heading all the way to $8,200 for 10,000,000 images predictions. Google Cloud Vision, on the other hand, starts at $20 then goes up to $18,700. These prices are outrageous when Priceless AI stays at a simple $99 for any amount of image predictions a month.
How can they do this?
“How do you offer such a cheaper alternative to AWS Rekognition and Google Cloud Vision?
We do clever low-level optimization allowing us to make a more efficient use of the hardware.”
Priceless AI allows its customers to have more than one concurrent request and they can be used on as many clients/devices as wanted. The devices/clients need to be synchronized, because only the number of concurrent purchases are allowed.
Priceless AI wants to be as transparent as possible with customers, but they do keep some things secret:
“What model are you using to run object detection?
We can’t disclose the exact model, but we can tell you it’s a state-of-the-art deep convolutional neural network.”
Companies do need to try to keep the their secrets.
Whitney Grace, August 13, 2020
Data Federation: K2View Seizes Lance, Mounts Horse, and Sallies Forth
August 13, 2020
DarkCyber noted “K2View Raises $28 million to Automate Enterprise Data Unification.”
Here’s the write up’s explanation of the K2View:
K2View’s “micro-database” Fabric technology connects virtually to sources (e.g., internet of things devices, big data warehouses and data lakes, web services, and cloud apps) to organize data around segments like customers, stores, transactions, and products while storing it in secure servers and exposing it to devices, apps, and services. A graphical interface and auto-discovery feature facilitate the creation of two-way connections between app data sources and databases via microservices, or loosely coupled software systems. K2View says it leverages in-memory technology to perform transformations and continually keep target databases up to date.
The write up contains a block diagram:
Observations:
- It is difficult to determine how much manual (human) work will be required to deal with content objects not recognized by the K2View system
- What happens if the Internet connection to a data source goes down?
- What is the fall back when a microservice is not available or removed from service?
Many organizations offer solutions to disparate types of data scattered across many systems. Perhaps K2View will slay the digital windmills of silos, different types of data, and unstable connections? Silos have been part of the data landscape as long as Don Quixote has been spearing windmills.
Stephen E Arnold, August 13, 2020
SlideShare: Some Work to Do
August 12, 2020
DarkCyber noted “Scribd Acquires Presentation Sharing Service SlideShare from LinkedIn.” In 2004, one could locate presentations on Google by searching for the extension ppt and its variants. In 2006, SlideShare became available. Then something happened. PowerPoints became more difficult to locate. When an online search pointed to a PowerPoint deck, the content was:
- Marketing fluff
- Incorrectly rendered with weird typography and wonky graphics
- Corrupted files.
What about today? DarkCyber’s most recent foray into the slide deck content wilderness produced zero; for example, SlideShare search produced identical pages of search results. The query retrieved slide decks on unrelated topics. Even worse, a query would result in SlideShare’s sending email upon email pointing to other slide decks. The one characteristic of these related slide deck was/is that they were unrelated to the information we sought.
There are online presentation services. There are open source presentation tools like SoftMaker’s. There is the venerable Keynote which never quite converts a PowerPoint file correctly.
Is there a future in a searchable collection of slide decks? In theory, yes. In reality, the cost of finding, indexing, and making searchable presentations faces some big hurdles; for example:
- Many organizations — for example, DARPA — standardize on PDF file formats. These are okay, but indexing these can be an interesting challenge
- Some presenters put their talks in the cloud, hoping that an Internet connection will allow their slides to display
- The Zoom world puts PowerPoints and other presentation materials on the user’s computer, never to make it into a more publicly accessible repository.
Like the dream of collecting conferences, presentations, and poster sessions, some content remains beyond the reach of researchers and analysts. The desire to get anyone looking for a slide deck to subscribe to a service gives operators of this service a chance to engage in spreadsheet fever. Here’s how this works? If there are X researchers, and we get Y percent of them. We can charge Z per year? By substituting guesstimates for the variables, the service becomes a winner.
The reality is that finding information in slide decks is more difficult today than it was in 2004. Access to information is becoming more difficult. DarkCyber would like to experience a SlideShare with useful content, more effective search and retrieval, and far less one page duplicates of ads for books.
Someday. Maybe?
Stephen E Arnold, August 12, 2020