Microsoft Partners Up for Smarter Security

May 13, 2021

I noted “Microsoft Partners with Darktrace to Help Customers Combat Cyber Threats with AI.” You may know that Microsoft has been the subject of some attention. No, I am not talking about Windows 10 updates which cause printers to become doorstops. Nope. I am not talking about the fate of a leaner, meaner version of Windows. Yep, I am making a reference to the SolarWinds’ misstep and the alleged manipulation of Microsoft Exchange Server to create a reprise of “waiting on line for fuel.” This was a popular side show in the Washington, DC, area in the mid-1970s.

How does Microsoft address its security PR challenge? There are white papers from Microsoft threat experts. There are meetings in DC ostensibly about JEDI but which may — just by happenstance — bring up the issue of security. No big deal, of course. And Microsoft forms new security-centric partnerships.

The partner mentioned in the write up is Darktrace. The company relies on technology somewhat related to the systems and methods packaged in the Autonomy content processing system. That technology included Bayesian methods, was at one time owned by Cambridge Neurodynamics, and licensed to Autonomy. (A summary of Autonomy is available at this link. The write up points out that Bayesian methods are centuries old and often criticized because humans have to set thresholds for some applications of the numerical recipes. Thus, outputs are not “objective” and can vary as the method iterates.) Darktrace’s origins are in Cambridge and some of the firm’s funding came from Michael Lynch-affiliated Invoke Capital. The firm’s Web page states:

Founded by celebrated technologist and entrepreneur, Dr Mike Lynch OBE, Invoke Capital founds, invests in and advises fast-growing fundamental technology companies in Europe. With deep expertise in identifying and commercializing artificial intelligence research and a close relationship with the University of Cambridge, Invoke exists to realize the commercial possibilities of Britain’s extraordinary science and deep technology base. Since 2012, Invoke has been instrumental in founding, creating and developing prominent technologies, and then finding the right teams to scale them into global businesses. Invoke’s companies include Darktrace, a world-leading cyber AI company that employs more than 1,500 people globally, Luminance, an award-winning machine learning platform for the legal industry, and AI fraud-detection engine, Featurespace. Invoke exited data-driven medicine experts, Sophia Genetics, in 2020.

{The Register provides a run down of some of the legal activity associated with Mr. Lynch at this link. )

The item presenting the tie up of Microsoft and Darktrace states:

Microsoft announced today a new partnership with Darktrace, a UK-based cyber security AI firm that works with customers to address threats using what it describes as “self-learning artificial intelligence”. Darktrace’s threat response system is designed to counter insider threats, espionage, supply chain attacks, phishing, and ransomware. The partnership between Microsoft and Darktrace is meant to give organizations an automated way of investigating threats across multiple platforms. Darktrace’s system works by learning the data within a specific environment as well as how users behave. The goal is to tell which activity is benign or malicious.

For more information about Darktrace, one can consult the firm’s Web site. For a different view, an entity with the handle OneWithCommonSense provides his/her assessment of the system. You can find that document (verified online on May 13, 2021) at this link.

Why is this interesting?

  1. The use of a system and method which may be related to how the Autonomy system operates may be an example how one mathematical method can be extended to a different suite of use cases; specifically, cyber security.
  2. The Darktrace disclosures about its technology make it clear that the technology is in the category of “artificial intelligence” or what I call smart software. Systems and methods which are more efficient, economical, and more effective are reasons why smart software is an important product category to watch.
  3. Darktrace (to my knowledge) may have the capability to recognize and issue an alert about SolarWinds-type incursions. Other cyber security firms’ smart software dropped the ball and many were blindsided by the subsequent Microsoft Exchange Server and shell exploits.

As a side note, Microsoft acquired the Fast Search & Transfer company after there were legal inquiries into the company. That was a company based in Norway. With the Darktrace deal, Microsoft is again looking offshore for solution to what on the surface seems to be the Achilles’ heel of the company’s product portfolio: Its operating system and related services.

Will Darktrace’s technology address the debilitating foot injury Microsoft has suffered? Worth watching because bad actors are having a field day with free ice cream as a result of the revelations related to Microsoft’s security engineering. Windows Defender may get an injection of a technology that caught Dr. Lynch’s eye. Quick is better in my opinion.

Stephen E Arnold, May 13, 2021

More Search Explaining: Will It Help an Employee Locate an Errant PowerPoint?

May 13, 2021

Semantics, Ambiguity, and the role of Probability in NLU” is a search-and-retrieval explainer. After half a century of search explaining, one would think that the technology required to enter a keyword and get a list of documents in which the key word appears would be nailed down. Wrong.

“Search” in 2021 embraces many sub disciplines. These range from explicit index terms like the date of a document to more elusive tags like “sentiment” and “aboutness.” Boolean has been kicked to the curb. Users want to talk to search, at least to Alexa and smartphones. Users want smart software to deliver results without the user having to enter a query. When I worked at Booz, Allen & Hamilton, one of my colleagues (I think his name was Harvey Poppel, the smart person who coined the phrase “paperless office”) suggested that someday a smart system would know when a manager walked into his or her office. The smart software would display what the person needed to know for that day. The idea, I think, was that whist drinking herbal tea, the smart person would read the smart outputs and be more smart when meeting with a client. That was in the late 1970s, and where are we? On Zooms and looking at smartphones. Search is an exercise in frustration, and I think that is why venture firms continue to pour money into ideas, methods, concepts, and demos which have been recycled many times.

I once reproduced a chunk of Autonomy’s marketing collateral in a slide in one of my presentations. I asked those in the audience to guess at what company wrote the text snippet. There were many suggestions, but none was Autonomy. I doubt that today’s search experts are familiar with the lingo of search vendors like Endeca, Verity, InQuire, et all. That’s too bad because the prose used to describe those systems could be recycled with little or no editing for today’s search system prospects.

The write up in question is serious. The author penned the report late last year, but Medium emailed me a link to it a day ago along with a “begging for dollars” plea. Ah, modern online blogs. Works of art indeed.

The article covers these topics as part of the “search” explainer:

  • Ambiguity
  • Understanding
  • Probability

Ambiguity is interesting. One example is a search for the word “terminal.” Does the person submitting the query want information about a computer terminal, a bus terminal, or some other type of terminal; for instance the post terminal on the transformer to my model train set circa 1951? Smart software struggles with this type of ambiguity. I want to point out that a subject matter expert can assign a “field code” to the term and eliminate the ambiguity, but SMEs are expensive and they lose their index precision capability as the work day progresses.

The deal with the “terminal” example, the modern system has to understand [a] what the user wants and [b] what the content objects are about. Yep, aboutness. Today’s smart software does an okay job with technical text because jargon like Octanitrocubane allows relatively on point identification of a document relevant to a chemist in Columbus, Ohio. Toss in a chemical structure diagram, and the precision of the aboutness ticks up a notch. However, if you search for a word replete with social justice meaning, smart software often has a difficult time figuring out the aboutness. One example is a reference to Skokie, Illinois. Is that a radical right wing code word or a town loved for Potawatomi linguistic heritage?

Probability is a bit more specific — usually. The idea in search is that numbers can illuminate some of the dark corners of text’s meaning. Examples are plentiful. Curious about Miley Cyrus on SNL and then at the after party? The search engine will display the most probable content based on whatever data is sluiced through the query matcher and stored in a cache. If others looked at specific articles, then, by golly, a query about Miley is likely or highly probable to be just what the searcher wanted. The difference between ambiguity, understanding, and probability is — in my opinion — part of the problem search vendors faces. No one can explain why, after 50 years of SMART, and Personal Library Software, STAIRS, et al, finding on point information remains frustrating, expensive, and ineffective.

The write up states:

ambiguity was not invented to create uncertainty — it was invented as a genius compression technique for effective communication. And it works like magic, because on the receiving end of the message, there is a genius decoding and decompression technique/algorithm to uncover all that was not said to get at the intended thought behind the message. Now we know very well how we compress our thoughts into a message using a genius encoding scheme, let us now concentrate on finding that genius decoding scheme — a task that we all call now ‘natural language understanding’.

Sounds great. Now try this test. You have a recollection of viewing a PowerPoint a couple of weeks ago at an offsite. You know who the speaker was and you want the slide with the number of instant messages sent per day on WhatsApp? How do you find that data?

[a] Run a query on your Fabasoft, SearchUnify, or Yext system?

[b] Run a query on Google in the hopes that the GOOG will point you to Statista, a company you believe will have the data?

[c] Send an email to the speaker?

[d] All of the above.

I would just send the speaker a text message and hope for an answer. If today’s search systems were smart, wouldn’t the single PowerPoint slide be in my email anyway? Sure, someday.

Stephen E Arnold, May 13, 2021

Amazing Moments in Cyber Security: The SolarWinds Awards

May 5, 2021

Believe it or not.

In a gem of an understatement, SolarWinds’ Sojung Lee called 2020 a “challenging year.” Lee made this assessment at his company’s recent APJ Q2 Virtual Partner Briefing where, as ChannelLife reports, “SolarWinds Celebrates Channel Partners in APJ Channel Awards.” Yes, that company gives out awards. We’re told:

“The awards recognize SolarWinds’ partners and distributors for their achievements in delivering services and expertise to customers. SolarWinds Asia Pacific and Jap vice president sales, Sojung Lee, says that 2020 was a challenging year but SolarWinds partners remained resilient.”

Resilient—yes, they would have to be. Readers can navigate to the brief write-up for the list of recipients, if curious. We just find it remarkable this list even exists at this point in time. What about these “winners’” security? We don’t know and maybe SolarWinds does not either. Sales, not security, could be job one.

Cynthia Murrell, May 5, 2021

Content Bot Employs Powerful OpenAI Tech for Marketing Content

May 4, 2021

Great news. Now companies can launch spam fiestas, no humans required, for as little as $29 per month. Content Bot offers tools to generate persuasive marketing copy, powerful slogans, smooth landing pages, improved blog posts, and even something it calls “automated inspiration.” The site promises “human-like text.” If that sounds familiar, it should—the site’s developers must have scored one of the coveted OpenAI beta API slots, as its FAQ reveals:

“We make use of a variety of AI models, with the main model being GPT-3 by OpenAI. GPT-3, or Generative Pre-trained Transformer 3 is an autoregressive language model which uses deep learning to produce human-like text. It’s a game changer for content creators.”

Indeed it is. This is exactly the sort of thing we expected to see when OpenAI began releasing its API to a select few last year. Well, one of the things—the fake news will likely be less publicized. Content Bot’s FAQ also specifies:

“95% of the content generated by the AI is unique and original. We also provide a uniqueness score for longer form content generated so you can have peace of mind to know that the content you have received is unique.”

So one must trust their metric to verify their promise of uniqueness. Interesting. We also learn the platform is relying on Google Translate as a stopgap measure:

“We currently support all languages supported by Google Translate. We understand that although Google Translate may not be the best translation for your needs, we are currently exploring other options such as IBM Watson and OpenAI to provide better, or multiple translations at once.”

Will the price go up if they find a better translation option? The service currently costs $29 per month for the basic version, $79 for the one geared toward agencies—the latter generates three times as many blog posts and supplies a “paraphrase rewriter.” There is a free trial available, and non-profits are invited to write in for a discount. It was no surprise to learn Content Bot workers are fully remote, but the company maintains licenses and operating addresses in Florida and in South Africa. Who will be next to launch a product based on GPT-3?

Cynthia Murrell, May 4, 2021

You Know about CDPs, Right? Good, These Are the In Thing

April 28, 2021

A CDP is a customer data platform. The jargon embraces the idea of a customer list, information about those who are spending money and who might spend money, and the myriad software utilities which are desperate to reimagine themselves as digital tigers, not kitty cats.

Years ago, Vivisimo or another IBM entity, came up with the V idea. Big Data is fast which became velocity. Big data became big which morphed into volume. Big Data is a confection which variety. But three Vs were not enough. Former physical training majors combined with some art history grads and added value and veracity. Data quality became veracity. Yeah, got it.

Now the sales crowd is in the game with CDPs.

ZDNet discusses “The Five Vs of Customer Data Platforms.” Quite a coincidence to me. For the ZD professionals, this overlap with the Vs of Big Data is logical, possibly brilliant.

Research from CRM platform vendor Salesforce indicates customer data platforms were a high-priority investment for marketing executives last year. To describe the unique challenges of wrangling data from multiple channels for these systems, Writer Vala Afshar quotes from the book Customer Data Platforms by Martin Kihn and Salesforce’s Chris O’Hara:

“When it comes to marketing, customers expect the interactions they have on a company’s website to translate to their mobile app experiences and even in-store visits. The problem is that, for most companies, those environments operate off of different datasets—even though the customer is the same. Customers also expect their experiences as they move from channel to channel to be consistent, and ‘in the moment.’ Most customer journeys involve over three different channels (e.g., email, web, and mobile app), and customers tend to move seamlessly and quickly between these channels. Most companies, however, don’t have these data environments connected in real-time. The result is disconnected experiences for consumers and the lack of a single source of truth about customers for the marketer.”

Benefits of gathering all this data and putting it at the fingertips of customer service include more personalized interactions and the ability to prioritize calls from the most loyal buyers, for example. We learn:

“The good news is that this is happening today. Large enterprises with sophisticated IT departments, in-house developers, and large software budgets are connecting these systems to create such results. The bad news is that it’s very expensive, requires constant vigilance and development to keep it working, and its dependent upon licensing solutions from dozens of software vendors for data ingestion to data activation, and everything in between.”

This is where Afshar’s version of the five Vs comes in. For more insight into original thinking, please, see the write-up for his suggestions on how to use those as guidelines for more efficiently creating a comprehensive customer data platform.

Cynthia Murrell, April 28, 2021

Google Ad Auctions: An Interesting Phrase

April 16, 2021

I am no legal eagle. I don’t do online advertising. I don’t care too much about the antics of art history majors laboring in the vineyards of pay to play.

I did read “When Google’s Fancy Lawyers Screw Up and Jeopardize Sheryl Sandberg, at $1500/Hour.”

Here’s the passage I found suggestive:

… now we know a few important new details about the Texas adtech case. This case includes an allegation that Google’s large online advertising marketplace – think stock market but instead of stocks they trade ad slots – is riddled with secret rigged auctions.

And what, pray tell, was the phrase which snagged my attention? Here she be:

secret rigged auctions

Why is this interesting?

  1. The phrase resonates with me because “secrecy” and algorithmic, objective methods are trustworthy. But then there’s the word “rigged.” Yikes!
  2. The assertion backed by legal documents with faulty redaction makes clear that getting useful information is a pretty complicated process. Who did what, when, and why?
  3. Online advertising is big money for some outfits. Could the information widen cracks in the digital foundation?

Interesting.

Stephen E Arnold, April 16, 2021

Checklist of Shady Digital Marketing Tactics

April 13, 2021

I think the author of “The Problem With Digital Marketing” wanted to make a positive contribution to the art and science of paying to get attention. The write up identifies four categories of marketing wizards which may cast a shadow over the well intentioned efforts of companies desperate for revenue.

The four buckets of bad things are:

  1. Gunning for a quick payoff
  2. Thinking about money now
  3. Shady search engine optimization methods
  4. Unprofessional behavior or what I call MBA ethical practices.

These four groups of activities are interesting for three reasons. First, the mixture of big things like the lack of an ethical command center and tiny thinks like using Dark Patterns to snooker a Web site visitor into spending money when the user thought he/she was NOT making a purchase are jarring.

The lack of the ethics thing opens the door to many activities not included in the three other buckets; for example, apps which are designed to snag a user’s financial information or the use of email to lure the recipient into divulging access credentials.

Items one and two are essentially the fabric of anyone who has bills to pay, a habit to feed, or a keen desire to ride to the bank in a new Bronco with an M1 MacBook under his/her arm.

Item three is actually the focal point of the write up. If an entity is not in the Google and easily findable by those with a limited vocabulary, that entity does not exist. The same need for findability applies to tweet things, Facebook craziness, and even the hopelessly weird Microsoft LinkedIn.

Distorting relevance, using assorted tricks like buying backlinks from clueless Web site owners, and dabbling in the sale of endorsements from YouTube influencers are probably not helpful to someone looking for an objective results list in response to a query.

So what do I make of this write up?

First, it makes clear that SEO is the way to go.

Second, the use of Dark Patterns or closely allied methods work and often work quite well.

Third, payoffs come when ethics are kicked into the trash surrounding the youth soccer field and email (phishing), apps (vectors of malware), and rhetorical tricks are used. The problem with digital content is a combination of tricks and bad content.

What works is buying Google online ads or becoming famous on YouTube or TikTok. Twitter is a minnow compared to the Google thing.

Stephen E Arnold, April 12, 2021

The Alphabet Google YouTube Thing Explains Good Old Outcome Centered Design

April 8, 2021

If you have tried to locate information on a Google Map, you know what good design is, right? What about trying to navigate the YouTube upload interface to add or delete a “channel”? Perfection, okay. What if you have discovered an AMP error email and tried to figure out how a static Web site generated by an AMP approved “partner” can be producing a single flawed Web page? Intuitive and helpful, don’t you think?

Truth is: Google Maps are almost impossible to use regardless of device. The YouTube interface is just weird and better for a 10-year-old video game player than a person over 30, and the AMP messages? Just stupid.

I read “Waymo’s 7 Principles of Outcome-Centered Design Are What Your Product Needs” and thought I stumbled upon a listicle crafted by Stephen Colbert and Jo Koy in the O’Hare Airport’s Jazz Bar.

Waymo (so named because one get way more with Alphabet Google YouTube — hereinafter, AGYT)technology — is managed by co-CEOs. It is semi famous for hiring uber engineer Anthony Levandowski. Plus the company has been beavering away to make driving down 101 semi fun since 2009. The good news is that Waymo seems to be making more headway than the Google team trying to solve death. The Wikipedia entry for Waymo documents 12 collisions, but the exact number of smart  errors by the Alphabet Google YouTube software is not known even to some Googlers. Need to know, you know.

What are the rules for outcome centered design; that is, ads but no crashes I presume. The write up presents seven. Here are three and you can let your Chrome browser steer you to the full list. Don’t run into the Tesla Web site either, please.

Principle 2. Create focus by clarifying you8r purpose.

Okay, focus. Let’s see. When riding in a vehicle with no human in charge, the idea is to avoid a crash. What about filtering YouTube for okay content? Well, that only works some of the time. The Waymo crashes appear to underscore the fuzz in the statistical routines.

And Principle 4. Clue in to your customer’s context.

Yep, in a vehicle which knows one browsing history and has access to nifty profiles with probabilities allows the vehicle to just get going. Forget what the humanoid may want. Alphabet Google YouTube is ahead of the humanoid. Sometimes. The AFYT approach is to trim down what the humanoid wants to three options. Close enough for horse shoes. Waymo, like Alphabet Google YouTube, knows best. Just like a digital mistress. The humanoid, however, is going to a previously unvisited location. Another humanoid told the rider face to face about an emergency. The AGYT system cannot figure out context. Not to worry. Those AGYT interfaces will make everything really easy. One can talk to the Waymo equipped smart vehicle. Just speak clearly, slowly, and in a language which Waymo parses in an acceptable manner. Bororo won’t work.

Finally, Principle 7: Edit edit edit.

I think this means revisions. Those are a great idea. Alphabet Google YouTube does an outstanding job with dots, hamburger menus, and breezy writing in low contrast colors. Oh, content? If you don’t get it, you are not Googley. Speak up and you may be the Timnit treatment or the Congressional obfuscation rhetoric. I also like ignoring the antics of senior managers.

Yep, outcome centered. Great stuff. Were Messrs. Colbert and Koy imbibing something other than Sprite at the airport when possibly conjuring this list of really good tips? What’s the outcome? How about ads displayed to passengers in Waymo infused vehicles? Context centered, relevant, and a feature one cannot turn off.

Stephen E Arnold, April 8, 2021

HPE Machine Learning: A Benefit of the Autonomy Tech?

April 8, 2021

This sounds like an optimal solution from HPE (formerly known as HP); too bad it was not available back when the company evaluated the purchase of Autonomy. Network World reports, “HPE Debuts New Opportunity Engine for Fast AI Insights.” The machine-learning platform is called the Software Defined Opportunity Engine, or SDOE. It is based in the cloud, and will greatly reduce the time it takes to create custom sales proposals for HPE channel partners and customers. Citing a blog post from HPE’s Tom Black, writer Andy Patrizio explains:

“It takes a snapshot of the customer’s workloads, configuration, and usage patterns to generate a quote for the best solution for the customer in under a minute. The old method required multiple visits by resellers or HPE itself to take an inventory and gather usage data on the equipment before finally coming back with an offer. That meant weeks. SDOE uses HPE InfoSight, HPE’s database which collects system and use information from HPE’s customer installed base to automatically remediate infrastructure issues. InfoSight is primarily for technical support scenarios. Started in 2010, InfoSight has collected 1,250 trillion data points in a data lake that has been built up from HPE customers. Now HPE is using it to move beyond technical support to rapid sales prep.”

The write-up describes Black’s ah-ha moment when he realized that data could be used for this new purpose. The algorithm-drafted proposals are legally binding—HPE must have a lot of confidence in the system’s accuracy. Besides HPE’s existing database and servers, the process relies on the assessment tool recently acquired when the company snapped up CloudPhysics. We learn that the tool:

“… analyzes on-premises IT environments much in the same way as InfoSight but covers all of the competition as well. It then makes recommendations for cloud migrations, application modernization and infrastructure. The CloudPhysics data lake—which includes more than 200 trillion data samples from more than one million virtual machines—combined with HPE’s InfoSight can provide a fuller picture of their IT infrastructure and not just their HPE gear.”

As of now, SDOE is only for storage systems, but we are told that could change down the road. Black, however, was circumspect on the details.

Cynthia Murrell, April 8, 2021

Alphabet Google YouTube: We Are Doing Darned Good Work

April 7, 2021

I read a peculiar item of information about the mom-and-pop outfit Alphabet Google YouTube. You may have a different reaction to the allegedly accurate data. Just navigate to “YouTube Claims It’s Getting Better at Enforcing Its Own Moderation Rules.” The “real news” story reports:

In the final months of 2020, up to 18 out of every 10,000 views on YouTube were on videos that violate the company’s policies and should have been removed before anyone watched them. That’s down from 72 out of every 10,000 views in the fourth quarter of 2017, when YouTube started tracking the figure.

Apparently the mom-and-pop outfit calculates a “violative view rate.” This is a metric possible only if a free video service accepts, indexes, and makes available “videos that contain graphic violence, scams, or hate speech.”

The system, the write up reports that :

YouTube’s team uses the figure internally to understand how well they’re doing at keeping users safe from troubling content. If it’s going up, YouTube can try to figure out what types of videos are slipping through and prioritize developing its machine learning to catch them.

A few questions spring to mind:

  • What specifically is “violative” content. An interview I conducted with a former CIA operative was removed a year after the interview appeared as a segment in my 10 to 15 minute twice monthly video news program. An interview with a retired spy was deemed violative. I hope YouTube learned something from this take down. I remain puzzled.
  • How does content depicting graphic violence, scams, and hate speech get on the YouTube system? After I upload a video, a message appears to tell me if the video is okay or not okay. I think Google’s system is getting better from the mom-and-pop outfit’s point of view. From other points of view? I am not sure.
  • Why trust metrics generated within the Alphabet Google YouTube outfit? By definition, the data collection methods, the sample, and the techniques used to identify what’s important are not revealed. FAANG-type outfits are not exactly the gold standard in ethical behavior for some people. I, of course, believe everything I read online like transcripts of senior executives’ remarks to Congressional committees?
  • Why release these data now? What’s the point? Apple is tossing cores at Facebook. Alphabet Google YouTube is reminding some that Microsoft’s security is interesting. Amazon wants to pay tax. Maybe these actions and the violative metric are PR.

The write up contains charts. Low contrast colors show just how much better Alphabet Google YouTube is getting in the violative content game. I love the violative view rate phrase. Delicious.

Stephen E Arnold, April 7, 2021

Next Page »

  • Archives

  • Recent Posts

  • Meta