Semantic Search for arXiv Papers

January 12, 2023

An artificial intelligence research engineer named Tom Tumiel (InstaDeep) created a Web site called arXivxplorer.com.

imageAccording to his Twitter message (posted on January 7, 2023), the system is a “semantic search engine.” The service implements OpenAI’s embedding model. The idea is that this search method allows a user to “find the most relevant papers.” There is a stream of tweets at this link about the service. Mr. Tumiel states:

I’ve even discovered a few interesting papers I hadn’t seen before using traditional search tools like Google or arXiv’s own search function or even from the ML twitter hive mind… One can search for similar or “more like this” papers by “pasting the arXiv url directly” in the search box or “click the More Like This” button.

I ran several test queries, including this one: “Google Eigenvector.” The system surfaced generally useful papers, including one from January 2022. However, when I included the date 2023 in the search string, arXiv Xplorer did not return a null set. The system displayed hits which did not include the date.

Several quick observations:

  1. The system seems to be “time blind,” which is a common feature of modern search systems
  2. The system provides the abstract when one clicks on a link. The “view” button in the pop up displays the PDF
  3. Comparing result sets from the query with additional search terms surfaces papers reduces the result set size, a refreshing change from queries which display “infinite scrolling” of irrelevant documents.

For those interested in academic or research papers, will OpenAI become aware of the value of dates, limiting queries to endnotes, and displaying a relationship map among topics or authors in a manner similar to Maltego? By combining more search controls with the OpenAI content and query processing, the service might leapfrog the Lucene/Solr type methods. I think that would be a good thing.

Will the implementation of this system add to Google’s search anxiety? My hunch is that Google is not sure what causes the Google system to perturb ate. It may well be that the twitching, the sudden changes in direction, and the coverage of OpenAI itself in blogs may be the equivalent of tremors, soft speaking, and managerial dizziness. Oh, my, that sounds serious.

Stephen E Arnold, January 12, 2022

Spammers, Propagandists, and Phishers Rejoice: ChatGPT Is Here

January 12, 2023

AI-generate dart is already receiving tons of backlash from the artistic community and now writers should trade lightly because according to the No Film School said, “You Will Be Impacted By AI Writing…Here Is How.” Hollywood is not a friendly place, but it is certainly weird. Scriptwriters deal with all personalities, especially bad actors, who comment on their work. Now AI algorithms will offer notes on their scripts too.

ChatGPT is a new AI tool that blurs the line between art and aggregation because it can “help” scriptwriters with their work aka made writers obsolete:

“ChatGPT, and programs like it, scan the internet to help people write different prompts. And we’re seeing it begin to be employed by Hollywood as well. Over the last few days, people have gone viral on Twitter asking the AI interface to write one-act plays based on sentences you type in, as well as answer questions….This is what the program spat back out at me:

‘There is concern among some writers and directors in Hollywood that the use of AI in the entertainment industry could lead to the creation of content that is indistinguishable from human-generated content. This could potentially lead to the loss of jobs for writers and directors, as AI algorithms could be used to automate the process of creating content. Additionally, there is concern that the use of AI in Hollywood could result in the creation of content that is formulaic and lacks the creativity and uniqueness that is typically associated with human-generated content.’”

Egads, that is some good copy! AI automation, however, lacks the spontaneity of human creativity. But the machine generated prose is good enough for spammers, propagandists, phishers, and college students.

Humans are still needed to break the formulaic, status quo, but Hollywood bigwigs only see dollar signs and not art. AI create laughable stories, but they are getting better all the time. AI could and might automate the industry, but the human factor is still needed. The bigger question is: How will humanity’s role change in entertainment?

Whitney Grace, January 12, 2023

Internet Archive Scholar: Will Publishers Find a Way to Stomp This Free Knowledge Beast?

January 12, 2023

Here is a new search service worth noting. The Internet Archive Scholar was built to search the extensive, non-profit Internet Archive. The tool introduces itself:

“This full text search index includes over 25 million research articles and other scholarly documents preserved in the Internet Archive. The collection spans from digitized copies of eighteenth century journals through the latest Open Access conference proceedings and pre-prints crawled from the World Wide Web.”

Yes, that is a lot of information and a dedicated search system is a welcome addition. If only it were easier to find what one is looking for; the search leaves some on the Arnold IT team wanting more functionality. But the service is young, and the page notes that “Metadata is being improved and features have not been finalized.

The About page tells us more about how the tool works, where the metadata comes from (fatcat.wiki), and where to direct certain queries. It also addresses the issue of text and data mining:

“We intend to provide researcher access to the full corpus for text and data mining purposes. Derived datasets may also be posted publicly for analysis, for example a citation graph or N-gram frequencies by year. If you are interested or would like to see specific datasets made available, please contact us.

Currently snapshots of the full fatcat metadata corpus and upstream metadata sources are uploaded periodically to the Bulk Bibliographic Metadata collection on archive.org. Read more in the Fatcat Guide.”

We look forward to seeing what functionality improvements the team implements as the Scholar is developed further. Readers may want to check it out for themselves and/or bookmark the site for future use. We are also curious about publishers’ reactions.

Cynthia Murrell, January 12, 2023

FAA Software: Good Enough?

January 11, 2023

Is today’s software good enough. For many, the answer is, “Absolutely.” I read “The FAA Grounded Every Single Domestic Flight in the U.S. While It Fixed Its Computers.” The article states what many people in affected airports knows:

The FAA alerted the public to a problem with the system at 6:29 a.m. ET on Twitter and announced that it had grounded flights at 7:19 a.m. ET. While the agency didn’t provide details on what had gone wrong with the system, known as NOTAM, Reuters reported that it had apparently stopped processing updated information. As explained by the FAA, pilots use the NOTAM system before they take off to learn about “closed runways, equipment outages, and other potential hazards along a flight route or at a location that could affect the flight.” As of 8:05 a.m. ET, there were 3,578 delays within, out, and into the U.S., according to flight-tracking website FlightAware.

NOTAM, for those not into government speak, means “Notice to Air Missions.”

Let’s go back in history. In the 1990s I think I was on the Board of the National Technical Information Service. One of our meetings was in a facility shared with the FAA. I wanted to move my rental car from the direct sunlight to a portion of the parking lot which would be shaded. I left the NTIS meeting, moved my vehicle, and entered through a side door. Guess what? I still remember my surprise when I was not asked for my admission key card. The door just opened and I was in an area which housed some FAA computer systems. I opened one of those doors and poked my nose in and saw no one. I shut the door, made sure it was locked, and returned to the NTIS meeting.

I recall thinking, “I hope these folks do software better than they do security.”

Today’s (January 11, 2023) FAA story reminded me that security procedures provide a glimpse to such technical aspects of a government agency as software. I had an engagement for the blue chip consulting firm for which I worked in the 1970s and early 1980s to observe air traffic control procedures and systems at one of the busy US airports. I noticed that incoming aircraft were monitored by printing out tail numbers and details of the flight, using a rubber band to affix these data to wooden blocks which were stacked in a holder on the air traffic control tower’s wall. A controlled knew the next flight to handle by taking the bottom most block, using the data, and putting the unused block back in a box on a table near the bowl of antacid tablets.

I recall that discussions were held about upgrading certain US government systems; for example, the IRS and the FAA computer systems. I am not sure if these systems were upgraded. My hunch is that legacy machines are still chugging along in facilities which hopefully are more secure than the door to the building referenced above.

My point is that “good enough” or “close enough for government work” is not a new concept. Many administrations have tried to address legacy systems and their propensity to [a] fail like the Social Security Agency’s mainframe to Web system, [b] not work as advertised; that is, output data that just doesn’t jibe with other records of certain activities (sorry, I am not comfortable naming that agency), or [c] are unstable because either funds for training staff, money for qualified contractors, or investments in infrastructure to keep the as is systems working in an acceptable manner.

I think someone other than a 78 year old should be thinking about the issue of technology infrastructure that, like Southwest Airlines’ systems, or the FAA’s system does not fail.

Why are these core systems failing? Here’s my list of thoughts. Note: Some of these will make anyone between 45 and 23 unhappy. Here goes:

  1. The people running agencies and their technology units don’t know what to do
  2. The consultants hired to do the work agency personnel should do don’t deliver top quality work. The objective may be a scope change or a new contract, not a healthy system
  3. The programmers don’t know what to do with IBM-type mainframe systems or other legacy hardware. These are not zippy mobile phones which run apps. These are specialized systems whose quirks and characteristics often have to be learned with hands on interaction. YouTube videos or a TikTok instructional video won’t do the job.

Net net: Failures are baked into commercial and government systems. The simultaneous of several core systems will generate more than annoyed airline passengers. Time to shift from “good enough” to “do the job right the first time”. See. I told you I would annoy some people with my observations. Well, reality is different from thinking about smart software will write itself.

Stephen E Arnold, January 11, 2023

Cyber Investigators: Feast, Famine, or Poisoned Data in 2023

January 11, 2023

At this moment in time, the hottest topic among some cyber investigators is open source intelligence or OSINT. In 2022, the number of free and for-fee OSINT tools and training sessions grew significantly. Plus, each law enforcement and intelligence conference I attended in 2022 was awash with OSINT experts, exhibitors, and investigators eager to learn about useful sites, Web and command line techniques, and intelware solutions combining OSINT information with smart software. I anticipate that 2023 will be a bumper year for DYOR or do your own research. No collegial team required, just a Telegram group or a Twitter post with comments. The Ukraine-Russia conflict has become the touchstone for the importance of OSINT.

Over pizza, my team and I have been talking about how the OSINT “revolution” will unwind in 2023. On the benefit side of the cyber investigative ledger, OSINT is going to become even more important. After 30 years in the background, OSINT has become the next big thing for investigators, intelligence professionals, entrepreneurs, and Beltway bandits. Systems developed in the US, Israel, and other countries continue to bundle sophisticated analytics plus content. The approach is to migrate basic investigative processes into workflows. A button click automates certain tasks. Some of the solutions have proven themselves to be controversial. Voyager Lab and the Los Angeles Police Department generated attention in late 2021. The Brennan Center released a number of once-confidential documents revealing the capabilities of a modern intelware system. Many intelware vendors have regrouped and appear to be ready to returned to aggressive marketing of their systems, its built-in data, and smart software. These tools are essential for certain types of investigations whether in US agencies like Homeland Security or in financial crime investigations at FINCEN. Even state and city entities have embraced the mantra of better, faster, easier, and, in some cases, cheaper investigations.

Another development in 2023 will be more tension between skilled human investigators and increasingly smarter software. The bean counters (accountants) see intelware as a way to reduce the need for headcount (full time equivalents) and up the amount of smart software and OSINT information. Investigators will face an increase in cyber crime. Some involved in budgeting will emphasize smart software instead of human officers. The crypto imbroglio is just one facet of the factors empowering online criminal behavior. Some believe that the Dark Web, CSAM, and contraband have faded from the scene. That’s a false idea. In the last year or so, what my team and I call the “shadow Web” has become a new, robust, yet hard-to-penetrate infrastructure for cyber crime. Investigators now face an environment into which a digital Miracle-Gro has been injected. Its components are crypto, encryption, and specialized software that moves Web sites from Internet host to Internet host in the click of a mouse. Chasing shadows is a task even the most recent intelware systems find difficult to accomplish.

However, my team and I believe that there is another downside for law enforcement and a major upside for bad actors. The wide availability of smart software capable of generating misinformation in the form of text, videos, and audio. Unfortunately today’s intelware is not yet able to flag and filter weaponized information in real time or in a reliable way. OSINT advocates and marketers unfamiliar with the technical challenges of ignoring “fake” information downplay the risk of weaponized or poisoned information. A smart software system ingesting masses of digital information can, at this time, learn from bogus data and, therefore, output misleading or incorrect recommendations. In 2023, poisoned data continue to derail many intelware systems as well as traditional investigations when insufficient staff are available to determine provenance and accuracy. Our research has identified 10 widely-used mathematical procedures particularly sensitive to bogus information. Few want to discuss these out-of-sight sinkholes in public forums. Hopefully the reluctance to talks about OSINT blindspots will fade in 2023.

The feast? Smart software. Masses of information.

The famine? Funds to expand the hiring of full time (not part time) investigators and the money needed to equip these professionals with high-value, timely instruction about tools, sources, pitfalls, and methods for verification of data.

The poison? The ChatGPT and related tools which can make anyone with basic scripting expertise into a volcano of misinformation.

Let me suggest four steps to begin to deal with the feast, famine, and poison challenges?

First, individuals, trade groups, and companies marketing intelware to law enforcement and intelligence entities stick to the facts about their systems. The flowery language and the truth-stretching lingo must be decreased. Why do intelware vendors experience brutal churn among licensees? The distance between the reality of the system and the assertions made to sell the system.

Second, procurement processes and procurement professionals must become advocates for reform. Vendors often provide “free” trials and then work to get “on the budget.” The present procurement methods can lead to wasted time, money, and contracting missteps. Outside-the-box ideas like a software sandbox require consideration. (If you want to know more about this, message me.)

Third, consulting firms which are often quick to offer higher salaries to cyber investigators need to evaluate the impact of their actions on investigative units. There is no regulatory authority monitoring the behavior of these firms. The Wild West of cyber investigator poaching hampers some investigations. Legislation perhaps? More attention from the Federal Trade Commission maybe? Putting the needs of the investigators ahead of the needs of the partners in the consulting firms?

Fourth, a stepped up recruitment effort is needed to attract investigators to the agencies engaged in dealing with cyber crime. In my years of work for the US government and related entities, I learned that government units are not very good at identifying, enlisting, and retaining talent. This is an administrative function that requires more attention from individuals with senior administrative responsibilities. Perhaps 2023 will generate some progress in this core personnel function.

Don’t get me wrong. I am optimistic about smart software. I believe techniques to identify and filter weaponized information can be enhanced and improved. I am confident that forward leaning professionals in government agencies can have a meaningful impact on institutionalized procedures and methods associated with fighting cyber crime.

My team and I are committed to conducting research and sharing our insights with law enforcement and intelligence professionals in 2023. My hope is that others will adopt a similar “give back” and “pay it forward” approach in 2023 in the midst of feasts, famines, and poisoned data.

Thank you for reading. — Stephen E Arnold, January 11, 2023

China Orders AI Into the Courtroom

January 11, 2023

China is simply delighted with the possibilities of AI technology. In fact, it is now going so far as to hand most of its legal services over to algorithms. ZDNet reports, “China Wants Legal Sector to be AI-Powered by 2025.” Yep, once those algorithms are set up justice in China can be automated. Efficient. Objective. What’s not to like? Writer Eileen Yu explains:

“The country’s highest court said all courts were required to implement a ‘competent’ AI system in three years, according to a report by state-owned newspaper China Daily, pointing to guidelines released by the Supreme People’s Court.  The document stated that a ‘better regulated’ and more effective infrastructure for AI use would support all processes needed in handling legal cases. This should encompass in-depth integration of AI, creation of smart courts, and higher level of ‘digital justice’, the high court said.  A more advanced application of AI, however, should not adversely affect national security or breach state secrets as well as violate personal data security, the document noted, stressing the importance of upholding the legitimacy and security of AI in legal cases.  It added that rulings would remain decisions made by human judges, with AI tapped as supplemental references and tools to improve judges’ efficiency and ease their load in trivial matters. An AI-powered system also would offer the public greater access to legal services and help resolve issues more effectively, the Supreme People’s Court said.”

We’re sure that is exactly how it will work out, making life better for citizens caught up in the legal system. The directive also instructs courts train their workers on using AI, specifically on learning to spot irregularities. What could go wrong? At least final decisions will be made by humans. For now. To make matters even trickier, the Supreme People’s Court is planning to use blockchain technology to link courts to other sectors in support of socioeconomic development. Because what is more important in matters of justice than how they affect the almighty yuan?

Cynthia Murrell, January 11, 2023

US AI Legal Decisions: Will They Matter?

January 10, 2023

I read an interesting essay called “This Lawsuit against Microsoft Could Change the Future of AI.” It is understandable that the viewpoint is US centric. The technology is the trendy discontinuity called ChatGPT. The issue is harvesting data, lots of it from any source reachable. The litigation concerns Microsoft’s use of open source software to create a service which generates code automatically in response to human or system requests.

The essay uses a compelling analogy. Here’s the passage with the metaphor:

But there’s a dirty little secret at the core of AI — intellectual property theft. To do its work, AI needs to constantly ingest data, lots of it. Think of it as the monster plant Audrey II in Little Shop of Horrors, constantly crying out “Feed me!” Detractors say AI is violating intellectual property laws by hoovering up information without getting the rights to it, and that things will only get worse from here.

One minor point: I would add the word “quickly” after the final word here.

I think there is another issue which may warrant some consideration. Will other countries — for instance, China, North Korea, or Iran — be constrained in their use of open source or proprietary content when training their smart software? One example is the intake of Galmon open source satellite data to assist in guiding anti satellite weapons should the need arise. What happens when compromised telecommunications systems allow streams of real time data to be pumped into ChatGPT-like smart systems? Smart systems with certain types of telemetry can take informed, direct action without too many humans in the process chain.

I suppose I should be interested in Microsoft’s use of ChatGPT. I am more interested in weaponized AI operating outside the span of control of the US legal decisions. Control of information and the concomitant lack of control of information is more than adding zest to a Word document.

As a dinobaby, I am often wrong. Maybe what the US does will act like a governor on an 19th century steam engine? As I recall, some of the governors failed with some interesting consequences. Worry about Google, Microsoft, or some other US company’s application of constrained information could be worrying about a lesser issue.

Stephen E Arnold, January 10. 2023

Insight about Software and Its Awfulness

January 10, 2023

Software is great, isn’t it? Try to do hanging indents with numbers in Microsoft Word. If you want this function without wasting time with illogical and downright weird controls, call a Microsoft Certified Professional to code what you need. Law firms are good customers. What about figuring out which control in BlackMagic DaVinci delivers the effect you want? No problem. Hire someone who specializes in the mysteries of this sort of free software. No expert in Princeton, Illinois, or Bear Dance, Montana? Do the Zoom thing with a gig worker. That’s efficient. There are other examples; for instance, do you want to put your MP3 on an iPhone? Yeah, no problem. Just ask a 13 year old. She may do the transfer for less than an Apple Genius.

Why is software awful?

There Is No Software Maintenance” takes a step toward explaining what’s going on and what’s going to get worse. A lot worse. The write up states:

Software maintenance is simply software development.

I think this means that a minimal viable product is forever. What changes are wrappers, tweaks, and new MVP functions. Yes, that’s user friendly.

The essay reports:

The developers working on the product stay with the same product. They see how it is used, and understand how it has evolved.

My experience suggests that the mindset apparent in this article is the new normal.

The advantages are faster and cheaper, quicker revenue, and a specific view of the customer as irrelevant even if he, she, or it pays money.

The downsides? I jotted down a few which occurred to me:

  1. Changes may or may not “work”; that is, printing is killed. So what? Just fix it later.
  2. Users’ needs are secondary to what the product wizards are going to do. Oh, well, let’s take a break and not worry about today. Let’s plan for new features for tomorrow. Software is a moving target for everyone now.
  3. Assumptions about who will stick around to work on a system or software are meaningless. Staff quit, staff are RIFed, and staff are just an entity on the other end of an email with a contract working in Bulgaria or Pakistan.

What’s being lost with this attitude or mental framing? How about trust, reliability, consistency, and stability?

Stephen E Arnold, January 10, 2023

The EU Has the Google in Targeting Range for 2023

January 10, 2023

Unlike the United States, the European Union does not allow Google to collect user data. The EU has passed several laws to protect its citizens’ privacy, however, Google can still deploy tools like Google Analytics with stipulations. Tutanota explains how Google operates inside the EU laws in, “Is Google Analytics Illegal In The EU? Yes And No, But Mostly Yes.”

Max Schrems is a lawyer who successfully sued Facebook for violating the privacy of Europeans. He won again, this time against Google. France and Austria decided that Google Analytics is illegal to use in Europe, but Denmark’s and Norway’s data protection authorities developed legally compliant ways to use the analytics service.

Organizations were using Google Analytics to collect user information, but that violated Europeans’ privacy rights because it exposed them to American surveillance. The tech industry did not listen to the ruling, so Schrems sued:

“However, the Silicon Valley tech industry largely ignored the ruling. This has now led to the ruling that Google Analytics is banned in Europe. NOYB says:

‘While this (=invalidation of Privacy Shield) sent shock waves through the tech industry, US providers and EU data exporters have largely ignored the case. Just like Microsoft, Facebook or Amazon, Google has relied on so-called ‘standard Contract Clauses’ to continue data transfers and calm its European business partners.’

Now, the Austrian Data Protection Authority strikes the same chord as the European court when declaring Privacy Shield as invalid: It has decided that the use of Google Analytics is illegal as it violates the General Data Protection Regulation (GDPR). Google is “subject to surveillance by US intelligence services and can be ordered to disclose data of European citizens to them’. Therefore, the data of European citizens may not be transferred across the Atlantic.”

There are alternatives to Google services, including Gmail and Google Analytics based in Europe, Canada, and the United States. This appears to be one more example of the EU lining up financial missiles to strike the Google.

Whitney Grace, January 10, 2023

The Pain of Prabhakar Becomes a Challenge for Microsoft

January 9, 2023

A number of online “real” news outfits have reported and predicted that ChatGPT will disrupt the Google’s alleged monopoly in online advertising. The excitement is palpable because it is not fashionable to beat up the technology giants once assumed to have feet made of superhero protein.

The financial information service called Seeking Alpha published “Bing & ChatGPT Might Work Together, Could Be Revolutionary.” My mind added “We Hope!” to the headline. Even the allegedly savvy Guardian Newspaper weighed in with “Microsoft Reportedly to Add ChatGPT to Bing Search Engine.”  Among the examples I noted is the article in The Information (registration required, thank you) called “Ghost Writer: Microsoft Looks to Add OpenAI’s Chatbot Technology to Word, Email.”

The origin of this boomlet in Bing will kill Google may be in the You.com Web search system which includes this statement. I have put in bold face the words and phrases revealing Microsoft’s awareness of You.com:

YouChat does not use Microsoft Bing web, news, video or other Microsoft Bing APIs in any manner. Other Web links, images, news, and videos on you.com are powered by Microsoft Bing. Read Microsoft Bing Privacy Policy

I am not going to comment on the usefulness of the You.com search results. Instead, navigate to www.you.com and run some queries. I am a dinobaby, and I like command line searching. You do not need to criticize me for my preference for Stone Age search tools. I am 78 and will be in one of Dante’s toasty environments. Boolean search? Burn for eternity. Okay with me.

I would not like to be Google’s alleged head of search (maybe the word “nominal” is preferable to some. That individual is a former Verity wizard named Prabhakar Raghavan. His domain of Search, Google Assistant, Ads, Commerce, and Payments has been expanded by the colorful Code Red activity at the Google. Mr. Raghavan’s expertise and that of his staff appears to be ill-equipped to deal with one of least secret of Microsoft’s activities. Allegedly more Google wizards have been enlisted to deal with this existential threat to Google’s search and online ad business. Well, Google is two decades old, over staffed, and locked in its aquarium. It presumably watched Microsoft invest a billion into ChatGPT and did not respond. Hello, Prabhakar?

The “value” has looked like adding ChatGPT-like functions and maybe some of its open sourciness to Microsoft’s ubiquitous software. One can envision typing a dot point in PowerPoint and the smart system will create a number of slides. The PowerPoint user fiddles with the words and graphics and rushes to make a pitch at a conference or a recession-proof venture capital firm.

Imagine a Microsoft application which launches ChatGPT-type of smart search in a Word document. This type of function might be useful to crypto bros who want to explain how virtual tokens will become the Yellow Brick Road to one of the seven cities of Cibola. Sixth graders writing an essay and MBAs explaining how their new business will dominate a market will find this type of functionality a must-have. No LibreOffice build offers this type of value…yet.

What if one thinks about Outlook? (I wou8ld prefer not to know anything about Outlook, but there are individuals who spend hours each day fiddling around in email. Writing email can become a task for a ChatGPT-like software. Spammers will love this capability, particularly combined with VBScript.

The ultimate, of course, will be the integration of Teams and ChatGPT. The software can generate an instance of a virtual person and the search function can generate responses to questions directed at the construct presented to others in a Teams’ session. This capability is worth big bucks.

Let’s step back from the fantasies of killing Google and making Microsoft Office apps interesting.

Microsoft faces a handful of challenges. (I will not mention Microsoft’s excellent judgment in referencing the Federal Trade Commission as unconstitutional. Such restraint.)

First, the company has a somewhat disappointing track record in enterprise security. Enough said.

Second, Microsoft has a fascinating series of questionable engineering decisions. One example is the weirdness of old code in Windows 11. Remember that Windows 10 was to be the last version of Windows. Then there is the chaos of updates to Windows 11, particularly missteps like making printing difficult. Again enough said.

Third, Google has its own smart software. Either Mr. Raghavan is asleep at the switch and missed the signal from Microsoft’s 2019 one billion dollar investment in OpenAI or Google’s lawyers have stepped on the smart software brake. Who owns outputs built from the content of Web sites? What happens when content the European Union appears in outputs? (You know the answer to that question. I think it is even bigger fines which will make Facebook’s recent half a billion dollar invoice look somewhat underweight.)

When my research team and I talked about the You.com-type search and the use of ChatGPT or other OpenAI technology in business, law enforcement, legal, healthcare, and other use cases — we hypothesized that:

  1. Time will be required to get the gears and wheels working well enough to deliver consistently useful outputs
  2. Google has responded and no one noticed much except infinite scrolling and odd “cards” of allegedly accurate information in response to a user’s query.
  3. Legal issues will throw sand in the gears of the machinery once the ambulance chasers tire of Camp Lejeune litigation
  4. Aligning costs of resources with the to-be revenue will put some potholes on this off-ramp of the information superhighway.

Net net: The world of online services is often described as being agile. A company can turn on a dime. New products and services can be issued and fixes can be a system better over time. I know Boolean works. The ChatGPT thing seems promising. I don’t know if it replaces human thought and actions in certain use cases. Assume you have cancer. Do you want your oncologist to figure out what to do using Bing.com, Google.com, or You.com?

Stephen E Arnold, January 9, 2023

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta