Listen Notes: A Podcast Search Engine

October 18, 2017

I read “This Podcast Search Engine Helps You Discover New Shows You’ll Love.” The search engine is called Listen Notes. The content pool is 350,000 podcasts and about 20 million episodes. I ran a query for the popular Twit.tv podcast Security Now. No hits. I then ran a query for the unpopular HonkinNews program which I did for one year. No hits. Your mileage may very. As horrible as the iTunes search system is, it sort of works for podcasts. I am still in search of a good enough podcast search tool. Maybe Listen Notes should just use one of the remarkable enterprise search systems which handle “all” content. Yeah, that will work.

Stephen E Arnold, October 18, 2017

Google and Video Search: Still a Challenge

August 31, 2017

I read “How YouTube Perfected the Feed.” The main idea is that Google used smart software to make YouTube videos easier to find. The trick is not keyword search. Google’s YouTube, which the write up calls a “pillar of the Internet,” uses signals to identify what a person want. Then smart software delivers recommendations. The “new” YouTube’s secret sauce is described this way:

McFadden [a Google wizard] revealed the source of YouTube’s suddenly savvy recommendations: Google Brain, the parent company’s artificial intelligence division, which YouTube began using in 2015. Brain wasn’t YouTube’s first attempt at using AI; the company had applied machine-learning techniques to recommendations before, using a Google-built system known as Sibyl. Brain, however, employs a technique known as unsupervised learning: its algorithms can find relationships between different inputs that software engineers never would have guessed. One of the key things it does is it’s able to generalize,” McFadden said. “Whereas before, if I watch this video from a comedian, our recommendations were pretty good at saying, here’s another one just like it. But the Google Brain model figures out other comedians who are similar but not exactly the same — even more adjacent relationships. It’s able to see patterns that are less obvious.”

The point of the exercise is to generate more ad revenue. With competition from Facebook and others, Google is facing another crack in its control of search. Amazon may generate three times the number of product searches as Google. That’s another problem for the GOOG.

Now the talk about smart software is thrilling to many. For me, I highlighted this statement in the article as quite suggestive about the method:

YouTube’s emphasis on videos related to ones you might like means that its feed consistently seems broader in scope — more curious — than its peers. The further afield YouTube looks for content, the more it feels like an escape from other feeds.

The smart software is not about search. Google is processing signals and looking for similarities. I don’t want to be a grouser, but these themes have peppered Google patent documents and technical papers for many years. In my Google: The Digital Gutenberg I reviewed some of the wonkier video ideas. (By the way, the “Gutenberg” metaphor refers to the automatically generated content which Google outputs in response to user actions. Facebook may be more prolific today, but when I was working on Google: The Digital Gutenberg, Google had the distinction of being the world’s largest digital artifact producer.

Several observations:

First, finding videos remains a difficult information retrieval task. I recall the promising approach of Exalead, before Dassault bought the company and used the technology to reduce its dependence on Autonomy and deploy a way to find nuts and bolts. Exalead converted text to speech, generated some semi-useful metadata, and allowed me to search for a word or phrase. The system would then display links to videos which contained the string. The problem with video search is that it is visual and, to my knowledge, no one has figured out how to have software convert an image to a searchable  string. Years ago, I saw a demo from an Israeli company whose software could “watch” a soccer match and flag the goals sometimes. Google’s video search is useful when one looks for words in video titles, video descriptions, video channel names, or the entity producing or starring in the video.

Second, recommendations work reasonably well for digital Walmart-type shoppers. However, many recommendations are off the wall. I bought a bottle of itch reliever spray for my dog. The product was designed for saddle horses. Now Amazon happily shows me boots, bits, and bridles. Other recommendation systems will work the same way. The reason? Signals are given incorrect “weights” and the clustering methods drift away because many smart software methods are “greedy.” (I have a for fee lecture on this subject which is pretty darned interesting and important. Curious? Write benkent2020 at yahoo dot com for info.)

Third, Google’s smart software for video continues to struggle with uploads that are on some pretty dicey topics. I routinely get links to YouTube videos which require me to be over 18. You can check out Google’s filtering for certain content by running queries on both YouTube.com and GoogleVideo.com for “nasheed.” Yep, interesting “promotional” videos are in evidence.

Net net: Talk about smart software creates the impression that great progress in video content access is being made. I agree. There is progress; however, finding videos remains a work in progress.

I suppose Amazon will sell me a horse when it runs out of farm fresh Echoes. Google is recommending videos to me which don’t match what I usually look for. I was curious about non Newtonian fluids. Guess what Google suggested I view? A Chinese table tennis match and my own video.

There you go.

Stephen E Arnold, August 31, 2017

HonkinNews for April 11, 2017, Now Available

April 11, 2017

This week’s HonkinNews video program leads with information about Bitext, a company providing breakthrough deep linguistic analysis solutions. In order to put the comments of Dr. Antonio Valderrabanos in perspective, HonkinNews takes a look at the “promo” article discussing IBM’s cognitive computing activities. There is one key difference highlighted in HonkinNews: IBM talks jargon in recycled marketing language and Bitext’s CEO talks about the company’s rapid growth and licensing deals with companies like Audi, Renault, and one of the largest players in the mobile device and mobile services market. The program also looks at the remarkable 9,000 word Fortune Magazine article about Palantir Technologies’ interaction with US government procurement agencies. The very long article does not describe Palantir’s technical innovations nor does the Fortune analysis explain why using commercial off-the-shelf software for intelligence work makes sense. News about the Dark Web Notebook teams three presentations at the prestigious TechnoSecurity & Digital Forensics Conference in June 2017 complements a special offer for the only handbook to Dark Web investigations available. For discount information, check out the links displayed in the video. The video also takes a look at the new Yahoo. Once the transformation of Yahoo into Oath with a punctuation mark no less takes place, the Yahoo yodel will become a faint auditory memory. Does the HonkinNews item trigger an auditory memory. Watch this week’s video to find out. You can watch the video at this link.

Kenny Toth, April 11, 2017

HonkinNews for 28 Feb 2017 Now Available

February 28, 2017

This week’s HonkinNews considers the Facebook “manifesto.” Our interpretation is that companies like Facebook are countries too. Aren’t we lucky? The IBM security conference is scheduled for March 2017 and Beyond Search was invited. We assume that the data science root access breach will be one highlighted case study. The program also comments on the Pinterest Lens technology. Now after “pintering”, one can locate and buy a product. No words required. Two stories illustrate the depth or shallowness of thinking about online research. We present a list of “must use” search engines and note some notable omissions. Then we consider a comparison of conducting research on an ad supported system versus the commercial databases, books, and journals at a first-rate research library like Dartmouth’s. The subject of Google’s Loon balloons drifts in as well. We consider the question: Will Facebook free Internet drones engage in combat with Google’s free Internet Loon balloons? You can find it at this link.

Kenny Toth, February 28, 2017

HonkinNews for 21 February Now Available

February 21, 2017

Hang onto your lightweight mobile. HonkinNews lets you watch recall, precision, and relevance being kicked to pieces by a real live SEO expert and famed author. We love that “famed” thing. You will also get a peek at how to visualize innovation. Inside the box and outside the box look tame compared to our view of the real world. We give you a tip for searching for an image in the Metropolitan Museum of Art’s 350,000 digital collection. You may not like the answer. We did not. If you have a mainframe in your home office, you can load Watson and let it index your significant other’s recipes, or you can process a local bank’s overnight cash transactions. Either way, IBM gives you some Watson juice. And you will get a bit of information about Yahoo’s most recent security issue. Yep, yabba dabba hoot.

Kenny Toth, February 21, 2017

HonkinNews for 14 February 2017 Now Available

February 14, 2017

Want some tax love? HonkinNews explains that you can visit an H&R Block store front and “touch” IBM Watson. Sounds inviting, doesn’t it? You will also learn about the fate of Lexmark’s search and content businesses under the firm’s new ownership. Denmark has appointed an ambassador to Sillycon Valley. Perhaps Apple, Facebook, and Google really are nation states? Google’s cloud wizard has some job advice for the newly terminated. Perhaps dog training collars are a breakthrough for those eager to acquire news skills. Lucid Imagination became Lucidworks. Now the company has positioned itself to deliver Exalead style search based applications. The play did not work too well for Exalead, which wrote the book about SBAs. Will Lucidworks make the me-too strategy pay off for the company’s backers and their tens of millions of dollars? We also catalog the many ways to search using the Pixel phone. Whatever happened to universal search?  We reveal where to live if you want easy access to old fashioned book stores. No, it is not Harrod’s Creek, Kentucky. You can view the video at this link.

Kenny Toth, February 14, 2017

Metropolitan Museum of Arts: Images There but Findability Not

February 14, 2017

I recall the Google Life Magazine image collection. I noted the BBC archive of programs. I checked out the Internet Archive’s rich media collection. Years ago I worked on the Library of Congress’ American Memory project. These have a unifying thread:

The content is essentially unfindable.

I read “Metropolitan Museum of Art Puts 375,000 Public Domain Images in Creative Commons.” The write up explains:

As part of a new initiative it’s calling Open Access, the Metropolitan Museum of Art in New York has placed 375,000 images of public-domain works in the Creative Commons. This major, though not unprecedented, move by one of the world’s most important museums means that users can now access pictures of many of the Met’s holdings on Wikimedia…

You can try out the search system at this link. Good luck finding images. Remember Caravaggio is spelled with two g’s. Oh, the query returned a number of false drops.

The way to find images is to browse. Fun. Time consuming. Not good.

Stephen E Arnold, February 14, 2017

HonkinNews for 7 February 2017 Now Available

February 7, 2017

This week’s program highlights Google’s pre school and K-3 robot innovation from Boston Dynamics. In June 2016 we thought Toyota was purchasing the robot reindeer company. We think Boston Dynamics may still be part of the Alphabet letter set. Also, curious about search vendor pivots. Learn about two shuffles (Composite Software and CopperEye) which underscore why plain old search is a tough market. You will learn about the Alexa Conference and the winner of the Alexathon. Alexa seems to be a semi hot product. When will we move “beyond Alexa”? Social media analysis has strategic value? What vendor seems to have provided “inputs” to the Trump campaign and the Brexit now crowd? HonkinNews reveals the hot outfit making social media data output slick moves. We provide a run down of some semantic “news” which found its way to Harrod’s Creek. SEO, writing tips, and a semantic scorecard illustrate the enthusiasm some have for semantics. We’re not that enthusiastic, however. Google is reducing its losses from its big bets like the Loon balloon. How much? We reveal the savings, and it is a surprising number. And those fun and friendly robots. Yes, the robots. You can view the video at this link. Google Video provides a complete run down of the HonkinNews programs too. Just search for HonkinNews.

Kenny Toth, February 7, 2017

Synthetic Datasets: Reality Bytes

February 5, 2017

Years ago I did a project for an outfit specializing in an esoteric math space based on mereology. No, I won’t define it. You can check out the explanation in the Stanford Encyclopedia of Philosophy. The idea is that sparse information can yield useful insights. Even better, if mathematical methods were use to populate missing cells in a data system, one could analyze the data as if it were more than probability generated items. Then when real time data arrived to populate the sparse cells, the probability component would generate revised data for the cells without data. Nifty idea, just tough to explain to outfits struggling to move freight or sell off lease autos.

I thought of this company’s software system when I read “Synthetic Datasets Are a Game Changer.” Once again youthful wizards happily invent the future even though some of the systems and methods have been around for decades. For more information about the approach, the journal articles and books of Dr. Zbigniew Michaelewicz may be helpful.

The “Synthetic Databases…” write up triggered some yellow highlighter activity. I found this statement interesting:

Google researchers went as far as to say that even mediocre algorithms received state-of-the-art results given enough data.

The idea that algorithms can output “good enough” results when volumes of data are available to the number munching algorithms.

I also noted:

there are recent successes using a new technique called ‘synthetic datasets’ that could see us overcome those limitations. This new type of dataset consists of images and videos that are solely rendered by computers based on various parameters or scenarios. The process through which those datasets are created fall into 2 categories: Photo realistic rendering and Scenario rendering for lack of better description.

The focus here is not on figuring out how to move nuclear fuel rods around a reactor core or adjusting coal fired power plant outputs to minimize air pollution. The synthetic databases have an application in image related disciplines.

The idea of using rendering engines to create images for facial recognition or for video games is interesting. The write up mentions a number of companies pushing forward in this field; for example, Cvedia.

However, the use of NuTech’s methods populated databases of fact. I think the use of synthetic methods has a bright future. Oh, NuTech was acquired by Netezza. Guess what company owns the prescient NuTech Solutions’ technology? Give up? IBM, a company which has potent capabilities but does the most unusual things with those important systems and methods.

I suppose that is one reason why old wine looks like new IBM Holiday Spirit rum.

Stephen E Arnold, February 5, 2017

HonkinNews for January 31, 2017 Now Available

January 31, 2017

This weeks’ seven minute HonkinNews includes some highlights from the Beyond Search coverage of Alphabet Google. If you have not followed, Sergey Brin’s participation at the World Economic Forum, you may have missed the opportunity that Google did not recognize. More surprising is that Alphabet Google owns a stake in a company which specializes in predicting the future. IBM Watson had a busy holiday season. The company which has compiled 19 consecutive quarters of declining revenue invented a new alcoholic “spirit”, sometimes referred to as booze, hooch, the bane of Mothers Against Drunk Driving. How did Watson, a software system, jump from reading text to inventing rum? We tell what Watson really did. How did Palantir Technologies respond to a protest in front of its Palo Alto headquarters, known by some as the Shire? Think free coffee, and we reveal what the Beyond Search goose wants when she attends a protest. Beyond Search has an interest in voice search, which seems to be more than an oddity. Learn about the battle between Amazon and Google. The stakes are high because Amazon is not a big player in search, but Alexa technology way be about to kick on of the legs from Google’s online hegemony. DuckDuckGo honked loudly that it experienced significant growth in online search traffic. How close is DuckDuckGo to Google? Find out. Mind that gap. Microsoft has “invented”, rediscovered, or simply copied Autonomy’s Kenjin service from the 1990s. The lucky Word users will experience automatic search and the display of third party information in an Outlook style paneled interface. HonkinNews believes that those writing term papers will be happy with the new “Research.” Yahoot or Yabba Dabba Hoot warrants a mention. The US Securities & Exchange Commission is allegedly poking into Yahoo’s ill timed public release of information about losing its users information. Yep, Yabba Dabba Hoot. Enjoy Beyond Search which is filmed on 8 mm film from the Beyond Search cabin in rural Kentucky.

If you are looking for previous HonkinNews videos, you can find them by navigating to www.googlevideo.com and running the query HonkinNews. Watch for Stephen E Arnold’s new information service, Beyond Alexa. Who wants to type a search query? That’s like real work and definitely not the future.

Kenny Toth, January 31, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta