Another New Web Search Engine: Zajjle

July 15, 2021

The DarkCyber research team tries to keep up with the Web search engines. Rarely does one come along with an index of content not generally included in the Bing, Google, Yandex systems or the new-kids-on-the-block like Metager or Neeva. According to “Arabic Search Engine plus Webmail and Data Analytics is Now Available at Zajjle”:

Zajjle is an Arabic search engine with many advantages. Besides offering the primary function as a search engine in the Arabic language, it also has additional features like webmail and website statistics & data analytics. People in Middle East countries can access the website in the Arabic language for searching the current news, website address, videos, and photos.

The write up states:

Ahmad A Najar, the Zajjle founder, is an entrepreneur who established CatchFood and Mazra3a.net. CatchFood is successful web-based restaurant management and food delivery platform. It has served countries like the United Arab Emirates, Jordan, Palestine, Iraq, Syria, and Lebanon, America, and Canada. Mazra3a.net is a popular Arabic agricultural platform connecting people interested in the agriculture industry to change ideas, increase knowledge, and communicate with professionals in the agriculture industry.

Will Zajjle index content deep in the US Department of Energy’s public facing Web sites? Will it snag content in Streamgun? What about content censored by mainstream systems?

You can explore the site at this link: http://www.zajjle.com/

Stephen E Arnold, July 15, 2021

Written by Stephen E. Arnold · Filed Under News, Search | Comments Off on Another New Web Search Engine: Zajjle

Brave Privacy-Centric Search Arrives

July 5, 2021

Several online services that emphasize privacy have emerged in recent years, including the Brave browser. The San Francisco-based company is not stopping there. We learn from TechCrunch that “Brave’s Nontracking Search Engine is Now in Beta.” Earlier this year, Brave acquired Cliqz, a company which had developed an anti-tracking browser with its own search platform. That system’s Tailcat technology will underpin Brave Search. The company also offers Brave Ads, a way to make money while preserving users’ privacy. Brave is different from other non-tracking Google alternatives like DuckDuckGo because it is using its own, independent index. Reporter Natasha Lomas writes:

“Brave touts its eponymous search offering as having a number of differentiating features versus rivals (including smaller rivals) — such as its own index, which it also says gives it independence from other search providers. Why is having an independent index important? We put that question to Josep M. Pujol, chief of search at Brave, who told us: ‘… More choices will entail more freedom and also get back to real competition, with checks and balances. Choice can only be achieved by being independent, as if we do not have our own index, then we are just a layer of paint on top of Google and Bing, unable to change much or anything` in the results for users’ queries. Not having your own index, as with certain search engines, gives the impression of choice, but in reality such engine “skins’” are the same players as the big-two. Only by building our own index, which is a costly proposition, will we be in a position to offer true choice to the users for the benefit of all, whether they are Brave Search users or not.’”

It should be noted that for certain functions, like image searches, Brave currently relies on other search providers to ensure relevant results. For now. Otherwise it turns to anonymized community contributions to refine its index’s results and will soon provide “community-curated open ranking models” in an effort to combat censorship and AI bias. The company plans to offer both a free option supported by ads and a paid, ad-free version we are told will be “affordable.”

We are running test queries, and the results are promising. There are other services becoming available too. I like Swisscows.ch. But I like cows.

Let us hope Brave’s efforts result in an index that is better than what has gone before. The increasing number of search options is a signal that Google search has failed in its basic mission. The problem is that millions don’t know what they are missing. Undisclosed omissions and obfuscated distortions are worse than guessing or asking friends.

Cynthia Murrell, July 5, 2021

Written by Stephen E. Arnold · Filed Under News, Search | Comments Off on Brave Privacy-Centric Search Arrives

Google and Unreliable Results: Like the Jack Benny One Liner, I Am Thinking, I Am Thinking

June 25, 2021

I read a “real” news story called “Google Is Starting to Warn Users When It Doesn’t Have a Reliable Answer.” (No, I will not ask, What’s reliable mean.)

Here’s the statement which snagged my attention in the write up:

“When anybody does a search on Google, we’re trying to show you the most relevant, reliable information we can,” said Danny Sullivan, a public liaison for Google Search. “But we get a lot of things that are entirely new,” Sullivan said the notice isn’t saying that what you’re seeing in search results is right or wrong — but that it’s a changing situation, and more information may come out later.

I think Mr. Sullivan, a former search engine optimization guru and conference organizer, is the “new” Matt Cutts, a Google professional helping to point the way to the digital future at the US government. Is key word packing the path to more patents than China?

I loved this statement which I know is pretty Tasmanian devil like: “Most relevant, reliable information we can.” I did a query for garage floor epoxy coating in Louisville. I gathered about 20 businesses display on the first two pages of Google search results. Two companies were in this business. Others were out of business. One “company” called me back and said, “My loser son has been gone for two years.”

I have other examples as well of search either being out of date, spoofed, or just weird.

Let’s look at some of the reasons why Google made a statement about “reliable answers.”

First, I think the difficulty of providing real-time indexing is beyond three Google capabilities: Outfits with real time content won’t play ball with Google unless Google pays up and works out a mechanism to move the content to a Google indexing queue. (Yep, queue as in long line at the McDonald’s drive through.)

Second, Google is not set up to do real time. I think the notion of having a short list of “must ping frequently sites” may be a hold over from the distant past. The reason? As the cost of indexing, updating, and making the Google indexes “consistent” – some of the practices no longer fit the current iteration of “relevant” and “reliable.” Google is not Twitter, and it is not Facebook. Therefore, the pipelines for real time content simply don’t exist. Googlers tried but seemed to be better at selling ads than dealing with new content types.

Third, hot info appears in non text form on Instagram, TikTok, and even places like DailyMotion and Vimeo sometimes days before the content plops into YouTube. Ever try to locate a video using the creator assigned index terms. That’s an exercise in futility. Ads, gentle reader, not relevant and reliable information.

From my vantage point on the porch overlooking a mine drainage pond, I have some hypotheses:

Google is under financial pressure, a competitive pressure from Amazon and Facebook, and a legal pressure. Almost any nation state with an appetite to drag the Google into court is in gear.
Google is just not able to handle the real time flows of content, either textual or imagery. Too bad, but that’s the excitement of Hegel’s these, antithese, synthese which “real” Googlers learn along with search engine optimization marketing methods.
Google’s propagandistic and jingoistic assurances that it returns relevant and reliable results is more and more widely seen as key word spam.
Google’s management methods are not tuned for the current business environment. I may be alone in noticing that high school science club thinking and management from assumed superiority is out of favor. (If Sergey Brin were to ride a Russian rocket into space, wou8ld he attract more signatures that Jeff Bezos. The quasi referendum did not want Mr. Bezos to return to earth. Mr. Brin’s ride did not materialize, so I won’t know who “won” the most votes.)

Net net: Relevant and reliable. That’s a line worthy of Jack Benny when he is asked about Fred Allen. I give up, “What does ‘reliable’ mean, Googlers.” My suggestion is marketing hoo haa with metatags.

Stephen E Arnold, June 25, 2021

Written by Stephen E. Arnold · Filed Under Google, Marketing, News, Search | 5 Comments

X1 Embraces Social Media and Collections Search for eDiscovery

June 22, 2021

It looks like eDiscovery firm X1 is moving beyond enterprise search into collection-centric search. They have sent around a memo announcing their “Defensible Social Media and Web Collections On-Demand.” The notice explains:

“Taking on the process of capturing evidence in a forensically sound manner can be challenging, time consuming and sometimes impossible with ever-increasing workloads. Why not outsource the collection portion of the process by letting our team of experts perform the job for you? With X1 Social Discovery On-Demand, X1’s forensic experts capture the data you need in a legally defensible manner, alleviating any headaches or worry about ESI collection.

Social Media and Web Capture Collection – save critical time by leveraging the expertise of X1 for efficient and accurate collections

Defensible Metadata Collection – unlike the ‘print screen’ approach, with X1 key metadata is included with all captures and deliverables with chain of custody preserved throughout

Experienced Service Support – X1 works with you to understand your collection scope and deliverable requirements up front bringing timely, authenticated results

Choose from Several Different Export Options – Concordance Load File, CSV, PDF, HTML for clear and accurate output

Get Started Right Away – engage with an X1 Solutions Consultant and start the collection process same-day”

Information on the X1 Social Discovery and the X1E Remote Collection On-Demand can be found on the company’s products page. At the time of this writing, new customers can save 50% off their first collection for up to 10 accounts. We do not know how long the “limited time” offer will last. We also cannot speak for or against the solutions since we have not tried them ourselves. We find the development interesting, though. Founded in 2003, the evolving small business is based in Los Angeles.

Search is becoming policeware.

Cynthia Murrell, June 22, 2021

Written by Stephen E. Arnold · Filed Under News, Search | Comments Off on X1 Embraces Social Media and Collections Search for eDiscovery

Ninfex Is a New Take on Internet Search

June 10, 2021

The creator of “experimental” search site Ninfex is trying a different approach that uses neither crawlers nor AI. The site’s Hello page explains:

“We rely on 2 proxies for search relevance. First: URL score (user votes). Second: Links to discussions on major forums. All submissions on Ninfex are votable by users. When you submit a link, you can also submit up to 5 supporting links (to discussions about that link) from external forums. Among the current submissions you are most likely to see forum links to reddit, hackernews, lobsters, stackexchange & tildes because those are the forums that I frequent most often.”

Yes, the young site still leans heavily toward material based on its maker’s interests, but that could change as its usership grows. The founder, who goes by the name traindreams, writes:

“I am the maker of Ninfex and right now I’m actively pushing to build the index around my personal wiki / research notes / bookmarks. That is why, the home feed mostly contains topics of my interest. The following is a list of those topics.”

See the page to explore that list of diverse and interesting topics, from Art to Psychology to Startups. Perhaps you will be inspired to vote or add a link. Traindreams has already made some changes based on user feedback, like cleaning up the UI, and promises more to come as the index grows. It looks like the idea is quality over quantity; we are curious to see if the enterprise takes off.

Cynthia Murrell, June 10, 2021

Written by Stephen E. Arnold · Filed Under News, Search | Comments Off on Ninfex Is a New Take on Internet Search

More Microsoft Finger Pointing: Not 1,000 Programmers, Just One

June 9, 2021

I got a kick out of “Microsoft Blames Human Error Amid Suspicion It Censored Bing Results for Tiananmen Square Tank Man.” The tank man reference refers to an individual who stood in front of a tank. Generally this is not a good idea because visibility within tanks is similar to that from a Honda CR-Z. Hold that. The tank has better visibility. Said tank continued forward, probably without noticing a slight impediment.

The story talks not about visibility; its focus is on Microsoft (yep, the SolarWinds’ and new Windows’ outfit). I read:

Throughout Friday afternoon, using the image search function on Microsoft-operated Bing using the words “Tank Man” returned the message, “There are no results for tank man / Check your spelling or try different keywords.” (According to Motherboard, the same is true in other countries outside the U.S., including France and Switzerland.)

DuckDuck and Yahoo search presented a similar no results message. These are metasearch systems eager to portray themselves as much, much more.

Why? The article reports:

Microsoft has done business in China for decades, and Bing is accessible there. Like competitors such as Apple, the company has long complied with the whims of Chinese censors to maintain access to the country’s massive market, and it purges Bing results within China of information its government deems sensitive. However, the company said that blocking image results for “Tank Man” in the U.S. was not intentional and the issue was being addressed. “This is due to an accidental human error and we are actively working to resolve this…”

Could a similar error been responsible for recent security lapses at the Redmond Defender office?

And no smart software, no rules-based instruction, and no filters involved in this curious search result?

Nope. I believe everything I read online about Microsoft. Call me silly.

Stephen E Arnold, June 9, 2021

Written by Stephen E. Arnold · Filed Under Business strategy, Microsoft, News, Search | Comments Off on More Microsoft Finger Pointing: Not 1,000 Programmers, Just One

Technical Debt: Paying Down Despite Disaster Waiting in the Wings

June 7, 2021

Some interesting ideas appear in “10 Ways to Prevent and Manage Technical Debt—Tips from Developers.” The listicle is not particularized on a specific application or service. Let me convert a few of the points in the article to the challenges that vendors of information retrieval software have to meet in a successful manner.

I am not tracking innovations in search the way I did when I wrote the first three editions of the Enterprise Search Report many years ago. Search technology, despite the hooting of marketers and “innovators” who don’t know much about the 50 year plus history of finding, search has not made much progress. In fact, if I were still giving talks at search-related events, I would present data showing that “findability” has regressed. Now to the matter at hand.

I am not sure most people understand what technical debt is in general and even fewer apply the concept to search and retrieval. To keep it simple, technical debt is not repairing and servicing your auto. You do just enough to keep the Nash Rambler on the road. Then it dies. You find that parts are tough to find and expensive to get. If you want to do the job “right,” you will find that specialists are on hand to make that hunk of junk gleam. Get out your checkbook and write small. You will be filling in some big numbers. Search is that Nash Rambler but you have a couple of Metropolitans and a junker of a 1951 Nash Ambassador sitting in your data center. You can get stuff from A to B, but each trip becomes more agonizing. Then you have to spend.

Technical debt is the amount you have to spend to get back up and running plus the lost revenue or estimated opportunity cost. These numbers are the cost of the hardware, software, knick knacks, and humans who sort of know what to do.

What about search? Let’s take three of the items identified in the article and consider them in terms of what is often incorrectly described as “enterprise search.” My work over the years has documented the fact that there is no enterprise search. Shocking? Think about it. Employees cannot find the video of that Zoom meeting or the transcript automagically prepared this morning. And that sales presentation with the new pricing? Oh, right, that’s on the VP’s laptop and it won’t connect to the cloud archiving system because the wizard executive has trouble opening a hotel room with the keycard. Like I said, “Wizard.”

Item number 2 in the article is “Embed technical debt management into the company culture.” Ho ho ho. The present state of play is to get something up and running, dump on features, and generate revenue, some revenue, any revenue. In many organizations, the pressure to move the needle trumps any weird ideas to go back and fix the plumbing. How often is the core of Google’s search and retrieval reworked? Yeah, not often and every year the job becomes less and less desirable. The legions of Xooglers who worked on the system are unlikely to return to the digital Disneyland to do this work even for dump trucks filled with Ethereum.

Item number 5 is “Make technical debt a priority in open source culture.” Okay, let’s think about open source search. Have you through about Sphinx recently. What about Xapian? The big dogs are under intense pressure from the real champions of open source like Amazon and everyone’s favorite security company Microsoft. The individuals who do the bulk of the work struggle to make the darned thing work on the latest and greatest platforms and operating systems. The more outfits like Amazon pressure Elastic, the less likely the humans who work on Lucene and Solr will be able to fend off complete commercialization. Hey, there’s always consulting work or a job at IBM, another cheerleader for open source. So priority? Right.

Now item number 6 in the article. It is “Choose a flexible architecture.” What does this mean for search and retrieval. Most search and search centric applications like policeware and intelware are mashups of open source, legacy code left over from another project, and intern-infused scripts. The “architecture” is whatever was easiest and most financially acceptable. Once those initial decisions are made or simply allowed to happen because someone knew someone, the systems are unlikely to change. Fixing up something that sort of works is similar to the stars of VanWives repairing their ageing vehicle while driving in the rain. Ain’t gonna happen.

Net net: Technical debt for most organizations is what will bring down the company. Innovations slows to a crawl and becomes a series of add ons, wrappers, and strapping tape patches. Then boom. A competitor has blown the doors off the incumbent, customers just cancel contracts for enterprise search systems, or the once valued function becomes a feature for a more important application. Technical debt, like a college grad’s student loan, is a stress inducer. Stress can shorten one’s life and kill. The enterprise search market is littered with the corpses of outfits terminated from technical debt denial syndrome.

Stephen E Arnold, June 7, 2021

Written by Stephen E. Arnold · Filed Under News, Search | Comments Off on Technical Debt: Paying Down Despite Disaster Waiting in the Wings

Search: Still Struggling with Synonyms

June 3, 2021

I read “How AI Can Help Resolve Complex Fashion Taxonomies.” The write up states:

ecommerce retailers are struggling to find a system for managing the growing fashion taxonomy. For reference, fashion taxonomy is defined as the science of naming, describing, and classifying items into categories. And it affects every component of the customer experience, from search and discovery to product recommendations.

I agree. The problem has been a persistent one for decades. Statistical methods, manual methods, smart software methods — non works particularly well. Statistics drift as the language changes. What’s a slang word for sneakers; is it “kicks”? The idea is that an ecommerce site might not recognize this term unless a human entered it in a list of synonyms. Smart software might miss the nuances of pickle ball shoes that are wavy or nifty ice for a B.

If a person cannot locate a product, will that person enter synonyms or just click away to another site? That’s bad.

The article asserts:

it’s also becoming increasingly difficult for customers to find what they’re looking for, regardless of search intent.

The phrase “increasingly difficult” does not quite capture what’s happening in online information access. Locating online information which is timely, accurate, and relevant is extraordinarily difficult.

The write up, however, has a possible fix:

Tackling complex fashion taxonomy is a heavy task, but with artificial intelligence, retailers now have different approaches to try. Through text-based and visual search tools, retailers have the power to change the way customers experience their products, leading to higher engagement rates and more conversions. The future of artificial intelligence as a remedy to complex fashion taxonomies is bright – and you can expect to see more products in the market in the future.

But the purpose of the write up is to explain that YesPlz is the way to deliver a “user initiated search experience, combined with artificial intelligence and visual search.”

Possibly, but I think the solutions which have rolled down the cash flow pipelines have not delivered. Language is a moving target and shoppers want the system to “know” what he or she wants without having to speak, type, or do anything.

The big dog in ecommerce is Amazon. Bing and Google are working overtime to make their “shopping” search functions work better than the Bezos bulldozer’s. The problem is that Despite the tricks, the cohorts, the user fingerprint, and the rest of the methods to divine what a shopper wants and will buy is clumsy.

Marketing talk is a heck of a lot easier than solving what is becoming a problem too big to resolve. I don’t want a fashion item. I want a belt which does not look stupid. Woo woo.

Stephen E Arnold, June 4, 2021

Written by Stephen E. Arnold · Filed Under News, Search | Comments Off on Search: Still Struggling with Synonyms

Alation Releases New Version of its Enterprise Search Platform

June 3, 2021

Alation announces the latest release of its platform in its post, “How 2021.2 Is Remaking the Future of Enterprise Search.” This version comes with some handy features, like its table view, metadata search, and lexicon pairing. The post contains helpful screenshots. It is the tool’s boosted search ranking system, though, that writer Linh Nguyen puts at the top of the list. The platform’s AI now considers user input in establishing each resource’s worth. She tells us:

“Search results ranking and relevance now takes clues from social indicators. Alation catalog users have always been able to endorse or deprecate a given asset or dataset, signaling to their peers, ‘this asset is trustworthy’ or, by contrast, ‘warning! Use at your own risk! This asset is deprecated.’ With this update, we’re leveraging that tribal knowledge to influence all search rankings, illuminating the best assets that people trust. Specifically, user-created endorsements will boost ranking scores while deprecations will penalize rankings scores. Admins also have the option to customize the score associated with these trust flags (endorsements & deprecations). This empowers admins to effectively ‘endorse the endorsements’, further influencing rankings to promote the best assets to the right people.”

This sounds helpful, but we wonder whether it means content that is difficult to index will become even more difficult to find. What about audio, video, and comments in Slack, Teams, or Zoom; chemical structure and engineering diagrams; legal information within secured repositories; the PowerPoint data on sales professionals’ laptops? Improved UI and other nice-to-haves are well and good, but in our view comprehensive enterprise search remains elusive. Even with the power of AI.

Cynthia Murrell, June 3, 2021

Written by Stephen E. Arnold · Filed Under News, Search | Comments Off on Alation Releases New Version of its Enterprise Search Platform

Need to Tame the Information Tsunamis in Databases? DbSurfer May Be Your Deviled Egg

June 2, 2021

An interesting article “DbSurfer: A Search and Navigation Tool for Relational Databases” describes a novel way to locate information in Codd databases. Nope, I won’t make a reference to codfish. The surfing metaphor is good enough today.

The write up states:

We present a new application for keyword search within relational databases, which uses a novel algorithm to solve the join discovery problem by finding Memex-like trails through the graph of foreign key dependencies. It differs from previous efforts in the algorithms used, in the presentation mechanism and in the use of primary-key only database queries at query-time to maintain a fast response for users.

The Memex reference is not to the mostly forgotten Australian search and retrieval system. The Memex in this paper is a nod to everyone’s information hero Vannevar Bush’s fanciful “memex device.” (No, Google is not a memex device.)

The method involves “joins” and “tails.” The result is a system that allows keyword search and navigation through relational databases.

The paper includes a useful list of references. (Some recent computer science graduates who are billing themselves as search experts might find reading a few of the citations helpful. Just a friendly suggestion to the AI, NLP, and semantic whiz types.)

Is this a product? Nope, not yet. Interesting idea, however.

Stephen E Arnold, June 2, 2021

Written by Stephen E. Arnold · Filed Under Database, News, Search | Comments Off on Need to Tame the Information Tsunamis in Databases? DbSurfer May Be Your Deviled Egg

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Another New Web Search Engine: Zajjle

Brave Privacy-Centric Search Arrives

Google and Unreliable Results: Like the Jack Benny One Liner, I Am Thinking, I Am Thinking

X1 Embraces Social Media and Collections Search for eDiscovery

Ninfex Is a New Take on Internet Search

More Microsoft Finger Pointing: Not 1,000 Programmers, Just One

Technical Debt: Paying Down Despite Disaster Waiting in the Wings

Search: Still Struggling with Synonyms

Alation Releases New Version of its Enterprise Search Platform

Need to Tame the Information Tsunamis in Databases? DbSurfer May Be Your Deviled Egg

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta