Dark Web Search: Specialized Services Are Still Better
March 26, 2020
Free Dark Web search is a hit-and-miss solution. In fact, “free” Dark Web search is often useless. Some experts do not agree with DarkCyber’s view, however. The reason is that these experts may not be aware of the specialized services available to government agencies and qualified licensees.
Here’s a recent example of cheerleading for a limited Dark Web search system.
A search engine does not exist for the Dark Web, until now says Digital Shadows in the article, “Dark Web Search Engine Kilos: Tipping The Scales InFavor Of Cybercrime.” Back in 2017, there used to be a search engine dubbed Grams that specialized in searching the Dark Web. It was taken down when its creator Larry Harmon, supposed operate of Helix the Bitcoin tumbling service. The Dark Web was search engine free, until November 2019 when Kilos debuted.
Kilos piggy backs on the same concept of Grams: using a Google-like search structure to locate illegal goods and services, bad actors, and cybercriminal marketplaces. Kilos has indexed more platforms, search functions, and includes many ways to ensure that users remain anonymous. Grams and Kilos are clearly linked based on the names that are units of measure.
Grams was the prominent search engine to use for the Dark Web, because it searched every where including Dream Market, Hansa, and AlphaBay and users could also hide their Bitcoin transactions via Helix. Grams did not have a powerful structure to crawl and index the Internet. Also it was expensive to maintain. This resulted in it going dark in 2017.
The argument is that Kilos is killing the Dark Web search scene as a more robust and powerful crawler/indexer. It already has indexed Samsara, Versus, Cannazon, CannaHome, and Cryptonia. Plus it has way more search functions to filter search results. Every day Kilos indexes more of the Dark Web’s content and has a unique feature Grams did not:
“Since the site’s creation in November 2019, the Kilos administrator has not only focused on increasing the site’s index but has also implemented updates and added new features and services to the site. These updates and features ensure the security and anonymity of its users but have also added a human element to the site not previously seen on dark web-based search engines, by allowing direct communication between the administrator and the users, and also between the users themselves.”
Kilos is adding more services to keep its users happy and anonymous. Among the upgrades are a CAPTCHA ranking system, faster search algorithm, a new Bitcoin mixer service, live chat, and ways to directly communicate with the administration.
Reading about Kilos sounds like an impressive search application startup, but wipe away the technology and its another tool to help bad actors hurt and break the system.
So what’s the issue? Kilos focuses on Dark Web storefronts, not the higher-value content in other Dark Web, difficult-to-index content pools.
But PR is PR, even in the Dark Web world.
Whitney Grace, March 26, 2020
Cloud Search Magic
March 26, 2020
Storing files on the cloud is a marvelous way to back up files and also free up valuable memory on devices. There is one big problem if you offload files on the cloud: finding them. There are various platforms to store files in the cloud, but Popular Science explains in the article “Find Any File In The Cloud” if you are unfamiliar with the platform it will be harder to find files.
The article explores popular cloud hosting platforms and walks readers through how to locate and search for files. The platforms examined are Google Drive, Dropbox, iCloud, and OneDrive. Each specific platform has its intricacies, but are important to master:
“But if you haven’t taken the time to explore a platform in depth, or if you use several and often get confused, you might find it harder to track down particular files compared to having them on a local hard drive. It doesn’t have to be this way, though. All the big cloud storage providers have useful tools for searching through your files and folders, whether you’re using a web browser, a desktop computer, or your phone.”
Be aware that these platforms can change based on the device accessing them. Many devices have mobile and desktop interfaces, so things are changed around if you move from one machine to another. None of these platforms are superior to the other, but users will prefer one to the other based on the type of machine they are using.
Another thing to consider when selecting a platform to use are the security parameters each one uses. The platform could be easy to use, but it also might be easy to hack.
Whitney Grace, March 26, 2020
Daedalus Enterprise Search Appliance with ElasticSearch Inside
March 25, 2020
Open source software is a boon to companies and organizations that cannot afford the steep price tag of proprietary software. Open source, however, does have its drawbacks, including lack of customer support, the software is only as good as its developer, and security issues. PR Web describes how the Department of Defense is getting an overdue search upgrade: “PSSC Labs Launches Daedalus Enterprise Search Appliance.”
The Department of Defense relied on Elasticsearch for many digital tasks, including cybersecurity and logistics. Elasticsearch was providing the one and done solution the Department of Defense needed for its advanced workloads. Enter the PSSC Labs with its Daedalus Enterprise Search Appliance to the rescue. PSSC Labs designs and builds custom big data and high performance computing solutions. Daedalus Enterprise Search Appliance is a new platform powered by Elastic and compatible with Elastic Cloud Enterprise.
The Daedalus Enterprise Search Appliance will upgrade the Department of Defense’s system components. It also will not be a huge investment and will be a reasonable upgrade cost. The Department of Defense went with PSSC Labs because:
“ ‘We chose Elasticsearch as the foundation of the platform because it offers the flexibility and simplicity other application packages do not. With Elastic, everything is included in one simple per node price. This means companies can utilize the high-performance Elastic Stack for a variety of workloads including log analysis, cybersecurity, simple distributed storage, geospatial data analysis, and other concepts that are still yet to be discovered,’ said Alex Lesser, PSSC Labs Vice President.”
Other than the reasonable cost and product quality, the Department of Defense selected PSSC Labs’ Daedalus Enterprise Search Appliance because it was built on Elastic. Elastic is an open source software, but many proprietary software companies build their own products on free technology. The move to the Daedalus Enterprise Search Appliance should relatively simple as the current Department of Defense system is based on Elasticsearch.
Whitney Grace, March 25, 2020
Semantic Sci-Fi: Search Is Great
March 23, 2020
I read “Keyword Search is DEAD; Semantic Search Is Smart.” I assume the folks at Medium consider each article, weigh its value, and then release only the highest value content.
Semantic search is better than any other type of search in the galaxy.
Let’s assume that the write up is correct and keyword search is dead. Further, we shall ignore the syntax of SQL queries, the dependence of policeware and intelware systems on users’ looking for named entities, and overlook the interaction of people using an automobile’s navigation service by saying, “Home.” These are examples of keyword search, and I decided to give a few examples, skipping how keyword search functions in desktop search, chemical structure systems, medical research, and good old, bandwidth trimming YouTube.
Okay, what’s the write up say beyond “keyword search is dead.”
Here are some points I extracted as I worked my way through the write up. I required more than three minutes (the Medium estimate) because my blood pressure was spiking, and I was hyper ventilating.
Factoid 1 from the write up :
If you do semantic search, you can get all information as per your intent.
What’s with this “all.” Content domains, no matter what the clueless believe, are incomplete. There is no “all” when it comes online information which is indexed.
Factoid 2 from the write up:
semantic search seeks to understand natural language the way a human would.
Yep, natural language queries are possible within certain types of content domains. However, the systems I have worked with and have an opportunity to use in controlled situations exhibit a number of persistent problems. These range from computational constraints. One system could support four simultaneous users on a corpus of fewer than 100,000 text documents. Others simply output “good enough” results. Not surprisingly when a physician needs an antitoxin to save a child’s life, keywords work better than “good enough” in my experience. NLP has been getting better, but the idea that systems can integrate widely different data which may be incomplete, incorrect, or stale and return a useful output is a big hurdle. So far no one has gotten over it on a consistent, affordable basis. Short cuts to reduce index look ups can be packaged as semantics and NLP but mostly these are clever ways to improve “efficiency.” Understanding sometimes. Precision and recall? Not yet.
Semantic Search Allegedly Adds A Boost To Product Discovery
March 20, 2020
Semantic search is one of the old reliable pieces of jargon for improving a search application, but it appears to be old hat. Semantic search, however, can, when correctly implemented, add a much needed boost for product discovery.
Grid Dynamics explains semantic magic in the article, “Boosting Product Discovery With Semantic Search.” We all know that human language is a complicated beast, which is why it has taken decades to develop decent voce to text and automated foreign language translation algorithms.
Humans learn from infancy to process speech based on the context and life experience. As technology has progressed, search engines are expected to perform the same actions which is where semantic search enters the game. Semantic search not only matches key words and phrases, but it brings meaning to them. Ecommerce Web sites require more than keyword and phrase search. Customers want to sort products based on price, brands, ratings, etc.
I am a librarian, and I know that irrelevant results often appear in any search and there are two types of these results: Obviously irrelevant values and values with subtle differences. A simple solution does not exist to fix all the irrelevant results.
Solutions are usually built a hybrid of semantic search and unstructured data. For the semantic search part, they must have: single words must be part of unbreakable multi-word phrases, business domain knowledge retracts/enhances query options, ambiguous matching need to be fixed with saliency to match attributes. Boolean queries also can be implemented in new ways to alter searches. Semantic search can also be used with different physical properties and merchandising rules.
Semantic search is a powerful tool for ecommerce Web sites, but:
“However, the power of semantic search largely depends on the richness and quality of the domain data – product attribution as well as synonyms. If your customers often perform out-of-dictionary search, then semantic search quality will suffer. It can include
• searches by subjective features like occasion of clothing (church dress) or age group for hi-tech device (laptops for kids)
• searches for brands which aren’t carried by your site, but it has similar products which can be suggested instead of just dropping the brand value from a query”
Never doubt how semantic search can improve a ecommerce search engine, but be sure to instill proper parameters for it to work correctly. Semantic search will remain a favorite of marketing whether a system is helping the person looking for information or hindering relevancy.
Whitney Grace, March 20, 2020
A New Horizon for Verizon: Swizzled Search Results
March 19, 2020
DarkCyber read “Yahoo, AOL, OneSearch Results Biased in Favor of Parent Company Verizon Media’s Web Sites.” The main idea seems to be that like baker’s in 11th century France a thumb on the scales could pay dividends. A gram here, a gram there.
The article asserts:
You may not be surprised to learn that the search results from all three of Verizon Media’s search engines are biased in favor of Verizon Media websites. Yahoo!, AOL, and OneSearch all boosts the ranking of Verizon Media brands in organic search results. That is to say, regular web results excluding ads, news, shopping, image, and video search results.
Surprised? Nope. What is the bit of revelatory factoid is that Bing indexes the Verizon content. Neither Bing nor Google reveals exactly how many Web sites their respective systems index. Useless information like how many links the crawlers follow in a Web site is not made explicit.
DarkCyber’s test queries suggest that Bing indexes only sites with a higher probability of being clicked. We have noted that for some queries, the Bing results closely parallel Google’s. Bing search administrators, are you monitoring Mother Google?
Therefore, such a happy coincidence that Bing indexes and displays in a favorable position the Verizon owned sites. In the good old days, the approach was called hit boosting. Today it probably has the words artificial intelligence and semantic technology obfuscating shaping content to meet a specific business need.
Progress in search? Absolutely just search engine optimization, however.
Stephen E Arnold, March 19, 2020
https://www.ctrl.blog/entry/verizon-media-search.html
A Guide to Finding Cloudy Files
March 11, 2020
Justa brief honk to describe this handy reference we have found. Popular Science tells us how to “Find Any File in the Cloud.” Writer David Nield describes platform-specific search functionality at Dropbox, Google Drive, iCloud, and OneDrive. He observes:
“Keeping your files in an online cloud locker means you can free up some space on your computer and get at your files from anywhere, using any device. But if you haven’t taken the time to explore a platform in depth, or if you use several and often get confused, you might find it harder to track down particular files compared to having them on a local hard drive. It doesn’t have to be this way, though. All the big cloud storage providers have useful tools for searching through your files and folders, whether you’re using a web browser, a desktop computer, or your phone.”
For each option, Nield details us where to find a basic search box as well as all filtering options. He also notes each platform’s limitations, if any. Naturally, the descriptions are illustrated with screenshots. See the writeup if you use, or are considering using, any of these cloud storage options.
Cynthia Murrell, March 11, 2020
Import.io and Connotate: One Year Later
March 3, 2020
There has been an interesting shift in search and content processing. Import.io, founded in 2012, purchased Connotate. Before you ask, “Connotate what?”, let me say that Connotate was a content scraping and analysis firm. I paid some attention to Connotate when it acquired Fetch, an outfit with an honest-to-goodness Xoogler on its team. Fetch processed structure data and Connotate was mostly an unstructured data outfit. I asked a Connotate professional when the company would process Dark Web content, only to be told, “We can’t comment on that.” Secretive, right.
Connotate was founded in 2000 and required about $25 million in funding. The amount Import.io paid was not revealed in a source to which DarkCyber has access. Import.io, which has ingested about $38 million. DarkCyber assumes that the stakeholders are confident that 1 + 1 will equal 3 or more.
Import.io says:
We are funded by some of the greatest minds in technology.
The great minds include AME Cloud Ventures, Open Ocean, IP Group, and several others.
The company explains:
Starting from a simple web data extractor and evolving to an enterprise level solution for concurrently getting data that drives business, industry, and goodness.
What’s the company provide? The answer is Web data integration: Identify, extract, prepare, integrate, and consume content from a user-provided list of urls. To illustrate the depth of the company’s capabilities, Import.io defines “prepare” this way:
Integrate prepared data with a library of APIs to support seamless integration with internal business systems and workflows or deliver it to any data repository to develop robust data sets for advanced analytics capabilities.
The firm’s Web site makes it clear that it serves the online travel, retail, manufacturing, hedge fund, advisory services, data scientists, analysts, journalists, marketing and product, hospitality, and media producers. These are a mix of sectors and industries, and DarkCyber did not create the grammatically inconsistent listing.
Import.io offers videos which provide some information about one of its important innovations “interactive extractors.” The idea is to convert script editing to point-and-click choices.
The company is growing. About a year ago, Import.io said that it experienced record sales growth. The company provided a link to its Help Center, but a number of panels contained neither information nor links to content.
The company offers a free version and a premium version. Price quotes are provided by the company.
Like Amplyfi and maybe ServiceMaster, Import.io is a company providing search and content processing with a 21st century business positioning. A new buzzword is needed to convey what Import.io, Amplyfi, and Service Master are providing. DarkCyber believes that these companies are examples of where search and content processing has begun to coalesce.
The question is, “Is acquiring, indexing, and analyzing OSINT content a truck stop or a destination like Miami Beach?”
Worth monitoring the trajectory of the company.
Stephen E Arnold, March 3, 2020
Microsoft Azure: Search, Artificial Intelligence, and Some Mystical Magic
March 3, 2020
DarkCyber spotted “Microsoft Announcements on Azure Artificial Intelligence.” The article is a summary of assorted Microsoft Azure assertions. Note that the article did not offer any information about Cortana’s and Windows 10 search semi-failure to thrill its users. But Azure is different. Microsoft does Azure better than Windows 10 updates… sometimes.
There were several highlights in the article.
First, Azure has artificial intelligence. The approach is open, interoperable, workflow, and “easy adaptation.” Is this way certified Microsoft Azure professionals are buying new houses and fancier automobiles?
Second, Azure does machine learning. The idea is that there are agents, applications, a machine learning model engine, support for R, and an enterprise edition. DarkCyber does not know a single person running Azure to make life better, faster, and cheaper except Azure consultants. But the big assertion is that Azure’s ML “delivers a unified data science experience.” DarkCyber wonders, “Does this include Outlook attachments?”
Third, Azure has updated some of its “old” features. There’s nothing like constant improvement like the flow of Windows 10 updates, uninstalls, and reinstalls. Now Azure does better decision making. Sentiment analysis has more deep learning and natural language processing. The system can do image analysis, and its has some of that Cortana goodness which has been repositioned in Windows 10 because it was so darned wonderful.
Fourth, Azure does knowledge mining. Azure does cognitive search. Azure recognizes forms.
The showcase client is a publishing company. The Atlantic has gone all in on the Azure systems. Another happy camper is AutoTrader.ca. Plus Archive 360 is tickled with the ability to use Azure cognitive search quickly and cost effectively. Yep, DarkCyber believes this was a smooth, easy implementation.
If you doubt that Microsoft is number one, read the article. If not, you will enjoy some of the ironies. How many search systems does Microsoft offer? How many of them are super? Who remembers Fast Search & Transfer?
Yep, super search the Azure way. It’s just like using Word’s numbering feature or figuring out PowerPoint backgrounds.
Stephen E Arnold, March 3, 2020
Trellis Research Gets Money And New Technical Co-Founder
February 27, 2020
If there is one industry that needs a powerful and accurate search and analytics tool it is court systems. Los Angeles startup Trellis Research specializes in software for state court data, recently made news with a big fundraiser and addition to their team. TechCrunch explains the details in, “Building A search Tool For State Court Data And Analytics, Trellis Adds Alon Schwartz As Co-Founder.”
Trellis Research is a fire starter startup, known for designing analytics and search software for state legal systems. Their most famous products were Dostoc, an online store and electronic document depository for financial, legal, and professional documents and unGlue a startup that regulates screen time for families.
Craft Ventures recently raised $4.4 million in funding for Trellis. The company also added a new technical co-founder Alon Shwartz. Shwartz’s new role will be the chief product officer. He will work side by side with the company founder Nicole Clark
Trellis’s home office is in California, where they service the California Superior court records and judicial analytics. Wit the new round of funding, the company hopes to expand to Florida, Delaware, Texas, and New York. Clark founded Trelis when she discovered a need for better search and analytics software in the courts:
“ ‘I was customer one,’ says Clark of the product. A former litigator in Los Angeles, the entrepreneur developed Trellis to serve her own research needs. ‘I used this data for two years and during those years I won every motion that I had,’ says Clark. ‘It made it so obvious what a competitive advantage this was. It’s a way to analyze how a judge thinks about issues and a lawyer can draft their motions with a particular judge in mind.’”
Trellis offers a freemium service for state trial decisions and filings with search, but to access the actual documents people need to become paying users. There is an $100 fee for individuals and enterprise users are negotiable. Once beyond the paywall, users can file documents, download, print, and analyze them.
Clark promises that attorneys will double their win rate with Trellis Research software.
Whitney Grace, February 27, 2020