March 24, 2017
Will search-and-discovery firm Diffeo’s recent acquisition give it the edge? Yahoo Finance shares, “Diffeo Acquires Meta Search and Launches New Offering.” Startup Meta Search developed a local computer and cloud search system that uses smart indexing to assign index terms and keep the terms consistent. Diffeo provides a range of advanced content processing services based on collaborative machine intelligence. The press release specifies:
Diffeo’s content discovery platform accelerates research analysts by applying text analytics and machine intelligence algorithms to users’ in-progress files, so that it can recommend content that fills in knowledge gaps — often before the user thinks of searching. Diffeo acts as a personal research assistant that scours both the user’s files and the Internet. The company describes its technology as collaborative machine intelligence.
Diffeo and Meta’s services complement each other. Meta provides unified search across the content on all of a user’s cloud platforms and devices. Diffeo’s Advanced Discovery Toolbox displays recommendations alongside in-progress documents to accelerate the work of research analysts by uncovering key connections.
Meta’s platform integrates cloud environments into a single keyword search interface, enabling users to search their files on all cloud drives, such as Dropbox, Google Drive, Slack and Evernote all at once. Meta also improves search quality by intelligently analyzing each document, determining the most important concepts, and automatically applying those concepts as ‘Smart Tags’ to the user’s documents.
This seems like a promising combination. Founded in 2012, Diffeo made Meta Search its first acquisition on January 10 of this year. The company is currently hiring. Meta Search, now called Diffeo Cloud Search, is based in Boston.
Cynthia Murrell, March 24, 2017
March 17, 2017
As institutions like banks and law enforcement come to grips with the flow of Bitcoin, another cyber currency is suddenly gaining ground. Bloomberg Technology reveals, “New Digital Currency Spikes as Drug Dealers Get More Secrecy.” The coin in question, Monero, has been around for a couple of years, but was recently given a boost by the marketplace AlphaBay, one of the most popular destinations for buyers of illicit drugs on the Dark Web. In the two weeks after the site announced it would soon accept Monero, the total worth of that currency in circulation jumped to over $100 million (from about $25 million the previous month). Writer Yuji Nakamura explains why a shift may be underway:
Bitcoin, the most popular digital currency in the world with a total value of $9.1 billion, also allows users to move funds discreetly and uses a network of miners to verify the authenticity of each trade. But its privacy has come under threat as governments and private investigators increase their ability to track transactions across the bitcoin network and trace funds to bank accounts ultimately used to convert digital assets to and from traditional currencies like U.S. dollars.
Monero similarly uses a network of miners to verify its trades, but mixes multiple transactions together to make it harder to trace the genesis of the funds. It also adopts ‘dual-key stealth’ addresses, which make it difficult for third-parties to pinpoint who received the funds.
For any two outputs, from the same or different transactions, you cannot prove they were sent to the same person,’ Riccardo Spagni, a lead developer of Monero, wrote by e-mail. Jumbling trades together makes it ‘impossible to tell which transaction, of a set of transactions, a particular input comes from. It appears to come from all of them.
Though Monero has yet to withstand the trials of AlphaBay-level volumes for long, its security features received praise from investor and prominent digital-currency-advocate Roger Ver. As of this writing, Monero is ranked fifth among digital currencies in overall market value. Click here for a list of digital currencies ranked, in real time, by market cap.
Cynthia Murrell, March 17, 2017
March 16, 2017
We noticed that Attivio is back to enterprise search, and now uses the fetching catchphrase, “data dexterity company.” Their News page announces, “Attivio Chosen as Enterprise Search Platform for World’s Largest Repository of Foreign Language Media.” We’ve been keeping an eye on Attivio as it grows. With this press release, Attivio touts a large, recent feather in their cap—providing enterprise search services to SCOLA, a non-profit dedicated to helping different peoples around the world learn about each other. This tool enables SCOLA’s subscribers to find any content in any language, we’re told. The organization regards today’s information technology as crucial to their efforts. The write-up explains:
SCOLA provides a wide range of online language learning services, including international TV programming, videos, radio, and newspapers in over 200 native languages, via a secure browser-based application. At 85 terabytes, it houses the largest repository of foreign language media in the world. With its users asking for an easier way to find and categorize this information, SCOLA chose Attivio Enterprise Search to act as the primary access point for information through the web portal. This enables users, including teachers and consumers, to enter a single keyword and find information across all formats, languages and geographical regions in a matter of seconds. After looking at several options, SCOLA chose Attivio Enterprise Search because of its multi-language support and ease of customization. ‘When you have 84,000 videos in 200 languages, trying to find the right content for a themed lesson is overwhelming,’ said Maggie Artus, project manager at SCOLA. ‘With the Attivio search function, the user only sees instant results. The behind-the-scenes processing complexity is completely hidden.’”
Attivia was founded in 2007, and is headquartered in Newton, Massachusetts. The company’s client roster includes prominent organizations like UBS, Cisco, Citi, and DARPA. They are also hiring for several positions as of this writing.
Cynthia Murrell, March 16, 2017
March 15, 2017
Apparently ahead of a rumored IPO launch, Russian search firm Yandex is introducing “Spectrum,” a semantic search feature. We learn of the development from “Russian Search Engine Yandex Gets a Semantic Injection” at the Association of Internet Research Specialists’ Articles Share pages. Writer Wushe Zhiyang observes that, though Yandex claims Spectrum can read users’ minds, the tech appears to be a mix of semantic technology and machine learning. He specifies:
The system analyses users’ searches and identifies objects like personal names, films or cars. Each object is then classified into one or more categories, e.g. ‘film’, ‘car’, ‘medicine’. For each category there is a range of search intents. [For example] the ‘product’ category will have search intents such as buy something or read customer reviews. So we have a degree of natural language processing, taxonomy, all tied into ‘intent’, which sounds like a very good recipe for highly efficient advertising.
But what if a search query has many potential meanings? Yandex says that Spectrum is able to choose the category and the range of potential user intents for each query to match a user’s expectations as close as possible. It does this by looking at historic search patterns. If the majority of users searching for ‘gone with the wind’ expect to find a film, the majority of search results will be about the film, not the book.
As users’ interests and intents tend to change, the system performs query analysis several times a week’, says Yandex. This amounts to Spectrum analysing about five billion search queries.”
Yandex has been busy. The site recently partnered with VKontakte, Russia’s largest social network, and plans to surface public-facing parts of VKontakte user profiles, in real time, in Yandex searches. If the rumors of a plan to go public are true, will these added features help make Yandex’s IPO a success?
Cynthia Murrell, March 15, 2017
March 9, 2017
In the wake of Amazon’s glitch, a number of publications rushed to report on the who, what, where, and why. ZDNet took a different approach in “Which Cloud Will Give You the Biggest Bang for the Buck?” The write up recycled in the best tradition of “real” journalism a report from a vendor named Cloud Spectator. I won’t ask too many questions about sample size, methodology, the meaning assigned to “value,” and statistical validity. I will assume that the information is not Facebook news.
The guts of the write up is this chart, which is impossible to read in this blog post, but the original is reasonably legible:
What this chart reveals about hosting is that the 1&1 system is the big dog. I would point out that the naming of the service is “1+1” in the chart; the “real” name of the company is “1&1”, a real joy to search using free Web search systems.
Okay, 1+1 was on my radar as a very low cost provider of Web page hosting and other services. Now the company remains a low cost provider and has added a range of new services. Cloud Spectator finds the company A Number One. I was tempted to type ANo1, another keen string to plug into a Web search system.
What interested me was the cluster of outfits which the Cloud Spectator survey pegged as small dogs; for example, Amazon Web Services, the very same outfit that nuked some major Web sites. (Send in a two pizza team, Mr. Bezos.)
Close to Amazon’s lower third ranking was Microsoft Azure. Somehow that seems par for the new Microsoft. Google and the financially challenged Rackspace were in the middle of the pack. (What happened to Rackspace’s love affair with Robert Scobel, recently removed from the Gilmore Gang.)
But the major news for me was that IBM, yep, the owner of the famed and much admired Watson thing, was darn near last. IBM nosed out DimensionDate for the “Also Participated” badge.
Net net: Maybe 1&1 should get more attention. Perhaps the company will change its name to minimize the likelihood of misspellings. Alternatively 1&1 can hire Recode to endlessly repeat that one spells embarrassed with two r’s and two esses.
When it comes to search in the cloud, the question becomes, “How does one deploy an enterprise class search and content processing on the 1&1 system?” Good question.
Stephen E Arnold, March 9, 2017
March 9, 2017
Here is the story of another successful Dark Web bust. Motherboard reports, “Undercover FBI Agent Busts Alleged Explosives Buyer on the Dark Web.” The 50-year-old suspect was based in Houston, and reporter Joseph Cox examined the related documents from the Southern District of Texas court. We are not surprised to learn that the FBI found this suspect through its infiltration of AlphaBay.; Cox writes:
The arrest was largely due to the work of an undercover agent who posed as an explosives seller on the dark web marketplace AlphaBay, showing that, even in the age of easy-to-use anonymization technology, old-school policing tactics are still highly effective at catching suspects.
According to the complaint, on August 21, an FBI Online Covert Employee (OCE)—essentially an undercover agent—located outside Houston logged into an AlphaBay vendor account they were running and opened an unsolicited private message from a user called boatmanstv. ‘looking for wireless transmitter with detonator,’ the message read. ‘Everything I need to set of a 5 gallon can of gas from a good distance away [sic].’ The pair started a rapport, and boatmanstv went into some detail about what he wanted to do with the explosives.
One thing led to another, and the buyer and “seller” agreed to an exchange after communicating for a couple of weeks. (Dark Web sting operations require patience. Lots of patience.) It became clear that Boatmanstv had some very specific plans in mind for a very specific target, and that he’d made plenty of purchases from AlphaBay before. The FBI was able to connect the suspect’s email account to other accounts, and finally to his place of business. He was arrested shortly after receiving and opening the FBI’s package, so it would appear there is one fewer violent criminal on the streets of Houston.
It is clear that the FBI, and other intelligence organizations, are infiltrating the Dark Web more and more. Let the illicit buyer be wary.
Cynthia Murrell, March 9, 2016
March 8, 2017
I read “Ontologies: Practical Applications.” The main idea in the write up is that indexing is important. Now indexing is labeled in different ways today; for example, metadata, entity extraction, concepts, etc. I agree that indexing is important, but the challenge is that most people are happy with tags, keywords, or systems which return a result that has made a high percentage of users happy. Maybe semi-happy. Who really knows? Asking about search and content processing system satisfaction returns the same grim news year after year; that is, most users (roughly two thirds) are not thrilled with the tools available to locate information. Not much progress in 50 years it seems.
The write up informs me:
Ontologies are a critical component of the enterprise information architecture. Organizations must be capable of rapidly gathering and interpreting data that provides them with insights, which in turn will give their organization an operational advantage. This is accomplished by developing ontologies that conceptualize the domain clearly, and allows transfer of knowledge between systems.
This seems to mean a classification system which makes sense to those who work in an organization. The challenge which we have encountered over the last half century is that the content and data flowing into an organization changes often rapidly over time. At any one point in time, the information today is not available. The organization sucks in what’s needed and hopes the information access system indexes the new content right away and makes it findable and usable in other software.
That’s the hope anyway.
The reality is that a gap exists between what’s accessible to a person in an organization and what information is being acquired and used by others in the organization. Search fails for most system users because what’s needed now is not indexed or if indexed, the information is not findable.
An ontology is a fancy way of saying that a consultant and software can cook up a classification system and use those terms to index content. Nifty idea, but what about that gap?
This is the killer for most indexing outfits. They make a sale because people are dissatisfied with the current methods of information access. An ontology or some other jazzed up indexing component is sold as the next big thing.
When an ontology, taxonomy, or other solution does not solve the problem, the company grouses about search and cotenant processing again.
Is there a fix? Who knows. But after 50 years in the information access sector, I know that jargon is not an effective way to solve very real problems. Money, know how, and old school methods are needed to make certain technologies deliver useful applications.
Ontologies. Great. Silver bullet. Nah. Practical applications? Nifty concept. Reality is different.
Stephen E Arnold, March 8, 2017
March 3, 2017
Trying to sell a state of the art, next-gen search and content processing system can be tough. In the article, “Most Companies Slow to Adopt New Business Tech Even When It Can Help,” Digital Trends demonstrates that a reluctance to invest in something new is not confined to Search. Writer Bruce Brown cites the Trends vs. Technologies 2016 report (PDF) from Capita Technology Solutions and Cisco. The survey polled 125 ICT [Information and Communications Tech] decision-makers working in insurance, manufacturing, finance, and the legal industry. More in-depth interviews were conducted with a dozen of these folks, spread evenly across those fields.
Most higher-ups acknowledge the importance of keeping on top of, and investing in, worthy technological developments. However, that awareness does not inform purchasing and implementation decisions as one might expect. Brown specifies:
The survey broke down tech trends into nine areas, asking the surveyed execs if the trends were relevant to their business, if they were being implemented within their industry, and more specifically if the specific technologies were being implemented within their own businesses. Regarding big data, for example, 90 percent said it was relevant to their business, 64 percent said it was being applied in their industry, but only 39 percent reported it being implemented in their own business. Artificial intelligence was ranked as relevant by 50 percent, applied in their industry by 25 percent, but implemented in their own companies by only 8 percent. The Internet of Things had 70 percent saying it is relevant, with 50 percent citing industry applications, but a mere 30 percent use it in their own business. The study analyzed why businesses were not implementing new technologies that they recognized could improve their bottom line. One of the most common roadblocks was a lack of skill in recognizing opportunities within organizations for the new technology. Other common issues were the perception of security risks, data governance concerns, and the inertia of legacy systems.
The survey also found the stain of mistrust, with 82 percent of respondents sure that much of what they hear about tech trends is pure hype. It is no surprise, then, that they hesitate to invest resources and impose change on their workers until they are convinced benefits will be worth the effort. Perhaps vendors would be wise to dispense with the hype and just lay out the facts as clearly as possible; potential customers are savvier than some seem to think.
Cynthia Murrell, March 3, 2017
March 2, 2017
You may have heard about Google X’s Project Loon, which aims to bring Internet access to underserved, rural areas using solar-powered balloons. The post, “Here’s How Google Makes its Giant, Internet-Beaming Balloons,” at Business Insider takes us inside that three-year-old project, describing some of how the balloons are made and used. The article is packed with helpful photos and GIFs. We learn that the team has turned to hot-air-balloon manufacturer Raven Aerostar for their expertise. The write-up tells us:
The balloons fly high in the stratosphere at about 60,000 to 90,000 feet above Earth. That’s two to three times as high as most commercial airplanes. Raven Aerostar creates a special outer shell for the balloons, called the film, that can hold a lot of pressure — allowing the balloons to float in the stratosphere for longer. The film is as thin as a typical sandwich bag. … The film is made of a special formulation of polyethylene that allows it to retain strength when facing extreme temperatures of up to -112 degrees Fahrenheit.
We like the comparison sandwich bag. The balloons are tested in sub-freezing conditions at the McKinley Climatic Lab—see the article for dramatic footage of one of their test subjects bursting. We also learn about the “ballonet,” an internal compartment in each balloon that controls altitude and, thereby, direction. Each balloon is equipped with a GPS tracker, of course, and all electronics are secured in a tiny basket below.
One caveat is a bit disappointing—users cannot expect to stream high-quality videos through the balloons. Described as “comparable to 3G,” the service should be enough for one to visit websites and check email. That is certainly far better than nothing and could give rural small-business owners and remote workers the Internet access they need.
Cynthia Murrell, March 2, 2017
February 28, 2017
I enjoy the “next frontier”-type article about search and retrieval. Consider “The Next Frontier of Internet and Search,” a write up in the estimable “real” journalism site Huffington Post. As I read the article, I heard “Scotty, give me more power.” I thought I heard 20 somethings shouting, “Aye, aye, captain.”
The write up told me, “Search is an ev3ryday part of our lives.” Yeah, maybe in some demographics and geo-political areas. In others, search is associated with finding food and water. But I get the idea. The author, Gianpiero Lotito of FacilityLive is talking about people with computing devices, an interest in information like finding a pizza, and the wherewithal to pay the fees for zip zip connectivity.
And the future? I learned:
he future of search appears to be in the algorithms behind the technology.
I understand algorithms applied to search and content processing. Since humans are expensive beasties, numerical recipes are definitely the go to way to perform many tasks. For indexing, humans fact checking, curating, and indexing textual information. The math does not work the way some expect when algorithms are applied to images and other rich media. Hey, sorry about that false drop in the face recognition program used by Interpol.
I loved this explanation of keyword search:
The difference among the search types is that: the keyword search only picks out the words that it thinks are relevant; the natural language search is closer to how the human brain processes information; the human language search that we practice is the exact matching between questions and answers as it happens in interactions between human beings.
This is as fascinating as the fake information about Boolean being a probabilistic method. What happened to string matching and good old truncation? The truism about people asking questions is intriguing as well. I wonder how many mobile users ask questions like, “Do manifolds apply to information spaces?” or “What is the chemistry allowing multi-layer ion deposition to take place?”
The write up drags in the Internet of Things. Talk to one’s Alexa or one’s thermostat via Google Home. That’s sort of natural language; for example, Alexa, play Elvis.
Here’s the paragraph I highlighted in NLP crazy red:
Ultimately, what the future holds is unknown, as the amount of time that we spend online increases, and technology becomes an innate part of our lives. It is expected that the desktop versions of search engines that we have become accustomed to will start to copy their mobile counterparts by embracing new methods and techniques like the human language search approach, thus providing accurate results. Fortunately these shifts are already being witnessed within the business sphere, and we can expect to see them being offered to the rest of society within a number of years, if not sooner.
Okay. No one knows the future. But we do know the past. There is little indication that mobile search will “copy” desktop search. Desktop search is a bit like digging in an archeological pit on Cyprus: Fun, particularly for the students and maybe a professor or two. For the locals, there often is a different perception of the diggers.
There are shifts in “the business sphere.” Those shifts are toward monopolistic, choice limited solutions. Users of these search systems are unaware of content filtering and lack the training to work around the advertising centric systems.
I will just sit here in Harrod’s Creek and let the future arrive courtesy of a company like FacilityLive, an outfit engaged in changing Internet searching so I can find exactly what I need. Yeah, right.
Stephen E Arnold, February 28, 2017