Lucidworks Sees Watson as a Savior

December 21, 2016

Lucidworks (really?). A vision has appeared to the senior managers of Lucidworks, an open source search outfit which has ingested $53 million and sucked in another $6 million in debt financing in June 2016. Yep, that Lucidworks. The “really” which the name invokes is an association I form when someone tells me that commercializing open source search is going to knock off the pesky Elastic of Elasticsearch fame while returning a juicy payoff to the folks who coughed up the funds to keep the company founded in 2007 chugging along. Yep, Lucid works. Sort of, maybe.

I read “Lucidworks Integrates IBM Watson into Fusion Enterprise Discovery Platform.” The write up explains that Lucidworks is “tapping into” the IBM Watson developer cloud. The write up explains that Lucidworks has:

an application framework that helps developers to create enterprise discovery applications so companies can understand their data and take action on insights.

Ah, so many buzzwords. Search has become applications. “Action on insights” puts some metaphorical meat on the bones of Solr, the marrow of Lucidworks. Really?

With Watson in the company’s back pocket, Lucidworks will deliver. I learned:

Customers can rely on Fusion to develop and deploy powerful discovery apps quickly thanks to its advanced cognitive computing features and machine learning from Watson. Fusion applies Watson’s machine learning capabilities to an organization’s unique and proprietary mix of structured and unstructured data so each app gets smarter over time by learning to deliver better answers to users with each query. Fusion also integrates several Watson services such as Retrieve and Rank, Speech to Text, Natural Language Classifier, and AlchemyLanguage to bolster the platform’s performance by making it easier to interact naturally with the platform and improving the relevance of query results for enterprise users.

But wait. Doesn’t Watson perform these functions already. And if Watson comes up a bit short in one area, isn’t IBM-infused Yippy ready to take up the slack?

That question is not addressed in the write up. It seems that the difference between Watson, its current collection of partners, and affiliated entities like Yippy are vast. The write up tells me:

customers looking for hosted, pre-tuned machine learning and natural language processing capabilities can point and click their way to building sophisticated applications without the need for additional resources. By bringing Watson’s cognitive computing technology to the world of enterprise data apps, these discovery apps made with Fusion are helping professionals understand the mountain of data they work with in context to take action.

This sounds like quite a bit of integration work. Lucidworks. Really?

Stephen E Arnold, December 21, 2016

Creativity for Search Vendors

December 18, 2016

If you scan the marketing collateral from now defunct search giants like Convera, DR LINK, Fulcrum Technologies or similar extinct beasties, you will notice a similarity of features and functions. Let’s face it. Search and retrieval has been stuck in the mud for decades. Some wizards point to the revolution of voice search, emoji based queries, and smart software which knows what you want before you know you need some information.

Typing key words, indexing systems which add concept labels, and shouting at a mobile phone whilst standing between cars on a speeding train returns semi-useful links to what amount to homework: Open link, scan for needed info, close link, and do it again.

Image result for eureka california

Eureka, California is easy to find. Get inspired.

Now there is a solution to search and content processing vendors’ inability to be creative. These methods appear to fuel the fanciful flights of fancy emanating from predictive analytics, Big Data, and semantic search companies.

Navigate to “8 Tried-and-Tested Ways to Unlock Your Creativity.” Now you too can emulate the breakthroughs, insights, and juxtapositions of Leonardo, Einstein, Mozart, and, of course, Facebook’s design team.

Let’s take a look at these 10 ideas.

  1. Set up a moodboard. I have zero idea what a moodboard is. I am not sure it would fit into the work methods of Beethoven. He seemed a bit volatile and prone to “bad” moods.
  2. Talk it out. That’s a great idea for companies engaged in classified projects for nation states. Why not have those conversations in a coffee shop or better yet on an airplane with strangers sitting cheek by jowl.
  3. Brainstorming. My recollectioin of brainstorming is that it can be fun, but without one person who doesn’t get with the program, the “ideas” are often like recycled plastic bottles. Not always, of course. But the donuts can be a motivator.
  4. Mindmapping. Yep, diagrams. These are helpful, particularly when equations are included for the home economics and failed webmasters who wrangle a job at a search or cotnent processing vendor. What’s that pitchfork looking thing mean?
  5. Doodling. Works great. The use of paper and pencils is popular. One can use a Microsoft Surface or a giant iPad thing. Profilers and psychologists enjoy doodles. Venture capitalists who invested in a search and content processing company often sketch some what dark images.
  6. Music. Forget that Mozart and fighter pilot stuff. Go for Gregorian chants, heavy metal, and mindfulness tunes. Here in Harrod’s Creek, we love Muzak featuring the Whites and John Lomax.
  7. Lucid dreaming. This idea is popular among some of the visionaries working at high profile Sillycon Valley companies. Loon balloons, solar powered Internet aircraft, and trips to Mars. Apply that thinking to search and what do you get? Tay, search by sketch, and smart maps which identify pizza joints.
  8. Imagine what a great innovator would do. That works. People sitting on a sofa playing a video game can innovate between button pushes.

Why are search and cotnent processing vendors more creative? Now these folks can go in new directions armed with these tips and the same eight or nine algorithms in wide use. Peak search? Not by a country mile.

Stephen E Arnold, December 18, 2016

Use Google on Itself to Search Your Personal Gmail Account

December 16, 2016

The article titled 9 Secret Google Search Tricks on Field Guide includes a shortcut to checking on your current and recent deliveries, your flight plans, and your hotels. Google provides this information by pulling keywords from your Gmail account inbox. Perhaps the best one for convenience is searching “my bills” and being reminded of upcoming payments. Of course, this won’t work for bills that you receive via snail mail. The article explains,

Google is your portal to everything out there on the World Wide Web…but also your portal to more and more of your personal stuff, from the location of your phone to the location of your Amazon delivery. If you’re signed into the Google search page, and you use other Google services, here are nine search tricks worth knowing. It probably goes without saying but just in case: only you can see these results.

Yes, search is getting easier. Trust Mother Google. She will hold all your information in her hand and you just need to ask for it. Other tricks include searching “I’ve lost my phone.” Google might not be Find My Iphone, but it can tell you the last place you had your phone, given that you phone was linked to your Google account. Hotels, Events, Photos, Google will have your back.

Chelsea Kerwin, December 16, 2016

Big Data Needs to Go Public

December 16, 2016

Big Data touches every part of our lives and we are unaware.  Have you ever noticed when you listen to the news, read an article, or watch a YouTube video that people say items such as: “experts claim, “science says,” etc.”  In the past, these statements relied on less than trustworthy sources, but now they can use Big Data to back up their claims.  However, popular opinion and puff pieces still need to back up their big data with hard fact. says that transparency is a big deal for Big Data and algorithm designers need to work on it in the article, “More Accountability For Big-Data Algorithms.”

One of the hopes is that big data will be used to bridge the divide between one bias and another, except that he opposite can happen.  In other words, Big Data algorithms can be designed with a bias:

There are many sources of bias in algorithms. One is the hard-coding of rules and use of data sets that already reflect common societal spin. Put bias in and get bias out. Spurious or dubious correlations are another pitfall. A widely cited example is the way in which hiring algorithms can give a person with a longer commute time a negative score, because data suggest that long commutes correlate with high staff turnover.

Even worse is that people and organizations can design an algorithm to support science or facts they want to pass off as the truth.  There is a growing demand for “algorithm accountability,” mostly in academia.  The demands are that data sets fed into the algorithms are made public.  There also plans to make algorithms that monitor algorithms for bias.

Big Data is here to say, but relying too much on algorithms can distort the facts.  This is why the human element is still needed to distinguish between fact and fiction.  Minority Report is closer to being our present than ever before.

Whitney Grace, December 16, 2016

Costs of the Cloud

December 15, 2016

The cloud was supposed to save organizations a bundle on servers, but now we learn from Datamation that “Enterprises Struggle with Managing Cloud Costs.” The article cites a recent report from Dimensional Research and cloud-financial-management firm Cloud Cruiser, which tells us, for one thing, that 92 percent of organizations surveyed now use the cloud. Researchers polled 189 IT pros at Amazon Web Services (AWS) Global Summit in Chicago this past April, where they also found that 95 percent of respondents expect their cloud usage to expand over the next year.

However, organizations may wish to pause and reconsider their approach before throwing more money at cloud systems. Writer Pedro Hernandez reports:

Most organizations are suffering from a massive blind spot when it comes to budgeting for their public cloud services and making certain they are getting their money’s worth. Nearly a third of respondents said that they aren’t proactively managing cloud spend and usage, the study found. A whopping 82 percent said they encountered difficulties reconciling bills for cloud services with their finance departments.

The top challenge with the continuously growing public cloud resource is the ability to manage allocation usage and costs,’ stated the report. ‘IT and Finance continue to have difficulty working together to ascertain and allocate public cloud usage, and IT continues to struggle with technologies that will gather and track public cloud usage information.’ …

David Gehringer, principal at Dimensional Research, believes it’s time for enterprises to quit treating the cloud differently and adopt IT monitoring and cost-control measures similar to those used in their own data centers.

The report also found that top priorities for respondents included cost and reporting at 54 percent, performance management at 46 percent, and resource optimization at 45 percent. It also found that cloudy demand is driven by application development and testing, at 59 percent, and big data/ analytics at 31 percent.

The cloud is no longer a shiny new invention, but rather an integral part of most organizations. We would do well to approach its management and funding as we would other resource. The original report is available, with registration, here.

Cynthia Murrell, December 15, 2016

On the Hunt for Thesauri

December 15, 2016

How do you create a taxonomy? These curated lists do not just write themselves, although they seem to do that these days.  Companies that specialize in file management and organization develop taxonomies.  Usually they offer customers an out-of-the-box option that can be individualized with additional words, categories, etc.  Taxonomies can be generalized lists, think of a one size fits all deal.  Certain industries, however, need specialized taxonomies that include words, phrases, and other jargon particular to that field.  Similar to the generalized taxonomies, there are canned industry specific taxonomies, except the more specialized the industry the less likely there is a canned list.

This is where the taxonomy lists needed to be created from scratch.  Where do the taxonomy writers get the content for their lists?  They turn to the tried, true resources that have aided researchers for generations: dictionaries, encyclopedias, technical manuals, and thesauri are perhaps one of the most important tools for taxonomy writers, because they include not only words and their meanings, but also synonyms and antonyms words within a field.

If you need to write a taxonomy and are at a lost, check out MultiTes.  It is a Web site that includes tools and other resources to get your taxonomy job done.  Multisystems built MultiTes and they:

…developed our first computer program for Thesaurus Management on PC’s in 1983, using dBase II under CPM, predecessor of the DOS operating system.  Today, more than three decades later, our products are as easy to install and use. In addition, with MultiTes Online all that is needed is a web connected device with a modern web browser.

In other words, they have experience and know their taxonomies.

Whitney Grace, December 15, 2016

The Robots Are Not Taking over Libraries

December 14, 2016

I once watched a Japanese anime that featured a robot working in a library.  The robot shelved, straightened, and maintained order of the books by running on a track that circumnavigated all the shelves in the building.  The anime took place in a near-future Japan, when all paper documents were rendered obsolete.  While we are a long way off from having robots in public libraries (budget constraints and cuts), there is a common belief that libraries are obsolete as well.

Libraries are the furthest thing from being obsolete, but robots have apparently gained enough artificial intelligence to find lost books, however.  Popsci shares the story in “Robo Librarian Tracks Down Misplaced Book.”  It explains a situation that librarians hate to deal with: people misplacing books on shelves instead of letting the experts put them back.  Libraries rely on books being in precise order and if they are in the wrong place, they are as good as lost.  Fancy libraries, like a research library at the University of Chicago, have automated the process, but it is too expensive and unrealistic to deploy.  There is another option:

A*STAR roboticists have created an autonomous shelf-scanning robot called AuRoSS that can tell which books are missing or out of place. Many libraries have already begun putting RFID tags on books, but these typically must be scanned with hand-held devices. AuRoSS uses a robotic arm and RFID scanner to catalogue book locations, and uses laser-guided navigation to wheel around unfamiliar bookshelves. AuRoSS can be programmed to scan the library shelves at night and instruct librarians how to get the books back in order when they arrive in the morning.

Manual labor is still needed to put the books in order after the robot does its work at night.   But what happens when someone needs help with research, finding an obscure citation, evaluating information, and even using the Internet correctly?  Yes, librarians are still needed.  Who else is going to interpret data, guide research, guard humanity’s knowledge?

Whitney Grace, December 14, 2016

DuckDuckGo Makes Search Enhancements by Leveraging Yahoo Partnership

December 13, 2016

The article on titled New Features from a Stronger Yahoo Partnership relates the continuation of the relationship between DuckDuckGo and Yahoo. DuckDuckGo has gained fame for its unique privacy policy of not tracking its users, which of course flies in the face of the Google Goliath, which is built on learning about its users by monitoring their habits and improving the search engine using that data. Instead, DuckDuckGo insists on forgetting its users and letting them search without fear of it being recorded somewhere. The article conveys some of the ways that Yahoo is mingled with the David of search engines,

In addition to the existing technology we’ve been using, DuckDuckGo now has access to features you’ve been requesting for years: Date filters let you filter results from the last day, week and month. Site links help you quickly get to subsections of sites. Of course our privacy policy remains the same: we don’t track you. In addition, we’re happy to announce that Yahoo has published a privacy statement to the same effect.

Paranoid internet users and people with weird secretive fetishes alike, rejoice! DuckDuckGo will soon be vastly improved. The article does not state an exact date for this new functionality to be revealed, but it is coming soon.

Chelsea Kerwin, December 13, 2016

Ten Search Engines That Are Not Google

December 13, 2016

Business-design firm Vandelay Design shares their 10 favorite alternatives to Google Search in their blog post titled, “Alternative Search Engines for Designers and Developers.” Naturally, writer Jake Rocheleau views these resources from a designer’s point of view, but don’t let that stop you from checking out the list. The article states:

New intriguing search engines frequently pop up as a replacement to the juggernaut that is Google. But it’s tough to find alternative search engines that actually work and provide real value to your workflow. I’d like to cover a handful of alternatives that work well for designers and developers. These aren’t all web crawler search engines because I did throw in a few obscure choices for design resources too. But the sites in this list may be better replacements for Google no matter what you’re searching for. …

All 10 of these search engines are viable choices to add into your workflow, or even replace existing sites you already use. Designers are always looking for new tools and I think these sites fit the bill.

Rocheleau describes his selections and gives tips for getting the most out of each. He leads with DuckDuckGo—come for the privacy, stay for the easter eggs. StartPage also promises privacy as it pulls results from other search engines. Designers will like Instant Logo Search, for locating SVG vector logos, and Vecteezy for free vector designs. Similarly, Iconfinder and DryIcons both offer collections of free icons.

For something a little different, try The WayBack Machine at the Internet Archive, where you can comb the archives for any previously existing domain. Rocheleau suggests designers use it to research competitors and gain inspiration, but surely anyone can find interesting artifacts here.

We are reminded that one can get a lot from WolframAlpha if one bothers learning to use it. Then there is Ecosia, which uses ad revenue to plant trees across the globe. (They have planted over four million trees since the site launched in December of 2009.) The final entry is Qwant, another engine that promises privacy, but also offers individual search features for categories like news, social-media channels, and shopping. For anyone tired of Google and Bing, even non-designers, this list points the way to several good alternatives.

Cynthia Murrell, December 13, 2016

Add Free Search to the Free Tibet Slogan

December 13, 2016

China is notorious for censoring its people’s access to the Internet.  I have heard and made more than one pun about the Great Firewall of China.  There is search engine in China, but it will not be in Chinese, says Quartz: “How Censored Is China;s First Tibetan Language Search Engine? It Omits The Dalai Lama’s Web Site.”

Yongzin is the first Tibetan language search engine.  It is supposed to act as a unified portal for all the major Tibetan language Web sites in China.  There are seven million Tibetan people in China, but the two big Chinese search engines: Baidu and Sogou do not include the Tibetan language.  Google is banned in China.

Yongzin rips off Google in colors and function.  The Chinese government has dealt with tense issues related to the country of Tibet for decades:

The Chinese government wants the service to act as a propaganda tool too. In the future, Yongzin will provide data for the government to guide public opinion across Tibet, and monitor information in Tibetan online for “information security” purposes, Tselo, who’s in charge of Yongzin’s development, told state media (link in Chinese) at Monday’s (Aug. 22) launch event.

When people search Yongzin with Tibet related keywords, such as Dalai Lama and Tibetan tea, China’s censorship shows itself at work.  Nothing related to the Dalai Lama is shown, not even his Web site, and an article about illegal publications.

China wants to position itself as guardian of the Tibetan culture, but instead they proffer a Chinese-washed version of Tibet rather than the true thing.  It is another reason why the Free Tibet campaign is still important.

Whitney Grace, December 13, 2016

« Previous PageNext Page »