October 10, 2016
“Computer Scientists Discover 14 key Components of Creativity” tries to reveal what differentiates the creative person from the average individual. Let’s look at the attributes of creativity:
- Active involvement and persistence. Yes, this is two attributes packaged as one
- Dealing with uncertainty
- Domain competence
- General intellectual ability
- Generation of results
- Independence and freedom. Another two’fer.
- Intention and emotional involvement. The bundling approach seems semi-creative. Maybe a cop out?
- Progression and development. Again!
- Social interaction and communication. Obviously a pattern of creating one “new” idea by sticking two things together.
- Spontaneity/subconscious processing. Again two to make one.
- Thinking and evaluation. Isn’t evaluating a component of thinking?
- Variety, Divergence as well as Experimentation. Now three attributes combine to produce one attribute. Does this overlap with other components.
Reflecting on this list, my hunch is that a bit more creativity in making the attributes clear might be needed. The list is interesting, but it lack—how shall I phrase it—creativity.
How does this apply to search?
For old school Boolean systems like ip.com, one has to know what one is looking for before the search system is of much help. Thus, the system can only respond to inputs from a human. The more creative the human, the less likely the cut and paste snippets function will be. Other systems with this less than creative approach include other Boolean systems and Lucene.
For modern predictive systems, the creativity shifts from the human to the software. The idea is that the software will look at the user’s history, similar users’ behaviors, GPS coordinates, and other observable information and produce outputs like Alexa or Google mobile search. The human does not have to think. The creativity seems somewhat limited because when one is looking for pizza via a mobile phone, some of the attributes seem less than creative.
Search systems which try to respond to the thoughts and notions of the human user and software delivering results based on rules are elusive.
Creativity may be difficult to generate and deal with. Perhaps that is why the list of 14 attributes includes multiple word descriptions which try to get at a single notion.
Cleverness is not on the list. Why not? I find clever approaches to search more interesting than creative searches. Clever?
Stephen E Arnold, October 10, 2016
October 10, 2016
I have concluded that finding information is entering a mini Dark Ages. The evidence I have gathered suggests that young folks will speak to their mobile devices to get pizza and information for their PhD research projects. I have a folder of examples of applications of smart software which produces remarkable marketing assertions and black box outputs.
I have added to my collection the write up “Postgress Full Text Search Is Good Enough.” I learned that “good enough” search has these features:
- Ranking / Boost Support
- Multiple languages
- Fuzzy search for misspelling
- Accent support
I assume, which is risky, that keywords are part of the basic feature set. But in a world of “good enough”, who knows?
The write up provides code snippets for and details regarding the implementation of Postgress’ search function. The explanation of Postgress’ internal methods may require that you keep some Postgress manuals handy and have a browser pointed at Bing or Google to chase down some of the jargon; for instance:
tsvectorvalue is a sorted list of distinct lexemes which are words that have been normalized to make different variants of the same word look alike. For example, normalization almost always includes folding upper-case letters to lower-case and often involves removal of suffixes (such as ‘s’, ‘es’ or ‘ing’ in English). This allows searches to find variant forms of the same word without tediously entering all the possible variants.
The section on optimization and indexation provides some useful guidelines. Trouble may result from mismatching one’s data with the types of indices Postgress offers.
If you are using Postgress and interested in “good enough” search, you will find the write up helpful. If you are an entrepreneur and want to tap into an underserved market for a graphical administrative interface for Postgress “good enough” search, you will find that the write provides a checklist for you to follow.
For me in rural Kentucky, I marvel at the happy acceptance of “good enough” search. Once “good enough” takes hold, where does one find the impetus to deliver outstanding search? Do I look to dtSearch? IBM OmniFind (aka, Watson). A whizzy cloud service like Amazon’s?
I suppose I can ask Siri or Cortana. Good enough search and good enough answers. Except when the answers are off point or just incorrect. A “C” is good enough for today’s business and technical approaches.
Stephen E Arnold, October 10, 2016
October 7, 2016
Bing is the redheaded stepchild of search engines, but according to the Motley Fool the Microsoft owned search engine started to earn a profit during its last fiscal year. The Motley Fool shares the story in “Bing Became Profitable Last Year. Can It Keep Up?” Bing’s search advertising generated $5.5 billion in estimated revenue, which is more than what Twitter and Tencent earned. Into 2016, Bing continues to turn a profit.
Bing’s revenue grew in Microsoft’s last fiscal year quarter and in June 40% of the search revenue came from Windows 10 devices. When the free Windows 10 upgrade ends soon and thus will end the growth, as Bing will no longer be see a high adoption rate. Microsoft will continue to grow Bing and profit is predicted to continue to rise:
One important factor is that Microsoft outsourced its display advertising business at the beginning of fiscal 2016. That has allowed the company to focus its sales team on its search advertisements, which generally carry higher prices and margins than display ads. That makes the sales team more cost-efficient for Microsoft to run while it collects high-margin revenue from outsourcing its display ads.
This means Microsoft will raise its ad prices and will focus on selling more ads to appear with search results. Bing will never compete with Google’s massive revenue, but it has proven that it is less of a copycat and a stable, profit generating search engine.
October 6, 2016
I read two articles about the future of search. The first was a series of remarks in a podcast by Christopher Issac Stone, aka Biz. In a nutshell, one finds information by asking people.
The other write up was “If I Ran Google (Why the Future of Search Will Diverge from Its Present and Past.” The author of this article is a “multi time bestselling author.” The “A” was capitalized.
The two views of the future of search underscores the perception that keyword search is dead. Text is uninteresting. Search systems are bouncing like a dead cat; that is, typing words in a search box and looking for germane information is not where the world of users wants to go. Hence, search is going to change.
Left clay tablet from the 4th millennium BCE. Right. clay tablets from 2016. Not much change it seems.
Here’s a statement which hits at the future of search. The quote comes from the multi time bestselling “Author”:
A lot of younger people don’t use Google as much as we might expect. They find things on YouTube (an Alphabet company), Snapchat, Facebook, Instagram or the like.
I agree. Pizza, cat videos, and even information about the future of search by many people will be sought and found using something other than the digital equivalent of a library card catalog. Thump. That’s the sound of the dead cat bouncing or hitting the pavement.
The thump means Google, the game changer, is going to have the game changed for itself.
The future is actionable intelligence. Ask a question and get an answer. Then order a pizza or watch a living cat video. Dead cats are not interesting.
First, there are numerous ways to look for information needed to answer a question. There are search boxes when one presumably is working on a research paper or maybe an article destined for publication. That is the old fashioned work which requires attention, note taking, and thinking about a topic and how to answer questions for which there is no single journal article or reliable data set. This type of research will not appeal to some people.
Second, there is the convenience of asking others for information. This is a useful type of information collection. Sometimes it works, and other times it forces the questioner to drag himself or herself back to the old fashioned method described in item one above.
Third, there is smart software which looks at behaviors and makes a best guess about what the person needs to know. When I drive to the airport, I want my GPS to show me which parking garage has an open space. No typing and no talking, please. Just the map with the answer.
In each of these broad categories of access — typing keywords, asking via text or voice, or smart software making best guesses — useful information can be located.
Most of the folks with whom I interact are not happy with search, a broad term used to describe a remarkable range of information access systems.
The problem with the future is that it is not going to bounce like the dead cat of the present.
If I have learned one thing in my years in the information access sector it is:
Information access methods do not die. Options become available.
Regardless of the future, some reading is necessary. Some talking to humans is necessary. Some smart software inputs are necessary.
Heck, here in Harrod’s Creek, people still use clay tablets to communicate. The message about the future is that “good enough” information access is more important than old fashioned checkpoints like precision, recall, provenance, and understanding.
Stephen E Arnold, October 6, 2016
October 6, 2016
One of the newest forms of search is using actual images. All search engines from Google to Bing to DuckDuckGo have an image search option, where using keywords you can find an image to your specifications. It seemed to be a thing of the future to use an actual image to power a search, but it has actually been around for a while. The only problem was that reverse image searching sucked and returned poor results.
Now the technology has improved, but very few people actually know how to use it. ZDNet explains how to use this search feature in the article, “Reverse Image Searching Made Easy…”. It explains that Google and TinEye are the best way to begin reverse image search. Google has the larger image database, but TinEye has the better photo experts. TinEye is better because:
TinEye’s results often show a variety of closely related images, because some versions have been edited or adapted. Sometimes you find your searched-for picture is a small part of a larger image, which is very useful: you can switch to searching for the whole thing. TinEye is also good at finding versions of images that haven’t had logos added, which is another step closer to the original.
TinEye does have its disadvantages, such as outdated results and not being able to find them on the Web. In some cases Google is the better choice as one can search by usage rights. Browser extensions for image searching are another option. Lastly if you are a Reddit user, Karma Decay is a useful image search tool and users often post comments on the image’s origin.
The future of image searching is now.
October 4, 2016
This week’s HonkinNews is available at this link. The feature story explores Palantir Technologies’ love-less love relationship with the US Army. Palantir’s approach to keeping its government customers happy is innovative. We also comment about Google’s blurring of cow faces in StreetView. Learn why SearchBlox is giving vendors of expensive, proprietary enterprise search systems cramps in their calves. Microsoft continues to pay users to access the Internet via Edge and use Bing to search for information. How much does the US government spend for operations and maintenance of its systems? The figure is surprising, if not shocking. This and more in HonkinNews for October 4, 2016.
Kenny Toth, October 4, 2016
October 4, 2016
One would think that in the days of instant information, we all would be expert searchers and know how to find any fact. The problem is that most people type entire questions into search engines and allow natural language processing to do the hard labor. There is a smarter way to search than lazy question typing and Geek Squad has an search literacy guide you might find useful: “Search Engine Secrets: Find More With Google’s Hidden Features.”
What very few people know (except us search gurus) is that search engines have hidden tricks you can use you find your results quicker and make search easier. While Google is the standard search engine and all these tricks are geared towards that search engine, they will also work with other ones. The standard way to search is by typing a query into the search bar and some of these typing tricks are old school, such as using parentheses for an exact phrase, searching one specific Web site, wildcards, Boolean operators, and using a minus sigh (-) to exclude terms.
Searching for pictures is a much newer search form and is usually done by clicking on the image search on a search engine. However, did you know that most search engines have the option to search with an image itself? With Google, simply drag and drop an image into the search bar to start the process. There are also delimiters on image search to filter results by specifics, such as GIFs, size, color, and others
Even newer than image search is vocal search with a microphone. Usually, voice search is employed with a digital assistant like Cortana and Siri. Some voice search commands are:
- Find a movie: What movies are playing tonight? or Where’s Independence Day playing?
- Find nearby places: Where’s the closest cafe?
- Find the time: What time is it in Melbourne?
- Answer trivia questions: Where was Albert Einstein born? or How old is Beyonce?
- Translate words or phrases: How do you say milk in Spanish?
- Define a word: What does existentialism mean?
- Convert between units: What’s 16 ounces in grams?
- Solve a math problem: What’s the square root of 2,209?
Book a restaurant table: Book a table for two at Dorsia on Wednesday night.
The only problem is that only the typing tricks transfer to professional research. They are used at universities, research institutes, and even large companies. The biggest problem is that people do not know how to use them in those organizations.
October 3, 2016
Google allows customers to save digital content to its Drive service. The hitch in the git along is that finding content can be difficult. Google, the search company which pays the bills selling ads, has introduced an information access utility to address this need.
I read “Google Updates Drive, Smarter Search Bar, Natural Language Processing and More.” The write up reminded me that with Google Drive I could “keep everything, share anything.” The article likes Google—a lot. I noted the words “fantastic.” That’s good. Fantastic.
The idea is that one no longer has to use key words. A person can ask a question; for example, I can “search like you talk.” Google is making search like “talk” because users requested this function.
The write up points out that “[Google] Docs:
will also now automatically save a copy of the non-Google file you open, convert and edit in Docs, Sheets or Slides, in its original format. This feature has been introduced considering the fact that work can happen across a spectrum of formats. Meanwhile, you can view or download the non-Google source file in its original format directly from Revision History in Docs, Sheets and Slides on the web.
The search function sounds perfect for the mobile user who finds keywords troublesome.
- Dumping digital content into a pile makes locating the specific item difficult when date, time, and other constraints are not available. Google does not “do time” very well.
- The notion of heterogeneous document types is an interesting one. What content types are supported and searchable? Framemaker or Analyst’s Notebook files perhaps?
- A search vendor introducing improved search is interesting to me. With precision and recall for Google Web search apparently eroded by other considerations, Google appears to be supporting multiple methods of locating information. Is this “let many flowers bloom” or a signal that search is Balkanizing?
As each user’s “pile” of digital artifacts grows, how will a person locate related content; for example, text, image, third party content in a way that makes sense? I do not see great progress in findability, but I like the marketing that inspires words like “fantastic.”
The convenience of Google Drive offers some useful information to those able to analyze a user’s content. Add in archived messages and search histories, and the “new” search creates an interesting concoction.
Stephen E Arnold, October 3, 2016
September 30, 2016
Beyond Search learned that open source search and retrieval solution Solr won a Bossie Award. The outfit involved in the awards said that Solr was a trusted and mature search engine technology.” Big outfits using Solr include Zappos, Comcast, and DuckDuckGo.
Also bringing home an award was Lucene. The description of Elasticsearch pointed out:
As part of the ELK stack (Elasticsearch, Logstash, and Kibana, all developed by Elasticsearch’s creators, Elastic), Elasticsearch has found its killer app as an open source Splunk replacement for log analysis.
Users of Lucene include Microsoft and LinkedIn. (What’s the problem with SharePoint Search? What prevents Microsoft from using Fast Search & Transfer technology in lieu of open source search?)
Why are Solr and Lucene the go to search utilities? Free? Actual bug fixes and not excuses? No licensing leg shackles? Did I mention free?
Stephen E Arnold, September 30, 2016
September 30, 2016
Enterprise search has taken a back a back seat to search news regarding Google’s next endeavor and what the next big thing is in big data. Enterprise search may have taken a back seat in my news feed, but it is still a major component in enterprise systems. You can even speculate that without a search function, enterprise systems are useless.
Lexmark, one of the largest suppliers of printers and business solutions in the country, understand the importance of enterprise search. This is why they recently updated the description of its Perceptive Enterprise Search in its system’s technical specifications:
Perceptive Enterprise Search is a suite of enterprise applications that offer a choice of options for high performance search and mobile information access. The technical specifications in this document are specific to Perceptive Enterprise Search version 10.6…
A required amount of memory and disk space is provided. You must meet these requirements to support your Perceptive Enterprise Search system. These requirements specifically list the needs of Perceptive Enterprise Search and do not include any amount of memory or disk space you require for the operating system, environment, or other software that runs on the same machine.
Some technical specifications also provide recommendations. While requirements define the minimum system required to run Perceptive Enterprise Search, the recommended specifications serve as suggestions to improve the performance of your system. For maximum performance, review your specific environment, network, and platform capabilities and analyze your planned business usage of the system. Your specific system may require additional resources above these recommendations.”
It is pretty standard fare when it comes to technical specifications, in other words, not that interesting but necessary to make the enterprise system work correctly.