February 19, 2015
These days, anyone can be a private investigator; all you need are the Internet and some know-how. CNet lays out “5 Tips for Finding Anything, About Anyone, Online.” Writer Sarah Jacobsson Purewal begins:
“I think everyone should have decent online stalking skills. Not because I condone stalking, but because knowledge is power—if you don’t know how to find people online, how do you know what people can find about you online? Googling yourself is like checking your credit report for inaccuracies: it’s only effective as a preventative measure if you do it thoroughly and routinely. Whether you’re looking for yourself or a friend (no judgment), here are five tips for finding out anything, about anyone, online.”
Purewal begins with the logical first step—Google. She helpfully links to a video on advanced Google search techniques. She also advises do-it-yourself sleuths to “type in everything you know about the person in keyword format.” Next is Facebook’s People Search tool. Here, the write-up reminds us we can go through friends and family to find someone who’s using a fake Facebook name. Under the heading “Make connections,” Purewal advises searchers they will have to do some thinking:
“Once you have several facts about your subject, you’ll need to use your brain to make connections and fill in the blanks. For example, if you know your subject’s name, job title, and location, you can probably find their LinkedIn profile. On their LinkedIn profile, they’ve probably listed their undergraduate degree and when they graduated from college, which means you can work backward to figure out approximately how old they are.”
The list goes on to note, “Remember people are not very creative;” many unwisely use the same username and even password at many websites. One can use that clue to hunt someone down in social networks and community forums. Finally, we’re reminded that “A picture is worth a thousand words.” Purewal recommends snagging a Facebook or Twitter profile picture and taking it over to TinEye or Google Images for a reverse lookup. Isn’t technology wonderful?
Cynthia Murrell, February 19, 2015
February 18, 2015
A big push for universities is teaching undergraduate students how to conduct research. Most of them simply go to Google or Wikipedia and think their work is done. Wrong! Research involves more than a few spins around the search engine and most students find themselves deficient in that area. LearnU wrote an article, “Get Your Search On: 40 Of The Best Search Engines For Students And How To Use Them.”
The article includes a brief spiel about information literacy and its importance. What is nice is it explains how search engines work:
1. “Internet user enters desired inquiry into the search engine’s search bar.
2. The search engine’s software gets to work and starts sorting through the millions of pages residing within its database in an attempt to find the best fit for the original inquiry.
3. Once all the action takes place behind the scenes, all relevant results are generated for the searcher and presented. Results are listed in order with the most relevant first.”
What is even better is that the article does not ban Google entirely from the approved academic search engine list. Google used to be a no-no in academic research, but now it is a useful tool with different features specifically geared towards academics. There are even nods to video and social media search engines. The article does forget to mention Blekko and Yandex as major search engines and they also forget to mention Google Video as a media search system.
Oh well, nothing is perfect and at least most of these are free compared to scholarly databases.
February 18, 2015
According to a recent report, it appears that people trust the aggregator more than the sources it aggregates. Wait, what? Search Engine Journal (SEJ) informs us that “Google Is a More Trusted Source of News than Traditional Media [Report].” Of the 27,000 people surveyed for the report (published by Quartz), 72% trust “online search engines,” 64% trust traditional media, and 59% trust social media. (Personally, I find that last figure most troubling; but I digress.) Writer Matt Southern tells us:
“Where the trust stems from is a search engine’s ability to give users an at-a-glance look at news and information from a variety of sources. That is, apparently, more dependable for most people than getting news and/or information from a single source.
“What this really means is getting news from more than one source is preferable compared to putting sole trust in the reporting of one publication….
“On the other hand, you must also consider that Google’s algorithm takes into account your search history when serving up search results. So, for example, if you often come to SEJ for your SEO news then you’ll see more results from SEJ when conducting a search.”
We seem to be choosing ease-of-use over being well-informed from a wide range of viewpoints. Some people make that trade-off knowingly. Others, apparently, believe their personalized “variety” of sources actually give them the full picture. Let us not mistake convenience for trustworthiness; being well-informed sometimes takes a little effort.
Cynthia Murrell, February 18, 2015
February 17, 2015
In order to build a fantastic Web site these days, you need eye-catching graphics. While creating a logo can be completed with Fiverr, making daily images for your content feed is a little bit more difficult. It is not cost efficient to hire a graphic designer for every image (unless you have deep pockets), so it helps to have an image library to retrieve images. The problem with typing in image library into a search engine means you have to sift through results and assess each possible source.
Graphic designer Ash Stallard-Phillips collected “25 Awesome Sites With Stunning Free Stock Photos.” He rounded up the image libraries, because:
“As a web designer myself, I always find it handy to have an image library that I can use for dummy images and testing. I have compiled a list of the best sites offering free stock photos that you can use for your projects. “
Ash evaluates each resource, listing the pros and cons. Many of the image Web sites he lists are ones we have not used before and will be useful as we create content. There is an increase in the number of articles like Ash’s on the Internet and they are not just for photo libraries. They are lists that have tons of helpful information that you would usually have to sift through search results for. It saves time on searching and the evaluation process.
February 16, 2015
In order to give citizens more access to research from the U.S. Department of Agriculture (USDA), the National Agricultural Library (NAL) has launched a new, public-facing search engine called PubAg. The USDA’s Agricultural Research Service tells us about the tool in, “NAL Unveils New Search Engine for Published USDA Research.” It looks a lot like a Lucene/Solr system to us; that choice would not be at all surprising. The post tells us:
“PubAg, which can be found at PubAg.nal.usda.gov, is a new portal for literature searches and full-text access of more than 40,000 scientific journal articles by USDA researchers, mostly from 1997 to 2014. New articles by USDA researchers will be added almost daily, and older articles may be added if possible. There is no access fee for PubAg.
“Phase I of PubAg provides access for searches of 340,000 peer-reviewed agriculturally related scientific literature, mostly from 2002 to 2012, each entry offering a citation, abstract and a link to the article if available from the publisher. This initial group of highly relevant, high-quality literature was taken from the 4 million bibliographic citations in NAL’s database.”
The agency has worked to make the system easy to use for folks from farmers to academicians. So easy, in fact, that there’s no registration — no user name or password is needed. We’re told that NAL maintains “one of the world’s largest and most comprehensive compilations of agricultural information.” Now they’ve made that wealth of knowledge available to us all.
Cynthia Murrell, February 16, 2015
February 13, 2015
As a Pinterest user myself, I know how important the site’s search function is. Now, as Gigaom informs us, “Pinterest Explains How It’s Making Its Search Work Better.” It sounds like an approach to semantic machine learning inspired by the crowdsourcing phenomenon. Writer Jonathan Vanian tells us:
“Dong Wang, the Pinterest software engineer who wrote the post, explained that even though a user may search for the word ‘turkey,’ it’s unclear what exactly that person may be looking for. Does he want to find turkey recipes, is he planning a trip to Turkey or is he just interested in poultry — it’s hard to say without some context.
“If that person decides to search for ‘turkey recipes’ as part of his next query, Pinterest takes that into account and can assume that the next person who may be searching for ‘turkey’ might also be craving some turkey recipes as well; maybe it’s holiday season and everyone’s hungry. Pinterest learned that ‘the information extracted from previous query log has shown to be effective in understanding the user’s search intent’ and this can be applied to other Pinterest users as well.”
Pinterest’s data-collection workflow is called QueryJoin, and engineers use it to draw conclusions like the one about turkey recipes, above. Factors analyzed also include data like pins’ image signatures and “engagement stats” like the number of clicks and re-pins it has received. For more information, see Dong Wang’s original post.
Cynthia Murrell, February 13, 2015
February 10, 2015
Microsoft is doing its best to maintain relevancy in the technology market. Its rivals, Google and Apple, are eating up all the customers and smacking their lips at the deliciousness of their success. Microsoft has not given up the battle and according to PC World, “Microsoft One Drive Adds Super-Intellifenct Searching Of Document Text, Photos.” OneDrive is Microsoft’s cloud service and it has been upgraded to include Microsoft Research and Bing techniques to examine, tag, and analyze photos aka intelligent photo search.
Once photos are uploaded into OneDrive they will be scanned by OCR to gather information and apply tags. This feature is part of Microsoft’s new automated image recognition technology. Microsoft will also make the cloud easier to use:
“Microsoft also will make it easier to actually get your photos into the cloud through a new “Camera Imports” folder, which will be rolling out over the next month. Once you connect a camera or USB stick to your Windows 7 or 8 computer, photos will be automatically siphoned off and stored in Microsoft’s cloud. Likewise, if you snap a screenshot on a Windows 7 or Windows 8 machine, it too will be stored in OneDrive—a feature that’s already in Windows Phone today.”
The Internet has always been a visual medium, but accessibility of cameras has increased that and people want to organize and find their photos like they can their text files. Good move, Microsoft.
February 9, 2015
We all have those weird relatives that drop by during the holidays or odd times during the year. Ken Starks of Foss Force used the metaphor in “Desktop Search: KDE’s Crazy Uncle” to explain his views about KDE desktop search. Starks says that you can rely on KDE’s desktop search to be unreliable, like that crazy uncle who can’t hold down a job or a marriage.
Kfind, the default search, cannot find any files, especially when Starks knew they were in there. After some grumbling, he shares his experiences with KDS search software that does work. He liked using GNOME, Nepomuk to index, Dolphin, and his current search of choice: Catfish. He stresses that he loves working with KDE, he just wants the out-of-the-box search engines to work well instead of having to download a third party app:
“I installed a search app I use in Xfce and I didn’t have to drag in too many GTK dependencies to do it. It’s called “Catfish”…. find it a bit odd that a third party app surpasses the native KDE search application. Catfish gets it right. It’s a darned shame that it isn’t native to KDE.”
He has gotten some comments about using the command line and fussing with the code, but Starks’ retort is that most users do not know how to use those lines. They want to log into a system and have it work right through the user interface. Crazy idea, is it not?
February 6, 2015
An oddball TechWars graphic suggests that Lucene is making life difficult for vendors of proprietary search systems. In the site’s head-to-head “dtSearch vs Lucene” comparison, the open source solution seems to handily trounce dtSearch. Of course, for us, Lucene means Elasticsearch. For those unfamiliar with TechWars, here’s what the site’s description of what it does:
Data-driven: TechWars shows objective data gathered from the web to help you make the right decision when choosing technology for your projects.
Up-to-date: TechWars scans the web to catch the latest trends, so you can sit back and relax while we keep you updated.
Professional: TechWars is built for professionals, by professionals. Let’s build the best tech comparison tool together!
Community: TechWars serves the developer community by opening case studies for discussion. We are always open to requests and feedback via Facebook and Twitter.
The graphic compares dtSearch and Lucene in several areas. We’re told that 196 of TechWars users use Lucene, versus just 15 who use dtSearch. Under the “which companies use it?” heading, sixteen companies (several high-profile) are listed for Lucene, but “no companies found” for dtSearch. Um, it seems like a pretty shallow dataset they’re tapping into there. The site does use Google data for one comparison—a graph that shows how very many more folks have searched for information on Lucene than on dtSearch. At a glance, Lucene would seem to be coming out ahead.
Cynthia Murrell, February 06, 2015
February 4, 2015
Short Honk: One of my two or three readers alerted me to a useful summary of Google’s search parameters. The list is available from BackLinkSentry. Google’s advanced search page is helpful but not particularly fine grained. This list of switches will unlock content that is in the Google index but not findable with a two or three word query slapped in the Google search box. Worth having at one’s fingertips in my opinion.
Stephen E Arnold, February 3, 2015