Amazon and Elasticsearch

May 29, 2015

If you are curious about the utility of Elastic’s technology, you will find “Indexing Common Crawl Metadata on Amazon EMR Using Cascading and Elasticsearch” a useful article to review. The main idea is that Amazon made Elasticsearch do some circus tricks. The write up explains the approach, provides code snippets, and includes a couple of nifty graphics which help those zany Zonies figure out the implications of the data crunched. the main idea is that Elasticsearch did something use with content in everyone’s favorite magic wand Hadoop. Why didn’t Amazon use LucidWorks (Really?)? Hmm. Good question.

Stephen E Arnold, May 29, 2015

JobSamurai Offers Alternative Job Search Method (Without the Search)

May 29, 2015

The article titled Take the Search Out of Job Hunting with JobSamurai on MakeUseOf describes the perks in using JobSamurai next time you are out of work. A lot of people rely on services like Craigslist, but anyone who has searched for a job there knows that a good portion of the listings are frauds, or just non-existent. The number of irrelevant posts are also high and weeding through them all is time-consuming and frustrating. JobSamurai claims to have the answers, with a job website that minimizes the search factor. The article explains,

“JobSamurai uses your information to find jobs around the web that match your profile, then shows them to you as banner adverts on the websites you visit most often. They do this by leaving a tracking cookie in your web browser that sends data back to JobSamurai to notify them of where to display their content. It typically takes 10-15 days for their internal search engines to find all the jobs that match a candidate.”

While this means that users will need to exercise some patience before seeing results, it is balanced out by the absence of those terrible spam emails that job search websites love to litter your inbox with. JobSamurai promises to limit itself to one email every two months- which really seems like no emails at all.

Chelsea Kerwin, May 29, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Search Functionality for the Roku 2

May 29, 2015

In with search, out with the remote-based headphone jack. Roku has had to weigh their priorities while considering user-friendly features, we learn from “Roku 2 Gets a Facelift with New Search Engine” at ITProPortal. The need for an affordable price point required the Roku 2 media-streaming player to drop some features so new ones could be added. Writer Sead Fadilpaši? reports:

“The new remote will work on IR, meaning you’ll need a clear line of sight to switch channels. The remote has also lost the headphone jack, which some will find quite saddening, as well as the motion sensor. Both remotes will now feature four dedicated buttons, which can’t be reprogrammed, giving users quick access to Netflix, YouTube, Google Play, and Rdio. New features also include a search engine and show notifications, letting people know when a certain show is available. The new Roku 2 will cost as much as the Apple TV after its price drop – a very competitive £69. Aside from improved hardware specs Roku has confirmed to Pocket-lint the new box will come with improved software that should have a dramatic affect in speeding up accessing your favorite channels, shows and movies.”

All Roku devices will be getting the revised interface, which adds a couple of features and is expected to speed boot times. The write-up reminds us that the Roku has a mobile app, with a new version due out soon. So if you really miss that headphone jack, just swap their remote for your smart phone. I leave the motion-sensor hack to you.

Cynthia Murrell, May 29, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

The Future of Enterprise and Web Search: Worrying about a Very Frail Goose

May 28, 2015

For a moment, I thought search was undergoing a renascence. But I was wrong. I noted a chart which purports to illustrate that the future is not keyword search. You can find the illustration (for now) at this Twitter location. The idea is that keyword search is less and less effective as the volume of data goes up. I don’t want to be a spoil sport, but for certain queries key words and good old Boolean may be the only way to retrieve certain types of information. Don’t believe me. Log on to your organization’s network or to Google. Now look for the telephone number of a specific person whose name you know or a tire company located in a specific city with a specific name which you know. Would you prefer to browse a directory, a word cloud, a list of suggestions? I want to zing directly to the specific fact. Yep, key word search. The old reliable.

But the chart points out that the future is composed of three “webs”: The Social Web, the Semantic Web, and the Intelligent Web. The dates for the Intelligent Web appears to be 2018 (the diagram at which I am looking is fuzzy). We are now perched half way through 2015. In 30 months, the Intelligent Web will arrive with these characteristics:

Embedded image permalink

  • Web scale reasoning (Don’t we have Watson? Oh, right. I forgot.)
  • Intelligent agents (Why not tap Connotate? Agents ready to roll.)
  • Natural language search (Yep, talk to your phone How is that working out on a noisy subway train?)
  • Semantics. (Embrace the OWL. Now.)

Now these benchmarks will arrive in the next 30 months, which implies a gradual emergence of Web 4.0.

The hitch in the git along, like most futuristic predictions about information access, is that reality behaves in some unpredictable ways. The assumption behind this graph is “Semantic technology help to regain productivity in the face of overwhelming information growth.”

Read more

Data Darkness

May 28, 2015

According to Datameer, organizations do not use a large chunk of their data and it is commonly referred to “dark data.”  “Shine Light On Dark Data” explains that organizations are trying to dig out the dark data and use it for business intelligence or in more recent terms big data.  Dark data is created from back end business processes as well as from regular business activities.  It is usually stored on storage silo in a closet and only kept for compliance audits.

Dark data has a lot of hidden potential:

Research firm IDC estimates that 90 percent of digital data is dark. This dark data may come in the form of machine or sensor logs that when analyzed help predict vacated real estate or customer time zones that may help businesses pinpoint when customers in a specific region prefer to engage with brands. While the value of these insights are very significant, setting foot into the world of dark data that is unstructured, untagged and untapped is daunting for both IT and business users.”

The article suggests making a plan to harness the dark data and it does not offer much in the way of approaching a project other than making it specifically for dark data, such as identifying sources, use Hadoop to mine it, and tests results against other data sets.

This article is really a puff piece highlighting dark data without going into much detail about it.  They are forgetting that the biggest movement in IT from the past three years: big data!

Whitney Grace, May 28, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

SharePoint Is Back and Yammer Is Left Behind

May 28, 2015

Many old things become trend and new again, and even that holds true with software, at least in principle. The old functions of SharePoint are withstanding the test of time, and the trendy new buzzwords that Microsoft worked so hard to push these last few years (cloud, social, collaborative) are fading out. Of course, some of it has to do with perception, but it does seem that Microsoft is harkening back to what the tried and true longtime users want. Read more in the CMS Wire article, “SharePoint is Back, Yammer… Not So Much.”

The article sums up the last few years:

“But these last few years, Microsoft seemingly didn’t want to talk about SharePoint. It wanted to talk about Office 365, the cloud, collaboration, social, mobile devices and perpetual monthly licensing models. Yet no one appears to have told many of the big traditional SharePoint customers of these shifts. These people are still running SharePoint 2007, 2010 and 2013 happily in-house and have no plans to change that for many years.”

So it seems that with the returned focus to on-premises SharePoint, users are pleased in theory. However, it remains to be seen how satisfying SharePoint Server 2016 will be in reality. To stay tuned to the latest reviews and feedback, keep an eye on ArnoldIT.com and his dedicated SharePoint feed. Stephen E. Arnold is a longtime leader in search with an interest in SharePoint. His reporting will shed a light on the realities of user experience once SharePoint Server 2016 becomes available.

Emily Rae Aldridge, May 28, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

 

Altiar And dtSearch Combine

May 27, 2015

Sometimes when items are combine they create something even better, such as Oreos and peanut butter, Disney and Marvel, and Netflix and original series.  EContentMag alerted us that a new team-up is underway between two well known companies.  The press release title says it all, “Altiar Cloud-Based ECM Platform Is Embedding The dtSearch Engine.”  Altair is an enterprise content management platform that has been specifically used by Microsoft Azure.  The popular dtSearch platform has been searching through terabytes since 1991 and is referred to as a powerful search tool.  Embedding dtSearch into the Altiar core will make it a more powerful ECM.

Altiar is a popular ECM and can only be improved by dtSearch:

“A cloud-based service, Altiar includes rapid setup, scalability, and storage. It can accept any type of file, from PowerPoint to streaming video, as well as providing a host of tools and services to create custom content pages, newsletters, personal zones, and the like. The platform lets users not only access content from any connected device, but also manage, share, and track content, including features like email alerts.”

Microsoft is not a main player in the cloud computing and Microsoft Azure is supposed to drive more customers to them.  Anything, like this new Altair improving its search will make it more appealing.

Whitney Grace, May 27, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Hijacking Semantics for Search Engine Optimization

May 26, 2015

I am just too old and cranky to get with the search engine optimization program. If a person cannot find your content, too bad. SEO has caused some of the erosion of relevance across public Web search engines.

The reason is that pages with lousy content are marketed as having other, more valuable content. The result is queries like this:

image

I want information about methods of digital reasoning. What I get is a company profile.

How do I get information for my specific requirement? I have to know how to work around the problems SEO puts in my face every day, over and over again.

This query works on Bing, Google, and Yandex: artificial intelligence decision procedures.

image

The results do not point to a small company in Tennessee, but to substantive documents from which other, pointed queries can be launched for individuals, industry associations, and methods.

When I read “Semantic Search Strategies That Work,” I became agitated. The notion of “forgetting about content” and “focusing on quality” miss the mark. Telling me to “spend time on engagement” are a collection of unrelated assertions.

The goal of semantics for SEO is to generate traffic. The search systems suck in shaped content and persist in directing people to topics that may have little or nothing to do with the information a person needs to solve his or her problem.

In short, the bastardization of semantics in the name of SEO is ensuring that some users will define the world from the point of view of marketing, not objective information.

What’s the fix?

Here’s the shocker: There is no fix. As individuals abrogate their responsibility to demand high value, on point results, schlock becomes the order of the day.

So much for clear thinking. Semantic strategies that erode relevance do not “work” from my point of view. This type of semantics thickens the cloud of unknowning.

Stephen E Arnold, May 26, 2015

Computing Power Up a Trillion Fold in 60 Years. Search Remains Unchanged.

May 25, 2015

I get the Moore’s Law thing. The question is, “Why isn’t search and content processing improving?”

Navigate to “Processing Power Has Increased by One Trillion-Fold over the Past Six Decades” and check out the infographic. There are FLOPs and examples of devices which deliver them. I focused on the technology equivalents; for example, the Tianhe 2 Supercomputer is the equivalent of 18,400 PlayStation 4s.

The problem is that search and content processing continue to bedevil users. Perhaps the limitations of the methods cannot be remediated by a bigger, faster assemblage of metal and circuits?

The improvement in graphics is evident. But allowing me to locate a single document in my multi petabyte archive continues to a challenge. I have more search systems than the average squirrel in Harrod’s Creek.

Findability is creeping along. After 60 years, the benefits of information access systems are very difficult to tie to better decisions, increased revenues, and more efficient human endeavors even when a “team of teams” approach is used.

Wake up call for the search industry. Why not deliver some substantive improvements in information access which are not tied to advertising? Please, do not use the words metadata, semantics, analytics, and intelligence in your marketing. Just deliver something that provides me with the information I require without my having to guess key words, figure out odd ball clustering, or waiting minutes or hours for a query to process.

I don’t want Hollywood graphics. I want on point information. In the last 60 years, my information access needs have not been met.

Stephen E Arnold, May 25, 2015

Yahoo Considers Options for Japanese Division

May 25, 2015

Despite a series of changes since former Googler Marissa Mayer took over at Yahoo, the search-and-entertainment company still struggles to find its footing in a tech landscape that shifted around it long ago. Bloomberg Business wonders whether the Yahoo’s next steps in Japan will set it on a sturdier path in, “Yahoo Weighs Options for Japan Stake; Sales Miss Estimates.” Writer Brian Womack reports that Mayer plans to make the most of her company’s Japanese assets. He posits:

“By telling investors she’s looking at options for Yahoo Japan, Mayer may be seeking to buy herself more time to jump-start growth at the company she’s been working to turn around for almost three years. Unless she can expand sales, investors may eventually lose patience with the strategy and question her leadership. Some analysts speculated earlier this year that Yahoo could become a takeover target for a larger Internet company after it spins off the Alibaba stake.

“Yahoo’s share of the U.S. online display ad market may slide to 3.5 percent in 2017 from 5.5 percent last year, according to EMarketer Inc. Quarterly revenue growth has come in at less than 4 percent or negative since the end of 2012.”

The success of China’s largest e-commerce firm, and Yahoo asset, Alibaba is responsible for much of the company’s recent growth, such as it is, but that boost will only last so long. Womack reports there has been investor pressure to  spin off Yahoo’ Japanese division, but apparently Mayer prefers to consider a range of options. Will Yahoo find salvation in the land of the rising sun?

Cynthia Murrell, May 25, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta