CyberOSINT banner

Where’s the Finish Line Enterprise Search?

September 16, 2015

What never ceases to amaze me is that people are always perplexed when goals for technology change.  It always comes with a big hullabaloo and rather than complaining about the changes, time would be better spent learning ways to adapt and learn from the changes.  Enterprise search is one of those technology topics that sees slow growth, but when changes occur they are huge.  Digital Workplace Group tracks the newest changes in enterprise search, explains why they happened, and how to adapt: “7 Ways The Goal Posts On Enterprise Search Have Moved.”

After spending an inordinate amount of explaining how the author arrived at the seven ways enterprise search has changed, we are finally treated to the bulk of the article.  Among the seven reasons are obvious insights that have been discussed in prior articles on Beyond Search, but there are new ideas to ruminate about.  Among the obvious are that users want direct answers, they expect search to do more than find information, and understanding a user’s intent.  While the obvious insights are already implemented in search engines, enterprise search lags behind.

Enterprise search should work on a more personalized level due it being part of a closed network and how people rely on it to fulfill an immediate need.  A social filter could be applied to display a user’s personal data in search results and also users rely on the search filter as a quick shortcut feature. Enterprise search is way behind in taking advantage of search analytics and how users consume and manipulate data.

“To summarize everything above: Search isn’t about search; it’s about finding, connecting, answers, behaviors and productivity. Some of the above changes are already here within enterprises. Some are still just being tested in the consumer space. But all of them point to a new phase in the life of the Internet, intranets, computer technology and the experience of modern, digital work.”

As always there is a lot of room for enterprise search improvement, but these changes need to made for an updated and better work experience.

Whitney Grace, September 16, 2015
Sponsored by, publisher of the CyberOSINT monograph

Mondeca Has a Sandbox

September 15, 2015

French semantic tech firm Mondeca has their own research arm, Mondeca Labs. Their website seems to be going for a playful, curiosity-fueled vibe. The intro states:

“Mondeca Labs is our sandbox: we try things out to illustrate the potential of Semantic Web technologies and get feedback from the Semantic Web community. Our credibility in the Semantic Web space is built on our contribution to international standards. Here we are always looking for new challenges.”

The page links to details on several interesting projects. One entry we noticed right away is for an inference engine; they say it is “coming soon,” but a mouse click reveals that no info is available past that hopeful declaration. The site does supply specifics about other projects; some notable examples include linked open vocabularies, a SKOS reader, and a temporal search engine. See their home page, above, for more.

Established in 1999, Mondeca has delivered pragmatic semantic solutions to clients in Europe and North America for over 15 years. The firm is based in Paris, France.

Cynthia Murrell, September 15, 2015

Sponsored by, publisher of the CyberOSINT monograph

Search and Find Love but Maybe Not

September 13, 2015

My trusty alert service delivered me this search gem: “13 Apps 17 Dates 30 Days: I Tried 13 Dating Apps in 30 Days in Search of Love.” Shadows of Ashley Madison could not obscure this clickable topic.

Here’s what I learned:

  • There are sites with interesting names with which I was not familiar. Here are two examples Jack’D and Scruff.
  • Dating apps can be used to deliver ads. Love for someone I assume.
  • Finding love takes “time and energy.” Yep, just like my notions for information access.
  • Some love search apps ask users to involve their Twitter followers. Now that’s a great idea for some folks.

Main point of the write up: Love apps don’t work. Wow, revelation.

Stephen E Arnold, September 13, 2015

Social Consensus: Control Becomes a Big Thing

September 13, 2015

I read “The Cable Industry Faces the Perfect Storm: Apps, App Stores, and Apple.” I think the idea is a valid one. I am not sure about the Apple thing.

Let’s go to the Web page. (Shades of Warner Wolfe.)

The write up states:

the average US consumer is spending 198 minutes per day inside apps compared to 168 minutes on TV. Please note that the 198 minutes per day spent inside apps on smart phones and tablets don’t include time spent in the mobile browser. In fact, if we add that time, the total time spent on mobile devices by the average US consumer is now 220 minutes (or 3 hours and 40 minutes) per day…

In the good old days, people were supposed to be watching the fire burn in their caves. Then folks listened to Jack Benny on Sunday night. When I was a wee lad, we had a black and white television which sort of worked. My progeny had color TV to watch. Today lots of people look at tiny screens and checking Facebook or looking for pizza via Google or Alphabet or whatever the company is.

Bad news for cable companies it seems.

Forget the cable folks. My view is that the bad news is what I call the consensus problem. Shared experiences are blockbusters in the James Twitchell sense of the word in Adcult USA.

Cohesiveness comes from the Super Bowl and similar constructs. The implications of this tiny screen shift are significant. Losers will be the organizations constructed to serve the mass markets of mass media.

Apple, bless its innovative heart, makes gizmos. The powerhouses are the outfits which deliver micro-content and micro-experiences to the OreIda’s walking around or sitting in coffee shops with their mobile devices.

Search and retrieval? A loser. Sustained concentration? A loser. Consensus? Interesting about that.

Stephen E Arnold, September 13, 2015

Coveo: A Real Life Search Implementation Success

September 11, 2015

If we detect some serious Coveo cheerleading in this recent article found on RT Insights, that might be because the story originated at that company. Still, “How Real-Time Enterprise Search Helps Seal Financial Deals” does illustrate the advantages of consolidating data resources into a more easily-used system.

The write-up describes challenges faced by London investment firm 3i Group. The global company had been collecting an abundance of data about its clients’ deals, but was spending many worker hours retrieving that information from scattered repositories. Coveo Enterprise Search to the rescue! The platform implementation included a user-friendly UI, actionable analytics, and security measures. The article continues:

“As a result of the implementation, 3i Group reports 90 percent faster access to deal-related intelligence as well as a 20 percent reduction in staff and resources required to respond to compliance requests. 3i Group’s staff members use the platform to search across 3.66 million file share documents, 6.39 million Exchange emails, 897,000 SharePoint documents, and 107 million Enterprise Vault records. For the first time, 3i Group staff members are able to perform a single search across all of the company’s knowledge repositories by using either a browser-based interface or an integrated search interface within SharePoint. 3i Group’s compliance team was provided with a dashboard that enabled them to search and correlate content from across 3i Group’s entire data set, and quickly evaluate permissions and user access rights for every 3i Group record or knowledge asset.”

Founded in 2005, Coveo maintains offices in California and the Netherlands, with its R&D headquarters in Quebec. (The company is also hiring as of this writing.)There is no doubt that being able to reach and analyze all data from one dashboard can be a huge time-saver, especially for a large organization. Just remember that Coveo is but one of several strong options; some are even open source.

Cynthia Murrell, September 11, 2015

Sponsored by, publisher of the CyberOSINT monograph

A Fun Japanese Elasticsearch Promotion Video

September 10, 2015

Elasticsearch is one of the top open source search engines and is employed by many companies including Netflix, Wikipedia, GitHub, and Facebook.  Elasticsearch wants to get a foothold into the Japanese technology market.  We can assume, because Japan is one of the world’s top producers of advanced technology and has a huge consumer base.  Once a technology is adopted in Japan, you can bet that it will have an even bigger adoption rate.

The company has launched a Japanese promotional campaign and a uploaded video entitled “Elasticsearch Product Video” to its YouTube channel.  The video comes with Japanese subtitles with appearances by CEO Steven Schuurman, VP of Engineering Kevin Kluge, Elasticsearch creator Shay Bannon, and VP of Sales Justin Hoffman.  The video showcases how Elasticsearch is open source software, how it has been integrated into many companies’ frameworks, its worldwide reach, product improvement, as well as the good it can do.

Justin Hoffman said that, “I think the concept of an open source company bringing a commercial product to market is very important to our company.  Because the customers want to know on one hand that you have the open source community and its evolution and development at the top of your priority list.  On the other hand, they appreciate that you’re innovating and bringing products to market that solve real problems.”

It is a neat video that runs down what Elasticsearch is capable of, the only complaint is that bland music in the background.  They could benefit from licensing the Jive Aces “Bring Me Sunshine” it relates the proper mood.

Whitney Grace, September 10, 2015
Sponsored by, publisher of the CyberOSINT monograph

Google and Alta Vista: Who Remembers?

September 9, 2015

A lifetime ago, I did some work for an outfit called Persimmon IT. We fooled around with ways to take advantage of memory, which was a tricky devil in my salad days. The gizmos we used were manufactured by Digital Equipment. The processors were called “hot”, “complex”, and AXP. You may know this foot warmer as the Alpha. Persimmon operated out of an office in North Carolina. We bumped into wizards from Cambridge University (yep, that outfit again), engineers housed on the second floor of a usually warm office in Palo Alto, and individuals whom I never met but I had to slog through their email.

So what?

A person forwarded me a link to a what seems to be an aged write up called “Why Did Alta Vista Search Engine Lose Ground so Quickly to Google?” The write up was penned by an UCLA professor. I don’t have too much to say about the post. I was lucky to finish grade school. I missed the entire fourth and fifth grades because my Calvert Course instructor in Brazil died of yellow jaundice after my second lesson.

I scanned the write up, which you may need to register in order to read the article and the comments thereto. I love walled gardens. They are so special.

I did notice that one reason Alta Vista went south was not mentioned. Due to the brilliant management of the company by Hewlett Packard/Compaq, Alta Vista created some unhappy campers. Few at HP knew about Persimmon, and none of these MBAs had the motivation to learn anything about the use of Alta Vista as a demonstration of the toasty Alpha chips, the clever use of lots of memory, and the speed with which certain content operations could be completed.

Unhappy with the state of affairs, the Palo Alto Alta Vista workers began to sniff for new opportunities. One scented candle burning in the information access night was a fledgling outfit Google, formerly Backrub. Keep in mind that intermingling of wizards was and remains a standard operating procedure in Plastic Fantastic (my name for Sillycon Valley).

The baby Google benefited from HP’s outstanding management methods. The result was the decampment from the HP Way. If my memory serves me, the Google snagged Jeff Dean, Simon Tong, Monica Henzinger, and others. Keep in mind that I am no “real” academic, but my research revealed to me and those who read my three monographs about Google that Google’s “speed” and “scaling” benefited significantly from the work of the Alta Vista folks.

I think this is important because few people in the search business pay much attention to the turbo boost HP unwittingly provided the Google.

In the comments to the “Why Did Alta Vista…” post, there were some other comments which I found stimulating.

  1. One commenter named Rajesh offered, “I do not remember the last time I searched for something and it did not end up in page 1.” My observation is, “Good for you.” Try this query and let me know how Google delivers on point information: scram action. I did not see any hits to nuclear safety procedures. Did you, Rajesh? I assume your queries are different from mine. By the way, “scram local events” will produce a relevant hit half way down the Google result page.
  2. Phillip observed that the “time stamp is irrelevant in this modern ear, since sub second search  is the norm.” I understand that “time” is not one of Google’s core competencies. Also, many results are returned from caches. The larger point is that Google remains time blind. Google invested in a company that does time well, but sophisticated temporal operations are out of reach for the Google.
  3. A number of commenting professionals emphasized that Google delivered clutter free, simple, clear results. Last time I looked at a Google results page for this query katy perry the presentation was far from a tidy blue list of relevant results.
  4. Henry pointed out that the Alta Vista results were presented without logic. I recall that relevant results did appear when a query was appropriately formed.
  5. One comment pointed out that it was necessary to cut and paste results for the same query processed by multiple search engines. The individual reported that it took a half hour to do this manual work. I would point out that metasearch solutions became available in the early 1990s. Information is available here and here.

Enough of the walk down memory lane. Revisionism is alive and well. Little wonder that folks at Alphabet and other searchy type outfits continue to reinvent the wheel.

Isn’t a search app for a restaurant a “stored search”? Who cares? Very few.

Stephen E Arnold, September 9, 2015

Bing Snapshots for In-App Searches

September 9, 2015

Developers have a new tool for incorporating search data directly into apps, we learn in “Bing Snapshots First to Bring Advanced In-App Search to Users” at Search Engine Watch. Apparently Google announced a similar feature, Google Now on Tap, earlier this year, but Microsoft’s Bing has beaten them to the consumer market. Of course, part of Snapshot’s goal is to keep users from wandering out of “Microsoft territory,” but many users are sure to appreciate the convenience nevertheless. Reporter Mike O’Brien writes:

“With Bing Snapshots, developers will be able to incorporate all of the search engine’s information into their apps, allowing users to perform searches in context without navigating outside. For example, a friend could mention a restaurant on Facebook Messenger. When you long-press the Home button, Bing will analyze the contents of the screen and bring up a snapshot of a restaurant, with actionable information, such as the restaurant’s official website and Yelp reviews, as well Uber.”

Bing officials are excited about the development (and, perhaps, scoring a perceived win over Google), declaring this the start of a promising relationship with developers. The article continues:

“Beyond making sure Snapshots got a headstart over Google Now on Tap, Bing is also able to stand out by becoming the first search engine to make its knowledge graph available to developers. That will happen this fall, though some APIs are already available on the company’s online developer center. Bing is currently giving potential users sneak peeks on its Android app.”

Hmm, that’s a tad ironic. I look forward to seeing how Google positions the launch of Google Now on Tap when the time comes.

Cynthia Murrell, September 9, 2015

Sponsored by, publisher of the CyberOSINT monograph


dtSearch Chases Those Pesky PDFs

September 7, 2015

While predictive analytics and other litigation software are more important than ever for legal professionals to sift through the mounds of documents and discover patterns, several companies have come to the rescue, especially dtSearch.  Inside Counsel explains how a “New dtSearch Release Offers More Support To Lawyers.”

The latest dtSearch release is not only able to search through terabytes of information in online and offline environments, but its documents filters have broadened to search encrypted PDFs, including those with a password.  While PDFs are a universally accepted document format, they are a pain to deal with if they ever have to be edited or are password protected.

Also included in the dtSearch are other beneficial features:

“Additionally, dtSearch products can parse, index, search, display with highlighted hits, and extract content from full-text and metadata in several data types, including: Web-ready content; other databases; MS Office formats; other “Office” formats, PDF, compression formats; emails and attachments; Recursively embedded objects; Terabyte Indexer; and Concurrent, Multithreaded Searching.”

The new PDF search feature with the ability to delve into encrypted PDF files is a huge leap ahead of its rivals, being able to explore PDFs without Adobe Acrobat or another PDF editor will make pursuing through litigation much simpler.

Whitney Grace, September 7, 2015
Sponsored by, publisher of the CyberOSINT monograph

Shades of CrossZ: Compress Data to Speed Search

September 3, 2015

I have mentioned in my lectures a start up called CrossZ. Before whipping out your smartphone and running a predictive query on the Alphabet GOOG thing, sit tight.

CrossZ hit my radar in 1997. The concept behind the company was to compress extracted chunks of data. The method, as I recall, made use of fractal compression, which was the rage at that time. The queries were converted to fractal tokens. The system then quickly pulled out the needed data and displayed them in human readable form. The approach was called as I recall “QueryObject.” By 2002, the outfit dropped off my radar. The downside of the CrossZ approach was that the compression was asymmetric; that is, slow preparing the fractal chunk but really fast when running a query and extracting the needed data.

Flash forward to Terbium Labs, which has a patent on a method of converting data to tokens or what the firm calls “digital fingerprints.” The system matches patterns and displays high probability matches. Terbium is a high potential outfit. The firm’s methods may be a short cut for some of the Big Data matching tasks some folks in the biology lab have.

For me, the concept of reducing the size of a content chunk and then querying it to achieve faster response time is a good idea.

What do you think I thought when I read “Searching Big Data Faster”? Three notions flitter through my aged mind:

First, the idea is neither new nor revolutionary. Perhaps the MIT implementation is novel? Maybe not?

Second, the main point that “evolution is stingy with good designs” strikes me as a wild and crazy generalization. What about the genome of the octopus, gentle reader?

Third, MIT is darned eager to polish the MIT apple. This is okay as long as the whiz kids take a look at companies which used this method a couple of decades ago.

That is probably not important to anyone but me and to those who came up with the original idea, maybe before CrossZ popped out of Eastern Europe and closed a deal with a large financial services firm years ago.

Stephen E Arnold, September 3, 2015

« Previous PageNext Page »