CyberOSINT banner

Maverick Search and Match Platform from Exorbyte

August 31, 2015

The article titled Input Management: Exorbyte Automates the Determination of Identities on Business On (a primarily German language website) promotes the Full Page Entity Detect from Exorbyte. Exorbyte is a world leader in search and match for large volumes of data. They boast clients in government, insurance, input management and ICT firms, really any business with identity resolution needs. The article stresses the importance of pulling information from masses of data in the modern office. They explain,

“With Full Page Entity Detect provides exorbyte a solution to the inbox of several million incoming documents.This identity data of the digitized correspondence (can be used for correspondence definition ) extract with little effort from full-text documents such as letters and emails and efficiently compare them with reference databases. The input management tool combines a high fault tolerance with accuracy, speed and flexibility.Gartner, the software company from Konstanz was recently included in the Magic Quadrant for Enterprise Search.”

The company promises that their Matchmaker technology is unrivaled in searching text without restrictions, even without language, allowing for more accurate search. Full Page Entity Detect is said to be particularly useful when it comes to missing information or overlooked errors, since the search is so thorough.

Chelsea Kerwin, August 31 , 2015

Sponsored by, publisher of the CyberOSINT monograph

Beyond Google, How to Work Your Search Engine

August 28, 2015

The article on Funnelback titled Five Ways to Improve Your Website Search offers tips that may seem obvious, but could always stand to be reinforced. Sometimes the Google site:<url> is not enough. The first tip, for example, is simply to be helpful. That means recognizing synonyms and perhaps adding an autocomplete function in case your site users think in different terms than you do. The worst case scenario is search is typing in a term and yielding no results, especially when the problem is just language and the thing being searched for is actually present, just not found. The article goes into the importance of the personal touch as well,

“You can use more than just the user’s search term to inform the results your search engine delivers… For example, if you search for ‘open day’ on a university website, it might be more appropriate to promote and display an ‘International Open Day’ event result to prospective international students instead of your ‘Domestic Student Open Day’ counterpart event. This change in search behavior could be determined by the user’s location – even if it wasn’t part of their original search query.”

The article also suggests learning from the search engine. Obviously, analyzing what customers are most likely to search for on your website will tell you a lot about what sort of marketing is working, and what sort of customers you are attracting. Don’t underestimate search.

Chelsea Kerwin, August 28, 2015

Sponsored by, publisher of the CyberOSINT monograph

Lexmark: Signs of Trouble?

August 27, 2015

I read “Shares of Lexmark International Inc. Sees Large Outflow of Money.”

The main point of the write up in my opinion was:

The company shares have dropped 41.65% in the past 52 Weeks. On August 25, 2014 The shares registered one year high of $50.63 and one year low was seen on August 21, 2015 at $29.11.

Today as I write this (August 26, 2015), Lexmark is trading at $28.25.

Why do I care?

The company acquired several search and content processing systems in the firm’s effort to find a replacement for the firm’s traditional business, printers. As you know, Lexmark is one of the IBM units which had an opportunity to find its future outside of IBM.

The company purchased three vendors which were among the companies I monitored:

  • Brainware, the trigram folks
  • ISYS Search Software, the 1988 old school search and retrieval system
  • Kapow (via Lexmark’s purchase of Kofax), the data normalization outfit.

Also, the company’s headquarters are about an hour from my cabin next to the pond filled with mine run off. Cutbacks at Lexmark may spell more mobile homes in my neck of the woods.

Stephen E Arnold, August 27, 2015

Insights into the Cut and Paste Coding Crowd

August 26, 2015

I read “How Developers Search for Code.” Interesting. The write up points out what I have observed. Programmers search for existing — wait for it — code.

Why write something when there are wonderful snippets to recycle. Here’s the paragraph I highlighted:

We also learn that a search session is generally just one to two minutes in length and involves just one to two queries and one to two file clicks.

Yep, very researchy. Very detailed. Very shallow. Little wonder that most software rolls out in endless waves of fixes. Good enough is the sort of sigma way.

Encouraging. Now why did that air traffic control crash happen? Where are the back ups to the data in Google’s Belgium server center? Why does that wonderful Windows 10 suck down data to mobile devices with little regard for data caps? Why does malware surface in Android apps?

Good enough: the new approach to software QA/QC.

Stephen E Arnold, August 26, 2015

How to Search the Ashley-Madison Data and Discover If You Had an Affair Too

August 26, 2015

If you haven’t heard about the affair-promoting website Ashley Madison’s data breach, you might want to crawl out from under that rock and learn about the millions of email addresses exposed by hackers to be linked to the infidelity site. In spite of claims by parent company Avid Life Media that users’ discretion was secure, and that the servers were “kind of untouchable,” as many as 37 million customers have been exposed. Perhaps unsurprisingly, a huge number of government and military personnel have been found on the list. The article on Reuters titled Hacker’s Ashley Madison Data Dump Threatens Marriages, Reputations also mentions that the dump has divorce lawyers clicking their heels with glee at their good luck. As for the motivation of the hackers? The article explains,

“The hackers’ move to identify members of the marital cheating website appeared aimed at maximum damage to the company, which also runs websites such as, causing public embarrassment to its members, rather than financial gain. “Find yourself in here?,” said the group, which calls itself the Impact Team, in a statement alongside the data dump. “It was [Avid Life Media] that failed you and lied to you. Prosecute them and claim damages. Then move on with your life. Learn your lesson and make amends. Embarrassing now, but you’ll get over it.”

If you would like to “find yourself” or at least check to see if any of your email addresses are part of the data dump, you are able to do so. The original data was put on the dark web, which is not easily accessible for most people. But the website Trustify lets people search for themselves and their partners to see if they were part of the scandal. The website states,

“Many people will face embarrassment, professional problems, and even divorce when their private details were exposed. Enter your email address (or the email address of your spouse) to see if your sexual preferences and other information was exposed on Ashley Madison or Adult Friend Finder. Please note that an email will be sent to this address.”

It’s also important to keep in mind that many of the email accounts registered to Ashley Madison seem to be stolen. However, the ability to search the data has already yielded some embarrassment for public officials and, of course, “family values” activist Josh Duggar. The article on the Daily Mail titled Names of 37 Million Cheating Spouses Are Leaked Online: Hackers Dump Huge Data File Revealing Clients of Adultery Website Ashley Madison- Including Bankers, UN and Vatican Staff goes into great detail about the company, the owners (married couple Noel and Amanda Biderman) and how hackers took it upon themselves to be the moral police of the internet. But the article also mentions,

“Ashley Madison’s sign-up process does not require verification of an email address to set up an account. This means addresses might have been used by others, and doesn’t prove that person used the site themselves.”

Some people are already claiming that they had never heard of Ashley Madison in spite of their emails being included in the data dump. Meanwhile, the Errata Security Blog entry titled Notes on the Ashley-Madison Dump defends the cybersecurity of Ashley Madison. The article says,

“They tokenized credit card transactions and didn’t store full credit card numbers. They hashed passwords correctly with bcrypt. They stored email addresses and passwords in separate tables, to make grabbing them (slightly) harder. Thus, this hasn’t become a massive breach of passwords and credit-card numbers that other large breaches have lead to. They deserve praise for this.”

Praise for this, if for nothing else. The impact of this data breach is still only beginning, with millions of marriages and reputations in the most immediate trouble, and the public perception of the cloud and cybersecurity close behind.


Chelsea Kerwin, August 26, 2015

Sponsored by, publisher of the CyberOSINT monograph

SLI Share Price: Headwinds for Search Evident

August 26, 2015

I read “SLI CEO Ryan Bemoans Low Share price, Says It Should Be $2-Plus.” This is a woulda, coulda, shoulda write up. Reality seems to ignore this somewhat lame mantra.

The write up says:

SLI Systems chief executive Shaun Ryan says the company’s share price is “significantly underpriced” and could be at least four times higher based on other public software-as-a-service valuations.

The write up included this bit of information:

The company today reported a loss of $7.1 million in the year ended June 30, widening from a loss of $5.7 million a year earlier. Operating revenue increased 27 percent to $28.1 million, in line with the $28 million guidance given in April, when it flagged that second-half sales would be lower than expected. Annualized recurring revenue (ARR), its preferred financial measure based on forward subscription revenue, rose 39 percent to $34.6 million.

SLI says its system

… helps you increase e-commerce revenue by connecting your online and mobile shoppers with the products they’re most likely to buy. SLI solutions include SaaS-based learning search, navigation, merchandising, mobile, recommendations and user-generated SEO.

Other publicly trade search vendors are struggling with their financial performance too. For example, Sprylogics, a Canadian vendor, sees it shares trading at $0.33. Lexmark shares are at $28 and change.

Search is a tough niche as Hewlett Packard and IBM are learning.

Stephen E Arnold, August 29, 2015

What Might be Left Out of SharePoint 2016

August 25, 2015

When a new version of any major software is released, users get nervous as to whether their favorite features will continue to be supported or will be phased out. Deprecation is the process of phasing out certain components, and users are warily eyeing SharePoint Server 2016. Read all the details in the Search Content Management article, “Where Can We Expect Deprecation in SharePoint 2016?”

The article begins:

“New versions of Microsoft products always include a variety of additional tools and capabilities, but the flip side of updating software is that familiar features are retired or deprecated. We can expect some changes with SharePoint 2016.”

While Microsoft has yet to officially release the list of what will make the cut and what will be deprecated, they have made it known that InfoPath is being let go. To stay on top of future developments as they happen, stay tuned to Stephen E. Arnold has made a lifetime career out of all things search, and he lends his expertise to SharePoint on a dedicated feed. It is a great resource for SharePoint tips and tricks at a glance.

Emily Rae Aldridge, August 25, 2015

Sponsored by, publisher of the CyberOSINT monograph

Insights Into SharePoint 2013 Search

August 25, 2015

It has been awhile since we have discussed SharePoint 2013 and enterprise search.  Upon reading “SharePoint 2013: Some Observations On Enterprise Search” from Steven Van de Craen’s Blog, we noticed some new insights into how users can locate information on the collaborative content platform.

The first item he brings our attention to is the “content source,” an out-of-the-box managed property option that create result sources that aggregate content from different content sources, i.e. different store houses on the SharePoint.   Content source can become a crawled property.  What happens is that meta elements from Web pages made on SharePoint can be added to crawled properties and can be made searchable content:

“After crawling this Web site with SharePoint 2013 Search it will create (if new) or use (if existing) a Crawled Property and store the content from the meta element. The Crawled Property can then be mapped to Managed Properties to return, filter or sort query results.”

Another useful option was mad possible by a user’s request: making it possible to add query string parameters to crawled properties.  This allows more information to be displayed in the search index.  Unfortunately this option is not available out-of-the-box and it has to be programmed using content enrichment.

Enterprise search on SharePoint 2013 still needs to be tweaked and fine-tuned, especially as users’ search demands become more complex.  It makes us wonder when Microsoft will release the next SharePoint installment and if the next upgrade will resolve some of these issues or will it unleash a brand new slew of problems?  We cannot wait for that can of worms.

Whitney Grace, August 25, 2015
Sponsored by, publisher of the CyberOSINT monograph


The Integration of  Elasticsearch and Sharepoint Adds Capabilities

August 24, 2015

The article on the IDM Blog titled BA Insight Brings Together Elasticsearch and Sharepoint describes yet another vendor embracing Elasticsearch and falling in love again with Sharepoint. The integration of Elasticsearch and Sharepoint enables customers to use Elasticsearch through Sharepoint portals. The integration also made BA Insight’s portfolio accessible through open source Elasticsearch as well as Logstash and Kibana, Elastic’s data retrieval and reporting systems, respectively. The article quotes the Director of Product Management at Elastic,

“BA Insight makes it possible for Elasticsearch and SharePoint to work seamlessly together…By enabling Elastic’s powerful real-time search and analytics capabilities in SharePoint, enterprises will be able to optimize how they use data within their applications and portals.”  “Combining Elasticsearch and SharePoint opens up a world of exciting applications for our customers, ranging from geosearch and pattern search through search on machine data, data visualization, and low-latency search,” said Jeff Fried, CTO of BA Insight.”

Specific capabilities that the integration will enable include connectors to over fifty system, auto-classification, federation to improve the presentation of results within the Sharepoint framework, applications like Smart Previews and Matter Comparison. Users also have the ability to decide for themselves whether they want to use the Sharepoint search engine or Elastic’s, or combine them and put the results together into a set. Empowering users to make the best choice for their data is at the heart of the integration.

Chelsea Kerwin, August 24, 2015

Sponsored by, publisher of the CyberOSINT monograph


Search Does Not Work: Maybe, Sometimes

August 23, 2015

I read “Feds Keep Magically Finding Documents They Insisted Didn’t Previously Exist.” I noted that the US government struggles with finding content, if the article is on the money. I learned:

Gawker had sought the email communications of Hillary Clinton deputy Philippe Reines, focused on his conversations with journalists. The State Department came back with a no responsive records reply, which was clearly bullshit, since Reines was known for regularly emailing reporters. So Gawker sued and guess what just happened: the State Department just magically found 17,855 emails that are likely responsive. How about that?

Obviously the US government is not aware of the search systems which can find documents. But what if the US government has these systems. Isn’t the finding issue an indication that basic search and retrieval does  not work? Interesting thought.

Stephen E Arnold, August 23, 2015

Next Page »