CyberOSINT banner

Enterprise Search: A Confused Stew

November 29, 2015

Every culture has a stew. A stew is a mélange of ingredients; for example, tripe, brains, chicken fat, water, rutabaga, etc. Your parental beacon probably used an off-the-shelf product and added grocery store goodies. Heat and serve. Yummy.

I read two articles which make vivid the sad state of enterprise search. I think the experts who cooked up these write ups are doing the best with the ingredients in the fridge. I am not sure about the palatability of the meals the items presage.

Get the Right Search Tool

First, I read “Choosing an Enterprise Search Tool.” Okay, tool, not utility. The word tool does not deliver the calories I need. Let’s move forward. I learned that search has these important ingredients:

  • Text analytics functions, including “entity co reference resolution”. Got it.
  • Pipeline architecture, which means no reprogramming. Love that if it is indeed Grade A chuck.

That’s it. The write up wants me to be sure the solution is scalable. Okay. No problem in cloud land unless there are some pesky contractual and security requirements to keep the system close at hand and in conformance with rules, regulations, and laws.

The notion of security is good. I am all for a secure enterprise search tool. The problem is that security is a slippery fish. Toss it into this mix with the analytics and the pipeline. It will be just fine, a mantra crock pot manufacturers use in their advertisements.

Go for the open standards. None of that Kobe beef technology.

Then the write up enjoins me to do what I think is a pretty tough task in the overheated kitchen under the gaze of the chief financial officer:

you need to make sure you first know what type of information your users will need to find.

I don’t know what information I need until I encounter a problem which I cannot resolve with what I know, have in my archive, or a colleague with hopefully a clue. Where did I put that venison bone? Right, I don’t have a venison bone, and I don’t know anyone who has one. Is there a difference between a cow bone and a deer bone? Help, I need a food centric search system without Yelp and TripAdvisor filters.

I am not sure about this recipe.

Do the Enterprise Search Engine Optimization Thing

The second article I read was “Kickstart Your Enterprise Search Program.” The approach I am urged to follow involves getting on the road to better search. Happy users are important.

To reach this goal:

You need to undertake your own Enterprise SEO, or ESEO. I first wrote about ESEO in 2009, and it’s as relevant today as it was then — and suffers the same lack of tools even now. However, there are methodologies you can use.

How do I do that? It’s as easy as microwaving a burrito. Just look at the search terms the users employ. In my experience, those search terms often mislead when users are hunting for their own documents or tackling a topic about which the user lacks information and vocabulary.

One I know the terms, the rest of the enterprise search task is even easier. I don’t have to puncture the cellophane wrap after microwaving the El Monterey wonder.


Stew and microwaved SEOs. Enterprise search requires more substantive fare than these two write ups deliver. Little wonder that enterprise search vendors are struggling to find purchase in a business environment looking for Red Bull solutions.

My thought: Do not rely on either of these chefs’ suggestions. Analytics and not precision and recall. SEO and not substantive nutrition? Not for me.

Stephen E Arnold, November 29, 2015

The New Real Journalism: Bezos a WaPo to the Gray Lady

November 29, 2015

I read “Jeff Bezos Says The Washington Post’s Goal Is to Become the New Paper of Record.” As Jack Benny used to say when someone mentioned $1 million, “Yipes.”

My hunch is that the sports at the New York Times probably had other exclamations to share among themselves.

We know that Mr. Bezos seems to have made the overhead reducing cloud computing thing a money maker. We know Mr. Bezos has pulled off a 1950s style rocket ship landing which suggests the visionary inventor of the Tesla has some catching up to do in the space craft landing field. We know Mr. Bezos has lots of money.

I noted this quote, which suggests, he knows his achievement factoids as well:

Well, you know, what we’re doing with the Post is we’re working on becoming the new paper of record, Charlie. We’ve always been a local paper, and just this month The Washington Post passed The New York Times in terms of number of viewers online. This is a gigantic accomplishment for the Post team. We’re just gonna keep after that. The reason that that’s working is because we have such a talented team at the Post. It’s all about quality journalism. And even here in the Internet age, in the 21st century, people really care about quality journalism.

What will the New York Times do? Gee, I don’t know. In the third quarter of 2015, the Gray Lady generated $9 million in profit. What do you think building rockets for fun costs? Probably a lot more than real journalism Bezos style.

Stephen E Arnold, November 29, 2015

Quote to Note: Wolfram on Artificial Intelligence

November 28, 2015

There’s a long interview with Stephen Wolfram in “Interview with Stephen Wolfram on AI and the Future,” which I found when pruning my archives. Here’s one of the quotes I noted:

Recently, computers, and GPUs, and all that kind of thing became fast enough that, really—there are a bunch of engineering tricks that have been invented, and they’re very clever, and very nice, and very impressive, but fundamentally, the approach is 50 years old, of being able to just take one of these neural network–like systems, and just show it a whole bunch of examples and have it gradually learn distinctions between examples, and get to the point where it can, for example, recognize different kinds of objects and images.

Hmm. Half a century. Progress comes from faster chips and clever implementations of well known methods. Interesting.

Enterprise search is also old. Improvements have been slow and seem to be lagging behind other fields. Is it the vendors or is the nature of the problem? The self appointed experts, failed webmasters, and former middle school teachers now working as taxonomy experts are pitching governance, semantics, and assorted packets of artificial butter.

Stephen E Arnold, November 28, 2015

Visualization Tool Round Up

November 28, 2015

Want to make a snappy visualization to impress your manager or a one star general? Navigate to “Top 5 Visualisation Tools” and explore the five recommendations. These systems output some Hollywood-style chart. Just remember to know where a particular data point came from and how the number was produced. Well, if you are briefing a CEO or a four star general, you might not have to stick to close to the facts. Just make each chart shout, “Good news.”

Here are the five systems the write up explains and illustrates:

  • Gephi. Yep, free to use with a couple of caveats
  • Tom Sawyer Perspectives. Not for the Huck Finns eager to kick back on a raft
  • Keylines. You too can do geospatial integration
  • Linkurious. Sharpen your query language skills
  • GraphX. Open source and Spark what could be more wonderful?

PowerPoint away. Just remember to make sure you can answer the question, “Where did that come from?”

Stephen E Arnold, November 28, 2015

Money Laundering: Digital Currency or Old Fashioned Methods?

November 27, 2015

Online is zeros and ones. I worked for a number of years for a fellow with lots of money who explained, “Money is information.” He was mostly correct. However, in the world of big time money laundering, online does not yet have the NFL lineman muscles to do the entire job of keeping financial transactions secret.

The challenge with digital currencies boils down to a search and retrieval problem. Actionable information is embedded in transaction data. Bad actors may not be Bitcoin fans for certain types of unregulated cash transfer tasks.

Navigate to “‘White Gloves,’ ‘VIP Boxes:’ How It’s Done at China’s Underground Banks,” which does a good job of explaining how more traditional money laundering is handled. Bitcoin is okay for moving assets if one has the time, the operational security, and expertise to make the system work.

For folks with JP Morgan-style funds, something more robust and reliable may be needed. Oh, the ability to keep the activity hard to find, hidden from regulators and tax authorities, and reliable is important.

The article states:

In one case Xinhua highlighted this week, state investigators accused a longtime general manager, surnamed Dai, in a state-owned engineering company, Beijing-based China Harbour Engineering, of helping to move $3 million of corruption-tainted gains via a Chinese underground bank onshore. The underground bank used a technique that regulators called an “audit hedge,” essentially depositing 18 million yuan in Mr. Dai’s onshore account in exchange for an equivalent amount of foreign exchange placed in the underground bank’s offshore account. No money crosses the border physically or electronically, making the transaction almost perfectly undetectable — hence “a hedge against audits.”

Another method is an old fave: Shell accounts. The article stated:

In Ningxia, a small northwestern region home to China’s Hui ethnic minority, criminal gangs in the provincial capital Yinchuan set up 12 trading shells that did nothing but generate false export data as a means to move money in or out of the country under the guise of legitimate corporate payments, according to Xinhua. Companies are allowed to move foreign exchange exceeding China’s $50,000 annual limit for legitimate purposes. Police found that the gangs marked the funds that moved through their shell accounts as “national export incentive awards” obtained from the Yinchuan City Bureau of Finance. Investigators alleged that the gangs used the scheme to defraud the Ningxia government since 2013 of export incentives worth 38.6 million yuan ($6 million). Export scams like these usually facilitate illegally moving funds onshore, rather than offshore. China controls foreign exchange coming onshore just as it does money trying to move offshore. The Ningxia case stemmed from 2013, when China was experiencing a high level of net capital inflows.

When will digital currencies facilitate money laundering on this supersized scale? Not surprisingly, verifiable data about the volume of money laundering via digital currencies is tough to obtain.

I would point out that old fashioned methods still have their use. Investigators, therefore, have to rely on useful software like Maltego and add ins and have the resources to dig out information the old fashioned way. This is not just feet on the street; it is humans pulling information threads.

Stephen E Arnold, November 27, 2015

IBM Cognos 2015 Pricing

November 27, 2015

IBM offers many products and services. Getting a firm, fixed cost for some of these can be tough. Asking Watson may not result in too many useful IBM cost outputs. A company’s IBM representative may be able to deliver the goods.

Imagine my delight when I read a semi content marketing item called “IBM Cognos business intelligence offers Self-Service BI.”

Here are the data I found interesting:

Cognos BI on Cloud offers three levels of user pricing and four levels of administrator pricing. User pricing is as follows:

  • A workgroup license is $75 per user, per month, with a minimum subscription of 50 users and a minimum six-month term. It is renewed semi-annually with monthly billing.
  • A standard license is $95 per user, per month, with a minimum subscription of 100 users and a minimum one-year term. It is billed monthly and renewed annually.
  • An enterprise license is $125 per user, per month, with a minimum subscription of 150 users and a minimum one-year term. It is billed monthly and renewed annually.

Administrator pricing is as follows:

  • Analytics Administrator (authorized user [AU]): List price is $15,100 per AU; typical discount is 30% and annual support percentage is 20%.
  • Analytics Explorer (authorized user and processor [PVU]): $2,500 per AU; typical discount is 30% and annual support percentage is 20%.
  • Analytics User Authorized (user and processor [PVU]): $1,350 per AU; typical discount is 30% and annual support percentage 20%.
  • Information Distribution (processor [PVU]): $500 per PVU; typical discount is 30% and annual support percentage is 20%.

The “menu” includes the variable pricing elements which IBM has used for decades. When we licensed ABI/INFORM document delivery to IBM, I happily implemented the same pricing scheme. Wow, does that approach yield revenue? Yep, it does.

I would point out that the write up does not beat the Watson drum. I find this amusing because Watson is marketed by the Watson as an analytics champion. See, for example, “It’s Come to This for IBM: Watson Is Now a Gimmick App on the iPhone.” But never fear, Big Blue fans, IBM said in October 2015 that it was tweaking Cognos. How? According to eWeek, “IBM Redesigns Cognos Analytics to Resemble Watson Analytics.”

IBM has a bit of a revenue and profits hill to climb. IBM has the analytics tools to track its financial progress. Tools, however, do not equal sustainable, organic revenues.

Storm clouds remain even with the Weather Channel data.

Stephen E Arnold, November 27, 2015

IBM and Digital Piracy: Just Three Ways?

November 27, 2015

I read “Preventing Digital Piracy: 3 Ways to Use Big Data to Protect Content.” I love making complicated issues really easy. Remember the first version of the spreadsheet? Easy. Just get a terminal, wrangle the team to install LANPAR, and have at it. Easy as 1-2-3, which came after VisiCalc.

Ah, LANPAR. You remember that, right. I have fond memories of Language for Programming Arrays at Random, don’t you? I still think the approach embodied in that software was a heck of a lot more user friendly than filling in tiny rectangular areas with a No. 4 pencil and adding and subtracting columns using an adding machine.

IBM has cracked digital piracy by preventing it. Now I find that notion fascinating. On a recent trip, I noted that stolen software and movies were more difficult to find. However, a question or two of the helpful folks at a computer store in Cape Town revealed a number of tips for snagging digital content. One involved a visit to a storefront in a township. Magic. Downloaded stuff on a USB stick. Cheap, fast, unmonitored.

IBM’s solution involve streaming data. Okay, but maybe streaming for some content is not available; for example, a list of firms identified by an intelligence agency as “up and comers.”

IBM also wants me to build a real time feedback loop. That sounds great, but the angle is not rules. The IBM approach is social media. This fix also involves live streaming. Not too useful when the content is not designed to entertain.

The third step wants me to perform due diligence. I am okay with this, but then what? When I worked at a blue chip consulting firm, the teams provided specific recommendations. The due diligence is useless without informed, affordable options and the resources to implement, maintain, and tune the monitoring activity.

I am not sure what IBM expects me to do with these three steps. My initial reaction is that I would do what charm school at Booz, Allen taught decades ago; that is, figure out the problem, identify the options, and implement the approach that had the highest probability of resolving the issue. The job is not to generalize. Proper scope helps ensure success.

If I wanted to prevent digital privacy, I would look to companies which have sophisticated, automatable methods to identify and remediate issues.

IBM, for example, does not possess the functionality of a company like Terbium Labs. There are other innovators dealing with leaking data. I could use LANPAR to do certain types of spreadsheet work. But why? Forward looking solutions do more than offer trivial 1-2-3.

Stephen E Arnold, November 27, 2015

Individualized Facebook Search

November 27, 2015

Facebook search is a puzzle.  If you want to find a specific post that you remember seeing on a person’s profile, you cannot find it unless it is posted to their timeline.  It is a consistent headache, especially if you become obsessed with finding that post.  Mashable alerts us to a new Facebook pilot program, “Facebook May Soon Let You Search Individual Profile Pages.”  Facebook’s new pilot program allows users to search for posts within a profile.

The new search feature is only available to pilot program participants.  Based on how the feedback, Facebook will evaluate the search function and announce a potential release date.

“Facebook says it’s a small pilot program going around the U.S. for iPhone and desktop and that users have requested an easier way to search for posts within a person’s profile. The feature is limited in nature and only showing up for a select group of people who are part of the pilot program. The social network will be evaluating feedback based on the pilot. No plans for an official rollout have been announced at this time.?”

The search feature shows up on user profiles as a basic search box with the description “search this profile” with the standard magnifying glass graphic.  It is a simple addition to a profile’s dashboard and it does not take up much space, but it does present a powerful tool.

Facebook is a social media platform that has ingrained itself into the function of business intelligence to regular socialization. As we rely more on it for daily functions, information needs to be easy to recall and access.  The profile search feature will probably be a standard Facebook dashboard function by 2016.

Whitney Grace, November 27, 2015
Sponsored by, publisher of the CyberOSINT monograph

How Semantic Technology Will Revolutionize Education

November 27, 2015

Will advanced semantic technology return us to an age of Socratic education? In a guest post at Forbes, Declara’s Nelson González suggests that’s exactly where we’re heading; the headline declares, “The Revolution Will Be Semantic: Web3.0 and the Emergence of Collaborative Intelligence.” In today’s world, stuffing a lot of facts into each of our heads is much less important than the ability to find and share information effectively. González writes:

“Most importantly, Web3.0 is opening paths to collaborative intelligence. Isolated individual learning is increasingly irrelevant to organizational health, which is measured largely through group metrics. Today, public and private institutions live or die based on the efficiency, innovation, and impact of corporate efforts.”

The post points to content curators like Flipboard and Pinterest as examples of such collective adaptive  capacity, then looks at effects this shift is already beginning to have on education. González gives a couple of examples he’s seen around the world, and discusses ways collaboration software like his company’s can facilitate new ways of learning. See the article for details. He writes:

“Web 3.0 is unleashing a kind of ‘back to the future’ innovation, the digital democratization of what élites have always practiced: deep learning through imitative apprenticeship, humanistic personalization via real-time observation, and mastery through crowdsourced validation. Silicon Valley is thus enabling us all to become the sons and daughters of Socrates.”

Launched in 2012, Declara set out to build better bridges between online sources of knowledge. The company is based in Palo Alto, California.

Cynthia Murrell, November 27, 2015

Sponsored by, publisher of the CyberOSINT monograph

Turkey, Watson. Turkey Meatballs Now

November 26, 2015

I read an article which is not a sketch for Saturday Night Live.

The juxtaposition of Watson and turkey strikes me as something a comedy writing team would craft on a pressure packed Wednesday afternoon for a demanding Jimmy Kimmel or Stephen Colbert type entertainer.

I now believe “This Is IBM Watson’s Favorite Turkey Recipe” is an honest, down home, authentic effort from IBM’s tireless Watson researchers. It is difficult to get open source code, home brew scripts, and acquired technology to put Julia Childs’ recipe for turkey back in the drawer.

I asked, “Wow, Watson does turkey recipes?” Who knew?

The write up explains:

In honor of the U.S. Thanksgiving holiday, Fortune reached out to IBM to find out Watson’s perfect turkey recipe. What came back wasn’t exactly a Thanksgiving classic: meatballs.

Well, the savvy editors at Fortune were not happy with plain old meatballs. Watson ran another string of zeros and ones through its magical system and generated a hard cider sauce for those IBM meatballs. Inspiration may have come from Bob Dylan, the Watson icon, and the Dylan song Copper Kettle:

Get you a copper kettle, get you a copper coil
Fill it with new made corn mash and never more you’ll toil
You’ll just lay there by the juniper while the moon is bright
Watch them just a-filling in the pale moonlight.

The alcoholic touch adds zest to the bacon infused mix on an American holiday. I am not sure how this bacon and booze thing works for some religious groups, people with dependency problems, and health nuts who associate bacon with cancer and liquor with liver issues. Well, trivial issues I assume from the point of view of the Watson kitchen wizards.

Thanks to Fortune and IBM Watson for a delightful alternative to deep fried turkey on “a day of giving thanks for the blessing of the harvest and of the preceding year.” (Wikipedia)

Stream a Dylan tune and dig in, gentle reader. (Aren’t meatballs Swedish?) Ask Watson.

Stephen E Arnold, November 26, 2015

Next Page »