Stache for Advanced Bookmarking

July 7, 2014

There’s another way to bookmark Web pages. Stache from d3i offers several features that go beyond those offered by the browser basics. The app has been designed for Macs and Apple’s i-devices; Android and Windows users, I’m afraid, are out of luck. The description pledges:

“If a page is useful, Stache it! Stache turns cluttered browser bookmarks and overwhelming reading lists in to a beautiful, visual and fully searchable collection of useful pages. No more digging through endless lists of page titles, or spending your precious time organising your bookmarks into folders. In one click a web page becomes part of your personal repository of useful information, archived, searchable and accessible in seconds from all of your devices.”

Features include one-click bookmarking; a stored screenshot of each marked page; entire-page search (they call this “complete” search); and, of course, syncing to the (i) cloud. Mac users get additional features, like full-page archiving and bookmark importing/exporting. The app can be downloaded for Macs here, for iPhones, iPads, and iPods here, and for Safari or Chrome on a Mac here.

Founded in 2008, d3i specializes in designing apps for the iPhone and iPad. The company also developed the journaling app Momento, which integrates with Web services like Facebook, Twitter, Flickr, YouTube, and RSS feeds. D3i is based in Buckinghamshire, U.K.

Cynthia Murrell, July 07, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Spreadsheet Fever May Suffer Spreadsheet Goofs

July 7, 2014

The data-analysis work of recently prominent economist Thomas Pikkety receives another whack, this time from computer scientist and blogger Daniel Lemire in, “You Shouldn’t Use a Spreadsheet for Important Work (I Mean It).” Pikkety is not alone in Lemire’s reproach; last year, he took Harvard-based economists Carmen Reinhart and Kenneth Rogoff to task for building their influential 2010 paper on an Excel spreadsheet.

The article begins by observing that Pikkety’s point, that in today’s world the rich get richer and the poor poorer, is widely made but difficult to prove. Though he seems to applaud Pikkety’s attempt to do so, Lemire really wishes the economist had chosen specialized software, like STATA, SAS, or “even” R or Fortran. He writes:

“What is remarkable regarding Piketty’s work, is that he backed his work with comprehensive data and thorough analysis. Unfortunately, like too many people, Piketty used speadsheets instead of writing sane software. On the plus side, he published his code… on the negative side, it appears that Piketty’s code contains mistakes, fudging and other problems….

“I will happily use a spreadsheet to estimate the grades of my students, my retirement savings, or how much tax I paid last year… but I will not use Microsoft Excel to run a bank or to compute the trajectory of the space shuttle. Spreadsheets are convenient but error prone. They are at their best when errors are of little consequence or when problems are simple. It looks to me like Piketty was doing complicated work and he bet his career on the accuracy of his results.”

The write-up notes that Piketty admits there are mistakes in his work, but asserts they are “probably inconsequential.” That’s missing the point, says Lemire, who insists that a responsible data analyst would have taken more time to ensure accuracy. My parents always advised me to use the right tool for a job: that initial choice can make a big difference in the outcome. It seems economists may want to heed that common (and common sense) advice.

Cynthia Murrell, July 07, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

13 Career Jeopardizing Enterprise Search Issues

July 6, 2014

The ArnoldIT team has combed through our archive of enterprise search data. We have identified the top 13 surprises that enterprise search delivers to licensees. Get hit with several of these surprises and you might find yourself seeking a future in a different line of work.

13 Users don’t like the new system or the old system, for that matter. Dissatisfaction with enterprise search, regardless of vendor, runs at 55 to 70 percent. See Successful Enterprise Search Management at http://www.galatea.co.uk/index.php?option=com_content&task=view&id=35&Itemid=53

12 No one pays attention to search costs until the CFO conducts an audit. Cost overruns plague nine out of 10 enterprise search deployments. The reason is that a comprehensive compilation of costs is not part of an enterprise search deployment. When a system crashes, the costs for emergency and rush work comes from a different line item. Customization is usually taken from a different budget allocation. Consultants and contractors are paid from another budget allocation. When the costs are added up, everyone seems surprised at the money spent for a system few find satisfactory. The hunt for a scapegoat is on.

11 Open source search and proprietary search solutions differ little in costs. Aside from an initial licensing fee, the costs for customization, optimization, contractors, programming, and enhancement are essentially the same.

10 Many major enterprise search systems are now based on open source software. The reason is that the cost for the basic functions are rising and are difficult to control. Therefore, vendors use open source and concentrate on extra cost add ons.

9 Every enterprise search system struggles to process content without human intervention, additional “connectors,” extract transform load activities, and original scripting. When content cannot be acquired, the few who notice will squawk, often loudly.

8 Latency creates a problem because new or changed content imposes significant costs on the licensee to cope with the need to process content in near real time and then refresh the indexes whether these are state of the art in memory systems or old style spinning discs and cached methods. When a user asks, “Why is the system too slow?, it may be difficult to make improvements when budgets are constrained.

7 Modern systems include adds on that permit faceting, query expansion, and linguistic functions. Unless these are “tuned” by subject matter experts or analysts, the outputs can generate irrelevant or off-query outputs. The notion of “smart, automatic” search and retrieval are often chimera.

6 Users typically do not conduct a thorough, research librarian type of investigation of query results. Enterprise search systems that generate laundry lists of results that are stale or irrelevant will be used by the person running the query. The assumption that online systems are “correct” is held by 95 percent of an enterprise system’s users. Only when a user cannot find a document that is supposed to be in the search index will the user realize that the system is not working as assumed. If the issues arises during a crunch, the prudent search manager will polish that résumé.

5 Enterprise search scaling is expensive and complex. The idea that scaling is seamless and economical is false. Improving the “performance” of an enterprise search system requires correct identification of the particular factor or factors creating latency. More frequent updates may not be possible without re-engineering an enterprise search system’s infrastructure. How much has Google’s core method changed in 14 years? What about Amazon? What about Autonomy, Endeca, Lucene, etc.? The answer is, “Not too much.” Search is very, very complicated.

4 Interface does not improve precision and recall of a search system. Interface and cosmetic design changes are easy to talk about and “more fun” to work on that figuring out how to process content more quickly and update the searchable indexes in with significantly less latency. If users grouse, an interface change won’t silence the critics or slow the proliferation of bootleg systems in units that are dissatisfied with the search status quo.

3 Search with text mining functions often rely on standard methods and algorithm  configurations that licensees cannot modify without specialist training. As a result, many systems output results that may be based on assumptions not germane to the licensee’s content. Hence, outputs purporting to provide insight into business intelligence or predictions may be incorrect. Search is not text mining. Search is not a Silver Bullet for Big Data. Search is pretty much type a query and get a laundry lists of stuff that must be reviewed by a human. Automatic reports are often off point.

2 Search appliances are not money savers. The Google Search Appliance costs as much as Autonomy or Endeca to deploy. The cloud is not the big money savers marketers want me to believe it is. Cloud search solutions reduce the need for capital expense, but the on going costs are comparable to on premises solutions. A search appliance may be like handcuffs. The cloud may be overly complicated. No highways leads to the Magic Kingdom for search. If you think search is a slam dunk, you are misinformed.

1 Enterprise search systems are more alike than different. The reason is that computational methods have not changed much since the first commercial systems became available  in the late 1960s. The differences are created by marketing, not by significantly different numerical recipes. Most users cannot differentiate between or among systems. The concepts of precision and recall are unknown. Users believe that search systems are right almost all the time. Yikes.

Stephen E Arnold, July 6, 2014

MillionShort: A Path around Search Engine Optimization

July 5, 2014

In my lectures for members of the intelligence community, I talk about how to move “Beyond Google.” I rely on several online search services that are not embraced by the unwashed millions who perceive Google as the Alpha and the Omega of search and retrieval. Google is not objective. The more quickly online users accept the pervasiveness of subjectivity in search results, the more likely a mobile user will be able to locate the Cuba Libre Restaurant in Washington, DC, near the Google offices and pin down the whereabouts of a person like eBay’s chief technical officer. The MillionShort.com system allows me to jump over the irrelevant baloney generated by heavily SEO’ed sites. Man, I hate SEO. Does Matt Cutts’ leave of absence suggest that he too cannot cope with the rigors of eroding objective, relevant results to a Google query?

I noticed a few days ago that the MillionShort.com search system was returning no results, a blank screen, or a message saying that the service was down. I was worried. MillionShort uses a combination of Bing application programming interface calls, some proprietary scripts, and its own index to chop out the Web sites that I love to hate. The name “million short” means that I can NOT out the offensive entertainment sites that pump Justin Bieber information to me. The wildly distorted search engine optimized sites that display useless, off point content, and sites that I really want not to see ever again. Do you too feel this way about www.about.com or www.wikipedia.com or Google.com?

Here’s what MillionShort.com let’s me do. I can run a query and narrow the results with a single click to sites that are not in the Top 1000 most popular Web sites. Try the service and run a query. Instead of showing me the drivel that passes for news from Yahoo.com or CNN.com, I can pinpoint gems like YouTube videos that provide specific information about certain illicit activities, identify blogs that contain information about moderators (if you don’t know what this is, then you won’t appreciate the value of the links), and similar topics that often cannot be found using Blekko, Exalead search, Google, or Yandex in .com and .ru flavors.

MillionShort.com is operated by an entrepreneur whom I am chasing for more details about the system. If I uncover something useful via MillionShort or one of the other “off the radar” services I profile in my intel lectures, I may share some information nuggets in this blog. In the meantime, check out the service. If you get a “not available” message, check back every hour or so. The service comes back up, which is a very good thing for intrepid researchers. For those who want their pizza from the microwave, MillionShort.com may not fit your info life style. Your loss, I fear.

Stephen E Arnold, July 5, 2014

Information Manipulation: Accountability Pipe Dream

July 5, 2014

I read an article with what I think is the original title: “What does the Facebook Experiment Teach us? Growing Anxiety About Data Manipulation.” I noted that the title presented on Techmeme was “We Need to Hold All Companies Accountable, Not Just Facebook, for How They Manipulate People.” In my view, this mismatch of titles is a great illustration of information manipulation. I doubt that the writer of the improved headline is aware of the irony.

The ubiquity of information manipulation is far broader than Facebook twirling the dials of its often breathless users. Navigate to Google and run this query:

cloud word processing

Note anything interesting in the results list displayed for me on my desktop computer:

image

The number one ad is for Google. In the first page of results, Google’s cloud word processing system is listed three more times. I did not spot Microsoft Office in the cloud except in item eight: Is Google Docs Making Microsoft Word Redundant.

For most Google search users, the results are objective. No distortion evident.

Here’s what Yandex displays for the same query:

image

No Google word processing and no Microsoft word processing whether in the cloud or elsewhere.

When it comes to searching for information, the notion that a Web indexing outfit is displaying objective results is silly. The Web indexing companies are in the forefront of distorting information and manipulating users.

Flash back to the first year of the Bush administration when Richard Cheney was vice president. I was in a meeting where the request was considered to make sure that the vice president’s office Web site would appear in FirstGov.gov hits in a prominent position. This, gentle reader, is a request that calls for hit boosting. The idea is to write a script or configure the indexing plumbing to make darned sure a specific url or series of documents appears when and where they are required. No problem, of course. We created a stored query for the Fast Search & Transfer search system and delivered what the vice president wanted.

This type of results manipulation is more common than most people accept. Fiddling Web search, like shaping the flow of content on a particular semantic vector, is trivial. Search engine optimization is a fools’ game compared with the tried and true methods of weighting or just buying real estate on a search results page, a Web site from a “real” company.

The notion that disinformation, reformation, and misinformation will be identifiable, rectified, and used to hold companies accountable is not just impossible. The notion itself reveals how little awareness of the actual methods of digital content injection work.

How much of the content on Facebook, Twitter, and other widely used social networks is generated by intelligence professionals, public relations “professionals,” and folks who want to be perceived as intellectual luminaries? Whatever your answer, what data do you have to back up your number? At a recent intelligence conference in Dubai, one specialist estimated that half of the traffic on social networks is shaped or generated by law enforcement and intelligence entities. Do you believe that? Probably not. So good for you.

Amusing, but as someone once told me, “Ignorance is bliss.” So, hello, happy idealists. The job is identifying, interpreting, and filtering. Tough, time consuming work. Most of the experts prefer to follow the path of least resistance and express shock that Facebook would toy with its users. Be outraged. Call for action. Invent an algorithm to detect information manipulation. Let me know how that works out when you look for a restaurant and it is not findable from your mobile device.

Stephen E Arnold, July 5, 2014

Social Silliness: Search, Collaboration, Fear

July 5, 2014

I am not able to recall which conference featured a speaker who said, “Social search is the future of search.” At this same event, social was the solution to cost control, competitive intelligence, and silos of information. I napped through the first day’s events, delivered a keynote on the second day, and disappeared as quickly as my fat, flat feet could carry me. That was in 2005 or 2006.

Between World Cup games, I read a classic IDG “real” news story with the fetching headline “Many Employees Won’t Mingle with Enterprise Social Software.” My immediate reaction was, “Is this reporter just figuring this out?”

The write up wends its way across four pages of page view goodness. Here are the diamond like insights that I noted. But you need to read the article yourself. You may have a different view because you are unaware of the value of tracking and processing each and every social network click, mouse movement, and dwell time, among dozens and dozens of useful user activities.

Wow, Real Value

Here’s the quote:

Implemented properly, ESN [employee social networks] can be beneficial, analysts say. “It’s great for breaking down geographical barriers and harnessing collective action,” said Rob Koplowitz, a Forrester Research analyst. “Their value can be astronomical.” The siren song of ESN is hard to resist. Spending on this type of software is expected to grow from US$4.77 billion this year to $8.14 billion in 2019, according to MarketsandMarkets.

I wonder how those social networks at the azure chip (lower tier) consulting firms are working. Well, no word on that. And the market size estimate? Just about any crazy number is okay. The notion of accurate market forecasts are essentially irrelevant in the world of a publishing company which puts its employees’ names on other people’s information.

Oh, Oh, Advisory Firms See Problems Ahead

Another quote from the article. (I wonder if I am reading an Onion parody.)

Gartner predicts that through 2015, 80 percent of social business efforts will not achieve their intended benefits due to inadequate leadership and an overemphasis on technology, she [another expert] said. Charlene Li, an Altimeter Group analyst, shares a similar view. “It’s not a situation where if you build it, they will come. That’s not how it works at all,” she said. “Adoption definitely continues to be a problem.”

Yep, fear does that I assume.

Training to the Rescue

Here’s a roasted chestnut:

It’s also important to provide proper training to show employees how they can switch some — or many — email and IM interactions over to the ESN software, and be more productive and efficient. It’s also key for managers and top company executives to endorse the use of the ESN software and lead by example through their own participation. Experts also say it helps when the ESN software is integrated at a technology level with the other tools employees use on a daily basis to do their jobs, whether its their email and calendaring client, their CRM and ERP suites or their office productivity applications.

Yep, let’s train workers to use the company’s social network. No problemo.

GE Is So Well Managed

When I worked at Booz, Allen & Hamilton, I had the experience of meeting Neutron Jack. Well, the management slickness was not evident that day. I recall he threw papers at my boss who was trying to get Neutron Jack to pay an invoice. Let’s say the meeting was a bit like the 4th of July. Here’s the IDG take:

All that can be done in a way that works as intended. GE, which has made use of many of these best practices when rolling out ESN software in recent years, achieved success where other companies have stumbled. GE has a primary ESN suite that’s available to all 300,000 employees globally and that’s known internally as GE Colab, and it has other ESN tools in place for specific teams and departments.

That sounds a lot like GE public relations. Why interview any GE users? Why ask if the social network activity is part of the employee review process? Why ask about the intersection of info on the network and GE security? Too much hassle I suppose. Hey, a case is needed and GE is a gold mine of just so special business cases.

Ah, Deep Integration

I found this quote fascinating:

For Alan Lepofsky, a Constellation Research analyst, the meshing of ESNs with business processes is essential. “If an ESN is not integrated with tools like file-sharing, CRM, marketing automation, support tracking or project management, then it becomes just another tool, and that is where adoption issues begin,” he said via email. Organizations need to ensure that ESNs are woven deeply in to their core business processes in areas such as sales, marketing and engineering, according to Lepofsky.

I have a headache.

Net Net

I am not sure about every US company, but the one with which I am familiar are not particularly well organized. The notion that electronic systems can create a cohesive work force does not match my experience. What makes employees do stuff is the employee’s executive compensation plan tied to specific actions. What creates organization is informed management, experienced middle managers who know their jobs and the customers, and consistency.

Most companies today are looking for silver bullets. Search vendors promise customer relationship management miracles and business intelligence magic. Senior managers are desperate, often fearful, sometimes clueless, and looking for a bigger payday. One crazy software or technical trend follows on another. As companies struggle to make sales and control costs, problems like ignition switches to shaving funds from investors are characteristics of many businesses.

Let’s not confuse cost control and the shift to contract workers with technology that delivers management miracles.

More to the point, let’s have “real” news sources present information that is not blather, glittering generalities, and MBA baloney.

Does IDG need a Neutron Jack? Pat McGovern left quite a legacy. Is fear an under rated management technique?

Stephen E Arnold, July 5, 2014

ThoughtSpot: Another Search Vendor with Aspirations of Big Bucks

July 4, 2014

One of the two or three readers of this blog reported a new and revolutionary search and Big Data vendor called ThoughtSpot. I navigated to the site and enjoyed to wolf / dog. The headline is:

Your business is fast and data hungry.

I really liked the wolf / dog. I found the various links kept pointing to the wolf / dog. I am no longer fast or data hungry. I am outta here. Maybe a reader will let me know when the Web site is working again. the company has captured $30 million in funding according to Venture Beat. I assume the Web site will be fattened in the days ahead. This should be easy. According to Google Maps, ThoughtSpot is very near the In and Out Burger in Redwood City. Presumably the Google-like search for Big Data will be the next double double cheeseburger. My dogs like In and Out Burgers. Neither is fast nor data hungry.

Stephen E Arnold, July 4, 2014

Presentation by a NoSQL Leader

July 4, 2014

The purported father of NoSQL, Norman T. Kutemperor, made an appearance at this year’s Enterprise Search & Discovery conference, we learn from “Scientel Presented Advanced Big Data Content Management & Search With NoSQL DB at Enterprise Search Summit in NY on May 13” at IT Business Net. The press release states:

“Norman T. Kutemperor, President/CEO of Scientel, presented on Scientels Enterprise Content Management & Search System (ECMS) capabilities using Scientels Gensonix NoSQL DB on May 13 at the Enterprise Search & Discovery 2014 conference in NY. Mr. Kutemperor, who has been termed the Father of NoSQL, was quoted as saying, When it comes to Big Data, advanced content management and extremely efficient searchability and discovery are key to gaining a competitive edge. The presentation focused on: The Power of Content – More power in a NoSQL environment.”

According to the write-up, Kutemperor spoke about the growing need to manage multiple types of unstructured data within a scalable system, noting that users now expect drag-and-drop functionality. He also asserted that any NoSQL system should automatically extract text and build an index that can be searched by both keywords and sentences. Of course, no discussion of databases would be complete without a note about the importance of security, and Kutemperor emphasized that point as well.

The veteran info-tech company Scientel has been in business since 1977. These days, they focus on NoSQL database design; however, it should be noted that they also design and produce optimized, high-end servers to go with their enterprise Genosix platform. The company makes its home in Bingham Farms, Michigan.

Cynthia Murrell, July 04, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Watson Wants To Help Vets

July 4, 2014

US veteran medical care is an ongoing topic of debate in the federal government. One of the biggest arguments is how to maintain veterans’ medical records. Business Insider reports that “IBM Overhauls Military Medical Records” and possibly instilling order in the digital chaos. According to the article, IBM and Epic Systems are competing for an $11 billion defense contract to revamp the Military Health Systems. The planned overhaul would affect 9.7 million veterans, active military, and their families.

Who does IBM and Epic Systems want running the program? Watson. IBM has been working on Watson’s medical diagnosis capabilities as well as his cooking skills. The idea is that Watson will clean up an Obamacare portion that demands healthcare contractors deliver flawless results.

“The Defense Healthcare Management Systems Modernization (DHMSM) contract would update the Pentagon’s record system and allow for easier sharing with the Veterans Affairs Administration (VA), which was recently engulfed in a patient-care scandal that culminated with the resignation of VA Secretary Eric Shinseki.”

IBM toots its horn in regards to healthcare technology and its team of 300 consultants. The company is also proud of its Epic Systems partnership. IBM has been working with the government for decades, the provided translation technology for the Nuremburg Trials and more recently won a $1 billion cloud computing contract. IBM is noted to have reliable quality products. If Watson the super computer cannot fix the health records problem, who will?

Whitney Grace, July 04, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

The Big Project News Page Updated

July 3, 2014

If you are a fan of the news page provided by The Big Project, you will already know about the update. If you are not familiar with a collection of news, sources.  you may be interested in the useful service at http://bit.ly/1qr2qVX. I highlight this portal in my lectures for the intelligence conferences featuring my presentation “Beyond Google.” A Newsnow.co.uk style news stream appears. The main links have been shifted left, but the usability has not been impaired. If you find that your country blocks access to The Big Project, you can use a Web proxy to follow news in English and other languages for most countries in the world. Highly recommended.

Stephen E Arnold, July 3, 2014

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta