Dotmatics

March 12, 2009

Visualization was a topic at a client meeting yesterday. I find that visualization is essential when trying to close a deal with senior executive types. Rows of numbers are the steak; a nifty visual presentation of the key points is the sizzle. Dotmatics, founded in 2005, as a spin-out from a multinational pharmaceutical company. The company was established to address the information needs of scientists in the biotech/pharma space.

This is a Dotmatics composite display showing data, structure, and scatter diagram plotting.

I learned today that the company has opened a US office in San Diego, California. The company’s products include browser based system, security provisions, cartridge to query chemical database content, and a tool to make it easy to suck in bio data to an Oracle system. The hot feature of the Dotmatics’ system is its scientific visualization. MBAs need not trouble themselves with this type of tool. MBAs do Excel charts. Scientists need tools to manipulate large datasets and use display tools that have more horsepower than a couple of pie slices and a bar of baloney. You can take a look at the Vortex chart basics here. A feature list for the basic visualization options is here. Worth a look unless you are an MBA. MBAs know everything already in my opinion, especially risk analysis and NINJA borrowers.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, News, Online (general), Search, Technology, visualization | 2 Comments

FDA: An Argument for Pervasive Monitoring

March 12, 2009

Lost amidst the noise about Apple netbooks and communication functions in Google services was this write up in Natural News: “FDA Scientists Accuse Agency of Corruption, Intimidation”. You can read the story here. When I reviewed the article, I was not concerned about who shot John. The point for me was that the information flowing into, around, and out of a US government entity seemed to be subjective to an extraordinary amount of massaging and filtering. I can understand the need for these actions in police and intelligence areas. I am a little puzzled about the same effort or lack of effort applied in areas where public health may be an issue. Read the story. Make up your own mind. My thoughts were after thinking about David Gutierrez’s write up were:

Pervasive monitoring may make it easier to identify anomalies or unusual information activities
A single search system would make it easier for authorized users to pinpoint topics and anomalies
A standard for US government information objects would be helpful.

The article has a somewhat troublesome series of assertions about the agency in question. Maybe the equivalent of traffic cameras are needed to police some agencies? I don’t have a solid opinion yet. Just a concern.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Government, News, Online (general), Privacy | Comments Off on FDA: An Argument for Pervasive Monitoring

Semantics in the Enterprise: Partial Business Case

March 12, 2009

If you struggle to justify spending money for semantic technology, you may want to take a look at “Is It Time for the Use of Semantic Technologies in the Enterprise?” by Javier Carbonell here. When I read the article, I sensed that Mr. Carbonell was involved in or privy to a business case for spending money on semantic technology. Mr. Carbonell does not define “semantics”, and I was forced to assume he was referencing software that can figure out what a “document” or other information object is about. The idea is to get beyond keyword indexing which is quite yesterday in my opinion. He acknowledged that the challenges range from staff expertise to technology. The core of the article, in my opinion, is the real problem today: justifying the expense for a technology or suite of technologies that may not be well understood or may not be easy to implement within a rigid timeline or budget. He breaks down his view of costs. You may find the review of methods useful. Keep in mind that the key to the type of analysis Mr. Carbonell recommends is the validity of the assumptions used to “fill in the blanks” where unknowns exist. Guessing does not work too well as the recent financial trouble suggests. Use of a method with faulty assumptions will trigger a host of interesting consequences. When I sense that those developing budgets for semantic projects don’t have the data needed to generate cost analysis that match my experience in the real world, I walk away from the project. A semantics project that goes off the rails can wreck havoc on a budget and on the project manager’s career. When those upsides slide into red ink, the stain may take some time to disappear.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, Financial, News, Online (general), Semantic, Technology | 2 Comments

Standards and eGovernment

March 12, 2009

The Obama administration seems to be shifting from topical silo portals to a single point of access; that is, Recovery.gov. Most people don’t care. The implications of this shift hit some government managers and quite a few contractors. If you are interested in eGovernment or eGov, you may want to scan this XBRL Blog Magazine article “Free the Data: eGov and Open Standards” here. “XBRL” stands for Extensible business Report Language. No, I am not sure what that embraces either. Nevertheless, the write up on march 12, 2009, struck me as thought provoking. The hook for the write up is the Obama technology officer, Vivek Kundra, and his support for “open” approaches. Most government contractors are not too keen on open anything. Washington, DC, was not built on openness and for some, the idea is anathemic. For me, the most interesting comment in the write up was:

The key to the success of this plan is to ensure that there is some agreement across all federal agencies that defines a shared common ‘open’ data standard and identifies how deeply they are willing to push the tagging of data gathered into the collection processes for Recovery Act funding applications and into the financial reporting between the federal, state, and local agencies who are to be the recipients of the Recovery Act funds. Currently, the plan is to only go one level deep — the federal agency will require recipient reporting only from the primary agency receiving the funds.

The problem is not promulgating guidelines. The problem is that it can take months, if not years, for an executive order or OMB mandates to reach the rank and file in an Executive Branch agency and then gain traction. When I was working in Washington, I heard that one term presidents could not make much of an impact because the process of disseminating and implementing a change took more than four years. Without two terms, Federal agencies just keep on doing what each has been doing for decades–preventing staff cuts and budget reductions, expanding programs, and protecting turf.

Making substantive change to break some of the commercial – government agency information connections, adopting standards, and moving from walled gardens to gardens with unlocked gates is a fine idea. I just don’t think change will take place quickly. It is easier to write about change than create it, particularly in governmental agencies engaged in silo construction and preservation.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under News, Online (general), Technology | 2 Comments

Torrents of Money

March 12, 2009

Check out the meager revenues from some of the dead tree crowds’ online services. What about Twitter revenue? Not much there. Now click here to read the Ars Technica story “Torrent Search Engine Mininova Earning €1 Million a Year.” You may be able to locate a link to Mininova with a Google search. I am not comfortable putting a link to the site on the addled goose’s blog. What is evident that users of torrent sites will pay money to locate data indexed by a torrents’ system. No big surprise. For me, the most interesting comment in the article was:

Even a casual glance at the site will confirm that a huge percentage of the .torrent files it hosts (Mininova, like The Pirate Bay, does not host actual content on its own servers) infringe copyright, but Mininova isn’t quite The Pirate Bay. While the Bay used to delight in posting—and then ridiculing—takedown requests from copyright owners, Mininova claims to comply with all such requests and has a prominent page on its Web site providing information on the takedown process.

I liked the Ars Technica write up. My hunch is that as legal eagles get more knowledge about torrents, the number of legal challenges will increase. Mininova may have the dough to fight today’s battles, but repeated, contentious, and prolonged legal battles will kill the company. Interesting challenges for the parties to the matter. Older folks have difficulty explaining why torrents may be problematic. The children of these individuals have zero problem understanding the benefits of a torrent site, finding them, and using them. Not much chance of a quick change in this pattern in my opinion. Search is a gateway to information. A link delivers the information object. The children of legal eagles may ask their parents, “What’s the big deal?” No quick or easy answer yet.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under Cloud computing, Financial, News, Online (general), Search | Comments Off on Torrents of Money

Autonomy Knipsel

March 12, 2009

A news release turned up in my newsreader with an interesting set of tags. You can read the story about Autonomy, the meaning based computing company, here. If the link goes dead, you will be able to find the original story on the Autonomy Web site here. My newsreader presented me with this headline, “Autonomy Powers Pioneering News Portal – MSN MoneyCentral”. What I think happened is that the news release title has the appended source, “MSN Money Central” as the full title. I don’t know if the parser jammed the two separate fields together or if it was some other type of human or system error. I was expecting to learn that Autonomy sold its search system to MSN Money Central. What the item told me was that Autonomy landed a news service about which I knew nothing. I found this interesting because my Overflight service makes some assumptions about what is a title and what is not a title. I will have to revisit that logic.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under News, Online (general), Text analytics, Text processing | Comments Off on Autonomy Knipsel

Gmail: Security Issue, Not a Big Issue

March 12, 2009

I recall a professor in college describing how one can win a debate by defining terms to leave the opponent without a leg upon which to stand. Try this tactic when you talk about search and today’s trophy and entitlement crowd usually respond with “Knock it off” or “You are wrong.” That’s my experience. Yours may differ because you are exposed to a more enlightened crowd than I. I thought of this “redefining terms” tactic when I read Dancho Danchev’s “Google Downplays Severity of Gmail CSRF Flaw” here. As a former high school and college debate team member, I am appreciate of the utility of defining terms “my way”. Mr. Danchev’s article includes a snippet of Google’s response to yet another Gmail security glitch. Google’s response, if it is accurately presented. explains that the security issue in part this way:

We’ve been aware of this report for some time, and we do not consider this case to be a significant vulnerability … Despite the very low chance of guessing a password in this way, we will explore ways to further mitigate the issue. We always encourage users to choose strong passwords, and we have an indicator to help them do this.

The key to this is the definition of “significant vulnerability”. Without defining terms, who can say whether the security issue is a big deal or a little deal. In my opinion, wordsmithing may address perception but it does not answer the questions this article raised in my mind.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under Google, News, Online (general), Technology | Comments Off on Gmail: Security Issue, Not a Big Issue

Database Content: Take or Use

March 12, 2009

You may want to read Out-Law.com’s “Database Infringements Depend on Taking, Not Usage of Data” here. The article tackles an issue that has triggered a European Court of Justice ruling. For me the key statement in the Out-Law.com synopsis of the ruling was:

The Directive protects against “extraction and/or re-utilisation of the whole or of a substantial part…of the contents of that database”. The ECJ said that infringement was independent of the use to which someone wants to put the information.

Does this ruling matter in the US or elsewhere?

In my opinion, the ruling underscores the difference between how a person who compiles and provides access to that specific compilation of data perceives the value of the data and the person who wants to repurpose some of the data in that database. I am no lawyer, but I do work with clients who can click to a Web site and find useful information; for example, the data available from a government Web site or the patent information I have compiled for my Google patent search service.

Software can now slice and dice data. A programmer can make many information “meals” with these amazing software tools.

There are different ways to view the structured data such as airline flight information or condos for sale in Baltimore, Maryland or loosely structured data such as an RSS feed or well formed XML documents.

An innovator / entrepreneur can see these data as raw material for something new. The idea is that individual data items may gain utility when assembled or organized in a way different from the way the information appear on a specific Web site. Because the information are viewable in a browser, it seems to the innovator / entrepreneur that the data or their constituent elements like a phone number are like molecules in a mixture. These can be combined without losing their original chemical structure. The data are publicly available, so the data are meant to be used.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Online (general), Technology, Text analytics, Text processing | 1 Comment

Tiscali Sinking

March 12, 2009

Most of my pals in Harrods Creek don’t know about Tiscali. The company rolled up a number of European Internet Service Providers and, for a time, offered a wide range of interesting services. The Wikipedia write up is here. For example, the UK version once had a nifty run down of European shareware and freeware. The selection was moderated by some people who knew what was interesting and what was a loser. Alas, the service was discontinued. Tiscali slipped off my radar until I heard about its financial troubles. The gloom is official. Reuters’ ran “Italy’s Tiscali Suspends Long-Term Debt Payments” making the gloom sufficiently deep to suggest that the company may go away at some point. I wish the old Tiscali was back, however. The fact I noticed, if it is indeed accurate, was this one:

Tiscali has long-term debt of 500 million euros, it said last Friday, with the next interest payments due on March 11 and 13 for a total of 11 million euros.

If you have any bright ideas, send them to Tiscali. The Italian outfit struck me as innovative and open not too long ago if you use human years. In Internet years, maybe the glory days were early 19th century. Pundits pushing for user fees might want to dig into the various monetization efforts the Italian company explored over the last decade. Instructive was that exercise for me.

Stephen Arnold, March 11, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, News, Online (general) | Comments Off on Tiscali Sinking

Dead Trees Form an Ad Forest of Seedlings

March 11, 2009

Nicholas Carlson’s “27 Huge Publishers Join to Replace the Banner” caught my eye. The story explains that well known publishing outfits are cooperating to generate revenue. For me the most interesting comment in the write up was:

27 publishers with a reach of about 109 million unique visitors per month — that’s 66% of the total U.S. Internet audience — have agreed to try one of three new online ad formats…

I think this is a good idea, just a bit late to the party. The flaw in the effort is traffic. Publishers’ Web sites get traffic from other places. I no longer navigate to a specific site like Forbes.com. The site is too annoying. I look at headlines and then decide whether I am willing to put up with the wacky intrusions that that publication thinks will catch my attention.

Mr. Carlson’s article goes into great detail about the way my eyeballs traverse a Web site, which makes the fatal assumption that I go to a publisher’s Web site to see what’s on offer. You may find the diagrams useful. I did.

I think the publishers may want to revisit their traffic assumptions. Some Web search engine vendors might want those ad revenues to be invested in the Web search engines’ ad systems. Ads that stumble over a technical hurdle might cost the site some traffic. The assumptions collapse. Traffic, not the publishers’ brand, is the secret to making ads do more than a trickle of revenue when a flood is needed–and quickly. I wonder if this group will figure out a way to do mobile and Twitter ads next?

Stephen Arnold, March 11, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, News, Online (general), Publishing | Comments Off on Dead Trees Form an Ad Forest of Seedlings

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Dotmatics

FDA: An Argument for Pervasive Monitoring

Semantics in the Enterprise: Partial Business Case

Standards and eGovernment

Torrents of Money

Autonomy Knipsel

Gmail: Security Issue, Not a Big Issue

Database Content: Take or Use

Tiscali Sinking

Dead Trees Form an Ad Forest of Seedlings

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta