Comets and Dinosaurs

March 16, 2010

I wrote about “newsosaurs” over the weekend, and this article caught my eye: “NetSuite Calls Microsoft ‘ERP Dinosaur.” The write up is not about search, although it could have been shaped to cover that technology as well. I wanted to capture this line from the write up:

The memo, with the subject line “The Netsuite comet officially hits the Microsoft ERP dinosaur,” calls Microsoft’s announcement “an obvious act of desperation as Microsoft’s customers and partners defect en masse for NetSuite and the cloud.” Microsoft’s bid, Nelson wrote, “tries to convince NetSuite customers to move backwards 20 years to try Great Plains, Navision or Solomon”– the names of Dynamics GP, NAV and SL before Microsoft acquired them. “Microsoft has no cloud-based ERP answer to NetSuite, and Microsoft’s statement that ‘hosting’ Great Plains is their response to the cloud is so absurd as to be laughable,” Nelson said in his memo. “This is the old ‘ASP’ approach of hosting client/server products that failed as a delivery mechanism even before we entered the Year 2000.”

Stong words. Ever try to find an item in Microsoft’s ERP solutions?

Stephen E Arnold, March 16, 2010

Free, free as a bird. No one paid me to write this. Since it is an uncompensated bit of work, I must report this to the zoologist responsible for the National Zoo’s aviary ad unit.

Written by Stephen E. Arnold · Filed Under Cloud computing, Marketing, Microsoft, News | Comments Off on Comets and Dinosaurs

Mainframe Cost: Migration Motive

March 16, 2010

You are babysitting a mainframe. The iPod listening 20 somethings don’t want to dig into the legacy code. You are reluctant to involve the IBM-savvy specialists and their new BMW work wagons. And for good reason. Navigate to “An Expert View on Mainframe Migration” at http://www.computing.co.uk/. The article provides a useful business case for dumping big iron. The source for the write up is DFA president Francis Feldman. He provides some useful factoids, including this gem:

“We expected to see a drop of between 30 and 70 per cent,” he said.

Particularly interesting was the list of the nine steps or checkpoints for migrating an application from the legacy system to a newer, much cheaper modern platform. I don’t want to recycle is list, but I can highlight three items and urge you to visit Computing UK for the full write up.

The three highlights of the write up I noted were:

First, the rework did require recoding and tweaking. The method involved recompiling into code that conformed to the ANSI standard.

Second, the legacy system and the new system were operated in parallel for a period of time. How many organizations bother with this step today?

Finally, the items on the checklist provide a solid anchoring in what one should consider. The first item is the key one in my opinion: “Asset catalogue and consistency assessment.”

I think this article is a download and save candidate. One question, “How much cheaper would a cloud solution be?” My view is that on premises installations are tomorrow’s mainframes. Just my opinion.

Stephen E Arnold, March 16, 2010

A freebie. No one paid me to write about mainframes. Because the subject is a mainframe, I will report non payment to the IRS.

Written by Stephen E. Arnold · Filed Under Financial, News, Technology | 1 Comment

Limitations of MSFT Exchange 2010

March 16, 2010

I am not sure how one of my goslings came across this spreadsheet tucked away on the Microsoft Exchange Web log. When I tried to access the file, the system did not recognize my “official” Microsoft MSDN user ID nor my Windows Live credentials. So you may have to register to access the blog. Once there, you need to look for the download section and visually inspect the file names for the one that points to the Exchange Performance Excel spreadsheet. Running a query in the blog’s search box produced zero hits for me. But with some persistence and patience I was able to get a copy of the spreadsheet. Latency was a problem when I was fiddling with this download. (Note: if the link is dead, write one of the goslings at benkent2020 at yahoo dot com, and maybe he will email you a copy of this document.)

Once you get the document “Scalability Limitations”, you will see some pretty interesting information. One quick example is that the spreadsheet includes three columns of specifics about scaling amidst the more marketing oriented data on the spreadsheet. These three juicy columns are:

Limitation
Issue
Mitigation.

Here’s the information for the row Database Size:

Limitation–Exchange 2007 – 200GB; Exchange 2010 – 2TB or 1 disk, whichever is less
Issue–The DB size guidance changed from 200GB (if you are in CCR) to 2TB or 1 disk, whichever is greater (if you have 2+ copies of the DB in question)
Mitigation—Blank. No information.

Okay.

I hope you are able to locate this document. For those of you eager to install Exchange 2010, SharePoint 2010, and Fast Search 2010, you will want to make sure you have these type of spreadsheets at your fingertips * before * you jump on the Microsoft Enterprise steam engine. The information in the spreadsheet makes clear why some types of email content processing may be expensive to implement.

Stephen E Arnold, March 16, 2010

This is the equivalent of the free newspaper Velocity in Louisville. Read it for nothing. I will report working for no dough to the Jefferson County agency that thinks I work in Louisville when I spend most of my time in the warm embrace of airlines.

Written by Stephen E. Arnold · Filed Under Enterprise, Microsoft, Real time search, Technology, Text processing | Comments Off on Limitations of MSFT Exchange 2010

More XML Expertise to Google

March 16, 2010

According to ZDNet, Tim Bray, founder of OpenText and collaborator with Ramanathan Guha on things XML, is now a Googler. The story “Ex-Sun Director Bray Joins Google’s Android Team” notes that Mr. Bray will work on the Android. The addled goose wants to point out that there are some big semantic Web guns in the Google arsenal now. Is Google becoming the big gun in the semantic Web or just the semantic Web?

Stephen E Arnold, March 16, 2010

Nope, a free one. No one paid me to reference semantic weapons. I will report this free write up to the FCC.

Written by Stephen E. Arnold · Filed Under Google, Mobile, News, Semantic, Technology | 1 Comment

SQL Does Too Scale

March 16, 2010

The Dennis Forbes on Software and Technology blog published “Getting Real about NoSQL and the SQL-Isn’t-Scalable Lie”. The article caught my attention because it expresses the viewpoint that SQL does scale. I found the write up interesting, and I wanted to highlight several of the arguments presented.

First, the article points out that bashing SQL is an increasingly popular sport. Mr. Forbes writes:

In the case of the NoSQL hype, it isn’t generally the inventors over-stating its relevance — most of them are quite brilliant, pragmatic devs — but instead it is loads and loads of terrible-at-SQL developers who hope this movement invalidates their weakness.

Second, he makes clear that SQL does scale. He offers:

Such a solution — even on a stodgy old RDBMS — is scalable far beyond any real world need because you’ve built a system for a large corporation, deployed in your own datacenter, with few constraints beyond the limits of technology and the platform. Your solution will cost hundreds of thousands of dollars (if not millions) to deploy, but that isn’t a critical blocking point for most enterprises. This sort of scaling that is at the heart of virtually every bank, trading system, energy platform, retailing system, and so on. To claim that SQL systems don’t scale, in defiance of such obvious and overwhelming evidence, defies all reason.

Third, he points out that progress is being made:

Scalability noise based upon the limitations of a cloud vendor’s offerings needs to be put into context: They don’t apply to most of the users of relational databases. MySQL isn’t the vanguard of the RDBMS world. Issues and concerns with it on high load sites have remarkably little relevance to other database systems. And of course the SQL/RDBMS world is changing (side note: Few love SQL, but I’ve yet to see a viable replacement). Wouldn’t it be a grand world where every desktop (platforms that spend about 99% of their time completely idle) in a corporation was a part of the corporate cloud, all seamlessly acting as a part of the corporate information system in a reliable, redundant way? A simple SQL statement silently and transparently fulfilled by hundreds of distributed systems?

But the real interesting part of the write up is the comment section of the Web log. Some are clever and others like Alex Popescu’s are thought provoking. Excellent write up.

Stephen E Arnold, March 17, 2010

No one paid me to write this. Because the article is about relational databases, I will report non payment to DHS, an outfit with quite fascinating RDBMS challenges. Those folks and their consultants get paid I believe.

Written by Stephen E. Arnold · Filed Under Database, News, Technology | Comments Off on SQL Does Too Scale

Another Google Jibe

March 15, 2010

Poor, poor Google. From top of the world to a punching bag in less than three months. This new decade is proving to be a challenging one for Google. I just read “Six Delusions of Google’s Arrogant Leaders.” I want to disclose that I too have been accused of being arrogant. Now I don’t have any good reason to be arrogant. I just find that approach works for me, but, please, keep in mind that I am an addled goose, live in rural Kentucky, and am wandering slowly toward being 66 years old. I am no sports car in today’s NASCAR ego race.

But Google! According the write up, Google is coming across as “cocky”. I don’t want to run down the six delusions. I inveigh you to go direct and suck up the juiciness yourself. However, I can point to two of the examples and offer a comment.

The first is “users are hungry for Google synergy.” I am not sure what synergy means. I know that the Google platform is one that works like a giant plastic bag wrapped around the earth. The idea is to put everyone in the bag and keep them there. This is mostly complete, but about 25 percent of Web users are outside of the bag and Google wants to get them in one way or another. The notion that users want this is irrelevant. What this delusion makes clear is that Google is retrofitting public relations baloney to match what the company has been working on for about decade. What’s interesting is that it has taken mavens, pundits, and “real” journalists 360 months to figure out the Google game plan. Who’s delusional? Google which has mostly accomplished its mission or the folks just figuring out that Google has been and will continue to push the Google PR line?

The second delusion is that “Google is a worker’s utopia.” Okay, when you take money to do work, by definition, this situation is not utopia for the workers. Companies can make work less onerous or more meaningful, but it is work. I don’t think the Googlers I know are doing much more than drinking the Google Kool-Aid, trying to build their knowledge value, and get some money. Like Apple, Google operates a reality distortion field, and, let’s face it, having Google on one’s résumé is arguably more impressive than a degree in Harry Potter studies from Frostburg College. My view is that Google manipulates its workers as effectively as it manipulates the media. Like the media, Google employees play along. It’s a game with high stakes, but it is a game. Google knows exactly what it is doing.

Now what’s the arrogance? The arrogance is not unique to Google. I call this the Math Club Syndrome. Here’s how it works. A group of folks with specialized interests and skills bond, sort of like a golf foursome from Sigma Chi fraternity. The difference is that no one understands the Math Club and most people understand and envy the Sigma Chi golf foursome. As a method of coping with a world that simply does not understand math, the math club becomes insular. The club’s rules are insider rules and act like a protective barrier. No problem until the math club becomes the first next generation supra-national company jousting on an apparently equal footing with China, the Department of Justice, and giants like Microsoft.

What do we expect from the Math Club? I expect Math Club behavior, complete with the insider jokes about janitors in patent documents. (Oh, janitors is a way of describing Google’s semi autonomous agents which “clean up” statistical anomalies in petascale flows of data. Snort, snort, get it. Janitor equals Dilbert’s garbage collector, the smartest person in the comic strip. Oh, you don’t get it? Well, there you have it. A mismatch between Math Club humor and you, gentle reader.)

My view is that it is time to quit worry about Google’s power and time to start figuring out how to surf on Google. My column for KMWorld and this month’s column for the Smart Business Network address two different ways to surf on Google. I don’t grouse. I accept that over the last decade Google has emerged as a new ecosystem. You can’t kill it because the Googlers who leave the company spawn Google-centric entities. My last count tallied a couple of hundred of these Xoogler ventures. And Facebook is not much more than a “legacy” of Google. Maybe Facebook will become the new Google, but that won’t change the arrogance.

Math Club is congruent with arrogance. Reality. Live within it; don’t deny it.

Stephen E Arnold, March 14, 2010

No one paid me to write this article. Because I have not been paid and I refer to psychological behavior, I will report my writing for no pay to the Surgeon General who understands such esoteric notions as delusions.

Written by Stephen E. Arnold · Filed Under Business strategy, Google, News, Online (general), Technology | Comments Off on Another Google Jibe

Why Bing Is Gaining on Google

March 15, 2010

eWeek ran an interesting story a week or so ago. “10 Reasons Why Microsoft Bing Is Gaining on Google.” I read the article when it became available and I sat on it. I wanted to see what happened to Google. The market share data for Web search is squishy, and my view is that Google is contributing to Bing’s success more than Bing is contributing to Bing’s success. This idea is not the focus on the eWeek article.

The eWeek article points to such factors as integration, partnerships, and social networks plus seven other factors. I think these are indicative of buzzword juggling, not what is actually happening.

My view is:

Google is in the midst of a one two punch, backlash and fatigue. Microsoft is benefiting.
Google’s management is becoming increasingly aggressive and causing even the fuzzy wuzzy Google followers to wonder if they should accept a Google mouse pad and write happy things about the company.
Google is in the news but most of the news is either negative or sort of crazy. One example is that Google is giving China the evil eye. Hey, China is a country. Google is a new type of company but when countries and companies collide, unless the company owns the country, the country will win.

Google has passed a line in the digital sand. Bing, despite its heritage and its weaknesses, looks a lot more appealing to some folks I opine.

Stephen E Arnold, March 15, 2010

Nope, a freebie. Who would pay me to write this type of comment which goes against the straight and true grain of a “real” publishing company. Ah, grain. I must report non payment to the Department of Agriculture.

Written by Stephen E. Arnold · Filed Under Business strategy, Google, Government, News | 1 Comment

Newsosaurs

March 15, 2010

I read “It’s Hard To Watch The Newsosaurs Turn A Blind Eye To Their Own Extinction” right after I flipped through the New York Times’s Sunday magazine clone from the Wall Street Journal outfit. Let me comment on each information MIRV and offer a couple of observations from my search vantage point.

First, TechCrunch’s write up has a killer comment:

Everyone wants to wall off the Web and keep grazing on declining ad revenues.

I agree. This is a combination of fear, anger, and ostracism. I enjoy pointing out that in the information economy, the traditional giants no longer own the country club. Each day, the former owners find their future will be as caddies to the new information elite. This is, I suppose, a bitter pill to swallow. The TechCrunch article includes the much quoted “burn the boats” admonition from one of the early superstars of the zippy-doo Web that is not the cat’s pajamas. Like Google’s advice to struggling industry, the listeners think that their outfits have already burned the boats, embraced technology, and reinvented themselves. This mismatch between advice and its perception is characteristic of the domain collision that is now taking place. The passage that caught my attention in the TechCrunch write up was:

The longer media companies wait, the bigger disadvantage they will have when they cross over to the other side and find a whole new host of competitors who never had any print legacy businesses to protect. Those competitors right now are blogs and online news hubs who are still furry little rodents in the underbrush, but who won’t stay little forever. The sooner print media companies cross over, the sooner they can be on pure offense. Their online strategies and business models won’t be crippled by any allegiance, or need to protect, to the old print business. If they wait until their online revenues become 25 or 50 percent before they fully commit, it will be too late.

I don’t disagree with the thought. I disagree with the “will be too late.” It is too late.

The example to wish I refer is the oversized, glossy, 80 plus page WSJ Magazine filled with “reading.” Well, that’s interesting. I just counted about 32 pages of ads plus a number of features that are tough for me to determine if these are placed for consideration or are actual editorial. The stories focused on cars and fashion with a profile tossed in for good measure.

I remember being told by my Financial Times’s delivery agent before I dropped my print subscription that he tossed the magazine insert because it was too much of a hassle. I wonder if my delivery person for my Saturday WSJ will follow the same path.

Did I read any of the stories? The answer is, “No.” None of them appealed to me. I have a person who works for me who drives a Mini Cooper and it seems to have constant tire problems. I am tired of with it executives who overcame hardship. Who hasn’t? Fashion? Not interested. I wear black Travel Smith jackets, black never wrinkle pants, and black shoes that do not set off any alarms anywhere I travel. Spare me the trendy. Was there any financial info, business intelligence, or juicy insights into making money grow? Nope. The WSJ added sports and now it is adding a New York Times’s magazine type publication every couple of months.

What’s my take?

WSJ is going after the NYT advertisers. That’s okay but the effectiveness of print ads have to be demonstrable. That might be tough unless the editorial product provides some content consideration. The boundary between an auto story and an advertiser might be getting a few molecules narrower, might it not?
The problem with traditional media is not content; the problem is finance and business models. Offering me 30 pages of ads in 80 pages of paper is somewhat 17th century in today’s world.
The Financial Times’s last home delivery offer to me was $50 a year. Will the Wall Street Journal face the same subscription challenge as readers discover that blending sports, Details magazine editorial, and business profiles might be out of step with what subscribers like me do on a Saturday?

Now search? How will I be able to locate the Gucci suit on the WSJ Web site? Answer: Not until the WSJ figures out image indexing and some other search tricks. I bet that when the iPad version of the WSJ Magazine comes out I will be able to click on a suit and see a map of locations where I can buy a suit that will fit most 20 year old soccer players. Maybe for some folks. Not for me.

Stephen E Arnold, March 14, 2010

No one paid me to write this article. I will report a failure to charge for my writing to the editor of the Army Times, an outfit focused on information in the modern world.

Written by Stephen E. Arnold · Filed Under News, Online (general), Publishing, Search | Comments Off on Newsosaurs

Indexing Craziness

March 15, 2010

I read “Folksonomy and Taxonomy – do you have to choose?,” which takes the position that a SharePoint administrator can use a formal controlled term list or just let the users slap their own terms into an index field. The buzzword for allowing users to index documents is part of a larger 20 something invention—folksonomy. The key segment for me in the SharePoint centric Jopx blog was:

The way that SharePoint 2010 supports the notion of promoting free tags into a managed taxonomy demonstrates that a folksonomy can be used as a source to define a taxonomy as well.

Let me try and save you a lot of grief. Indexing must be normalized. The idea is to use certain terms to retrieve documents with reasonable reliability. Humans who are not trained indexers do a lousy job of applying terms. Even professional indexers working in production settings fall into some well known ruts. For example, unless care is exercised in management and making the term list available, humans will work from memory. The result is indexing that is wrong about 15 percent of the time. Machine indexing when properly tuned can hit that rate. The problem is the that the person looking for information assumes that indexing is 100 percent accurate. It is not.

The idea behind controlled term lists is that these are logically consistent. When changes are made such as the addition of a term such as “webinar” as a related term to “seminar”, a method exists to keep the terms consistent and a system is in place to update the index terms for the corpus.

When there is a mix of indexing methods, the likelihood of having a mess is pretty high. The way around this problem is to throw an array of “related” links in front of the user and invite the user to click around. This approach to discovery entertains the clueless but leads to the potential for rat holes and wasted time.

Most organizations don’t have the appetite to create a controlled term list and keep it current. The result is the approach that is something I encounter frequently. I see a mix of these methods:

A controlled term list from someplace (old Oracle or Convera term list, a version of the ABI/INFORM or some other commercial database controlled vocabulary, or something from a specialty vendor)
User assigned terms; that is, uncontrolled terms. (This approach works when you have big data like Google but it is not so good when there are little data, which is how I would characterize most SharePoint installations.)
Indexes based on parsing the content.

A user may enter a term such as “Smith purchase order” and get a bunch of extra work. Users are not too good at searching, and this patchwork of indexing terms ensures that some users will have to do the Easter egg drill; that is, look for the specific information needed. When it is located, some users like me make a note card and keep in handy. No more Easter egg hunts for that item for me.

What about third party SharePoint metadata generators? These generate metadata but they don’t solve the problem of normalizing index terms.

SharePoint and its touting of metadata as the solution to search woes are interesting. In my opinion, the approach implemented within SharePoint will make it more difficult for some users to find data, not easier. And, in my opinion, the resulting index term list will be a mess. What happens when a search engine uses these flawed index terms, the search results force the user to look for information the old fashioned way.

Stephen E Arnold, March 15, 2010

A free write up. No one paid me to write this article. I will report non payment to the SharePoint fans at the Department of Defense. Metadata works first time every time at the DoD I assume.

Written by Stephen E. Arnold · Filed Under Enterprise, News, Search, SharePoint, Text processing | Comments Off on Indexing Craziness

Database Skirmishes: Relational versus Non-Relational

March 15, 2010

A happy quack to the reader who sent me a link to “SCALE 8x: Relational vs. Non-relational.” I was able to access the link, but the page was marked “subscribers only.” The main point of the write up was to explain some of the performance differences between Codd databases like SQL Server and Oracle and the non relational data management systems like those in use at Google or Digg.com. You can download the slides upon which the article was based at http://www.pgexperts.com/document.html?id=40. The article also includes a useful write up about the issue at http://ossdbsurvey.org/.

Stephen E Arnold, March 15, 2010

No one paid me to write this news item. I will report this sad fact to Health & Human Services because I used a variant of the word “relation”.

Written by Stephen E. Arnold · Filed Under Database, News, Technology | Comments Off on Database Skirmishes: Relational versus Non-Relational

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Telegram
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Comets and Dinosaurs

Mainframe Cost: Migration Motive

Limitations of MSFT Exchange 2010

More XML Expertise to Google

SQL Does Too Scale

Another Google Jibe

Why Bing Is Gaining on Google

Newsosaurs

Indexing Craziness

Database Skirmishes: Relational versus Non-Relational

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta