Autonomy Knipsel

March 12, 2009

A news release turned up in my newsreader with an interesting set of tags. You can read the story about Autonomy, the meaning based computing company, here. If the link goes dead, you will be able to find the original story on the Autonomy Web site here. My newsreader presented me with this headline, “Autonomy Powers Pioneering News Portal – MSN MoneyCentral”. What I think happened is that the news release title has the appended source, “MSN Money Central” as the full title. I don’t know if the parser jammed the two separate fields together or if it was some other type of human or system error. I was expecting to learn that Autonomy sold its search system to MSN Money Central. What the item told me was that Autonomy landed a news service about which I knew nothing. I found this interesting because my Overflight service makes some assumptions about what is a title and what is not a title. I will have to revisit that logic.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under News, Online (general), Text analytics, Text processing | Comments Off on Autonomy Knipsel

Media Cloud: Foggy Payoff

March 12, 2009

I wrote about Calais in 2008. You can find that article here. Calais makes use of ClearForest technology to perform semantic tagging. I am cautious when large companies make services available at a low or no cost. Now, Calais was pegged to a project at Harvard University. You can read the ReadWriteWeb.com story here. the Media Cloud project delivers some of the Google Trends or Compete.com type outputs from content processed with Calais. For me, the most interesting comment in the write up was:

we see this as an example of how the Internet is driving traditional media to change and respond in new ways. We are excited by the scope and potential that Media Cloud brings to anyone interested in following news and media trends.

I have a different view. A university demo project is just that a demo with an academic spin. Traditional media need to do more than a demo before the money in the checking and savings accounts runs dry.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under News, Semantic, Technology, Text analytics, Text processing | 2 Comments

Gmail: Security Issue, Not a Big Issue

March 12, 2009

I recall a professor in college describing how one can win a debate by defining terms to leave the opponent without a leg upon which to stand. Try this tactic when you talk about search and today’s trophy and entitlement crowd usually respond with “Knock it off” or “You are wrong.” That’s my experience. Yours may differ because you are exposed to a more enlightened crowd than I. I thought of this “redefining terms” tactic when I read Dancho Danchev’s “Google Downplays Severity of Gmail CSRF Flaw” here. As a former high school and college debate team member, I am appreciate of the utility of defining terms “my way”. Mr. Danchev’s article includes a snippet of Google’s response to yet another Gmail security glitch. Google’s response, if it is accurately presented. explains that the security issue in part this way:

We’ve been aware of this report for some time, and we do not consider this case to be a significant vulnerability … Despite the very low chance of guessing a password in this way, we will explore ways to further mitigate the issue. We always encourage users to choose strong passwords, and we have an indicator to help them do this.

The key to this is the definition of “significant vulnerability”. Without defining terms, who can say whether the security issue is a big deal or a little deal. In my opinion, wordsmithing may address perception but it does not answer the questions this article raised in my mind.

Stephen Arnold, March 12, 2009

Written by Stephen E. Arnold · Filed Under Google, News, Online (general), Technology | Comments Off on Gmail: Security Issue, Not a Big Issue

Database Content: Take or Use

March 12, 2009

You may want to read Out-Law.com’s “Database Infringements Depend on Taking, Not Usage of Data” here. The article tackles an issue that has triggered a European Court of Justice ruling. For me the key statement in the Out-Law.com synopsis of the ruling was:

The Directive protects against “extraction and/or re-utilisation of the whole or of a substantial part…of the contents of that database”. The ECJ said that infringement was independent of the use to which someone wants to put the information.

Does this ruling matter in the US or elsewhere?

In my opinion, the ruling underscores the difference between how a person who compiles and provides access to that specific compilation of data perceives the value of the data and the person who wants to repurpose some of the data in that database. I am no lawyer, but I do work with clients who can click to a Web site and find useful information; for example, the data available from a government Web site or the patent information I have compiled for my Google patent search service.

Software can now slice and dice data. A programmer can make many information “meals” with these amazing software tools.

There are different ways to view the structured data such as airline flight information or condos for sale in Baltimore, Maryland or loosely structured data such as an RSS feed or well formed XML documents.

An innovator / entrepreneur can see these data as raw material for something new. The idea is that individual data items may gain utility when assembled or organized in a way different from the way the information appear on a specific Web site. Because the information are viewable in a browser, it seems to the innovator / entrepreneur that the data or their constituent elements like a phone number are like molecules in a mixture. These can be combined without losing their original chemical structure. The data are publicly available, so the data are meant to be used.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Online (general), Technology, Text analytics, Text processing | 1 Comment

Tiscali Sinking

March 12, 2009

Most of my pals in Harrods Creek don’t know about Tiscali. The company rolled up a number of European Internet Service Providers and, for a time, offered a wide range of interesting services. The Wikipedia write up is here. For example, the UK version once had a nifty run down of European shareware and freeware. The selection was moderated by some people who knew what was interesting and what was a loser. Alas, the service was discontinued. Tiscali slipped off my radar until I heard about its financial troubles. The gloom is official. Reuters’ ran “Italy’s Tiscali Suspends Long-Term Debt Payments” making the gloom sufficiently deep to suggest that the company may go away at some point. I wish the old Tiscali was back, however. The fact I noticed, if it is indeed accurate, was this one:

Tiscali has long-term debt of 500 million euros, it said last Friday, with the next interest payments due on March 11 and 13 for a total of 11 million euros.

If you have any bright ideas, send them to Tiscali. The Italian outfit struck me as innovative and open not too long ago if you use human years. In Internet years, maybe the glory days were early 19th century. Pundits pushing for user fees might want to dig into the various monetization efforts the Italian company explored over the last decade. Instructive was that exercise for me.

Stephen Arnold, March 11, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, News, Online (general) | Comments Off on Tiscali Sinking

Dead Trees Form an Ad Forest of Seedlings

March 11, 2009

Nicholas Carlson’s “27 Huge Publishers Join to Replace the Banner” caught my eye. The story explains that well known publishing outfits are cooperating to generate revenue. For me the most interesting comment in the write up was:

27 publishers with a reach of about 109 million unique visitors per month — that’s 66% of the total U.S. Internet audience — have agreed to try one of three new online ad formats…

I think this is a good idea, just a bit late to the party. The flaw in the effort is traffic. Publishers’ Web sites get traffic from other places. I no longer navigate to a specific site like Forbes.com. The site is too annoying. I look at headlines and then decide whether I am willing to put up with the wacky intrusions that that publication thinks will catch my attention.

Mr. Carlson’s article goes into great detail about the way my eyeballs traverse a Web site, which makes the fatal assumption that I go to a publisher’s Web site to see what’s on offer. You may find the diagrams useful. I did.

I think the publishers may want to revisit their traffic assumptions. Some Web search engine vendors might want those ad revenues to be invested in the Web search engines’ ad systems. Ads that stumble over a technical hurdle might cost the site some traffic. The assumptions collapse. Traffic, not the publishers’ brand, is the secret to making ads do more than a trickle of revenue when a flood is needed–and quickly. I wonder if this group will figure out a way to do mobile and Twitter ads next?

Stephen Arnold, March 11, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, News, Online (general), Publishing | Comments Off on Dead Trees Form an Ad Forest of Seedlings

Arnold to Return to Information World Review

March 11, 2009

In the late 1990s, I wrote a series of columns for Information World Review, a print and online publication, published in the United Kingdom. I grew tired with the monthly grind. In a couple of months, I will again submit a column to IWR. I will not recycle either the information in this Web log nor the persona of the addled goose. This is a free Web log of recycled information. The columns will contain original research. I may conclude each column with a comment, but my goal will be to highlight technology or developments of interest to those interested in electronic information and online. I do not write many articles about Microsoft’s online activities, and I may dive into the Redmond doings in the months to come. I am also interested in the new social publishing products like the novels written for mobile phones by some sharp eyed, accurate typing Tokyo residents. Just a note to myself that I have to write a KMWorld column each month and now one for Information World Review. With the vast sums these publishing companies pay me, I think I will buy Tess a new raw hide chew stick. She’s my white rescued boxer. Stone deaf. Her dog hearing aid will have to wait until I strike gold in Harrod’s Creek.

Stephen Arnold, March 11, 2009

Written by Stephen E. Arnold · Filed Under Business strategy, News, Online (general), Publishing | 2 Comments

Search May Not Mean Search

March 11, 2009

Last week, I had a disturbing conversation with a very confident 30 something. After more than a year of planning, I learned that the company had decided to deploy a key word search system from a big name vendor. I asked, “What do the employees need? Keyword retrieval? Reports? Alerts?”

The answer was, “We have that information from informal discussions. Keyword search.”

I thanked the person for lunch and walked away shaking my head. Businesses are struggling for revenue, and employees in the organizations I have visited since October 2008 strike me as wanting to make their companies successful. Employees are savvy and know that if their employer goes down the drain finding another job might not be easy.
For some, there will be increased competition. Darwinianism is an abstract concept until a person can’t find work.

The 30 something had a job. An important job. The information technology unit at this services firms had search systems but employees did not use them. The IT budget was getting scrutiny, so the manager and tech staff decided it was time to get a “new” search system.

The problem was that I had in 2003 and 2004 conducted interviews with a number of senior managers at this organization. I even knew the president of one of the operating units socially. Although my lunch took place in 2009, I realized that the IT department was going to make the same errors it had with its previous search procurements. Every two or three years, the company licensed another system. After a honeymoon of six months, the results were predictable in my opinion. Grousing and declining usage.

Vendors have a tough time breaking the cycle. Some search companies pitch a “simple solution” that is like a One a Day vitamin. Others deliver a toolkit that is far to complicated for the IT team to get working and scarce budget dollars cannot be pumped into what amounts a customized search system.

If this scenario resonates, you may want to navigate to LLrX and read the article, “Knowledge Discovery Resources 2009: An Internet MiniGuide Annotated Link Compilation” here. The listing was compiled by the prolific Marcus P. Zillman, Internet expert. What I liked about the meaty listing was it made clear to me one point: Search does not mean keyword retrieval. The list provided me with a meaty link burger. I discovered a number of useful resources. You will want to download it and do some exploration.

I did not send the list to my lunch pal, the 30 something who knows what his users want without bothering with surveys, interviews, focus groups, and observation of users in action. As long as organizations hire information technology professionals who know what “search” means, a list won’t make much difference.

You might have a more open mind. I hope so. Search defined as keyword retrieval is about as relevant today as a bronze surgical instrument in an emergency room in a big city hospital. Access to information in a way that meets the needs of individual users is, in my opinion, what search means.

Stephen Arnold, March 11, 2009

Written by Stephen E. Arnold · Filed Under Enterprise, News, Online (general), Search, Semantic, Text analytics, Text processing | 1 Comment

Google Software: A Glimpse at the Future of Google Applications

March 11, 2009

In my analyses of Google Patent documents, I documented the number of inventions that have applicability to online advertising. This makes sense. If your $20 billion in annual revenue depended on online advertisers bidding to get in front of potential customers, you would invest in ad R&D as well.

Most of the 40 percent of Google’s inventions have an ad hook. But some of the wizardry operates beneath a digital kimono. Few outside of the GOOG itself get to see the hidden charms that Google’s billions have purchased.

You can glimpse some of the technology by exploring what Google calls its “Agency Toolkit”. You can locate the page here, but I am not sure if you have to be an authorized Google-holic to access the tools. My own goslings wrought the necessary magic for this addled goose. Your goslings may vary in capabilities, of course.

Here’s what Google said about its Agency Toolkit:

We know how busy you are planning, creating and measuring success for your clients. That’s why we’ve created this site: your one-stop shop for Google tools to make your job a little easier. Build effective advertising programs, optimize your performance, and uncover market insights using the resources outlined here. And each of these free tools is easy to use, helping you to efficiently support your clients.

What’s on offer? Quite a few interesting services and functions. I don’t want to spoil your fun when you work through the 18 tools and links for training on the Web page. Let me highlight two:

A placement tool. Google described it this way: “Find and choose websites, RSS feeds and other placements in the Google content network where you want your clients’ ads to run. Identify placements by URLs, topics, or demographics.”
An SEO troubleshooter. The Google wordsmiths wrote: “Find and choose websites, RSS feeds and other placements in the Google content network where you want your clients’ ads to run. Identify placements by URLs, topics, or demographics.”

The toolkit is important for three reasons:

First, it is making powerful functions available to non programmers found in advertising agencies. In my opinion, this approach to what once were script based tasks makes Google a potential disruptive force for information contractors

Second, the metaphor “tools” implies that the ad exec or you if you are your own ad agency you can pick and choose the right tool for the task at hand. Unlike the notion of tools used by Microsoft or Oracle, Google wants its tools to be used by “regular” people, not techies who have passed tests to prove they are worthy of the secrets that unlike the usefulness of software.

Third, the tools themselves are pure Google. That is clean with just enough eye candy available for those client presentations. Don’t believe me. Refresh your memory with http://trends.google.com, which is one of the tools in the kit.

Google is one of the leaders in making arcane and technically complex operations easy and, for some, interactive. Whether to tools provide the amount of control a user assumes he or she gets, the perception of control is what is important. Think of tools as an never ending supply of ice cream and snacks for the Mad Ave type Google customer. The process is so much fun, in my opinion, that is is easy to forget that those are real deflated dollars one is spending. Not even a newspaper’s haggard, desperate ad rep can match the Google Ad Toolkit for fun and results.

Stephen Arnold, March 11.009

Written by Stephen E. Arnold · Filed Under Business strategy, Cloud computing, Google, News, Online (general), Technology | 1 Comment

YAGG: Gmail Down

March 11, 2009

If you are a Gmail user, you don’t need the Beyond Search Web log to point out another Google glitch, what I call a YAGG (yet another Google glitch). You can read the story here. If accurate, the ComputerWorld headline tells the tale: “Gmail Down; Outage Could Last 36 Hours for Some.” If that link is dead by the time you read the addled goose’s write up, you can find tons of fun at this link to Google News‘s own coverage. So what? Not much to add to my earlier comments. The vaunted technical prowess of the Googlers is not that vaunted. Organizations trying to call Google to license its enterprise solutions may well find that Google will put in place humans who will promptly and eagerly field calls and explain why an potential customer should put its information on Google’s cloud systems.

Stephen Arnold, March 11, 2009

Written by Stephen E. Arnold · Filed Under Cloud computing, Google, News, Online (general), Technology | Comments Off on YAGG: Gmail Down

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Autonomy Knipsel

Media Cloud: Foggy Payoff

Gmail: Security Issue, Not a Big Issue

Database Content: Take or Use

Tiscali Sinking

Dead Trees Form an Ad Forest of Seedlings

Arnold to Return to Information World Review

Search May Not Mean Search

Google Software: A Glimpse at the Future of Google Applications

YAGG: Gmail Down

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta