UK Government Web Archive Progress

February 26, 2010

Short honk: I found “UK Web Archive Will Offer Just 1% of Web Sites by 2011” interesting for two reasons. First, the notion of taking a year to build a Web archive struck me as somewhat slow progress. My recollection is that at that rate, it may be difficult to build a sufficient repository to make its usefulness to me evident. New pages and changes are important in my work. Second, I found this comment interesting:

Its UK Web Archive, which was officially launched today, contains just 6,000 of around 8 million UK websites. According to the British Library, on average UK websites have a lifespan of between 44 and 75 days. It also said that at least 10 percent of all UK websites were either lost or replaced by new material every six months. However, under the 2003 Legal Deposit Library Act the group needs to gain permission from the website owner before it can archive the site, which is slowing down the process dramatically.

When I read this, I can understand the frustration that some commercial companies must feel when government agencies decide to create certain collections of information. On the other hand, I see only opportunity. With the UK approach, the job seems to be behind the eight ball before it begins.

Why not chat with Internet Archive and just NOT (filter) out the non-UK content. Not perfect but somewhat more robust than 6,000 sites if I understand the write up.

Stephen E Arnold, February 26, 2010

No one paid me to point out that there is a disconnect between government time and Internet time. I think I have to report geophysical issues to the timekeepers in the US, NIST.

Internet Metrics

February 24, 2010

One of my goslings submitted a write up about Internet Metrics. Here’s his item to me:

*The data came from Defensetech.org, a military.com site about “how technology is shaping how wars are fought, borders are protected, crooks are caught and individual rights are defined.”

In 2009, there were 90 trillion e-mails sent over the Internet with an estimated 81 percent being spam.  There were over one trillion unique URLs in Google’s index. YouTube served over one billion videos per day, on average. There were over 47 million new Web sites added.

By the end of 2010, there will be over 1.9 billion Internet users worldwide. The volume of daily email will be greater than 100 billion messages per day. There will be over 3.3 billion cell phones users with Internet access. We sites will jump to more than 265 million. The number of blogs will hit 130 million.

Ignore that, if you can.

The challenges of the internet, its social networking implications and our fast-evolving digital world are more than shared by today’s marketers. They are also a first line of defense concern, as you can imagine. And like a run-away train, it will not be stopped.

That, my friends, is the world we live in. A little scary, isn’t it? But (and I have been waiting so very many years to use this poem that I was forced to memorize as a high school freshman), “If we buckle right in, with a bit of a grin… without any doubting or ‘quit-it. If we tackle this thing that can not be done… we’ll soon be able to say, ‘We did it!’“

So if you need impetus to ‘buckle right in,’ consider yourself pushed.

Jerry Constantino, February 24, 2010

ArnoldIT.com paid Mr. Constantino for his write up.

Google and Screwups

February 24, 2010

PCWorld is certainly getting frisky. The story “2010 Is Becoming the Year of Google Screwups.” The article written by Robert X. Cringely is going to get lots of clicks. Even the addled goose exercises goose judgment when writing about Googzilla. For example, I wanted to cover the Google mistress article, but I had a tough time figuring out how to hook search into the story.

Not PCWorld. For me, the most interesting point was:

So far, 2010 is shaping up to be the year Google discovered it had feet of clay — and those feet have been spending a lot of time in Google’s mouth.

The Screwups article provides a particularly useful discussion of Google and its handling of copyright violation claims. MBAs are going to love this write up.

In my view, the year is young.

Stephen E Arnold, February 24, 2010

No one paid me to write this. Because of the reference to copyright, I will alert the Copyright Office that I am working like a beaver chewing down potentially useful raw material for paper suitable for ink jet use.

Digital Dorks: Maybe Lots of Them?

February 24, 2010

I was amused with the Wall Street Journal’s “free” article called “Nearly 20% of US Is Digitally Uncomfortable or Digitally Distant, FCC Says.” Some folks, according to the Federal Communications Commission, are not hip to online. If not hip to online, I wonder if these segments read books, subscribe to magazines, and frequent the local library to lap up information.

In the Commonwealth of Kentucky, the schools struggle to crank out students who can read. The low level of reading skill may be different where you live, but I find it disturbing. Also Harrod’s Creek is enlivened on some mornings with the sound of gun fire. Avid hunters pursue squirrel (tree rats in the local jargon), deer (rats on hooves), and geese (not sure if I am a rat or not) with enthusiasm. My network access is flakey and the throughput sucks. I have to high speed lines, so unless it snows or rains or is windy or is foggy, the data creeps through.

Why then would it be surprising that in Kentucky and similar states, there would be a large chunk of folks who are digitally different? I think that online for some folks means using an automatic teller machine for cash, not surfing LexisNexis.

Among the factoids in the write up that impressed me were:

  • The digitally distant make up 10 percent of the US population. Figure 320 million people. The distant folks amount to 32 million.
  • The digitally uncomfortable account for another 10 percent. That’s another 32 million people.

That means that 64 million people are not going to be into online like this addled goose. What happens when we consider the appetite for reading and the unemployment rate”? What about the dependence on video games, Twitter, and TV for information? What about the loss of local bookstores? What about the decline in local radio and TV news coverage? Yikes!

For me, it means that knowledge workers comprise an important but probably very tiny segment of society. So when we talk about search and nifty gadgets like the forthcoming Apple iPad, I wonder if the spectacular growth numbers associated with online and tech-based services are sustainable.

Even more interesting a question is, “Will search systems shift from user generated queries to predictive pushing of information?” The idea is that some people may be too busy or simply unable to formulate a query for the information needed to perform a task. Why search? Just take what gets pushed to a person with a question.

Finally, with the information revolution going on for decades, what happens to the organizations who have not yet mastered electronic information? Into what category do these people and their organizations fit? Maybe we need a new segment for digital dorks? Just a question, not a real assertion.

Stephen E Arnold, February 24, 2010

No one paid me to think up the phrase digital dorks. I will report non payment to the Council on Literacy.

Baffled about Real Journalists

February 23, 2010

I am not a journalist. I don’t even know how one becomes a “real” journalist. I learned when I read “Why We Don’t Trust Devil Mountain Software (and Neither Should You)” that big publishing companies don’t know either, assuming the information in the write up is accurate. I guess I should not be surprised. I learned last week at a person who writes about electronic information is officially an “expert” on electronic information. I suppose that means that if I were a crime reporter and I wrote about an alleged illegal activity, I would be qualified to talk about wrongdoing as an “expert”. I wonder how the professionals in law enforcement, military intelligence, and related disciplines feel about “real” journalists becoming experts by virtue of talking to people and reading news items? I suppose faux expertise and “real” journalists are products of the modern world. I find the footprints of these types of folks when I work to mop up after search and content processing disasters. There is a downside to the lack of  information about complex subjects even at outfits who are supposed to “know” what’s “real” and what’s not. One thing is sure. This flap is great for the search engine optimization crowd.

Stephen E Arnold, February 23, 2010

No one paid me to write this. I do use a persona—namely, the addled goose—when I write this column. But I received no crumbs for this article. As a fowl, I will report this bedraggled condition to Fish & Wildlife. I wonder if the “real” journalist was paid for his dual roles?

Google and Energy

February 22, 2010

I left the power generation industry in 1975 (I think). I did a study of the online transaction service rolled out for Enron’s energy trading. That project forced me to look at how other companies dabbled in this once little-known niche in the US energy sector. Anyone remember Aquilla, a  name derived from the Latin word for eagle. Aquilla is still around, but it does business as Black Hills Energy. The other companies in this sector now have some competition.

The basics of energy trading is a variant of online search and retrieval. Information is indexed and then either analysts or smart software work through the data and their changes. The algorithms stipulate that when A happens, B should occur if the probability is X.

In short, energy trading is just another application running on a computer. The reason I mention this is that Google is now in the buying and selling of energy business. You can get the basics in “Google Energy Can Now Buy and Sell Electricity.

Most of the commentary I have scanned suggests that Google will be able to save money on its own electricity bills. That’s partially correct. My view is that the Google platform is going to take the “old” Enron model and improve it. Just as search and retrieval in the late 1990s was stuck in a rut, energy trading is similarly encumbered with inefficiencies.

My view: Google could be a bigger and better Enron. I do hope that its managers exercise somewhat better judgment than the “old” Enron group did. Worth watching because prior to this announcement I think the power generation, energy traders, and Wall Street mavens did not perceive Google as a mover or shaker in financial markets.

Well, that group of pundits will regroup once the light bulb goes on. Will the power be intermediated through Google’s trading desk? Buying and selling stuff based on digital data is just another Google application. Simple statement. Big implications in my opinion. Those janitor methods at Google are going to be busy little beavers.

Stephen E Arnold, February 21, 2010

I wish to report to the DOE that I was not paid to write this article. I was thinking of making the disclosure to the SEC, but I think that group has its hands full with traditional publicly traded power generation companies.

PointCast Version 2?

February 19, 2010

I read “Did Google Reader Just Turn on the Firehose?” I don’t use the Google Reader. The addled goose does not read. But when he scanned with one eye the story in Stay N’ Alive, he had one thought, “Is PointCast back?” Different coat of paint maybe but possibly the same squeaking wheel?

Stephen E Arnold, February 19, 2010

No one paid me to write this. Non payment means that I must report this to the IMF, an outfit aware of such sad situations.

Tinfoil Hat Department: UK UFO Archives Now Online

February 19, 2010

The Ministry of Defence (UK) and the National Archives have an online resource for those who believe in unidentified flying objects or UFOs. You can get the details in “MoD UFO Files on the National Archives Site.” For me the telling statement was:

The files are just the latest round of information about UFO sightings released by the MoD.

Who needs Google Books?

Stephen E Arnold, February 19, 2010

No one paid me to write this. I am not sure which agency controls the writing of uncompensated news items about UFOs? Probably NASA. I am now reporting. Dit dah dah.

Search Patterns: User Experience Explained

February 19, 2010

The addled goose does not do book reviews. I was asked if I wanted a copy of Search Patterns by Peter Morville and Jeffery Callender. I said, “Sure.” I read the book, and I think that anyone mired in user interface for search and content processing systems will want to snag a copy. For me, the section that was Chapter 4, Design Patterns. The O’Reilly production value is good. The book is stuffed with screenshots. I am not sure when the book will be in the Harrod’s Creek bookstore. You can chase down a copy on Amazon.

After finishing the 180 page book, I kept thinking about the thrashing that goes on among procurement teams and vendors. The procurement teams know what they like when they see, and in my experience, have not too much information about what is required to make a particular interface feasible. The vendors do quite a bit of borrowing from one another. It is possible that some procurement teams will focus on the UX, user experience in the lingo of Microsoft. Maybe that approach will reduce the dissatisfaction among enterprise users of search and content processing systems?

Worth a look.

Stephen E. Arnold, February 19, 2010

No one paid me to read this 180 page book, examine the screenshots, and do some thinking about the shift from search plumbing to the UX. I am not sure to which government agency I report such uncompensated work. Maybe the Library of Congress whose interfaces knock my socks off each time I use LOC.gov.

The Southwest, Smith, Social Media Storm Front

February 18, 2010

Beyond Search does not cover the social media space. Our companion Web log, Strategic Social Networking, does. You can view our new Social Media video by navigating to http://ssnblog.com and clicking on the video graphi or click the logo below:

ssnlogo

The subject of this week’s two minute video is the storm front triggered by the interaction of Southwest Airlines, movie director Kevin Smith, and social media. Our take? Quite a mess, and most organizations are powerless because social media is moving more quickly than management. We had two emails about the carved bird featured in the video. That’s the inspiration for the SSNBlog’s logo… a social and technical term (you know, one of those social birds that flock near your restaurant table in Cassis).

Okay, I paid myself with money from my own pocket to write about my video. I am not sure how this disclosure of self compensation strikes you, but I think Ralph Waldo Emerson would probably have whipped up one of his exciting essays were he alive and fresh from penning “Compensation.” I think this type of payment to the Bureau of Labor Statistics. That outfit understands zeros.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta