Facebook Streams

June 25, 2009

You will want to work through this somewhat disjointed discussion of Facebook in ReadWriteWeb’s “The Day Facebook Changed Forever: Messages to Become Public By Default.” For me the most important point was:

In time, though, people may very well decide they are comfortable with their social networking being public by default. That will be a different world, and today will have been one of the most important days in that new world’s unfolding.

The reason? More content flows to monitor and mine. Goodie. Love those social postings.

Stephen Arnold, June 26, 2009

Text Mining and Predicting Doom

June 23, 2009

The New Scientist does not cover the information retrieval sector. Occasionally the publication runs an article like “Email Patterns Can Predict Impending Doom” which gets into a content processing issue. I quite liked the confluence of three buzz words in today’s ever thrilling milieu: “predict”, “email”, and “doom”. What’s the New Scientist’s angle? The answer is that as tension within an organization increases, communication patterns in email can be discerned via text mining. The article hints that analysis of email is tough with privacy a concern. The article offers a suggestive reference to an email project at Yahoo, but provided few details. With monitoring of real time data flows available to anyone with an Internet connection, message patterns seem to be quite useful to those lucky enough to have the tools need to ferret out the nuggets. Nothing about fuzzification of data, however. Nothing about which vendors are leaders in the space except for the Yahoo and Enron comments. I think there is more to be said on this topic.

Stephen Arnold, June 23, 2009

Data Tables Contain Deleted Data. Yikes. Revelation.

May 21, 2009

it was spies on Facebook. Then it was the LA Times’s spoofed via a year old Prop 8 story. Now – news flash – the issue is privacy on social networking sites. Yikes. What a scoop? Sky News in the UK published “Fears over Privacy on Social Networking Sites” here. The intrepid news hounds at Sky News reported:

Researchers from the University of Cambridge say that many social networking sites maintain copies of user photos even after users delete them.

I wonder if the wizards in the groves of academe figured out that quite a bit of other information and data lurk on these sites. In fact, unless the indexes have been rebuilt, my hunch is that my team could find some interesting stuff not searchable but available to those poking around with forensic savvy.

I am waiting for one of these intrepid reporters to define “delete” and “remove”.

Stephen Arnold, May 22, 2009

Google Health

May 20, 2009

A battle is shaping up among some heavy hitters for digital health services. If you want a useful summary of what Googzilla has been doing, you can click here to read Mark Gibbs’s overview of the service. For me the most interesting comment was:

Google Health provides an API based on a subset of the “Continuity of Care Record” API described as “a standard format for transferring snapshots of a patient’s medical history.” This API allows developers to build software that can create and read consumer’s medical records with sophisticated authorization and access controls.

Not much about search and data mining in the story, however. Keep in mind that Google products and services have search baked in. Google seems to be pursuing a consumer strategy whilst Microsoft is chasing the health enterprise. Lots of exciting coming in this sector. Health information is in the same sorry state as the US health care system. My thought is that it will evolve along the same lines as the US auto and airline industry. That’s a comforting notion, isn’t it?

Stephen Arnold, May 20, 2009

Security: Search a Factor

May 10, 2009

Security of online information is critical to any company who operates on the Internet, from large corporations to medical institutions to the federal government. Remember the stolen laptop? Security online, especially when setting up a database of searchable, confidential material, is a herculean task, because if it’s online–someone can search and find it. Case in point, a headline from May 7: US Med Data Held Hostage by Hackers; Ransom: $10M. See the article at http://bit.ly/16IoZi. Hackers stole over eight million cases of drug prescription records, social security numbers, and driver’s license details from Virginia on April 30. It was reported that several layers of protection failed and allowed the hackers access. It’s not the first time something like this has happened. Data security online must be improved, or we’re all going to be facing a lot more fraud in the future.

Jessica Bratcher, May 10, 2009

LexisNexis, Its Data and Fraud

May 3, 2009

Robert McMillan’s “LexisNexis Says Its Data Was Used by Fraudsters” here caught my attention. The story reported that “LexisNexis acknowledged Friday [May 1, 2009] that criminals used its information retrieval service for more tan three years to gather data that was used to commit credit card fraud.” Mr. McMillan added that “LexisNexis has tightened up the way it verifies customers.” The article noted that LexisNexis “was involved in other data breaches in 2005 and 2006.” Interesting. So 2005, 2006, 2009. Perhaps the third time will be the charm?

Stephen Arnold, May 2, 2009

The Individual: The Fount of Crime

May 3, 2009

Most people don’t think of bad guys as the fellow who lives in the next flat or the nice girl with the lawn service. Those in bank security, law enforcement, and insurance investigators know that the individual is the key to certain interesting activities. The numerous comments about Spock’s sale to the Naveen Jane, the founder of Intelius and InfoSpace, were quite tame. You can read “Intelius Buys Spock, the People Search Engine” here. My thoughts on this deal included these musings:

  • Mr. Jane is a canny lad. I think he senses that the value of a people-centric service will rise. If this takes place, Mr. Jane could make a tidy profit
  • In the near term, the market demand for people information is likely to rise
  • People data can be sliced and diced, and I think that Mr. Jane will make an attempt to generate some revenue from this property.

And crime? Well, I don’t have much to say about that.

Stephen Arnold, May 2, 2009

Cyberwarfare Shoots Down an Aircraft

April 21, 2009

Short honk: Everyone from the Wall Street Journal to Slashdot is reacting to the reality of cyberwarfare. You can read the Wall Street Journal’s tabloidesque coverage here. For those of you with a more scholarly approach to what’s been going on for many years, click here to buy and then read Information Warfare by Winn Schwartau. Although more than 15 years old, you may as well start with one of the best discussions I have examined. Put the energies and hand waving into practices that close security loopholes. The barn, the horse—you know the aphorism. Search is an important function in these escapades. Unlike some enterprise search vendors, some bad guys use sophisticated findability methods.

Stephen Arnold, April 21, 2009

Google Health: Two New Deals

April 6, 2009

Googzilla has revealed some new tie ups in its Google Health initiative. At my lecture a couple of weeks ago in Houston, a big medical center with a city wrapped around it, there was quite a bit of interest in electronic medical records. The real issue, however, was consistency. I thought privacy and security were the cat’s pajamas. I was wrong. The medical types kept circling around the issue of data management, data transformation, and moving bits from Point A to Point B with the people at Point B able to use the information.

Google announced two interesting tie ups. The first is a partnership with CVS, a retail chain. You can get the details here. The Reuters’ story provides a few details. But the big point to me was that the GOOG is thinking retail and retail pharmacies.

The second tie up is with the giant Medco Health Solutions Inc. outfit. You can read this Reuters’ story here. Same deal: some facts but not much on the way the tie up will affect customers. The news story asserts that Google has more than 100 million people who can get access to prescription data. For me, the point is that the GOOG is thinking consumers via a partnership.

Microsoft thinks the same types of thoughts for HealthVault. The appearance of the two stories is either a coincidence or part of a health push. With the Obama Administration’s support of electronic medical records, the Google may be shifting gears. If so, the company will accelerate its surround and seep strategy in an effort to capture the market sector.

Can Google do this? Right now I think it is a wide open sector. Google’s chances are neither better nor worse than the other companies fighting for a handhold.

Stephen Arnold, April 6, 2009

Passwords List

April 1, 2009

Short honk: you can get three lists of common passwords here. These lists often come in handy when filtering government information prior to putting documents online. ArnoldIT.com has used this method for years. If you are indexing an organization’s documents, you might want to filter test your corpus. Might be helpful.

Stephen Arnold, April 1, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta