Are Privacy Issues Still Plaguing Google?

July 31, 2011

It’s hard to believe that Google is continually putting consumer privacy in question, you would think they would learn. While I’m all for a good Google roast, this is borderline overkill. TechSpot’s Matthew DeCarlo’s article “Goggle’s Street View Cars Collect Locations of Wi-Fi Devices” is an interesting look at what looks like Google’s latest troubles.

Google’s Street View Cars collected information about the Wi-Fi locations of many European users and their “previous locations”. We learned:

“For instance, someone could use the data to show you were at a specific place during a specific time, and that’s something you might not want to share with the world.”

What I don’t get is how this is any different than the check in applications that millions of people are already utilizing on their Facebook and Twitter accounts. It’s also no different than the hundreds of millions of social media users that post their whereabouts on statuses and feeds worldwide.

In order for users to cry “privacy infringement” the data should have been private in the first place. But it sure looks as if privacy issues, whether grounded or not, are an albatross around Googzilla’s neck. It is tough to search if one cannot connect in our opinion.

Leslie Radcliff, July 31, 2011

Sponsored by Pandia.com, an outfit which published my new study of enterprise search and a chapter that provides some of my analysis of the Google Search Appliance.

Ardentia Search Now Connexica

July 29, 2011

Short honk: We were updating the Overflight links today and noted that Ardentia Search which had positioned itself as a “business intelligence company” is now redirecting to Connexica.com. The About Us page references Ardentia Search. The managing director of the company is Richard Lewis. Here’s the important bit:

As CTO at Ardentia, [Richard Lewis] was responsible for the development of BI and Data Warehouse products which are now used in over 100 NHS organizations as well as providing analysis and extract services for the National Program for IT. Richard founded Connexica in 2006 by buying the IPR for his latest BI and search product from Ardentia.

If you don’t recall the Ardentia system, here’s a block diagram I unearthed from the Overflight archive:

ardentia overview

A number of search and content processing companies are repositioning, not disappearing.

Stephen E Arnold, July 29, 2011

Freebie

Exclusive Interview with Margie Hlava, Access Innovations

July 19, 2011

Access Innovations has been a leader in the indexing, thesaurus, and value-added content processing space for more than 30 years. Her company has worked for most of the major commercial database publishers, the US government, and a number of professional societies.

image

See www.accessinn.com for more information about MAI and the firm’s other products and services.

When I worked at the database unit of the Courier-Journal & Louisville Times, we relied on Access Innovations for a number of services, including thesaurus guidance. Her firm’s MAI system and its supporting products deliver what most of the newly-minted “discovery” systems need. Indexing that is accurate, consistent, and makes it easy for a user to find the information needed to answer a research or consumer level question. What few realize is that using the systems and methods developed by the taxonomy experts at Access Innovations is the value of standards. Specifically, the Access Innovations’ approach generates an ANSI standard term list. Without getting bogged down in details, the notion of an ANSI compliant controlled term list embodies logical consistency and adherence to strict technical requirements. See the Z39.19 ANSI/NISO standard. Most of the 20 somethings hacking away at indexing fall far short of the quality of the Access Innovations’ implementations. Quality? Not in my book. Give me the Access Innovations (Data Harmony) approach.

Care to argue? I think you need to read the full interview with Margie Hlava in the ArnoldIT.com Search Wizards Speak series. Then we can interact enthusiastically.

On a rare visit to Louisville, Kentucky, on July 15, 2011, I was able to talk with Ms. Hlava about the explosion of interest in high quality content tagging, the New Age word for indexing. Our conversation covered the roots of indexing to the future of systems which will be available from Access Innovations in the next few months.

Let me highlight three points from our conversation, interview, and enthusiastic discussion. (How often do I in rural Kentucky get to interact with one of the, if not the, leading figure in taxonomy development and smart, automated indexing? Answer: Not often enough.)

First, I asked how her firm fit into the landscape of search and retrieval?

She said:

I have always been fascinated with logic and the application of it to the search algorithms was a perfect match for my intellectual interests. When people have an information need, I believe there are three levels to the resources which will satisfy them. First, the person may just need a fact checked. For this they can use encyclopedia, dictionary etc. Second, the person needs what I call “discovery.” There is no simple factual answer and one needs to be created or inferred. This often leads to a research project and it is certainly the beginning point for research. Third, the person needs updating, what has happened since I last gathered all the information available. Ninety five percent of search is either number one or number two. These three levels are critical to answering properly the user questions and determining what kind of search will support their needs. Our focus is to change search to found.

Second, I probed why is indexing such a hot topic?

She said:

Indexing, which I define as the tagging of records with controlled vocabularies, is not new. Indexing has been around since before Cutter and Dewey. My hunch is that librarians in Ephesus put tags on scrolls thousands of years ago. What is different is that it is now widely recognized that search is better with the addition of controlled vocabularies. The use of classification systems, subject headings, thesauri and authority files certainly has been around for a long time. When we were just searching the abstract or a summary, the need was not as great because those content objects are often tightly written. The hard sciences went online first and STM [scientific, technical, medical] content is more likely to use the same terms worldwide for the same things. The coming online of social sciences, business information, popular literature and especially full text has made search overwhelming, inaccurate, and frustrating. I know that you have reported that more than half the users of an enterprise search system are dissatisfied with that system. I hear complaints about people struggling with Bing and Google.

Third, I queried her about her firm’s approach, which I know to be anchored in personal service and obsessive attention to detail to ensure the client’s system delivers exactly what the client wants and needs.

She said:

The data processed by our systems are flexible and free to move. The data are portable. The format is flexible. The interfaces are tailored to the content via the DTD for the client’s data.  We do not need to do special programming. Our clients can use our system and perform virtually all of the metadata tasks themselves through our systems’ administrative module. The user interface is intuitive. Of course, we would do the work for a client as well. We developed the software for our own needs and that includes needing to be up running and in production on a new project very quickly. Access Innovations does not get paid for down time. So our staff are are trained. The application can be set up, fine tuned, deployed in production mode in two weeks or less. Some installations can take a bit longer. But as soon as we have a DTD, we can have the XML application up in two hours. We can create a taxonomy really quickly as well. So the benefits, are fast, flexible, accurate, high quality, and fun!

You will want to read the complete interview with Ms. Hlava. Skip the pretend experts in indexing and taxonomy. The interview answers the question, “Where’s the beef in the taxonomy burger?”

Answer: http://www.arnoldit.com/search-wizards-speak/access-innovations.html

Stephen E Arnold, July 19, 2011

It pains me to say it, but this is a freebie.

Belgium and Google: A Messy Waffle

July 18, 2011

I saw this headline: “Belgian Newspapers Claim Retaliation By Google After Copyright Victory” and I was nervous clicking on the link. SEO news services make me nervous. The idea of any “retaliation” story makes me think of long lists of words on a watch list somewhere.

I clicked on the link, and the story seemed okay, just a bit thin on substantive details. Quoting the Associated Press is in and of itself is reason for concern.

Here’s the main idea:

Publishers in Belgium did not want their content indexed by Google. (That strikes me as less than informed, but forget the knowledge value angle.) So publishers get the fluid legal system to notify the Google. Shortly thereafter, some Belgium publishers note that their content is tough to find at the top of a Google results list. Bottomline line: Some folks believe Google is jiggling the results to make some Belgian content familiar with the tedium of clicking through lots of pages to find the desired hit.

My view is that accusations are definitely good for “real” news outfits like the publisher of the retaliation story. I also think that considerable care must be taken before yip yapping about why a particular results list does not show what one wants, expects, believes, or hopes will appear.

Google has lots of people working on the search system. I once believed that these teams were coordinated and working like a well oiled robot arm assembling nuclear fuel rods. Now I know that the method is more like “get it working”. Good enough is going to earn a search wizard an A from the Google system.

image

Messy waffles. Image source: http://eatbakelove-todayistheday.blogspot.com/2011/06/weekend-warrior.html

If there is a difference between publishers’ expectations and what is in a particular result list, I suggest several things:

First, get a trained and expert online searcher to run queries in a methodical manner to verify what is and what is not “findable.” Keep in mind that 99.9 percent of the people who claim to be search experts are not. If you don’t believe me, give Ulla de Stricker a buzz. You can also try Anne Mintz, former director of the Forbes Magazine information center. You can also ping Marydee Ojala, editor of online. Folks, trust me. These individuals are certifiable online search experts and can get the information needed to put some data behind the hot air. Data needed.

Read more

Vivisimo Assumes Leadership Position in Twitter Chat

July 15, 2011

I noted the write up “Customer Service Industry Luminaries Participate in Vivisimo’s Unique Twitter Chat Experience Vivisimo Twitter Chat #CXO is Named one of "The 12 Most Stimulating Twitter Chats". The CXO is a business publication. My recollection is that I wrote something for the company years ago. I lost track of the outfit after I shifted to the Enterprise Technology Management publication. (If you click the link today, you may see the addled goose himself. I wrote about Google, not Twitter Chat. I did not know what a Twitter Chat was.

The write up informed me that I was wrong about CXO, which means customer Experience Optimization and not the publication CXO. I also learned that Twitter Chat has been an ongoing activity since April 2011.

Here’s the passage I noted:

"We’ve grown to over 100 participants in just over 10 weeks and have had A-list participants join us because Vivisimo has established residency as the leader in customer experience," said Tracey Mustacchio, Vice President of Marketing at Vivisimo. "The #CXO Twitter chats are an interactive, convenient, and highly effective medium to collaborate and learn from peers how to improve their customer experience initiatives. Also, being named one of the 12 Most Stimulating Twitter Chats was an honor and will help to set us apart from our competitors." Previous session topics included: What is Customer Experience Optimization? Another Officer in the C-Suite: Chief Customer Officer [CCO] The Intersection Between Innovation and the Customer Experience Customer Experience for the Gen Y, Digital Native Trust in the Customer Experience "Who Comes First in Shaping #CustExp – Customers or Employees?" Social Media and the Customer Experience What’s the best way to make customer experience metrics actionable? Going Mobile with Customer Experience? Customer Data: from Overload to Insight Supply Chain or Customer Value Chain To join in on future Vivisimo #CXO Twitter chats, visit the following link every Monday at 12:00 pm ET: http://tweetchat.com/room/cxo. If you use an application like Tweetdeck or Seesmic Desktop, create a search column for the term "#CXO" and follow the Vivisimo chat.

If you want more information, visit the Vivisimo Web site at www.vivisimo.com. Interesting.

Stephen E Arnold, July 15, 2011

Sponsored by Pandia.com, publisher of The New Landscape of Enterprise Search

Digital Reasoning Forges Ahead With New Select Partner Program

July 10, 2011

Digital Reasoning has solidified its reputation thanks to its data analytics solutions which help companies dig deep into their data and really get the most from their resources. According to the Red Javelin Communications press release “Digital Reasoning Introduces Select Partner Program for Big Data Analytics” Digital Reasoning recently announced the formation of a new Select Partner Program for technology partners. “The program has been created to support the growing ecosystem of leading technology vendors building the next generation of analytic solutions for Big Data.” Digital Reasoning hopes to integrate its Synthesys business intelligence platform in not only commercial enterprises but also the federal government’s intelligence sectors.

Data Analytics is a way of life for companies such a Cloudera, DataStax and Fetch Technologies and they make-up the initial member lineup for the Select Partner Program. Cloudera provides clients with the valuable Apache Hadoop based data management software, Cloudera Distribution which is a commercial distribution of Hadoop platform. Apache Hadoop is an open-source platform designed to efficiently analyze both structured and complex data. Many companies use Hadoop because of its ability to handle different data sets, the flexibility to add or remove servers and its capability to detect errors or problems and fix them simultaneously while continuing to process data. According to the Cloudera Web Site, Cloudera Distribution “integrates the most popular projects related to Hadoop into a single package, which is run through a suite of rigorous tests to ensure reliability during production.”

Digital Reasoning’s latest version of its popular intelligence platform Synthesys will be fully integrated with Cloudera’s Distribution, specifically Apache Hadoop (CDH3) and HBase. According to the Marketwire.com article “Cloudera and Digital Reasoning Partner to Provide Complex Data Analytics for Government Intelligence and Enterprise Markets” this new integration will allow “Digital Reasoning to achieve extreme scale capabilities and provide complex data analytics to government and commercial markets.”

DataStax provides another valuable open source program, called Apache Cassandra.  DataStax is the commercial leader for this open-source platform which allows customers to build, deploy and operate high volume real time operations across various servers, databases, computers etc without interruption. Fetch Technologies brings another important piece to the puzzle with its real time technology-based solutions.  The technology allows companies to not only access their data but also compile it while accessing real-time Web data. Clients can access valuable websites and get important information such as data analysis.

Though each company’s products have some similarities they each bring something different and valuable to the Data Analytics table.  According to a statement from the CEO of Digital Reasoning Time Estes, “We strongly believe that working with the best-in-class partners is the ideal way to help customers solve their Big Data analytic challenges.” Digital Reasoning understands the power of team work and is extremely optimistic about the partnerships. Proof the company believes “there is power is numbers.”

A happy quack for the Digital Reasoning team.

April Holmes, July 11, 2011

You can read more about enterprise search and retrieval in The New Landscape of Enterprise Search, published by Pandia in Oslo, Norway, in June 2011.

Google+: How Does One Measure Success?

July 10, 2011

I keep trying to avoid the Google+ “thing.” I focus on a different slice of the online world, but my newsreader and Overflight system overfloweth with baloney about a service that is in trial, less than two weeks old, and pretty much a knock off of Facebook. This “me too” stuff is becoming the focus on innovation, and I am not too interested in learning how to use a wheel or light a fire, thank you.

image

How does one measure success? Tweets, revenue, number of products and services offered customers? Source: http://goo.gl/9yBS9

I did notice two items this morning. The first was a report about the number of short messages mentioning Google+. Yep, let’s get on the Tweeter thing. I find the data illustrative of the spike nature of information. Where’s that Fukushima thing? Long gone for today news consumers, but the aftermath of the event is good for a couple thousand years.

The second item was my favorite techno-management guru, Eric Schimdt. The view that caught my eye appeared in “Eric Schmidt on Gauging Google+’s Success.” There is a great deal of information in the write up and the embedded video. Here’s the passage that caught my attention:

When asked by reporters whether Google planned eventually to fill out Google+ with other products, Schmidt answered, “Yeah, and there’s a lot coming,” saying that business accounts and ads are expected, assuming Google+ continues to grow. ”We test stuff and when it works we put a lot more emphasis on it,” he said.

Okay, “a lot” and “test stuff”. Let’s reflect.

Google is juggling a large number of balls, just like IBM, Microsoft, and Oracle. What’s interesting to me is that the company’s principal—dare I say, “only”—revenue stream is advertising. The purpose of Google+, Android, and the other major initiatives is generating advertising revenue.

I urge Google to move forward and generate ad revenue from these services. The reason is that once again I talked with a number of companies and heard one message, “We need traffic. Something’s changed at Google, and we don’t know what to do.”

My response was, “Adapt.”

Google has shifted its attention from the brute force AltaVista.com approach to information to a new, compound model. The focus of that model is not delivering better results. The focus is producing revenue. And the company needs the new sources. With Web traffic shifting from desktop access to mobile access, Google has to find a way to sustain revenue and then grow significant new streams. Generating new revenue streams is not easy. The case example? Google itself. After 11 or 12 years in business, Google has a giant footprint but it is now good at one thing: selling ads.

I think there are companies with better products in search. Example: Yandex.com. I think there are companies with better social network services: Facebook.com. I think there are companies with better mobile devices: Apple. I think there are companies with better online shopping: Amazon.com. In short, unlike the early days, Google has competition in packaging, technology, and innovation.

So, the buzz about Google+ tells me three things:

  1. Google excites significant interest, particularly among a certain sector of the online community. That indeed is a marketing advantage. Marketing now has to turn into revenue. Where’s that $100 billion a year company now? About $70 billion to go.
  2. Google faces significant competition across a wide range of business facets. I suppose Alexander the Great could handle a multi front war, but he died at an early age, and his empire fizzled. I won’t press the metaphor, however.
  3. The legal opponents Google faces are likely to see the sprawl of Google as an indication of the company’s ambitions. At a time when issues of monopoly and certain business practices is increasing, Google demonstrates that it can pretty much do what it wants, dismissing questions with remarks like “a lot” and “test stuff”.

The upcoming hearings in Washington, DC, will be interesting. I wonder if there will be a Google+ service for those involved? I hope so. That might be a way to measure success just not in terms of revenue.

Stephen E Arnold, July 10, 2011

You can read more about enterprise search and retrieval in The New Landscape of Enterprise Search, published by Pandia in Oslo, Norway, in June 2011.

Google Filters Ads but Decries Censorship

July 7, 2011

We like acrobats who can walk two tightropes at one time.

Let’s compare “Google Chairman Warns of Censorship after Arab Spring,” at Datamation to ZDNet’s “Google Takes Down 93,000 Scam Ads.”

The first piece cites Eric Schmidt’s concerns regarding consequences of this year’s uprisings across the Middle East:

Google chairman Eric Schmidt expressed his belief that dictatorial governments will begin restricting free expression on the Web following the democratic uprisings in Arab countries that occurred this spring. Many of these movements used Web-based technologies to organize protests and communicate with others. Schmidt says that as a result, government censorship of the Internet is going to ‘get worse,’ and he said that some technology executives could face arrest and torture.

That’s bad, alright. Isn’t censorship always bad?

Apparently not. The ZDNet article reports that, in cooperation with U.K.’s Office of Fair Trading, Google has removed 93,000 ads that linked to “scam sites” between July and December 2010:

Digital fraud is a significant problem for British consumers, according to Action Fraud. The UK’s national scam reporting centre said 23 percent of incidents submitted were carried out online. That figure includes cons such as fraudulent adverts, ticketing scams and dodgy online auctions.

Here’s an idea. How about educating consumers about fraud tactics? Nothing will dissuade the perpetrators better than wasted effort.

Isn’t taking down ads censorship? I get it; maybe this is different? Because in this case, isn’t able to decide what to filter? Is this also in cooperation with a government? Interesting.

Stephen E Arnold, July 7, 2011

The addled goose is the author of The New Landscape of Enterprise Search

Are Webinars the Backbone of Concept Searching Marketing?

July 5, 2011

On the surface, Concept Searching looks like some of the other analytics company that asserts steady growth. What is interesting is that when some value adding software co9mpanies market, webinars or online lectures and demos are a component of a broader marketing program, Concept Search seems to rely heavily on webinars. We find this interesting.

We looked into one search company which was using Twitter to make the text processing service a hot trend. From our vantage point, it seems that Concept Searching is using social media in a more modest way.

Though it sounds like Spiderman should be involved, a webinar is simply an online seminar or workshop. The great thing about a webinar is that it is usually interactive and allows all participates to give, receive and discuss the topics at hand. Additionally, geographical boundaries are not an issue and these presentations are very low in cost.

When perusing Concept Searching’s Web site, you will find an entire events page dedicated to their upcoming exhibitions and a list and description of their current webinars. Some titles include: “Designing Information Architecture for SharePoint: Making Sense in a World of SharePoint Architecture”  and “De-mystifying Content Types: Four Key Content Types of Leverage.” You simply register and voilà, you join in on all the fun. They also have a page dedicated to previously recorded webinars that you can access at your leisure.

I moderate webinars for a couple of outfits, and these are often expensive programs. There is time, often lots of time, required to prepare the text, create the graphics and demos, and then build an audience. I participate in webinars when I am paid to do so. However, I do not participate in webinars. The reason is that I am receiving inputs, experiencing interruptions even when the door is closed, and working to respond to ad hoc requests from clients.

I do think that webinars are somewhat more useful than attending certain conferences. Over the last couple of years, conferences are more like fraternity and sorority parties. But that perception may be a function of my age and distaste for rock and roll, mixed media events with lots of 20 somethings opining about social media and organic search. Yikes, digital bonsai.

This leads me to the question, “Who has time to participate in webinars?” If these are buyers of high end solutions, great. However, if I were the boss of a company where webinars consumed staff time, I would be asking some questions about the efficacy of the method.

I find reading a Web page and using an online demo or downloading code useful. Webinars may be too zippy for an old goose like me. One thing for sure: lots of companies are using webinars to hold down the cost of on site sales calls and getting individuals “interested” in a product or service to cough up an email address.

Stephen E Arnold, July 5, 2011

Sponsored by Pandia.com, publishers of the New Landscape of Enterprise Search

Security and Open Source: A Delicate Mix Requires a Deft Hand

July 5, 2011

I recall reading a very unusual write up with a “learning” hook. The story was “The 10 Worst Cloud Outages (and What We can Learn from Them).” The article makes lemonade from the Amazon faults, which is a combination of home grown, open source, and commercial software. The lemonade bucket is full because the same recipe is used for cloud outages at Microsoft Sidekick, Google’s Gmail (with no reference to the Blogger.com crash during this year’s Inside Search conference which focused on cloud stuff), Microsoft’s Hotmail issues, Intuit’s flubs, Microsoft’s business productivity online standard suite stumbles, Saleforce.com’s outage, Terremark’s troubles, PayPal’s hiccups, and Rackspace’s wobblies.

What the article taught me was that this cloud stuff is pretty difficult even for folks with deep pockets, lots of engineers, and oodles of customers who swallow the pitch hook, line, and sinker.

My hope is that US government funding of research into the use of open source software for security applications can route around cloud dependencies. “DHS, Georgia Tech Seek to Improve Security with Open Source Tools.” The article said:

Although parts of the government, such as the Defense Department, have embraced open-source software for a variety of applications, many agencies still view it as suspect. As a resource, Davis hopes HOST will help to dispel the “hippie in the basement” view of open-source programs — that it’s cobbled together by enthusiasts rather than teams of professional programmers. The advantage of open-source software is that users can vet the source code themselves to make an application more secure. “Having something in a cellophane wrapped box doesn’t make it safer,” he said.

A combination of cloud technology and open source might prove the undoing of a well conceived program based on open source technology. Intertwining the cloud and open source tools for security might create a interesting and difficult to troubleshoot situation. Let’s hope the approach delivers lemonade with just the right amount of sugar, not a sour concoction.

Stephen E Arnold, July 5, 2011

From the leader in next-generation analysis of search and content processing, Beyond Search.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta