High Speed Searching Possible with Jazzy Algorithm

May 16, 2012

The first in a series where the official Avadis NGS blog describes algorithms used in their product, “Elegant Exact String Match Using BWT” explains a fast algorithm used to perform an exact string match. The article acknowledges that many readers want to know why they needed another string matching algorithm.

Summarizing the problem, the article informs us that the issue lies in the need to matchworks billions of short strings to a 3 billion character long text. Currently, the way sequencing technology works is too convoluted.

The article continues:

“We need an algorithm that allows repeatedly searching on a text as fast as possible. We are allowed to perform some preprocessing on the text once if that will help us achieve this goal. BWT search is one such algorithm. It requires a one-time preprocessing of the reference to build an index, after which the query time is of the order of the length of the query (instead of the reference).”

Overall, the article does a solid job providing the contextual technology revolving around their product and explaining it with diagrams and concise language. It is a recommended read for those who would like to know more about this interesting idea for high speed searching.

Megan Feil, May 16, 2012

Sponsored by PolySpot

No Joke: SEO and Panda

April 1, 2012

Another lesson for the SEO mavens. Search Engine Journal presents “The One Question Google Panda has Taught Us to Ask Ourselves.” The question suggested by the article: “am I adding value?” Hmm, that’s actually a very good query, and one that could render the whole SEO field moot. Perhaps the Panda is working as designed; could it be?

Writer Eric Siu seems to think so, and that this is a good thing. He emphasizes:

The time wasted trying to figure out how to manipulate the system would probably be better spent on creating something remarkable for users. Besides, who doesn’t like the added benefit of engagement and new relationships from great content? From becoming a better writer to establishing your brand on other Web sites, the benefits are countless.

What a novel concept.

I wonder, though, how many search engine optimization professionals will take Siu’s advice. They have built their careers on gaming search engines with such tools (ploys) as keywords, anchor text, and link networks.

As Upton Sinclair famously declared, “It is difficult to get a man to understand something, when his salary depends on his not understanding it.”

Cynthia Murrell, April 1, 2012

Sponsored by Pandia.com

Excellence in Search or Anything Else Slipping?

January 25, 2012

US lawmakers are called out for being willing to sabotage American products in “American Corporate Software Can No Longer Be Trusted For Anything” at Falkvinge & Co. The arguments over SOPA have revealed representatives’ true colors, charges Rick Falkvinge. The article asserts:

In the debate around the American Stop Online Piracy Act, American legislators have demonstrated a clear capability and willingness to interfere with the technical operations of American products, when doing so furthers American political interests regardless of the policy situation in the customer’s country. Actually, it’s even worse: American legislators have demonstrated a willingness to do this just because of the different laws in the customer’s country, outside of the United States.

The ramifications of such an attitude are be greater than might appear at first glance; most of the world’s governments and global entities are built on hardware and software from American companies.

Other countries seem to be moving to curb their reliance on American technology. Extreme Tech announces, “Russia Building 10-Petaflop Supercomputer, Joins China in Search of Less US Tech Dependence.” Now, the Russian computer of this headline is still built on US hardware. However, the article mentions that China is now building a supercomputer entirely out of home-grown tech.

Hmm. Are these countries really trying to protect themselves from dependence on US made products? Maybe Russia and China are just moving ahead of the US. It isn’t always about us.

By the way, Russia’s Yandex is an excellent search engine for both Russian and English content.

Cynthia Murrell, January 25, 2012

Sponsored by Pandia.com

Facebook, Google, and Evil: Standard Operating Procedure?

January 23, 2012

One of the most over-used and little-understood words attached to online is “evil.” Long before Google, I was in a meeting at which ABI/INFORM announced per online type pricing. I think the person who described the decision to charge $0.25 per online type for Format 5 on Dialog was Martha Williams, one of the stalwarts of the online industry and a respected figure at the University of Illinois science and engineering libraries.

A tip of the trident to http://reinventingtheeventhorizon.wordpress.com/2011/10/06/midnight-in-the-garden-of-good-and-evil%E2%80%94mafia-style/

Evil, according to Dictionary.com–which is tough to use because of the ads for Zoho, InetSoft, and RingCentral–iterates through 10 definitions:

  1. morally wrong or bad; immoral; wicked: evil deeds; an evil life.
  2. harmful; injurious: evil laws.
  3. characterized or accompanied by misfortune or suffering;unfortunate; disastrous: to be fallen on evil days.
  4. due to actual or imputed bad conduct or character: an evil reputation.
  5. marked by anger, irritability, irascibility, etc.: He is known for his evil disposition.
  6. that which is evil; evil quality, intention, or conduct: to choose the lesser of two evils.
  7. the force in nature that governs and gives rise to wickedness and sin.
  8. the wicked or immoral part of someone or something: The evil in his nature has destroyed the good.
  9. harm; mischief; misfortune: to wish one evil.
  10. anything causing injury or harm: Tobacco is considered by some to be an evil.

Like many words in every day use, evil can denote or connote different shades of meaning.

I thought about these 10 definitions after I read “Facebook to Google: Don’t Be Evil, Focus on the User.” The write up presents a respected real journalist’s report about information exchanged in a meeting. The main point of the write up describes a way to make Google work the way it did before the social bonus program kicked in and the Google Plus avalanche rumbled down the roof of the Googleplex.

Read more

Search Engine Optimization Billing

January 7, 2012

I saw a graphic which purports to answer the question, “How Much Does SEO Cost?” The guts of the write up is more along the lines of how a client pays for the allegedly high-value, must-have ministrations of SEO experts. Here’s an example:

cost-per-project is the most common pricing model and is offered by 70% of the agencies and consultancies surveyed. A monthly retainer was the second most common cost model offered (60%), followed by hourly rates at 55%.

The big summary of data explains what services the alleged experts offer the clients who pay. The bulk of the work appears to be involved in making recommendations and suggesting key words. Okay, librarians, are you on alert. SEO experts are recommending key words. I wonder if home economics majors, those skilled in political science, and various unemployed high school teachers are trained in indexing? MBAs? Hey, MBAs are born able to manage anything. Key words are a piece of cake. Just look at the indexing of Lehman Brothers’ and BearStearns’ content.

But the big factoid in the write up is the Monthly retainer section. One learns that the fees are in what is “buy a Toyota Camry” range; that is, hundreds a month to $2,501 to $5,000 a month range. The use of blue bars without “real” numbers makes this observation suspect, but I concluded that with advisory services and some key word fiddling, a good salesperson could snag six or seven clients a month. Even at $2,000 per month, the enterprising SEO expert can move up to a baby Lexus.

Project pricing is, it appears, mostly in the $1,500 to $7,000 range. My hunch is that projects drag out over several time chunks. The hourly rate section pegs the experts in the $75 to $150 per hour range. Compared to blue chip consulting work or expert witness work, SEO experts are billing at a rate which probably keeps the lights on and maybe makes it possible to enjoy a holiday each year.

The infographic suggests that making a living as an SEO expert is possible, probably not particularly easy. Worth checking out the chart if you are in the SEO game. No information about the productization of the alleged SEO services. That would be interesting to me.

By the way, the “real cost” of SEO is the friction added to the spending of Bing and Google to deal with the craziness, spoofing, and coding horrors the SEO clan visits on the hapless residents of rural Kentucky. Google’s Matt Cutts has a job because of SEO. SEO costs a great deal of money, and when I consider how relevance has become a thing of the past, SEO has consumed more dough than it has generated for those looking for on point information.

Stephen E Arnold, January 7, 2012

Sponsored by Pandia.com

January 2012 and a 2009 Meet Up: Spoof or Goof?

January 1, 2012

The idea of accuracy is on my mind. I did a quick look at what our Overflight service “saw” in the last eight hours, and I noticed “SEO Meet Up and Its Future Potential.” The source for the document is Ontosearch which has the subtitle “Ontology Search Engine.” Since I don’t know what an ontology is, I was interested in how I might search such a system.

Get your goof T shirt from Zazzle. Image source: http://www.zazzle.com/ya_dun_goofd_tshirt-235540199656793547

I noted this passage in a write up that seemed to be reporting on a meet up in Mubai, India, in August 2009. Since it is now 2012, the idea that “news” flowing from an event held two years ago caught my attention. Here’s the passage I noted:

Keeping the potential of a SEO analyst in mind and in general the SEO vertical, a SEO meet up was organized in Mumbai on the 1st Aug 2009. Scores of SEO specialists, content experts, web designers etc. met to discuss the changing landscape of the web, and latest trends in the SEO services. This meet up was undoubtedly an eye opener for everybody and they left with a plethora of understanding. They also discussed the future of SEO. The web world has made a transition from the traditional Web 1. to Web 2.. And there are already talks of Web three. in the pipeline. The future is semantic indexing and collaborative development. A excellent SEO must have the flexibility to recognize and implement the nuances of making use of a semantic technology to link different sites and come across a way to promote his own. So adaptability and openness are going to the keys of Web 3.. Agility and continuous improvement would be the hallmark of Web three..

Hmmm. I think this is too sophisticated for an addled goose. Is this a spoof or goof? My view is that this is an example of content which looks as if it were the product of a person who graduated from a junior college. Then again, when an addled goose cannot figure out”agility”, I think we have another example of fancy words and meaning free content. Are Bing and Google fooled? I think so.

A quick review of other posts on the Web site reveal other write ups which baffle. If you are looking for information about a taxonomy, Pandia and ArnoldIT will publish in 2012 a monograph on the subject. No spoof, and we hope that we don’t goof. That’s a useful New Year’s resolution: Write about sources, ideas, and developments which sort of make sense “ontologically”, of course. I think it is time for content to “relevel up”, a phrase used by a political candidate.

But not for the owner of the domain in Timur, Indonesia.

Stephen E Arnold, January 1, 2012

Start Your Year with Your Content Radar On

January 2, 2011

I am concerned about the quality of information which appears in public Web search results. I was fooling around with queries for the “new” silver bullet, which is made of Fool’s Gold. You know this search revolution as taxonomy. Everyone wants a taxonomy because key word indexing usually disappoints the inept searcher. A taxonomy, therefore, is one way to allow a user to slam in a word and maybe get a “use for” or “broader term” to make the results more “relevant.”

But a taxonomy goes only so far. The depleted uranium bullet is one that uses “facets”, another faerie dust term. The hapless user clicks on a descriptor or bound phrase that is broader than a taxonomy entry and magic happens. The results will contain something even the junior college graduate can use.

There is a level above taxonomy and facets too. This is the Disneyworld of predictive search. The idea is that the “system knows best.” The user does not have to do much more than fire up the app or poke her nose against the touch pad’s icon and the system predicts and delivers the needed information. Sounds great.

The problem, gentle reader, is that indexing systems don’t know when the content is addled, wrong, shaped, or just chock full of crapola. Let me illustrate two examples from an outfit with Web sites as JazdTech.com. Yep, “Jazd”, not “jazzed.” That’s a clue that I notice. Some search systems are not as picky.

I use the little known metasearch system Devilfinder.com. Be alert. Turn on “safe search.” Now run the query for taxonomy software vendors and in the results list you find these promising links:

image

There you go. “2011 Top Taxonomy Software Companies in Pharma.” Right on the money. The problem is that the results are not germane to anything remotely close to taxonomy software narrowed to pharmaceutical applications. When I clicked on the link on New Year’s Eve, I saw this Web page:

image

It looks okay but the links are useless and so far off the keywords I used for the query that I laughed out loud. Okay, a metasearch system can make mistakes.

I ran the query “2011 Top Taxonomy Software Companies” on Google and I was greeted with a display that contained not one or two entries to JazdTech.com’s lousy content but there were many listings.

image

After the ads that Google feeds upon were 11 hits to pages which contained irrelevant information which superficially look like content.

What’s my point?

It is easy to run queries which return hits to Web pages which are like the sugar free candy for dieters. The goodies look like the real thing, but are not. That’s okay when fooling the snack addict. For online searching, users expect nutritious information.

JazDTech.com is one outfit benefiting from the indifference of “real” search and metasearch systems. The screenshot below contains lots of information which I find questionable. I can guard myself against most flawed Web content? Others may not so equipped.

image

The domain is registered to an outfit called JAZD Markets, allegedly operating out of Hampstead, New Hampshire. There appears to be a reference to a street address in Andover, Massachusetts on Dundee Park Drive. The “service” is hosted on my favorite outfit Hostgator.com. The staff at JAZD Markets list themselves on LinkedIn, but provide modest information about the quality control in use for the firm’s software listings. Perhaps one purchases a listing and selects a category in which to appear? I will have to check out Firehouse BBQ and Pig Roast when I am next in Andover, a lovely place.

The problem is that some researchers may waste valuable time or use information that will make their search and retrieval cannon explode in their face.

Stephen E Arnold, January 1, 2012

Sponsored by Pandia.com

« Previous Page

  • Archives

  • Recent Posts

  • Meta