Search, Its Biggest Change, and Yawns

December 8, 2009

I try to steer clear of the search engine optimization crowd. A reader sent me a link to a write up called “Google’s Personalized Results: The “New Normal” That Deserves Extraordinary Attention”. The idea is that Google can personalize search results for every user in the world. Search Engine Land slaps the word “biggest” on this Google announcement. The idea is that users should be revved up, excited, concerned, involved, etc.

I suppose I should be excited, but the personalization can be turned off. I have noticed shaped search results for quite a while. The scale interests me. Personalization is one consequence of Google’s adaptive functions. Newly visible to users, not new.

Stephen Arnold, December 8, 2009

Oyez, oyez, I want to disclose to the Geological Survey (USGS) that this new world has been explored already. I did my write up without any payment. Tough to charge money to state the obvious.

Written by Stephen E. Arnold · Filed Under Business strategy, Google, Online (general), Text analytics, Text processing | Comments Off on Search, Its Biggest Change, and Yawns

MarkLogic and Its XML Briefing Draw Crowds at London Online

December 4, 2009

Usually I ignore the exhibit areas at trade shows. I don’t know anyone any longer, and the average age of most of the people in the booths is about one third of my 65 years. I did make a sweep through the Incisive International Online Show but I had my progress impeded yesterday. The reason was that the MarkLogic briefings given every hour or so created a mini-traffic jam.

Overflow crowds participated in the MarkLogic technical briefings at the International Online Show, December 1 to 3, 2009, in London, UK.

The briefings drew crowds that overflowed the space allocated for attendees. I asked one of the XML wizards, “What’s with the big crowd?” The MarkLogic wizard replied, “Our MarkLogic server briefing is selling like cold drinks at a football match.” MarkLogic knows its XML and its metaphors. The interest in XML MarkLogic style makes clear that where there is technical magnetism, there is a crowd.

Stephen Arnold, December 4, 2009

I want to disclose to the Food & Drug Administration that I was not paid by MarkLogic to write this article. I was not able to get a booth giveaway when I stopped to ask about the reason for the interest in the XML server lectures. I have to find a way to get some cash for my photographic expertise.

Written by Stephen E. Arnold · Filed Under Conferences, Enterprise, News, Technology, Text analytics, Text processing | Comments Off on MarkLogic and Its XML Briefing Draw Crowds at London Online

Oracle Feels Heat, Tries to Redefine Kitchen

December 3, 2009

I know when there is trouble on the off ramp that once ran directly to Sea World south of San Francisco. There is the deterioration of the road bed, a reminder of the problems aging infrastructure pose to drivers. In a way, cracked pavement and poorly marked off ramps are indicative of some enterprise technology solutions as well. You make a choice, expecting a smooth ride to Sea World, and what do you get? A jarring ride down a highway filled with bumps and pot holes. Why no improvements? Good question. I asked this when I was doing one of my periodic brush ups for the companies I track in the search, information management, and content processing sector that is my particular interest.

My research suggests that the giant database vendor Oracle faces a number of challenges. The company’s headquarters can be reached on the old Sea World highway, now named Oracle Way, I think.

First, the company cannot land its corporate jet or its founder’s jet fighter at the same airport as Google. Googlers can walk to their expensive toys. Oracle executives have to fight traffic on 101. Second, there are growing problems from data management upstarts like InfoBright and Aster Data. Third, there are the pesky French search based application vendors like Exalead. Fourth, the geriatric Codd database is getting left in the performance dust by the speedy Perfect Search vortex technology. Fifth, the Oracle Secure Enterprise Search remains an undercard opponent in the enterprise search wrestling matches that entertain me on a daily basis.

But Oracle asserts that it has not only addressed some of its weaknesses but the company has taken a leadership position in next generation data management.

For example, in November 2009, I read an interesting Oracle blog post called “Next Generation Data Warehouse Platforms”. The post made a number of assertions that suggested Oracle had overcome the problems of scaling, performance, and affordability that continue to plague the world’s largest database vendor. For example, that blog post pointed out these breakthroughs for Oracle 11 and I quote from the Oracle blog:

“Performance => Sun Oracle Database Machine. Yes, it really is fast!
In-memory processing => Oracle now has (11gR2) In-Memory Parallel Execution. More about this can be read in Maria’s excellent post here.
In-Database Analytics => As the report says in Exadata V2 and Oracle 11gR2 we are now offloading data mining model scoring to the storage side of the house, which allows us to embed mining models into more and more operational systems and get online (direct) feedback on transactions. We also have for years moved more and more OLAP and Stats functionality into the engine
Real-time data warehousing => First and foremost the read consistency model introduced in Oracle 4 (this is not a typo…) allowing readers to see consistent data during writes, secondly, the just completed acquisition of Golden Gate and the ETL capabilities (like streams) in Oracle allow for very nice real time data feeds. Oracle’s MAA architecture allows us to be up and running 24*7 on commodity hardware and deliver an online experience to all customers…
Cloud computing => see the in-database MapReduce post here.
Appliances => Sun Oracle Database Machine.”

If these statements are spot on, Oracle has cracked some technical and business challenges it has faced for many years. From Oracle’s position of strength, Oracle can crush its rivals by winning head to head competitions. Strength is manifested in client wins and revenues in my book. White papers nuking another tech vendor are not demonstrations of strength in my opinion.

Apparently companies in a position of strength find it appropriate to use rhetoric and disinformation to discipline an upstart. Let me give you an example.

I stumbled upon an Oracle white paper “Mark Logic XML Server 4.1”. You must download your own copy from this link. This paper which shows a November 2009 date here is a fascinating window into Oracle. If I were teaching rhetoric, I would use this Oracle white paper as an example of disinformation. Your mileage may vary.

I asked myself, “Why would a multi-billion dollar outfit invest the time, money, and effort in a direct attack on a specialist company chugging quietly along, pretty much minding its own business?” The Oracle white paper purports to discuss technology of a company that “continues to rely on venture funding”. The white paper explores five alleged weaknesses of the Mark Logic XML Server 4.1. The implications of the Oracle analysis range from cost to complexity to proprietary technology to financial weakness. Mark Logic, according to the white paper “Mark Logic XML Server 4.1” essentially cannot walk and chew gum at the same time.

My own experience with the Mark Logic technology is that Mark Logic can walk, chew gum, and compete in data gymnastics. Keep in mind that I have been fed cold tacos and compensated with a Mark Logic goodie bag at a recent Mark Logic meet up in Washington, DC.

I sat in the crowded meeting room with 225 other people and listened to my former colleagues at Booz, Allen & Hamilton explain their use of the Mark Logic technology. I trust blue chip consultants because the risk of screw ups is too great to deploy a solution that makes a very large client unhappy. I heard speakers representing the US government explain their use of the Mark Logic technology in war fighting, pointing out the benefits of Mark Logic technology in war fighting. I heard whiz kids explain that slicing and dicing information permitted clever mash ups of data without humans fiddling to deliver on the fly, low latency solutions for decision makers. The assertions and evidence in the Oracle anti-Mark Logic white paper were not in line with what I learned directly from Mark Logic users.

This begs the question, “So what’s with the direct attack on Mark Logic?” In my opinion, there are three factors operating:

First, Oracle finds itself in a position of playing catch up in next generation data management. For whatever reason, the Oracle sales engineers have found that organizations in a number of business sectors want a non Oracle solution.

Second, customers are struggling with a mushy economy. The notion of paying more money for Oracle licenses, more money for Oracle service, and more money for more hardware to get acceptable performance continues to lose appeal. Like SAP, Oracle finds itself facing customer resistance to the traditional enterprise software approach. Cost alone is not the only deal breaker. The perceived benefits of an Oracle RDBMS are losing magnetism.

Third, the petascale flows of data in some organizations are forcing a fundamental rethink of traditional data management and repurposing approaches in use since the late 1970s. Last generation technology is not appropriate for next generation data management problems. Not even the entrenched Oracle database administrator can get an aging RDBMS elephant to do the tricks it could in the good old days. A different data animal is needed in my opinion.

I suggest that you read the Oracle write ups yourself and draw your own conclusions. The analysis of Mark Logic underscores Oracle’s own technical Werner’s syndrome.

Stephen Arnold, December 4, 2009

Oyez, oyez, I wish to disclose to the Veterans’ Employment and Training Service that I have been fed tacos and given a goodie bag by Mark Logic’s official chief technology officer. He was nervous around the addled goose, and he watched where he stepped after I waddled away. Prudence is a positive.

Written by Stephen E. Arnold · Filed Under Business strategy, Database, Enterprise, Marketing, News, Technology, Text analytics, Text processing | 3 Comments

Google Nails Duplicate Detection Invention

December 3, 2009

I know that most of my two or three readers does not give a goose feather for duplicate detection. Pretty boring stuff. Google result lists seem to be just one list with few repeating objects. Even in the Google News service, identical stories rarely slip through the digital net.

The ever reliable USPTO has granted a patent to the Google for its duplicate detection method. If you want to know a bit more about the Google approach, you will want to download US7,627,613, “Duplicate Document Detection in a Web Crawler System”. Before my pals at various search and content processing companies email me to explain that their duplicate detection is better, save that energy. No one at the Beyond Search goose pond is asserting “better”. The Google invention deals with scale, petabytes of digital crapola deduped quickly and reasonably effectively. The “scale” idea is one clue to Google’s technology. The challenges of scale are not well understood unless you have to figure out what to do with trillions of instances of digital crapola.

Google says in its glorious prose:

Duplicate documents are detected in a web crawler system. Upon receiving a newly crawled document, a set of documents, if any, sharing the same content as the newly crawled document is identified. Information identifying the newly crawled document and the selected set of documents is merged into information identifying a new set of documents. Duplicate documents are included and excluded from the new set of documents based on a query independent metric for each such document. A single representative document for the new set of documents is identified in accordance with a set of predefined conditions.

Notice what’s left out? Now read the patent document. Notice what’s left out? Google does not make explicit how these separate inventions interlock. Those interlocks are sort of important, particularly if you are a competitor and one of your 20 somethings say, “That’s obvious. I can code that up myself.” Scale. Remember scale. Remember that Google can convert speech to text and then dedupe those outputs too. Scale. Performance. Cost. Useful Google concepts all.

Stephen Arnold, December 3, 2009

I wish to disclose to the National Constitution Center that I was not paid to write this essay with its implicit reference to the constitutional right of Google competitors to misunderstand the notion of “scale” in Google’s weird vocabulary.

Written by Stephen E. Arnold · Filed Under Google, News, Technology, Text analytics, Text processing | Comments Off on Google Nails Duplicate Detection Invention

Some Thoughts About Real Time Content Processing

December 2, 2009

I wanted to provide my two or three readers with a summary of my comments about real time content processing at the Incisive international online information conference. I arrived more addled than than normal due to three mechanical failures on America’s interpretation of a joint venture between Albanian and Galapagos Airlines. That means Delta Airlines I think.

What I wanted to accomplish in my talk was to make one point—real time search is here to stay. Why?

First, real time means lots of noise and modest information payload. To deal with lots of content requires a robust and expensive line up of hardware, software, and network resources. Marketers have been working overtime by slapping “real time” on any software product conceivable in the hopes of making another sale. And big time search vendors essentially ignored the real time information challenge. Plain vanilla search on content updated when the vendor decided was an easier game.

Real time can mean almost any thing. In fact, most search and content processing systems are not even close to real time. The reason is that slow downs can occur in any component of a large, complex content processing system. As long as the user gets some results, for many of the too-busy 30 somethings that is just fine. Any information is better than no information. Based on the performance of some commercial and governmental organizations, the approach is not working particularly well in my opinion.,

Let me give you an example of real time. In the 1920s, America decided that no booze was good news. Rum runners filled the gap. The US Coast Guard learned that it could tune a radio receiver to a frequency used by the liquor smugglers. The intercepts were in real time, and the Coast Guard increased its interdiction rate. The idea was that a bad buy talked and the Coast Guard listened in real time even though there was a slight delay in wireless transmissions. The same idea is operative today when good guys intercept mobile conversations or listen to table talk at a restaurant.

The problem is that communications and content believed to be real time are not. SMS may be delivered quickly, but I have received SMS sent a day or more earlier. The telco takes considerable license in billing for SMS and delivering SMS. No one seems to be the wiser.

A content management system often creates this ty8pe of conversation in an organization. Jack: “I can’t find my document.” Jill: “Did you put it in the system with the ‘index me’ metatag?’” Jack: “Yes.” Jill: “Gee, that happens to me all the time.” The reason is that the CMS indexes when it can or on a specific schedule. Content in some CMSs are not findable. So much for real time in the organization.

An early version of the Google Search Appliance could index so aggressively that the network was choked by the googlebot. System administrators solved the problem by indexing once a day, maybe twice a day. Again, the user perceives one thing and the system is doing another.

This means that real time will have a specific definition depending on the particular circumstances in which the system is installed and configured.

Several business sectors are gung ho for real time information.

Financial services firms will pay $500,000 for a single Exegy high speed content processing server. When that machine is saturated, just buy another Exegy server. Microsoft is working on a petascale real time content processing system for the financial services industry which will compete with such established vendors as Connotate and Relegence. But a delay of a millisecond or two can spoil the fun.

Accountants want to know exactly what money is where. Purchase order systems and accounts receivable have to be fast. Speed does not prevent accidents. The implosion of such corporate giants as Enron and Tyco make it clear that going faster does not make information or management decisions better.

Intelligence agencies want to know immediately when a term on a watch list appears in a content stream. A good example is “Bin Ladin” or “Bin Laden” or a variant. A delay can cost lives. Systems from Exalead and SRA can handle this type of problem and a range of other real time tasks without breaking a sweat.

The problem is that there is not certifying authority for “real time”. Organizations trying to implement real time may be falling for a pig in the poke or buying a horse without checking to see if it has been enhanced at a horse beauty salon.

In closing, real time is here to stay.

First, Google, Microsoft, and other vendors are jumping into indexing content from social networks, RSS feeds, and Web sites that update when new information is written to their databases. Like it or not, real time links or what appear to be real time links will be in these big commercial systems.

Second, enterprise vendors will provide connectors to handle RSS and other real time content. This geyser of information will be creating wet floors in organizations worldwide.

Third, vendors in many different enterprise sectors will be working to make fresh data available. You may not be able to escape real time information even if you work with an inventory control system.

Finally, users—particularly recent college graduate—will get real time information their own way, like it or not.

To wrap up, “what’s happening now, baby?” is going to be an increasingly common question you will have to answer.

Stephen Arnold, December 2, 2009

Oyez, oyez, I disclose to the National Intelligence Center that the Incisive organization paid me to write about real time information. In theory, I will get some money in eight to 12 weeks. Am I for sale to the highest bidder? I guess it depends on how good looking you are.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, News, Online (general), Real time search, Text analytics, Text processing | 2 Comments

Social, Real Time, Content Intelligence

November 29, 2009

I had a long talk this morning about finding useful nuggets from the social content streams. The person with whom I spoke was making a case for tools designed for the intelligence community. My phone pal mentioned JackBe.com, Kapow, and Kroll. None of these outfits is a household word. I pointed to services and software available from NetBase, Radian6, and InsideView.

What came out of this conversation were several broad points of agreement:

First, most search and content processing procurement teams have little or no information about these firms. The horizons of most people working information technology and content processing are neither wide nor far.

Second, none of these companies has a chance of generating significant traction with their current marketing programs. Sure, the companies make sales, but these are hard won and usually anchored in some type of relationship or a serendipitous event.

Third, users need the type of information these firms can deliver. Those same users cannot explain what they need, so the procurement teams fall back into a comfortable and safe bed like a “brand name” search vendor or some fuzzy wuzzy one-size-fits-all solution like the wondrous SharePoint.

We also disagreed on four points:

First, I don’t think these specialist tools will find broad audiences. The person with whom I was discussing these social content software vendors believed that one would be a break out company.

Second, I think Google will add social content “findability” a baby step at a time. One day, I will arise from my goose nest and the Google will simply be “there”. The person at the other end of my phone call sees Google’s days as being numbered. Well, maybe.

Third, I think that social content is a more far reaching change than most publishers and analysts realize. My adversary things that social content is going to become just another type of content. It’s not revolutionary; it’s mundane. Well maybe.

Finally, I think that these systems—despite their fancy Dan marketing lingo—offer functions not included in most search and content processing systems. The person disagreeing with me thinks that companies like Autonomy offer substantially similar services.

In short, how many of these vendors’ products do you know? Not many I wager. So what’s wrong with the coverage of search and content processing by the mavens, pundits, and azure chip consultants? Quite a bit because these folks may know less about these vendors’ systems than how to spoof Google or seem quite informed because of their ability to repeat marketing lingo.

Have a knowledge gap? Better fill it.

Stephen Arnold, November 29, 2009

I want to disclose to the National Intelligence Center that no one paid me to comment on these companies. These outfits are not secret but don’t set the barn on fire with their marketing acumen.

Written by Stephen E. Arnold · Filed Under Business strategy, News, Online (general), Search, Social, Text analytics, Text processing | Comments Off on Social, Real Time, Content Intelligence

Microsoft and News Corp.: A Tag Team of Giants Will Challenge Google

November 23, 2009

Government regulators are powerless when it comes to online. The best bet, in my opinion, is for large online companies to act as if litigation and regulator hand holding was a cost of doing business. While the legal eagles flap and the regulators meet bright, chipper people, the business of online moves forward.

The news that News Corp. and Microsoft are, according to “Microsoft Offers To Pay News Corp To “De-List” Itself From Google”, and other “experts”, these two giants want to form a digital World Wrestling Federation tag team. In the “fights” to come, these champions—Steve Ballmer and Rupert Murdoch–will take on the unlikely upstarts, Sergey the Algorithm Guy and Larry the Math Whiz.

Which of these two tag teams will grace the cover of the WWF marketing collateral? What will their personas become? Source: http://www.x-entertainment.com/pics5/wwe11click.jpg

The idea is to “pull” News Corp. content from Google or make it pay through its snout for the right to index News Corp. content. The deal will probably encompass any News Corp. content. Whatever Google deal is in place with News Corp. would be reworked. News Corp., like other traditional media companies is struggling to regain its revenue traction.

For Microsoft a new wrestling partner makes sense. Bing is gaining market share, but at the expense of Yahoo’s search share. Microsoft now faces Google’s 1,001 tiny cuts. The most recent is the browser based operating system. There is the problem of developers with Microsoft’s former employees rallying the Google faithful. There’s the pesky Android phone thing that went from a joke to a coalition of telephone-centric outfits. There’s the annoyance of Google in the US government. On and on. No one Google nick has to kill Microsoft. Nope. Google just needs to let a trickle of revenue slip away from the veins of Microsoft. The company’s rising blood pressure will do the rest. Eventually, the losses from the 1,001 tiny cuts will force the $70 billion Redmond wrestler to take a break. That “rest” may be what gives Google the opportunity to do significant damage with its as-yet-unappreciated play for the TV, cable, and independent motion picture business. Silverlight 4.0 may not be enough to overcome the structural changes in rich media. That’s real money. Almost as much as the telephony play promises to deliver to the somewhat low key team of Sergey the Algorithm Guy and Larry the Math Whiz

Sergey the Algorithm Guy and Larry the Math Whiz take a break from discussing the Kolmogorov-Smirnov test of normality. Training is tough for this duo. Long hours of solitary computation may exhaust the team before it tackles the Ballmer-Murdoch duo, which may be the most dangerous opponent the Math Guys have faced.

I look forward to the fight promoter to pull out all the stops. One of the Buffers will be the announcer. The cut man will be the master, Stitch Duran. The venue will be Las Vegas, followed by other world capitals of money, power, and citizen concern.

Nicholas Carlson reported:

Still, if News Corp were to “de-list” from Google, we’d expect to see all kinds of ads touting Bing as the only place to find the Wall Street Journal and MySpace pages online. Maybe that’d swing search engine share some, but we doubt it.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Financial, Google, Microsoft, Online (general), Publishing, Technology, Text analytics, Text processing | 2 Comments

Google and Artificial Anchors

November 20, 2009

Folks are blinded by Chrome. What might be missed is what’s often overlooked—Google’s plumbing. Once you have tired of the shiny, bright chatter about Microsoft’s latest reason for its fear and loathing of Google, you may want to navigate to the USPTO and download 20090287698, “Artificial Anchor for a Document.” Google said:

Methods, systems, and apparatus, including computer program products, for linking to an intra-document portion of a target document includes receiving an address for a target document identified by a search engine in response to a query, the target document including query-relevant text that identifies an intra-document portion of the target document, the intra-document portion including the query relevant text. An artificial anchor is generated, the artificial anchor corresponding to the intra-document portion. The artificial anchor is appended the address.

The system and method has a multiplicity of uses, and these are spelled out in Googley detail in the claims made for this patent application. In this free Web log, I won’t dive into the implications of artificial anchors. I will let you don your technical scuba gear and surf on the implications of artificial anchors. Chrome is the surface of the Google ocean. Artificial anchors are part of the Google ocean. Big, big difference.

Stephen Arnold, November 21, 2009

I want to disclose to the USPTO itself that no one paid me to be cryptic in this article.

Written by Stephen E. Arnold · Filed Under Google, News, Online (general), Publishing, Technology, Text analytics, Text processing | 1 Comment

MarkLogic Tames Big Data

November 20, 2009

I spent several hours on November 18, 2009, at the MarkLogic client conference held in Washington, DC on November 18, 2009. I was expecting another long day of me-too presentations. What a surprise! The conference attracted about 250 people and featured presentations by a number of MarkLogic customers and engineers. There were several points that struck me:

First, unlike the old-fashioned trade show, this program was a combination of briefings, audience interaction, and informal conversations fueled by genuine enthusiasm. Much of that interest came from the people who had used the MarkLogic platform to deliver solutions in very different big data situations. Booz, Allen & Hamilton was particularly enthusiastic. As a former laborer in the BAH knowledge factory, the enthusiasm originates in one place—the client. BAH professionals are upbeat * only * when the firm’s customers are happy. BAH described using the MarkLogic platform as a way to solve a number of different client problems.

MarkLogic’s platform applied to an email use case caught the attention of audiences involved in certain types of investigative and data forensics work.Shown is the default interface which can be customized to the licensee’s requirements.

Second, those in the audience were upfront about their need to find solutions to big data problems—scale, analytics, performance. I assumed that those representing government entities would be looking for ways to respond to President Obama’s mandates. There was an undercurrent of responding to the Administration, but the imperative was the realization that tools like relational databases were not delivering solutions. Some in the audience, based on my observations, were actively looking for new ways to manipulate data. In my view, the MarkLogic system had blipped the radar in some government information technology shops, and the people with problems showed up to learn.

Written by Stephen E. Arnold · Filed Under Business strategy, Database, Feature, Government, Text analytics, Text processing | Comments Off on MarkLogic Tames Big Data

Google Books, The Nov 14 Edition

November 15, 2009

If you were awake at 11 54 pm Eastern time, you would have seen Google’s “Modifications to the Google Books Settlement.” Prime time for low profile information distribution. I find it interesting that national libraries provided Google an opportunity to do their jobs. Furthermore, despite the revisionism in the Sergey Brin New York Times’s editorial, the Google has been chugging away at Google Books for a decade. With many folks up in arms about Google’s pumping its knowledge base and becoming the de facto world library, the Google continues to move forward. Frankly I am surprised that it has taken those Google users so long to connect Google dots. Google Books embraces more than publishing. Google Books is a small cog in a much larger information system, but the publishing and writing angles have center stage. In my opinion, looking at what the spotlight illuminates may be the least useful place toward which to direct attention. Maybe there’s a knowledge value angle to the Google Books project? You can catch up with Google’s late Friday announcement and enjoy this type of comment:

The changes we’ve made in our amended agreement address many of the concerns we’ve heard (particularly in limiting its international scope), while at the same time preserving the core benefits of the original agreement: opening access to millions of books while providing rights holders with ways to sell and control their work online. You can read a summary of the changes we made here, or by reading our FAQ.

Yep, more opportunities for you, gentle reader, to connect Google dots. What is the knowledge value to Google of book information? Maybe one of the search engine optimization experts will illuminate this dark corner for me? Maybe one of the speakers at an information conference will peek into the wings of the Google Information Theatre?

Stephen Arnold, November 15, 2009

I wish to report to the Advisory Council on Historic Preservation that I was not paid to point out that national libraries abrogated their responsibilities to their nations’ citizens. For this comment, I have received no compensation, either recent or historic. Historical revisionism is an art, not a science. That’s a free editorial comment.

Written by Stephen E. Arnold · Filed Under Business strategy, Database, Google, News, Publishing, Technology, Text analytics, Text processing | Comments Off on Google Books, The Nov 14 Edition

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Search, Its Biggest Change, and Yawns

MarkLogic and Its XML Briefing Draw Crowds at London Online

Oracle Feels Heat, Tries to Redefine Kitchen

Google Nails Duplicate Detection Invention

Some Thoughts About Real Time Content Processing

Social, Real Time, Content Intelligence

Microsoft and News Corp.: A Tag Team of Giants Will Challenge Google

Google and Artificial Anchors

MarkLogic Tames Big Data

Google Books, The Nov 14 Edition

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta