Google Version 2.0 in Research and Markets Catalog

November 19, 2008

The “world’s largest market research resource” is now reselling the 2007 study Google Version 2.0. You can read Research and Markets’ profile of the study, examine a sample of the 250 page report, and review the table of contents here. In the last six months, interest in this analysis of Google’s technology has spiked. If you have an interest in the technical underpinning for some of the services Google has recently launched, you will find this study a useful resource. The information in the study comes from Google open source publications; for example, patent documents and technical papers. In addition to the discussion of the nuts and bolts of Google’s hottest and most suggestive technologies, the book describes the business implications of these technologies. Google does not commercialize its many inventions, but it is important to think about Google’s technical capabilities in a broader context than a purely engineering frame. The information in the study is more germane to today’s Google because the GOOG has been expanding its services and products at an increasingly rapid rate into markets adjacent to core Web search and advertising.

Stephen Arnold, November 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Cloud computing, Google, News, Technology, Text analytics, Text processing | 1 Comment

Microsoft and Pricing

November 19, 2008

I saw a new story in Seattle Tech Report here that Microsoft is making is OneCare security service free. A short time later I came across Microsoft’s own news release about this pricing change here. Bundling or giving away services free is not a new idea in software. The notion is to give customers a taste and then sell them more has worked many times. In the Microsoft news release, the company says:

Windows Live OneCare will continue to be sold for Windows XP and Windows Vista at retail through June 30, 2009. Direct sales of OneCare will be gradually phased out when “Morro” becomes available. Regardless of their method of purchase, Microsoft will ensure that all current customers remain protected through the life of their subscriptions.

The marketing technique is little more than shareware or freeware with a catch.

Then I remembered that Microsoft was reducing prices for its Dynamics products. The prices for its cloud services for Exchange and SharePoint were quite competitive as well. Even the Zune, according to CNet news is getting new features and a lower price. You can read “Microsoft Chopping Zune Prices” here.

The question I asked myself, “Will Microsoft’s price cutting and no fee initiatives extend to Microsoft Fast enterprise search?” My hunch is that the Fast ESP search technology may become more affordable in the months ahead. Here’s my reasoning:

A number of high profile vendors have rolled out more robust content processing solutions that “snap in” to SharePoint. Examples range from Autonomy to Coveo to Exalead to Interse to ISYS to dozens of other vendors. Companies who want to “work around” SharePoint search problems have an abundance of options. Microsoft Fast may have to use severe price cuts to keep customers from getting out of the corral
As the economic noose tightens on organizations, some vendors may offer a two-fer deal; that is, sign up now, get one year free and pay only for the second year. This approach may be quite appealing in some organizations. In fact, in a recent review of Google prices for the US government, one could easily conclude that Google is keeping this option available to its resellers. The idea is to get shelf space or the camel’s nose into the tent.
New players may be willing to install a proof of concept for little or no money. These upstarts may provide “good enough” solutions that allow an organization to solve a tough content processing problem without spending much money.

I see the present economic climate forcing some Darwinian actions and Microsoft Fast may have to move quickly or face escalating competition within the Microsoft ecosystem. After spending $1.2 billion for a Web part and a police raid, there may be some strategic pricing changes Redmond may have to consider to adapt to the present enterprise market for search and content processing. If you are a Microsoft champion, please, help me understand if my analysis is on track or off track. Use the comments section and bring along some facts, please. I have enough uninformed inputs from my pals Barry and Cyrus to last the winter.

Stephen Arnold, November 19, 2008

Written by Stephen E. Arnold · Filed Under Financial, Microsoft, News, Online (general), Semantic, Technology, Text analytics, Text processing | Comments Off on Microsoft and Pricing

Hewlett Packard: 4th Quarter Friskiness

November 19, 2008

Hewlett Packard reported that its fourth quarter looked pretty good. You can read the November 18, 2008, report here. This San Francisco Business Times story mixed some bitter with the sweet with the headline “HP Extends Holiday Shutdown but Offers Upbeat Earnings Report.” The shutdown allows employees to be with their families and it also conserves cash, which for many companies is scarce. One source close to HP activities told me that Exalead, the French information infrastructure and content processing company, has signed a deal for Exalead to become the first information access vendor to join the HP Business Intelligence Partner Program. This is an HP initiative to deal with terascale information access implementations. HP bought Exstream Software for $1.0 billion earlier in 2008 and now the company is pushing into the business intelligence market. As more information becomes available about Exalead’s role in business intelligence with HP, I will try to document the trajectory of this deal. Exalead’s business intelligence demo that I saw in Paris was quite impressive.

Stephen Arnold, November 18, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Financial, News, Search, Technology, Text analytics, Text processing | Comments Off on Hewlett Packard: 4th Quarter Friskiness

Autonomy Upgrades Investigative System

November 15, 2008

Autonomy, based in Cambridge, England, continues to be one of the most agile of the information access and services company. The firm has updated its Intelligent Investigator & Early Case Assessment software. You can read about the story here or visit the Autonomy Web site for more details. Autonomy asserts that its software can understand the meaning of large volumes of data collected in an investigation or similar procedure. Once the structured and unstructured data are processed, an investigator can use the Autonomy system:

to reconstruct what occurred, develop informed case strategies and sweep aside non-responsive data. A seamless link with Autonomy Legal Hold software automatically provides a legally defensible preservation and collection process.

Features of the investigative system include:

A case centric view of the data. The idea is that an investigator can get a bird’s eye view of information, events, persons of interest, and time in a matter
A new feature to analyze data where it resides and provide answers to queries without building a collection and performing some of the manual tasks other systems require
A risk component
Enhanced entity extraction and alias identification

Other companies offer case management and investigative tools. Autonomy’s broad sweep of software and systems allows the company to provide a solution that can mesh with almost any organizational or legal requirement. Will Autonomy sweep the field in this market? I know the company will try? The challenge will be to convince investigative units and lawyers to try new methods. Investigators and lawyers can be like my grandmother–set in her ways. A number of search and content processing companies are looking closely at these specialized markets. When the economy goes south, legal activity goes north. Autonomy has demonstrated it knows which way the compass is spinning.

Stephen Arnold, November 15, 2008

Written by Stephen E. Arnold · Filed Under EDiscovery, Enterprise, News, Search, Technology, Text analytics, Text processing | 2 Comments

Yasni: People Search

November 14, 2008

yasni a people search engine, just launched in the U.S. If you’re on the web, yasni supposedly will find you. But the search is on first and last names, and there are lots of “Jessica Bratcher”s out there. My yasni search returned 30 results, including hits on amazon.com, Facebook, MySpace, Google News and Blogs, Technorati, even criminal searches. But for more listings, they’ll send me an e-mail list within 24 hours.

People search has been and remains very important. Zoom Info, LinkedIn, and other sites provide useful information. I have found Cluuz.com useful as well. Cluuz.com displays relationship charts. I did some ego surfing to test yasni and I ran the same queries on Cluuz.com. On Cluuz.com, I found an interview I did in 2005. Cluuz.com also surfaced several articles about newspaper awards I’ve received. On my test queries, I did not find yasni as useful. But it is early in the game for yasni. I will check back in a month or so to see how the service develops. I do recommend that you give it a whirl.

Jessica Bratcher, November 14, 2008

Written by Stephen E. Arnold · Filed Under News, Online (general), Search, Text analytics, Text processing | 8 Comments

Webinar: Open Standards and Semantic Technology

November 14, 2008

The economic downturn worldwide bodes poorly for dollars to add more search technologies to the enterprise, but the umbrella in the thunderstorm may be found in a movement quietly readying for a download launch. When will a standardized, semantic IT infrastructure be the basis of the enterprise’s entire IT framework for operations across all divisions?

There is a growing discussion in Europe, now spilling over into the US, regarding the SMILA project, the SeMantic Information Logistics Architecture. For more detail, click here or navigate http://eccenca.broxblogs.de. This open source solution is coming from a partnering of brox IT-Solutions, and empolis in Germany through Eclipse.org.

Semantic Technologies

Semantic technologies continue to gain in the discussion amongst researchers and companies investing in their own search frameworks across the organization because it is the unstructured data that remains the elephant in the room. There are proponents in several large IT companies that believe an answer is available in SMILA. When will a semantic IT infrastructure be the basis of the enterprise’s entire IT framework for operations across all divisions? Consider this white paper (in German-use translate.google.com) http://www.heise.de/open/Union-Investment-Integrationsplattform-auf-Basis-offener-Standards–/artikel/118395 The paper contends that:

“Open standards make applications more quickly realized and flawless.”

Eccenca is the commercial level version available for enterprise that is being deployed with professional services and support. At brox, the company is building commercial-grade architecture and applications for the enterprise under the Eccenca Foundation, based on the SMILA codebase. Eccenca products will reflect internal expertise of existing customer requests, including those of startups in Theseus, Volkswagen, and others. See more information in the response to this blog’s recent discussion (Nov.4th) at http://h3lge.de/weblog/. Eccenca.com and the first download of SMILA are anticipated in short order. At Eccenca.com, brox will set up and manage a marketplace for standard-based plug ins, solutions, and expertise.

Webinar

There is a webinar in English coming up to discuss this whole approach further, coming up on December 17, 2008. The seminar will run about one hour and take place at 8:00 am PDT / 11:00 pm EDT / 4:00 pm GMT. The seminar will be given by Georg Schmidt (brox IT-Solutions) and Igor Novakovic (empolis). The title of the webinar is “SMILA – SeMantic Information Logistics Architecture.” This webinar will present the SMILA project (emphasizing the integration possibilities), provide the status report about the latest project developments and give a short demonstration of currently implemented features.

The webinar will discuss the challenge of the amount and diversity of information is growing exponentially, mainly in the area of unstructured data, like emails, text files, blogs and images. Poor data accessibility, user rights integration and the lack of semantic metadata are constraining factors for building next generation enterprise search and other document centric applications. Missing standards result in proprietary solutions with huge short and long term cost. SMILA is an extensible framework for building search solutions to access unstructured information in the enterprise. Besides providing essential infrastructure components and services, SMILA also delivers ready-to-use add-on components, like connectors to most relevant data sources. Using the framework as their basis will enable developers to concentrate on the creation of higher value solutions, like semantic driven applications.

An article authored by Dawn Marie Yankeelov, president of ASPectx.

Written by Stephen E. Arnold · Filed Under Conferences, News, Semantic, Technology, Text analytics, Text processing | 2 Comments

Google and Novel Content

November 13, 2008

On November 11, 2008, Google received a patent for the invention “Detecting Novel Content”, US7451120. In my opinion this is an important Google invention. The system and method makes it possible for Google to identify a segment of a document that contains interesting information. “Novel” is a code word for distinctive information. The abstract for the invention is:

A system determines an ordered sequence of documents and determines an amount of novel content contained in each document of the ordered sequence of documents. The system assigns a novelty score to each document based on the determined amount of novel content.

Let’s assume that Google uses this invention. What can the method deliver? My thought was a compilation of novel content on a user-specified subject. Traditional publishers cut and paste to create anthologies. In the 15th and 16th centuries books that were collections of snippets were used to teach students Latin and Greek. Another possible use of the method would be to snip content from one document and place that snippet and its metadata into a dataspace.

Stephen Arnold, November 13, 2008

Written by Stephen E. Arnold · Filed Under Database, Google, News, Online (general), Semantic, Technology, Text analytics, Text processing | 2 Comments

ISYS:web 9 Now Available

November 11, 2008

A happy quack to the reader in Colorado who alerted me to the new release of ISYS Search Software Version 9.0. I had a pre release version, and I found that its speed and date features were particularly useful. According to ISYS Search Software:

ISYS:web 9 offers customers several major enhancements, all designed to deliver the speed, efficiency and accuracy required to find information fast. More importantly, ISYS has expanded its content mining capabilities using predictive and reliable methods that help customers better understand their content. Through its Intelligent Content Analysis, ISYS notes key characteristics about a content collection, such as metadata patterns and entities, and leverages these facets in the interface to provide a more fluid search and discovery process.

Among the new features are:

Intelligent Query Expansion. Designed to give users greater context and avenues to pursue, Intelligent Query Expansion offers suggestions based on your query and the document. For example, a search for “SharePoint” might suggest “SharePoint search web part.”
ContextCogs are snippets of relevant and contextual information pulled from third-party sources and displayed alongside standard ISYS results. When a search is executed, the query is also passed to each registered Cog, which could include enterprise-level applications, Internet search engines or Active Directory Contacts.
Intelligence Clouds enable rapid navigation of key information. The tag cloud appears as a collection of search terms and phrases, with the various terms shown in larger or smaller fonts depending on their density within the index.
Improved Performance and Scalability. ISYS:web handles most search requests concurrently with a higher throughput. Additionally, we’ve increased index capacity from 24 gigabytes to 384 gigabytes per index. With indexed data representing, on average, 10 to 20 percent of the total data size, ISYS can now index two to four terabytes of information per index.
Search Form Customization. ISYS now offers both automatic and custom designed search forms. For automatic search forms, users point the wizard at their indexes and ISYS creates a search form automatically by analyzing the content and structure of the information. ISYS also offers a point-and-click method for creating forms for searching structured information.
Index Biasing. ISYS told me that the company wanted to enhance ISYS:web’s tuning capabilities. “Tuning” in this context means giving administrators with the ability to adjust the weighting on entire collections of documents. This option enables an organization to further tune relevance to suit specific situations; for example, boost specific content across result sets.
De-Duplication. ISYS automatically identifies identical documents and either removes them from the results or visually marks them. This capability is of particular importance to legal professionals conducting discovery work, or any user attempting to conduct analysis of a given content collection
ISYS:web Federator allows customers to federate their searches across both ISYS and non-ISYS content sources. ISYS:web displays results from each source separately, allowing users to navigate between the sets of results without compromising relevance.
Exchange Indexing. Particularly important for responding in a timely manner to discovery requests, ISYS:web enables administrators to centrally create and manage individual indexes for each user’s email account. Administrators can also opt to make these indexes available to end users, relying on Active Directory permissions to ensure users can only search the email indexes for which they are authorized.

I ran several queries on the new system. You can read about my tests and examine a sample screen shot here. In my April 2008 study for the Gilbane Group, I identified ISYS Search Software as a “company to watch.” In fact, I highlighted the company in lecture about enterprise search in 2009 here. For more information about the company, navigate to the ISYS Search Software Web site here. You can download a trial version of the software here. If you want to get a flavor for the company’s commitment to search, you may find the interview I conducted with Ian Davies, founder of ISYS Search Software a way to understand the firm’s approach to information access. I conducted the interview in March 20008, but it is quite relevant today (November 11, 2008).

Stephen Arnold, November 11, 2008

Written by Stephen E. Arnold · Filed Under Enterprise, News, Search, Semantic, Text analytics, Text processing | 1 Comment

Disturbing Data, Possible Parallel for Search

October 30, 2008

After wrapping up another section of my forthcoming monograph Google Publishing technology for Infonortics Ltd. in Tetbury, England, I scanned the content sucked in by my crawlers. Another odd duck greeted me with the off point headline “Outlook: Don’t Panic It’s Not 2001” here. (This is a wacky url so you may have to navigate to the parent site www.commsdesign.com and hunt for the author Bolaji Ojo.

For me, one telling paragraph was:

In 2001, for instance, the wireline communications equipment market sank 18 percent to $69.6 billion, from $85.3 billion in the previous year. Semiconductor sales to the segment tumbled 37 percent on a combination of sagging demand and severe pricing declines. Seven years later, wired communications equipment sales have yet to recover to the 2000 level, and estimates indicate the market won’t bounce back fully until sometime in the next decade. ISuppli expects 2009 wired communications sales to be approximately $76.6 billion, improving from an estimated $72.5 billion in 2008, but still below the record 2000 figure of $85 billion.

Source: http://thesaleswars.wordpress.com/2008/02/

Another interesting point was:

The entire semiconductor market wasn’t as fortunate. Chip sales plunged 43 percent in 2001, to $101.8 billion from $178.9 billion in 2000, according to the Semiconductor Industry Association. The industry resumed growth in 2002, but it wasn’t until 2004 before global sales finally crawled past the previous record. By then, dozens of semiconductor, passives, interconnect and electromechanical companies and electronic manufacturing services providers had disappeared, some merging with stronger rivals. A few others went under, unable to finance operations as customers froze purchases or exited the embattled networking equipment market.

What these data suggested to me was that the search, content processing, and search enabled application sectors may face significant revenue declines and could take years to recover. The loss of companies that have no revenue is understandable. Funding sources may dry up or cut off the flow of money. Large firms may shed staff, but these vendors will, for the most part, remain in business. The real pressure falls on what I call “tweeners”. Tweeners are organizations that are in growth mode but the broader downturn can reduce their sales and squeeze the companies’ available cash. Slow payment from customers adds to the problem.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Search, Semantic, Text analytics, Text processing | 1 Comment

Amazon’s iTunes Like Interface

October 28, 2008

Amazon has developed a new interface. You can read the news story on TechCrunch here. The graphical presentation is intended to make it easier and more fun to browse Amazon’s products. Jason Kinkaid’s article does a very good job of explaining the features of this interface. For me, the most important comment in the write up was:

The site seems geared towards shoppers who are just looking for ideas, as there isn’t a search feature. Users can scroll through the site using their arrow keys, zooming in on individual products by hitting the spacebar. Each product includes a demo video (in the case of movies, songs, and video games) or an excerpt (from books).

I have often asserted that search is dead. I did not say that search was not useful. Amazon believes it has cracked the code on information retrieval without asking the user to type in the title of a book or an author’s name. Amazon wants to be a combination of Apple and Google. Amazon may have to keep trying to manage this transition.

Stephen Arnold, October 28, 2008

Written by Stephen E. Arnold · Filed Under News, Online (general), Search, Semantic, Text analytics, Text processing | 1 Comment

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Google Version 2.0 in Research and Markets Catalog

Microsoft and Pricing

Hewlett Packard: 4th Quarter Friskiness

Autonomy Upgrades Investigative System

Yasni: People Search

Webinar: Open Standards and Semantic Technology

Google and Novel Content

ISYS:web 9 Now Available

Disturbing Data, Possible Parallel for Search

Amazon’s iTunes Like Interface

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta