Facebook Streams

June 25, 2009

You will want to work through this somewhat disjointed discussion of Facebook in ReadWriteWeb’s “The Day Facebook Changed Forever: Messages to Become Public By Default.” For me the most important point was:

In time, though, people may very well decide they are comfortable with their social networking being public by default. That will be a different world, and today will have been one of the most important days in that new world’s unfolding.

The reason? More content flows to monitor and mine. Goodie. Love those social postings.

Stephen Arnold, June 26, 2009

Written by Stephen E. Arnold · Filed Under News, Privacy, Social, Text analytics, Text processing | Comments Off on Facebook Streams

Text Mining and Predicting Doom

June 23, 2009

The New Scientist does not cover the information retrieval sector. Occasionally the publication runs an article like “Email Patterns Can Predict Impending Doom” which gets into a content processing issue. I quite liked the confluence of three buzz words in today’s ever thrilling milieu: “predict”, “email”, and “doom”. What’s the New Scientist’s angle? The answer is that as tension within an organization increases, communication patterns in email can be discerned via text mining. The article hints that analysis of email is tough with privacy a concern. The article offers a suggestive reference to an email project at Yahoo, but provided few details. With monitoring of real time data flows available to anyone with an Internet connection, message patterns seem to be quite useful to those lucky enough to have the tools need to ferret out the nuggets. Nothing about fuzzification of data, however. Nothing about which vendors are leaders in the space except for the Yahoo and Enron comments. I think there is more to be said on this topic.

Stephen Arnold, June 23, 2009

Written by Stephen E. Arnold · Filed Under EDiscovery, News, Online (general), Privacy, Text analytics, Text processing | Comments Off on Text Mining and Predicting Doom

Why Social Information Becomes More Important to Investors

June 22, 2009

Few people in Harrod’s Creek, Kentucky, pay much attention to the publishing flow from financial services and its related service industry. Most of the puffery gets recycled on the local news program, boiled down to a terse statement about hog prices and the cost of a gallon of gasoline. The Wall Street Journal has become software in the last two years with about 20 percent of the Friday edition and 30 percent of the Saturday edition devoted to wine, automobiles, and lifestyles (now including sports). I am waiting for a regular feature about sports betting, which is one of the key financial interests in Kentucky.

Asking your pal at the local country club is not likely to get you a Bernie Madoff scale tip, but there are quite a few churners. Each is eager to take what money one has, recycle it, and scrape off sufficient commissions to buy a new Porsche. As the deer have been nuked by heavy traffic in the hollow, zippy sports cars are returning to favor. A Porsche drivers fears no big bodywork repair by smoking a squirrel.

I read with interest “Washington Moves to Muzzle Wall Street” by Mike Larson. I think Mr. Larson puts his photo on his Web site, and he looks like a serious person. Squirrels won’t run in front of his vehicle I surmise. He wrote:

he Obama administration revealed a sweeping series of new proposed regulations and reforms — all designed to prevent the next great financial catastrophe. The plan is multi-faceted and complex. Among other things, it aims to increase the Fed’s power, regulate the derivatives and securitization markets more effectively, protect consumers from the potential harm of complex financial products, and more. It’s been a long time in the making, with input from key policymakers, consumer groups, academics, and others.

After the set up, Mr. Larson reviews the components of the Administration’s plan. He observed:

I’m hopeful we’ll see meaningful action this year. More importantly, I’m hopeful that policymakers who are empowered to take new actions to police the markets and protect consumers actually exercise them. That’s the key to making any of this stuff work. It’s unclear exactly when these provisions will start to impact the disclosures you get when you take out a mortgage, or when you’ll be able to protest to the new consumer protection agency should you get shafted on a financial transaction.

His story trigger my thinking. One angle that crossed my mind was that the information generated about the US financial circus may get sucked into the gravitational pull of this initiative. The reason is that money is a form of information. Regulate the money, the information stream is affected.

One consequence is that the type of information generated by social networks, Web logs, Facebook posts, and other “off the radar” sources is likely to become more important. If I am right, the value of companies that can make “off the radar” available or better yet in a form that makes sense of many data points will go up.

My first thought is that if the Wall Street crowd gets muzzled to a greater degree, then the underside of reportage–bloggers like me–may become more important. Just my opinion, of course.

In the months ahead, I want to noodle this idea. My thoughts are exploratory, but I have decided that my preliminary musings will be made available as a PDF which you can download without paying for the information. Keep in mind that the editorial policy in the “About” section of this Web log will apply to free stuff that I am not forcing anyone to read.

Stephen Arnold, June 22, 2009

Written by Stephen E. Arnold · Filed Under Financial, News, Publishing, Search, Social, Technology, Text analytics, Text processing | Comments Off on Why Social Information Becomes More Important to Investors

Analysis of Google Wave

June 8, 2009

Radovan Seman?ík’s Weblog published “Storm Alert”, a quite interesting discussion of Google Wave. You must read his write up here. Among the points I noted were:

Lack of security
Inconsistent nomenclature
Assumptions about performance when there are large numbers of users.

For me, the most telling comment in this article was:

Google Wave architecture does not adhere to architectural best practice. It is not minimal. The robots are described to communicate with Wave by HTTP/JSONRPC (robot is server), Client apparently communicates by HTTP (as AJAX application?) , while the wave federation protocol is described as XMPP-based. Why do we need so many protocols? Is there any reason why robot protocol and client-server protocol needs to be different? The non-minimalistic approach can be seen in the OT operations as well. The antidocumentelementstart and endantidocumentelementstart operations seems redundant to me. If they are not redundant, their existence should be explained in the architectural documents.

Highly recommended. In my opinion, Wave is a variant of the dataspace technologies. Like the first Searchology, I think Google built a demo and rushed it out the door in order to blunt the PR impact of Microsoft Bing.com’s roll out.

Stephen Arnold, June 8, 2009

Written by Stephen E. Arnold · Filed Under Google, News, Social, Technology, Text analytics, Text processing | 2 Comments

MarkLogic: The Shift Beyond Search

June 5, 2009

Editor’s note: I gave a talk at a recent user group meeting. My actual remarks were extemporaneous, but I did prepare a narrative from which I derived my speech. I am reproducing my notes so I don’t lose track of the examples. I did not mention specific company names. The Successful Enterprise Search Management (SESM) reference is to the new study Martin White and I wrote for Galatea, a publishing company in the UK. MarkLogic paid me to show up and deliver a talk, and the addled goose wishes other companies would turn to Harrod’s Creek for similar enlightenment. MarkLogic is an interesting company because it goes “beyond search”. The firm addresses the thorny problem of information architecture. Once that issue is confronted, search, reports, repurposing, and other information transformations becomes much more useful to users. If you have comments or corrections to my opinions, use the comments feature for this Web log. The talk was given in early May 2009, and the Tyra Banks’s example is now a bit stale. Keep in mind this is my working draft, not my final talk.

Introduction

Thank you for inviting me to be at this conference. My topic is “Multi-Dimensional Content: Enabling Opportunities and Revenue.” A shorter title would be repurposing content to save and make money from information. That’s an important topic today. I want to make a reference to real time information, present two brief cases I researched, offer some observations, and then take questions.

Let me begin with a summary of an event that took place in Manhattan less than a month ago.

Real Time Information

America’s Top Model wanted to add some zest to their popular television reality program. The idea was to hold an audition for short models, not the lanky male and female prototypes with whom we are familiar.

The short models gathered in front of a hotel on Central Park South. In a matter of minutes, the crowd began to grow. A police cruiser stopped and the two officers were watching a full fledged mêlée in progress. Complete with swinging shoulder bags, spike heels, and hair spray. Every combatant was 5 feet six inches taller or below.

The officers called for the SWAT team but the police were caught by surprise.

I learned in the course of the nine months research for the new study written by Martin White (a UK based information governance expert) and myself that a number of police and intelligence groups have embraced one of MarkLogic’s systems to prevent this type of surprise.

Real-time information flows from Twitter, Facebook, and other services are, at their core, publishing methods. The messages may be brief, less than 140 characters or about 12 to 14 words, but they pack a wallop.

MarkLogic’s slicing and dicing capabilities open new revenue opportunities.

Here’s a screenshot of the product about which we heard quite positive comments. This is MarkMail, and it makes it possible to take content from real-time systems such as mail and messaging, process them, and use that information to create opportunities.

Intelligence professionals use the slicing and dicing capabilities to generate intelligence that can save lives and reduce to some extent the type of reactive situation in which the NYPD found itself with the short models disturbance.

Financial services and consulting firms can use MarkMail to produce high value knowledge products for their clients. Publishing companies may have similar opportunities to produce high grade materials from high volume, low quality source material.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Publishing, Technology, Text analytics, Text processing | 3 Comments

Microsoft Fast ESP Revealed

June 3, 2009

Need an enterprise search system? Have four months? Microsoft Fast Enterprise 360 is for you. You can read a case study of a lightning fast (no pun intended) search implementation in a new Microsoft Fast white paper called “Enterprise Search 360: Achieving a Single Search Platform across the Enterprise”. I had a bit of trouble locating the document but I am an addled goose and your, if my Web log usage analysis system is working, are a 40 something, proud, confident, and an expert in search. If you have a user name and password for ZDAsia, you can download it here. If you get a 404, shave the url, register, and search for “enterprise 360”. If you are persistent, you can snag this one page write up. Here are some keepers from this remarkable “white paper”:

“National Instruments was quickly attracted to the FAST platform’s versatility, flexibility, and capability to expand. Inside four months the FAST Enterprise Search Platform solution was fully implemented by National Instruments’ team without the assistance and added cost of professional services personnel for the installation.” My thought, “Wow.”
“FAST’s impact was apparent from the get-go.” My thought, “Wow again.”
“When customers do seek out customer support, the FAST-supported online support request functionality gives application engineers vital information about the customers and their needs before the support conversation even begins.” My thought, “ESP. Extra sensory perception. I knew the meaning of the acronym and thought it meant enterprise search platform.“

I downloaded this document. Much food for thought and analysis.

Stephen Arnold, June 1, 2009

Written by Stephen E. Arnold · Filed Under Enterprise, Financial, News, Online (general), Search, Technology, Text analytics, Text processing | Comments Off on Microsoft Fast ESP Revealed

Digital Reef Makes Microsoft Fast Work

May 25, 2009

I puzzled over “Digital Reef Partners with FAST, Helps Manage SharePoint Content” in CMSWire here. The article covers a number of content functions that I try to keep separate; for example, unstructured data, “out of the box support for eDiscovery, compliance, Office SharePoint Server management, data security, and storage initiatives”, and analytic tools. Oh, I almost omitted manipulation of structured data. Who provides this happy family of services? Digital Reef. You can read more about this company here. The company asserts that it handles these different functions. My view is that the company knows how to tame SharePoint and implement Fast Search’s ESP “out of the box”. In my experience, prior to the acquisition of Fast Search & Transfer, implementing Fast ESP as an “out of the box” solution was time consuming, difficult, and required a Fast engineer with email and phone access to senior Fast Search wizards in Boston and Oslo. Dark days ahead for third party vendors of alternatives to Microsoft SharePoint services.

Stephen Arnold, May 25, 2009

Written by Stephen E. Arnold · Filed Under EDiscovery, Enterprise, Search, SharePoint, Text analytics, Text processing | Comments Off on Digital Reef Makes Microsoft Fast Work

SirsiDynix Search Plus Discovery for Libraries

May 24, 2009

Brainware landed a deal to provide search and discovery to SirsiDynix. After a bit of poking around, I learned that SirsiDynix wanted to move beyond key word search and provide users of its library systems with discovery functions. “Discovery”, as used in this sense, refers to giving a person looking for information easy-to-use methods to look for related information and suggested information also germane to the user’s query. Endeca hooked up with Ebsco to provide “guided navigation” to Ebsco customers. Most online public access catalogs and library-centric search systems match the users’ query terms or force the user to search by entering an author’s name. Change, at long last, seems to be coming to the library for search of an institution’s textual information. I wrote about some of the Brainware system’s capabilities in my 2008 study “Beyond Search” for the Gilbane Group here. I also did a short write up about Brainware in this Web log in early 2008 here.

A reader alerted me to an announcement here that SirsiDynix will roll out an enhanced enterprise search and discovery system to over 30 libraries. You can read that announcement here. The system includes such features as:

Trigram analysis, or “fuzzy logic” which evaluates each trigram in a word to allow for typos, diacritics and more: a first in the library search and discovery market
“Did you mean” suggestions which are based on terms in the catalog (rather than a generic third-party dictionary)
Dynamic search suggestions
Delivery of saved searches through an RSS web feed
Email and print options for search results
Built-in “Library Favorites”
The capability for libraries to define their own “Favorites”, profiles, languages and filters.

You can test the Brainware power “enterprise” service at the Wells County Public Library here.

The library market has been under severe price competition. This information sector is coming under more and more pressure from Google. The world’s largest search provider has been slowly expanding its services, including the controversial Google Books’ program. So far, specialized vendors of library information systems have been able to maintain the grip in today’s slippery economic one lane highway. The impact of Google on this market will be interesting to observe.

Stephen Arnold, May 24, 2009

Written by Stephen E. Arnold · Filed Under Library automation, News, Search, Text analytics, Text processing | 2 Comments

Autonomy Expands into Marketing

May 17, 2009

Attensity has been moving into marketing and marketing-related search applications. Autonomy has offered tools that provide insights into market behavior announced at the eMetrics Marketing Optimization Summit a deal that indicates Autonomy is serious about this application of its search technology. Autonomy announced that the Optimost Adaptive Targeting is now powered by Autonomy’s Meaning Based Marketing engine. Autonomy is showing agility in its leveraging of its Interwoven acquisition. The company said here:

Optimost Adaptive Targeting mines all major types of customer attributes to create customer segments, including context (how the visitor arrives at the Website, e.g. search keyword), geography, time of day, and demographic, behavior, and account profile information. Once customer segments are created, multivariate tests are conducted on an unlimited number of copy ideas, offers, and layouts to determine the best solution for each audience segment. By adding the Meaning Based Computing capabilities of IDOL to Adaptive Targeting, marketers gain a unique ability to understand and effectively serve their customers. By leveraging IDOL, Optimost Adaptive Targeting now includes unique keyword clustering capabilities that automatically identifies
concepts and patterns as they are emerging on the web. For instance, an online pet store might discover that an unusually high number of “long-tail” searches relate to vacationing with pets. The solution could then automatically serve up more content and special offers around travel tote bags and kits.

The addled goose predicts that other vendors of search and content processing technology will increase their efforts to blend search and content processing with online and traditional marketing functions. Google is active in online marketing, and could increase its presence in this sector quickly and without warning. A stampede may be forming on the search prairie.

Stephen Arnold, May 17, 2009

Written by Stephen E. Arnold · Filed Under Enterprise, News, Online (general), Technology, Text analytics, Text processing | 3 Comments

Wolfram Alpha and Niche Ville

May 17, 2009

I have been grabbing quick looks at the write ups about the Wolfram Alpha search system. One of the more interesting essays is by Larry Dignan. You can read “Wolfram Alpha Launches: Can It Break Out of Niche Ville?” here. I liked the idea of a “Niche Ville”. The phrase connoted a small town which may be interesting, but it is not likely to suck MBAs into its maw the way Manhattan does. Aside from the suggestion that Wolfram Alpha was a “hamburger and fries” type of search and content processing system, I found this comment quite suggestive:

Overall, Wolfram/Alpha reads like an encyclopedia. It’s handy at times, but the big question is whether the search engine can break out of niche-ville. Sure, geeks like the presentation and it Wolfram/Alpha can be handy for deep dives, but the average person will want some sort of results every time. In that regard, Wolfram/Alpha may be a disappointment.

My thought was, “Wolfram Alpha is like other question answering systems. It’s a bit like an advanced search function because the user has to do more thinking than typing pizza into a Google Map. As a result, a small percentage of Web users may have the mental energy to tackle these systems.” Niche is a useful term.

Stephen Arnold, May 17, 2009

Written by Stephen E. Arnold · Filed Under News, Online (general), Technology, Text analytics, Text processing | 3 Comments

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Facebook Streams

Text Mining and Predicting Doom

Why Social Information Becomes More Important to Investors

Analysis of Google Wave

MarkLogic: The Shift Beyond Search

Microsoft Fast ESP Revealed

Digital Reef Makes Microsoft Fast Work

SirsiDynix Search Plus Discovery for Libraries

Autonomy Expands into Marketing

Wolfram Alpha and Niche Ville

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta