Enterprise Search: Can Word Choice Rescue a Dogpaddling Business?
May 23, 2013
I read “Ontology Slays Data Integration and Ignites Semantic Search Revolution.” I found several things interesting about the write up.
First, there is the word choice: “slays,” “ignites,” and “revolution.” In case you have forgotten, an ontology is, according to the Catholic Encyclopedia:
Though the term is used in this literal meaning by Clauberg (1625-1665) (Opp., p. 281), its special application to the first department of metaphysics was made by Christian von Wolff (1679-1754) (Philos. nat., sec. 73). Prior to this time “the science of being” had retained the titles given it by its founder Aristotle: “first philosophy”, “theology”, “wisdom”. The term “metaphysics” (q.v.) was given a wider extension by Wolff, who divided “real philosophy” into general metaphysics, which he called ontology, and special, under which he included cosmology, psychology, and theodicy. This programme has been adopted with little variation by most Catholicphilosophers. The subject-matter of ontology is usually arranged thus:
- The objective concept of being in its widest range, as embracing the actual and potential, is first analyzed, the problems concerned with essence (nature) and existence, “act” and “potency” are discussed, and the primary principles — contradiction, identity, etc. — are shown to emerge from the concept of entity.
- The properties coextensive with being — unity, truth, and goodness, and their immediately associated concepts, order and beauty — are next explained.
- The fundamental divisions of being into the finite and the infinite, the contingent and the necessary, etc., and the subdivisions of the finite into the categories (q.v.) substance and its accidents (quantity, quality, etc.) follow in turn — the objective — reality of substance, the meaning of personality, the relation of accidents to substance being the most prominent topics.
- The concluding portion of ontology is usually devoted to the concept of cause and its primary divisions — efficient and final, material and formal –the objectivity and analytical character of the principle of causality receiving most attention.
My reaction? The use of the term ontology in the context of “slays,” “ignites,” and “revolution” seems a little frisky.
Second, the product referenced in the news release offers some relief. I find the explanation of the product in terms of what it is not quite interesting; to wit:
Ontology 4 is built to five key principles that separate it from traditional data integration technologies:
- No schema – Ontology uses a searchable, semantic model built on proven graph-based technology.
- No Integration – Ontology uses a semantic model to find and combine data relating to business entities fragmented across the enterprise.
- No Big Bang – Ontology’s semantic model embraces on-going changes while delivering value early and iteratively over the duration of a project.
- No Search Restriction – Ontology’s semantic search find’s information across application data, documents and emails.
- No Upfront Risk. – No integration to data sources, No unnecessary tying up of team resources, No feasibility surprises, and No problem changing project requirements.
“The Internet is the world’s largest source of data, yet no one integrates it. They search it,” concluded Enweani. “So, when it comes to enterprise data, we say ‘Search, don’t Integrate.”
Third, enterprise search and the vendors engaged in the discipline demonstrated at two enterprise search summits in the last two weeks a strong shift away from the use of the word “search.” Synonyms included customer relationship management, discovery, search based applications, and similar distancing terms.
Perhaps more colorful word choice and the use of old style rhetorical flourishes will breathe life into a dogpaddling business sector. As one vendor which recently experienced a CEO shuffle because the firm once again missed its numbers, “We are now a platform.”
Will word choice deliver revenue? Investors hope so.
Stephen E Arnold, May 23, 2013
Sponsored by Augmentext
Facial Recognition Technology Is A WIP
May 23, 2013
Watch any crime-solving show on TV and the forensics department has facial recognition technology that can take a blurry photo and make it as clear as pure water. Sadly, ARS Technica points out that facial recognition technology is more fantasy than truth: “Why Facial Recognition Tech Failed In The Boston Bombing Manhunt.” The article points out the faults in facial recognition, citing how the suspected Boston bombers’ photos were in a database but cameras around the area failed to pick them up. The technology can work, but it almost needs the right person at the right time:
“Under the best circumstances, facial recognition can be extremely accurate, returning the right person as a potential match more than 99 percent of the time with ideal conditions. But to get that level of accuracy almost always requires some skilled guidance from humans, plus some up-front work to get a good image.”
Improved graphic quality and cloud computing make the process more reliable and accurate, even deployable to mobile devices. Multiple mobile devices with cameras from different angles can actually cobble together an image, but more cameras are not a solution. The current systems are not complex enough to handle it, but the technology is well on its way. Facial recognition is more science-fiction than reality. It exists, but only in the beta phase.
Whitney Grace, May 23, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
The Negative Side Of Enterprise Software
May 23, 2013
You like, you hate it, you love it, you loathe it. These seem to be the common conceptions when it comes to enterprise software. Despite all the praise enterprise software has garnered, Glider takes a look at “Why Enterprise Software Sucks: 6 years Later,” a retrospect on an article from 2007.
Back in 2007, enterprise software’s biggest problem was the software buyers were not the end users. The buyers just needed to fulfill the requirements and a good user experience was optional. Fast forward to the present day, things are better…somewhat. Users are able to cut out the middleman and buy their own product as well as more user-friendly software. Companies are still facing slow adoption of the better product. Why? They are running off legacy systems and are afraid to touch them in case it should fail. Then there is the trust factor, companies hear about next technology, but are reluctant to try it. Once the crowd migrates over, so will everyone else.
Does enterprise software have a future? Yes, it does:
“The world at large is quickly growing accustomed to consumer internet (and mobile) applications. Everybody in the world is on Facebook. The average person has over 50 apps on their phone. It’s just a matter of time until they expect the same quality in the tools they use at work. The consumerization of enterprise will only grow stronger. The same can be said for bottom-up adoption.”
Enterprise is wanted, the mentality of the users just has to change to adopt it. If enterprise is “back,” are there lessons in this article for vendors of search, content processing and analytics systems aka the Big Data crowd? Or have they already learned from where enterprise software failed in the past?
Whitney Grace, May 23, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
LucidWorks Raises 10 Million in Capital
May 23, 2013
LucidWorks continues to raise revenue, helping the company build and support open source software that empowers organizations to manage their multi-structured data. Venture Beat covers this latest round of venture capital in their story, “LucidWorks Pulls in $10M to Turn Open Source Data Into ‘Business Gold.’”
The articles states:
“‘Big data’ startup LucidWorks has raised $10 million to help enterprise companies ‘turn multistructured data into business gold’ . . . According to a form filed with the SEC, existing investors Shasta Ventures, Granite Ventures, and Walden International contributed to this third round of funding. It brings LucidWorks’ total capital raised to $26 million.”
The company employs one-fourth of the committers on the Apache Lucene/Solr project, upon which their LucidWorks Search and LucidWorks Big Data offerings are built. Big customers include AT&T, Elsevier, Cisco, Nike, Sears, and Ford, among others. The company is truly doing well, and this additional capital will help improve their scope and reach. Their support offerings set them apart from the pack, and their investment in open source is sincere, sponsoring multiple training and development events across the country. If they stay on this path, good things will continue to happen to LucidWorks.
Emily Rae Aldridge, May 23, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Phone Data Value And What Companies Are Doing With It
May 23, 2013
Smartphones are an extension of a person’s life and they record it every time a person uses it. Smithsonian Magazine takes a look at how phone companies are tracking and using the data from phones in, “What Phone Companies Are Doing With All That Data From Your Phone.” Verizon Wireless is aware of the phone data goldmine and has added a new division called Precision Market Insights and Telefonica is adding a new business unit Telefonica Dynamic Insights to do the same thing. Phone data is being used for market, medical, and social science research. The biggest usage is tracking how people move in real time. The data collected is supposed to remain anonymous, but that is not happening.
People can be tracked:
“But a study published in Scientific Reports in March found that even data made anonymous may not be so anonymous after all. A team of researchers from Louvain University in Belgium, Harvard and M.I.T. found that by using data from 15 months of phone use by 1.5 million people, together with a similar dataset from Foursquare, they could identify about 95 percent of the cell phones users with just four data points and 50 percent of them with just two data points. A data point is an individual’s approximate whereabouts at the approximate time they’re using their cell phone.”
People’s travel and cell phone patterns are repetitive and unique, making it easy to narrow down results to an individual user. Anonymity is a hard thing to achieve with a smartphone. To confuse the data, a person could get two mobile phones, but then does that increase the fun or increase the risk?
Whitney Grace, May 23, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Demographics and an Another Daunting Challenge for Search
May 22, 2013
I read “Pew: 94% Of Teenagers Use Facebook, Have 425 Facebook Friends, But Twitter & Instagram Adoption Way Up.” The main point is that Facebook has what I would call a monopolistic position when it comes to teens and their friends. I am not sure Facebook is the home run play in places like rural Chile, but where there is money, infrastructure, and gizmos, Facebook is on top.
The point which struck me is, “What happens when an outfit is on top?” Revenue accrues and so does attention.
The research which the write up summarizes contains an interesting factoid or two. For example, teens are, if the data are correct, are shifting from online services which use words to online services which use pictures. (Will video be far behind?) Here’s the passage I noted:
Twitter and Instagram are far behind Facebook, but both have made impressive gains. Twitter was used by only 12% of teens in 2011 but more than doubled that to 26% in 2012. with usage of 26% and 11%. Instagram doesn’t appear to have been measured in 2011, so surveyed growth can’t be determined. But it comes in with an impressive third place at 11%.
Several observations are warranted.
First, search is somewhat of a disappointment when one tries to locate specific information in text form. Last night at dinner, a prominent New York attorney said, “It may just be me but I am having more difficulty finding exactly what I am looking for.” The comment bedevils quite a few people. I suggested that the prominent attorney hire a legal researcher. The prominent attorney replied, “I suppose I will have to.” Lesson: Finding information is getting more difficult, not easier. Keep in mind that the problem exists for words. Search is a challenge for some folks, and vendors have been trying to crack the code for 40, maybe 50 years.
Second, what information is embedded in digital images? What “metamessages” are teens sending when a snapshot is launched into the Twitter or Instagram world? More important, what search system is needed to locate and figure out the information in an image? My view is that geocoding and personal information may offer some important clues. But do we have a search system for these content repositories which works for the hapless attorney, a marketer, or a person looking for information about a runaway teen? In my view, not yet, and not by a long shot.
Third, is the shift from text to images by the teen demographic in the study sample a signal that text is losing its usefulness or relevance? The notion that those entering the workforce in a few years wedded to Tweets and snapshots may be an important cultural shift in some parts of the developed world.
The big question remains, “How will one find information to answer a question?” Text search is a problem. The brave new world hinted at in the Pew study poses more findability challenges. I am not sure the current crop of search and content processing challenges can resolve the problem to my satisfaction. The marketers will assert the opposite. The reality is that findability will remain a central problem for the foreseeable future.
Search is most easily resolved by ignoring its problems or reducing the problem to predictive algorithms in a “mother knows best” approach to information. That may work for some, but not everyone.
Stephen E Arnold, May 21, 2013
Sponsored by Augmentext
Demographics and an Another Daunting Challenge for Search
May 22, 2013
I read “Pew: 94% Of Teenagers Use Facebook, Have 425 Facebook Friends, But Twitter & Instagram Adoption Way Up.” The main point is that Facebook has what I would call a monopolistic position when it comes to teens and their friends. I am not sure Facebook is the home run play in places like rural Chile, but where there is money, infrastructure, and gizmos, Facebook is on top.
The point which struck me is, “What happens when an outfit is on top?” Revenue accrues and so does attention.
The research which the write up summarizes contains an interesting factoid or two. For example, teens are, if the data are correct, are shifting from online services which use words to online services which use pictures. (Will video be far behind?) Here’s the passage I noted:
Twitter and Instagram are far behind Facebook, but both have made impressive gains. Twitter was used by only 12% of teens in 2011 but more than doubled that to 26% in 2012. with usage of 26% and 11%. Instagram doesn’t appear to have been measured in 2011, so surveyed growth can’t be determined. But it comes in with an impressive third place at 11%.
Several observations are warranted.
First, search is somewhat of a disappointment when one tries to locate specific information in text form. Last night at dinner, a prominent New York attorney said, “It may just be me but I am having more difficulty finding exactly what I am looking for.” The comment bedevils quite a few people. I suggested that the prominent attorney hire a legal researcher. The prominent attorney replied, “I suppose I will have to.” Lesson: Finding information is getting more difficult, not easier. Keep in mind that the problem exists for words. Search is a challenge for some folks, and vendors have been trying to crack the code for 40, maybe 50 years.
Second, what information is embedded in digital images? What “metamessages” are teens sending when a snapshot is launched into the Twitter or Instagram world? More important, what search system is needed to locate and figure out the information in an image? My view is that geocoding and personal information may offer some important clues. But do we have a search system for these content repositories which works for the hapless attorney, a marketer, or a person looking for information about a runaway teen? In my view, not yet, and not by a long shot.
Third, is the shift from text to images by the teen demographic in the study sample a signal that text is losing its usefulness or relevance? The notion that those entering the workforce in a few years wedded to Tweets and snapshots may be an important cultural shift in some parts of the developed world.
The big question remains, “How will one find information to answer a question?” Text search is a problem. The brave new world hinted at in the Pew study poses more findability challenges. I am not sure the current crop of search and content processing challenges can resolve the problem to my satisfaction. The marketers will assert the opposite. The reality is that findability will remain a central problem for the foreseeable future.
Search is most easily resolved by ignoring its problems or reducing the problem to predictive algorithms in a “mother knows best” approach to information. That may work for some, but not everyone.
Stephen E Arnold, May 21, 2013
Sponsored by Augmentext
LucidWorks to Participate in OSCON
May 22, 2013
OSCON, the Open Source Convention, will take place in Portland, Oregon in July. Themes of the conference include not just innovation and the exchange of ideas, but also how open source can give back to the community and support upcoming developers. This year, LucidWorks will support the conference. Read more on the LucidWorks Events page.
The event overview begins:
“OSCON is the best place on the planet to prepare for what comes next, from learning new skills to understanding how new and emerging open source technologies are going to impact how we live, work, and do business. In keeping with its O’Reilly heritage, OSCON is a unique gathering of all things open source, where participants find inspiration, confront new challenges, share their expertise, renew bonds to community, make significant connections, and find ways to give back to the open source movement. Erik Hatcher from LucidWorks will be presenting at the event.”
Stay tuned for more details about what Hatcher will present in his Solr Quick Start session. Attendees can expect information regarding installing and running Solr, indexing data, configuring schema, tuning and scaling, and more. LucidWorks offers some of the best value-added open source software with its LucidWorks Search and LucidWorks Big Data offerings. Perhaps more importantly, LucidWorks has a long track record of investing in open source development, training, and support, including employing one-quarter of the committers on the Apache Lucene/Solr project.
Emily Rae Aldridge, May 22, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Yahoo Bids Goodbye To Microsoft
May 22, 2013
When Marissa Mayer took charge of Yahoo, she flipped the failing company upside down with strategic changes and she is about to make another one, says CNet in the article, “Yahoo Reportedly Looking To Dump Microsoft Search Pact.” Mayer has been unhappy with Yahoo’s partnership with Microsoft and has been searching for a way to end the arrangement.
Both companies made the deal in good faith:
“The two companies entered into a 10-year search partnership in 2010 in which Microsoft would power Yahoo search and Yahoo would become the sales force for Microsoft’s premium properties. However, the relationship hasn’t yielded the revenue-per-search guaranteed by the partnership, prompting Microsoft to extend the RPS guarantee for another year, Yahoo disclosed in a regulatory filing Tuesday.”
Microsoft failed to hit the RPS targets and Microsoft keeps seeking extensions in hopes to generate some profits. Mayer wants to grow Yahoo, she does not want to remain stagnant which is what the deal is bringing. Yahoo still considers Microsoft an important partner, but back in 2008 Google courted Yahoo with an ad-search deal and they may come back. Yahoo will probably find a way out of the deal and if the purpose is to make money, which Google is good at, Yahoo just might join the Google family. Is it time to drink the Kool-Aid?
Whitney Grace, May 22, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Potentially New Web Page Data Mining Tool
May 22, 2013
Extracting content from a Web page can be a maddening process, requiring specialized scripts and time spent coding them. Taking a look at available tools, Softpedia touts “FMiner Pro 7.05.” FMiner Pro is advertised as a reliable application that allows users to easily handle Web content without scripts. The software can pull data from any page type, including https, plugins, JavaScript, and even complete data structures.
After the data is extracted much can be done with it:
“Extracted results can be saved to csv, Excel(xls), SQLite, Access, SQL Server, MySQL, PostgreSQL, and can specify the database fields’ types and attributes(eg, UNIQUE can avoid duplication of the extracted data). According to the setting, program can build, rebuild or load the database structure, and save the data to an existing database. Professional edition support incremental extraction, clear extraction and schedule extraction.”
FMiner Pro is available for a free fifteen-day trial to see how well it can perform. After viewing the specs, FMiner Pro is worth a shot. It can probably save coders hours by not having to write scripts and organizing Web content is a tedious job no one likes to do. Having a program to do it is much more preferable.
Whitney Grace, May 22, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search




