CyberOSINT banner

LexisNexis: Riding the Patent Pony

April 25, 2015

Need patent information? Lots of folks believed that making sense of the public documents available from the USPTO were the road to riches. Before I kicked back to enjoy the sylvan life in rural Kentucky, I did some work on Fancy Dan patent systems. There was a brush with the IBM Intelligent Patent Miner system. For those who do not recall their search history, you can find a chunk of information in “Information Mining with the IBM Intelligent Miner Family.” Keep in mind that the write up is about 20 years old. (Please, notice that the LexisNexis system discussed below uses many of the same, time worn techniques.)

image

Patented dog coat.

Then there was the Manning & Napier “smart” patent analysis system with analyses’ output displayed in three-D visualizations. I bumped into Derwent (now Intellectual Property & Science) and other Thomson Corp. solutions as well. And, of course, there was may work for an unnamed, mostly clueless multi billion dollar outfit related to Google’s patent documents. I summarized the results of this analysis in my Google Version 2.0 monograph, portions of which were published by BearStearns before it met its thrilling end seven years ago. (Was my boss the fellow carrying a box out of the Midtown BearStearns’ building?)

Why the history?

Well, patents are expensive to litigate. For some companies, intellectual property is a revenue stream.

There is a knot in the headphone cable. Law firms are not the go go business they were 15 or 20 years ago. Law school grads are running gyms; some are Uber drivers. Like many modern post Reagan businesses, concentration is the name of the game. For the big firms with the big buck clients, money is no object.

The problem in the legal information business is that smaller shops, including the one and two person outfits operating in Dixie Highway type of real estate do not want to pay for the $200 and up per search commercial online services charge. Even when I was working for some high rollers, the notion of a five or six figure online charge elicited what I would diplomatically describe as gentle push back.

I read “LexisNexis TotalPatent Keeps Patent Research out of the Black Box with Improved Version of Semantic Search.” For those out of touch with online history, I worked for a company in the 1980s which provided commercial databases to LexisNexis. I knew one of the founders (Don Wilson). I even had reasonably functional working relationships with Dan Prickett and people named “Jim” and “Sharon.” In one bizarre incident, a big wheel from LexisNexis wanted to meet with me in the Cherry Hill Mall’s parking lot across from the old Bell Labs’ facility where I was a consultant at the time. Err, no thanks. I was okay with the wonky environs of Bell Labs. I was not okay with the lash up of a Dutch and British company.

image

Snippet of code from a Ramanathan Guha invention. Guha used to be at IBM Almaden and he is a bright fellow. See US7593939 B2.

What does LexisNexis TotalPatent deliver for a fee? According to the write up:

TotalPatent, a web-based patent research, retrieval and analysis solution powered by the world’s biggest assortment of searchable full-text and bibliographic patent authorities, allows researchers to enter as much as 32,000 characters (comparable to more than 10 pages of text)—much over along a whole patent abstract—into its search industry. The newly enhanced semantic brains, pioneered by LexisNexis during 2009 and continually improved upon utilizing contextual information supplied by the useful patent data offered to the machine, current results in the form of a user-adjustable term cloud, where the weighting and positioning of terms may be managed for lots more precise results. And countless full-text patent documents, TotalPatent in addition utilizes systematic, technical also non-patent literature to go back the deepest, most comprehensive serp’s.

Read more

Ontotext Pursues Visibility

April 23, 2015

Do you know Ontotext? The company is making an effort to become more visible. Navigate to “Vassil Momtchev talks Insights with the Bloor Group.” The interview provides a snapshot of the company’s history which dates from 2001. After 14 years, the interview reports that Ontotext “keeps its original company spirit.”

Other points from the write up:

  • The company’s technology makes use of semantic and ontology modeling
  • A knowledge base represents complex information and makes asking questions better
  • Semantic applications can deliver complete applications.

For more information about Ontotext and its “ontological” approach, visit the company’s Web site at www.ontotext.com.

Stephen E Arnold, April 23, 2015

Expert System Webinar: Sharepoint and Semantics Add Value for Users

April 20, 2015

Expert System offers a system capable of turbo-charging information access in SharePoint installations. The company has developed a fact-based webinar to demonstrate the power of Expert System’s semantic technology.

The company’s Cogito Connected for SharePoint features a document library, complete with metadata enrichment for files to increase their visibility as well as their content. The library will also be retained in SharePoint and be available for use by other files and accurate time and date of most recent tagging will be captured for each file. Users will also be able to process multiple attachments in the Document List and the search function is enhanced with fully integrated Web components.

With Cogito, users can locate content via a custom taxonomy, entities, or faceted search options. SharePoint users can locate needed information via point-and-click, eDiscovery, and traditional keyword search enriched with organization-specific metadata. Expert System’s Cogito allows users to browse content organized by topics, people, and concepts, which makes SharePoint more useful to a busy professional.

SharePoint is one of the most popular collaborative content platforms for enterprise systems, but like many proprietary software programs it has its limits. The good news is that companies like Expert System discover SharePoint’s weaknesses and create solutions to fix them.

Using its patented technology Cogito, Expert System addresses one of the main user concerns when looking for information housed in SharePoint. Cogito sharply reduces the difficulty of navigating and locating content in SharePoint. This problem stems from creators improperly tagging content or not tagging it at all.

In an exclusive interview, Maurizio Mencarini, Expert System had this to say about Expert System’s Cogito Connected for SharePoint:

“Cogito Connected for SharePoint addresses these two areas by providing the power of Cogito semantics to the application of consistent, automated tagging of SharePoint content. With the addition of fully integrated web parts that expose the granularity of content generated metadata, Cogito enhanced SharePoint optimizes the management of content for the SharePoint administrator. For the user, Cogito Connected for SharePoint significantly improves the SharePoint search experience by enhancing the search capabilities beyond the list to include faceted search including category, entity and topic.”

Expert System’s solution delivers a better SharePoint experience for the user and improves work productivity for employees, since they will be able to locate information quicker. Expert System knows what many users don’t realize: the value of being able to locate and recognize content quickly. In this case, Expert System applied this knowledge to SharePoint, but it can be used for other programs in any field. On April 28, 2015 from 12:00 PM-1:00 PM EST, Expert System will host a free webinar called “Implementing a Better Search Experience” where attendees will “learn how to make SharePoint more than a place where you put documents and start transforming your collected knowledge in your collective knowledge.”

Expert System was founded in 1989 and its flagship product is Cogito. Solutions based on the Cogito software include semantic search, natural language search, text analytics, development and management of taxonomies and ontologies, automatic categorization, extraction of data and metadata, and natural language processing. Expert System is working on exciting new developments on everything from enterprise systems to security and intelligence.

Expert System wants to share its knowledge with users so they can have a better user experience, apply the knowledge to other areas, and, of course, make daily tasks simpler.

The new “Implementing a Better Search Experience” will be offered on April 28, 2015, from 12 to 1 pm Eastern Time. You will learn how you can transform your organization’s collected knowledge in actionable collective knowledge.

Sign up for the April webinar at http://bit.ly/1FalGjH.

Stephen E Arnold, April 20, 2015

Yahoo: A Portion of Its Fantastical Search History

April 15, 2015

I have a view of Yahoo. Sure, it was formed when I was part of the team that developed The Point (Top 5% of the Internet). Yahoo had a directory. We had a content processing system. We spoke with Yahoo’s David Filo. Yahoo had a vision, he said. We said, No problem.

The Point became part of Lycos, embracing Fuzzy and his round ball chair. Yahoo, well, Yahoo just got bigger and generally went the way of general purpose portals. CEOs came and went. Stakeholders howled and then sulked.

I read or rather looked at “Yahoo. Semantic Search From Document Retrieval to Virtual Assistants.” You can find the PowerPoint “essay” or “revisionist report” on SlideShare. The deck was assembled by the director of research at Yahoo Labs. I don’t think this outfit is into balloons, self driving automobiles, and dealing with complainers at the European Commission. Here’s the link. Keep in mind you may have to sign up with the LinkedIn service in order to do anything nifty with the content.

The premise of the slide deck is that Yahoo is into semantic search. After some stumbles, semantic search started to become a big deal with Google and rich snippets, Bing and its tiles, and Facebook with its Like button and the magical Open Graph Protocol. The OGP has some fascinating uses. My book CyberOSINT can illuminate some of these uses.

And where is Yahoo in the 2008 to 2010 interval when semantic search was abloom? Patience, grasshopper.

Yahoo was chugging along with its Knowledge Graph. If this does not ring a bell, here’s the illustration used in the deck:

image

The date is 2013, so Yahoo has been busy since Facebook, Google, and Microsoft were semanticizing their worlds. Yahoo has a process in place. Again from the slide deck:

image

I was reminded of the diagrams created by other search vendors. These particular diagrams echo the descriptions of the now defunct Siderean Software server’s set up. But most content processing systems are more alike than different.

Read more

A Kettle of Search Fish

April 6, 2015

We have hear a lot about the semantic Web and search engine optimization (SEO), but both have the common thread of making information more accessible and increasing its use.  One would think this would be the same kettle of fish, but sometimes it is hard to make SEO and the semantic Web work together for platonic web experience.  On Slideshare.net, Eric Franzon’s “SEO Meets Semantic Web-Saint Patrick’s Day 2015-Meetup” tries to consolidate the two into one happy fish taco.  The presentation tries to explain how the two work together, but here is the official description:

“Schema.org didn’t just appear out of thin air in 2011. It was built upon a foundation of web standards and technologies that have been in development for decades. In this presentation, Eric Franzon, Managing Partner of SemanticFuse provides an introduction to Semantic Web standards such as RDF and SPARQL. He explores who’s using them today and why (hint: it involves money), and takes a look at how Semantic Web, Linked Data, and schema.org are related.”

The problem with the presentation is that we do not have the audio to accompany it, but by flipping through the slides we can understand the general idea.  The semantic Web is full of relationships that are connected by ideas and require coding and other fancy stuff to make it one big kettle.  In fact, this appears to have too much of the semantic Web flavor and not enough of the SEO spice.  One is a catfish for fine meal and the other is a fish fry without the oil.

Whitney Grace, April 6, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Search Engine Optimization: Chasing Semantic Search

April 4, 2015

I have read a number of articles about search engine optimization (SEO) and Web search. From my point of view, the SEO sector wants to do more than destroy relevance. SEO seeks to undermine the meaning of discourse. For some marketers, the destruction of meaning is a good thing. A Web site and its content will be disconnected from what the information the user seeks. The user, particularly a recent high school grad, is probably ill equipped to differentiate among reformation of information, disinformation, and misinformation. Instead of identifying Jacques Ellul’s touch points, the person will ask, “Is he Taylor Swift’s hair stylist.” As I said, erosion of meaning is a good think when a client’s Web page appears in a list of Google search results or is predicatively presented as what the user wants, needs, and desires.

Examples of these SEO learned analyses include:

Sigh.

The basic idea is that concepts and topics rise above mere words. In this blog, when I use the phrase “azure chip consultant,” Bing, Google, or Yandex will know that I really am talking about consulting companies that are not in the top tier of expertise centric consulting firms. There is a difference between an IDC-  or Gartner-type firm and outfits like Booz, Allen, Boston Consulting Group, and McKinsey type firms. The notion is that via appropriate content processing and value-added metadata enrichment, the connection will be established between my terminology and the consulting firms which are second or third class.

The reason I use this terminology is to provide my readers with a nudge to their funny bone. Bing, Google, et al do not make these type of connections without help. The help ranges from explicitly links to the functions of various numerical recipes.

In my experience, marketers describe concept magic but usually deliver a puff of stage fog like that used by rock and roll bands. Fog hides age and other surface defects.

Does anyone (marketer, user, vendor) care about the loss of relevance? Sure. Each of these sectors will define relevance in their of their phenomenological position. The marketer wants to close a sale or keep a client. The user wants a pizza or a parking place. The vendor wants to be found, get leads, and sell.

When meaning is disconnected from relevance and precision, those filtering information are in control. If a company wants traffic, buy ads. Unfortunately for the SEO crowd, mumbo jumbo is its most recent reaction to the challenge of controlling what a Bing, Google, or Yandex displays.

I am not confident the search engines are able to present that they want to display. Search is broken. In my experience it is more difficult today to get on point information than at any other point in my professional life.

Here’s a simple example. Run a query on Bing or Google for Dark Web index. The result is zero relevant information. What the query should display is TOR domain. Hmm. Wonder why? Now how does one find that information? Good question.

Now look for Lady Gaga. There you go. Now try “low airfares.” Interesting indeed.

Stephen E Arnold, April 4, 2015

 

Watson Goes Blekko

March 28, 2015

I read “Goodbye Blekko: Search Engine Joins IBM’s Watson Team.” According to the write up, “Blekko’s home page says its team and technology are now part of IBM’s Watson technology.” I would not know this. I do not use the service. I wrestled with the implementation of Blekko on a news service and then wondered if Yandex was serious about the company. Bottom line: Blekko is not one of my go to search systems, and I don’t cover it in my Alternatives to Google lectures for law enforcement and intelligence professionals.

The write up asserts:

Blekko came out of stealth in 2008 with Skrenta promising to create a search engine with “algorithmic editorial differentiation” compared to Google. Its public search engine finally opened in 2010, launching with what the site called “slashtags” — a personalization and filtering tool that gave users control over the sites they saw in Blekko’s search results.

Another search system becomes part of the puzzling Watson service. How many information access systems does IBM require to make Watson the billion dollar revenue generator or at least robust enough to pay the rent for the Union Square offices?

IBM “owns” the Clementine system which arrived with the SPSS purchase. IBM owns Vivisimo, which morphed into a Big Data system in the acquisition news release, iPhrase, and the wonky search functions in DB2. Somewhere along the line, IBM snagged the Illustra system. From its own labs, IBM has Web Fountain. There is the decades old STAIRS system which may still be available as Service Master. And, of course, there is the Lucene system which provides the dray animals for Watson. Whew. That is a wealth of information access technology, and I am not sure it is comprehensive.

My point is that Blekko and its razzle dazzle assertions now have to provide something that delivers a payoff for IBM. On the other hand, maybe IBM Watson executives are buying technology in the hopes that one of the people “aquihired” or the newly bought zeros and ones will generate massive cash flows.

Watson has morphed from a question answering game show winner into all manner of fantastic information processing capabilities. For me, Watson is an example of what happens when a lack of focus blends with money, executive compensation schemes, and a struggling $100 billion outfit.

Lots of smoke. Not much revenue fire. Stakeholders hope it will change. I am looking forward to a semantically enriched recipe for barbeque sauce that includes tamarind and other spices not available in Harrod’s Creek, Kentucky. Yummy. A tasty addition to the quarterly review menu: Blekko with revenue and a piquant profit sauce.

Perhaps IBM next will acquire Pertimm and the Qwant search system which terrrifes Eric Schmidt? Surprises ahead. I prefer profitable, sustainable revenues however.

Stephen E Arnold, March 28, 2015

Semantic Search Becomes Search Engine Optimization: That Is Going to Improve Relevance

March 27, 2015

I read “The Rapid Evolution of Semantic Search.” It must be my age or the fact that it is cold in Harrod’s Creek, Kentucky, this morning. The write up purports to deliver “an overview of the history of semantic search and what this means for marketers moving forward.” I like that moving forward stuff. It reminds me of Project Runway’s “fashion forward.”

The write up includes a wonky graphic that equates via an arrow Big Data and metadata, volume, smart content, petabytes, data analysis, vast, structured, and framework. Big Data is a cloud with five little arrows pointing down. Does this mean Big Data is pouring from the sky like yesterday’s chilling rain?

The history of the Semantic Web begins in 1998. Let’s see that is 17 years ago. The milestone is in the context of the article, the report “Semantic Web road Map.” I learned that Google was less than a month old. I thought that Google was Backrub and the work on what was named Google begin a couple, maybe three years, earlier. Who cares?

The Big Idea is that the Web is an information space. That sounds good.

Well in 2012, something Big happened. According to the write up Google figured out that 20 percent of its searches were “new.” Aren’t those pesky humans annoying. The article reports:

long tail keywords made up approximately 70 percent of all searches. What this told Google was that users were becoming interested in using their search engine as a tool for answering questions and solving problems, not just looking up facts and finding individual websites. Instead of typing “Los Angeles weather,” people started searching “Los Angeles hourly weather for March 1.” While that’s an extremely simplified explanation, the fact is that Google, Bing, Facebook, and other internet leaders have been working on what Colin Jeavons calls “the silent semantic revolution” for years now. Bing launched Satori, a knowledge storehouse that’s capable of understanding complex relationships between people, things, and entities. Facebook built Knowledge Graph, which reveals additional information about things you search, based on Google’s complex semantic algorithm called Hummingbird.

Yep, a new age dawned. The message in the article is that marketers have a great new opportunity to push their message in front of users. In my book, this is one reason why running a query on any of the ad supported Web search engines returns so much irrelevant information. In my just submitted Information Today column, I report how a query for the phrase “concept searching” returned results littered with a vendor’s marketing hoo-hah.

I did not want information about a vendor. I wanted information about a concept. But, alas, Google knows what I want. I don’t know what I want in the brave new world of search. The article ignores the lack of relevance in results, the dust binning of precision and recall, and the bogus information many search queries generate. Try to find current information about Dark Web onion sites and let me know how helpful the search systems are. In fact, name the top TOR search engines. See how far you get with Bing, Google, and Yandex. (DuckDuckGo and Ixquick seem to be aware of TOS content by the way.)

So semantic in the context of this article boils down to four points:

  1. Think like an end user. I suppose one should not try to locate an explanation of “concept searching.” I guess Google knows I care about a company with a quite narrow set of technology focused on SharePoint.
  2. Invest in semantic markup. Okay, that will make sense to the content marketers. What if the system used to generate the content does not support the nifty features of the Semantic Web. OWL, who? RDF what?
  3. Do social. Okay, that’s useful. Facebook and Twitter are the go to systems for marketing products I assume. Who on Facebook cares about cyber OSINT or GE’s cratering petrochemical business?
  4. And the keeper, “Don’t forget about standard techniques.” This means search engine optimization. That SEO stuff is designed to make relevance irrelevant. Great idea.

Net net: The write up underscores some of the issues associated with generating buzz for a small business like the ones INC Magazine tries to serve. With write ups like this one about Semantic Search, INC may be confusing their core constituency. Can confused executives close deals and make sense of INC articles? I assume so. I know I cannot.

Stephen E Arnold, March 27, 2015

An Incomplete History of the Semantic Web

March 3, 2015

The article on the blog Realizing Semantic Web titled Semantic Web – Story So Far explores where exactly credit it due for the current state of Semantic Web technology. The author notes that as of 2004, there were very few tools for developers interested in investing time and money. Between then and 2010, quite a leap forward took place, with major improvements in the standards and practices of the Semantic Web technology. The article aims to acknowledge the people and companies that did the most important work. The list includes,

Tim Berners Lee for believing when we all thought Semantic web might not work and will be another AI failure. And of course for his His work at the W3C. James Handler – in addition to his continued work on Semantic Web, for coming up with gems such as the definition of Semantics/Linked Data Cloud that is most effective….DBPedia & Linked Data Cloud…OWL/RDF/SKOS…Google Refine and similar efforts…BBC & other case studies…”

This list does, however, still seem incomplete and somewhat partial. The author even suggests that more input might be needed, but he only allows for two or so more additions. Is this an accurate reflection of the development of the Semantic Web?

Chelsea Kerwin, March 03, 2015

Sponsored by ArnoldIT.com, developer of Augmentext

Whatever Happened to Social Search?

January 7, 2015

Social search was supposed to integrate social media and regular semantic search to create a seamless flow of information. This was one of the major search points for a while, yet it has not come to fruition. So what happened? TechCrunch reports that it is “Good Riddance To Social Search” and with good reason, because the combination only cluttered up search results.

TechCrunch explains that Google tried Social Search back in 2009, using its regular search engine and Google+. Now the search engine mogul is not putting forth much effort in promoting social search. Bing tried something by adding more social media features, but it is not present in most of its search results today.

Why did this endeavor fail?

“I think one of the reasons social search failed is because our social media “friendships” don’t actually represent our real-life tastes all that well. Just because we follow people on Twitter or are friends with old high school classmates on Facebook doesn’t mean we like the same restaurants they do or share the politics they do. At the end of the day, I’m more likely to trust an overall score on Yelp, for example, than a single person’s recommendation.”

It makes sense considering how many people consider their social media feeds are filled with too much noise. Having search results free of the noiwy makes them more accurate and helpful to users.

Whitney Grace, January 07, 2014
Sponsored by ArnoldIT.com, developer of Augmentext

Next Page »