Google and Fact Ranking: Close but No SEO Cigar

June 3, 2015

Based on my experience gleaned in rural Kentucky, home of Pappy Van Winkle, Google ranks more than Web pages, people, news stories, and links. Google ranks with lots and lots of factors. The Googlers are busy lads and lasses in the ranking department. There are many reasons. One of them may be ad-centric.

I read “Beyond Links: Why Google Will Rank Facts in the Future.” In my opinion, the write up is close by no cigar and certainly no an SEO cigar. Here’s a passage I highlight in dramatic orange:

Since web pages can be littered with factual inaccuracies and still appear credible because of a high number of quality links, the Google team is pursuing a future where endogenous signals carry far more weight than exogenous signals. In short, Google may soon be more concerned with the information your website contains than the level of trust people have in your website. New websites could immediately be ranked higher than established competitor sites just by hosting content that is more factually accurate than theirs.

I have quite a bit of confidence in the GOOG; however, there is one sticky wicket: What is a fact? Facts can be tricky in math; for example, infinity or zero, fact or fanciful notion. One whiz kid went crazy noodling the infinity issue, infinities of infinities, and sets of infinities on the left and the right of the good old decimal point.

Rah rah for the Google Knowledge Vault. There are some statistical tools to rank a fact as more or less correct. Will the SEO crowd be able to game the system so their clients’ Web pages are more factual? Will Google use facts to drive ad sales? Will the user know what is and is not correct? Is there a factual answer to this question: Which is more sophisticated technically? Facebook or Google. What does the Knowledge Vault think?

Stephen E Arnold, June 3, 2015

Written by Stephen E. Arnold · Filed Under algorithms, Google, News | Comments Off on Google and Fact Ranking: Close but No SEO Cigar

Sinequa and Systran Partner on Cyber Defense

May 20, 2015

Enterprise search firm Sinequa and translation tech outfit Systran are teaming up on security software. “Systran and Sinequa Combine in the Field of Cyber Defense,” announces ITRmanager.com. (The article is in French, but Google Translate is our friend.) The write-up explains:

“Sinequa and Systran have indeed decided to cooperate to develop a solution for detecting and processing of critical information in multiple languages ??and able to provide investigators with a panoramic view of a given subject. On one side Systran provides safe instant translation in over 45 languages, and the other Sinequa provides big data processing platform to analyze, categorize and retrieve relevant information in real time. The integration of the two solutions should thus facilitate the timely processing of structured and unstructured data from heterogeneous sources, internal and external (websites, audio transcripts, social media, etc.) and provide a clear and comprehensive view of a subject for investigators.”

Launched in 2002, Sinequa is a leader in the Enterprise Search field; the company boasts strong business analytics, but also emphasizes user-friendliness. Based in Paris, the firm maintains offices in Frankfurt, London, and New York City. Systran has a long history of providing innovative translation services to defense and security organizations around the world. The company’s headquarters are in Seoul, with other offices located in Daejeon, South Korea; Paris; and San Diego.

Cynthia Murrell, May 20, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Written by Stephen E. Arnold · Filed Under algorithms, Database, News, Search, Search quality, Security | Comments Off on Sinequa and Systran Partner on Cyber Defense

Data Mining Algorithms Explained

May 18, 2015

In plain English too. Navigate to “Top 10 Data Mining Algorithms in Plain English.” When you fire up an enterprise content processing system, the algorithms beneath the user experience layer are chestnuts. Universities do a good job of teaching students about some reliable methods to perform data operations. In fact, the universities do such a good job that most content processing systems include almost the same old chestnuts in their solutions. The decision to use some or all of the top 10 data mining algorithms has some interesting consequences, but you will have to attend one of my lectures about the weaknesses of these numerical recipes to get some details.

The write up is worth a read. The article includes a link to information which underscores the ubiquitous nature of these methods. This is the Xindong Wu et all write up “Top 10 Algorithms in Data Mining.” Our research reveals that dependence on these methods is more wide spread now than they were seven years ago when the paper first appeared.

The implication then and now is that content processing systems are more alike than different. The use of similar methods means that the differences among some systems is essentially cosmetic. There is a flub in the paper. I am confident that you, gentle reader, will spot it easily.

Now to the “made simple” write up. The article explains quite clearly the what and why of 10 widely used methods. The article also identifies some of the weaknesses of each method. If there is a weakness, do you think it can be exploited? This is a question worth considering I suggest.

Example: What is a weakness of k means:

Two key weaknesses of k-means are its sensitivity to outliers, and its sensitivity to the initial choice of centroids. One final thing to keep in mind is k-means is designed to operate on continuous data — you’ll need to do some tricks to get it to work on discrete data.

Note the key word “tricks.” When one deals with math, the way to solve problems is to be clever. It follows that some of the differences among content processing systems boils down to the cleverness of the folks working on a particular implementation. Think back to your high school math class. Was there a student who just spit out an answer and then said, “It’s obvious.” Well, that’s the type of cleverness I am referencing.

The author does not dig too deeply into PageRank, but it too has some flaws. An easy way to identify one is to attend a search engine optimization conference. One flaw turbocharges these events.

My relative Vladimir Arnold, whom some of the Arnolds called Vlad the Annoyer, would have liked the paper. So do I. The write up is a keeper. Plus there is a video, perfect for the folks whose attention span is better than a goldfish’s.

Stephen E Arnold, May 18, 2015

Written by Stephen E. Arnold · Filed Under algorithms, Analytics, Data mining, News | 1 Comment

Don’t Fear the AI

May 14, 2015

Will intelligent machines bring about the downfall of the human race? Unlikely, says The Technium, in “Why I Don’t Worry About a Super AI.” The blogger details four specific reasons he or she is unafraid: First, AI does not seem to adhere to Moore’s law, so no Terminators anytime soon. Also, we do have the power to reprogram any uppity AI that does crop up and (reason three) it is unlikely that an AI would develop the initiative to reprogram itself, anyway. Finally, we should see managing this technology as an opportunity to clarify our own principles, instead of a path to dystopia. The blog opines:

“AI gives us the opportunity to elevate and sharpen our own ethics and morality and ambition. We smugly believe humans – all humans – have superior behavior to machines, but human ethics are sloppy, slippery, inconsistent, and often suspect. […] The clear ethical programing AIs need to follow will force us to bear down and be much clearer about why we believe what we think we believe. Under what conditions do we want to be relativistic? What specific contexts do we want the law to be contextual? Human morality is a mess of conundrums that could benefit from scrutiny, less superstition, and more evidence-based thinking. We’ll quickly find that trying to train AIs to be more humanistic will challenge us to be more humanistic. In the way that children can better their parents, the challenge of rearing AIs is an opportunity – not a horror. We should welcome it.”

Machine learning as a catalyst for philosophical progress—interesting perspective. See the post for more details behind this writer’s reasoning. Is he or she being realistic, or naïve?

Cynthia Murrell, May 14, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under AI, algorithms, News, Search, Technology | Comments Off on Don’t Fear the AI

The Philosophy of Semantic Search

May 13, 2015

The article Taking Advantage of Semantic Search NOW: Understanding Semiotics, Signs, & Schema on Lunametrics delves into semantics on a philosophical and linguistic level as well as in regards to business. He goes through the emergence of semantic search beginning with Ray Kurzweil’s interest in machine learning meaning as opposed to simpler keyword search. In order to fully grasp this concept, the author of the article provides a brief refresher on Saussure’s semantics.

“a Sign is comprised of a signifier, or the name of a thing, and the signified, what that thing represents… Say you sell iPad accessories. “iPad case” is your signifier, or keyword in search marketing speak. We’ve abused the signifier to the utmost over the years, stuffing it onto pages, calculating its density with text tools, jamming it into title tags, in part because we were speaking to robot who read at a 3-year-old level.”

In order to create meaning, we must go beyond even just the addition of price tag and picture to create a sign. The article suggests the need for schema, in the addition of some indication of whom and what the thing is for. The author, Michael Bartholow, has a background in linguistics and marketing and search engine optimization. His article ends with the question of when linguists, philosophers and humanists will be invited into the conversation with businesses, perhaps making him a true visionary in a field populated by data engineers with tunnel-vision.

Chelsea Kerwin, May 13, 2014

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under algorithms, Data, Marketing, News, Search, Semantic, Technology | 1 Comment

Cloud Adoption Is Like a Lead Balloon

May 8, 2015

According to Datamation’s article, “Deflating The Cloud BI Hype Balloon” the mad, widespread adoption of enterprise cloud computing is deflating like helium out of a balloon. While the metaphor is apt for any flash pan fad, it also should be remembered that Facebook and email were considered passing trends. It could be said that when their “newness” wore off they would sink faster than a lead balloon, if we want to continue with the balloon metaphor. If you are a fan of Mythbusters, however, you know that lead balloons, in fact, do float.

What the article and we are aiming here is that like the Mythbusters’ lead balloon, cloud adoption can be troublesome but it will work or float in the end. Datamation points out that the urgency for immediate adoption has faded as security risks and integration with proprietary systems become apparent.

Howard Dresner wrote a report called “Cloud Computing And Business Intelligence” that explain his observations on enterprise cloud demand. Dresner says that making legacy systems adaptable to the cloud will be a continuous challenge, but he stresses that some data does not belong in cloud, while some data needs to be floating about. The challenge is making the perfect hybrid system.

He makes the same apt observation about the lead balloon:

“Dresner, who was a Gartner fellow and has 34 years in the IT industry, takes a longer-term perspective about the integration challenges. “We have to solve the same problems we solved on premise,” he explains, and then adds that these problems “won’t persist forever in the enterprise, but they will take a while to solve.”

In other words, it takes time to assemble, but the lead balloon will keep floating around until the next big thing to replace the cloud. Maybe it will be direct data downloads into the head.

Whitney Grace, May 8, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under algorithms, Business intelligence, Business strategy, Cloud computing, Data, News, Search quality | Comments Off on Cloud Adoption Is Like a Lead Balloon

How do You use Your Email?

April 28, 2015

Email is still a relatively new concept in the grander scheme of technology, having only been around since the 1990s. As with any human activity, people want to learn more about the trends and habits people have with email. Popular Science has an article called “Here’s What Scientists Learned In The Largest Systematic Study Of Email Habits” with a self-explanatory title. Even though email has been around for over twenty years, no one is quite sure how people use it.

So someone decided to study email usage:

“…researchers from Yahoo Labs looked at emails of two million participants who sent more than 16 billion messages over the course of several months–by far the largest email study ever conducted. They tracked the identities of the senders and the recipients, the subject lines, when the emails were sent, the lengths of the emails, and the number of attachments. They also looked at the ages of the participants and the devices from which the emails were sent or checked.”

The results were said to be so predictable that an algorithm could have predicted them. Usage has a strong correlation to age groups and gender. The young write short, quick responses, while men are also brief in their emails. People also responded more quickly during work hours and the more emails they receive the less likely they are to write a reply. People might already be familiar with these trends, but the data is brand new to data scientists. The article predicts that developers will take the data and design better email platforms.

How about creating an email platform that merges a to-do list with emails, so people don’t form their schedules and tasks from the inbox.

Whitney Grace, April 28, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under algorithms, Data, News, Technology, Tools, Web Services, Yahoo | Comments Off on How do You use Your Email?

EnterpriseJungle Launches SAP-Based Enterprise Search System

April 27, 2015

A new enterprise search system startup is leveraging the SAP HANA Cloud Platform, we learn from “EnterpriseJungle Tames Enterprise Search” at SAP’s News Center. The company states that their goal is to make collaboration easier and more effective with a feature they’re calling “deep people search.” Writer Susn Galer cites EnterpriseJungle Principal James Sinclair when she tells us:

“Using advanced algorithms to analyze data from internal and external sources, including SAP Jam, SuccessFactors, wikis, and LinkedIn, the applications help companies understand the make-up of its workforce and connect people quickly….

“Who Can Help Me is a pre-populated search tool allowing employees to find internal experts by skills, location, project requirements and other criteria which companies can also configure, if needed. The Enterprise Q&A tool lets employees enter any text into the search bar, and find experts internally or outside company walls. Most companies use the prepackaged EnterpriseJungle solutions as is for Human Resources (HR), recruitment, sales and other departments. However, Sinclair said companies can easily modify search queries to meet any organization’s unique needs.”

EnterpriseJungle users manage their company’s data through SAP’s Lumira dashboard. Galer shares Sinclair’s example of one company in Germany, which used EnterpriseJungle to match employees to appropriate new positions when it made a whopping 3,000 jobs obsolete. Though the software is now designed primarily for HR and data-management departments, Sinclair hopes the collaboration tool will permeate the entire enterprise.

Cynthia Murrell, April 27, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under algorithms, Analytics, Cloud computing, Enterprise, Enterprise search, News | Comments Off on EnterpriseJungle Launches SAP-Based Enterprise Search System

Contextual Search Recommended for Sales Pros

April 14, 2015

Sales-productivity pro Doug Winter penned “Traditional Search is Dying as Sales Organizations Make Way for “Context” for Entrepreneur. He explains how companies like Google, Apple, and Yahoo have long been developing “contextual” search, which simply means using data it has gathered about the user to deliver more relevant answers to queries, instead of relying on keywords alone. Consumers have been benefiting from this approach online for years now, and Winter says it’s time for salespeople to apply contextual search to their internal content. He writes:

“The key to how contextual search delivers on its magic is the fact that the most advanced ECM systems are, like Google’s search algorithms, much more knowledgeable about the person searching than we care to admit. What you as a sales rep see is tailored to you because when you sign in, the system knows what types of products you sell and in what geographic areas.”

“Tie in customer data from your customer relationship management (CRM) system and now the ECM knows what buying stage and industry your prospect is in. Leveraging that data, you as a rep shouldn’t then see a universe of content you have to manually sort through. Instead, according to Ring DNA, you should see just a handful of useful pieces you otherwise would have spent 30 hours a month searching for on your own.”

As long as the chosen algorithm succeeds in catching what a salesperson needs in its net, this shift could be a terrific time saver. Sales departments should do their research, however, before investing in any contextual-search tools.

Cynthia Murrell, April 14, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Written by Stephen E. Arnold · Filed Under algorithms, Business intelligence, Data, Enterprise search, Google, Marketing, News, Yahoo | Comments Off on Contextual Search Recommended for Sales Pros

Predicting Plot Holes Isn’t So Easy

April 10, 2015

According to The Paris Review’s blog post “Man In Hole II: Man In Deeper Hole” Mathew Jockers created an analysis tool to predict archetypal book plots:

A rough primer: Jockers uses a tool called “sentiment analysis” to gauge “the relationship between sentiment and plot shape in fiction”; algorithms assign every word in a novel a positive or negative emotional value, and in compiling these values he’s able to graph the shifts in a story’s narrative. A lot of negative words mean something bad is happening, a lot of positive words mean something good is happening. Ultimately, he derived six archetypal plot shapes.”

Academics, however, found some problems with Jockers’s tool, such as is it possible to assign all words an emotional variance and can all plots really take basic forms? The problem is that words are as nuanced as human emotion, perspectives change in an instant, and sentiments are subjective. How would the tool rate sarcasm?

All stories have been broken down into seven basic plots, so why can it not be possible to do the same for book plots? Jockers already identified six basic book plots and there are some who are curiously optimistic about his analysis tool. It does beg the question if will staunch author’s creativity or if it will make English professors derive even more subjective meaning from Ulysses?

Whitney Grace, April 10, 2015

Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Written by Stephen E. Arnold · Filed Under algorithms, Analytics, Big data, Corporate Concerns, Google, News, Text analytics, Web Services | Comments Off on Predicting Plot Holes Isn’t So Easy

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Google and Fact Ranking: Close but No SEO Cigar

Sinequa and Systran Partner on Cyber Defense

Data Mining Algorithms Explained

Don’t Fear the AI

The Philosophy of Semantic Search

Cloud Adoption Is Like a Lead Balloon

How do You use Your Email?

EnterpriseJungle Launches SAP-Based Enterprise Search System

Contextual Search Recommended for Sales Pros

Predicting Plot Holes Isn’t So Easy

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta