Chan and Zuckerberg Invest in Science Research Search Engine, Meta

March 1, 2017

Mark Zuckerberg and his wife Priscilla Chan have dedicated a portion of their fortune to philanthropy issues through their own organization, the Chan Zuckerberg InitiativeTech Crunch shares that one of their first acquisitions is to support scientific research, “Chan Zuckerberg Initiative Acquires And Will Free Up Science Search Engine Meta.”

Meta is a search engine dedicated to science research papers and it is powered by artificial intelligence.  Chan and Zuckerberg plan to make Meta free in a few months, but only after they have enhanced it.  Once released, Meta will help scientists find the latest papers in their study fields, which is awesome as these papers are usually blocked behind paywalls.  What is even better is that Meta will also assist funding organizations with research and areas with potential for investment/impact.  What makes Meta different from other search engines or databases is quite fantastic:

What’s special about Meta is that its AI recognizes authors and citations between papers so it can surface the most important research instead of just what has the best SEO. It also provides free full-text access to 18,000 journals and literature sources.

Meta co-founder and CEO Sam Molyneux writes that “Going forward, our intent is not to profit from Meta’s data and capabilities; instead we aim to ensure they get to those who need them most, across sectors and as quickly as possible, for the benefit of the world.

CZI invested $3 billion dedicated to curing all diseases and they already built the Biohub in San Francisco for medical research.  Meta works like this:

Meta, formerly known as Sciencescape, indexes entire repositories of papers like PubMed and crawls the web, identifying and building profiles for the authors while analyzing who cites or links to what. It’s effectively Google PageRank for science, making it simple to discover relevant papers and prioritize which to read. It even adapts to provide feeds of updates on newly published research related to your previous searches.

Meta is an ideal search engine, because it crawls the entire Web (supposedly) and returns verified information, not to mention potential research partnerships and breakthroughs.  This is the type of database researchers have dreamed of for years.  Would CZI be willing to fund something similar for fields other than science?  Will they run into trouble with other organizations less interested in philanthropy?

Whitney Grace, March 1, 2017

A Famed Author Talks about Semantic Search

February 24, 2017

I read “An Interview with Semantic Search and SEO Expert David Amerland.” Darned fascinating. I enjoyed the content marketing aspect of the write up. I also found the explanation of semantic search intriguing as well.

image

This is the famed author. Note the biceps and the wrist gizmos.

The background of the “famed author” is, according to the write up:

David Amerland, a chemical engineer turned semantic search and SEO expert, is a famed author, speaker and business journalist. He has been instrumental in helping startups as well as multinational brands like Microsoft, Johnson & Johnson, BOSCH, etc. create their SMM and SEO strategies. Davis writes for high-profile magazines and media organizations such as Forbes, Social Media Today, Imassera and journalism.co.uk. He is also part of the faculty in Rutgers University, and is a strategic advisor for Darebee.com.

Darebee.com is a workout site. Since I don’t workout, I was unaware of the site. You can explore it at Darebee.com. I think the name means that a person can “dare to be muscular” or “date to be physically imposing.” I ran a query for Darebee.com on Giburu, Mojeek, and Unbubble. I learned that the name “Darebee” does come up in the index. However, the pointers in Unbubble are interesting because the links identify other sites which are using the “darebee” string to get traffic. Here’s the Unbubble results screen for my query “darebee.”

image

 

What I found interesting is the system administrator for Darebee.com is none other than David Amerland, whose email is listed in the Whois record as david@amerland.co.uk. Darebee is apparently a part of Amerland Enterprises Ltd. in Hertfordshire, UK. The traffic graph for Darebee.com is listed by Alexa. It shows about 26,000 “visitors” per month which is at variance with the monthly traffic data of 3.2 million on W3Snoop.com.

image

When I see this type of search result, I wonder if the sites have been working overtime to spoof the relevance components of Web search and retrieval systems.

I noted these points in the interview which appeared in the prestigious site Kamkash.com.

On relevance: Data makes zero sense if you can’t find what you want very quickly and then understand what you are looking for.

On semantic search’s definition: Semantic search essentially is trying to understand at a very nuanced level, and then it is trying to give us the best possible answer to our query at that nuanced level of our demands or our intent.

On Boolean search: Boolean search essentially looks at something probabilistically.

On Google’s RankBrain: [Google RankBrain] has nothing to do with ranking.

On participating in Google Plus: Google+ actually allows you to be pervasively enough very real in a very digital environment where we are synchronously connected with lot of people from all over the world and yet the connection feels very…very real in terms of that.

I find these statements interesting.

Read more

Bing Improvements

February 17, 2017

Online marketers are usually concerned with the latest Google algorithm, but Microsoft’s Bing is also a viable SEO target. Busines2Community shares recent upgrades to that Internet search engine in its write-up, “2016 New Bing Features.” The section on the mobile app seems to be the most relevant to those interested in Search developments. Writer Asaf Hartuv tells us:

For search, product and local results were improved significantly. Now when you search using the Bing app on an iPhone, you will get more local results with more information featured right on the page. You won’t have to click around to get what you want.

Similarly, when you search for a product you want to buy, you will get more options from more stores, such as eBay and Best Buy. You won’t have to go to as many websites to do the comparison shopping that is so important to making your purchase decision.

While these updates were made to the app, the image and video search results were also improved. You get far more options in a more user-friendly layout when you search for these visuals.

The Bing app also includes practical updates that go beyond search. For example, you can choose to follow a movie and get notified when it becomes available for streaming. Or you can find local bus routes or schedules based on the information you select on a map.

Hartuv also discusses upgrades to Bing Ads (a bargain compared to Google Ads, apparently), and the fact that Bing is now powering AOL’s search results (after being dropped by Yahoo). He also notes that, while not a new feature, Bing Trends is always presenting newly assembled, specialized content to enhance users’ understanding of current events. Hartuv concludes by prompting SEO pros to remember the value of Bing.

Cynthia Murrell, February 17, 2017

More Semantic Search Cheerleading: My Ears Hurt

February 8, 2017

I read “Semantic Search. The Present and Future of Search Engine Optimization .” Let’s be clear. The point of this write up has zero to do with precision and recall. The goal strikes me as generating traffic. Period. Wrapping the blunt truth in semantic tinsel does not change the fact that providing on point information is not on the radar.

I noted this statement and circled it in wild and crazy pink:

SEO in the current times involves user intent to provide apt results which can help you to improve your online presence. Improvement is possible by emphasizing on various key psychological principles to attract readers; rank well and eventually expand business.

When I look for information, my intent is pretty clear to me. I have learned over the last 50 years that software is not able to assist me. May I give you an example from yesterday, gentle reader. I wanted information about Autonomy Kenjin, which became available in the late 1990s. It disappeared. Online was useless and the search systems I used either pointed me to board games, rock music, or Japanese culture. My intent is pretty clear to me. Intent to today’s search systems suck when it comes to my queries.

The write up points out that semantics will help out with “customer personality guiding SEO.” Maybe for Lady Gaga queries. For specialized, highly variable search histories, not a chance. Systems struggle to recognize the intent of highly idiosyncratic queries. Systems do best with big statistical globs. College students like pizza. This user belongs to a cluster of users labeled college students. Therefore, anyone in this cluster gets… pizza ads. Great stuff. Double cheese with two slices of baloney. Then there are keywords. Create a cluster, related terms to it. Bingo. Job done. Close enough for today’s good enough approach to indexing.

The real gems of the write up consist of admonitions to write about a relevant topic. Relevant to whom, gentle reader. The author, the reader, the advertiser? Include concepts. No problem. A concept to you might be a lousy word to describe something to me; for example, games and kenjin. And, of course, use keywords. Right, double talk and babble.

Semantic SEO. Great stuff. Cancel that baloney pizza order. I don’t feel well.

Stephen E Arnold, February 8, 2017

Google Semantics Sort of Explained by an SEO Expert

February 1, 2017

I know that figuring out how Google’s relevance ranking works is tough. But why not simplify the entire 15 year ball of wax for those without a grasp of Messrs. Brin and Page, their systems and methods, and the wrapper software glued on the core engine. Keep in mind that it is expensive and time consuming to straighten a bent frame when one’s automobile experiences a solid T bone impact. Google’s technology foundation is that frame, and over the years, it has had some blows, but the old girl keeps on delivering advertising revenue.

I read “Semantic Search for Rookies. How Does Google Search Work” does not provide the obvious answer; to wit:

Well enough for the company to continue to show revenue growth and profits.

The write up takes a different tact toward the winds of relevance. I highlighted this passage:

Google’s semantic algorithm hasn’t developed overnight. It’s a product of continuous work:

  • Knowledge Graph (2012)
  • Hummingbird (2013)
  • RankBrain (2015)
  • Related Questions and Rich Answers (ongoing)

The work began many years before 2012, but that is of no consequence to the SEO whiz explaining how Google search works.

The write up then brings up the idea of semantic and relevance obstacles. I won’t drag issues such as disambiguation, a user’s search history, and Google’s method of dealing with repetitive queries. I won’t comment on Ramanathan Guha’s inventions nor bring up the word in semantics which began when Jeff Dean revealed how many versions of Britney Spears name were in one of Google’s suggested search subsystems.

The way to take advantage of where Google is today boils down to writing an article, a blog post similar to this one you are reading, or any textual information to employing user oriented phrasing and algorithm oriented phrasing. The explanation of these two types of phrasing was too sophisticated for me. I urge you, gentle reader, to consult the source document and learn yourself by sipping from the font of knowledge. (I would have used the phrase “Pierian spring” but that would have forced me to decide whether I was using a bound phrase, semantic oriented phrase, or algorithm oriented phrase. That’s too much work for me.

The write up concludes with these injunctions:

If you wish to create well-optimized content, you shouldn’t focus on text in the traditional sense. Instead, you should focus on words and word formation which Google expects to see. In this day and age, users’ feedback plays a crucial role in determining the importance of content. You will have to cater to both sides. Create content with lots of synonyms and semantically related words incorporated in it. Try to be provocative and readable at the same time.

I don’t want to rain on the SEO poobah’s parade, but there are some issues that this semantic write up does not address; namely, the challenge of rich media. How does one get one’s video indexed in a correct way in YouTube.com, GoogleVideo.com, Vimeo.com, or one of the other video search systems. What about podcasts, still images, Twitter outputs, public Facebook goodies, and social media image sharing sites?

My point is that defining semantics in terms of a particular content type suggests that Google has a limited repertoire of indexing, metatagging, and cross linking methods. Perhaps a quick look at Dr. Guha’s semantic server would shed some light on the topic? Well, maybe not. This is, after all, SEO oriented with semantic and algorithmic phrasing I suppose.

Stephen E Arnold, February 1, 2017

PageRank Revealed with a Superficial Glance

December 24, 2016

I read “19 Confirmed Google Ranking Factors.” The table below comes from my 2004 monograph The Google Legacy. You will be able to view a seven minute summary on December 20, 2016. The table in The Google Legacy table consists of more than 100 factors used in the Google relevance system. Each of the PageRank elements was extracted from open source information; for example, journal articles, Google technical papers which were once easily available from Google, patents, various speeches, and blog posts. We estimated that the factors are tuned and modified to deal with hacks, tricks, and new developments. Here is an extract from the tables in The Google Legacy:

image

Imagine my surprise when I worked through the 19 factors in the article “19 Confirmed Google Ranking Factors.” My research suggested that by 2004, Google had layered on and inserted many factors which the company did not document. These adjustments have continued since 2004 when production of The Google Legacy began and changes could not longer be made to the book text.

The idea that one can influence PageRank by paying attention to a handful of content, format, update, and technical requirements is interesting for two reasons:

  1. It continues the simplification of the way people think about Google and its methods
  2. Google faces “inertia” when it comes to making changes in its core relevance methods; that is, it is easier for Google to “wrap” the core with new features than it is to make certain changes. That’s the reason there is an index for mobile search and an index for desktop search.

Here’s an example of the current thinking about Google’s relevance ranking methods from the article cited in this blog post: Links. Yep, PageRank relies on links. Think about IBM Almaden Clever and you get a good idea how this works. What Google has added were methods which pay attention to less crude signals. Google also pays attention to signals which “deduct” or “down check” a page or site. Transgressions include duplicate content and crude tricks to fool Google’s algorithms; for example, you click on a link that says “White House” and you see porn. This issue and thousands of others have been “fixed” by Google engineers. My 2004 listing of 100 factors is a fraction of the elements the Google relevance systems process.

Another example of relevance simplification appears in “10 Google Search Ranking Factors Content Marketers Should Prioritize (And 3 You Shouldn’t).” Yep, almost 20 years of relevance tweaks boil down to a dozen rules. Hey, if these worked, why isn’t everyone in the SEO game generating oodles of traffic? Answer: The Google system is a bit more slippery and requires methods with more rubber studs on the SEO gym shoe.l

The problem with boiling down Google’s method to a handful of checkpoints is that the simplification can impart false confidence. Do this and the traffic does NOT materialize. What happened? The answer is that a misstep has been introduced while doing the “obvious” search engine optimization tweak. To give one example, consider making changes to one’s site. Google notes frequency and type of changes to a Web site. How about those frequent and radical redesigns. How does Mother Google interpret that information?

Manipulating relevance in order to boost a site’s ranking in a results list can have some interesting consequences. Over the years, I have stated repeatedly that if a webmaster wants traffic, buy AdWords. The other path is to concentrate on producing content which other people want to read. Shortcuts and tricks can lead to some fascinating consequences and, of course, work for the so called search engine optimization experts.

Matt, Matt, where are you now? Oh, that’s right…

Stephen E Arnold, December 24, 2016

SEO Craziness: Design Is the Most Important Ranking Factor

December 3, 2016

I love the search engine optimization “experts.” The concepts, the ideas, the confections—Amazing. I read “Design: The Top SEO Ranking Factor.” The main idea is that arts and crafts are more important than accuracy, useful information, clear presentation, and the other factors identified by Google itself.

I learned:

The only way to sustain organic search ranking is to offer an experience that users engage with after landing on it.

I love statements containing the word “only.” The idea is that there is one, unique, remarkable, go-to thing to do to pump up a result is a list of results or in an app designed to provide search even though the user does not know he/she is performing a search. The write up explains:

User-centered design of organic search landing experiences is pretty simple: provide clearly scannable [sic], relevant, compelling text above the fold. Anything that distracts from that mission defeats the purpose.

Cart before horse and ready-fire-aim are essential tools for this approach to providing useful and usable information.

Stephen E Arnold, December 3, 2016

Black-Hat SEO Tactics Google Hates

November 16, 2016

The article on Search Engine Watch titled Guide to Black Hat SEO: Which Practices Will Earn You a Manual Penalty? follows up on a prior article that listed some of the sob stories of companies caught by Google using black-hat practices. Google does not take kindly to such activities, strangely enough. This article goes through some of those practices, which are meant to “falsely manipulate a website’s search position.”

Any kind of scheme where links are bought and sold is frowned upon, however money doesn’t necessarily have to change hands… Be aware of anyone asking to swap links, particularly if both sites operate in completely different niches. Also stay away from any automated software that creates links to your site. If you have guest bloggers on your site, it’s good idea to automatically Nofollow any links in their blog signature, as this can be seen as a ‘link trade’.

Other practices that earned a place on the list include automatically generated content, cloaking and irrelevant redirects, and hidden text and links. Doorway pages are multiple pages for a key phrase that lead visitors to the same end destination. If you think these activities don’t sound so terrible, you are in great company. Mozilla, BMW, and the BBC have all been caught and punished by Google for such tactics. Good or bad? You decide.

Chelsea Kerwin, November 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Google Biases: Real, Hoped For, or Imagined?

November 10, 2016

I don’t have a dog in this fight. Here at Beyond Search we point to open source documents and offer comments designed to separate the giblets from the goose feathers. Yep, that’s humor, gentle reader. Like it or not.

The write up “Opinion: Google Is Biased Toward Reputation-Damaging Content” pokes into an interesting subject. When I read the article, I thought, “Is this person a user of Proton Mail?”

The main point of the write up is that the Google relevance ranking method responds in an active manner to content which the “smart” software determines is negative. But people wrote the software, right? What’s up, people writing relevance ranking modules?

The write up states:

Google has worked very hard to interpret user intent when searches are conducted. It’s not easy to fathom what people may be seeking when they submit a keyword or a keyword phrase.

Yep, Google did take this approach prior to its initial public offering in 2004. Since then, I ask, “What changes did Google implement in relevance in the post IPO era?” I ask, “Did Google include some of the common procedures which have known weaknesses with regard to what lights the fires of the algorithms’ interests?”

The write up tells me:

Since Google cannot always divine a specific intention when a user submits a search query, it’s evolved to using something of a scattergun approach — it tries to provide a variety of the most likely sorts of things that people are generally seeking when submitting those keywords. When this is the name of a business or a person, Google commonly returns things like the official website of the subject, resumes, directory pages, profiles, business reviews and social media profiles. Part of the search results variety Google tries to present includes fresh content — newly published things like news articles, videos, images, blog posts and so on. [Emphasis added.]

Perhaps “fresh” content triggers the following relevance components? For example, fresh content signals change and change may mean that the “owner” of the Web page may be interested in buying AdWords. A boost for “new stuff” means that when a search result drifts lower over a span of a week or two, the willingness to buy AdWords goes up? I think about this question because it suggests that tuning certain methods provides a signal to the AdWords’ subsystems of people and code. I have described how such internal “janitors” within Google modules perform certain chores. Is this a “new” chore designed to create a pool of AdWords’ prospects? Alas, the write up does not explore this matter.

The write up points to a Googler’s public explanation of some of the relevance ranking methods in use today. That’s good information. But with the public presentations of Google systems and methods with which I am familiar, what’s revealed is like touching an elephant when one is blind. There is quite a bit more of the animal to explore and understand. In fact “understand” is pretty tough unless one is a Googler with access to other Googlers, the company’s internal database system, and the semi clear guidelines from whoever seems to be in charge at a particular time.

I highlighted this passage from the original write up as interesting:

I’ve worked on a number of cases in which all my research indicates my clients’ names have extremely low volumes of searches.  The negative materials are likely to receive no more clicks than the positive materials, according to my information, and, in many cases, they have fewer links.

Okay, so there’s no problem? If so, why is the write up headed down the Google distorts results path? My hunch is that the assurance is a way to keep Googzilla at bay. The author may want to work at the GOOG someday. Why be too feisty and remind the reader of the European Commission’s view of Google’s control of search results?

The write up concludes with a hope that Google says more about how it handles relevance. Yep, that’s a common request from the search engine optimization crowd.

My view from rural Kentucky is that there are a number of ways to have an impact on what Google presents in search results. Some of these methods exploit weaknesses in the most common algorithms used for basic functions within the Google construct. Other methods are available as well, but these are identified by trial and error by SEO wizards who flail for a way to make their clients’ content appear in the optimum place for one of the clients’ favorite keywords.

Three observations:

  • The current crop of search mavens at Google are in the business of working with what is already there. Think in terms of using a large, frequently modified, and increasingly inefficient system for determining relevance. That’s what the new hires confront. Fun stuff.
  • The present climate for relevance at Google is focused on dealing with the need to win in mobile search. The dominant market share in desktop search is not a given in the mobile world. Google is fragmenting its index for a reason. The old desktop model looks a bit like a 1990s Corvette. Interesting. Powerful. Old.
  • The need for revenue is putting more and more pressure on Google to make up for the mobile user behavior and the desktop user behavior in terms of search. Google is powerful, but different methods are needed to get closer to that $100 billion in revenue Eric Schmidt referenced in 2006. Relevance may be an opportunity.

My view is that Google is more than 15 years down the search road. Relevance is no longer defined by precision and recall. What’s important is reducing costs, increasing revenue, and dealing with the problems posed by Amazon, Facebook, Snapchat, et al.

Relevance is not high on the list of to dos in some search centric companies. Poking Google about relevance may produce some reactions. But not from me. I love the Google. Proton Mail is back in the index because Google allegedly made a “fix.” See. Smart algorithms need some human attention. If you buy a lot of AdWords, I would wager that some human Googlers will pay attention to you. Smart software isn’t everything once it alerts a Googler to activate the sensitivity function in the wetware.

Stephen E Arnold, November 10, 2016

Web Marketers: Get Ready for the Google Disruption

October 28, 2016

The GOOG is shifting from desktop search to mobile search. The transition will take time and make life exciting for the Web marketers who have to [a] justify their budgets, [b] generate traffic, [c] keep their jobs. The search engine optimization wizards will be looking a McMansions and BMW convertibles. Business is likely to boom for the purveyors of fairy dust and jargon.

Navigate to “50+ Web Measurement KPIs – Analytics Demystified.” The write up presents four dozen ways to accomplish your objectives. The write up groups the analytics some folks view like the Rosetta Stone. The principal categories are:

  • Key Performance Indicators to Measure Return on Investment
  • KPIs to Measure Lead Generation Campaigns
  • KPIs to Measure Intent to Purchase
  • KPIs to Measure Website Engagement

I worked through the long write up, complete with mini MBA comments and screenshots of the magic data. The thought I had was that some folks are reaching for straws to build their career. The number that matters is the revenue produced by a digital marketing program.

Intent? Probably to sell consulting.

Stephen E Arnold, October 28, 2016

Next Page »

  • Archives

  • Recent Posts

  • Meta