Google and Disappeared Streets

July 14, 2015

If you spend any time with Google Maps (civilian edition), you will find blurred areas, gaps, and weird distortions which cause me to ask, “Where did that building go?”

If you really spend a lot of time with Google Maps, you will be able to see my two dogs, Max and Tess, in a street view scene.

image

And zoomed in. Aren’t the doggies wonderful?

image

The article “The Curious Case of Google Street View’s Missing Streets” is not interested in seeing what the wonky Google autos capture. The write up pokes at me with this line:

Many towns and cities are littered with small gaps in the Street View imagery.

The write up explains that Google recognizes that gaps are a known issue. The article gets close to something quite interesting when it states:

In extreme cases, whole countries are affected. Privacy has been a particular issue in Germany, where many people objected to the roll-out of Street View. Google now has Street View images only for big cities in Germany, like Berlin and Frankfurt, and appears to have given up on the rest of the country completely. Zoom out over Europe in Street View mode and Germany is mostly a blank island in a sea of blue.

Want to do something fun the author of the write up did not bother to do? Locate a list of military facilities in the UK. Then try to locate those on a Google Map. Next try to locate those on a Bing.com map (oops, Uber map)?

Notice anything unusual? Now relate your thoughts to the article’s list of causes.

If not, enjoy the snap of Max and Tess.

Stephen E Arnold, July 14, 2015

Short Honk: Saudi Supercomputer

July 14, 2015

In order to crunch text and do large scale computations, a fast computer is a useful tool. Engineering & Technology Magazine reported in “Saudi Machine Makes It on to World’s Top Ten Supercomputer List”:

The Shaheen II is the first supercomputer based in the Middle East to enter the world’s top ten list, debuting at number seven. The Saudi supercomputer is based at King Abdullah University of Science and Technology and is the seventh most powerful computer on the planet, according to the Top 500 organization that monitors high-performance machines. China’s Tianhe-2 kept its position as the most powerful supercomputer in the world in the latest rankings.

If you are monitoring the supercomputer sector, this announcement, if accurate, is important in my opinion. There are implications for content processing, social graph generation, and other interesting applications.

Stephen E Arnold, July 14, 2015

Page Load Speed: Let Us Blame Those in Suits

July 14, 2015

I read “News Sites Are Fatter and Slower Than Ever.” Well, I am not sure about “ever.” I recall when sites simply did not work. Those sites never worked. You can check out the turtles if you can grab a peak at a crawler’s log file. Look for nifty codes like 2000, 4, or 12. Your mileage may vary, but the log file tells the tale.

The write up aims at news sites. My hunch is that the definition of a news site is one of those toip one percent things: The user is looking for information from a big name and generally clueless outfit like The Daily Whatever or a mash up of content from hither and yon.

Enter latency, lousy code, crazy ads from half baked ad servers, and other assorted craziness.

The write up acknowledges that different sites deliver different response times. Okay.

If you are interested in data, the article presents an interesting chart. You can see home page load times with and without ads. There’s a chart which shows page load times via different mobile connections.

The main point, in my opinion, is a good one:

Since its initial release 22 years ago, the Hyper Text Markup Language  (HTML) has gone through many iterations that make web sites richer and smarter than ever. But this evolution also came with loads of complexity and a surfeit of questionable features. It’s time to swing the pendulum back toward efficiency and simplicity. Users are asking for it and will punish those who don’t listen.

My hunch is that speed is a harsh task master. In our work, we have found that with many points in a process, resources are often constrained or poorly engineered. As a result, each new layer of digital plaster contributes to the sluggishness of a system.

Unless one has sufficient resources (money and expertise and time), lousy performance is the new norm. The Google rails and cajoles because slow downs end up costing my favorite search engine big bucks.

Most news sites do not get the message and probably never will. The focus is on another annoying overlay, pop up, or inline video.

Click away, gentle reader, click away. Many folks see the browser as the new Windows 3.11. Maybe browsers are the new Windows 3.11?

Stephen E Arnold, July 14, 2015

Kashman to Host Session at SharePoint Fest Seattle

July 14, 2015

Mark Kashman, Senior Product Manager at Microsoft, will deliver a presentation at the upcoming SharePoint Fest Seattle in August. All eyes remain peeled for any news about the new SharePoint Server 2016 release, so his talk entitled, “SharePoint at the Core of Reinventing Productivity,” should be well watched. Benzinga gives a sneak peek with their article, “Microsoft’s Mark Kashman to Deliver Session at SharePoint Fest Seattle.”

The article begins:

“Mark Kashman will deliver a session at SharePoint Fest Seattle on August 19, 2015. His session will be held at the Washington State Convention Center in downtown Seattle. SharePoint Fest is a two-day training conference (plus an optional day of workshops) that will have over 70 sessions spread across multiple tracks that brings together SharePoint enthusiasts and practitioners with many of the leading SharePoint experts and solution providers in the country.”

Stephen E. Arnold is also keeping an eye out for the latest news surrounding SharePoint and its upcoming release. His Web service ArnoldIT.com efficiently synthesizes and summarizes essential tips, tricks, and news surrounding all things search, including SharePoint. The dedicated SharePoint feed can save users time by serving as a one-stop-shop for the most pertinent pieces for users and managers alike.
Emily Rae Aldridge, July 14, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Algorithmic Art Historians

July 14, 2015

Apparently, creativity itself is no longer subjective. MIT Technology Review announces, “Machine Vision Algorithm Chooses the Most Creative Paintings in History.” Traditionally, art historians judge how creative a work is based on its novelty and its influence on subsequent artists. The article notes that this is a challenging task, requiring an encyclopedic knowledge of art history and the judgement to decide what is novel and what has been influential. Now, a team at Rutgers University has developed an algorithm they say is qualified for the job.

Researchers Ahmed Elgammal and Babak Saleh credit several developments with bringing AI to this point. First, we’ve recently seen several breakthroughs in machine understanding of visual concepts, called classemes. that include recognition of factors from colors to specific objects. Another important factor: there now exist well-populated online artwork databases that the algorithms can, um, study. The article continues:

“The problem is to work out which paintings are the most novel compared to others that have gone before and then determine how many paintings in the future have uses similar features to work out their influence. Elgammal and Saleh approach this as a problem of network science. Their idea is to treat the history of art as a network in which each painting links to similar paintings in the future and is linked to by similar paintings from the past. The problem of determining the most creative is then one of working out when certain patterns of classemes first appear and how these patterns are adopted in the future. …

“The problem of finding the most creative paintings is similar to the problem of finding the most influential person on a social network, or the most important station in a city’s metro system or super spreaders of disease. These have become standard problems in network theory in recent years, and now Elgammal and Saleh apply it to creativity networks for the first time.”

Just what we needed. I have to admit the technology is quite intriguing, but I wonder: Will all creative human endeavors eventually have their algorithmic counterparts and, if so, how will that effect human expression?

Cynthia Murrell, July 14, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Algorithms: Bias and Shoes for the Shoemaker

July 13, 2015

I read the Slashdot item “CSTA: Google Surveying Educators on Unconscious Biases of Students, Parents.” Interesting, but I think there are biases in curricula, textbooks, and instructors. My hunch is that some schools have biases baked into the walls like the odor of grade school hall at 10 am on a snowy day in November.

I thought of a shoemaker whose children had lousy shoes. Did the family focus its attention on meeting the needs of customers? Did the family just forget that the children’s shoes might help sell more shoes if those shoes were collectible sneaks with the image of a basketball star on them?

I thought about this item from the gray lady: “When Algorithms Discriminate.” You may have to pay for this gem. Don’t hassle me if the link goes dead. Collar a New York Times’ executive and express your opinion about disappeared content, pay walls, and the weird blend of blog and “real” content. I don’t care.

The gray lady’s write up points out that

Google’s online advertising system, for instance, showed an ad for high-income jobs to men much more often than it showed the ad to women, a new study by Carnegie Mellon University researchers found.

So the idea is that Google’s algorithms discriminate because humans wrote the code?

Will Google (the shoemaker) turn its attention to its children’s shoes?

I include in my forthcoming column for Information Today “The Bing Summer Fling” that Bing also fiddles search results.

I know search systems are their human side. Perhaps Microsoft and Google can cooperate to determine how much discrimination surfaces in their next generation, state of the art, smart, objective, super duper systems?

My hunch is that the financial requirements may make such introspection unpopular. That’s why it is far safer to research students and parents. Who wants to look closely at his and her shoes?

Stephen E Arnold, July 13, 2015

SAS Explains Big Data. Includes Cartoon, Excludes Information about Cost

July 13, 2015

I know that it is easy to say Big Data. It is easy to say Hadoop. It is easy to make statements in marketing collateral, in speeches, and in blogs written by addled geese. Honk!

 

I wish to point out that any use of these terms in the same sentence require an important catalyst: Money. Money that has been in the words of the government procurement officer, “Allocated, not just budgeted.”

Here are the words:

  1. Big Data
  2. Hadoop
  3. Unstructured data.

Point your monitored browser at “Marketers Ask: What Can Hadoop Do That My Data Warehouse Can’t?” The write up originates with SAS. When a company anchored in statistics, I expect some familiarity with numbers. (yep, just like the class you have blocked from your mind. The mid term? What mid term?)

The write up points out that unstructured data comes in many flavors. This chart, complete with cartoon, identifies 15 content types. I was amazed. Just 15. What about the data in that home brew content management system or tucked in the index of the no longer supported DEC 20 TIPS system. Yes, that data.

image

How does Hadoop deal with the orange and blue? Pretty well but you and the curious marketer must attend to three steps. Count ‘em off, please:

  1. Identify the business issue. I think this means know what problem one is trying to solve. This is a good idea, but I think most marketing problems boil down to generating revenue and proving it to senior management. Marketing looks for silver bullets when the sales are not dropping from the sky like packages for the believers in the Cargo Cult.
  2. Get top management support. Yep, this is a good idea because the catalyst—money—has to be available to clean, acquire, and load the goodies in the blue boxes and the wonky stuff from the home brew CMS.
  3. Develop a multi play plan. I think this means that the marketer has zero clue how complicated the Hadoop magic is. The excitement of extract, transform, and load. The thrill of batch processing awaits. Then the joy of looking at outputs which baffle the marketer more comfortable selecting colors and looking at Adwords’ reports than Hadoop data.

My thought is that SAS understands data, statistical methods, and the reality of a revolution which is taking place without the strictures of SAS approaches.

I do like the cartoon. I do not like the omission of the money part of the task. Doing the orange and blue thing for marketers is expensive. Do the marketers know this?

Nope.

Stephen E Arnold, July 13, 2015

IBM Can See What Is Next in Search: Voice to Text and APIs. Yes, APIs. APIs Do You Hear?

July 13, 2015

I just don’t believe it. You may. Navigate to “IBM Sees the Next Phase of Search.” The write up trots out the cognitive computing thing. That’s jargon buzz for smart software. (Google, as you may know, is making its band play this artificial intelligence tune as well.)

Here’s the paragraph next to which I put a multi stroke red exclamation point:

IBM has released several applications that will help move search services into the next phase as voice search replaces the act of typing keywords into a search box, and the need for something in a specific moment in time replaces intent. Humans will express a need and devices from cars to inanimate objects like refrigerators will respond in a more natural way, according to IBM VP of Watson Core Technology Jerome Pesenti. Many of these platforms will allow consumers to interact with Internet-connected devices.

Now Pesenti is one of the founders of Vivisimo, an outfit with nifty on the fly clustering and deduplicating technology. Scaling was not a core competency. Vivisimo also magically and instantly morphed into a Big Data company as soon as the IBM purchase of Vivisimo for about one year’s projected revenues (estimated at $20 million).

I am reasonably confident that Pesenti, who was hooked up with Carnegie Mellon, is aware of voice to text, gestures, and software which “watches” a user to try and figure out what the wacky human wants: To call a significant other, order a pizza, or look for clarification on the mathematical procedures for calculating an Einstein manifold.

The write up explains that IBM is offering application programming interfaces to Watson. Be still my heart. My goodness, APIs.

I find it interesting that IBM’s expensive, 24×7 Watson marketing campaign is reaching “real” journalists who are really excited about APIs. APIs!

Vivisimo used to make available a number of wacky demonstrations of its technology. Perhaps Pesenti and Watson will make available a public demo on the corpus of Wikipedia or the Hacker News content.

I don’t need wonky jargon; I need to bang on a system to see how others react. I want metrics for content processing. I want latency data. I want system resource consumption data. I want something other than hints and API talk.

Stephen E Arnold, July 13, 2015

Watson Based Tradeoff Analytics Weighs Options

July 13, 2015

IBM’s Watson now lends its considerable intellect to helping users make sound decisions. In “IBM Watson Tradeoff Analytics—General Availability,” the Watson Developer Community announces that the GA release of this new tool can be obtained through the Watson Developer Cloud platform. The release follows an apparently successful Beta run that began last February. The write-up explains that the tool:

“… Allows you to compare and explore many options against multiple criteria at the same time. This ultimately contributes to a more balanced decision with optimal payoff.

“Clients expect to be educated and empowered: ‘don’t just tell me what to do,’ but ‘educate me, and let me choose.’ Tradeoff Analytics achieves this by providing reasoning and insights that enable judgment through assessment of the alternatives and the consequent results of each choice. The tool identifies alternatives that represent interesting tradeoff considerations. In other words: Tradeoff Analytics highlights areas where you may compromise a little to gain a lot. For example, in a scenario where you want to buy a phone, you can learn that if you pay just a little more for one phone, you will gain a better camera and a better battery life, which can give you greater satisfaction than the slightly lower price.”

For those interested in the technical details behind this Watson iteration, the article points you to Tradeoff Analyticsdocumentation. Those wishing to glimpse the visualization capabilities can navigate to  this demo. The write-up also lists post-beta updates and explains pricing, so check it out for more information.

Cynthia Murrell, July 13, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Elsevier and Its Business Model May Be Ageing Fast

July 13, 2015

If you need to conduct research and are not attached to a university or academic library, then you are going to get hit with huge subscription fees to have access to quality material.  This is especially true for the scientific community, but on the Internet if there is a will there most certainly is a way.  Material often locked behind a subscription service can be found if you dig around the Internet long enough, mostly from foreign countries, but the material is often pirated.  Gizmodo shares in the article, “Academic Publishing Giant Fights To Keep Science Paywalled” that Elsevier, one of the largest academic publishers, is angry about its content being stolen and shared on third party sites.  Elsevier recently filed a complaint with the New York District Court against Library Genesis and SciHub.org.

“The sites, which are both popular in developing countries like India and Indonesia, are a treasure trove of free pdf copies of research papers that typically cost an arm and a leg without a university library subscription. Most of the content on Libgen and SciHub was probably uploaded using borrowed or stolen student or faculty university credentials. Elsevier is hoping to shut both sites down and receive compensation for its losses, which could run in the millions.”

Gizmodo acknowledges Elsevier has a right to complain, but they also flip the argument in the other direction by pointing out that access to quality scientific research material is expensive.  The article brings up Netflix’s entertainment offerings, with Netflix users pay a flat fee every month and have access to thousands of titles.  Netflix remains popular because it remains cheap and the company openly acknowledges that it sets its prices to be competitive against piracy sites.

Publishers and authors should be compensated for their work and it is well known that academics do not rake in millions, but access to academic works should be less expensive.  Following Netflix’s model or having a subscription service like Amazon Prime might be a better business model to follow.

Whitney Grace, July 13, 2015
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta