October 21, 2015
The article titled When Big Data Becomes Bad Data on Tech In America discusses the legal ramifications of relying on algorithms for companies. The “disparate impact” theory has been used in the courtroom for some time to ensure that discriminatory policies be struck down whether they were created with the intention to discriminate or not. Algorithmic bias occurs all the time, and according to the spirit of the law, it discriminates although unintentionally. The article states,
“It’s troubling enough when Flickr’s auto-tagging of online photos label pictures of black men as “animal” or “ape,” or when researchers determine that Google search results for black-sounding names are more likely to be accompanied by ads about criminal activity than search results for white-sounding names. But what about when big data is used to determine a person’s credit score, ability to get hired, or even the length of a prison sentence?”
The article also reminds us that data can often be a reflection of “historical or institutional discrimination.” The only thing that matters is whether the results are biased. This is where the question of human bias becomes irrelevant. There are legal scholars and researchers arguing on behalf of ethical machine learning design that roots out algorithmic bias. Stronger regulations and better oversight of the algorithms themselves might be the only way to prevent time in court.
Chelsea Kerwin, October 21, 2015
July 24, 2015
Humans are visual creatures and they learn and absorb information better when pictures accompany it. In recent years, the graphic novel medium has gained popularity amongst all demographics. The amount of information a picture can communicate is astounding, but unless it is looked for it can be hard to find. It also cannot be searched by a search engine…or can it? Synaptica is in the process of developing the “OASIS Deep Image Indexing Using Linked Data,”
OASIS is an acronym for Open Annotation Semantic Imaging System, an application that unlocks image content by giving users the ability to examine an image closer than before and highlighting data points. OASIS is linked data application that enables parts of the image to be identified as linked data URIS, which can then be semantically indexed to controlled vocabulary lists. It builds an interactive map of an image with its features and conceptual ideas.
“With OASIS you will be able to pan-and-zoom effortlessly through high definition images and see points of interest highlight dynamically in response to your interaction. Points of interest will be presented along with contextual links to associated images, concepts, documents and external Linked Data resources. Faceted discovery tools allow users to search and browse annotations and concepts and click through to view related images or specific features within an image. OASIS enhances the ability to communicate information with impactful visual + audio + textual complements.”
OASIS is advertised as a discovery and interactive tool that gives users the chance to fully engage with an image. It can be applied to any field or industry, which might mean the difference between success and failure. People want to fully immerse themselves in their data or images these days. Being able to do so on a much richer scale is the future.
Whitney Grace, July 24, 2015
July 14, 2015
If you spend any time with Google Maps (civilian edition), you will find blurred areas, gaps, and weird distortions which cause me to ask, “Where did that building go?”
If you really spend a lot of time with Google Maps, you will be able to see my two dogs, Max and Tess, in a street view scene.
And zoomed in. Aren’t the doggies wonderful?
The article “The Curious Case of Google Street View’s Missing Streets” is not interested in seeing what the wonky Google autos capture. The write up pokes at me with this line:
Many towns and cities are littered with small gaps in the Street View imagery.
The write up explains that Google recognizes that gaps are a known issue. The article gets close to something quite interesting when it states:
In extreme cases, whole countries are affected. Privacy has been a particular issue in Germany, where many people objected to the roll-out of Street View. Google now has Street View images only for big cities in Germany, like Berlin and Frankfurt, and appears to have given up on the rest of the country completely. Zoom out over Europe in Street View mode and Germany is mostly a blank island in a sea of blue.
Want to do something fun the author of the write up did not bother to do? Locate a list of military facilities in the UK. Then try to locate those on a Google Map. Next try to locate those on a Bing.com map (oops, Uber map)?
Notice anything unusual? Now relate your thoughts to the article’s list of causes.
If not, enjoy the snap of Max and Tess.
Stephen E Arnold, July 14, 2015
June 25, 2015
The main point of an advertisement is to get your attention and persuade you to buy a good or service. So why would ads be hiding themselves in a public venue? Gizmodo reports that in Russia certain ads are hiding from law enforcement in the article: “This Ad For Banned Food In Russia Itself From The Cops.” Russian authorities have banned imported food from the United States and European Union. Don Giulio Salumeria is a Russian food store that makes its income by selling imported Italian food, but in light of the recent ban the store has had to come up with some creative advertising:
“Websites are already able to serve up ads customized for whoever happens to be viewing a page. Now an ad agency in Russia is taking that idea one step further with an outdoor billboard that’s able to automatically hide when it spots the police coming.”
Using a camera equipped with facial recognition software programmed to recognized symbols and logos on officers’ uniforms, the billboard switches ads from Don Giulio Salumeria to another ad advertising a doll store. While the ad does change when it “sees “ the police coming, they still have enough time to see it. The article argues that the billboard’s idea is more interesting than anything. It then points out how advertising will become more personally targeted in the future, such as a billboard recognizing a sports logo and advertising an event related to your favorite team or being able to recognize your car on a weekly commute, then recommending a vacation. While Web sites are already able to do this by tracking cookies on your browser, it is another thing to being tracked in the real world by targeted ads.
Whitney Grace, June 25, 2015
June 23, 2015
MIT did not discover object recognition, but researchers did teach a deep-learning system designed to recognize and classify scenes can also be used to recognize individual objects. Kurzweil describes the exciting development in the article, “MIT Deep-Learning System Autonomously Learns To Identify Objects.” The MIT researchers realized that deep-learning could be used for object identification, when they were training a machine to identify scenes. They complied a library of seven million entries categorized by scenes, when they learned that object recognition and scene-recognition had the possibility of working in tandem.
“ ‘Deep learning works very well, but it’s very hard to understand why it works — what is the internal representation that the network is building,’ says Antonio Torralba, an associate professor of computer science and engineering at MIT and a senior author on the new paper.”
When the deep-learning network was processing scenes, it was fifty percent accurate compared to a human’s eighty percent accuracy. While the network was busy identifying scenes, at the same time it was learning how to recognize objects as well. The researchers are still trying to work out the kinks in the deep-learning process and have decided to start over. They are retraining their networks on the same data sets, but taking a new approach to see how scene and object recognition tie in together or if they go in different directions.
Deep-leaning networks have major ramifications, including the improvement for many industries. However, will deep-learning be applied to basic search? Image search still does not work well when you search by an actual image.
June 10, 2015
Online shopping is supposed to drive physical stores out of business, but that might not be the case if online shopping is too difficult. The Ragtrader article, “Why They Abandon” explains that 45 percent of Australian consumers will not make an online purchase if they experience Web site difficulties. The consumers, instead, are returning to physical stores to make the purchase. The article mentions that 44 percent believe that traditional shopping is quicker if they know what to look for and 43 percent as prefer in-store service.
The research comes from a Rackspace survey to determine shopping habits in New Zealand and Australia. The survey also asked participants what other problems they experienced shopping online:
“42 percent said that there were too many pop-up advertisements, 34 percent said that online service is not the same as in-store and 28 percent said it was too time consuming to narrow down options available.”
These are understandable issues. People don’t want to be hounded to purchase other products when they have a specific item in mind and thousands of options are overwhelming to search through. Then a digital wall is often daunting if people prefer interpersonal relationships when they shop. The survey may pinpoint online shopping weaknesses, but it also helps online stores determine the best ways for improvement.
“ ‘This survey shows that not enough retailers are leveraging powerful and available site search and navigation solutions that give consumers a rewarding shopping experience.’ ”
People shop online for convenience, variety, lower prices, and deals. Search is vital for consumers to narrow down their needs, but if they can’t navigate a Web site then search proves as useless as an expired coupon.
May 15, 2015
Image search means having software which can figure out from a digital photo that a cow is a cow. In more complex photos, the software identifies what it can. I recall one demonstration which recognized me as a 20 year old criminal. Close but no cigar.
I received an email from a former clandestine professional. The link provided informed me that Baidu was better at image recognition than the Google. The alleged error rate is 4.58 percent. I love the two decimal accuracy.
Not to be outdone, WolframAlpha is in the image recognition game as well. Navigate to “Wolfram Alpha Image Identification Identifies Steven Wolfram as Podium.” The write up points out:
Speaking of which, a picture of Steven Wolfram returned the answer ‘podium’. So no recognition for the creator. Unfortunately, it couldn’t identify a map of France at all and just came back with a big question mark. Sorry, France.
You can try the system at this page.
I uploaded the image of the cover of my new CyberOSINT study. The system returned this result:
My book cover is a a piece of electronic equipment that mixes two or more input signals to give a single output signal.
I did not know that. I thought it was a book cover with a blue hand.
Stephen E Arnold, May 15, 2015
April 26, 2015
Need a GIF file? Check out “5 GIF Search Engines & Tools You Haven’t Heard Of Yet.” Searching for GIFs using some Web search engines can yield interesting results.
Stephen E Arnold, April 26, 2015
April 8, 2015
Anyone interested in the mechanics behind image search should check out the description of PicSeer: Search Into Images from YangSky. The product write-up goes into surprising detail about what sets their “cognitive & semantic image search engine” apart, complete with comparative illustrations. The page’s translation seems to have been done either quickly or by machine, but don’t let the awkward wording in places put you off; there’s good information here. The text describes the competition’s approach:
“Today, the image searching experiences of all major commercial image search engines are embarrassing. This is because these image search engines are
- Using non-image correlations such as the image file names and the texts in the vicinity of the images to guess what are the images all about;
- Using low-level features, such as colors, textures and primary shapes, of image to make content-based indexing/retrievals.”
With the first approach, they note, trying to narrow the search terms is inefficient because the software is looking at metadata instead of inspecting the actual image; any narrowed search excludes many relevant entries. The second approach above simply does not consider enough information about images to return the most relevant, and only most relevant, results. The write-up goes on to explain what makes their product different, using for their example an endearing image of a smiling young boy:
“How can PicSeer have this kind of understanding towards images? The Physical Linguistic Vision Technologies have can represent cognitive features into nouns and verbs called computational nouns and computational verbs, respectively. In this case, the image of the boy is represented as a computational noun ‘boy’ and the facial expression of the boy is represented by a computational verb ‘smile’. All these steps are done by the computer itself automatically.”
See the write-up for many more details, including examples of how Google handles the “boy smiles” query. (Be warned– there’s a very brief section about porn filtering that includes a couple censored screenshots and adult keyword examples.) It looks like image search technology progressing apace.
Cynthia Murrell, April 08, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com
March 24, 2015
I read “Images That Fool Computer Vision Raise Security Concerns.” I found the write up a reminder that the marketing and venture capitalists’ hype are one thing. Real world software performance is another thing.
The article states:
Cornell researchers have found that computers, like humans, can be fooled by optical illusions, which raises security concerns and opens new avenues for research in computer vision.
The passage I highlighted in a mellow yellow says:
But computers don’t process images the way humans do, Yosinski [a Cornell wizard] said. “We realized that the neural nets did not encode knowledge necessary to produce an image of a fire truck, only the knowledge necessary to tell fire trucks apart from other classes,” he explained. Blobs of color and patterns of lines might be enough. For example, the computer might say “school bus” given just yellow and black stripes, or “computer keyboard” for a repeating array of roughly square shapes.
It turns out that this diagram looks exactly like a penguin.
The smart software sees the abstraction as what most grade school children know as a lovable penguin. I did not smell a penguin until after I left grade school. Someone should have warned me.
And the challenge? I have no comment about the expectations of a government professional who relies on image recognition as part of an on going investigation.
Stephen E Arnold, March 24, 2015