Smart Software, Dumb Biases

April 17, 2017

Math is objective, right? Not really. Developers of artificial intelligence systems, what I call smart software, rely on what they learned in math school. If you have flipped through math books ranging from the Googler’s tome on artificial intelligence Artificial Intelligence: A Modern Approach to the musings of the ACM’s journals, you see the same methods recycled. Sure, the algorithms are given a bath and their whiskers are cropped. But underneath that show dog’s sleek appearance, is a familiar pooch. K-means. We have k-means. Decision trees? Yep, decision trees.

What happens when developers feed content into Rube Goldberg machines constructed of mathematical procedures known and loved by math wonks the world over?

The answer appears in “Semantics Derived Automatically from Language Corpora Contain Human Like Biases.” The headline says it clearly, “Smart software becomes as wild and crazy as a group of Kentucky politicos arguing in a bar on Friday night at 2:15 am.”

Biases are expressed and made manifest.

The article in Science reports with considerable surprise it seems to me:

word embeddings encode not only stereotyped biases but also other knowledge, such as the visceral pleasantness of flowers or the gender distribution of occupations.

Ah, ha. Smart software learns biases. Perhaps “smart” correlates with bias?

The canny whiz kids who did the research crawfish a bit:

We stress that we replicated every association documented via the IAT that we tested. The number, variety, and substantive importance of our results raise the possibility that all implicit human biases are reflected in the statistical properties of language. Further research is needed to test this hypothesis and to compare language with other modalities, especially the visual, to see if they have similarly strong explanatory power.

Yep, nothing like further research to prove that when humans build smart software, “magic” happens. The algorithms manifest biases.

What the write up did not address is a method for developing less biases smart software. Is such a method beyond the ken of computer scientists?

To get more information about this question, I asked on the world leader in the field of computational linguistics, Dr. Antonio Valderrabanos, the founder and chief executive officer at Bitext. Dr. Valderrabanos told me:

We use syntactic relations among words instead of using n-grams and similar statistical artifacts, which don’t understand word relations. Bitext’s Deep Linguistics Analysis platform can provide phrases or meaningful relationships to uncover more textured relationships. Our analysis will provide better content to artificial intelligence systems using corpuses of text to learn.

Bitext’s approach is explained in the exclusive interview which appeared in Search Wizards Speak on April 11, 2017. You can read the full text of the interview at this link and review the public information about the breakthrough DLA platform at www.bitext.com.

It seems to me that Bitext has made linguistics the operating system for artificial intelligence.

Stephen E Arnold, April 17, 2017

Written by Stephen E. Arnold · Filed Under AI, Analytics, News, Technology, Text analytics, Text processing | Comments Off on Smart Software, Dumb Biases

Forrester: Enterprise Content Management Misstep

April 14, 2017

I have stated in the past that mid tier consulting firms—that is, outfits without the intellectual horsepower of a McKinsey, Bain, or BCG—generate work that is often amusing, sometimes silly, and once in a while just stupid. I noted an error which is certainly embarrassing to someone, maybe even a top notch expert at mid tier Forrester. The idea for a consulting firm is to be “right” and to keep the customer (in this case Hyland) happy. Also, it is generally good to deliver on what one promises. You know, the old under promise, over deliver method.

How about being wrong, failing, and not delivering at all? Read on about Forrester and content management.

Context

I noted the flurry of news announcements about Forrester, a bigly azure-chip consulting firm. A representative example of these marketing news things is “Microsoft, OpenText, IBM Lead Forrester’s ECM Wave in Evolving Market.” The write up explains that the wizards at Forrester have figured out the winners and losers in enterprise content management. As it turns out, the experts at Forrester do a much better job of explaining their “perception” of content management that implementing content management.

How can this be? Paid experts who cannot implement content management for reports about content management? Some less generous people might find this a minor glitch. I think that consultants are pretty good at cooking up reports and selling them. I am not too confident that mid tier consulting firms and even outfits like Booz, Allen has dotted their “i’s” and crossed their “t’s.”

Let me walk you through this apparent failure of Forrester to make their reports available to a person interested in a report. This example concerns a Forrester reviewed company called Hyland and its OnBase enterprise content management system.

The deal is that Hyland allows a prospect to download a copy of the Forrester report in exchange for providing contact information. Once the contact information is accepted, the potential buyer of OnBase is supposed to be able to download a copy of the Forrester report. This is trivial stuff, and we are able to implement the function when I sell my studies. Believe me. If we can allow registered people to download a PDF, so can you.

The Failure

I wanted a copy of “The Forrester Wave: ECM Business Content Services.” May I illustrate how Forrester’s enterprise content management system fails its paying customers and those who register to download these high value, completely wonderful documents.

Step 1: Navigate to this link for OnBase by Hyland, one of the vendors profiled in the allegedly accurate, totally object Forrester report

Step 2: Fill out the form so Hyland’s sales professionals can contact you in hopes of selling you the product which Forrester finds exceptional

Note the big orange “Download Now” button. I like the “now” part because it means that with one click I get the high-value, super accurate report.

Step 3: Click on one of these two big green boxes:

I tested both, and both return the same high value, super accurate, technically wonderful reports—sort of.

Written by Stephen E. Arnold · Filed Under Consulting, Feature, Management, Marketing, Technology | 1 Comment

A Peek at the DeepMind Research Process

April 14, 2017

Here we have an example of Alphabet Google’s organizational prowess. Business Insider describes how “DeepMind Organises Its AO Researchers Into ‘Strike Teams’ and ‘Frontiers’.” Writer Sam Shead cites a report by Madhumita Murgia as described in the Financial Times. He writes:

Exactly how DeepMind’s researchers work together has been something of a mystery but the FT story sheds new light on the matter. Researchers at DeepMind are divided into four main groups, including a ‘neuroscience’ group and a ‘frontiers’ group, according to the report. The frontiers group is said to be full of physicists and mathematicians who are tasked with testing some of the most futuristic AI theories. ‘We’ve hired 250 of the world’s best scientists, so obviously they’re here to let their creativity run riot, and we try and create an environment that’s perfect for that,’ DeepMind CEO Demis Hassabis told the FT. […]

DeepMind, which was acquired by Google in 2014 for £400 million, also has a number of ‘strike teams’ that are set up for a limited time period to work on particular tasks. Hassabis explained that this is what DeepMind did with the AlphaGo team, who developed an algorithm that was able to learn how to play Chinese board game Go and defeat the best human player in the world, Lee Se-dol.

Here’s a write-up we did about that significant AlphaGo project, in case you are curious. The creative-riot approach Shead describes is in keeping with Google’s standard philosophy on product development—throw every new idea at the wall and see what sticks. We learn that researchers report on their progress every two months, and team leaders allocate resources based on those reports. Current DeepMind projects include algorithms for healthcare and energy scenarios.

Hassabis launched DeepMind in London in 2010, where offices remain after Google’s 2014 acquisition of the company.

Cynthia Murrell, April 14, 2017

Written by Stephen E. Arnold · Filed Under AI, algorithms, Google, News, Technology | Comments Off on A Peek at the DeepMind Research Process

Motivations for Microsoft LinkedIn Purchase

April 13, 2017

We thought the purchase was related to Microsoft’s in-context, real-time search within an Office application. However, according to BackChannel’s article, “Now We Know Why Microsoft Bought LinkedIn,” it’s all about boosting the company’s reputation. Writer Jessi Hempel takes us back to 2014, when CEO Satya Nadella was elevated to his current position. She reminds of the fiscal trouble Microsoft was having at the time, then continues:

It also had a lousy reputation, particularly in Silicon Valley, where camaraderie and collaboration are hallmarks of tech’s evolution and every major player enjoys frenemy status with its adversaries. Microsoft wasn’t a company that partnered with outsiders. It scorned the open-source community and looked down its nose at tech upstarts. In a public conversation with Marc Andreessen in October 2014, investor Peter Thiel called Microsoft a bet ‘against technological innovation.’

The write-up goes on to detail ways Nadella has turned the company around financially. According to Hempel, the LinkedIn purchase, and the installation of its founder Reid Hoffman on the board, are in an effort to boost Microsoft’s reputation. Hembel observes:

As a board member, Hoffman will be Microsoft’s ambassador in the Valley. Among a core group of constituents for whom Microsoft may not factor into conversation, Hoffman will work to raise its profile. The trickle-down effect has the potential to be tremendous as Microsoft competes for partners and talent.

See the article for more information on the relationship between the Nietzsche-quoting Nadella and the charismatic tech genius Hoffman, as well as changes Microsoft has been making to boost both its reputation and its bottom line.

Cynthia Murrell, April 13, 2017

Written by Stephen E. Arnold · Filed Under Microsoft, News, Social Media, Technology | Comments Off on Motivations for Microsoft LinkedIn Purchase

Battle in the Clouds

April 10, 2017

The giants of the tech world are battling fiercely to dominate the Cloud services industry. Amazon, however is still at the pole position being the first entrant, followed by Microsoft, Google and IBM.

The Street in an in-depth report titled How Amazon, Microsoft, Google and IBM Battle for Dominance in the Cloud says:

Amazon Web Services, or AWS, is the indisputable leader, with a breadth of services and clients ranging from blue chips such as Coca Cola (KO) and General Electric (GE) to app-economy stalwarts like Netflix (NFLX), Tinder and Lyft. Microsoft and Google are closing the features gap, even if they are far behind on market share.

So far, these technology giants are fighting it out in cornering the IaaS market. Amazon with AWS clearly dominates this space. Microsoft, because of its inherent advantage of B2B software already running across major corporations has it easy, but not easy enough to topple Amazon. Google and IBM are vying for the remaining market share.

Apart from IaaS, PaaS is going to be the next frontier on which the Cloud battles will be fought, the report states. Consolidation is a distant possibility considering the fact that the warriors involved are too big to be acquired. With most services at par, innovation will be the key to gain and sustain in this business.

Vishal Ingole, April 10, 2017

Written by Stephen E. Arnold · Filed Under Amazon, Cloud computing, Google, Microsoft, News, Technology | Comments Off on Battle in the Clouds

Creative Commons Eludes Copyright With Free Image Search

April 7, 2017

One scandal that plagues the Internet is improper usage and citation of digital images. Photographs, art, memes, and GIFs are stolen on a daily basis and original owners are often denied compensation or credit. Most of the time, usage is completely innocent; other times it is blatant theft. If you need images for your Web site or project, but do not want to be sent a cease and desist letter or slammed with a lawsuit check out the Creative Commons, a community where users post photos, art, videos, and more free of copyright control as long as you give credit to the original purveyor. Forbes wrote that, “Creative Commons’ New Search Engine Makes It Easy To Find Free-To-Use Images.”

The brand new Creative Commons search engine is something the Internet has waited for:

The Creative Commons search engine gives you access to over nine million images drawn from 500px, Flickr, the Metropolitan Museum of Art, the New York Public Library and the Rijksmuseum. You can search through all or any combination of these collections. You can also constrain your search to titles, creators, tags or any combination of the three. Finally, you can limit your search to images that you can modify, adapt or build upon as you see fit, or that are free to use for commercial purposes.

Creative Commons is a wonderful organization and copyright tool that allows people to share their work with others while receiving proper credit. It is also a boon for others who need photos and video to augment their own work. My only question is: why did it take so long for the Creative Commons to make this search engine?

Whitney Grace, April 7, 2017

Written by Stephen E. Arnold · Filed Under Copyright, News, search engine, Technology | 1 Comment

The Design Is Old School, but the Info Is Verified

April 5, 2017

For a moment, let us go back to the 1990s. The Internet was still new, flash animation was “da bomb” (to quote the vernacular of the day), and Web site design was plain HTML. While you could see prime examples of early Web site design visiting the Internet Archive, but why hit the time machine search button when you can simply visit RefDesk.com.

RefDesk is reminiscent of an old AOL landing page, except it lacks the cheesy graphics and provides higher quality information. RefDesk is an all-inclusive reference and fact checking Web site that pools links of various sources with quality information into one complete resource. It keeps things simple with the plain HTML format, then it groups sources together based on content and relevance, such as search engines, news outlets, weather, dictionaries, games, white pages, yellow pages, and specialized topics that change daily. RefDesk’s mission is to take the guesswork out of the Internet:

The Internet is the world’s largest library containing millions of books, artifacts, images, documents, maps, etc. There is but one small problem in this library: everything is scattered about on the floor, with growing hordes of confused and bewildered users frantically shifting through the maze, occasionally crying out, Great Scott, look at what I just found!’ Enter refdesk.

Refdesk has three goals: (1) fast access, (2) intuitive and easy navigation and (3) comprehensive content, rationally indexed. The prevailing philosophy here is: simplicity. “Simplicity is the natural result of profound thought.” And, very difficult to achieve.

Refdesk is the one stop source to find verified, credible resources because a team dedicated to fishing out the facts from the filth that runs amuck on other sites runs it. It set up shop in 1995 and the only thing that has changed is the information. It might be basic, it might be a tad bland, but the content is curated to ensure credibility.

Elementary school kids take note; you can use this on your history report.

Whitney Grace, April 5, 2017

Written by Stephen E. Arnold · Filed Under Corporate Concerns, Data, Digital Library, News, Technology | Comments Off on The Design Is Old School, but the Info Is Verified

Dataminr Presented to Foreign Buyers Through Illegal Means

April 4, 2017

One thing that any company wants is more profit. Companies generate more profit by selling their products and services to more clients. Dataminr wanted to add more clients to their roster and a former Hillary Clinton wanted to use his political connections to get more clients for Dataminr of the foreign variety. The Verge has the scoop on how this happened in, “Leaked Emails Reveal How Dataminr Was Pitched To Foreign Governments.”

Dataminr is a company specializing in analyzing Twitter data and turning it into actionable data sets in real-time. The Clinton aide’s personal company, Beacon Global Strategies, arranged to meet with at least six embassies and pitch Dataminr’s services. All of this came to light when classified emails were leaked to the public on DCLeaks.com:

The leaked emails shed light on the largely unregulated world of international lobbying in Washington, where “strategic advisors,” “consultants,” and lawyers use their US government experience to benefit clients and themselves, while avoiding public scrutiny both at home and overseas.

Beacon isn’t registered to lobby in Washington. The firm reportedly works for defense contractors and cybersecurity companies, but it hasn’t made its client list public, citing non-disclosure agreements. Beacon’s relationship with Dataminr has not been previously reported.

The aide sold Dataminr’s services in a way that suggest they could be used for surveillance. Beacon even described Dataminr as a way to find an individual’s digital footprint. Twitter’s development agreement forbids third parties from selling user data if it will be used for surveillance. But Twitter owns a 5% stake in Dataminr and allows them direct access to their data firehose.

It sounds like some back alley dealing took place. The ultimate goal for the Clinton aide was to make money and possibly funnel that back into his company or get a kickback from Dataminr. It is illegal for a company to act in this manner, says the US Lobbying Disclosure Act, but there are loopholes to skirt around it.

This is once again more proof that while a tool can be used for good, it can also be used in a harmful manner. It begs the question, though, that if people leave their personal information all over the Internet, is it not free for the taking?

Whitney Grace, April 4, 2017

Written by Stephen E. Arnold · Filed Under cybersecurity, Data, News, Social Media, Technology, Twitter | Comments Off on Dataminr Presented to Foreign Buyers Through Illegal Means

The Golden Age of Radio as Compared to the Internet

April 3, 2017

Here is an article going out to all those old fogies who remember when radio was the main source of news, entertainment, and communication. Me Shed Society compares the Golden Age of Radio to the continuous information stream known as the Internet and they discuss more in the article, “The Internet Does To The World What Radio Did To The World.”

The author focuses on Marshal McLuhan’s book Understanding Media and its basic idea, “The medium is the message.” There are three paragraphs that the author found provoking and still relevant, especially in today’s media craze times. The author suggests that if one were to replace the Hitler references with the Internet or any other influential person or medium, it would be interchangeable. The first paragraph states that Hitler’s rise to power is due in part to the new radio invention and mass media. The most profound paragraph is the second:

The power of radio to retribalize mankind, its almost instant reversal of individualism into collectivism, Fascist or Marxist, has gone unnoticed. So extraordinary is this unawareness that it is what needs to be explained. The transforming power of media is easy to explain, but the ignoring of this power is not at all easy to explain. It goes without saying that the universal ignoring of the psychic action of technology bespeaks some inherent function, some essential numbing of consciousness such as occurs under stress and shock conditions.

The third paragraph concludes that there should be some way to defend against media fallout, such as education and its foundations in dead tree formats, i.e. print.

Print, however, is falling out of favor, at least when it comes to the mass media, and education is built more on tests and meeting standards than fighting hysteria. Let us add another “-ism” to this list with the “extreme-ism” that runs rampant on the TV and the Internet.

Whitney Grace, April 3, 2017

Written by Stephen E. Arnold · Filed Under Internet, News, Technology, User experience | Comments Off on The Golden Age of Radio as Compared to the Internet

Seventeen Visions of the Future From Microsoft Researchers

March 31, 2017

Here’s a bit of PR from Microsoft that could pay off in many ways, should the company be wise enough to listen to these women. Microsoft’s blog posts, “17 for ’17: Microsoft Researchers on What to Expect in 2017 and in 2027.” As part of their Computer Science Education Week, the company shares 17 well-informed perspectives on the future of tech, presented by 17 talented researchers. On the way to introducing these insights, the post reminds us:

In this ‘age of acceleration,’ in which advances in technology and the globalization of business are transforming entire industries and society itself, it’s more critical than ever for everyone to be digitally literate, especially our kids. This is particularly true for women and girls who, while representing roughly 50 percent of the world’s population, account for less than 20 percent of computer science graduates in 34 OECD countries, according to this report. This has far-reaching societal and economic consequences.

Consequences like a worldwide shortage of qualified computer scientists, which could be eased by a surge of women entering the field. That’s why they call personnel management ”human resources,” after all.

We are pleased to see one particular researcher on the list, Sue Dumais, who happens to be an alum of the historic Bell Labs. Dumais now works as deputy managing director at Microsoft’s Redmond, Washington, lab. Her view for 2017 makes perfect sense—more progress in, and reliance upon, deep learning models. Among other things, she expects these models to continue improving internet search results. What about further down the road? Here’s Dumais’ vision:

What will be the key advance or topic of discussion in search and information retrieval in 2027?

The search box will disappear. It will be replaced by search functionality that is more ubiquitous, embedded and contextually sensitive. We are seeing the beginnings of this transformation with spoken queries, especially in mobile and smart home settings. This trend will accelerate with the ability to issue queries consisting of sound, images, or video, and with the use of context to proactively retrieve information related to the current location, content, entities, or activities without explicit queries.

The post urges readers to share this list, in the hope that it will inspire talented kids of all genders to pursue careers in computer science.

Cynthia Murrell, March 31, 2017

Written by Stephen E. Arnold · Filed Under Education, Microsoft, News, Search, Technology | Comments Off on Seventeen Visions of the Future From Microsoft Researchers

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Smart Software, Dumb Biases

Forrester: Enterprise Content Management Misstep

Context

The Failure

A Peek at the DeepMind Research Process

Motivations for Microsoft LinkedIn Purchase

Battle in the Clouds

Creative Commons Eludes Copyright With Free Image Search

The Design Is Old School, but the Info Is Verified

Dataminr Presented to Foreign Buyers Through Illegal Means

The Golden Age of Radio as Compared to the Internet

Seventeen Visions of the Future From Microsoft Researchers

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Context

The Failure

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta