New York Begins Asking If Algorithms Can Be Racist

December 27, 2017

The whole point of algorithms is to be blind to everything except data. However, it is becoming increasingly clear that in the wrong hands, algorithms and AI could have a very negative impact on users. We learned more in a recent ACLU post, “New York Takes on Algorithm Discrimination.”

According to the story:

A first-in-the-nation bill, passed yesterday in New York City, offers a way to help ensure the computer codes that governments use to make decisions are serving justice rather than inequality.

Algorithms are often presumed to be objective, infallible, and unbiased. In fact, they are highly vulnerable to human bias. And when algorithms are flawed, they can have serious consequences.

The bill, which is expected to be signed by Mayor Bill de Blasio, will provide a greater understanding of how the city’s agencies use algorithms to deliver services while increasing transparency around them. This bill is the first in the nation to acknowledge the need for transparency when governments use algorithms…

This is a very promising step toward solving a very real problem. From racist coding to discriminatory AI, this is a topic that is creeping into the national conversation. We hope others will follow in New York’s footsteps and find ways to prevent this injustice from going further.

Patrick Roland, December 27, 2017

Written by Stephen E. Arnold · Filed Under AI, algorithms, Data, News | Comments Off on New York Begins Asking If Algorithms Can Be Racist

Data Analysis Startup Primer Already Well-Positioned

December 22, 2017

A new startup believes it has something unique to add to the AI data-processing scene, we learn from VentureBeat’s article, “Primer Uses AI to Understand and Summarize Mountains of Text.” The company’s software automatically summarizes (what it considers to be) the most important information from huge collections of documents. Filters then allow users to drill into the analyzed data. Of course, the goal is to reduce or eliminate the need for human analysts to produce such a report; whether Primer can soar where others have fallen short on this tricky task remains to be seen. Reporter Blair Hanley Frank observes:

Primer isn’t the first company to offer a natural language understanding tool, but the company’s strength comes from its ability to collate a massive number of documents with seemingly minimal human intervention and to deliver a single, easily navigable report that includes human-readable summaries of content. It’s this combination of scale and human readability that could give the company an edge over larger tech powerhouses like Google or Palantir. In addition, the company’s product can run inside private data centers, something that’s critical for dealing with classified information or working with customers who don’t want to lock themselves into a particular cloud provider.

Primer is sitting pretty with $14.7 million in funding (from the likes of Data Collective, In-Q-Tel, Lux Capital, and Amplify Partners) and, perhaps more importantly, a contract with In-Q-Tel that connects them with the U.S. Intelligence community. We’re told the software is being used by several agencies, but that Primer knows not which ones. On the commercial side, retail giant Walmart is now a customer. Primer emphasizes they are working to enable more complex reports, like automatically generated maps that pinpoint locations of important events. The company is based in San Francisco and is hiring for several prominent positions as of this writing.

Cynthia Murrell, December 22, 2017

Written by Stephen E. Arnold · Filed Under AI, Data, ECommerce, News, Search | Comments Off on Data Analysis Startup Primer Already Well-Positioned

Search System from UAEU Simplifies Life Science Research

December 21, 2017

Help is on hand for scientific researchers tired of being bogged down in databases in the form of a new platform called Biocarian. The Middle East’s ITP.net reports, “UAEU Develops New Search Engine for Life Sciences.” Semantic search is the key to the more efficient and user-friendly process. Writer Mark Sutton reports:

The UAEU [United Arab Emirages University] team said that Biocarian was developed to address the problem of large and complex data bases for healthcare and life science, which can result in researchers spending more than a third of their time searching for data. The new search engine users Semantic Web technology, so that researchers can easily create targeted searches to find the data they need in a more efficient fashion. … It allows complex queries to be constructed and entered, and offers additional features such as the capacity to enter ‘facet values’ according to specific criteria. These allow users to explore collated information by applying a range of filters, helping them to find what they are looking for quicker.

Project lead Nazar Zaki expects that simplifying the search process will open up this data to many talented researchers (who don’t happen to also be computer-science experts), leading to significant advances in medicine and healthcare. See the article for on the Biocarian platform.

Cynthia Murrell, December 21, 2017

Written by Stephen E. Arnold · Filed Under Data, healthcare, News, Search, User experience | Comments Off on Search System from UAEU Simplifies Life Science Research

Plan for 100,000 Examples When Training an AI

December 19, 2017

Just what is the magic number when it comes to the amount of data needed to train an AI? See VentureBeat’s article, “Google Brain Chief: Deep Learning Takes at Least 100,000 Examples” for an answer. Reporter Blair Hanley Frank cites Jeff Dean, a Google senior fellow, who spoke at this year’s VB Summit. Dean figures that supplying 100,000 examples gives deep learning systems enough examples of most types of data. Frank writes:

Dean knows a thing or two about deep learning — he’s head of the Google Brain team, a group of researchers focused on a wide-ranging set of problems in computer science and artificial intelligence. He’s been working with neural networks since the 1990s, when he wrote his undergraduate thesis on artificial neural networks. In his view, machine learning techniques have an opportunity to impact virtually every industry, though the rate at which that happens will depend on the specific industry. There are still plenty of hurdles that humans need to tackle before they can take the data they have and turn it into machine intelligence. In order to be useful for machine learning, data needs to be processed, which can take time and require (at least at first) significant human intervention. ‘There’s a lot of work in machine learning systems that is not actually machine learning,’ Dean said.

Perhaps poetically, Google is using machine learning to explore how best to perform this non-machine-learning work. The article points to a couple of encouraging projects, including Google DeepMind’s AlphaGo, which seems to have mastered the ancient game of Go simply by playing against itself.

Cynthia Murrell, December 19, 2017

Written by Stephen E. Arnold · Filed Under AI, Data, Google, News | Comments Off on Plan for 100,000 Examples When Training an AI

Everyone Should Know the Term Cognitive Computing

December 19, 2017

Cognitive computing is a term everyone in the AI world should already be familiar with. If not, it’s time for a crash course. This is the DNA of machine learning and it is a fascinating field, as we learned from a recent Information Age story, “RIP Enterprise Search –AI-Based Cognitive Insight is the Future.”

According to the story:

The future of search is linked directly to the emergence of cognitive computing, which will provide the framework for a new era of cognitive search. This recognizes intent and interest and provides structure to the content, capturing more accurately what is contained within the text.

Context is king, and the four key (NOTE: We only included the most important two) elements of context detection are as follows:

Who – which user is looking for information? What have they looked for previously and what are they likely to be interested in finding in future? Who the individual is key as to what results are delivered to them.
What – the nature of the information is also highly important. Search has moved on from structured or even unstructured text within documents and web pages. Users may be looking for information in any number of different forms, from data within databases and in formats ranging from video and audio, to images and data collected from the internet-of-things (IOT).

Who and what is incredibly important, but that might be putting the cart before the horse. First, we must convince CEOs how important AI is to their business…any business. Thankfully, folks like Huffington Post are already ahead of us and rallying the troops.

Patrick Roland, December 19, 2017

Written by Stephen E. Arnold · Filed Under AI, Data, Enterprise search, News | Comments Off on Everyone Should Know the Term Cognitive Computing

Googles Data Police Fail with Creepy Videos

December 13, 2017

YouTube is suffering from a really strange problem lately. In various children’s programming feeds, inappropriate knockoff videos of popular cartoon characters keep appearing. It has parents outraged, as we learned in a Fast Company article, “Creepy Kids Videos Like These Keep Popping Up on YouTube.”

The videos feature things like Elle from “Frozen” firing machine guns. According to the story:

A YouTube policy imposed this year says that videos showing “family entertainment characters” being “engaged in violent, sexual, vile, or otherwise inappropriate behavior” can’t be monetized with ads on the platform. But on Monday evening Fast Company found at least one violent, unlicensed superhero video, entitled “Learn Colors With Superheroes Finger Family Song Johny Johny Yes Papa Nursery Rhymes Giant Syringe,” still included ads. A YouTube spokesperson didn’t immediately comment, but by Tuesday the video’s ads had been removed.

The videos may well draw ire from legislators, as Congress takes an increasingly close look at user-generated content online in the wake of Russian election manipulation.

It feels like they really need to have a tighter rein on content. But it would surprise us if this Congress would impose too much on YouTube’s parent company, Google. With Net Neutrality likely being erased by Congress, the idea of any deeper oversight is unlikely. If anything, we think Google will be given less oversight.

Patrick Roland, December 13, 2017

Written by Stephen E. Arnold · Filed Under Data, Google, News, Search quality, Video | Comments Off on Googles Data Police Fail with Creepy Videos

Watson Works with AMA, Cerner to Create Health Data Model

December 1, 2017

We see IBM Watson is doing the partner thing again, this time with the American Medical Association (AMA). I guess they were not satisfied with blockchain applications and the i2 line of business after all. Forbes reports, “AMA Partners With IBM Watson, Cerner on Health Data Model.” Contributor Bruce Japsen cites James Madera of the AMA when he reports that though the organization has been collecting a lot of valuable clinical data, it has not yet been able to make the most of it. Of the new project, we learn:

The AMA’s ‘Integrated Health Model Initiative’ is designed to create a ‘shared framework for organizing health data , emphasizing patient-centric information and refining data elements to those most predictive of achieving better outcomes.’ Those already involved in the effort include IBM, Cerner, Intermountain Healthcare, the American Heart Association, the American Academy of Family Physicians and the American Medical Informatics Association. The initiative is open to all healthcare and information stakeholders and there are no licensing fees for participants or potential users of what is eventually created. Madara described the AMA’s role as being like that of Switzerland: working to tell companies like Cerner and IBM what data elements are important and encouraging best practices, particularly when patient care and clinical information is involved. The AMA, for example, would provide ‘clinical validation review to make sure there is an evidence base under it because we don’t want junk,’ Madara said.

IBM and Cerner each have their own healthcare platforms, of course, but each is happy to work with the AMA. Japsen notes that as the healthcare industry shifts from the fee-for-service approach to value-based pricing models, accurate and complete information become more crucial than ever.

Cynthia Murrell, December 1, 2017

Written by Stephen E. Arnold · Filed Under Data, healthcare, IBM Watson, News | 2 Comments

Experts Desperately Seeking the Secret to Big Data Security

November 28, 2017

As machine learning and AI becomes a more prevalent factor in our day-to-day life, the daily risk of a security breach threatens. This is a major concern for AI experts and you should be concerned too. We learned how scary the fight feels from a recent Tech Target article, “Machine Learning’s Training is Security Vulnerable.”

According to the story:

To tune machine learning algorithms, developers often turn to the internet for training data — it is, after all, a virtual treasure trove of the stuff. Open APIs from Twitter and Reddit, for example, are popular training data resources. Developers scrub them of problematic content and language, but the data-cleansing techniques are no match for the methods used by adversarial actors…

What could solve that risk? Some experts have been proposing a very interesting solution: a global security framework. While this seems like a great way to roadblock hackers, it may also pose a threat. As the Tech Target piece states, hacking technology usually moves at the same speed as a normal tech. So, a global security framework would look like a mighty tempting prize for hackers looking to cause global chaos. Proceed with caution!

Patrick Roland, November 28, 2017

Written by Stephen E. Arnold · Filed Under AI, Data, News, Security | Comments Off on Experts Desperately Seeking the Secret to Big Data Security

Nothing New as UK Continues to Spy on Citizens

November 27, 2017

People in the United States appear to always be up in arms about their civil liberties. While it can be annoying, this is a good thing because it shows that citizens are trying to keep their government in check. The United States pales in comparison to the United Kingdom when it comes to defying civil liberties and spying on citizens. TechCrunch shares the article, “UK Spies Using Social Media Data For Das Surveillance.”

Why does it come as a surprise that governments are using social media to collect information on their citizens? Many social media users do not have filters, including the US President Trump, and post everything online. Governments take advantage of this, so it only makes sense when Privacy International says they have evidence that UK spy agencies use social media to gather information on suspects.

What does come as interesting is that the evidence shows that UK agencies shared their information databases with foreign governments and law enforcement? On the other hand, given that the UK has been a target for terrorist attacks, this makes sense. Privacy International is challenging UK’s intelligence use of the of the personal data as an investigation tool. This is the biggest concern and rightly so:

A key concern of the committee at the time was that rules governing use of the datasets had not been defined in legislation (although the UK government has since passed a new investigatory powers framework that enshrines various state surveillance bulk powers in law). But at the time of the report, privacy issues and other safeguards pertaining to BPDs had not been considered in public or parliament.

There are not any legal ramifications if the data is misused. This is a big deal and there need to be penalties if the data is used in harmful ways. It begs the question, however, what about financial and retail industries that collect data on customers to sell more products? Is that akin to this? Also, people need to put less of their lives online and they would have less to worry about.

Whitney Grace, November 27, 2017

Written by Stephen E. Arnold · Filed Under Data, Government, News, Privacy | 3 Comments

Analytics Tips on a Budget

November 23, 2017

Self-service analytics is another way to say “analytics on a budget.” Many organizations, especially non-profits, do not have the funds to invest in a big data plan and technology, so they decide to take the task on themselves. With the right person behind the project, self-service analytics is a great way to save a few bucks. IT Pro Portal shares some ways how to improve on an analytics project in, “Three Rules For Adopting Self-Service Analytics.” Another benefit to self-service analytics is that theoretically anyone in the organization can make use of the data and find some creative outlet for it. The tips come with the warning label:

Any adoption of new technology requires a careful planning, consultation, and setup process to be successful: it must be comprehensive without being too time-consuming, and designed to meet the specific goals of your business end-users. Accordingly, there’s no one-size-fits-all approach: each business will need to consider its specific technological, operational and commercial requirements before they begin.

What are the three tips?

Define your business requirements
Collaborate and integrate
Create and implement a data governance policy

All I can say to this is, duh! These are standard tips that can be applied, not only for self-service analytics but also BI plans and any IT plan. Maybe there are a few tips directly geared at the analytics field but stick to fewer listicles and more practical handbooks. Was this a refined form of clickbait?

Whitney Grace, November 23, 2017

Written by Stephen E. Arnold · Filed Under Analytics, Business intelligence, Data, News | Comments Off on Analytics Tips on a Budget

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

New York Begins Asking If Algorithms Can Be Racist

Data Analysis Startup Primer Already Well-Positioned

Search System from UAEU Simplifies Life Science Research

Plan for 100,000 Examples When Training an AI

Everyone Should Know the Term Cognitive Computing

Googles Data Police Fail with Creepy Videos

Watson Works with AMA, Cerner to Create Health Data Model

Experts Desperately Seeking the Secret to Big Data Security

Nothing New as UK Continues to Spy on Citizens

Analytics Tips on a Budget

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta