Measuring Classifiers by a Rule of Thumb
February 1, 2016
Computer programmers who specialize in machine learning, artificial intelligence, data mining, data visualization, and statistics are smart individuals, but they sometimes even get stumped. Using the same form of communication as reddit and old-fashioned forums, Cross Validated is a question an answer site run by Stack Exchange. People can post questions related to data and relation topics and then wait for a response. One user posted a question about “Machine Learning Classifiers”:
“I have been trying to find a good summary for the usage of popular classifiers, kind of like rules of thumb for when to use which classifier. For example, if there are lots of features, if there are millions of samples, if there are streaming samples coming in, etc., which classifier would be better suited in which scenarios?”
The response the user received was that the question was too broad. Classifiers perform best depending on the data and the process that generates it. It is kind of like asking the best way to organize books or your taxes, it depends on the content within the said items.
Another user replied that there was an easy way to explain the general process of understanding the best way to use classifiers. The user directed users to the Sci-Kit.org chart about “choosing the estimator”. Other users say that the chart is incomplete, because it does not include deep learning, decision trees, and logistic regression.
We say create some other diagrams and share those. Classifiers are complex, but they are a necessity to the artificial intelligence and big data craze.
Whitney Grace, February 1, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph