NTechLab as David to the Google Goliath of Facial Recognition
October 27, 2016
The article titled A Russian Startup is Beating Google with Eerily Accurate Facial Recognition Technology on Business Insider positions NTechLab as the company leading the industry in facial recognition technology. In 2015, the startup beat Google to win the “MegaFace” competition. The article explains,
NTechLab sets itself apart from its competitors with its high level of accuracy and its ability to search an extensive database of photographs. At the MegaFace Championship, NTechLab achieved a 73 percent accuracy with a database of 1 million pictures. When the number dropped to 10,000 images, the system achieved a jaw-dropping accuracy of 95 percent. “We are the first to learn how to efficiently handle large picture databases,” said NTechLab founder Artem Kukharenko to Intel iQ.
The startup based its technology in deep learning and a neural network. The company has held several public demonstrations at festivals and amusement parks. Attendees share selfies with the system, then receive pictures of themselves when the system “found” them in the crowd. Kukharenko touts the “real-word” problem-solving capabilities of his system. While there isn’t a great deal of substantive backup to his claims, the company is certainly worth keeping an eye on.
Chelsea Kerwin, October 27, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Artificial Intelligence: Is More Better? China Thinks So
October 26, 2016
I read “China Overtakes US in Quantity of AI Research.” The idea seems to be that more is better. We know that China has more table tennis players than most countries. China wins more tournaments than most countries. Therefore, more is better. Does the same spinny logic apply to artificial intelligence or smart software?
The write up states:
The Obama administration has a new strategic plan aimed at spurring US development of artificial intelligence. What’s striking is that while the US was an early leader on deep learning research, China has effectively eclipsed it in terms of the number of papers published annually on the subject. The rate of increase is remarkably steep, reflecting how China’s research priorities have changed. The quality of China’s research is also striking.
As HonkinNews pointed out on October 18, 2016, the White House plan calls for standards. Companies are, however, moving forward and unlikely to be slowed down by a standards setting process. Researchers outside the United States will pay attention to standards, but in the meantime, China and other nation states are pressing forward. As HonkinNews pointed out, the White House end of term paper about artificial intelligence was handed in late.
The article cited above says:
The American government is pushing for a major role for itself in AI research, because becoming a leader in artificial intelligence R&D puts the US in a better position to establish global norms on how AI should be safely used.
Nice idea for a discipline which has been chugging along for a half century.
Stephen E Arnold, October 26, 2016
Silicon Valley Inspires Potomac Puzzlement
October 26, 2016
I read “Washington’s Version of Silicon Valley Startup Founders.” The story, which is probably tinged with some Silicon Valley envy for the wizards in Washington, DC, describes the 18F government entrepreneur, consulting, and start up project. The report is not findable on the Office of the Inspector General’s Web site. I did some poking around and as of October 25, 2016, there report was available at this link. (If this puppy leaves the public dog park, contact the OIG. Those folks are ready to answer their phones and respond to email.)
I noted this passage in the SFGate article:
[The] “18F” program to create its own version of a high-tech startup for government digital projects has foundered since its launch in 2014, losing nearly $32 million as its staff spent most of its time on unbillable work…
The issues identified in the SFGate write up include:
- Lousy revenue estimates. “Senior 18F managers overestimated the amount of money their projects would recoup”
- Adding staff. “Increased hiring using special rules every three months since April 2014” and “18F hired a full-time head of state and local government practice at an annual salary of $152,780, even though at the time, 18F was not authorized to work directly for state and local governments.”
- Missed billability targets. “Less than half the program’s staff time on projects for which it could bill other federal agencies.”
18F seems to be allied with the GSA, a fine outfit. One clue about the organization’s management appears in this passage from the SFGate write up:
The 10-month investigation by the GSA’s inspector general found that 52 percent of 18F’s work was unbillable and included an internal project to change its logo by altering its font, alignment and background color. In all, 727 staff hours, or $140,104, were spent on developing the brand, including that logo change.
Yep, logos are easy and fun. Lots of meetings. The billable work stuff for consulting work is tougher. In the news release about the report, I highlighted this passage:
18F had projected annual revenue of $84.18 million for fiscal year 2016; however, through the third quarter 18F only generated $27.82 million in revenue, leaving 18F one quarter to generate $56.37 million in revenue to meet its projections.
There was an online component too revealed in the SFGate article. I noted this statement:
The report said 18F spent about 20 hours or $4,148 on two customized “bots” for Slack, an online messaging application. One of the automated programs would monitor users’ messages for the words “guys,” ”guyz” and “dudes,” which could have been perceived as being not inclusive for women. It prompted users to consider replacing those words with 21 options that included buds, compatriots, fellow humans, posse, team, mateys, persons of any kind, organic carbon-based life-forms living on the third planet from the sun, comrades and cats.
Yo, dudes. Love the search engine for US OIG reports.
Stephen E Arnold, October 26, 2016
Google Introduces Fact Checking Tool
October 26, 2016
If it works as advertised, a new Google feature will be welcomed by many users—World News Report tells us, “Google Introduced Fact Checking Feature Intended to Help Readers See Whether News Is Actually True—Just in Time for US Elections.” The move is part of a trend for websites, who seem to have recognized that savvy readers don’t just believe everything they read. Writer Peter Woodford reports:
Through an algorithmic process from schema.org known as ClaimReview, live stories will be linked to fact checking articles and websites. This will allow readers to quickly validate or debunk stories they read online. Related fact-checking stories will appear onscreen underneath the main headline. The example Google uses shows a headline over passport checks for pregnant women, with a link to Full Fact’s analysis of the issue. Readers will be able to see if stories are fake or if claims in the headline are false or being exaggerated. Fact check will initially be available in the UK and US through the Google News site as well as the News & Weather apps for both Android and iOS. Publishers who wish to become part of the new service can apply to have their sites included.
Woodford points to Facebook’s recent trouble with the truth within its Trending Topics feature and observes that many people are concerned about the lack of honesty on display this particular election cycle. Google, wisely, did not mention any candidates, but Woodford notes that Politifact rates 71% of Trump’s statements as false (and, I would add, 27% of Secretary Clinton’s statements as false. Everything is relative.) If the trend continues, it will be prudent for all citizens to rely on (unbiased) fact-checking tools on a regular basis.
Cynthia Murrell, October 26, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Machine Learning Changes the Way We Learn from Data
October 26, 2016
The technology blog post from Danial Miessler titled Machine Learning is the New Statistics strives to convey a sense of how crucial Machine Learning has become in terms of how we gather information about the world around us. Rather than dismissing Machine Learning as a buzzword, the author heralds Machine Learning as an advancement in our ability to engage with the world around us. The article states,
So Machine Learning is not merely a new trick, a trend, or even a milestone. It’s not like the next gadget, instant messaging, or smartphones, or even the move to mobile. It’s nothing less than a foundational upgrade to our ability to learn about the world, which applies to nearly everything else we care about. Statistics greatly magnified our ability to do that, and Machine Learning will take us even further.
The article breaks down the steps of our ability to analyze our own reality, moving from randomly explaining events, to explanations based on the past, to explanations based on comparisons with numerous trends and metadata. The article positions Machine Learning as the next step, involving an explanation that compares events but simultaneously progresses the comparison by coming up with new models. The difference is of course that Machine Learning offers the ability of continuous model improvement. If you are interested, the blog also offers a Machine Learning Primer.
Chelsea Kerwin, October 26, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
BA Insight and Its Ideas for Enterprise Search Success
October 25, 2016
I read “Success Factors for Enterprise Search.” The write up spells out a checklist to make certain that an enterprise search system delivers what the users want—on point answers to their business information needs. The reason a checklist is necessary after more than 50 years of enterprise search adventures is a disconnect between what software can deliver and what the licensee and the users expect. Imagine figuring out how to get across the Grand Canyon only to encounter the Iguazu Falls.
The preamble states:
I’ll start with what absolutely does not work. The “dump it in the index and hope for the best” approach that I’ve seen some companies try, which just makes the problem worse. Increasing the size of the haystack won’t help you find a needle.
I think I agree, but the challenge is multiple piles of data. Some data are in haystacks; some are in odd ball piles from the AS/400 that the old guy in accounting uses for an inventory report.
Now the check list items:
- Metadata. To me, that’s indexing. Lousy indexing produces lousy search results in many cases. But “good indexing” like the best pie at the state fair is a matter of opinion. When the licensee, users, and the search vendor talk about indexing, some parties in the conversation don’t know indexing from oatmeal. The cost of indexing can be high. Improving the indexing requires more money. The magic of metadata often leads back to a discussion of why the system delivers off point results. Then there is talk about improving the indexing and its cost. The cycle can be more repetitive than a Kenmore 28132’s.
- Provide the content the user requires. Yep, that’s easy to say. Yep, if its on a distributed network, content disappears or does not get input into the search system. Putting the content into a repository creates another opportunity for spending money. Enterprise search which “federates” is easy to say, but the users quickly discover what is missing from the index or stale.
- Deliver off point results. The results create work by not answering the user’s question. From the days of STAIRS III to the latest whiz kid solution from Sillycon Valley, users find that search and retrieval systems provide an opportunity to go back to traditional research tools such as asking the person in the next cube, calling a self-appointed expert, guessing, digging through paper documents, or hiring an information or intelligence professional to gather the needed information.
The check list concludes with a good question, “Why is this happening?” The answer does not reside in the check list. The answer does not reside in my Enterprise Search Report, The Landscape of Search, or any of the journal and news articles I have written in the last 35 years.
The answer is that vendors directly or indirectly reassure that their software will provide the information a user needs. That’s an easy hook to plant in the customer who behaves like a tuna. The customer has a search system or experience with a search system that does not work. Pitching a better, faster, cheaper solution can close the deal.
The reality is that even the most sophisticated search and content processing systems end up in trouble. Search remains a very difficult problem. Today’s solutions do a few things better than STAIRS III did. But in the end, search software crashes and burns when it has to:
- Work within a budget
- Deal with structured and unstructured data
- Meet user expectations for timeliness, precision, recall, and accuracy
- Does not require specialized training to use
- Delivers zippy response time
- Does not crash or experience downtime due to maintenance
- Outputs usable, actionable reports without having to involve a programmer
- Provides an answer to a question.
Smart software can solve some of these problems for specific types of queries. Enterprise search will benefit incrementally. For now, baloney about enterprise search continues to create churn. The incumbent loses the contract, and a new search vendors inks a deal. Months later, the incumbent loses the contract, and the next round of vendors compete for the contract. This cycle has eroded the credibility of search and content processing vendors.
A check list with three items won’t do much to change the credibility gap between what vendors say, what licensees hope will occur, and what users expect. The Grand Canyon is a big hole to fill. The Iguazu Falls can be tough to cross. Same with enterprise search.
Stephen E Arnold, October 25, 2016
HonkinNews for 25 October 2016 Now Available
October 25, 2016
This week’s video roundup of search, online, and content processing news is now available. Navigate to this link for seven minutes of plain talk about the giblets and goose feathers in the datasphere. This week’s program links Google’s mobile search index with the company’s decision to modify its privacy policies for tracking user actions. The program includes an analysis of Marissa Mayer’s managerial performance at Yahoo. Better browser history search swoops into the program too. Almost live from Harrod’s Creek in rural Kentucky. HonkinNews is semi educational, semi informative, and semi fun. Three programs at the end of the year will focus on Stephen E Arnold’s three monographs about Google.
Kenny Toth, October 25, 2016
IBM: Financial Report Keeps Up with Its Predecessors
October 25, 2016
IBM’s financial results for 2015-2015 third quarter kept up with their predecessors.
The Wall Street Journal, October 18, 2016, said “IBM Profit, Sales Slip But New Units Grow.” (See page B1 in the Harrod’s Creek edition of the venerable business newspaper.) I noted this passage:
The Armonk, NY Company said Monday that third quarter revenue was $19.23 billion, down 0.3%…Big Blue said its profit fell 4% to $2.9 billion during the quarter ended in September amid weakness in its systems segment, which includes mainframe computer hardware and operating system software.
According to “Barclays Says IBM’s Q3 Not Much to Get Excited About”:
Going by the weakness in third quarter margins, Barclays said it’s led into believing that the company’s cloud business doesn’t have the scale to achieve margin at or above the corporate average. The firm termed it as not good, as the company’s strategic imperatives, including cloud, are starting to reaccelerate.
Here in Harrod’s Creek, the fans of blue chip stocks are wondering when IBM will reverse its 18th consecutive quarters of revenue decline.
I suggested to the folks hanging out at the local car repair shop that they should ask Watson. Their response:
What’s Watson. Ain’t he the guy who lives in the next town over?
I was going to explain but decided to put oil in my old car. It is getting old. I don’t think it can hang on much longer. I call it “Big Blue 2 too.” That sounds like blue tutu, doesn’t it?
Stephen E Arnold, October 25, 2016
Online Drugs Trade Needs Surgical Strikes
October 25, 2016
Despite shutdown of Silk Road by the FBI in 2013, online drug trade through Dark Net is thriving. Only military-precision like surgical strikes on vendors and marketplaces using technological methods can solve this problem.
RAND Corporation in its research papaer titled Taking Stock of the Online Drugs Trade says that –
Illegal drug transactions on cryptomarkets have tripled since 2013, with revenues doubling. But at $12-21 (€10.5-18.5) million a month, this is clearly a niche market compared to the traditional offline market, estimated at $2.3 (€2) billion a month in Europe alone.
The primary goal of the research paper was to determine first, the size and scope of cryptomarkets and second, to device avenues for law enforcement agencies to intervene these illegal practices. Though the report covered the entire Europe, the role of Netherlands, in particular, was studied in this report. This was owing to the fact that Netherlands has the highest rate of consumption of drugs acquired using cryptomarkets.
Some interesting findings of the report include –
- Though revenues have doubled, drug cryptomarkets are still niche and generate revenues of $21 million/month as compared to $2.1 billion in offline trade.
- Cannabis still is the most in demand followed by stimulants like cocaine and ecstasy-type drugs
- Vendors from US, Australia, Canada and Western Europe dominate the online marketplace
Apart from following the conventional methods of disrupting the drug trade (dismantling logistics, undercover operations, and taking down marketplaces), the only new method suggested includes the use of Big Data techniques.
Cryptomarkets are going to thrive, and the only way to tackle this threat is by following the money (in this case, the cryptocurrencies). But who is going to bell the cat?
Vishal Ingole, October 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph
Trending Topics: Google and Twitter Compared
October 25, 2016
For those with no time to browse through the headlines, tools that aggregate trending topics can provide a cursory way to keep up with the news. The blog post from communications firm Cision, “How to Find Trending Topics Like an Expert,” examines the two leading trending topic tools—Google’s and Twitter’s. Each approaches its tasks differently, so the best choice depends on the user’s needs.
Though the Google Trends homepage is limited, according to writer Jim Dougherty, one can get further with its extension, Google Explore. He elaborates:
If we go to the Google Trends Explore page (google.com/trends/explore), our sorting options become more robust. We can sort by the following criteria:
*By country (or worldwide)
*By time (search within a customized date range – minimum: past hour, maximum: since 2004)
*By category (arts and entertainment, sports, health, et cetera)
*By Google Property (web search, image search, news search, Google Shopping, YouTube)
You can also use the search feature via the trends page or explore the page to search the popularity of a search term over a period (custom date ranges are permitted), and you can compare the popularity of search terms using this feature as well. The Explore page also allows you to download any chart to a .csv file, or to embed the table directly to a website.
The write-up goes on to note that there are no robust third-party tools to parse data found with Google Trends/ Explore, because the company has not made the API publicly available.
Unlike Google, we’re told, Twitter does not make it intuitive to find and analyze trending topics. However, its inclusion of location data can make Twitter a valuable source for this information, if you know how to find it. Dougherty suggests a work-around:
To ‘analyze’ current trends on the native Twitter app, you have to go to the ‘home’ page. In the lower left of the home page you’ll see ‘trending topics’ and immediately below that a ‘change’ button which allows you to modify the location of your search.
Location is a huge advantage of Twitter trends compared to Google: Although Google’s data is more robust and accessible in general, it can only be parsed by country. Twitter uses Yahoo’s GeoPlanet infrastructure for its location data so that it can be exercised at a much more granular level than Google Trends.
Since Twitter does publicly share its trending-topics API, there are third-party tools one can use with Twitter Trends, like TrendoGate, TrendsMap, and ttHistory. The post concludes with a reminder to maximize the usefulness of data with tools that “go beyond trends,” like (unsurprisingly) the monitoring software offered by Daugherty’s company. Paid add-ons may be worth it for some enterprises, but we recommend you check out what is freely available first.
Cynthia Murrell, October 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph