March 23, 2017
The article titled Silicon Valley Hedge Fund Takes Over Wall Street With AI Trader on Bloomberg explains how Sentient Technologies Inc. plans to take the human error out of the stock market. Babak Hodjat co-founded the company and spent the past 10 years building an AI system capable of reviewing billions of pieces of data and learning trends and techniques to make money by trading stocks. The article states that the system is based on evolution,
According to patents, Sentient has thousands of machines running simultaneously around the world, algorithmically creating what are essentially trillions of virtual traders that it calls “genes.” These genes are tested by giving them hypothetical sums of money to trade in simulated situations created from historical data. The genes that are unsuccessful die off, while those that make money are spliced together with others to create the next generation… Sentient can squeeze 1,800 simulated trading days into a few minutes.
Hodjat believes that handing the reins over to a machine is wise because it eliminates bias and emotions. But outsiders wonder whether investors will be willing to put their trust entirely in a system. Other hedge funds like Man AHL rely on machine learning too, but nowhere near to the extent of Sentient. As Sentient bring in outside investors later this year the success of the platform will become clearer.
Chelsea Kerwin, March 23, 2017
March 14, 2017
An article at Recode, “Watson Claims to Predict Cancer, but Who Trained It To Think,” reminds us that even the most successful AI software was trained by humans, using data collected and input by humans. We have developed high hopes for AI, expecting it to help us cure disease, make our roads safer, and put criminals behind bars, among other worthy endeavors. However, we must not overlook the datasets upon which these systems are built, and the human labor used to create them. Writer (and CEO of DaaS firm Captricity) Kuang Chen points out:
The emergence of large and highly accurate datasets have allowed deep learning to ‘train’ algorithms to recognize patterns in digital representations of sounds, images and other data that have led to remarkable breakthroughs, ones that outperform previous approaches in almost every application area. For example, self-driving cars rely on massive amounts of data collected over several years from efforts like Google’s people-powered street canvassing, which provides the ability to ‘see’ roads (and was started to power services like Google Maps). The photos we upload and collectively tag as Facebook users have led to algorithms that can ‘see’ faces. And even Google’s 411 audio directory service from a decade ago was suspected to be an effort to crowdsource data to train a computer to ‘hear’ about businesses and their locations.
Watson’s promise to help detect cancer also depends on data: decades of doctor notes containing cancer patient outcomes. However, Watson cannot read handwriting. In order to access the data trapped in the historical doctor reports, researchers must have had to employ an army of people to painstakingly type and re-type (for accuracy) the data into computers in order to train Watson.
Chen notes that more and more workers in regulated industries, like healthcare, are mining for gold in their paper archives—manually inputting the valuable data hidden among the dusty pages. That is a lot of data entry. The article closes with a call for us all to remember this caveat: when considering each new and exciting potential application of AI, ask where the training data is coming from.
Cynthia Murrell, March 14, 2017
March 13, 2017
It will require training Canada’s youth in design and the arts, as well as STEM subjects if that country is to excel in today’s big-data world. That is the advice of trio of academic researchers in that country, Patricio Davila, Sara Diamond, and Steve Szigeti, who declare, “There’s No Big Data Without Intelligent Interface” at the Globe and Mail. The article begins by describing why data management is now a crucial part of success throughout society, then emphasizes that we need creative types to design intuitive user interfaces and effective analytics representations. The researchers explain:
Here’s the challenge: For humans, data are meaningless without curation, interpretation and representation. All the examples described above require elegant, meaningful and navigable sensory interfaces. Adjacent to the visual are emerging creative, applied and inclusive design practices in data “representation,” whether it’s data sculpture (such as 3-D printing, moulding and representation in all physical media of data), tangible computing (wearables or systems that manage data through tactile interfaces) or data sonification (yes, data can make beautiful music).
Infographics is the practice of displaying data, while data visualization or visual analytics refers to tools or systems that are interactive and allow users to upload their own data sets. In a world increasingly driven by data analysis, designers, digital media artists, and animators provide essential tools for users. These interpretive skills stand side by side with general literacy, numeracy, statistical analytics, computational skills and cognitive science.
We also learn about several specific projects undertaken by faculty members at OCAD University, where our three authors are involved in the school’s Visual Analytics Lab. For example, the iCity project addresses transportation network planning in cities, and the Care and Condition Monitor is a mobile app designed to help patients and their healthcare providers better work together in pursuit of treatment goals. The researchers conclude with an appeal to their nation’s colleges and universities to develop programs that incorporate data management, data numeracy, data analysis, and representational skills early and often. Good suggestion.
Cynthia Murrell, March 13, 2017
March 2, 2017
One of Google’s biggest rivals is Yandex, at least in Russia. Yandex is a Russian owned and operated search engine and is more popular in Russia than the Google, depending on the statistics. It goes to say that a search engine built and designed by native speakers does have a significant advantage over foreign competition, and it looks like France wants a chance to beat Google. Search Engine Journal reports that, “Qwant, A French Search Engine, Thinks It Can Take On Google-Here’s Why.”
Qwant was only founded in 2013 and it has grown to serve twenty-one million monthly users in thirty countries. The French search engine has seen a 70% growth each year and it will see more with its recent integration with Firefox and a soon-to-be launched mobile app. Qwant is very similar to DuckDuckGo in that it does not collect user data. It also boasts mote search categories than news, images, and video and these include, music, social media, cars, health, music, and others. Qwant had an interesting philosophy:
The company also has a unique philosophy that artificial intelligence and digital assistants can be educated without having to collect data on users. That’s a completely different philosophy than what is shared by Google, which collects every bit of information it can about users to fuel things like Google Home and Google Allo.
Qwant still wants to make a profit with pay-per-click and future partnerships with eBay and TripAdvisor, but they will do without compromising a user’s privacy. Qwant has a unique approach to search and building AI assistants, but it has a long way to go before it reaches Google heights.
They need to engage more users not only on laptops and computers but also mobile devices. They also need to form more partnerships with other browsers.
Bon chance, Qwant! But could you share how you plan to make AI assistants without user data?
Whitney Grace, March 2, 2017
February 24, 2017
We have good news and bad news for fans of government transparency. In their Secrecy News blog, the Federation of American Scientists’ reports, “Number of New Secrets in 2015 Near Historic Low.” Writer Steven Aftergood explains:
The production of new national security secrets dropped precipitously in the last five years and remained at historically low levels last year, according to a new annual report released today by the Information Security Oversight Office.
There were 53,425 new secrets (‘original classification decisions’) created by executive branch agencies in FY 2015. Though this represents a 14% increase from the all-time low achieved in FY 2014, it is still the second lowest number of original classification actions ever reported. Ten years earlier (2005), by contrast, there were more than 258,000 new secrets.
The new data appear to confirm that the national security classification system is undergoing a slow-motion process of transformation, involving continuing incremental reductions in classification activity and gradually increased disclosure. …
Meanwhile, ‘derivative classification activity,’ or the incorporation of existing secrets into new forms or products, dropped by 32%. The number of pages declassified increased by 30% over the year before.
A marked decrease in government secrecy—that’s the good news. On the other hand, the report reveals some troubling findings. For one thing, costs are not going down alongside classifications; in fact, they rose by eight percent last year. Also, response times to mandatory declassification requests (MDRs) are growing, leaving over 14,000 such requests to languish for over a year each. Finally, fewer newly classified documents carry the “declassify in ten years or less” specification, which means fewer items will become declassified automatically down the line.
Such red-tape tangles notwithstanding, the reduction in secret classifications does look like a sign that the government is moving toward more transparency. Can we trust the trajectory?
February 22, 2017
Oh, the wonders of modern technology. Now, TechCrunch informs us, “This Amazing Search Engine Automatically Face Swaps You Into Your Image Results.” Searching may never be the same. Writer Devin Coldewey introduces us to Dreambit, a search engine that automatically swaps your face into select image-search results. The write-up includes some screenshots, and the results can be a bit surreal.
The system analyzes the picture of your face and determines how to intelligently crop it to leave nothing but your face. It then searches for images matching your search term — curly hair, for example — and looks for ‘doppelganger sets, images where the subject’s face is in a similar position to your own.
A similar process is done on the target images to mask out the faces and intelligently put your own in their place — and voila! You with curly hair, again and again and again. […]
It’s not limited to hairstyles, either: put yourself in a movie, a location, a painting — as long as there’s a similarly positioned face to swap yours with, the software can do it. A few facial features, like beards, make the edges of the face difficult to find, however, so you may not be able to swap with Rasputin or Gandalf.
Behind the nifty technology is the University of Washington’s Ira Kemelmacher-Shlizerman, a researcher in computer vision, facial recognition, and augmented reality. Her work could have more sober applications, too, like automated age-progressions to help with missing-person cases. Though the software is still in beta, it is easy to foresee a wide array of uses ahead. Now, more than ever, don’t believe everything you see.
Cynthia Murrell, February 22, 2017
February 21, 2017
A recent study seems to confirm what some have suspected: “Research Shows Gender Bias in Google’s Voice Recognition,” reports the Daily Dot. Not that this is anything new. Writer Selena Larson reminds us that voice recognition tech has a history of understanding men better than women, from a medical tracking system to voice-operated cars. She cites a recent study by linguist researcher Rachael Tatman, who found that YouTube’s auto captions performed better on male voices than female ones by about 13 percent—no small discrepancy. (YouTube is owned by Google.)
Though no one is accusing the tech industry of purposely rendering female voices less effective, developers probably could have avoided this problem with some forethought. The article explains:
’Language varies in systematic ways depending on how you’re talking,’ Tatman said in an interview. Differences could be based on gender, dialect, and other geographic and physical attributes that factor into how our voices sound. To train speech recognition software, developers use large datasets, either recorded on their own, or provided by other linguistic researchers. And sometimes, these datasets don’t include diverse speakers.
Tatman recommends a purposeful and organized approach to remedying the situation. Larson continues:
Tatman said the best first step to address issues in voice tech bias would be to build training sets that are stratified. Equal numbers of genders, different races, socioeconomic statuses, and dialects should be included, she said.
Automated technology is developed by humans, so our human biases can seep into the software and tools we are creating to supposedly to make lives easier. But when systems fail to account for human bias, the results can be unfair and potentially harmful to groups underrepresented in the field in which these systems are built.
Indeed, that’s the way bias works most of the time—it is more often the result of neglect than of malice. To avoid it requires realizing there may be a problem in the first place, and working to avoid it from the outset. I wonder what other technologies could benefit from that understanding.
Cynthia Murrell, February 21, 2017
February 20, 2017
Analytics are catching up to content. In a recent ZDNet article, Digimind partners with Ditto to add image recognition to social media monitoring, we are reminded images reign supreme on social media. Between Pinterest, Snapchat and Instagram, messages are often conveyed through images as opposed to text. Capitalizing on this, and intelligence software company Digimind has announced a partnership with Ditto Labs to introduce image-recognition technology into their social media monitoring software called Digimind Social. We learned,
The Ditto integration lets brands identify the use of their logos across Twitter no matter the item or context. The detected images are then collected and processed on Digimind Social in the same way textual references, articles, or social media postings are analysed. Logos that are small, obscured, upside down, or in cluttered image montages are recognised. Object and scene recognition means that brands can position their products exactly where there customers are using them. Sentiment is measured by the amount of people in the image and counts how many of them are smiling. It even identifies objects such as bags, cars, car logos, or shoes.
It was only a matter of time before these types of features emerged in social media monitoring. For years now, images have been shown to increase engagement even on platforms that began focused more on text. Will we see more watermarked logos on images? More creative ways to visually identify brands? Both are likely and we will be watching to see what transpires.
Megan Feil, February 20, 2017
February 17, 2017
The article and delightful Infographic on BA Insight titled Stats Show Enterprise Search is Still a Challenge builds an interesting picture of the present challenges and opportunities surrounding enterprise search, or at least alludes to them with the numbers offered. The article states,
As referenced by AIIM in an Industry Watch whitepaper on search and discovery, three out of four people agree that information is easier to find outside of their organizations than within. That is startling! With a more effective enterprise search implementation, these users feel that better decision-making and faster customer service are some of the top benefits that could be immediately realized.
What follows is a collection of random statistics about enterprise search. We would like to highlight one stat in particular: 58% of those investing in enterprise search get no payback after one year. In spite of the clear need for improvements, it is difficult to argue for a technology that is so long-term in its ROI, and so shaky where it is in place. However, there is a massive impact on efficiency when employees waste time looking for the information they need to do their jobs. In sum: you can’t live with it, and you can’t live (productively) without it.
Chelsea Kerwin, February 17, 2017
February 16, 2017
Enterprises could be doing so much more to protect themselves from cyber attacks, asserts Auriga Technical Manager James Parry in his piece, “The Dark Side: Mining the Dark Web for Cyber Intelligence” at Information Security Buzz. Parry informs us that most businesses fail to do even the bare minimum they should to protect against hackers. This minimum, as he sees it, includes monitoring social media and underground chat forums for chatter about their company. After all, hackers are not known for their modesty, and many do boast about their exploits in the relative open. Most companies just aren’t bothering to look that direction. Such an effort can also reveal those impersonating a business by co-opting its slogans and trademarks.
Companies who wish to go beyond the bare minimum will need to expand their monitoring to the dark web (and expand their data-processing capacity). From “shady” social media to black markets to hacker libraries, the dark web can reveal much about compromised data to those who know how to look. Parry writes:
Yet extrapolating this information into a meaningful form that can be used for threat intelligence is no mean feat. The complexity of accessing the dark web combined with the sheer amount of data involved, correlation of events, and interpretation of patterns is an enormous undertaking, particularly when you then consider that time is the determining factor here. Processing needs to be done fast and in real-time. Algorithms also need to be used which are able to identify and flag threats and vulnerabilities. Therefore, automated event collection and interrogation is required and for that you need the services of a Security Operations Centre (SOC).
The next generation SOC is able to perform this type of processing and detect patterns, from disparate data sources, real-time, historical data etc. These events can then be threat assessed and interpreted by security analysts to determine the level of risk posed to the enterprise. Forewarned, the enterprise can then align resources to reduce the impact of the attack. For instance, in the event of an emerging DoS attack, protection mechanisms can be switched from monitoring to mitigation mode and network capacity adjusted to weather the attack.
Note that Parry’s company, Auriga, supplies a variety of software and R&D services, including a Security Operations Center platform, so he might be a tad biased. Still, he has some good points. The article notes SOC insights can also be used to predict future attacks and to prioritize security spending. Typically, SOC users have been big businesses, but, Parry advocates, scalable and entry-level packages are making such tools available to smaller companies.
From monitoring mainstream social media to setting up an SOC to comb through dark web data, tools exist to combat hackers. The question, Parry observes, is whether companies will face the growing need to embrace those methods.
Cynthia Murrell, February 16, 2017