Text Classification: Established Methods Deliver Good Enough Results

April 26, 2018

Short honk: If you are a cheerleader for automatic classification of text centric content objects, you are convinced that today’s systems are home run hitters. If you have some doubts, you will want to scan the data in “Machine Learning for Text Categorization: Experiments Using Clustering and Classification.” The paper was free when I checked at 920 am US Eastern time. For the test sets, Latent Dirichlet Allocation performed better than other widely used methods. Worth a look. From my vantage point in Harrod’s Creek, automated processes, regardless of method, perform in a manner one expert explained to me at Cebit several years ago: “Systems are good enough.” Improvements are now incremental but like getting the last few percentage ticks of pollutants from a catalytic converter, an expensive and challenging engineering task.

Stephen E Arnold, April 26, 2018

Does Smart Software Understand Kid Vids?

April 26, 2018

The growth of AI and predictive analytics across the spectrum has become a universal rah rah. Super powered computers and their data crunching power is being utilized by industries great and small. However, the producers of AI technology might not be getting rich off this revolution. We learned more from a recent Market Watch story, “IBM Earnings Show AI is Not Paying Off Yet.”

According to the story:

“’The bulls were hoping for a clean modest beat on this key growth segment, which represents the underpinnings of the IBM turnaround story in 2018 and beyond,’ Ives said in a note to clients. In an email, Ives said he does not have an estimate for Watson itself. ‘It’s a major contributing factor to strategic imperatives and helping drive double-digit growth…’”

Despite these less than stellar results, the big names in tech aren’t getting scared away by AI yet. In fact, it is still a boom investment time. Intel, for one, is betting a large chunk on cash on AI. We will be watching this development closer, since we all know that AI can be the greatest product in the world, but if it keeps losing money it might just end up in the graveyard. (Unlikely, we know.)

But—and there seems to be a “but” when it comes to the capabilities of smart software—we noticed that Google seems to be relying on humans to make sure that children’s videos are not violent, chuck full of objectionable material, or inappropriate for kiddie viewing. According to “For the First Time, Parents Will Be Able to Limit YouTube Kids to Human-Reviewed Channels and Recommendations,”

The new features will allow parents to lock down the YouTube Kids app so it only displays those channels* that have been reviewed by humans, not just algorithms. And this includes both the content displayed within the app itself, as well as the recommended videos. A later update will allow parents to configure which videos and channels, specifically, can be viewed.

A few observations seem to be warranted:

  1. Google’s vaunted smart software cannot determine what’s appropriate for children. Therefore, Google is now assuming the role that old school, chain smoking, ink stained editors once performed. Back to the past?
  2. If the smart software cannot figure out what video is okay for children, how accurate is Google’s ad matching software. Is it possible that the ad matching system is able to perform in a “good enough” manner? Will advertisers lose confidence that their money is putting messages in front of the “right” eye balls?
  3. Perhaps Google has caught the same case of sniffles that IBM Watson has been suffering? The failure of smart software with regard to kid vids suggests that hyperbole is not the same as actual performance.

The kid vid matter is as significant as the Facebook Cambridge Analytica matter. Could these be different facets of the same assumption that technology is a go getter?

Stephen E Arnold, April 26, 2018

Listening and Voice Search: A Happy Tech Couple

April 26, 2018

Voice search is the next big thing in the search industry. This is a pretty universally accepted trend among tech thinkers. With that in mind, it’s a good time to look at your own personal use and your business uses for search and inquire whether or not you are ready. Chances are, you aren’t. We learned more from a recent article in The Next Web, “By 2020 30% of Search Will Be Voice Conducted. Here’s What That Means for Your Business.”

According to the story:

“I would also invest in trying to get clients to review my restaurant on Yelp and Tripadvisor so that when people click through, they will see relevant and recent information on my restaurant. If I were providing services, I would make an effort to get listed in Yelp and Google My Business to increase my chances of showing up.”

Another big way to prepare that experts are recommending is to think about SEO in a totally different way. The way we search through our fingertips and through our voiceboxes are totally different. In short, we tend to say less than we type when searching so SEO will have to be even more precise than before.

However, “Amazon’s Alexa Had a Flaw “That Let Eavesdroppers Listen In” reminds Beyond Search that in order to answer a question, the devices have to listen. Amazon’s Alexa had a “flaw” which allowed third parties to use the device like an old school “bug.” According to the write up, Amazon fixed this problem.

How many other always on listening devices are just listening, analyzing, and sending data into a federated database?

Toss in online search and cross correlation, and one has an intriguing way to gather intelligence.

Stephen E Arnold, April 26, 2018

Social Data Donated to Internet Archive

April 25, 2018

The plot thickened around Facebook’s user-data sharing practices when it was revealed they gave access to social-network search platform Profile Engine. Now, the company itself announces, “The Profile Engine Has Now Been Donated to the Internet Archive.” The post relates the company’s perspective on its subsequent dispute with Facebook, then explains why they are making this donation. It asserts:

“We sued Facebook, fought hard in a David and Goliath battle and won a good settlement. One day, maybe we’ll have time to tell the whole story – you’d be utterly shocked what goes on inside Facebook – what you’ve already heard is just the tip of the iceberg. If you have a Facebook account, we strongly recommend that you delete it completely, without delay. Learn more about Facebook.

We also noted this statement:

“We are freely and lawfully transferring this database to the Internet Archive (archive.org) as they have a long track record as a suitable, responsible long term custodian and we have the legal right to do so. “Making this data freely available and preserving it serves many purposes. Here are a few:

* Helping to reunite old friends with powerful search tools (Facebook don’t provide powerful search tools because if you have to search through hundreds of pages of profiles then you view more ads than if the tools take you straight to who you want).

* Helping you to find and meet new people with common interests

* Exposing the interests and group memberships of politicians and public figures (What did they really like ten years ago before they were famous?)

* This snapshot of the early days of social networking will be invaluable to Genealogists, Social Historians and perhaps even Archaeologists in ten, fifty or even 1000 years time.

* Most importantly, this will break Facebook’s monopoly over social data. People chose to make this data free and public, yet Facebook still charge for it. Not any more!”

The balance between privacy and rights to information seems to get trickier every year, and many are vexed that the Profile Engine was allowed to collect user profiles in the first place. Whether the value of this archived information outweighs the perceived violations of trust remains to be seen. For its part, Profile Engine asks us to use this data “responsibly and respectfully.” Eyerys once reported the platform to be partly owned by the Auckland University of Technology when it began as a Facebook search tool. Profile Technology was founded in 2007, and is based in Auckland, New Zealand.

Cynthia Murrell, April 25, 2018

IBM at Bat with Blockchain

April 25, 2018

What’s the difference between innovation and desperation?

About a month ago, I read “IBM Hit With Massive Age Discrimination Charges, Undermining CEO Rometty.”

According to the story:

“The news once again will raise the question about the tenure of CEO Ginni Rometty, who has presided over the demise of IBM. The company has suffered quarter after quarter of falling revenue. She has tried unsuccessfully to make IBM a leader in cloud computing. In the meantime, its older software, services and hardware businesses have suffered.”

Is the idea is that old timers are not able to deliver the zip zip ideas that IBM needs? One of the Beyond Search team said at lunch that management has delivered another setback for IBM. A recent story said that as the company aims to positing its enterprise search for the future, it is acting as its own worst enemy in the planning stages.

I noticed a story this morning which illustrates another home run swing for Big Blue. “Blockchain Gets Real? IBM Advances Projects With Walmart & Others” explains that:

IBM has been working on blockchain technology for about three years, and it officially launched a blockchain business about 16 months ago, Gopinath [a vice president of blockchain solutions and research at IBM] says. About 1,500 IBMers are now working on blockchain products and consulting services, he says. Big Blue has developed a blockchain software platform built on open-source Hyperledger software from the Linux Foundation; IBM also helps clients set up and manage their blockchain systems. Thus far, IBM has worked on 400-plus blockchain projects spanning retail, financial services, healthcare, media, the supply chain, and more.

Watson was supposed to be a revenue game changer at IBM. Now IBM is beating the blockchain drum. Can IBM leverage open source technology to make the company a revenue and earnings engine? Let’s ask Watson. Who’s on first?

Patrick Roland, April 25, 2018

An Interesting Use of Instagram

April 24, 2018

There is an opioid dealer nearby. In fact, this drug kingpin is not standing on the corner or lurking on college campuses, this supplier is right at your fingertips. Thanks to a recent article, the plague of drug sales through popular and public social media platforms has caught the attention of some powerful people. We learned about these developments in a recent Wired article, “One Woman Got Facebook to Police Opioid Sales on Instagram.”

While it’s a little confusing, the basic story goes that one woman who discovered opioid sales on Instagram (which is owned by Facebook) reached out to Facebook, urging them to take action, through a rival social platform, Twitter. The tactic worked, even getting the FDA involved.

According to the story:

“It shouldn’t take this much effort to get people to realize that you have some responsibility for the stuff on your platform…A 13 year old could do this search and realize there’s bad stuff on your platform — and probably has — you don’t need the commissioner of the FDA to tell you that.”

However, the act of policing drug sales on social media platforms and the dark web is not as easy as one might think. Yes, they shut down offending accounts, but beyond that there is little that can be done. According to the story, it outlawed certain hashtags, like it had done before. “Instagram previously restricted the drug-related hashtags, #Xanax and #Xanaxbar and banned #weedforsale and #weed4sale.”

It’s a small step, but hopefully one that will lead to greater and greater progress. For more information, learn more about CyberOSINT: Next Generation Information Access here.

Patrick Roland, April 24, 2018

Terror Database Enriched with Social Media Pix

April 24, 2018

A question is surging through the tech and espionage communities after a recent article that makes some big implications in both worlds. That’s because a company formed by ex-spies is using facial recognition software to create a database of images from social networks like Facebook. This raises a ton of questions, but they all start with the recent Daily Mail piece, “Surveillance Company Run by Ex-Spies is Harvesting Facebook Photos.”

According to the story, the program is called Face-Int and they have a specific goal in mind:

“Its creators say the software could lead to the identification of terror suspects, captured in promotional and other material posted online… “Experts are concerned that the company’s efforts extend beyond this remit, however, and into the political realm…’It raises the stakes of face recognition – it intensifies the potential negative consequences,’ Jay Stanley, senior policy analyst at the American Civil Liberties Union, told Forbes.”

While it is admirable that a company is aiming to help capture terrorists through social media, it leaves one to worry about several things. For starters, it’s pretty safe to assume many terrorists will not appear on social media or, at the least, not without something covering their face. Thus, accuracy becomes a concern. However, the larger concern is that This, however, does not touch upon the greater concern that private, law abiding citizens are also getting funneled into this database. The opportunities for invading one’s privacy is alarmingly high. Time will tell how this shakes out, but we have a hunch the general public will never be told.

Patrick Roland, April 24, 2018

DarkCyber for April 24, 2018, Now Available

April 24, 2018

DarkCyber for April 124, 2018, is now available at www.arnoldit.com/wordpress and on Vimeo at https://vimeo.com/266003727 .

Stephen E Arnold’s DarkCyber is a weekly video news and analysis program about the Dark Web and lesser known Internet services.

This week’s lead story focuses on universities as unwitting accomplices for student cyber criminals. Five students at Manchester University began selling drugs via SilkRoad. The students “graduated” to their own brand and branched out. Before UK law enforcement shut down the students’ operation, more than 6,000 drug sales were completed. Plus, university computer systems have become targets for malicious crypto currency mining operations. A student can take classes in computer science and be up and scamming quickly.

Stephen E Arnold, producer of DarkCyber and author of “CyberOSINT: Next Generation Information Access” said: “The combination of easy access to high-value information about programming and computer systems plus the lure of easy money can turn a good student into a good criminal. Universities, despite their effort to implement more robust security, are targets for bad actors. Students can operate Dark Web businesses from their campus residence. Outsiders can exploit the institution’s computer system in order to install crypto currency mining software. At this time, colleges and universities are in a cat and mouse game with high stakes and stiff penalties for students, administrators, and school security professionals.”

DarkCyber revisits the security of virtual private networks. This week’s program answers a viewer’s question about improving the security of a VPN. In addition to changing the ports the VPN uses, DarkCyber points out that a tech savvy individual can operate his or her own VPN or use additional specialized software to shore up the often leaky security many VPN services provide.

Vendors of “policeware” are generally unknown to most tech professionals. DarkCyber highlights a new, UK based company doing business as Grey Heron. The company offers a range of cyber security services. The firm’s staff appears to include individuals once affiliated with the Hacking Team, another policeware vendor which found itself the victim of a cyber attack two years ago. If Gray Heron taps the Hacking Team’s technical talent, the firm may make an impact in this little known sector of the software market.

The final story in DarkCyber for April 24, 2018, highlights several findings from a study sponsored by Bromium, a cyber security company. The researchers at a UK university gathered data which provide some surprising and interesting information about the Dark Web. For example, the new report asserts that more than $200 billion is laundered on the Dark Web in a single year. If true, these newly revealed research data provide hard metrics about the role of digital currency in today’s online economy.

Beginning in May 2018, coverage of the Dark Web and related subjects will be increased within Beyond Search.

Kenny Toth, April 24, 2018

LinkedIn Identifies Worker Weakness

April 23, 2018

I read “LinkedIn CEO Jeff Weiner Just Revealed Employees Lack This 1 Surprising Job Skill More Than Any Other.” I found the write up amazing. The table below identifies the steps one must take to become an effective communicator. Note that there are two columns in the table. I have commented on each of the “tips” with a reference to LinkedIn’s own service, used by an estimated 200 million professionals. How is LinkedIn doing? Decide for yourself.

Weiner Tip Addled Goose View
Really listen LinkedIn does its best to distract via annoying notifications, a “dark” interface, and obscured functions
Exude confidence How many “contacts” do you have? You need more. Without more, your confidence will suffer.
Be a non verbal ninja Nothing is better than zero feedback from LinkedIn regarding issues with its system
Be concise Nothing is more concise than zero way to reach LinkedIn staff
Start from a place of respect Nothing shows respect than zero user support

Can you locate a list of groups so that you can find those which interest you? Start there. Oh, don’t forget to sign up for the monthly “real” service. A combination of Microsoft and LinkedIn—an ideal couple.

Stephen E Arnold, April 23, 2018

Regulating Facebook and Unexpected Consequences

April 23, 2018

After Mark Zuckerberg’s mostly frothy and somewhat entertaining testimonies for Congress and the Senate, what are we left with? Some tea leaves are saying that Facebook will likely be permitted to self regulate.

What happens if governments step in. One commentator worries not just for our privacy, but for society as a whole. We learned more from a recent Guardian story, “Facebook is a Tyranny and Our Government Isn’t Built to Stop it.”

According to the story:

“Many ideas for regulatory reforms to protect privacy fail to address the governance problems we face. Our government was not built to counter the tyranny of the global corporation…. “With the fervor of the early US founders, we need to debate and adopt a new structure for self-government that is strong enough to counter the global monopolies of the 21st century. Our liberty is at stake.”

Is Facebook really that serious of a threat? We’re ones to pump the brakes a little on this subject. However, that doesn’t mean that social media needs to change. Many people are inventing suggestions for ways in which Washington can regulate this world. Many are bunk, but some are legitimately solid. One that we have been leaning toward is a Digital Consumer Protection Agency. This keeps the senator and congress, who proved how shockingly little they know about social media when they grilled Zuckerberg, out of the fray.

Allegedly accurate information surfaced in Buzzfeed. The article “Cambridge Analytica Data Scientist Aleksandr Kogan Wants You To Know He’s Not A Russian Spy” will certainly spark some additional discussion of governance at Facebook and Cambridge University.

Aleksandr Kogan, a Russian, who appears to have been a key module in the Cambridge Analytica data service is quoted as saying, “I am not a Russian spy.” That’s good to know. The academic asserts that he was doing research. He wrote journal papers about that research. In fact, he wrote papers with Facebook professionals. He also “believes” that his work had not impact on elections. The information in the article is interesting.

Four observations:

  1. Government officials who do not understand Facebook are likely to find themselves relying on Facebook lobbyists for guidance.
  2. Facebook itself can continue to operate and use clever maneuvers to sidestep some regulations.
  3. With more than two billion users, Facebook has the capability of becoming a messaging system for itself.
  4. The story will continue to have momentum.

One unintended consequence is that it will be business as usual for Facebook.

Patrick Roland, April 23, 2018

Next Page »

  • Archives

  • Recent Posts

  • Meta