AI Startups Use Advanced AI Technology to Improve Daily Chores

February 11, 2016

The article on e27 titled 5 Asian Artificial Intelligence Startups that Caught Our Eye lists several exciting new companies working to unleash AI technology, often for quotidian tasks. For example, Arya.ai provides for speeder and more productive decision-making, while Mad Street Den and Niki.ai offers AI shopping support! The article goes into detail about the latter,

“Niki understands human language in the context of products and services that a consumer would like to purchase, guides her along with recommendations to find the right service and completes the purchase with in-chat payment. It performs end-to-end transactions on recharge, cab booking and bill payments at present, but Niki plans to add more services including bus booking, food ordering, movie ticketing, among others.”

Mad Street Den, on the other hand, is more focused on object recognition. Users input an image and the AI platform seeks matches on e-commerce sites. Marketers will be excited to hear about Appier, a Taiwan-based business offering cross-screen insights, or in layman’s terms, they can link separate devices belonging to one person and also estimate how users switch devices during the day and what each device will be used for. These capabilities allow marketers to make targeted ads for each device, and a better understanding of who will see what and via which device.

Chelsea Kerwin, February 11, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under AI, Big data, Data, Management, News, Technology, Uncategorized | Comments Off on AI Startups Use Advanced AI Technology to Improve Daily Chores

The History of ZyLab

February 10, 2016

Big data was a popular buzzword a few years ago, making it seem that it was a brand new innovation. The eDiscovery process, however, has been around for several decades, but recent technology advancements have allowed it to take off and be implemented in more industrial fields. While many big data startups have sprung up, ZyLab-a leading innovator in the eDiscovery and information governance-started in its big data venture in 1983. ZyLab created a timeline detailing its history called, “ZyLab’s Timeline Of Technical Ingenuity.”

Even though ZyLab was founded in 1983 and introduced the ZyIndex, its big data products did not really take off until the 1990s when personal computers became an indispensable industry tool. In 1995, ZyLab made history by being used in the OJ Simpson and Uni-bomber investigations. Three years later it introduced text search in images, which is now a standard search feature for all search engines.

Things really began to take off for ZyLab in the 2000s as technology advanced to the point where it became easier for companies to create and store data as well as beginning the start of masses of unstructured data. Advanced text analytics were added in 2005 and ZyLab made history again by becoming the standard for United Nations War Crime Tribunals.

During 2008 and later years, ZyLab’s milestones were more technological, such as creating the Zylmage SharePoint connector and Google Web search engine integration, the introduction of the ZyLab Information Management Platform, first to offer integrated machine translation in eDiscovery, adding audio search, and incorporating true native visual search and categorization.

ZyLab continues to make historical as well as market innovations for eDiscovery and big data.

Whitney Grace, February 10, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Analytics, Big data, Business strategy, EDiscovery, News, Search | Comments Off on The History of ZyLab

Elasticsearch Works for Us 24/7

February 5, 2016

Elasticsearch is one of the most popular open source search applications and it has been deployed for personal as well as corporate use. Elasticsearch is built on another popular open source application called Apache Lucene and it was designed for horizontal scalability, reliability, and easy usage. Elasticsearch has become such an invaluable piece of software that people do not realize just how useful it is. Eweek takes the opportunity to discuss the search application’s uses in “9 Ways Elasticsearch Helps Us, From Dawn To Dusk.”

“With more than 45 million downloads since 2012, the Elastic Stack, which includes Elasticsearch and other popular open-source tools like Logstash (data collection), Kibana (data visualization) and Beats (data shippers) makes it easy for developers to make massive amounts of structured, unstructured and time-series data available in real-time for search, logging, analytics and other use cases.”

How is Elasticsearch being used? The Guardian is daily used by its readers to interact with content, Microsoft Dynamics ERP and CRM use it to index and analyze social feeds, it powers Yelp, and her is a big one Wikimedia uses it to power the well-loved and used Wikipedia. We can already see how much Elasticsearch makes an impact on our daily lives without us being aware. Other companies that use Elasticsearch for our and their benefit are Hotels Tonight, Dell, Groupon, Quizlet, and Netflix.

Elasticsearch will continue to grow as an inexpensive alternative to proprietary software and the number of Web services/companies that use it will only continues to grow.

Whitney Grace, February 5, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Analytics, Big data, Data, Data mining, Microsoft, News, Search | Comments Off on Elasticsearch Works for Us 24/7

Big Data: A Shopsmith for Power Freaks?

February 4, 2016

I read an article that I dismissed. The title nagged at my ageing mind and dwindling intellect. “This is Why Dictators Love Big Data” did not ring my search, content processing, or Dark Web chimes.

Annoyed at my inner voice, I returned to the story, annoyed with the “This Is Why” phrase in the headline.

Predictive analytics are not new. The packaging is better.

I think this is the main point of the write up, but I an never sure with online articles. The articles can be ads or sponsored content. The authors could be looking for another job. The doubts about information today plague me.

The circled passage is:

Governments and government agencies can easily use the information every one of us makes public every day for social engineering — and even the cleverest among us is not totally immune. Do you like cycling? Have children? A certain breed of dog? Volunteer for a particular cause? This information is public, and could be used to manipulate you into giving away more sensitive information.

The only hitch in the git along is that this is not just old news. The systems and methods for making decisions based on the munching of math in numerical recipes has been around for a while. Autonomy? A pioneer in the 1990s. Nope. Not even the super secret use of Bayesian, Markov, and related methods during World War II reaches back far enough. Nudge the ball to hundreds of years farther on the timeline. Not new in my opinion.

I also noted this comment:

In China, the government is rolling out a social credit score that aggregates not only a citizen’s financial worthiness, but also how patriotic he or she is, what they post on social media, and who they socialize with. If your “social credit” drops below a certain level because you post anti-government messages online or because you’re socially associated with other dissidents, you could be denied credit approval, financial opportunities, job promotions, and more.

Just China? I fear not, gentle reader. Once again the “real” journalists are taking an approach which does not do justice to the wide diffusion of certain mathy applications.

Net net: I should have skipped this write up. My initial judgment was correct. Not only is the headline annoying to me, the information is par for the Big Data course.

Stephen E Arnold, February 4, 2016

Written by Stephen E. Arnold · Filed Under Analytics, Big data, News, Text analytics, Text processing | Comments Off on Big Data: A Shopsmith for Power Freaks?

IBM Sells Technology Platform with a Throwback to Big Datas Mysteries

February 2, 2016

The infographic on the IBM Big Data & Analytics Hub titled Extracting Business Value From the 4 V’s of Big Data involves quantifying Volume (scale of data), Velocity (speed of data), Veracity (certainty of data), and Variety (diversity of data). In a time when big data may have been largely demystified, IBM makes an argument for its current relevance and import, not to mention its mystique, with reminders of the tremendous amounts of data being created and consumed on a daily basis. Ultimately the graphic is an ad for the IBM Analytics Technology Platform. The infographic also references a “fifth “V”,

“Big data = the ability to achieve greater Value through insights from superior analytics. Case Study: A US-based aircraft engine manufacturer now uses analytics to predict engine events that lead to costly airline disruptions, with 97% accuracy. If this prediction capability had ben available in the previous year, it would have saved $63 million.”
IBM struggles for revenue. But, obviously from this infographic, IBM knows how to create Value with a capital “V”, if not revenue. The IBM Analytics Technology Platform promises speedier insights and actionable information from trustworthy sources. The infographic reminds us that poor quality in data leads to sad executives, and that data is growing exponentially, with 90% of all data forged in only the last two years.

Chelsea Kerwin, February 2, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Analytics, Big data, Data, IBM Watson, News, Search, Technology | Comments Off on IBM Sells Technology Platform with a Throwback to Big Datas Mysteries

Metadata Could Play Integral Role in Data Security

February 2, 2016

A friend recently told me how they can go months avoiding suspicious emails, spyware, and Web sites on her computer, but the moment she hands her laptop over to her father he downloads a virus within an hour. Despite the technology gap existing between generations, the story goes to show how easy it is to deceive and steal information these days. ExpertClick thinks that metadata might hold the future means for cyber security in “What Metadata And Data Analytics Mean For Data Security-And Beyond.”

The article uses biological analogy to explain metadata’s importance: “One of my favorite analogies is that of data as proteins or molecules, coursing through the corporate body and sustaining its interrelated functions. This analogy has a special relevance to the topic of using metadata to detect data leakage and minimize information risk — but more about that in a minute.”

This plays into new companies like, Ayasdi, using data to reveal new correlations using different methods than the standard statistical ones. The article compares this to getting to the data atomic level, where data scientists will be able to separate data into different elements and increase the analysis complexity.

“The truly exciting news is that this concept is ripe for being developed to enable an even deeper type of data analytics. By taking the ‘Shape of Data’ concept and applying to a single character of data, and then capturing that shape as metadata, one could gain the ability to analyze data at an atomic level, revealing a new and unexplored frontier. Doing so could bring advanced predictive analytics to cyber security, data valuation, and counter- and anti-terrorism efforts — but I see this area of data analytics as having enormous implications in other areas as well.”

There are more devices connected to the Internet than ever before and 2016 could be the year we see a significant rise in cyber attacks. New ways to interpret data will leverage predictive and proactive analytics to create new ways to fight security breaches.

Whitney Grace, February 2, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Analytics, Big data, Business strategy, Data, Metadata, News, Security | Comments Off on Metadata Could Play Integral Role in Data Security

Big Data Is so Last Year, Data Analysts Inform Us

February 1, 2016

The article on Fortune titled Has Big Data Gone Mainstream? asks whether big data is now an expected part of data analysis. The “merger” as Deloitte advisor Tom Davenport puts it, makes big data an indistinguishable aspect of data crunching. Only a few years ago, it was a scary buzzword that executives scrambled to understand and few experts specialized in. The article shows what has changed lately,

“Now, however, universities offer specialized master’s degrees for advanced data analytics and companies are creating their own in-house programs to train talent in data science. The Deloitte report cites networking giant Cisco CSCO -4.22% as an example of a company that created an internal data science training program that over 200 employees have gone through. Because of media reports, consulting services, and analysts talking up “big data,” people now generally understand what big data means…”

Davenport sums up the trend nicely with the statement that people are tired of reading about big data and ready to “do it.” So what will replace big data as the current mysterious buzzword that irks laypeople and the C-suite simultaneously? The article suggests “cognitive computing” or computer systems using artificial intelligence for speech recognition, object identification, and machine learning. Buzz, buzz!

Chelsea Kerwin, February 1, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under AI, Analytics, Big data, Business intelligence, Data, News, Security, Technology | Comments Off on Big Data Is so Last Year, Data Analysts Inform Us

Measuring Classifiers by a Rule of Thumb

February 1, 2016

Computer programmers who specialize in machine learning, artificial intelligence, data mining, data visualization, and statistics are smart individuals, but they sometimes even get stumped. Using the same form of communication as reddit and old-fashioned forums, Cross Validated is a question an answer site run by Stack Exchange. People can post questions related to data and relation topics and then wait for a response. One user posted a question about “Machine Learning Classifiers”:

“I have been trying to find a good summary for the usage of popular classifiers, kind of like rules of thumb for when to use which classifier. For example, if there are lots of features, if there are millions of samples, if there are streaming samples coming in, etc., which classifier would be better suited in which scenarios?”

The response the user received was that the question was too broad. Classifiers perform best depending on the data and the process that generates it. It is kind of like asking the best way to organize books or your taxes, it depends on the content within the said items.

Another user replied that there was an easy way to explain the general process of understanding the best way to use classifiers. The user directed users to the Sci-Kit.org chart about “choosing the estimator”. Other users say that the chart is incomplete, because it does not include deep learning, decision trees, and logistic regression.

We say create some other diagrams and share those. Classifiers are complex, but they are a necessity to the artificial intelligence and big data craze.

Whitney Grace, February 1, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under AI, Big data, Data mining, Marketing, News | Comments Off on Measuring Classifiers by a Rule of Thumb

Anonymity Not Always Secured for Tor and Dark Web Users

January 28, 2016

From the Washington Post comes an article pertinent to investigative security technologies called This is how the government is catching people who use child porn sites. This piece outlines the process used by the FBI to identify a Tor user’s identity, despite the anonymity Tor provides. The article explains how this occurred in one case unmasking the user Pewter,

“In order to uncover Pewter’s true identity and location, the FBI quietly turned to a technique more typically used by hackers. The agency, with a warrant, surreptitiously placed computer code, or malware, on all computers that logged into the Playpen site. When Pewter connected, the malware exploited a flaw in his browser, forcing his computer to reveal its true Internet protocol address. From there, a subpoena to Comcast yielded his real name and address.”

Some are concerned with privacy of the thousands of users whose computers are also hacked in processes such as the one described above. The user who was caught in this case is arguing the government’s use of such tools violated the Fourth Amendment. One federal prosecutor quoted in the article describes the search processes used in this case as a “gray area in the law”. His point, that technology is eclipsing the law, is definitely one that deserves more attention from all angles: the public, governmental agencies, and private companies.

Megan Feil, January 28, 2016

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Big data, Legal matters, News, Privacy, Search, Security | Comments Off on Anonymity Not Always Secured for Tor and Dark Web Users

Pearson: Revenue Challenges and Digital Initiatives

January 26, 2016

I used to follow Pearson when it owned a wax museum and a number of other fascinating big revenue opportunities. Today the company is still big: $8 billion in revenue, 40,000 employees, and offices in 70 countries. (Lots of reasons for senior executives to do field trips I assume.)

I noted that that Pearson plans to RIF (reduce in force) 4,000 employees. Let’s see. Yep, that works out to 10 percent of the “team.” Without the wax museum as a job option, will these folks become entrepreneurs?

I read “Turning Digital Learning Into Intellectual Property.” The title snagged me, and I assume that some of the 4,000 folks now preparing to find their future elsewhere were intrigued.

The write up reported:

Pearson is also positioning itself as a major center for the analysis of educational big data.

Ah, ha. A publishing outfit involved in education is getting with the Big Data thing.

How is a traditional publishing company going to respond to the digital opportunities it now perceives?

big data analysis methods will enable researchers to “capture stream or trace data from learners’ interactions” with learning materials, detect “new patterns that may provide evidence about learning,” and “more clearly understand the micro-patterns of teaching and learning by individuals and groups.” Big data methods of pattern recognition are at the heart of its activities, and Pearson ambitiously aims to use pattern recognition to identify generalizable insights into learning processes not just at the level of the individual learner but at vast scale.

Yes, vast. Micro patterns. Big Data.

My mouth is watering and my ageing brain cells hunger for the new learning.

Big questions have to be answered. For example, who owns learning theory?

I recall my brush with the education department. Ugly. I thought that most of the information to which I was exposed was baloney. For evidence, I think back to my years in Brazil with my hit and miss involvement with the Calvert Course, the “English not spoken here” approach of the schools in Campinas, and the seamless transition I made back to my “regular” US school after having done zero in the learning aquaria for several years.

I also recall the look of befuddlement on the face of the check out clerks, when I point out that a cash register tally is incorrect or the consternation that furrows the brow when I provide bills and two pennies.

My hunch is that the education thing is a juicy business, but I am not confident in Pearson’s ability to catch up with the folks who are not saddled with the rich legacy of printing books and charging lots of money for them.

This is a trend worth watching. Will it become the success of Ebsco’s “discovery” system? Will it generate the payoff Thomson Reuters is getting by reselling Palantir? Will it allow Pearson to make the bold moves that so many traditional publishing companies have made after they embraced XML as the silver bullet and incantation to ward off collapsing revenues?

I for one will be watching. Who knows? Maybe I will return to school to brighten the day of an adjunct professor at the local university. (This institution I might add is struggling with FBI investigations, allegations of sexual misconduct, and a miasma of desperation.)

Education. Great stuff.

Stephen E Arnold, January 26, 2016

Written by Stephen E. Arnold · Filed Under Big data, News, Publishing | 1 Comment

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

AI Startups Use Advanced AI Technology to Improve Daily Chores

The History of ZyLab

Elasticsearch Works for Us 24/7

Big Data: A Shopsmith for Power Freaks?

IBM Sells Technology Platform with a Throwback to Big Datas Mysteries

Metadata Could Play Integral Role in Data Security

Big Data Is so Last Year, Data Analysts Inform Us

Measuring Classifiers by a Rule of Thumb

Anonymity Not Always Secured for Tor and Dark Web Users

Pearson: Revenue Challenges and Digital Initiatives

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta