Why Is MiningLamp Getting Ink?

December 3, 2019

The question “Why is MiningLamp getting ink?” is an interesting one to some people. The firm was founded in 2014. The company was a product of bunsha practiced by Miaozhen Systems, a company engaged in advertising “analysis.” The company is funded by Tencent, China Renaissance, and Sequoia Capital China. The firm may have revenues in the hundreds of millions of dollars. Data about the influence of the Chinese government is not available to the DarkCyber team at this time. MiningLamp may have received as much as $290 million from its backers.

image

Companies want publicity to get sales leads, attract investors, create buzz to lure new hires, and become known to procurement professionals in government agencies.

image

We noted talk about MiningLamp at a couple of law enforcement and intelligence conferences. The company provides policeware and intelware to customers in China and elsewhere. You can read about the firm on its Web site at this link. (Be patient. The service seems to provide a high latency experience.) Product pages also seem to be missing in action.

Nevertheless, “Chinese Data Mining Firm MiningLamp, Now a National AI Champion, Began by Helping Police Solve Crimes” does not talk about a dearth of public information. The write up states that “MiningLamp’s business analytics tools are used by more than 200 companies in the Fortune 200.” That’s a lot of big companies embracing investigative software. Judging from the attendees at law enforcement and intelligence conference, these big companies are finding out about a Chinese company somehow.

The news story states that “Like Palantir, this Chinese start up uses AI to help corporate clients convert huge volumes of data into actionable information.” Palantir is a big ticket item. Perhaps price is a factor or Fortune 200 companies want to rely on a business intelligence system operated by a company located outside the span of control of some government authorities.

The company has been named a Chinese champion. The article reveals:

Although not as well known as US equivalent Palantir Technologies, which reportedly contributed to America’s success in hunting down Osama bin Laden, MiningLamp’s data mining software is used to spot crime patterns, track drug dealers and prevent human trafficking.

DarkCyber thinks that any company which has 200 Fortune listed companies as customers is reasonably well known.

We learned:

“Cases are being resolved on our platforms every day” in more than 60 cities and regions in China, said founder and CEO Wu Minghui. “We can run fast analysis on potential drug dealers or major suspects, improving the overall case-solving efficiency several hundred times.”

Read more

Federating Data: Easy, Hard, or Poorly Understood Until One Tries It at Scale?

March 8, 2019

I read two articles this morning.

One article explained that there’s a new way to deal with data federation. Always optimistic, I took a look at “Data-Driven Decision-Making Made Possible using a Modern Data Stack.” The revolution is to load data and then aggregate. The old way is to transform, aggregate, and model. Here’s a diagram from DAS43. A larger version is available at this link.das42 diagram

Hard to read. Yep, New Millennial colors. Is this a breakthrough?

I don’t know.

When I read “2 Reasons a Federated Database Isn’t Such a Slam-Dunk”, it seems that the solution outlined by DAS42 and the InfoWorld expert are not in sync.

There are two reasons. Count ‘em.

One: performance

Two: security.

Yeah, okay.

Some may suggest that there are a handful of other challenges. These range from deciding how to index audio, video, and images to figuring out what to do with different languages in the content to determining what data are “good” for the task at hand and what data are less “useful.” Date, time, and geocodes metadata are needed, but that introduces the not so easy to solve indexing problem.

So where are we with the “federation thing”?

Exactly the same place we were years ago…start ups and experts notwithstanding. But then one has to wrangle a lot of data. That’s cost, gentle reader. Big money.

Stephen E Arnold, March 8, 2019

Data Science Gets Political

November 20, 2018

With the near ubiquitous use of big data science in every industry short of rock hunting, it was inevitable that there would be blowback. Recently, many tech companies began to feel some political heat due to their involvement with immigration agencies. We learned more from a recent Mercury News story, “Bay Area Cities May Boycott Tech Giants Contracting With ICE.”

According to the story:

“The policy comes as the local immigration debate shifts toward several prominent tech companies — including Palo Alto’s Palantir Technologies, Vigilant Solutions in Livermore and Amazon, which have been criticized for contracting with federal immigration agencies. Last week, advocates descended on Salesforce’s annual conference in San Francisco with an 14-foot-tall cage symbolizing ICE detention to protest the company’s contract with Customs and Border Protection.”

If this sounds a little farfetched or even unlikely, pay close attention to similar actions in Europe. There, when people pushed back against the intersection of politics and big data, it began to impact finances. And when pocketbooks begin to suffer, you can guarantee companies take notice. We don’t yet know if the same will happen in America, but we have a hunch this issue won’t vanish quietly.

Patrick Roland, November 20, 2018

Amazon Intelligence Gets a New Data Stream

June 28, 2018

I read “Amazon’s New Blue Crew.” The idea is that Amazon can disintermediate FedEx, UPS (the outfit with the double parking brown trucks), and the US Postal Service.

On the surface, the idea makes sense. Push down delivery to small outfits. Subsidize them indirectly and directly. Reduce costs and eliminate intermediaries not directly linked to Amazon.

FedEx, UPS, and the USPS are not the most nimble outfits around. I used to get FedEx envelopes every day or two. I haven’t seen one of those for months. Shipping vis UPS is a hassle. I fill out forms and have to manage odd slips of paper with arcane codes on them. The US Postal Services works well for letters, but I have noticed some returns for “addresses not found.” One was an address in the city in which I live. I put the letter in the recipient’s mailbox. That worked.

The write up reports:

The new program lets anyone run their own package delivery fleet of up to 40 vehicles with up to 100 employees. Amazon works with the entrepreneurs — referred to as “Delivery Service Partners” — and pays them to deliver packages while providing discounts on vehicles, uniforms, fuel, insurance, and more. They operate their own businesses and hire their own employees, though Amazon requires them to offer health care, paid time off, and competitive wages. Amazon said entrepreneurs can get started with as low as $10,000 and earn up to $300,000 annually in profit.

Now what’s the connection to Amazon streaming data services and the company’s intelligence efforts? Several hypotheses come to mind:

  • Amazon obtains fine grained detail about purchases and delivery locations. These are data which no longer can be captured in a non Amazon delivery service system
  • The data can be cross correlated; for example, purchasers of a Kindle title with the delivery of a particular product; for example, hydrogen peroxide
  • Amazon’s delivery data make it possible to capture metadata about delivery time, whether a person accepted the package or if the package was left at the door and other location details such as a blocked entrance, for instance.

A few people dropping off packages is not particularly useful. Scale up the service across Amazon operations in the continental states or a broader swatch of territory and the delivery service becomes a useful source of high value information.

FedEx and UPS are ripe for disruption. But so is the streaming intelligence sector. Worth monitoring this ostensible common sense delivery play.

Stephen E Arnold, June 28, 2018

UK Surveillance Backlash

February 9, 2018

Recently, the UK attempted to fight a variety of criminal activity by developing a mass data unit that used analytics and AI to fight crime. If it sounds like science fiction, that’s because it doesn’t really exist. The task force was ruled illegal recently, we discovered in a Guardian story, “UK Mass Digital Surveillance Regime Ruled Illegal.”

According to the story Security minister Ben Wallace responded to the ruling saying:

“Communications data is used in the vast majority of serious and organized crime prosecutions and has been used in every major security service counter-terrorism investigation over the last decade. It is often the only way to identify pedophiles involved in online child abuse as it can be used to find where and when these horrendous crimes have taken place.”

While the British police are crying for more freedom, they are not the only ones being restricted. A better solution, in our mind, comes from Crime Report, who are advocating for a balanced system of big data policing. According to their report, “acceptable boundaries” must be set in order to protect citizen privacy, but also increase the police’s ability to do their job through technology. It’s likely to be a debate that rages on for a while, but we are hoping for an acceptable solution.

Patrick Roland, February 9, 2018

MarkLogic Aims to Take on Oracle in Enterprise Class Data Hub Frameworks

October 10, 2017

MarkLogic is trying to give Oracle a run for its money in the world of enterprise-class data hubs. According to a recent press release on ITWire, “MarkLogic Releases New Enterprise Class Data Hub Framework to Enhance Agility and Speed Digital Transformations.”

How does this Australian legend plan on doing this? According to the release:

Traditionally, integrating data from silos has been very costly and time consuming for large organizations looking to make faster and better decisions based on their data assets. The Data Hub Framework simplifies and speeds the process of building a MarkLogic solution by providing a framework around how to data model, load data, harmonize data, and iterate with new data and compliance requirements.

But is that enough to unseat Oracle, who has long had a seat at the head of the table? Especially, since they have their own new framework hitting the market. That is still up for debate, but MarkLogic is confident in their ability to compete. According to the piece:

Unlike other databases, NoSQL was specifically designed to ingest and integrate all types of disparate data to find relationships among data, and drive searches and analytics—within seconds.

This battle is just beginning and we have no indication of who has the edge, but you can bet it will be an interesting fight in the marketplace between these two titans.

Patrick Roland, October 10, 2017

Yet Another Digital Divide

September 8, 2017

Recommind sums up what happened at a recent technology convention in the article, “Why Discovery & ECM Haven’t, Must Come Together (CIGO Summit 2017 Recap).” Author Hal Marcus first discusses that he was a staunch challenge to anyone who said they could provide a complete information governance solution. He recently spoke at CIGO Summit 2017 about how to make information governance a feasible goal for organizations.

The problem with information governance is that there is no one simple solution and projects tend to be self-contained with only one goal: data collection, data reduction, etc. When he spoke he explained that there are five main reasons for there is not one comprehensive solution. They are that it takes a while to complete the project to define its parameters, data can come from multiple streams, mass-scale indexing is challenging, analytics will only help if there are humans to interpret the data, risk, and cost all put a damper on projects.

Yet we are closer to a solution:

Corporations seem to be dedicating more resources for data reduction and remediation projects, triggered largely by high profile data security breaches.

Multinationals are increasingly scrutinizing their data sharing and retention practices, spurred by the impending May 2018 GDPR deadline.

ECA for data culling is becoming more flexible and mature, supported by the growing availability and scalability of computing resources.

Discovery analytics are being offered at lower, all-you-can-eat rates, facilitating a range of corporate use cases like investigations, due diligence, and contract analysis

Tighter, more seamless and secure integration of ECM and discovery technology is advancing and seeing adoption in corporations, to great effect.

And it always seems farther away.

Whitney Grace, September 8, 2017

Big Data Too Is Prone to Human Bug

August 2, 2017

Conventional wisdom says Big Data being a realm of machines is immune from human behavioral traits like discrimination. Insights from data scientists, however, are different.

According to an article published by PHYS.ORG titled Discrimination, Lack of Diversity, and Societal Risks of Data Mining Highlighted in Big Data, the author says:

Despite the dramatic growth in big data affecting many areas of research, industry, and society, there are risks associated with the design and use of data-driven systems. Among these are issues of discrimination, diversity, and bias.

The crux of the problem is the way data is mined, processed and decisions made. At every step, humans need to be involved in order to tell machines how each of these processes are executed. If the person guiding the system is biased, these biases are bound to seep into the subsequent processes in some way.

Apart from decisions like granting credit, human resources which also is being automated may have diversity issues. The fundamental remains the same in this case too.

Big Data was touted as the next big thing and may turn out to be so, but most companies are yet to figure out how to utilize it. Streamlining the processes and making them efficient would be the next step.

Vishal Ingole, August 2, 2017

Big Data in Biomedical

July 19, 2017

The biomedical field which is replete with unstructured data is all set to take a giant leap towards standardization with Biological Text Mining Unit.

According to PHYS.ORG, in a peer review article titled Researchers Review the State-Of-The-Art Text Mining Technologies for Chemistry, the author states:

Being able to transform unstructured biomedical research data into structured databases that can be more efficiently processed by machines or queried by humans is critical for a range of heterogeneous applications.

Scientific data has fixed set of vocabulary which makes standardization and indexation easy. However, most big names in Big Data and enterprise search are concentrating their efforts on e-commerce.

Hundreds of new compounds are discovered every year. If the data pertaining to these compounds is made available to other researchers, advancements in this field will be very rapid. The major hurdle is the data is in an unstructured format, which Biological Text Mining Unit standards intend to overcome.

Vishal Ingole, July 19, 2017

Does This Count As Irony?

May 16, 2017

Does this count as irony?

Palantir, who has built its data-analysis business largely on its relationships with government organizations, has a Department of Labor analysis to thank for recent charges of discrimination. No word on whether that Department used Palantir software to “sift through” the reports. Now, Business Insider tells us, “Palantir Will Shell Out $1.7 Million to Settle Claims that It Discriminated Against Asian Engineers.” Writer Julie Bort tells us that, in addition to that payout, Palantir will make job offers to eight unspecified Asians. She also explains:

The issue arose because, as a government contractor, Palantir must report its diversity statistics to the government. The Labor Department sifted through these reports and concluded that even though Palantir received a huge number of qualified Asian applicants for certain roles, it was hiring only small numbers of them. Palantir, being the big data company that it is, did its own sifting and produced a data-filled response that it said refuted the allegations and showed that in some tech titles 25%-38% of its employees were Asians. Apparently, Palantirs protestations weren’t enough on to satisfy government regulators, so the company agreed to settle.

For its part, Palantir insists on their innocence but say they settled in order to put the matter behind them. Bort notes the unusual nature of this case—according to the Equal Employment Opportunity Commission, African-Americans, Latin-Americans, and women are more underrepresented in tech fields than Asians. Is the Department of Labor making it a rule to analyze the hiring patterns of companies required to report diversity statistics? If they are consistent, there should soon be a number of such lawsuits regarding discrimination against other groups. We shall see.

Cynthia Murrell, May 16, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta