Ottawa Law Enforcement and Reasonable Time for Mobile Phone Access

February 5, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

The challenge of mobile phones is that it takes time to access the data if a password is not available to law enforcement. As more mobiles are obtained from alleged bad actors, the more time is required. The backlog can be onerous because many law enforcement agencies have a limited number of cyber investigators and a specific number of forensic software licenses or specialized machines necessary to extract data from a mobile device.

Time is not on their side. The Ottawa Citizen reports, “Police Must Return Phones After 175 Million Passcode Guesses, Judge Says.” It is not actually about the number of guesses, but about how long investigators can retain suspects’ property. After several months trying to crack the passwords on one suspect’s phone, Ottawa police asked Ontario Superior Court Justice Ian Carter to allow them to retain the device for another two years. But even that was a long shot. Writer Andrew Duffy tells us:

“Ontario Superior Court Justice Ian Carter heard that police investigators tried about 175 million passcodes in an effort to break into the phones during the past year. The problem, the judge was told, is that more than 44 nonillion potential passcodes exist for each phone. To be more precise, the judge said, there are 44,012,666,865,176,569,775,543,212,890,625 potential alpha-numeric passcodes for each phone. It means, Carter said, that even though 175 million passcodes were attempted, those efforts represented ‘an infinitesimal number’ of potential answers.”

The article describes the brute-force dictionary attacks police had used so far and defines the term leetspeak for curious readers. Though investigators recently added the password-generating tool Mentalist to their arsenal, the judge determined their chances of breaking into the phone were too slim. We learn:

“In his ruling, Carter said the court had to balance the property rights of an individual against the state’s legitimate interest in preserving evidence in an investigation. The phones, he said, have no evidentiary value unless the police succeed in finding the right passcodes. ‘While it is certainly possible that they may find the needle in the next two years, the odds are so incredibly low as to be virtually non-existent,’ the judge wrote. ‘A detention order for a further six months, two years, or even a decade will not alter the calculus in any meaningful way.’ He denied the Crown’s application to retain the phones and ordered them returned or destroyed.”

The judge suggested investigators instead formally request more data from Google, which supplied the information that led to the warrants in the first place. Good idea, but techno feudal outfits are often not set up to handle a large number of often-complex requests. The result is that law enforcement is expected to perform certain tasks while administrative procedures and business processes slam on the brakes. One would hope that information about the reality of accessing mobile devices were better understood and supported.

Cynthia Murrell, February 5, 2024

Easy Monitoring for Web Site Deltas

February 9, 2023

We have been monitoring selected Clear Web pages for a research project. We looked at a number of solutions and turned to VisualPing.io. The system is easy to use. Enter a url of the Web page for which you want a notification of a delta (change). Enter an email, and the system will provide you with a notification. The service is free if you want to monitory five Web pages per day. The company has a pricing FAQ which explains the cost of more notification. The Visual Ping service assumes a user wants to monitor the same Web site or sites on a continuous basis. In order to kill monitoring for one site, a bit of effort is required. Our approach was to have a different team member sign up and enter the revised monitor list. There may be an easier way, but without an explicit method, a direct solution worked for us.

Stephen E Arnold, February 9, 2023

Consumer Image Manipulation: Deep Fakes or Yes, That Is Granny!

September 7, 2022

I find deep fake services interesting. Good actors can create clever TikTok and YouTube videos. Bad actors can whip up a fake video résumé and chase a work from home job. There are other uses as well; for example, a zippy video professional can create a deep fake of a “star” who may be dead or just stubborn and generate a scene. Magic and maybe cheaper.

I read “Use This Free Tool to Restore Faces in Old Family Photos.” The main idea is that a crappy old photo with blurry faces can be made almost like “new.” The write up says:

This online tool—called GFPGAN—first made it onto our radar when it was featured in the August 28 edition of the (excellent) Recomendo newsletter, specifically, a post by Kevin Kelly. In it, he says that he uses this free program to restore his own old family photos, noting that it focuses solely on the faces of those pictured, and “works pretty well, sometimes perfectly, in color and black and white.”

The service has another trick amidst its zeros and ones:

According to the ByteXD post, in addition to fixing or restoring faces in old photos, you can also use GFPGAN to increase the resolution of the entire image. Plus, because the tool works using artificial intelligence, it can also come in handy if you need to fix AI art portraits. ByteXD provides instructions for both upscaling and improving the quality of AI art portraits, for people interested in those features.

Will it work on passport photos and other types of interesting documents? We will have to wait until the bad actors explore and innovate.

Stephen E Arnold, September 8, 2022

Why Investigative Software Is Expensive

December 3, 2020

In a forthcoming interview, I explore industrial-strength policeware and intelware with a person who was Intelligence Officer of the Year. In that review, which will appear in a few weeks, the question of cost of policeware and intelware is addressed. Systems like those from IBM’s i2, Palantir Technologies, Verint, and similar vendors are pricey. Not only is there a six or seven figure license fee, the client has to pay for training, often months of instruction. Plus, these i2-type systems require systems and engineering support. One tip off of to the fully loaded costs is the phrase “forward deployed engineer.” The implicit message is that these i2-type systems require an outside expert to keep the digital plumbing humming along. But who is responsible for the data? The user. If the user fumbles the data bundle, bad outputs are indeed possible.

What’s the big deal? Why not download Maltego? Why not use one of the $100 to $3,000 solutions from jazzy startups by former intelligence officers? These are “good enough”, some may assert. One facet of the cost of industrial strength systems available to qualified licensees is a little appreciated function: Dealing with data.

Keep Data Consistency During Database Migration” does a good job of explaining what has to happen in a reliable, consistent way when one of the multiple data sources contributes “new” or “fresh” data to an intelware or policeware system. The number of companies providing middleware to perform these functions is growing. Why?

Most companies wanting to get into the knowledge extraction business have to deal with the issues identified in the article. Most organizations do not handle these tasks elegantly, rapidly, or accurately.

Injecting incorrect, stale, inaccurate data into a knowledge centric process like those in industrial strength policeware causes those systems to output unreliable results.

What’s the consequence?

Investigators and analysts learn to ignore certain outputs.

Why? The outputs can be more serious than a flawed diagram whipped up by an MBA who worries only about the impression he or she makes on a group of prospects attending a Zoom meeting.

Data consistency is a big deal.

Stephen E Arnold, December 2, 2020

Update for TemaTres, a Taxonomy Tool

March 25, 2020

In order to create and maintain a Web site, database, or other information source, a powerful knowledge management applications needed. There are numerous proprietary knowledge management software on the market, but the problem is often the price tag and solutions are not available out of the box. Open source software is the best way to save money and curate a knowledge management application to your specifications. The question remains: what open source knowledge management software should you download?

One of the top knowledge management software available via open source is TeamTres. TeamTres is described as a:

“Web application for management formal representations of knowledge, thesauri, taxonomies and multilingual vocabularies.”

TemaTres allows users to manage, publish, and share ontologies, taxonomies, thesauri, and glossaries. TemaTres includes numerous features that are designed for the best taxonomy development experience. Among these features are: MARC21 XML Schema, search function, keyword suggestions, user management, multilingual interface, scope notes, relationship visualizations, term reports, terminology mapping, unique code for each term, free terms control, vocabulary harmonization features, no limits on delimiters, integration into web tools, and more.

TemaTres requires programming knowledge to make it functional. Data governance is an important part of knowledge management and it gives editorial control over content. It is an underrated, but valuable tool.

Whitney Grace, March 25, 2020

Into R? A List for You

May 12, 2019

Computerworld, which runs some pretty unusual stories, published “Great R Packages for Data Import, Wrangling and Visualization.” “Great” is an interesting word. In the lingo of Computerworld, a real journalist did some searching, talked to some people, and created a list. As it turns out, the effort is useful. Looking at the Computerworld table is quite a bit easier than trying to dig information out of assorted online sources. Plus, people are not too keen on the phone and email thing now.

The listing includes a mixture of different tools, software, and utilities. There are more than 80 listings. I wasn’t sure what to make of XML’s inclusion in the list, but, the source is Computerworld, and I assume that the “real” journalist knows much more than I.

Two observations:

  • Earthworm lists without classification or alphabetization are less useful to me than listings which are sorted by tags and alphabetized within categories. Excel does perform this helpful trick.
  • Some items in the earthworm list have links and others do not. Consistency, I suppose, is the hobgoblin of some types of intellectual work
  • An indication of which item is free or for fee would be useful too.

Despite these shortcomings, you may want to download the list and tuck it into your “Things I love about R” folder.

Stephen E Arnold, May 12, 2019

Machine Learning Frameworks: Why Not Just Use Amazon?

September 16, 2018

A colleague sent me a link to “The 10 Most Popular Machine Learning Frameworks Used by Data Scientists.” I found the write up interesting despite the author’s failure to define the word popular and the bound phrase data scientists. But few folks in an era of “real” journalism fool around with my quaint notions.

According to the write up, the data come from an outfit called Figure Eight. I don’t know the company, but I assume their professionals adhere to the basics of Statistics 101. You know the boring stuff like sample size, objectivity of the sample, sample selection, data validity, etc. Like information in our time of “real” news and “real” journalists, some of these annoying aspects of churning out data in which an old geezer like me can have some confidence. You know like the 70 percent accuracy of some US facial recognition systems. Close enough for horseshoes, I suppose.

miss sort of accurate

Here’s the list. My comments about each “learning framework” appear in italics after each “learning framework’s” name:

  1. Pandas — an open source, BSD-licensed library
  2. Numpy — a package for scientific computing with Python
  3. Scikit-learn — another BSD licensed collection of tools for data mining and data analysis
  4. Matplotlib — a Python 2D plotting library for graphics
  5. TensorFlow — an open source machine learning framework
  6. Keras — a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano
  7. Seaborn — a Python data visualization library based on matplotlib
  8. Pytorch & Torch
  9. AWS Deep Learning AMI — infrastructure and tools to accelerate deep learning in the cloud. Not to be annoying but defining AMI as Amazon Machine Learning Interface might be useful to some
  10. Google Cloud ML Engine — neural-net-based ML service with a typically Googley line up of Googley services.

Stepping back, I noticed a handful of what I am sure are irrelevant points which are of little interest to a “real” journalists creating “real” news.

First, notice that the list is self referential with python love. Frameworks depend on other python loving frameworks. There’s nothing inherently bad about this self referential approach to shipping up a list, and it makes it a heck of a lot easier to create the list in the first place.

Second, the information about Amazon is slightly misleading. In my lecture in Washington, DC on September 7, I mentioned that Amazon’s approach to machine learning supports Apache MXNet and Gluon, TensorFlow, Microsoft Cognitive Toolkit, Caffe, Caffe2, Theano, Torch, PyTorch, Chainer, and Keras. I found this approach interesting, but of little interest to those creating a survey or developing an informed list about machine learning frameworks; for example, Amazon is executing a quite clever play. In bridge, I think the phrase “trump card” suggests what the Bezos momentum machine has cooked up. Notice the past tense because this Amazon stuff has been chugging along in at least one US government agency for about four, four and one half years.

Third, Google brings up dead last. What about IBM? What about Microsoft and its CNTK. Ah, another acronym, but I as a non real journalist will reveal that this acronym means Microsoft Cognitive Toolkit. More information is available in Microsoft’s wonderful prose at this link. By the way, the Amazon machine learning spinning momentum thing supports the CNTK. Imagine that? Right, I didn’t think so.

Net net: The machine learning framework list may benefit from a bit of refinement. On the other hand, just use Amazon and move down the road to a new type of smart software lock in. Want to know more? Write benkent2020 @ yahoo dot com and inquire about our for fee Amazon briefing about machine learning, real time data marketplaces, and a couple of other most off the radar activities. Have you seen Amazon’s facial recognition camera? It’s part of the Amazon machine learning imitative, and it has some interesting capabilities.

Stephen E Arnold, September 16, 2018

Useful AI Tools and Frameworks

July 6, 2018

We have found a useful resource: DZone shares “10 Open-Source Tools/Frameworks for Artificial Intelligence.” We do like open-source software. The write-up discusses the advantages offered by each entry in detail, so navigate there to compare and contrast the options. For example, regarding the popular TensorFlow, writer Somanath Veettil describes:

“TensorFlow is an open-source software library, which was originally developed by researchers and engineers working on the Google Brain Team. TensorFlow is for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow provides multiple APIs. The lowest level API — TensorFlow Core — provides you with complete programming control. The higher level APIs are built on top of TensorFlow Core. These higher level APIs are typically easier to learn and use than TensorFlow Core. In addition, the higher level APIs make repetitive tasks easier and more consistent between different users. A high-level API like tf.estimator helps you manage data sets, estimators, training, and inference. The central unit of data in TensorFlow is the tensor. A tensor consists of a set of primitive values shaped into an array of any number of dimensions. A tensor’s rank is its number of dimensions.”

The rest of Veettil’s entries are these: Apache SystemML, Caffe, Apache Mahout, OpenNN, Torch, Neuroph, Deeplearning4j (the “j” is for Java), Mycroft, and OpenCog. I note that several options employ a neural network, but approach that technology in different ways. It is nice to have so many choices for implementing AI; now the challenge is to determine which system is best for one’s particular needs. This list could help with that.

Cynthia Murrell, July 6, 2018

The Spirit of 1862 Seems to Live On

July 2, 2018

Years ago I learned about a Confederate spy who worked in a telegraph office used by General Henry Halleck and General US Grant. The confederate spy allegedly “filtered” orders. This man in the middle exploit took place in 1862. You can find some information about this incident at this link. The Verge dipped into history for its 2013 write up “How Lincoln Used the Telegraph Office to Spy on Citizens Long Before the NSA.” Information about the US Signals Corps and Bell Telephone / AT&T is abundant.

Why am I dipping into history?

The reason is that I read several articles similar to “8 AT&T Buildings That Are Central to NSA Spying.” The Intercept’s story, which struck me as a bit surprising, triggered this cascade of “wow, what a surprise” copycat articles.

Even though I live in rural Kentucky, the “spy hubs” did not strike me as news, a surprise, or different from systems and methods in use in many countries. Just as Cairo, Illinois, was important to General Grant, cities with large populations and substantial data flows are important today.

Stephen E Arnold, July 2, 2018

CyberOSINT: Next Generation Information Access Explains the Tech Behind the Facebook, GSR, Cambridge Analytica Matter

April 5, 2018

In 2015, I published CyberOSINT: Next Generation Information Access. This is a quick reminder that the profiles of the vendors who have created software systems and tools for law enforcement and intelligence professionals remains timely.

The 200 page book provides examples, screenshots, and explanations of the tools which are available to analyze social media information. The book is the most comprehensive run down of the open source, commercial, and cloud based systems which can make sense of social media data, lawful intercept data, and general text and imagery content.

Companies described in this collection of “tools” include:

  • Cyveillance (now LookingGlass)
  • Decisive Analytics
  • IBM i2 (Analysts Notebook)
  • Geofeedia
  • Leidos
  • Palantir Gotham
  • and more than a dozen developers of commercial and open source, high impact cyberOSINT tool vendors.

The book is available for $49. Additional information is available on my Xenky.com Web site. You can buy the PDF book online at this link gum.co/cyberosint.

Get the CyberOSINT monograph. It’s the standard reference for practical and effective analysis, text analytics, and next generation solutions.

Stephen E Arnold, April 5, 2018

Next Page »

  • Archives

  • Recent Posts

  • Meta