OSINT Is Popular. Just Exercise Caution

November 2, 2022

Many have embraced open source intelligence as the solution to competitive intelligence, law enforcement investigations, and “real” journalists’ data gathering tasks.

For many situations, OSINT as open source intelligence is called, most of those disciplines can benefit. However, as we work on my follow up to monograph to CyberOSINT and the Dark Web Notebook, we have identified some potential blind spots for OSINT enthusiasts.

I want to mention one example of what happens when clever technologists mesh hungry OSINT investigators with some online trickery.

Navigate to privtik.com  (78.142.29.185). At this site you will find:

image

But there is a catch, and a not too subtle one:

image

The site includes mandatory choices in order to access the “secret” TikTok profile.

How many OSINT investigators use this service? Not too many at this time. However, we have identified other, similar services. Many of these reside on what we call “ghost ISPs.” If you are not aware of these services, that’s not surprising. As the frenzy about the “value” of open source investigations increases, geotag spoofing, fake data, and scams will escalate. What happens if those doing research do not verify what’s provided and the behind the scenes data gathering?

That’s a good question and one that gets little attention in much OSINT training. If you want to see useful OSINT resources, check www.osintfix.com. Each click displays one of the OSINT resources we find interesting.

Stephen E Arnold, November 2, 2022

What Do Quasi Monopolies Do? What Big Outfits Have Done for Decades: Keep On Keeping On

November 2, 2022

The race is on. With the advertising money machines making some unpleasant sounds, the big tech companies are doing what big companies do.

Google’s ad revenues softened. The Zuckbook whines about Apple’s ad plays. Apple is gearing up to suck in ad dollars. Amazon is post so many ads when I search for T shirts, I can’t figure out what’s what.

And this is just the beginning.

What’s coming? Ah, you don’t care. I don’t either. Here are some prognostications from the Beyond Search team:

  1. More ads than ever. Everywhere. Constantly. (Why bother with objective content. Do advertorials.)
  2. The dunce advertisers have no choice but a few big outfits; thus, advertisers will choke down questions about ad fraud and fee manipulation
  3. Consumers will pay for these less and less effective ads with higher and higher prices. Zero gravity, right because the money floats out of individuals’ wallets. Zip zip.
  4. Government regulators will do what they do best — Have meetings and maybe hold a hearing or two so we can hear, “Senator, thank you for that question…”

Pretty bleak, right? Want to push back? You will be fighting what sure look like monopolies, legions of attorneys, and probably some other folks as well.

Is this the attention revolution? Nope. You will have less and less attention between more and more advertising.

Stephen E Arnold, November 2, 2022

OpenAI and The Evolution of Academic Cheating

November 2, 2022

Once considered too dangerous for public release, OpenAI’s text generator first ventured forth as a private beta. Now a version called Playground is available to everyone and is even free for the first three months (or the first 1,200,00 characters, whichever comes first). Leave it to the free market to breeze past considerations of misuse. We learn from Vice Motherboard that one key concern has materialized: “Students Are Using AI to Write Their Papers, Because Of Course They Are.” It did not take students long to realize this cheat slips right past plagiarism detecting software—because it is not technically plagiarism. Reporter Claire Woodcock writes:

“George Veletsianos, Canada Research Chair in Innovative Learning & Technology and associate professor at Royal Roads University says this is because the text generated by systems like OpenAI API are technically original outputs that are generated within a black box algorithm. ‘[The text] is not copied from somewhere else, it’s produced by a machine, so plagiarism checking software is not going to be able to detect it and it’s not able to pick it up because the text wasn’t copied from anywhere else,’ Veletsianos told Motherboard. ‘Without knowing how all these other plagiarism checking tools quite work and how they might be developed in the future, I don’t think that AI text can be detectable in that way.’ It’s unclear whether the companies behind the AI tools have the ability to detect or prevent students from using them to do their homework. OpenAI did not comment in time for publication.”

It was inevitable, really. One writing instructor quoted in the story recognizes today’s students can easily accumulate more knowledge than ever before. However, he laments losing the valuable process of gaining that knowledge through exploration if writing assignments become moot. The tutor has a point, but there is likely no turning back now. Perhaps there is a silver lining: academic institutions may finally be forced to teach like they exist in the 21st century. Students are already there. One cited only as innovate_rye states:

“I still do my homework on things I need to learn to pass, I just use AI to handle the things I don’t want to do or find meaningless. If AI is able to do my homework right now, what will the future look like? These questions excite me.”

That is one way to look at it. Perhaps the spirit of exploration is not dead, but rather evolving. Colleges and universities must find a way to keep up or risk becoming irrelevant.

A failure means than students will learn that cheating is the norm. Such progress.

Cynthia Murrell, November 2, 2022

New Hardware for Smart Software from IBM

November 2, 2022

IBM is getting into the AI hardware acceleration game with its new Artificial Intelligence Unit (AIU), we learn from VentureBeat‘s piece, “IBM Announces System-On-Chip AI Hardware.” Each AIU holds 32 cores similar to the Telum chip’s AI core. Rather than a CPU or GPU, the new component is an application-specific integrated circuit (ASIC) designed with AI in mind. This allows it to perform tasks not part of many AI accelerators, we’re told, like the ability to virtualize AI acceleration services. We are assured it is compatible with “the vast majority” of software commonly used by data scientists.

So far so good, but we noticed something in a passage tucked at the end of the write-up—it almost seems results are merely close enough for horseshoes and hand grenades. In order to work faster, the AIU practices “approximate computing.” Kerner tells us:

“Approximate computing is really the recognition that AI is not 100% correct,’ Leland Chang, principal research staff member and senior manager, AI hardware, at IBM Research, told VentureBeat. Chang explained that AI often works by recognizing a pattern and could well be just 99% accurate, meaning that 1% of results are incorrect. The concept of approximate computing is the recognition that within the AI algorithm it is possible to cut some corners. While Chang admitted that this can reduce precision, he explained that if information is lost in the right places, it doesn’t affect the result — which, more often than not, will still be 99% correct. ‘Approximate computing … is simply recognizing that it doesn’t have to be 100% exact,’ Chang said. ‘You’re losing some information, but you’re losing in places where it doesn’t matter.'”

You don’t say. Can we get a guarantee of that? Who makes the electronic components? Oh, right. Bad question.

Cynthia Murrell, November 2, 2022

The Failure of Search: Let Many Flowers Bloom and… Die Alone and Sad

November 1, 2022

I read “Taxonomy is Hard.” No argument from me. Yesterday (October 31, 2022) I spoke with a long time colleague and friend. Our conversations usually include some discussion about the loss of the expertise embodied in the early commercial database firms. The old frameworks, work processes, and shared beliefs among the top 15 or 20 for fee online database companies seem to have scattered and recycled in a quantum crazy digital world. We did not mention Google once, but we could have. My colleague and I agreed on several points:

  • Those who want to make digital information must have an informing editorial policy; that is, what’s the content space, what’s included, what’s excluded, and what problem does the commercial database solve
  • Finding information today is more difficult than it has been our two professional lives. We don’t know if the data are current and accurate (online corrections when publications issue fixes), fit within the editorial policy if there is one or the lack of policy shaped by the invisible hand of politics, advertising, and indifference to intellectual nuances. In some services, “old” data are disappeared presumably due to the cost of maintaining, updating if that is actually done, and working out how to make in depth queries work within available time and budget constraints
  • The steady erosion of precision and recall as reliable yardsticks for determining what a search system can find within a specific body of content
  • Professional indexing and content curation is being compressed or ignored by many firms. The process is expensive, time consuming, and intellectually difficult.

The cited article reflects some of these issues. However, the mirror is shaped by the systems and methods in use today. The approaches pivot on metadata (index terms) and tagging (more indexing). The approach is understandable. The shift to technology which slash the needed for subject matter experts, manual methods, meetings about specific terms or categories, and the other impedimenta are the new normal.

A couple of observations:

  1. The problems of social media boil down to editorial policies. Without these guard rails and the specialists needed to maintain them, finding specific items of information on widely used platforms like Facebook, TikTok, or Twitter, among others is difficult
  2. The challenges of processing video are enormous. The obvious fix is to gate the volume and implement specific editorial guidelines before content is made available to a user. Skipping this basic work task leads to the craziness evident in many services today
  3. Indexing can be supplemented by smart software. However, that smart software can drift off course, so specialists have to intervene and recalibrate the system.
  4. Semantic, statistical, or behavior centric methods for identifying and suggesting possible relevant content require the same expert centric approach. There is no free lunch is automated indexing, even for narrow vocabulary technical fields like nuclear physics or engineered materials. What smart software knows how to deal with new breakthroughs in physics which emerge from the study of inter cell behavior among proteins in the human brain?

Net net: Is it time to re-evaluate some discarded systems and methods? Is it time to accept the fact that technology cannot solve in isolation certain problems? Is it time to recognize that close enough for horseshoes and good enough are not appropriate when it comes to knowledge centric activities? Search engines die when the information garden cannot support the buds and shoots of finding useful information the user seeks.

Stephen E Arnold, November 1, 2022

Threats of a Digital Death: Legal Eagles, Pay Attention

November 1, 2022

I read “Twitter Users Plot Revenge on Elon Musk by Killing the Platform.” I don’t have a dog (real or a digital Zuck confection) in this fight. Frankly I never thought that individuals would admit that their actions harmed a commercial enterprise. Nor did I think these individuals would use their Twitter handles and the tweeter system to organize a group action to harm the Fail Whale. Plus, these people displayed their photographs and probably have profiles available online. (How difficult will it be for some to identify these actors and provide that information to a legal eagle known to chase ambulances?)

The write up states:

…tweeters are conspiring to help Twitter suffer a similar fate by sh*tposting and furry-frying Musk.

“Furry-frying?” Sounds interesting. Will there be a TikTok video?

There are illustrative tweets about what to do and how to accomplish the goal of causing the tweeter to die while trying to get aloft.

Remarkable information if accurate. I wonder if social media analytics systems can pinpoint these actors and take action; for example, emailing tweets and “personas” to a musky law firm?

My hunch is that this idea may occur to someone on the Musk team.

Stephen E Arnold, November 1, 2022

TokTok: Is Ad Integrity Is Job Number One?

November 1, 2022

Nope.

Syrian refugees are still in desperate need of support, and responding to pleas on TikTok is an understandable impulse. However, one should consider how much of any donation will actually help intended recipients and how much will slide into other pockets along the way. The BBC reveals, “TikTok Profits from Livestreams of Families Begging.” Reporters Hannah Gelbart, Mamdouh Akbiek and Ziad Al-Qattan write:

“Children are livestreaming on the social media app for hours, pleading for digital gifts with a cash value. The BBC saw streams earning up to $1,000 (£900) an hour, but found the people in the camps received only a tiny fraction of that.”

In fact, BBC researchers found TikTok owner ByteDance was taking up to 70% of donations meant for Syrian refugees. But wait, there’s more. Of the remaining 30%, 10% went to the local equivalent of Western Union and a hefty 35% of the last fifth went to a middleman, leaving the actual family with a paltry sum. For middlemen, though, this is quite the opportunity. We learn:

“In the camps in north-west Syria, the BBC found that the trend was being facilitated by so-called ‘TikTok middlemen,’ who provided families with the phones and equipment to go live. The middlemen said they worked with agencies affiliated to TikTok in China and the Middle East, who gave the families access to TikTok accounts. … Hamid, one of the TikTok middlemen in the camps, told the BBC he had sold his livestock to pay for a mobile phone, SIM card and wi-fi connection to work with families on TikTok. He now broadcasts with 12 different families, for several hours a day. Hamid said he uses TikTok to help families make a living. He pays them most of the profits, minus his running costs, he said.”

Yes, we are sure he has quite the overhead. Note it is the families putting in the most effort here, pouring their hearts out to strangers for hours each day. Yet TikTok insists none of its Terms of Use are being violated, including the provision to “prevent the harm, endangerment or exploitation” of minors. Unfortunately, residents of many of these camps have few options because local charities are stretched way too thin. For now, TikTok and its middlemen seem to be the only place many can turn.

Cynthia Murrell, November 1, 2022

Microsoft Downplays Revelation of Massive Data Leak

November 1, 2022

Microsoft customers have reason to be annoyed despite the company’s insistence there is nothing to see here. “Microsoft Under Fire After Leaking 2.4TB of Data from Customers Including Contracts, Emails, and More,” reveals Tech Times. Citing a report by cybersecurity firm SOCRadar, writer Joseph Henry tells us:

“According to SOCRadar post, 2.4TB of confidential data from more than 65,000 entities has been leaked because of the misconfiguration in the data bucket. The cybersecurity firm confirms that the data involved in the leak include State of Work (SoW) documents, PII (Personally Identifiable Information) data, Proof-of-Execution (PoE) data, customer emails, project details, product offers, and more. SOCRadar also notes that the above mentioned data spanned five years, particularly from 2017 to August 2022. It should be noted that Microsoft did not include the number of affected customers in its announcement. Unfortunately, instead of acknowledging SOCRadar’s finding, the Redmond giant downplayed the statement by disapproving of its post. Microsoft added that its investigation showed that no customer accounts were compromised in the process.”

Really? What a stroke of good fortune. Henry goes on to share some customer comments regarding the data leak as collected by Ars Technica. Apparently few are reassured by the company’s insistence SOCRadar is exaggerating. If nothing else, some note, this incident highlights Microsoft’s policy of retaining sensitive information in perpetuity. That is not exactly a security best practice. See the SOCRadar post for its description of the misconfiguration that caused this kerfuffle and its potential ramifications.

Which big tech giant will be the next one to get an F in security? My hunch is that it is Amazon’s turn to lose the game of cyber security musical chairs.

Cynthia Murrell, November 1, 2022

« Previous Page

  • Archives

  • Recent Posts

  • Meta