Downloading Web Sites: Some Useful Information Available

February 20, 2020

Do you want to download a Web site or the content available from a specific url? What seems easy can become a tricky problem. For example, Google offers “feature” content which is more difficult to download than our DarkCyber video news program. Presumably flowing acrylic paint has more value than information about policeware software.

There are tools available; for example, Cyotek Web Copy and HTTrack, among others. But many of the available Web site downloaders often encounter problems with modern Web sites accessible via any “regular” browser. The challenges come from the general Wild West in which Internet accessible content resides.

One site ripping software goes an extra step. If you download the free version or pay for Microsys’ A1 Web Site Downloader, the developers have created a quite useful series of help pages. Many of the problems one can encounter trying suck down text, images, videos, or other content are addressed.

Navigate to the Microsys help pages and browse the list of topics available. Note that the help directs one to the A1 Web Site Downloader, but the information is likely to be useful if you are using another software or if you are trying to code your own site ripper.

The topics addressed by Microsys include:

  • Some basics like how to restrict how many pages are grabbed
  • Frequent problems encountered; for example, no urls located
  • The types of “options” available; for instance, dealing with Unicode. These “options” provide a useful checklist of important functions to include if you are rolling your own downloader. If you are trying to decide what alternative to AI Web Site Download, the list is useful.
  • A rundown of frequently encountered errors and response code; for example, hard and soft 404s
  • A summary of common platforms. (We liked the inclusion of information about EBay store data.)
  • General questions about the A1 software.

You can access the software and the useful help information via the Microsys Web site at this link. Version 1.0 is free. The current version is about US$40.

DarkCyber pays some attention to software which purports to download Web sites. If you want to download Dark Web sites or content accessible via an obfuscation system, you will have to look elsewhere or do your own programming.

Stephen E Arnold, February 20, 2020


Comments are closed.

  • Archives

  • Recent Posts

  • Meta