mirror of
https://github.com/StrawberryMaster/wayback-machine-downloader.git
synced 2025-12-17 17:56:44 +00:00
Merge branch 'master' of https://github.com/StrawberryMaster/wayback-machine-downloader
This commit is contained in:
commit
01f0fc9587
32
README.md
32
README.md
@ -7,7 +7,7 @@ This is a fork of the [Wayback Machine Downloader](https://github.com/hartator/w
|
||||
Included here is partial content from other forks, namely those @ [ShiftaDeband](https://github.com/ShiftaDeband/wayback-machine-downloader) and [matthid](https://github.com/matthid/wayback-machine-downloader) — attributions are in the code and go to the original authors; as well as a few additional (future) features.
|
||||
|
||||
## Installation
|
||||
Note: You need to install Ruby on your system (>= 2.3) to run ths program — if you don't already have it.
|
||||
Note: You need to install Ruby on your system (>= 2.3) to run this program — if you don't already have it.
|
||||
|
||||
1. Clone/download this repository
|
||||
2. In your terminal (e.g. Command Prompt, PowerShell, Windows Terminal), navigate to the directory where you cloned/downloaded this repository
|
||||
@ -44,7 +44,7 @@ There are a few constants that can be edited in the `wayback_machine_downloader.
|
||||
|
||||
Run wayback_machine_downloader with the base url of the website you want to retrieve as a parameter (e.g., https://example.com):
|
||||
|
||||
wayback_machine_downloader https://example.com
|
||||
ruby wayback_machine_downloader https://example.com
|
||||
|
||||
## How it works
|
||||
|
||||
@ -52,7 +52,7 @@ It will download the last version of every file present on Wayback Machine to `.
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
Usage: wayback_machine_downloader https://example.com
|
||||
Usage: ruby wayback_machine_downloader https://example.com
|
||||
|
||||
Download an entire website from the Wayback Machine.
|
||||
|
||||
@ -82,7 +82,7 @@ Optional. By default, Wayback Machine Downloader will download files to `./websi
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader https://example.com --directory downloaded-backup/
|
||||
ruby wayback_machine_downloader https://example.com --directory downloaded-backup/
|
||||
|
||||
## All Timestamps
|
||||
|
||||
@ -92,7 +92,7 @@ Optional. This option will download all timestamps/snapshots for a given website
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader https://example.com --all-timestamps
|
||||
ruby wayback_machine_downloader https://example.com --all-timestamps
|
||||
|
||||
Will download:
|
||||
websites/example.com/20060715085250/index.html
|
||||
@ -109,7 +109,7 @@ Wayback Machine Downloader will then fetch only file versions on or after the ti
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader https://example.com --from 20060716231334
|
||||
ruby wayback_machine_downloader https://example.com --from 20060716231334
|
||||
|
||||
## To Timestamp
|
||||
|
||||
@ -120,7 +120,7 @@ Wayback Machine Downloader will then fetch only file versions on or before the t
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader https://example.com --to 20100916231334
|
||||
ruby wayback_machine_downloader https://example.com --to 20100916231334
|
||||
|
||||
## Exact Url
|
||||
|
||||
@ -130,7 +130,7 @@ Optional. If you want to retrieve only the file matching exactly the url provide
|
||||
|
||||
For example, if you only want to download only the html homepage file of example.com:
|
||||
|
||||
wayback_machine_downloader https://example.com --exact-url
|
||||
ruby wayback_machine_downloader https://example.com --exact-url
|
||||
|
||||
|
||||
## Only URL Filter
|
||||
@ -141,11 +141,11 @@ Optional. You may want to retrieve files which are of a certain type (e.g., .pdf
|
||||
|
||||
For example, if you only want to download files inside a specific `my_directory`:
|
||||
|
||||
wayback_machine_downloader https://example.com --only my_directory
|
||||
ruby wayback_machine_downloader https://example.com --only my_directory
|
||||
|
||||
Or if you want to download every images without anything else:
|
||||
|
||||
wayback_machine_downloader https://example.com --only "/\.(gif|jpg|jpeg)$/i"
|
||||
ruby wayback_machine_downloader https://example.com --only "/\.(gif|jpg|jpeg)$/i"
|
||||
|
||||
## Exclude URL Filter
|
||||
|
||||
@ -155,11 +155,11 @@ Optional. You may want to retrieve files which aren't of a certain type (e.g., .
|
||||
|
||||
For example, if you want to avoid downloading files inside `my_directory`:
|
||||
|
||||
wayback_machine_downloader https://example.com --exclude my_directory
|
||||
ruby wayback_machine_downloader https://example.com --exclude my_directory
|
||||
|
||||
Or if you want to download everything except images:
|
||||
|
||||
wayback_machine_downloader https://example.com --exclude "/\.(gif|jpg|jpeg)$/i"
|
||||
ruby wayback_machine_downloader https://example.com --exclude "/\.(gif|jpg|jpeg)$/i"
|
||||
|
||||
## Expand downloading to all file types
|
||||
|
||||
@ -169,7 +169,7 @@ Optional. By default, Wayback Machine Downloader limits itself to files that res
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader https://example.com --all
|
||||
ruby wayback_machine_downloader https://example.com --all
|
||||
|
||||
## Only list files without downloading
|
||||
|
||||
@ -179,7 +179,7 @@ It will just display the files to be downloaded with their snapshot timestamps a
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader https://example.com --list
|
||||
ruby wayback_machine_downloader https://example.com --list
|
||||
|
||||
## Maximum number of snapshot pages to consider
|
||||
|
||||
@ -189,7 +189,7 @@ Optional. Specify the maximum number of snapshot pages to consider. Count an ave
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader https://example.com --snapshot-pages 300
|
||||
ruby wayback_machine_downloader https://example.com --snapshot-pages 300
|
||||
|
||||
## Download multiple files at a time
|
||||
|
||||
@ -199,7 +199,7 @@ Optional. Specify the number of multiple files you want to download at the same
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader https://example.com --concurrency 20
|
||||
ruby wayback_machine_downloader https://example.com --concurrency 20
|
||||
|
||||
## Contributing
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user