mirror of
https://github.com/StrawberryMaster/wayback-machine-downloader.git
synced 2025-12-29 16:16:06 +00:00
Add downloading in multi threads
This commit is contained in:
31
README.md
31
README.md
@@ -24,9 +24,9 @@ It will download the last version of every file present on Wayback Machine to `.
|
||||
## Advanced Usage
|
||||
|
||||
Usage: wayback_machine_downloader http://example.com
|
||||
|
||||
|
||||
Download an entire website from the Wayback Machine.
|
||||
|
||||
|
||||
Optional options:
|
||||
-f, --from TIMESTAMP Only files on or after timestamp supplied (ie. 20060716231334)
|
||||
-t, --to TIMESTAMP Only files on or before timestamp supplied (ie. 20100916231334)
|
||||
@@ -34,12 +34,13 @@ It will download the last version of every file present on Wayback Machine to `.
|
||||
-x, --exclude EXCLUDE_FILTER Skip downloading of urls that match this filter (use // notation for the filter to be treated as a regex)
|
||||
-a, --all Expand downloading to error files (40x and 50x) and redirections (30x)
|
||||
-l, --list Only list file urls in a JSON format with the archived timestamps. Won't download anything.
|
||||
--threads NUMBER Number of threads to use while downloading website (ie. 20)
|
||||
-v, --version Display version
|
||||
|
||||
## From Timestamp
|
||||
|
||||
-f, --from TIMESTAMP
|
||||
|
||||
|
||||
Optional. You may want to supply a from timestamp to lock your backup to a specific version of the website. Timestamps can be found inside the urls of the regular Wayback Machine website (e.g., http://web.archive.org/web/20060716231334/http://example.com). You can also use years (2006), years + month (200607), etc. It can be used in combination of To Timestamp.
|
||||
Wayback Machine Downloader will then fetch only file versions on or after the timestamp specified.
|
||||
|
||||
@@ -50,7 +51,7 @@ Example:
|
||||
## To Timestamp
|
||||
|
||||
-t, --to TIMESTAMP
|
||||
|
||||
|
||||
Optional. You may want to supply a to timestamp to lock your backup to a specifc version of the website. Timestamps can be found inside the urls of the regular Wayback Machine website (e.g., http://web.archive.org/web/20100916231334/http://example.com). You can also use years (2010), years + month (201009), etc. It can be used in combination of From Timestamp.
|
||||
Wayback Machine Downloader will then fetch only file versions on or before the timestamp specified.
|
||||
|
||||
@@ -67,9 +68,9 @@ Optional. You may want to retrieve files which are of a certain type (e.g., .pdf
|
||||
For example, if you only want to download files inside a specific `my_directory`:
|
||||
|
||||
wayback_machine_downloader http://example.com --only my_directory
|
||||
|
||||
|
||||
Or if you want to download every images without anything else:
|
||||
|
||||
|
||||
wayback_machine_downloader http://example.com --only "/\.(gif|jpg|jpeg)$/i"
|
||||
|
||||
## Exclude URL Filter
|
||||
@@ -81,11 +82,11 @@ Optional. You may want to retrieve files which aren't of a certain type (e.g., .
|
||||
For example, if you want to avoid downloading files inside `my_directory`:
|
||||
|
||||
wayback_machine_downloader http://example.com --exclude my_directory
|
||||
|
||||
|
||||
Or if you want to download everything except images:
|
||||
|
||||
|
||||
wayback_machine_downloader http://example.com --exclude "/\.(gif|jpg|jpeg)$/i"
|
||||
|
||||
|
||||
## Expand downloading to all file types
|
||||
|
||||
-a, --all
|
||||
@@ -106,6 +107,16 @@ Example:
|
||||
|
||||
wayback_machine_downloader http://example.com --list
|
||||
|
||||
## Download using ruby green threads
|
||||
|
||||
--threads NUMBER
|
||||
|
||||
Optional. Default is 1. Number of threads to use while downloading website.
|
||||
|
||||
Example:
|
||||
|
||||
wayback_machine_downloader http://example.com --threads 20
|
||||
|
||||
## Using the Docker image
|
||||
|
||||
As an alternative installation way, we have a Docker image! Retrieve the wayback-machine-downloader Docker image this way:
|
||||
@@ -124,7 +135,7 @@ To run the tests:
|
||||
|
||||
bundle install
|
||||
bundle exec rake test
|
||||
|
||||
|
||||
## Donation
|
||||
|
||||
Wayback Machine Downloader is free and open source.
|
||||
|
||||
Reference in New Issue
Block a user