mirror of
https://github.com/StrawberryMaster/wayback-machine-downloader.git
synced 2025-12-18 10:16:47 +00:00
Add new maximum snapshot page option
This commit is contained in:
parent
188b285848
commit
45f5f29826
46
README.md
46
README.md
@ -26,20 +26,28 @@ It will download the last version of every file present on Wayback Machine to `.
|
|||||||
|
|
||||||
## Advanced Usage
|
## Advanced Usage
|
||||||
|
|
||||||
Usage: wayback_machine_downloader http://example.com
|
Usage: wayback_machine_downloader http://example.com
|
||||||
|
|
||||||
Download an entire website from the Wayback Machine.
|
Download an entire website from the Wayback Machine.
|
||||||
|
|
||||||
Optional options:
|
Optional options:
|
||||||
-d, --directory PATH Directory to save the downloaded files to. Default is ./websites/ plus the domain name.
|
-d, --directory PATH Directory to save the downloaded files into
|
||||||
-f, --from TIMESTAMP Only files on or after timestamp supplied (ie. 20060716231334)
|
Default is ./websites/ plus the domain name
|
||||||
-t, --to TIMESTAMP Only files on or before timestamp supplied (ie. 20100916231334)
|
-f, --from TIMESTAMP Only files on or after timestamp supplied (ie. 20060716231334)
|
||||||
-o, --only ONLY_FILTER Restrict downloading to urls that match this filter (use // notation for the filter to be treated as a regex)
|
-t, --to TIMESTAMP Only files on or before timestamp supplied (ie. 20100916231334)
|
||||||
-x, --exclude EXCLUDE_FILTER Skip downloading of urls that match this filter (use // notation for the filter to be treated as a regex)
|
-o, --only ONLY_FILTER Restrict downloading to urls that match this filter
|
||||||
-a, --all Expand downloading to error files (40x and 50x) and redirections (30x)
|
(use // notation for the filter to be treated as a regex)
|
||||||
-c, --concurrency NUMBER Number of multiple files to dowload at a time. Default is one file at a time. (ie. 20)
|
-x, --exclude EXCLUDE_FILTER Skip downloading of urls that match this filter
|
||||||
-l, --list Only list file urls in a JSON format with the archived timestamps. Won't download anything.
|
(use // notation for the filter to be treated as a regex)
|
||||||
-v, --version Display version
|
-a, --all Expand downloading to error files (40x and 50x) and redirections (30x)
|
||||||
|
-c, --concurrency NUMBER Number of multiple files to dowload at a time
|
||||||
|
Default is one file at a time (ie. 20)
|
||||||
|
-p, --snapshot-pages NUMBER Maximum snapshot pages to consider (Default is 100)
|
||||||
|
Count an average of 150,000 snapshots per page
|
||||||
|
-l, --list Only list file urls in a JSON format with the archived timestamps, won't download anything.
|
||||||
|
-v, --version Display version
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Specify directory to save files to
|
## Specify directory to save files to
|
||||||
|
|
||||||
@ -121,6 +129,16 @@ Example:
|
|||||||
|
|
||||||
wayback_machine_downloader http://example.com --list
|
wayback_machine_downloader http://example.com --list
|
||||||
|
|
||||||
|
## Maximum number of snapshot pages to consider
|
||||||
|
|
||||||
|
-p, --snapshot-pages NUMBER
|
||||||
|
|
||||||
|
Optional. Specify the maximum number of snapshot pages to consider. Count an average of 150,000 snapshots per page. 100 is the default maximum number of snapshot pages and should be sufficient for most websites. Use a bigger number if you want to download a very large website.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
wayback_machine_downloader http://example.com --snapshot-pages 300
|
||||||
|
|
||||||
## Download multiple files at a time
|
## Download multiple files at a time
|
||||||
|
|
||||||
-c, --concurrency NUMBER
|
-c, --concurrency NUMBER
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user