150 Commits

Author SHA1 Message Date
Felipe
0c701ee890
Fetching API calls sequentially
although the WM API is particularly wonky and this will not prevent all errors, this aligns better with what we have here.
2025-03-29 22:27:01 +00:00
Felipe
2243958643
Fixes in cases of too many redirects or files not found 2025-02-09 16:48:52 +00:00
Felipe
9283f04a57 Added ability to download rewritten Wayback Archive files 2025-01-02 12:17:20 +00:00
Felipe
b38d528656 typo fix 2025-01-01 12:20:06 +00:00
Felipe
4d5f187f15 Proper connection pool lifecycle management 2024-12-31 16:48:29 +00:00
Felipe
7de1c5a028 typo fix 2024-12-31 15:03:28 +00:00
Felipe
75617060d7 Workflow fixes, pt.3
You've gotta be squidding me. How did I never notice this
2024-12-05 12:11:16 +00:00
Felipe
02785b2eba Workflow fixes, pt. 1 2024-12-05 12:00:44 +00:00
Felipe
d1b70d83b1 Minor cleanup 2024-12-05 11:53:38 +00:00
Felipe
45fa2be573 Significant refactoring
including extra config settings, a proper rate limit, and a logger. Fixes: #307 #291 #281 #269 and probably others too
2024-12-03 00:23:47 +00:00
Felipe
a3ac4e0341 Minor cleanup 2024-06-26 20:30:59 +00:00
Felipe
93a6fb3c1b typo 2024-06-26 19:52:34 +00:00
Felipe
509d7034a1 Setting file modified time to value reported by Wayback Machine
Implements 937306712c564e5757d898feacc14fbabd10722d, fixes Maintain original creation/modified dates of files while downloading #174
2024-06-26 19:52:12 +00:00
Felipe
0a7752eb41 Minor cleanup 2024-06-26 19:47:19 +00:00
Felipe
cff30f529e Using net:HTTP and decompressing gzip content
see https://github.com/ShiftaDeband/wayback-machine-downloader and bf6e33c2fe
2024-06-26 16:54:55 +00:00
hartator
cf770c2e55 Bump gem version 2021-09-04 01:51:08 -05:00
Paul Wise
9da87bfa74
Make URI#open cross Ruby versions compatible
Inspired-by: commit 30475c5c9e1d92d63b75dc5f22a40dd16c1aa23a
2021-06-08 07:59:38 +08:00
hartator
83b4f880b1 Bump Gem version 2021-06-06 19:47:48 -05:00
Paul Wise
ba4ca60377
Do not emit a comma for the final item in JSON output
This avoids producing JSON that is not parsable.
2021-05-03 20:54:29 +08:00
Paul Wise
06e25957b6
Print progress messages to stderr when printing JSON
This avoids the messages breaking JSON parsing when
the output is being redirected to a file and parsed.
2021-05-03 20:52:28 +08:00
Paul Wise
cd29f79fd0
Switch to the JSON output format for easier parsing 2021-05-03 17:44:56 +08:00
DessertArbiter
15edae6a92 updated deprecated calls, changed URI to https 2020-05-27 20:28:06 -04:00
hartator
0a2ae60378 Bump Gem version 2017-10-26 20:09:23 -05:00
hartator
c360d4621f Merge branch 'master' of https://github.com/niklasjansson/wayback-machine-downloader into niklasjansson-master 2017-10-26 20:05:10 -05:00
hartator
aa5977c53a Bump Gem version 2017-10-26 19:44:00 -05:00
hartator
c06ab067aa Fix missing comma issue 2017-10-26 19:43:48 -05:00
hartator
a7c3d9b6c1 Merge branch 'pr/89'
# Conflicts:
#	lib/wayback_machine_downloader.rb
2017-10-26 19:39:36 -05:00
hartator
d80b51f502 Use more explicit variable name all_timestamps 2017-10-26 19:35:29 -05:00
hartator
3179faca2e Bump Gem version 2017-06-11 22:18:07 -05:00
hartator
123c2f3024 Bump Gem version 2017-06-11 22:06:57 -05:00
hartator
246441ff17 Replace exact match by exact url 2017-06-11 21:53:13 -05:00
hartator
4eca581257 Remove too verbose comment 2017-06-11 13:19:50 -05:00
hartator
28fd1e10a2 Fix length of arguments per line 2017-06-11 13:19:36 -05:00
hartator
62f424b6d1 Merge branch 'master' of https://github.com/p/wayback-machine-downloader into p-master 2017-06-11 12:46:00 -05:00
hartator
f465c83a80 Bump Gem version 2017-06-10 15:49:35 -05:00
Niklas Jansson
662ab9eeb7 fixed bad characters in directories for windows 2017-06-05 12:32:37 +02:00
Oleg Pudeyev
4af80adca6 Fix exact match logic and add a test 2017-06-03 17:45:06 -04:00
Oleg Pudeyev
e73a88ab56 File.exists? causes warning is ruby 2.4.1, use exist? 2017-06-03 16:53:37 -04:00
Oleg Pudeyev
d926f965f9 Add exact_match option.
With this option set, Wayback Machine Downloader
will only look for snapshots matching the exact base_url
passed in rather than base_url and its children.

This is useful when trying to download a single file
rather than mirroring a site.
2017-03-15 17:58:05 -04:00
Oleg Pudeyev
65b1948517 Avoid interleaving status output with file listing.
Before:

[
Getting snapshot pages.. found 1 snaphots to consider.

{"file_url":"http://www.trackpedia.com:80/forums/archive/index.php/f-115.html","timestamp":20131221124252,"file_id":"forums/archive/index.php/f-115.html"},
]

After:

Getting snapshot pages.. found 1 snaphots to consider.

[
{"file_url":"http://www.trackpedia.com:80/forums/archive/index.php/f-115.html","timestamp":20131221124252,"file_id":"forums/archive/index.php/f-115.html"},
]
2017-03-15 17:19:34 -04:00
Oleg Pudeyev
6b8c1aa194 Remove list attribute from the downloader.
Whether to list or download is a program option external to the downloader
2017-03-15 17:12:43 -04:00
Oleg Pudeyev
ea73ed5ed6 Shorten some lines for readability 2017-03-15 17:10:16 -04:00
hartator
e95ade8079 Bump Gem version 2017-02-17 12:55:58 -06:00
hartator
4830913ed3 Add explicit variable current encoding 2017-02-17 12:54:12 -06:00
Ian Kirker
132e3fa5f8 Alters encoding of file_url to fix encoding incompatibilities 2017-02-15 11:01:26 +00:00
insaner
5bd9fbffdd Update wayback_machine_downloader.rb 2017-01-24 05:03:07 -05:00
hartator
734521803a Bump Gem version 2016-11-14 18:20:16 -06:00
hartator
7383bdfadc Bump Gem version 2016-10-17 12:42:05 -05:00
hartator
9fc60b5385 Require thread library explicitly for 1.9.x Ruby versions 2016-10-17 12:42:05 -05:00
hartator
c9be8ae945 Bump Gem version 2016-09-26 10:36:05 -07:00