Felipe
1f4202908f
Fixes for tidy_bytes
...
admittedly not the cleanest way to do this, although it works for #25 .
2025-07-31 12:58:22 -03:00
adampweb
801fb77f79
Perf: Refactored a huge function into smaller subprocesses
2025-07-29 21:12:20 +02:00
Felipe
bc868e6b39
Refactor tidy_bytes.rb
...
I'm not sure if we can easily determine the encoding behind each site (and I don't think Wayback Machine does that), *but* we can at least translate it and get it to download. This should be mostly useful for other, non-Western European languages. See #25
2025-07-29 10:10:56 -03:00
Felipe
2bf04aff48
Sanitize base_url and directory parameters
...
this might be the cause of #25 , at least from what it appears
2025-07-27 17:18:57 +00:00
adampweb
e4487baafc
Fix: Handle default case in tidy_bytes
2025-07-20 17:13:36 +02:00
Felipe
038785557d
Ability to recursively download across subdomains
...
this is quite experimental. Fixes #15 but still needs more testing
2025-07-09 12:53:58 +00:00
Eli Dickinson
c3c5b8446a
don’t append /* when —exact-url
2025-06-15 13:26:11 -04:00
Eli Dickinson
1681a12579
workaround for API only showing html files for some domains
...
See https://github.com/StrawberryMaster/wayback-machine-downloader/issues/6
2025-05-30 12:50:48 -04:00
Felipe
febffe5de4
Added support for resuming incomplete downloads
2025-04-19 13:40:14 +00:00
Felipe
46450d7c20
Refactoring tidy_bytes, part 2
2025-02-09 16:47:29 +00:00
Felipe
019534794c
Taking care of empty responses
...
fixes "unexpected token at ''" appearing after fetching a list of snapshots
2025-02-09 16:24:02 +00:00
Felipe
fdcb81f1a0
Refactoring
2024-12-31 16:50:50 +00:00
Felipe
9bbb67cd90
More testing
2024-12-31 00:11:58 +00:00
Felipe
466228fee4
Refactoring the archive API
2024-06-26 16:53:08 +00:00
hartator
30475c5c9e
Make URI#open cross Ruby versions compatible
2021-06-06 19:47:11 -05:00
Paul Wise
ea15965d6d
Fix typos
...
Suggested-by: codespell, spellintian
2021-05-03 20:20:09 +08:00
Paul Wise
cd29f79fd0
Switch to the JSON output format for easier parsing
2021-05-03 17:44:56 +08:00
Paul Wise
afab72c894
Construct the cdx API query using a URI object
...
This avoids problems related to URL encoding.
Obsoletes: https://github.com/hartator/wayback-machine-downloader/pull/116
2021-05-03 17:44:36 +08:00
DessertArbiter
15edae6a92
updated deprecated calls, changed URI to https
2020-05-27 20:28:06 -04:00
Oleg Pudeyev
aab9a49509
Get rid of assigned but unused variable warnings under ruby 2.4
2017-06-03 17:00:50 -04:00
Oleg Pudeyev
e6157c21b9
Parens are required before * when used for splatting.
...
https://stackoverflow.com/questions/41821628/ruby-how-can-i-kill-warning-interpreted-as-argument-prefix
2017-06-03 16:59:08 -04:00
Oleg Pudeyev
6779971dc9
Fix whitespace
2017-03-15 17:08:40 -04:00
hartator
8d5be7a89e
Fix compatibility with Ruby 1.9.x and proxies
2016-11-14 18:18:58 -06:00
Anton Eliasson
54bd5d3852
Support http(s)_proxy ENV variables
...
Closes issue #65
2016-10-31 17:49:30 +01:00
hartator
7eedc1a183
Get snapshot result page per page index
2016-09-24 10:04:57 -07:00
hartator
87eee70969
Show 404 archives when a resource had 200 response previously
2016-09-18 12:24:29 -05:00
hartator
21dd22f581
Disable gzip compression on API calls
2016-09-17 14:42:32 -05:00
hartator
2e7f8611ef
Load early net/http library for Ruby 1.9.x
2016-09-17 14:08:29 -05:00
hartator
95eaa91715
Refactor archive API calls to own module
2016-09-17 13:37:17 -05:00
hartator
205e0da48b
Add to_regex library to treat complex regex cases
2015-11-19 15:28:27 -06:00
hartator
4c712d4614
Move TidyBytes to fix executable issue #3
2015-08-19 12:02:08 -05:00