Felipe
1f4202908f
Fixes for tidy_bytes
...
admittedly not the cleanest way to do this, although it works for #25 .
2025-07-31 12:58:22 -03:00
Felipe
bed3f6101c
Added missing gemspec file
2025-07-31 12:57:03 -03:00
Felipe
754df6b8d6
Merge pull request #27 from adampweb/master
...
Refactored huge functions & cleanup
2025-07-29 18:09:51 -03:00
adampweb
801fb77f79
Perf: Refactored a huge function into smaller subprocesses
2025-07-29 21:12:20 +02:00
adampweb
e9849e6c9c
Cleanup: I removed the obsolete options.
...
The classic way provides more flexibility
2025-07-29 20:55:10 +02:00
Felipe
bc868e6b39
Refactor tidy_bytes.rb
...
I'm not sure if we can easily determine the encoding behind each site (and I don't think Wayback Machine does that), *but* we can at least translate it and get it to download. This should be mostly useful for other, non-Western European languages. See #25
2025-07-29 10:10:56 -03:00
Felipe
2bf04aff48
Sanitize base_url and directory parameters
...
this might be the cause of #25 , at least from what it appears
2025-07-27 17:18:57 +00:00
Felipe
51becde916
Minor fix
2025-07-26 21:01:40 +00:00
Felipe
c30ee73977
Sanitize file_id
...
we were not consistently handling non-UTF-8 characters here, especially after commit e4487baafcab64d2b81a5fd7a6b572ac8fa772e2. This also fixes #25
2025-07-26 20:58:50 +00:00
Felipe
d3466b3387
Bumping version
...
normally I would've yanked the old gem, but that's not working here
v2.3.12
2025-07-22 12:41:26 +00:00
Felipe
0250579f0e
Added missing file
2025-07-22 12:38:12 +00:00
Felipe
0663c1c122
Merge pull request #23 from adampweb/master
...
Fixed base image vulnerability
2025-07-21 14:44:43 -03:00
adampweb
93115f70ec
Merge pull request #5 from adampweb/snyk-fix-88576ceadf7e0c41b63a2af504a3c8ae
...
[Snyk] Security upgrade ruby from 3.4.4-alpine to 3.4.5-alpine
2025-07-21 18:46:03 +02:00
snyk-bot
3d37ae10fd
fix: Dockerfile to reduce vulnerabilities
...
The following vulnerabilities are fixed with an upgrade:
- https://snyk.io/vuln/SNYK-ALPINE322-OPENSSL-10597997
- https://snyk.io/vuln/SNYK-ALPINE322-OPENSSL-10597997
2025-07-21 16:45:10 +00:00
Felipe
bff10e7260
Initial implementation of a composite snapshot
...
see issue #22 . TBF
2025-07-21 15:30:49 +00:00
Felipe
3d181ce84c
Bumped version
v2.3.11
2025-07-21 13:48:34 +00:00
Alfonso Corrado
999aa211ae
fix match filters
2025-07-21 13:42:44 +00:00
adampweb
ffdce7e4ec
Exclude dev enviroment config
2025-07-20 17:14:09 +02:00
adampweb
e4487baafc
Fix: Handle default case in tidy_bytes
2025-07-20 17:13:36 +02:00
Felipe
82ff2de3dc
Added brief note for users with both WMD gems here
2025-07-14 08:12:38 -03:00
Felipe
fd329afdd2
Merge pull request #20 from underarchiver/rfc3968-url-validity-check
...
Prevent fetching off non RFC3968-compliant URLs
2025-07-11 10:55:12 -03:00
Felipe
038785557d
Ability to recursively download across subdomains
...
this is quite experimental. Fixes #15 but still needs more testing
2025-07-09 12:53:58 +00:00
Felipe
2eead8cc27
Bumping version
v2.3.10
2025-06-27 19:50:39 +00:00
cybercode3
7e5cdd54fb
Fix: path sanitizer and timestamp sorting errors
...
Fix: path sanitizer and timestamp sorting errors
( I encountered these errors issues with the script using Windows 11. Changing these two lines got the script to work for me. )
- Fixed a bug in Windows path sanitizer where String#gsub was incorrectly called with a Proc as the replacement. Replaced with block form to ensure proper character escaping for Windows-incompatible file path characters.
- Fixed an ArgumentError in file sorting when a file snapshot’s timestamp was nil. Updated sort logic to safely handle nil timestamps by converting them to strings or integers, preventing comparison errors between NilClass and String/Integer.
These changes prevent fatal runtime errors when downloading files with certain URLs or incomplete metadata, improving robustness for sites with inconsistent archive data.
2025-06-25 02:07:20 +00:00
Felipe
4160ff5e4a
Bumping version
v2.3.9
2025-06-18 18:05:31 +00:00
underarchiver
f03d92a3c4
Prevent fetching off non RFC3968-compliant URLs
2025-06-17 13:27:10 +02:00
Felipe
2490109cfe
Merge pull request #17 from elidickinson/fix-exact-url
...
don’t append /* when using —exact-url
2025-06-15 22:18:40 -03:00
Eli Dickinson
c3c5b8446a
don’t append /* when —exact-url
2025-06-15 13:26:11 -04:00
Felipe
18357a77ed
Correct file path and sanitization in Windows
...
Not only we weren't normalizing the file directories, we were also agressively sanitizing incorrect characters, leading to some funny stuff on Windows. Fixes #16
2025-06-15 13:48:11 +00:00
Felipe
3fdfd70fc1
Bump version
v2.3.8
2025-06-05 22:34:40 +00:00
Felipe
2bf74b4173
Merge pull request #14 from elidickinson/fix-bracket-urls
...
Fix bug with archive urls containing square brackets
2025-06-03 23:12:07 -03:00
Eli Dickinson
79cbb639e7
Fix bug with archive urls containing square brackets
2025-06-03 16:36:03 -04:00
Felipe
071d208b31
Merge pull request #13 from elidickinson/master
...
workaround for API only showing html files for some domains (fixes #6 )
2025-05-30 14:34:32 -03:00
Eli Dickinson
1681a12579
workaround for API only showing html files for some domains
...
See https://github.com/StrawberryMaster/wayback-machine-downloader/issues/6
2025-05-30 12:50:48 -04:00
Felipe
f38756dd76
Correction for downloaded data folder
...
if you downloaded content from example.org/*, it would be listed in a folder titled * instead of the sitename. See #6 (and thanks to elidickinson for pointing it out!)
2025-05-30 14:00:32 +00:00
Felipe
9452411e32
Added nil checks
2025-05-30 13:52:25 +00:00
Felipe
61e22cfe25
Bump versions
v2.3.7
2025-05-27 18:10:09 +00:00
Felipe
183ed61104
Attempt at fixing --all
...
I honestly don't recall if this was implemented in the original code, and I'm guessing this worked at *some point* during this fork. It seems to work correctly now, however. See #6 and #11
2025-05-27 17:17:34 +00:00
Felipe
e6ecf32a43
Dockerfile test 2
...
I really should not be using deprecated parameters.
2025-05-21 21:34:36 -03:00
Felipe
375c6314ad
Dockerfile test
...
...again
2025-05-21 21:26:37 -03:00
Felipe
6e2739f5a8
Testing
2025-05-18 18:00:10 +00:00
Felipe
caba6a665f
Rough attempt to make this more efficient
2025-05-18 17:52:28 +00:00
Felipe
ab4324c0eb
Bumping to 2.3.6
v2.3.6
2025-05-18 16:49:44 +00:00
Felipe
e28d7d578b
Experimental ability to rewrite URLs to local browsing
2025-05-18 16:48:50 +00:00
Felipe
a7a25574cf
Merge pull request #10 from adampweb/master
...
Using ghcr.io for pulling Docker image
2025-05-15 08:50:33 -03:00
Felipe
23cc3d69b1
Merge pull request #9 from adampweb/feature/increase-performance
...
Increase performance of Bundler processes
2025-05-15 08:50:04 -03:00
adampweb
01fa1f8c9f
Merge pull request #2 from vitaly-zdanevich/patch-1
...
README.md: add docker example without cloning the repo
2025-05-14 21:19:11 +02:00
adampweb
d2f98d9428
Merge remote-tracking branch 'upstream/master' into feature/increase-performance
2025-05-14 15:41:07 +02:00
adampweb
c7a5381eaf
Using nproc in Bundler processes
2025-05-14 15:03:22 +02:00
Felipe
9709834e20
Merge pull request #8 from adampweb/master
...
Fix: delete empty files, Compose command fixes
2025-05-12 10:36:10 -03:00