Fix: path sanitizer and timestamp sorting errors

Fix: path sanitizer and timestamp sorting errors

( I encountered these errors issues with the script using Windows 11. Changing these two lines got the script to work for me. )

- Fixed a bug in Windows path sanitizer where String#gsub was incorrectly called with a Proc as the replacement. Replaced with block form to ensure proper character escaping for Windows-incompatible file path characters.
- Fixed an ArgumentError in file sorting when a file snapshot’s timestamp was nil. Updated sort logic to safely handle nil timestamps by converting them to strings or integers, preventing comparison errors between NilClass and String/Integer.

These changes prevent fatal runtime errors when downloading files with certain URLs or incomplete metadata, improving robustness for sites with inconsistent archive data.
This commit is contained in:
cybercode3 2025-06-22 19:45:05 -05:00 committed by GitHub
parent 4160ff5e4a
commit 7e5cdd54fb

View File

@ -384,7 +384,7 @@ class WaybackMachineDownloader
end
else
file_list_curated = get_file_list_curated
file_list_curated = file_list_curated.sort_by { |k,v| v[:timestamp] }.reverse
file_list_curated = file_list_curated.sort_by { |_,v| v[:timestamp].to_s }.reverse
file_list_curated.map do |file_remote_info|
file_remote_info[1][:file_id] = file_remote_info[0]
file_remote_info[1]
@ -649,7 +649,7 @@ class WaybackMachineDownloader
# for Windows, we need to sanitize path components to avoid invalid characters
# this prevents issues with file names that contain characters not allowed in
# Windows file systems. See # https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions
element.gsub(/[:\*?"<>\|\&\=\/\\]/, ->(match) { '%' + match.ord.to_s(16).upcase })
element.gsub(/[:\*?"<>\|\&\=\/\\]/) { |match| '%' + match.ord.to_s(16).upcase }
else
element
end