41 Commits

Author SHA1 Message Date
Yuvi9587
0316813792 Delete dist directory 2025-05-26 13:55:54 +05:30
Yuvi9587
d201a5396c Delete build/Kemono Downloader directory 2025-05-26 13:55:25 +05:30
Yuvi9587
86f9396b6c Commit 2025-05-26 13:52:34 +05:30
Yuvi9587
0fb4bb3cb0 Commit 2025-05-26 13:52:07 +05:30
Yuvi9587
1528d7ce25 Update Read.png 2025-05-26 09:54:26 +05:30
Yuvi9587
4e7eeb7989 Commit 2025-05-26 09:52:06 +05:30
Yuvi9587
7f2976a4f4 Commit 2025-05-26 09:48:00 +05:30
Yuvi9587
8928cb92da readme.md 2025-05-26 01:39:39 +05:30
Yuvi9587
a181b76124 Update main.py 2025-05-25 17:18:11 +05:30
Yuvi9587
8f085a8f63 Commit 2025-05-25 21:52:04 +05:30
Yuvi9587
93a997351b Update readme.md 2025-05-25 21:22:47 +05:30
Yuvi9587
b3af6c1c15 Commit 2025-05-25 21:21:00 +05:30
Yuvi9587
4a65263f7d Commit 2025-05-25 19:49:17 +05:30
Yuvi9587
1091b5b9b4 Commit 2025-05-25 19:48:08 +05:30
Yuvi9587
f6b3ff2f5c Update main.py 2025-05-25 11:36:35 +05:30
Yuvi9587
b399bdf5cf readme.md 2025-05-25 16:54:35 +05:30
Yuvi9587
9ace161bc8 Update downloader_utils.py 2025-05-25 11:22:04 +05:30
Yuvi9587
66e52cfd78 Commit 2025-05-25 12:27:15 +05:30
Yuvi9587
e665fd3cde Commit 2025-05-25 11:38:38 +05:30
Yuvi9587
fc94f4c691 Commit 2025-05-24 22:55:23 +05:30
Yuvi9587
78e2012f04 Commit 2025-05-24 13:30:06 +05:30
Yuvi9587
3fe9dbacc6 Commit 2025-05-24 13:15:08 +05:30
Yuvi9587
004dea06e0 Commit 2025-05-24 16:22:47 +05:30
Yuvi9587
8994a69c34 Add files via upload 2025-05-24 10:36:15 +05:30
Yuvi9587
f4a692673e main.py 2025-05-24 10:35:46 +05:30
Yuvi9587
4cb5f14ef6 Delete Known.txt 2025-05-23 21:01:05 +05:30
Yuvi9587
a596c4f350 Update main.py 2025-05-23 20:59:35 +05:30
Yuvi9587
e091c60d29 Commit 2025-05-23 20:23:36 +05:30
Yuvi9587
d2ea026a41 Commit 2025-05-23 19:11:52 +05:30
Yuvi9587
bb3d5c20f5 Commit 2025-05-23 18:24:42 +05:30
Yuvi9587
a13eae8f16 Commit 2025-05-23 18:19:30 +05:30
Yuvi9587
7e5dc71720 Commit 2025-05-23 18:06:47 +05:30
Yuvi9587
d7960bbb85 Commit 2025-05-23 17:22:54 +05:30
Yuvi9587
c4d5ba3040 Commit 2025-05-22 07:40:10 +05:30
Yuvi9587
fd84de7bce Commit 2025-05-22 07:03:05 +05:30
Yuvi9587
a6383b20a4 Commit 2025-05-21 17:20:16 +05:30
Yuvi9587
651f9d9f8d Update main.py 2025-05-18 16:17:40 +05:30
Yuvi9587
decef6730f Commit 2025-05-18 16:12:19 +05:30
Yuvi9587
32a12e8a09 Commit 2025-05-17 11:41:43 +05:30
Yuvi9587
62007d2d45 Update readme.md 2025-05-16 16:08:48 +05:30
Yuvi9587
f1e592cf99 Update readme.md 2025-05-16 12:50:32 +05:30
10 changed files with 3338 additions and 1029 deletions

BIN
Kemono.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

View File

BIN
Read.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 162 KiB

BIN
assets/discord.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

BIN
assets/github.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

BIN
assets/instagram.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

File diff suppressed because it is too large Load Diff

2787
main.py

File diff suppressed because it is too large Load Diff

View File

@@ -13,61 +13,64 @@ DOWNLOAD_CHUNK_SIZE_ITER = 1024 * 256 # 256KB for iter_content within a chunk d
def _download_individual_chunk(chunk_url, temp_file_path, start_byte, end_byte, headers, def _download_individual_chunk(chunk_url, temp_file_path, start_byte, end_byte, headers,
part_num, total_parts, progress_data, cancellation_event, skip_event, logger, part_num, total_parts, progress_data, cancellation_event, skip_event, pause_event, global_emit_time_ref, cookies_for_chunk, # Added cookies_for_chunk
signals=None, api_original_filename=None): # Added signals and api_original_filename logger_func, emitter=None, api_original_filename=None): # Renamed logger, signals to emitter
"""Downloads a single chunk of a file and writes it to the temp file.""" """Downloads a single chunk of a file and writes it to the temp file."""
if cancellation_event and cancellation_event.is_set(): if cancellation_event and cancellation_event.is_set():
logger(f" [Chunk {part_num + 1}/{total_parts}] Download cancelled before start.") logger_func(f" [Chunk {part_num + 1}/{total_parts}] Download cancelled before start.")
return 0, False # bytes_downloaded, success return 0, False # bytes_downloaded, success
if skip_event and skip_event.is_set(): if skip_event and skip_event.is_set():
logger(f" [Chunk {part_num + 1}/{total_parts}] Skip event triggered before start.") logger_func(f" [Chunk {part_num + 1}/{total_parts}] Skip event triggered before start.")
return 0, False return 0, False
if pause_event and pause_event.is_set():
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Download paused before start...")
while pause_event.is_set():
if cancellation_event and cancellation_event.is_set():
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Download cancelled while paused.")
return 0, False
time.sleep(0.2) # Shorter sleep for responsive resume
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Download resumed.")
chunk_headers = headers.copy() chunk_headers = headers.copy()
# end_byte can be -1 for 0-byte files, meaning download from start_byte to end of file (which is start_byte itself)
if end_byte != -1 : # For 0-byte files, end_byte might be -1, Range header should not be set or be 0-0 if end_byte != -1 : # For 0-byte files, end_byte might be -1, Range header should not be set or be 0-0
chunk_headers['Range'] = f"bytes={start_byte}-{end_byte}" chunk_headers['Range'] = f"bytes={start_byte}-{end_byte}"
elif start_byte == 0 and end_byte == -1: # Specifically for 0-byte files elif start_byte == 0 and end_byte == -1: # Specifically for 0-byte files
# Some servers might not like Range: bytes=0--1.
# For a 0-byte file, we might not even need a range header, or Range: bytes=0-0
# Let's try without for 0-byte, or rely on server to handle 0-0 if Content-Length was 0.
# If Content-Length was 0, the main function might handle it directly.
# This chunking logic is primarily for files > 0 bytes.
# For now, if end_byte is -1, it implies a 0-byte file, so we expect 0 bytes.
pass pass
bytes_this_chunk = 0 bytes_this_chunk = 0
last_progress_emit_time_for_chunk = time.time()
last_speed_calc_time = time.time() last_speed_calc_time = time.time()
bytes_at_last_speed_calc = 0 bytes_at_last_speed_calc = 0
for attempt in range(MAX_CHUNK_DOWNLOAD_RETRIES + 1): for attempt in range(MAX_CHUNK_DOWNLOAD_RETRIES + 1):
if cancellation_event and cancellation_event.is_set(): if cancellation_event and cancellation_event.is_set():
logger(f" [Chunk {part_num + 1}/{total_parts}] Cancelled during retry loop.") logger_func(f" [Chunk {part_num + 1}/{total_parts}] Cancelled during retry loop.")
return bytes_this_chunk, False return bytes_this_chunk, False
if skip_event and skip_event.is_set(): if skip_event and skip_event.is_set():
logger(f" [Chunk {part_num + 1}/{total_parts}] Skip event during retry loop.") logger_func(f" [Chunk {part_num + 1}/{total_parts}] Skip event during retry loop.")
return bytes_this_chunk, False return bytes_this_chunk, False
if pause_event and pause_event.is_set():
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Paused during retry loop...")
while pause_event.is_set():
if cancellation_event and cancellation_event.is_set():
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Cancelled while paused in retry loop.")
return bytes_this_chunk, False
time.sleep(0.2)
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Resumed from retry loop pause.")
try: try:
if attempt > 0: if attempt > 0:
logger(f" [Chunk {part_num + 1}/{total_parts}] Retrying download (Attempt {attempt}/{MAX_CHUNK_DOWNLOAD_RETRIES})...") logger_func(f" [Chunk {part_num + 1}/{total_parts}] Retrying download (Attempt {attempt}/{MAX_CHUNK_DOWNLOAD_RETRIES})...")
time.sleep(CHUNK_DOWNLOAD_RETRY_DELAY * (2 ** (attempt - 1))) time.sleep(CHUNK_DOWNLOAD_RETRY_DELAY * (2 ** (attempt - 1)))
# Reset speed calculation on retry
last_speed_calc_time = time.time() last_speed_calc_time = time.time()
bytes_at_last_speed_calc = bytes_this_chunk # Current progress of this chunk bytes_at_last_speed_calc = bytes_this_chunk # Current progress of this chunk
# Enhanced log message for chunk start
log_msg = f" 🚀 [Chunk {part_num + 1}/{total_parts}] Starting download: bytes {start_byte}-{end_byte if end_byte != -1 else 'EOF'}" log_msg = f" 🚀 [Chunk {part_num + 1}/{total_parts}] Starting download: bytes {start_byte}-{end_byte if end_byte != -1 else 'EOF'}"
logger(log_msg) logger_func(log_msg)
print(f"DEBUG_MULTIPART: {log_msg}") # Direct console print for debugging response = requests.get(chunk_url, headers=chunk_headers, timeout=(10, 120), stream=True, cookies=cookies_for_chunk)
response = requests.get(chunk_url, headers=chunk_headers, timeout=(10, 120), stream=True)
response.raise_for_status() response.raise_for_status()
# For 0-byte files, if end_byte was -1, we expect 0 content.
if start_byte == 0 and end_byte == -1 and int(response.headers.get('Content-Length', 0)) == 0: if start_byte == 0 and end_byte == -1 and int(response.headers.get('Content-Length', 0)) == 0:
logger(f" [Chunk {part_num + 1}/{total_parts}] Confirmed 0-byte file.") logger_func(f" [Chunk {part_num + 1}/{total_parts}] Confirmed 0-byte file.")
with progress_data['lock']: with progress_data['lock']:
progress_data['chunks_status'][part_num]['active'] = False progress_data['chunks_status'][part_num]['active'] = False
progress_data['chunks_status'][part_num]['speed_bps'] = 0 progress_data['chunks_status'][part_num]['speed_bps'] = 0
@@ -77,17 +80,24 @@ def _download_individual_chunk(chunk_url, temp_file_path, start_byte, end_byte,
f.seek(start_byte) f.seek(start_byte)
for data_segment in response.iter_content(chunk_size=DOWNLOAD_CHUNK_SIZE_ITER): for data_segment in response.iter_content(chunk_size=DOWNLOAD_CHUNK_SIZE_ITER):
if cancellation_event and cancellation_event.is_set(): if cancellation_event and cancellation_event.is_set():
logger(f" [Chunk {part_num + 1}/{total_parts}] Cancelled during data iteration.") logger_func(f" [Chunk {part_num + 1}/{total_parts}] Cancelled during data iteration.")
return bytes_this_chunk, False return bytes_this_chunk, False
if skip_event and skip_event.is_set(): if skip_event and skip_event.is_set():
logger(f" [Chunk {part_num + 1}/{total_parts}] Skip event during data iteration.") logger_func(f" [Chunk {part_num + 1}/{total_parts}] Skip event during data iteration.")
return bytes_this_chunk, False return bytes_this_chunk, False
if pause_event and pause_event.is_set():
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Paused during data iteration...")
while pause_event.is_set():
if cancellation_event and cancellation_event.is_set():
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Cancelled while paused in data iteration.")
return bytes_this_chunk, False
time.sleep(0.2)
logger_func(f" [Chunk {part_num + 1}/{total_parts}] Resumed from data iteration pause.")
if data_segment: if data_segment:
f.write(data_segment) f.write(data_segment)
bytes_this_chunk += len(data_segment) bytes_this_chunk += len(data_segment)
with progress_data['lock']: with progress_data['lock']:
# Increment both the chunk's downloaded and the overall downloaded
progress_data['total_downloaded_so_far'] += len(data_segment) progress_data['total_downloaded_so_far'] += len(data_segment)
progress_data['chunks_status'][part_num]['downloaded'] = bytes_this_chunk progress_data['chunks_status'][part_num]['downloaded'] = bytes_this_chunk
progress_data['chunks_status'][part_num]['active'] = True progress_data['chunks_status'][part_num]['active'] = True
@@ -100,45 +110,42 @@ def _download_individual_chunk(chunk_url, temp_file_path, start_byte, end_byte,
progress_data['chunks_status'][part_num]['speed_bps'] = current_speed_bps progress_data['chunks_status'][part_num]['speed_bps'] = current_speed_bps
last_speed_calc_time = current_time last_speed_calc_time = current_time
bytes_at_last_speed_calc = bytes_this_chunk bytes_at_last_speed_calc = bytes_this_chunk
if emitter and (current_time - global_emit_time_ref[0] > 0.25): # Max ~4Hz for the whole file
# Emit progress more frequently from within the chunk download global_emit_time_ref[0] = current_time # Update shared last emit time
if current_time - last_progress_emit_time_for_chunk > 0.1: # Emit up to 10 times/sec per chunk status_list_copy = [dict(s) for s in progress_data['chunks_status']] # Make a deep enough copy
if signals and hasattr(signals, 'file_progress_signal'): if isinstance(emitter, queue.Queue):
# Ensure we read the latest total downloaded from progress_data emitter.put({'type': 'file_progress', 'payload': (api_original_filename, status_list_copy)})
# Send a copy of the chunks_status list elif hasattr(emitter, 'file_progress_signal'): # PostProcessorSignals-like
status_list_copy = [dict(s) for s in progress_data['chunks_status']] # Make a deep enough copy emitter.file_progress_signal.emit(api_original_filename, status_list_copy)
signals.file_progress_signal.emit(api_original_filename, status_list_copy)
last_progress_emit_time_for_chunk = current_time
return bytes_this_chunk, True return bytes_this_chunk, True
except (requests.exceptions.ConnectionError, requests.exceptions.Timeout, http.client.IncompleteRead) as e: except (requests.exceptions.ConnectionError, requests.exceptions.Timeout, http.client.IncompleteRead) as e:
logger(f" ❌ [Chunk {part_num + 1}/{total_parts}] Retryable error: {e}") logger_func(f" ❌ [Chunk {part_num + 1}/{total_parts}] Retryable error: {e}")
if attempt == MAX_CHUNK_DOWNLOAD_RETRIES: if attempt == MAX_CHUNK_DOWNLOAD_RETRIES:
logger(f" ❌ [Chunk {part_num + 1}/{total_parts}] Failed after {MAX_CHUNK_DOWNLOAD_RETRIES} retries.") logger_func(f" ❌ [Chunk {part_num + 1}/{total_parts}] Failed after {MAX_CHUNK_DOWNLOAD_RETRIES} retries.")
return bytes_this_chunk, False return bytes_this_chunk, False
except requests.exceptions.RequestException as e: # Includes 4xx/5xx errors after raise_for_status except requests.exceptions.RequestException as e: # Includes 4xx/5xx errors after raise_for_status
logger(f" ❌ [Chunk {part_num + 1}/{total_parts}] Non-retryable error: {e}") logger_func(f" ❌ [Chunk {part_num + 1}/{total_parts}] Non-retryable error: {e}")
return bytes_this_chunk, False return bytes_this_chunk, False
except Exception as e: except Exception as e:
logger(f" ❌ [Chunk {part_num + 1}/{total_parts}] Unexpected error: {e}\n{traceback.format_exc(limit=1)}") logger_func(f" ❌ [Chunk {part_num + 1}/{total_parts}] Unexpected error: {e}\n{traceback.format_exc(limit=1)}")
return bytes_this_chunk, False return bytes_this_chunk, False
# Ensure final status is marked as inactive if loop finishes due to retries
with progress_data['lock']: with progress_data['lock']:
progress_data['chunks_status'][part_num]['active'] = False progress_data['chunks_status'][part_num]['active'] = False
progress_data['chunks_status'][part_num]['speed_bps'] = 0 progress_data['chunks_status'][part_num]['speed_bps'] = 0
return bytes_this_chunk, False # Should be unreachable return bytes_this_chunk, False # Should be unreachable
def download_file_in_parts(file_url, save_path, total_size, num_parts, headers, def download_file_in_parts(file_url, save_path, total_size, num_parts, headers, api_original_filename,
api_original_filename, signals, cancellation_event, skip_event, logger): emitter_for_multipart, cookies_for_chunk_session, # Added cookies_for_chunk_session
cancellation_event, skip_event, logger_func, pause_event):
""" """
Downloads a file in multiple parts concurrently. Downloads a file in multiple parts concurrently.
Returns: (download_successful_flag, downloaded_bytes, calculated_file_hash, temp_file_handle_or_None) Returns: (download_successful_flag, downloaded_bytes, calculated_file_hash, temp_file_handle_or_None)
The temp_file_handle will be an open read-binary file handle to the .part file if successful, otherwise None. The temp_file_handle will be an open read-binary file handle to the .part file if successful, otherwise None.
It is the responsibility of the caller to close this handle and rename/delete the .part file. It is the responsibility of the caller to close this handle and rename/delete the .part file.
""" """
logger(f"⬇️ Initializing Multi-part Download ({num_parts} parts) for: '{api_original_filename}' (Size: {total_size / (1024*1024):.2f} MB)") logger_func(f"⬇️ Initializing Multi-part Download ({num_parts} parts) for: '{api_original_filename}' (Size: {total_size / (1024*1024):.2f} MB)")
temp_file_path = save_path + ".part" temp_file_path = save_path + ".part"
try: try:
@@ -146,7 +153,7 @@ def download_file_in_parts(file_url, save_path, total_size, num_parts, headers,
if total_size > 0: if total_size > 0:
f_temp.truncate(total_size) # Pre-allocate space f_temp.truncate(total_size) # Pre-allocate space
except IOError as e: except IOError as e:
logger(f" ❌ Error creating/truncating temp file '{temp_file_path}': {e}") logger_func(f" ❌ Error creating/truncating temp file '{temp_file_path}': {e}")
return False, 0, None, None return False, 0, None, None
chunk_size_calc = total_size // num_parts chunk_size_calc = total_size // num_parts
@@ -167,7 +174,7 @@ def download_file_in_parts(file_url, save_path, total_size, num_parts, headers,
chunk_actual_sizes.append(end - start + 1) chunk_actual_sizes.append(end - start + 1)
if not chunks_ranges and total_size > 0: if not chunks_ranges and total_size > 0:
logger(f" ⚠️ No valid chunk ranges for multipart download of '{api_original_filename}'. Aborting multipart.") logger_func(f" ⚠️ No valid chunk ranges for multipart download of '{api_original_filename}'. Aborting multipart.")
if os.path.exists(temp_file_path): os.remove(temp_file_path) if os.path.exists(temp_file_path): os.remove(temp_file_path)
return False, 0, None, None return False, 0, None, None
@@ -178,7 +185,8 @@ def download_file_in_parts(file_url, save_path, total_size, num_parts, headers,
{'id': i, 'downloaded': 0, 'total': chunk_actual_sizes[i] if i < len(chunk_actual_sizes) else 0, 'active': False, 'speed_bps': 0.0} {'id': i, 'downloaded': 0, 'total': chunk_actual_sizes[i] if i < len(chunk_actual_sizes) else 0, 'active': False, 'speed_bps': 0.0}
for i in range(num_parts) for i in range(num_parts)
], ],
'lock': threading.Lock() 'lock': threading.Lock(),
'last_global_emit_time': [time.time()] # Shared mutable for global throttling timestamp
} }
chunk_futures = [] chunk_futures = []
@@ -191,8 +199,9 @@ def download_file_in_parts(file_url, save_path, total_size, num_parts, headers,
chunk_futures.append(chunk_pool.submit( chunk_futures.append(chunk_pool.submit(
_download_individual_chunk, chunk_url=file_url, temp_file_path=temp_file_path, _download_individual_chunk, chunk_url=file_url, temp_file_path=temp_file_path,
start_byte=start, end_byte=end, headers=headers, part_num=i, total_parts=num_parts, start_byte=start, end_byte=end, headers=headers, part_num=i, total_parts=num_parts,
progress_data=progress_data, cancellation_event=cancellation_event, skip_event=skip_event, logger=logger, progress_data=progress_data, cancellation_event=cancellation_event, skip_event=skip_event, global_emit_time_ref=progress_data['last_global_emit_time'],
signals=signals, api_original_filename=api_original_filename # Pass them here pause_event=pause_event, cookies_for_chunk=cookies_for_chunk_session, logger_func=logger_func, emitter=emitter_for_multipart,
api_original_filename=api_original_filename
)) ))
for future in as_completed(chunk_futures): for future in as_completed(chunk_futures):
@@ -201,32 +210,29 @@ def download_file_in_parts(file_url, save_path, total_size, num_parts, headers,
total_bytes_from_chunks += bytes_downloaded_this_chunk total_bytes_from_chunks += bytes_downloaded_this_chunk
if not success_this_chunk: if not success_this_chunk:
all_chunks_successful = False all_chunks_successful = False
# Progress is emitted from within _download_individual_chunk
if cancellation_event and cancellation_event.is_set(): if cancellation_event and cancellation_event.is_set():
logger(f" Multi-part download for '{api_original_filename}' cancelled by main event.") logger_func(f" Multi-part download for '{api_original_filename}' cancelled by main event.")
all_chunks_successful = False all_chunks_successful = False
if emitter_for_multipart:
# Ensure a final progress update is sent with all chunks marked inactive (unless still active due to error)
if signals and hasattr(signals, 'file_progress_signal'):
with progress_data['lock']: with progress_data['lock']:
# Ensure all chunks are marked inactive for the final signal if download didn't fully succeed or was cancelled
status_list_copy = [dict(s) for s in progress_data['chunks_status']] status_list_copy = [dict(s) for s in progress_data['chunks_status']]
signals.file_progress_signal.emit(api_original_filename, status_list_copy) if isinstance(emitter_for_multipart, queue.Queue):
emitter_for_multipart.put({'type': 'file_progress', 'payload': (api_original_filename, status_list_copy)})
elif hasattr(emitter_for_multipart, 'file_progress_signal'): # PostProcessorSignals-like
emitter_for_multipart.file_progress_signal.emit(api_original_filename, status_list_copy)
if all_chunks_successful and (total_bytes_from_chunks == total_size or total_size == 0): if all_chunks_successful and (total_bytes_from_chunks == total_size or total_size == 0):
logger(f" ✅ Multi-part download successful for '{api_original_filename}'. Total bytes: {total_bytes_from_chunks}") logger_func(f" ✅ Multi-part download successful for '{api_original_filename}'. Total bytes: {total_bytes_from_chunks}")
md5_hasher = hashlib.md5() md5_hasher = hashlib.md5()
with open(temp_file_path, 'rb') as f_hash: with open(temp_file_path, 'rb') as f_hash:
for buf in iter(lambda: f_hash.read(4096*10), b''): # Read in larger buffers for hashing for buf in iter(lambda: f_hash.read(4096*10), b''): # Read in larger buffers for hashing
md5_hasher.update(buf) md5_hasher.update(buf)
calculated_hash = md5_hasher.hexdigest() calculated_hash = md5_hasher.hexdigest()
# Return an open file handle for the caller to manage (e.g., for compression)
# The caller is responsible for closing this handle and renaming/deleting the .part file.
return True, total_bytes_from_chunks, calculated_hash, open(temp_file_path, 'rb') return True, total_bytes_from_chunks, calculated_hash, open(temp_file_path, 'rb')
else: else:
logger(f" ❌ Multi-part download failed for '{api_original_filename}'. Success: {all_chunks_successful}, Bytes: {total_bytes_from_chunks}/{total_size}. Cleaning up.") logger_func(f" ❌ Multi-part download failed for '{api_original_filename}'. Success: {all_chunks_successful}, Bytes: {total_bytes_from_chunks}/{total_size}. Cleaning up.")
if os.path.exists(temp_file_path): if os.path.exists(temp_file_path):
try: os.remove(temp_file_path) try: os.remove(temp_file_path)
except OSError as e: logger(f" Failed to remove temp part file '{temp_file_path}': {e}") except OSError as e: logger_func(f" Failed to remove temp part file '{temp_file_path}': {e}")
return False, total_bytes_from_chunks, None, None return False, total_bytes_from_chunks, None, None

396
readme.md
View File

@@ -1,204 +1,368 @@
# Kemono Downloader v3.2.0 <h1 align="center">Kemono Downloader v4.1.1</h1>
A feature-rich GUI application built with PyQt5 to download content from **Kemono.su** or **Coomer.party**. <div align="center">
Offers robust filtering, smart organization, manga-specific handling, and performance tuning. <img src="https://github.com/Yuvi9587/Kemono-Downloader/blob/main/Read.png" alt="Kemono Downloader"/>
</div>
This version introduces:
- Multi-part downloads
- Character filtering by comments
- Filename word removal
- Various UI/workflow enhancements
--- ---
## 🚀 What's New in v3.2.0 A powerful, feature-rich GUI application for downloading content from **[Kemono.su](https://kemono.su)** and **[Coomer.party](https://coomer.party)**.
Built with **PyQt5**, this tool is ideal for users who want deep filtering, customizable folder structures, efficient downloads, and intelligent automation — all within a modern, user-friendly graphical interface.
### 🔹 Character Filter by Post Comments (Beta)
- New "Comments" scope for the 'Filter by Character(s)' feature.
**How it works:**
1. Checks if any **filenames** match your character filter. If yes → downloads the post (skips comment check).
2. If no filename matches → scans the **post's comments**. If matched → downloads the post.
- Prioritizes filename-matched character name for folder naming, otherwise uses comment match.
- Cycle through filter scopes with the `Filter: [Scope]` button next to the character input.
--- ---
### ✂️ Remove Specific Words from Filenames ## What's New in v4.1.1?
- Input field: `"✂️ Remove Words from name"` Version 4.1.1 introduces a smarter way to capture images that might be embedded directly within post descriptions, enhancing content discovery.
- Enter comma-separated words (e.g., `patreon, kemono, [HD], _final`)
- These are removed from filenames (case-insensitive) to improve organization. ### "Scan Content for Images" Feature
- **Enhanced Image Discovery:** A new checkbox, "**Scan Content for Images**," has been added to the UI (grouped with "Download Thumbnails Only" and "Compress Large Images").
- **How it Works:**
- When enabled, the downloader scans the HTML content of posts (e.g., the description area).
- It looks for images embedded via HTML `<img>` tags or as direct absolute URL links (e.g., `https://.../image.png`).
- It intelligently resolves relative image paths found in `<img>` tags (like `/data/image.jpg`) into full, downloadable URLs.
- This is particularly useful for capturing images that are part of the post's narrative but not formally listed in the API's file or attachment sections.
- **Default State:** This option is **unchecked by default**.
- **Interaction with "Download Thumbnails Only":**
- If you check "Download Thumbnails Only":
- The "Scan Content for Images" checkbox will **automatically become checked and disabled** (locked).
- In this combined mode, the downloader will **only download images found by the content scan**. API-listed thumbnails will be ignored, prioritizing images from the post's body.
- If you uncheck "Download Thumbnails Only":
- The "Scan Content for Images" checkbox will become **enabled again and revert to being unchecked**. You can then manually enable it if you wish to scan content without being in thumbnail-only mode.
This feature ensures a more comprehensive download experience, especially for posts where images are integrated directly into the text.
--- ---
### 🧩 Multi-part Downloads for Large Files ## Previous Update: What's New in v4.0.1?
- Toggle multi-part downloads (OFF by default). Version 4.0.1 focuses on enhancing access to content and providing even smarter organization:
- Improves speed on large files (e.g., >10MB videos, zips).
- Falls back to single-stream on failure. ### Cookie Management
- Toggle via `Multi-part: ON/OFF` in the log header.
- **Access Content:** Seamlessly download from Kemono/Coomer as if you were logged in by using your browser's cookies.
- **Flexible Input:**
- Directly paste your cookie string (e.g., `name1=value1; name2=value2`).
- Browse and load cookies from a `cookies.txt` file (Netscape format).
- Automatic fallback to a `cookies.txt` file in the application directory if "Use Cookie" is enabled and no other source is specified.
- **Easy Activation:** A simple "Use Cookie" checkbox in the UI controls this feature.
- *Important Note: Cookie settings (text, file path, and enabled state) are configured per session and are not saved when the application is closed. You will need to re-apply them on each launch if needed.*
--- ---
### 🧠 UI and Workflow Enhancements ### Advanced `Known.txt` and Character Filtering
- **Updated Welcome Tour** The `Known.txt` system has been revamped for improved performance and stability. The previous method of handling known names could become resource-intensive with large lists, potentially leading to application slowdowns or crashes. This new, streamlined system offers more direct control and robust organization.
Shows on first launch, covers all new and core features. The `Known.txt` file and the "Filter by Character(s)" input field work together to provide powerful and flexible content organization. The `Known.txt` file itself has a straightforward syntax, while the UI input allows for more complex session-specific grouping and alias definitions that can then be added to `Known.txt`.
- **Smarter Cancel/Reset** **1. `Known.txt` File Syntax (Located in App Directory):**
Cancels active tasks and resets UI — but retains URL and Download Directory fields.
- **Simplified Interface** `Known.txt` stores your persistent list of characters, series, or keywords for folder organization. Each line is an entry:
- Removed "Skip Current File" and local API server for a cleaner experience.
- **Simple Entries:**
- A line like `My Awesome Series` or `Nami`.
- **Behavior:** Content matching this term will be saved into a folder named "My Awesome Series" or "Nami" respectively (if "Separate Folders" is enabled).
**2. "Filter by Character(s)" UI Input Field:**
This field allows for dynamic filtering for the current download session and provides options for how new entries are added to `Known.txt`.
- **Standard Names:**
- Input: `Nami, Robin`
- Session Behavior: Filters for "Nami" OR "Robin". If "Separate Folders" is on, creates folders "Nami" and "Robin".
- `Known.txt` Addition: If "Nami" is new and selected for addition in the confirmation dialog, it's added as `Nami` on a new line in `Known.txt`.
- **Grouped Aliases for a Single Character (using `(...)~` syntax):**
- Input: `(Boa, Hancock)~`
- Meaning: "Boa" and "Hancock" are different names/aliases for the *same character*. The names are listed within parentheses separated by commas (e.g., `name1, alias1, alias2`), and the entire group is followed by a `~` symbol. This is useful when a creator uses different names for the same character.
- Session Behavior: Filters for "Boa" OR "Hancock". If "Separate Folders" is on, creates a single folder named "Boa Hancock".
- `Known.txt` Addition: If this group is new and selected for addition, it's added to `Known.txt` as a grouped alias entry, typically `(Boa Hancock)`. The first name in the `Known.txt` entry (e.g., "Boa Hancock") becomes the primary folder name.
- **Combined Folder for Distinct Characters (using `(...)` syntax):**
- Input: `(Vivi, Uta)`
- Meaning: "Vivi" and "Uta" are *distinct characters*, but for this download session, their content should be grouped into a single folder. The names are listed within parentheses separated by commas. This is useful for grouping art of less frequent characters without creating many small individual folders.
- Session Behavior: Filters for "Vivi" OR "Uta". If "Separate Folders" is on, creates a single folder named "Vivi Uta".
- `Known.txt` Addition: If this "combined group" is new and selected for addition, "Vivi" and "Uta" are added to `Known.txt` as *separate, individual simple entries* on new lines:
```
Vivi
Uta
```
The combined folder "Vivi Uta" is a session-only convenience; `Known.txt` stores them as distinct entities for future individual use.
**3. Interaction with `Known.txt`:**
- **Adding New Names from Filters:** When you use the "Filter by Character(s)" input, if any names or groups are new (not already in `Known.txt`), a dialog will appear after you start the download. This dialog allows you to select which of these new names/groups should be added to `Known.txt`, formatted according to the rules described above.
- **Intelligent Fallback:** If "Separate Folders by Name/Title" is active, and content doesn't match the "Filter by Character(s)" UI input, the downloader consults your `Known.txt` file for folder naming.
- **Direct Management:** You can add simple entries directly to `Known.txt` using the list and "Add" button in the UI's `Known.txt` management section. For creating or modifying complex grouped alias entries directly in the file, or for bulk edits, click the "Open Known.txt" button. The application reloads `Known.txt` on startup or before a download process begins.
- **Using Known Names to Populate Filters (via "Add to Filter" Button):**
- Next to the "Add" button in the `Known.txt` management section, a "⤵️ Add to Filter" button provides a quick way to use your existing known names.
- Clicking this opens a popup window displaying all entries from your `Known.txt` file, each with a checkbox.
- The popup includes:
- A search bar to quickly filter the list of names.
- "Select All" and "Deselect All" buttons for convenience.
- After selecting the desired names, click "Add Selected".
- The chosen names will be inserted into the "Filter by Character(s)" input field.
- **Important Formatting:** If a selected entry from `Known.txt` is a group (e.g., originally `(Boa Hancock)` in `Known.txt`, which implies aliases "Boa" and "Hancock"), it will be added to the filter field as `(Boa, Hancock)~`. Simple names are added as-is.
---
## What's in v3.5.0? (Previous Update)
This version brought significant enhancements to manga/comic downloading, filtering capabilities, and user experience:
### Enhanced Manga/Comic Mode
- **Optional Filename Prefix:**
- When using the "Date Based" or "Original File Name" manga styles, an optional prefix can be specified in the UI.
- This prefix will be prepended to each filename generated by these styles.
- **Example (Date Based):** If prefix is `MySeries_`, files become `MySeries_001.jpg`, `MySeries_002.png`, etc.
- **Example (Original File Name):** If prefix is `Comic_Vol1_`, an original file `page_01.jpg` becomes `Comic_Vol1_page_01.jpg`.
- This input field appears automatically when either of these two manga naming styles is selected.
- **New "Date Based" Filename Style:**
- Perfect for truly sequential content! Files are named numerically (e.g., `001.jpg`, `002.jpg`, `003.ext`...) across an *entire creator's feed*, strictly following post publication order.
- **Smart Numbering:** Automatically resumes from the highest existing number found in the series folder (and subfolders, if "Subfolder per Post" is enabled).
- **Guaranteed Order:** Disables multi-threading for post processing to ensure sequential accuracy.
- Works alongside the existing "Post Title" and "Original File Name" styles.
- **New "Title+G.Num (Post Title + Global Numbering)" Filename Style:**
- Ideal for series where you want each file to be prefixed by its post title but still maintain a global sequential number across all posts from a single download session.
- **Naming Convention:** Files are named using the cleaned post title as a prefix, followed by an underscore and a globally incrementing number (e.g., `Post Title_001.ext`, `Post Title_002.ext`).
- **Example:**
- Post "Chapter 1: The Adventure Begins" (contains 2 files: `imageA.jpg`, `imageB.png`) -> `Chapter 1 The Adventure Begins_001.jpg`, `Chapter 1 The Adventure Begins_002.png`
- Next Post "Chapter 2: New Friends" (contains 1 file: `cover.jpg`) -> `Chapter 2 New Friends_003.jpg`
- **Sequential Integrity:** Multithreading for post processing is automatically disabled when this style is selected to ensure the global numbering is strictly sequential.
--- ---
### 📁 Refined File & Duplicate Handling ### "Remove Words from Filename" Feature
- **Duplicate Filenames** - Specify comma-separated words or phrases (case-insensitive) that will be automatically removed from filenames.
Adds numeric suffix (`file.jpg`, `file_1.jpg`, etc.).
Removed the "Duplicate" subfolder system.
- **Efficient Hash Check** - Example: `patreon, [HD], _final` transforms `AwesomeArt_patreon` `Hinata_Hd` into `AwesomeArt.jpg` `Hinata.jpg`.
Detects and skips duplicate files within the same session (before writing to disk).
- **Better Temp File Cleanup**
Cleans up `.part` files — especially if duplicate or compressed post-download.
--- ---
## 🧩 Core Features ### New "Only Archives" File Filter Mode
### 🎛 Simple GUI - Exclusively downloads `.zip` and `.rar` files.
- Built with **PyQt5**
- Dark theme, responsive layout
### 📥 Supports Post and Creator URLs - Automatically disables conflicting options like "Skip .zip/.rar" and external link logging.
- Download a single post or an entire creators feed.
### 🔢 Page Range Support
- Choose page range when downloading creator feeds (except in Manga Mode).
--- ---
### 🗂 Smart Folder System ### Improved Character Filter Scope - "Comments (Beta)"
- Organize by character names, post titles, or custom labels. - **File-First Check:** Prioritizes matching filenames before checking post comments for character names.
- Option to create a separate folder for each post.
- Uses `Known.txt` for fallback names. - **Comment Fallback:** Only checks comments if no filename match is found, reducing unnecessary API calls.
--- ---
### 📚 Known Names Manager ### Refined "Missed Character Log"
- Add/edit/delete known characters/shows - Displays a capitalized, alphabetized list of key terms from skipped post titles.
- Saves entries in `Known.txt` for automatic folder naming.
- Makes it easier to spot patterns or characters that might be unintentionally excluded.
--- ---
### 🔍 Advanced Filtering ### Enhanced Multi-part Download Progress
- **Filter by Character(s)** - Granular visibility into active chunk downloads and combined speed for large files.
Scope: `Files`, `Post Titles`, `Both`, or `Post Comments (Beta)`
- **Skip with Words**
Skip posts or files based on keywords. Toggle scope.
- **Media Type Filters**
Choose: `All`, `Images/GIFs`, `Videos`, `📦 Only Archives (.zip/.rar)`
- **🔗 Only Links Mode**
Extracts links from post descriptions.
- **Skip Archives**
Ignore `.zip`/`.rar` unless in "Only Archives" mode.
--- ---
### 📖 Manga/Comic Mode (Creator URLs Only) ### Updated Onboarding Tour
- Downloads posts oldest-to-newest. - Improved guide for new users, covering v4.0.0 features and existing core functions.
**Filename Style Toggle:**
- `Post Title` (default): Names first file in post after title.
- `Original File`: Uses original file names.
- Uses manga/series title for filtering and folder naming.
--- ---
### 🖼️ Image Compression ### Robust Configuration Path
- Converts large images to **WebP** if it significantly reduces size. - Settings and `Known.txt` are now stored in the same folder as app.
- Requires `Pillow` library.
--- ---
### 🖼 Download Thumbnails Only ## Core Features
- Option to fetch only small preview images.
--- ---
### ⚙️ Multithreaded Downloads ### User Interface & Workflow
- Adjustable threads for: - **Clean PyQt5 GUI** — Simple, modern, and dark-themed.
- Multiple post processing (creator feeds)
- File-level concurrency (within a post) - **Persistent Settings** — Saves preferences between sessions.
- **Download Modes:**
- Single Post URL
- Entire Creator Feed
- **Flexible Options:**
- Specify Page Range (disabled in Manga Mode)
- Custom Folder Name for single posts
--- ---
### ⏯ Download Controls ### Smart Filtering
- Start and cancel active operations. - **Character Name Filtering:**
- Use `Tifa, Aerith` or group `(Boa, Hancock)` → folder `Boa Hancock`
- Flexible input for current session and for adding to `Known.txt`.
- Examples:
- `Nami` (simple character)
- `(Boa Hancock)~` (aliases for one character, session folder "Boa Hancock", adds `(Boa Hancock)` to `Known.txt`)
- `(Vivi, Uta)` (distinct characters, session folder "Vivi Uta", adds `Vivi` and `Uta` separately to `Known.txt`)
- A "⤵️ Add to Filter" button (near the `Known.txt` management UI) allows you to quickly populate this field by selecting from your existing `Known.txt` entries via a popup with search and checkbox selection.
- See "Advanced `Known.txt` and Character Filtering" for full details.
- **Filter Scopes:**
- `Files`
- `Title`
- `Both (Title then Files)`
- `Comments (Beta - Files first)`
- **Skip with Words:**
- Exclude with `WIP, sketch, preview`
- **Skip Scopes:**
- `Files`
- `Posts`
- `Both (Posts then Files)`
- **File Type Filters:**
- `All`, `Images/GIFs`, `Videos`, `📦 Only Archives`, `🔗 Only Links`
- **Filename Cleanup:**
- Remove illegal and unwanted characters or phrases
--- ---
### 🌙 Dark Mode Interface ### Manga/Comic Mode (Creator Feeds Only)
- Modern, dark-themed GUI for comfort and clarity. - **Chronological Processing** — Oldest posts first
- **Filename Style Options:**
- `Name: Post Title (Default)`
- `Name: Original File`
- `Name: Date Based (New)`
- `Name: Title+G.Num (Post Title + Global Numbering)`
- **Best With:** Character filters set to manga/series title
--- ---
## 🔧 Backend Enhancements ### Folder Structure & Naming
### ♻️ Retry Logic - **Subfolders:**
- Auto-created based on character name, post title, or `Known.txt`
- Retries failed file and chunk downloads before skipping. - "Subfolder per Post" option for further nesting
- **Smart Naming:** Cleans invalid characters and structures logically
--- ---
### 🧬 Session-wide Deduplication ### Thumbnail & Compression Tools
- **Download Thumbnails Only:**
- Downloads small preview images from the API instead of full-sized files (if available).
- **Interaction with "Scan Content for Images" (New in v4.1.1):** When "Download Thumbnails Only" is active, "Scan Content for Images" is auto-enabled, and only images found by the content scan are downloaded. See "What's New in v4.1.1" for details.
- **Scan Content for Images (New in v4.1.1):**
- A UI option to scan the HTML content of posts for embedded image URLs (from `<img>` tags or direct links).
- Resolves relative paths and helps capture images not listed in the API's formal attachments.
- See the "What's New in v4.1.1?" section for a comprehensive explanation.
- **Compress to WebP** (via Pillow)
- Converts large images to smaller WebP versions
- Uses **MD5 hashes** to avoid saving identical files during a session.
--- ---
### 🧹 Smart Naming & Cleanup ### Performance Features
- Cleans special characters in names. - **Multithreading:**
- Applies numeric suffixes on collision. - For both post processing and file downloading
- Removes specified unwanted words.
- **Multi-part Downloads:**
- Toggleable in GUI
- Splits large files into chunks
- Granular chunk-level progress display
--- ---
### 📋 Efficient Logging ### Logging & Progress
- Toggle verbosity: `Basic` (important) or `Full` (everything). - **Real-time Logs:** Activity, errors, skipped posts
- Separate panel for extracted external links.
- Real-time feedback with clear statuses. - **Missed Character Log:** Shows skipped keywords in easy-to-read list
- **External Links Log:** Shows links (unless disabled in some modes)
- **Export Links:** Save `.txt` of links (Only Links mode)
--- ---
## 📦 Installation ### Config System
- **`Known.txt` for Smart Folder Naming (Located in App Directory):**
- A user-editable file that stores a list of preferred names, series titles, or keywords.
- It's primarily used as an intelligent fallback for folder creation when "Separate Folders by Name/Title" is enabled.
- **Syntax:**
- Simple entries: `My Favorite Series` (creates folder "My Favorite Series", matches "My Favorite Series").
- Grouped entries: `(Desired Folder Name, alias1, alias2)` (creates folder "Desired Folder Name"; matches "Desired Folder Name", "alias1", or "alias2").
- **Settings Stored in App Directory**
- **Editable Within GUI**
---
## Installation
---
### Requirements ### Requirements
- Python 3.6+
- Pip (Python package manager)
### Install Libraries - Python 3.6 or higher
- pip
---
### Install Dependencies
```bash ```bash
pip install PyQt5 requests Pillow pip install PyQt5 requests Pillow
```
***
## ** Build a Standalone Executable (Optional)**
1. Install PyInstaller:
```bash
pip install pyinstaller
```
2. Run:
```bash
pyinstaller --name "Kemono Downloader" --onefile --windowed --icon="Kemono.ico" main.py
```
3. Output will be in the `dist/` folder.
***
## ** Config Files**
- `settings.json` — Stores your UI preferences and settings.
- `Known.txt` — Stores character names, series titles, or keywords for organizing downloaded content into specific folders.
- Supports simple entries (e.g., `My Series`) and grouped entries for aliases (e.g., `(Folder Name, alias1, alias2)` where "Folder Name" is the name of the created folder, and all terms are used for matching).
***
## ** Feedback & Support**
Issues? Suggestions?
Open an issue on the [GitHub repository](https://github.com/Yuvi9587/kemono-downloader) or join our community.