mirror of
https://github.com/Yuvi9587/Kemono-Downloader.git
synced 2025-12-29 16:14:44 +00:00
Compare commits
12 Commits
a5cb04ea6f
...
v6.4.3
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d9364f4f91 | ||
|
|
9cd48bb63a | ||
|
|
d0f11c4a06 | ||
|
|
26fa3b9bc1 | ||
|
|
f7c4d892a8 | ||
|
|
661b97aa16 | ||
|
|
3704fece2b | ||
|
|
bdb7ac93c4 | ||
|
|
76d4a3ea8a | ||
|
|
ccc7804505 | ||
|
|
4ee750c5d4 | ||
|
|
e9be13c4e3 |
145
readme.md
145
readme.md
@@ -1,4 +1,4 @@
|
|||||||
<h1 align="center">Kemono Downloader v6.0.0</h1>
|
<h1 align="center">Kemono Downloader </h1>
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
@@ -41,108 +41,53 @@ Built with PyQt5, this tool is designed for users who want deep filtering capabi
|
|||||||
|
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<h2><strong>Core Capabilities Overview</strong></h2>
|
||||||
|
|
||||||
---
|
<h3><strong>High-Performance Downloading</strong></h3>
|
||||||
|
<ul>
|
||||||
|
<li><strong>Multi-threading:</strong> Processes multiple posts simultaneously to greatly accelerate downloads from large creator profiles.</li>
|
||||||
|
<li><strong>Multi-part Downloading:</strong> Splits large files into chunks and downloads them in parallel to maximize speed.</li>
|
||||||
|
<li><strong>Resilience:</strong> Supports pausing, resuming, and restoring downloads after crashes or interruptions.</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
## Feature Overview
|
<h3><strong>Advanced Filtering & Content Control</strong></h3>
|
||||||
|
<ul>
|
||||||
|
<li><strong>Content Type Filtering:</strong> Select whether to download all files or limit to images, videos, audio, or archives only.</li>
|
||||||
|
<li><strong>Keyword Skipping:</strong> Automatically skips posts or files containing certain keywords (e.g., "WIP", "sketch").</li>
|
||||||
|
<li><strong>Character Filtering:</strong> Restricts downloads to posts that match specific character or series names.</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
Kemono Downloader offers a range of features to streamline your content downloading experience:
|
<h3><strong>File Organization & Renaming</strong></h3>
|
||||||
|
<ul>
|
||||||
|
<li><strong>Automated Subfolders:</strong> Automatically organizes downloaded files into subdirectories based on character names or per post.</li>
|
||||||
|
<li><strong>Advanced File Renaming:</strong> Flexible renaming options, especially in Manga Mode, including:
|
||||||
|
<ul>
|
||||||
|
<li><strong>Post Title:</strong> Uses the post's title (e.g., <code>Chapter-One.jpg</code>).</li>
|
||||||
|
<li><strong>Date + Original Name:</strong> Prepends the publication date to the original filename.</li>
|
||||||
|
<li><strong>Date + Title:</strong> Combines the date with the post title.</li>
|
||||||
|
<li><strong>Sequential Numbering (Date Based):</strong> Simple sequence numbers (e.g., <code>001.jpg</code>, <code>002.jpg</code>).</li>
|
||||||
|
<li><strong>Title + Global Numbering:</strong> Uses post title with a globally incrementing number across the session.</li>
|
||||||
|
<li><strong>Post ID:</strong> Names files using the post’s unique ID.</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
- **User-Friendly Interface:** A modern PyQt5 GUI for easy navigation and operation.
|
<h3><strong>Specialized Modes</strong></h3>
|
||||||
|
<ul>
|
||||||
|
<li><strong>Manga/Comic Mode:</strong> Sorts posts chronologically before downloading to ensure pages appear in the correct sequence.</li>
|
||||||
|
<li><strong>Favorite Mode:</strong> Connects to your account and downloads from your favorites list (artists or posts).</li>
|
||||||
|
<li><strong>Link Extraction Mode:</strong> Extracts external links from posts for export or targeted downloading.</li>
|
||||||
|
<li><strong>Text Extraction Mode:</strong> Saves post descriptions or comment sections as <code>PDF</code>, <code>DOCX</code>, or <code>TXT</code> files.</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
- **Flexible Downloading:**
|
<h3><strong>Utility & Advanced Features</strong></h3>
|
||||||
- Download content from Kemono.su (and mirrors) and Coomer.party (and mirrors).
|
<ul>
|
||||||
- Supports creator pages (with page range selection) and individual post URLs.
|
<li><strong>Cookie Support:</strong> Enables access to subscriber-only content via browser session cookies.</li>
|
||||||
- Standard download controls: Start, Pause, Resume, and Cancel.
|
<li><strong>Duplicate Detection:</strong> Prevents saving duplicate files using content-based comparison, with configurable limits.</li>
|
||||||
|
<li><strong>Image Compression:</strong> Automatically converts large images to <code>.webp</code> to reduce disk usage.</li>
|
||||||
- **Powerful Filtering:**
|
<li><strong>Creator Management:</strong> Built-in creator browser and update checker for downloading only new posts from saved profiles.</li>
|
||||||
- **Character Filtering:** Filter content by character names. Supports simple comma-separated names and grouped names for shared folders.
|
<li><strong>Error Handling:</strong> Tracks failed downloads and provides a retry dialog with options to export or redownload missing files.</li>
|
||||||
- **Keyword Skipping:** Skip posts or files based on specified keywords.
|
</ul>
|
||||||
- **Filename Cleaning:** Remove unwanted words or phrases from downloaded filenames.
|
|
||||||
- **File Type Selection:** Choose to download all files, or limit to images/GIFs, videos, audio, or archives. Can also extract external links only.
|
|
||||||
|
|
||||||
- **Customizable Downloads:**
|
|
||||||
- **Thumbnails Only:** Option to download only small preview images.
|
|
||||||
- **Content Scanning:** Scan post HTML for `<img>` tags and direct image links, useful for images embedded in descriptions.
|
|
||||||
- **WebP Conversion:** Convert images to WebP format for smaller file sizes (requires Pillow library).
|
|
||||||
|
|
||||||
- **Organized Output:**
|
|
||||||
- **Automatic Subfolders:** Create subfolders based on character names (from filters or `Known.txt`) or post titles.
|
|
||||||
- **Per-Post Subfolders:** Option to create an additional subfolder for each individual post.
|
|
||||||
|
|
||||||
- **Manga/Comic Mode:**
|
|
||||||
- Downloads posts from a creator's feed in chronological order (oldest to newest).
|
|
||||||
- Offers various filename styling options for sequential reading (e.g., post title, original name, global numbering).
|
|
||||||
|
|
||||||
- **⭐ Favorite Mode:**
|
|
||||||
- Directly download from your favorited artists and posts on Kemono.su.
|
|
||||||
- Requires a valid cookie and adapts the UI for easy selection from your favorites.
|
|
||||||
- Supports downloading into a single location or artist-specific subfolders.
|
|
||||||
|
|
||||||
- **Performance & Advanced Options:**
|
|
||||||
- **Cookie Support:** Use cookies (paste string or load from `cookies.txt`) to access restricted content.
|
|
||||||
- **Multithreading:** Configure the number of simultaneous downloads/post processing threads for improved speed.
|
|
||||||
|
|
||||||
- **Logging:**
|
|
||||||
- A detailed progress log displays download activity, errors, and summaries.
|
|
||||||
|
|
||||||
- **Multi-language Interface:** Choose from several languages for the UI (English, Japanese, French, Spanish, German, Russian, Korean, Chinese Simplified).
|
|
||||||
|
|
||||||
- **Theme Customization:** Selectable Light and Dark themes for user comfort.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## ✨ What's New in v6.0.0
|
|
||||||
|
|
||||||
This release focuses on providing more granular control over file organization and improving at-a-glance status monitoring.
|
|
||||||
|
|
||||||
### New Features
|
|
||||||
|
|
||||||
- **Live Error Count on Button**
|
|
||||||
The **"Error" button** now dynamically displays the number of failed files during a download. Instead of opening the dialog, you can quickly see a live count like `(3) Error`, helping you track issues at a glance.
|
|
||||||
|
|
||||||
- **Date Prefix for Post Subfolders**
|
|
||||||
A new checkbox labeled **"Date Prefix"** is now available in the advanced settings.
|
|
||||||
When enabled alongside **"Subfolder per Post"**, it prepends the post's upload date to the folder name (e.g., `2025-07-11 Post Title`).
|
|
||||||
This makes your downloads sortable and easier to browse chronologically.
|
|
||||||
|
|
||||||
- **Keep Duplicates Within a Post**
|
|
||||||
A **"Keep Duplicates"** option has been added to preserve all files from a post — even if some have the same name.
|
|
||||||
Instead of skipping or overwriting, the downloader will save duplicates with numbered suffixes (e.g., `image.jpg`, `image_1.jpg`, etc.), which is especially useful when the same file name points to different media.
|
|
||||||
|
|
||||||
### Bug Fixes
|
|
||||||
|
|
||||||
- The downloader now correctly renames large `.part` files when completed, avoiding leftover temp files.
|
|
||||||
- The list of failed files shown in the Error Dialog is now saved and restored with your session — so no errors get lost if you close the app.
|
|
||||||
- Your selected download location is remembered, even after pressing the **Reset** button.
|
|
||||||
- The **Cancel** button is now enabled when restoring a pending session, so you can abort stuck jobs more easily.
|
|
||||||
- Internal cleanup logs (like "Deleting post cache") are now excluded from the final download summary for clarity.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 📅 Next Update Plans
|
|
||||||
|
|
||||||
### 🔖 Post Tag Filtering (Planned for v6.1.0)
|
|
||||||
|
|
||||||
A powerful new **"Filter by Post Tags"** feature is planned:
|
|
||||||
|
|
||||||
- Filter and download content based on specific post tags.
|
|
||||||
- Combine tag filtering with current filters (character, file type, etc.).
|
|
||||||
- Use tag presets to automate frequent downloads.
|
|
||||||
|
|
||||||
This will provide **much greater control** over what gets downloaded, especially for creators who use tags consistently.
|
|
||||||
|
|
||||||
### 📁 Creator Download History (.json Save)
|
|
||||||
|
|
||||||
To streamline incremental downloads, a new system will allow the app to:
|
|
||||||
|
|
||||||
- Save a `.json` file with metadata about already-downloaded posts.
|
|
||||||
- Compare that file on future runs, so only **new** posts are downloaded.
|
|
||||||
- Avoids duplication and makes regular syncs fast and efficient.
|
|
||||||
|
|
||||||
Ideal for users managing large collections or syncing favorites regularly.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 💻 Installation
|
## 💻 Installation
|
||||||
|
|
||||||
@@ -154,7 +99,7 @@ Ideal for users managing large collections or syncing favorites regularly.
|
|||||||
### Install Dependencies
|
### Install Dependencies
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install PyQt5 requests Pillow mega.py
|
pip install PyQt5 requests Pillow mega.py fpdf2 python-docx
|
||||||
```
|
```
|
||||||
|
|
||||||
### Running the Application
|
### Running the Application
|
||||||
@@ -197,7 +142,7 @@ Feel free to fork this repo and submit pull requests for bug fixes, new features
|
|||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
This project is under the Custom Licence
|
This project is under the MIT Licence
|
||||||
|
|
||||||
## Star History
|
## Star History
|
||||||
|
|
||||||
|
|||||||
@@ -60,6 +60,7 @@ DOWNLOAD_LOCATION_KEY = "downloadLocationV1"
|
|||||||
RESOLUTION_KEY = "window_resolution"
|
RESOLUTION_KEY = "window_resolution"
|
||||||
UI_SCALE_KEY = "ui_scale_factor"
|
UI_SCALE_KEY = "ui_scale_factor"
|
||||||
SAVE_CREATOR_JSON_KEY = "saveCreatorJsonProfile"
|
SAVE_CREATOR_JSON_KEY = "saveCreatorJsonProfile"
|
||||||
|
FETCH_FIRST_KEY = "fetchAllPostsFirst"
|
||||||
|
|
||||||
# --- UI Constants and Identifiers ---
|
# --- UI Constants and Identifiers ---
|
||||||
HTML_PREFIX = "<!HTML!>"
|
HTML_PREFIX = "<!HTML!>"
|
||||||
@@ -97,7 +98,7 @@ FOLDER_NAME_STOP_WORDS = {
|
|||||||
"for", "he", "her", "his", "i", "im", "in", "is", "it", "its",
|
"for", "he", "her", "his", "i", "im", "in", "is", "it", "its",
|
||||||
"me", "my", "net", "not", "of", "on", "or", "org", "our",
|
"me", "my", "net", "not", "of", "on", "or", "org", "our",
|
||||||
"s", "she", "so", "the", "their", "they", "this",
|
"s", "she", "so", "the", "their", "they", "this",
|
||||||
"to", "ve", "was", "we", "were", "with", "www", "you", "your",
|
"to", "ve", "was", "we", "were", "with", "www", "you", "your", "nsfw", "sfw",
|
||||||
# add more according to need
|
# add more according to need
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -111,7 +112,9 @@ CREATOR_DOWNLOAD_DEFAULT_FOLDER_IGNORE_WORDS = {
|
|||||||
"may", "jun", "june", "jul", "july", "aug", "august", "sep", "september",
|
"may", "jun", "june", "jul", "july", "aug", "august", "sep", "september",
|
||||||
"oct", "october", "nov", "november", "dec", "december",
|
"oct", "october", "nov", "november", "dec", "december",
|
||||||
"mon", "monday", "tue", "tuesday", "wed", "wednesday", "thu", "thursday",
|
"mon", "monday", "tue", "tuesday", "wed", "wednesday", "thu", "thursday",
|
||||||
"fri", "friday", "sat", "saturday", "sun", "sunday"
|
"fri", "friday", "sat", "saturday", "sun", "sunday", "Pack", "tier", "spoiler",
|
||||||
|
|
||||||
|
|
||||||
# add more according to need
|
# add more according to need
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
import time
|
import time
|
||||||
import traceback
|
import traceback
|
||||||
from urllib.parse import urlparse
|
from urllib.parse import urlparse
|
||||||
import json # Ensure json is imported
|
import json
|
||||||
import requests
|
import requests
|
||||||
from ..utils.network_utils import extract_post_info, prepare_cookies_for_request
|
from ..utils.network_utils import extract_post_info, prepare_cookies_for_request
|
||||||
from ..config.constants import (
|
from ..config.constants import (
|
||||||
@@ -41,9 +41,14 @@ def fetch_posts_paginated(api_url_base, headers, offset, logger, cancellation_ev
|
|||||||
try:
|
try:
|
||||||
response = requests.get(paginated_url, headers=headers, timeout=(15, 60), cookies=cookies_dict)
|
response = requests.get(paginated_url, headers=headers, timeout=(15, 60), cookies=cookies_dict)
|
||||||
response.raise_for_status()
|
response.raise_for_status()
|
||||||
|
response.encoding = 'utf-8'
|
||||||
return response.json()
|
return response.json()
|
||||||
|
|
||||||
except requests.exceptions.RequestException as e:
|
except requests.exceptions.RequestException as e:
|
||||||
|
if e.response is not None and e.response.status_code == 400:
|
||||||
|
logger(f" ✅ Reached end of posts (API returned 400 Bad Request for offset {offset}).")
|
||||||
|
return []
|
||||||
|
|
||||||
logger(f" ⚠️ Retryable network error on page fetch (Attempt {attempt + 1}): {e}")
|
logger(f" ⚠️ Retryable network error on page fetch (Attempt {attempt + 1}): {e}")
|
||||||
if attempt < max_retries - 1:
|
if attempt < max_retries - 1:
|
||||||
delay = retry_delay * (2 ** attempt)
|
delay = retry_delay * (2 ** attempt)
|
||||||
@@ -81,9 +86,12 @@ def fetch_single_post_data(api_domain, service, user_id, post_id, headers, logge
|
|||||||
response_body += chunk
|
response_body += chunk
|
||||||
|
|
||||||
full_post_data = json.loads(response_body)
|
full_post_data = json.loads(response_body)
|
||||||
|
|
||||||
if isinstance(full_post_data, list) and full_post_data:
|
if isinstance(full_post_data, list) and full_post_data:
|
||||||
return full_post_data[0]
|
return full_post_data[0]
|
||||||
return full_post_data
|
if isinstance(full_post_data, dict) and 'post' in full_post_data:
|
||||||
|
return full_post_data['post']
|
||||||
|
return full_post_data
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger(f" ❌ Failed to fetch full content for post {post_id}: {e}")
|
logger(f" ❌ Failed to fetch full content for post {post_id}: {e}")
|
||||||
@@ -101,6 +109,7 @@ def fetch_post_comments(api_domain, service, user_id, post_id, headers, logger,
|
|||||||
try:
|
try:
|
||||||
response = requests.get(comments_api_url, headers=headers, timeout=(10, 30), cookies=cookies_dict)
|
response = requests.get(comments_api_url, headers=headers, timeout=(10, 30), cookies=cookies_dict)
|
||||||
response.raise_for_status()
|
response.raise_for_status()
|
||||||
|
response.encoding = 'utf-8'
|
||||||
return response.json()
|
return response.json()
|
||||||
except requests.exceptions.RequestException as e:
|
except requests.exceptions.RequestException as e:
|
||||||
raise RuntimeError(f"Error fetching comments for post {post_id}: {e}")
|
raise RuntimeError(f"Error fetching comments for post {post_id}: {e}")
|
||||||
@@ -120,7 +129,8 @@ def download_from_api(
|
|||||||
selected_cookie_file=None,
|
selected_cookie_file=None,
|
||||||
app_base_dir=None,
|
app_base_dir=None,
|
||||||
manga_filename_style_for_sort_check=None,
|
manga_filename_style_for_sort_check=None,
|
||||||
processed_post_ids=None
|
processed_post_ids=None,
|
||||||
|
fetch_all_first=False
|
||||||
):
|
):
|
||||||
headers = {
|
headers = {
|
||||||
'User-Agent': 'Mozilla/5.0',
|
'User-Agent': 'Mozilla/5.0',
|
||||||
@@ -140,12 +150,9 @@ def download_from_api(
|
|||||||
parsed_input_url_for_domain = urlparse(api_url_input)
|
parsed_input_url_for_domain = urlparse(api_url_input)
|
||||||
api_domain = parsed_input_url_for_domain.netloc
|
api_domain = parsed_input_url_for_domain.netloc
|
||||||
|
|
||||||
# --- START: MODIFIED LOGIC ---
|
|
||||||
# This list is updated to include the new .cr and .st mirrors for validation.
|
|
||||||
if not any(d in api_domain.lower() for d in ['kemono.su', 'kemono.party', 'kemono.cr', 'coomer.su', 'coomer.party', 'coomer.st']):
|
if not any(d in api_domain.lower() for d in ['kemono.su', 'kemono.party', 'kemono.cr', 'coomer.su', 'coomer.party', 'coomer.st']):
|
||||||
logger(f"⚠️ Unrecognized domain '{api_domain}' from input URL. Defaulting to kemono.su for API calls.")
|
logger(f"⚠️ Unrecognized domain '{api_domain}' from input URL. Defaulting to kemono.su for API calls.")
|
||||||
api_domain = "kemono.su"
|
api_domain = "kemono.su"
|
||||||
# --- END: MODIFIED LOGIC ---
|
|
||||||
|
|
||||||
cookies_for_api = None
|
cookies_for_api = None
|
||||||
if use_cookie and app_base_dir:
|
if use_cookie and app_base_dir:
|
||||||
@@ -159,6 +166,7 @@ def download_from_api(
|
|||||||
try:
|
try:
|
||||||
direct_response = requests.get(direct_post_api_url, headers=headers, timeout=(10, 30), cookies=cookies_for_api)
|
direct_response = requests.get(direct_post_api_url, headers=headers, timeout=(10, 30), cookies=cookies_for_api)
|
||||||
direct_response.raise_for_status()
|
direct_response.raise_for_status()
|
||||||
|
direct_response.encoding = 'utf-8'
|
||||||
direct_post_data = direct_response.json()
|
direct_post_data = direct_response.json()
|
||||||
if isinstance(direct_post_data, list) and direct_post_data:
|
if isinstance(direct_post_data, list) and direct_post_data:
|
||||||
direct_post_data = direct_post_data[0]
|
direct_post_data = direct_post_data[0]
|
||||||
@@ -183,7 +191,8 @@ def download_from_api(
|
|||||||
logger("⚠️ Page range (start/end page) is ignored when a specific post URL is provided (searching all pages for the post).")
|
logger("⚠️ Page range (start/end page) is ignored when a specific post URL is provided (searching all pages for the post).")
|
||||||
|
|
||||||
is_manga_mode_fetch_all_and_sort_oldest_first = manga_mode and (manga_filename_style_for_sort_check != STYLE_DATE_POST_TITLE) and not target_post_id
|
is_manga_mode_fetch_all_and_sort_oldest_first = manga_mode and (manga_filename_style_for_sort_check != STYLE_DATE_POST_TITLE) and not target_post_id
|
||||||
api_base_url = f"https://{api_domain}/api/v1/{service}/user/{user_id}"
|
should_fetch_all = fetch_all_first or is_manga_mode_fetch_all_and_sort_oldest_first
|
||||||
|
api_base_url = f"https://{api_domain}/api/v1/{service}/user/{user_id}/posts"
|
||||||
page_size = 50
|
page_size = 50
|
||||||
if is_manga_mode_fetch_all_and_sort_oldest_first:
|
if is_manga_mode_fetch_all_and_sort_oldest_first:
|
||||||
logger(f" Manga Mode (Style: {manga_filename_style_for_sort_check if manga_filename_style_for_sort_check else 'Default'} - Oldest First Sort Active): Fetching all posts to sort by date...")
|
logger(f" Manga Mode (Style: {manga_filename_style_for_sort_check if manga_filename_style_for_sort_check else 'Default'} - Oldest First Sort Active): Fetching all posts to sort by date...")
|
||||||
|
|||||||
80
src/core/discord_client.py
Normal file
80
src/core/discord_client.py
Normal file
@@ -0,0 +1,80 @@
|
|||||||
|
import time
|
||||||
|
import requests
|
||||||
|
import json
|
||||||
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
def fetch_server_channels(server_id, logger, cookies=None, cancellation_event=None, pause_event=None):
|
||||||
|
"""
|
||||||
|
Fetches the list of channels for a given Discord server ID from the Kemono API.
|
||||||
|
UPDATED to be pausable and cancellable.
|
||||||
|
"""
|
||||||
|
domains_to_try = ["kemono.cr", "kemono.su"]
|
||||||
|
for domain in domains_to_try:
|
||||||
|
if cancellation_event and cancellation_event.is_set():
|
||||||
|
logger(" Channel fetching cancelled by user.")
|
||||||
|
return None
|
||||||
|
while pause_event and pause_event.is_set():
|
||||||
|
if cancellation_event and cancellation_event.is_set(): break
|
||||||
|
time.sleep(0.5)
|
||||||
|
|
||||||
|
lookup_url = f"https://{domain}/api/v1/discord/channel/lookup/{server_id}"
|
||||||
|
logger(f" Attempting to fetch channel list from: {lookup_url}")
|
||||||
|
try:
|
||||||
|
response = requests.get(lookup_url, cookies=cookies, timeout=15)
|
||||||
|
response.raise_for_status()
|
||||||
|
channels = response.json()
|
||||||
|
if isinstance(channels, list):
|
||||||
|
logger(f" ✅ Found {len(channels)} channels for server {server_id}.")
|
||||||
|
return channels
|
||||||
|
except (requests.exceptions.RequestException, json.JSONDecodeError):
|
||||||
|
# This is a silent failure, we'll just try the next domain
|
||||||
|
pass
|
||||||
|
|
||||||
|
logger(f" ❌ Failed to fetch channel list for server {server_id} from all available domains.")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def fetch_channel_messages(channel_id, logger, cancellation_event, pause_event, cookies=None):
|
||||||
|
"""
|
||||||
|
Fetches all messages from a Discord channel by looping through API pages (pagination).
|
||||||
|
Uses a page size of 150 and handles the specific offset logic.
|
||||||
|
"""
|
||||||
|
offset = 0
|
||||||
|
page_size = 150 # Corrected page size based on your findings
|
||||||
|
api_base_url = f"https://kemono.cr/api/v1/discord/channel/{channel_id}"
|
||||||
|
|
||||||
|
while not (cancellation_event and cancellation_event.is_set()):
|
||||||
|
if pause_event and pause_event.is_set():
|
||||||
|
logger(" Message fetching paused...")
|
||||||
|
while pause_event.is_set():
|
||||||
|
if cancellation_event and cancellation_event.is_set(): break
|
||||||
|
time.sleep(0.5)
|
||||||
|
logger(" Message fetching resumed.")
|
||||||
|
|
||||||
|
if cancellation_event and cancellation_event.is_set():
|
||||||
|
break
|
||||||
|
|
||||||
|
paginated_url = f"{api_base_url}?o={offset}"
|
||||||
|
logger(f" Fetching messages from API: page starting at offset {offset}")
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = requests.get(paginated_url, cookies=cookies, timeout=20)
|
||||||
|
response.raise_for_status()
|
||||||
|
messages_batch = response.json()
|
||||||
|
|
||||||
|
if not messages_batch:
|
||||||
|
logger(f" ✅ Reached end of messages for channel {channel_id}.")
|
||||||
|
break
|
||||||
|
|
||||||
|
logger(f" Fetched {len(messages_batch)} messages...")
|
||||||
|
yield messages_batch
|
||||||
|
|
||||||
|
if len(messages_batch) < page_size:
|
||||||
|
logger(f" ✅ Last page of messages received for channel {channel_id}.")
|
||||||
|
break
|
||||||
|
|
||||||
|
offset += page_size
|
||||||
|
time.sleep(0.5)
|
||||||
|
|
||||||
|
except (requests.exceptions.RequestException, json.JSONDecodeError) as e:
|
||||||
|
logger(f" ❌ Error fetching messages at offset {offset}: {e}")
|
||||||
|
break
|
||||||
@@ -37,7 +37,7 @@ try:
|
|||||||
except ImportError:
|
except ImportError:
|
||||||
Document = None
|
Document = None
|
||||||
from PyQt5 .QtCore import Qt ,QThread ,pyqtSignal ,QMutex ,QMutexLocker ,QObject ,QTimer ,QSettings ,QStandardPaths ,QCoreApplication ,QUrl ,QSize ,QProcess
|
from PyQt5 .QtCore import Qt ,QThread ,pyqtSignal ,QMutex ,QMutexLocker ,QObject ,QTimer ,QSettings ,QStandardPaths ,QCoreApplication ,QUrl ,QSize ,QProcess
|
||||||
from .api_client import download_from_api, fetch_post_comments
|
from .api_client import download_from_api, fetch_post_comments, fetch_single_post_data
|
||||||
from ..services.multipart_downloader import download_file_in_parts, MULTIPART_DOWNLOADER_AVAILABLE
|
from ..services.multipart_downloader import download_file_in_parts, MULTIPART_DOWNLOADER_AVAILABLE
|
||||||
from ..services.drive_downloader import (
|
from ..services.drive_downloader import (
|
||||||
download_mega_file, download_gdrive_file, download_dropbox_file
|
download_mega_file, download_gdrive_file, download_dropbox_file
|
||||||
@@ -124,7 +124,8 @@ class PostProcessorWorker:
|
|||||||
processed_post_ids=None,
|
processed_post_ids=None,
|
||||||
multipart_scope='both',
|
multipart_scope='both',
|
||||||
multipart_parts_count=4,
|
multipart_parts_count=4,
|
||||||
multipart_min_size_mb=100
|
multipart_min_size_mb=100,
|
||||||
|
skip_file_size_mb=None
|
||||||
):
|
):
|
||||||
self.post = post_data
|
self.post = post_data
|
||||||
self.download_root = download_root
|
self.download_root = download_root
|
||||||
@@ -189,6 +190,7 @@ class PostProcessorWorker:
|
|||||||
self.multipart_scope = multipart_scope
|
self.multipart_scope = multipart_scope
|
||||||
self.multipart_parts_count = multipart_parts_count
|
self.multipart_parts_count = multipart_parts_count
|
||||||
self.multipart_min_size_mb = multipart_min_size_mb
|
self.multipart_min_size_mb = multipart_min_size_mb
|
||||||
|
self.skip_file_size_mb = skip_file_size_mb
|
||||||
if self.compress_images and Image is None:
|
if self.compress_images and Image is None:
|
||||||
self.logger("⚠️ Image compression disabled: Pillow library not found.")
|
self.logger("⚠️ Image compression disabled: Pillow library not found.")
|
||||||
self.compress_images = False
|
self.compress_images = False
|
||||||
@@ -276,7 +278,25 @@ class PostProcessorWorker:
|
|||||||
cookies_to_use_for_file = None
|
cookies_to_use_for_file = None
|
||||||
if self.use_cookie:
|
if self.use_cookie:
|
||||||
cookies_to_use_for_file = prepare_cookies_for_request(self.use_cookie, self.cookie_text, self.selected_cookie_file, self.app_base_dir, self.logger)
|
cookies_to_use_for_file = prepare_cookies_for_request(self.use_cookie, self.cookie_text, self.selected_cookie_file, self.app_base_dir, self.logger)
|
||||||
|
|
||||||
|
if self.skip_file_size_mb is not None:
|
||||||
|
api_original_filename_for_size_check = file_info.get('_original_name_for_log', file_info.get('name'))
|
||||||
|
try:
|
||||||
|
# Use a stream=True HEAD request to get headers without downloading the body
|
||||||
|
with requests.head(file_url, headers=file_download_headers, timeout=15, cookies=cookies_to_use_for_file, allow_redirects=True) as head_response:
|
||||||
|
head_response.raise_for_status()
|
||||||
|
content_length = head_response.headers.get('Content-Length')
|
||||||
|
if content_length:
|
||||||
|
file_size_bytes = int(content_length)
|
||||||
|
file_size_mb = file_size_bytes / (1024 * 1024)
|
||||||
|
if file_size_mb < self.skip_file_size_mb:
|
||||||
|
self.logger(f" -> Skip File (Size): '{api_original_filename_for_size_check}' is {file_size_mb:.2f} MB, which is smaller than the {self.skip_file_size_mb} MB limit.")
|
||||||
|
return 0, 1, api_original_filename_for_size_check, False, FILE_DOWNLOAD_STATUS_SKIPPED, None
|
||||||
|
else:
|
||||||
|
self.logger(f" ⚠️ Could not determine file size for '{api_original_filename_for_size_check}' to check against size limit. Proceeding with download.")
|
||||||
|
except requests.RequestException as e:
|
||||||
|
self.logger(f" ⚠️ Could not fetch file headers to check size for '{api_original_filename_for_size_check}': {e}. Proceeding with download.")
|
||||||
|
|
||||||
api_original_filename = file_info.get('_original_name_for_log', file_info.get('name'))
|
api_original_filename = file_info.get('_original_name_for_log', file_info.get('name'))
|
||||||
filename_to_save_in_main_path = ""
|
filename_to_save_in_main_path = ""
|
||||||
if forced_filename_override:
|
if forced_filename_override:
|
||||||
@@ -488,19 +508,18 @@ class PostProcessorWorker:
|
|||||||
except requests.RequestException as e:
|
except requests.RequestException as e:
|
||||||
self.logger(f" ⚠️ Could not verify size of existing file '{filename_to_save_in_main_path}': {e}. Proceeding with download.")
|
self.logger(f" ⚠️ Could not verify size of existing file '{filename_to_save_in_main_path}': {e}. Proceeding with download.")
|
||||||
|
|
||||||
|
max_retries = 3
|
||||||
retry_delay = 5
|
retry_delay = 5
|
||||||
downloaded_size_bytes = 0
|
downloaded_size_bytes = 0
|
||||||
calculated_file_hash = None
|
calculated_file_hash = None
|
||||||
downloaded_part_file_path = None
|
downloaded_part_file_path = None
|
||||||
total_size_bytes = 0
|
|
||||||
download_successful_flag = False
|
download_successful_flag = False
|
||||||
last_exception_for_retry_later = None
|
last_exception_for_retry_later = None
|
||||||
is_permanent_error = False
|
is_permanent_error = False
|
||||||
data_to_write_io = None
|
data_to_write_io = None
|
||||||
|
|
||||||
response_for_this_attempt = None
|
|
||||||
for attempt_num_single_stream in range(max_retries + 1):
|
for attempt_num_single_stream in range(max_retries + 1):
|
||||||
response_for_this_attempt = None
|
response = None
|
||||||
if self._check_pause(f"File download attempt for '{api_original_filename}'"): break
|
if self._check_pause(f"File download attempt for '{api_original_filename}'"): break
|
||||||
if self.check_cancel() or (skip_event and skip_event.is_set()): break
|
if self.check_cancel() or (skip_event and skip_event.is_set()): break
|
||||||
try:
|
try:
|
||||||
@@ -519,12 +538,24 @@ class PostProcessorWorker:
|
|||||||
new_url = self._find_valid_subdomain(current_url_to_try)
|
new_url = self._find_valid_subdomain(current_url_to_try)
|
||||||
if new_url != current_url_to_try:
|
if new_url != current_url_to_try:
|
||||||
self.logger(f" Retrying with new URL: {new_url}")
|
self.logger(f" Retrying with new URL: {new_url}")
|
||||||
file_url = new_url # Update the main file_url for subsequent retries
|
file_url = new_url
|
||||||
|
response.close() # Close the old response
|
||||||
response = requests.get(new_url, headers=file_download_headers, timeout=(30, 300), stream=True, cookies=cookies_to_use_for_file)
|
response = requests.get(new_url, headers=file_download_headers, timeout=(30, 300), stream=True, cookies=cookies_to_use_for_file)
|
||||||
|
|
||||||
|
|
||||||
response.raise_for_status()
|
response.raise_for_status()
|
||||||
|
|
||||||
|
# --- REVISED AND MOVED SIZE CHECK LOGIC ---
|
||||||
total_size_bytes = int(response.headers.get('Content-Length', 0))
|
total_size_bytes = int(response.headers.get('Content-Length', 0))
|
||||||
|
|
||||||
|
if self.skip_file_size_mb is not None:
|
||||||
|
if total_size_bytes > 0:
|
||||||
|
file_size_mb = total_size_bytes / (1024 * 1024)
|
||||||
|
if file_size_mb < self.skip_file_size_mb:
|
||||||
|
self.logger(f" -> Skip File (Size): '{api_original_filename}' is {file_size_mb:.2f} MB, which is smaller than the {self.skip_file_size_mb} MB limit.")
|
||||||
|
return 0, 1, api_original_filename, False, FILE_DOWNLOAD_STATUS_SKIPPED, None
|
||||||
|
# If Content-Length is missing, we can't check, so we no longer log a warning here and just proceed.
|
||||||
|
# --- END OF REVISED LOGIC ---
|
||||||
|
|
||||||
num_parts_for_file = min(self.multipart_parts_count, MAX_PARTS_FOR_MULTIPART_DOWNLOAD)
|
num_parts_for_file = min(self.multipart_parts_count, MAX_PARTS_FOR_MULTIPART_DOWNLOAD)
|
||||||
|
|
||||||
file_is_eligible_by_scope = False
|
file_is_eligible_by_scope = False
|
||||||
@@ -548,9 +579,7 @@ class PostProcessorWorker:
|
|||||||
if self._check_pause(f"Multipart decision for '{api_original_filename}'"): break
|
if self._check_pause(f"Multipart decision for '{api_original_filename}'"): break
|
||||||
|
|
||||||
if attempt_multipart:
|
if attempt_multipart:
|
||||||
if response_for_this_attempt:
|
response.close() # Close the initial connection before starting multipart
|
||||||
response_for_this_attempt.close()
|
|
||||||
response_for_this_attempt = None
|
|
||||||
mp_save_path_for_unique_part_stem_arg = os.path.join(target_folder_path, f"{unique_part_file_stem_on_disk}{temp_file_ext_for_unique_part}")
|
mp_save_path_for_unique_part_stem_arg = os.path.join(target_folder_path, f"{unique_part_file_stem_on_disk}{temp_file_ext_for_unique_part}")
|
||||||
mp_success, mp_bytes, mp_hash, mp_file_handle = download_file_in_parts(
|
mp_success, mp_bytes, mp_hash, mp_file_handle = download_file_in_parts(
|
||||||
file_url, mp_save_path_for_unique_part_stem_arg, total_size_bytes, num_parts_for_file, file_download_headers, api_original_filename,
|
file_url, mp_save_path_for_unique_part_stem_arg, total_size_bytes, num_parts_for_file, file_download_headers, api_original_filename,
|
||||||
@@ -576,7 +605,6 @@ class PostProcessorWorker:
|
|||||||
current_attempt_downloaded_bytes = 0
|
current_attempt_downloaded_bytes = 0
|
||||||
md5_hasher = hashlib.md5()
|
md5_hasher = hashlib.md5()
|
||||||
last_progress_time = time.time()
|
last_progress_time = time.time()
|
||||||
single_stream_exception = None
|
|
||||||
try:
|
try:
|
||||||
with open(current_single_stream_part_path, 'wb') as f_part:
|
with open(current_single_stream_part_path, 'wb') as f_part:
|
||||||
for chunk in response.iter_content(chunk_size=1 * 1024 * 1024):
|
for chunk in response.iter_content(chunk_size=1 * 1024 * 1024):
|
||||||
@@ -643,8 +671,8 @@ class PostProcessorWorker:
|
|||||||
is_permanent_error = True
|
is_permanent_error = True
|
||||||
break
|
break
|
||||||
finally:
|
finally:
|
||||||
if response_for_this_attempt:
|
if response:
|
||||||
response_for_this_attempt.close()
|
response.close()
|
||||||
self._emit_signal('file_download_status', False)
|
self._emit_signal('file_download_status', False)
|
||||||
|
|
||||||
final_total_for_progress = total_size_bytes if download_successful_flag and total_size_bytes > 0 else downloaded_size_bytes
|
final_total_for_progress = total_size_bytes if download_successful_flag and total_size_bytes > 0 else downloaded_size_bytes
|
||||||
@@ -826,37 +854,91 @@ class PostProcessorWorker:
|
|||||||
return 0, 1, filename_to_save_in_main_path, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_FAILED_RETRYABLE_LATER, details_for_failure
|
return 0, 1, filename_to_save_in_main_path, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_FAILED_RETRYABLE_LATER, details_for_failure
|
||||||
|
|
||||||
def process(self):
|
def process(self):
|
||||||
|
# --- START: REFACTORED PROCESS METHOD ---
|
||||||
|
|
||||||
|
# 1. DATA MAPPING: Map Discord Message or Creator Post fields to a consistent set of variables.
|
||||||
|
if self.service == 'discord':
|
||||||
|
# For Discord, self.post is a MESSAGE object from the API.
|
||||||
|
post_title = self.post.get('content', '') or f"Message {self.post.get('id', 'N/A')}"
|
||||||
|
post_id = self.post.get('id', 'unknown_id')
|
||||||
|
post_main_file_info = {} # Discord messages don't have a single main file
|
||||||
|
post_attachments = self.post.get('attachments', [])
|
||||||
|
post_content_html = self.post.get('content', '')
|
||||||
|
post_data = self.post # Keep a reference to the original message object
|
||||||
|
log_prefix = "Message"
|
||||||
|
else:
|
||||||
|
# Existing logic for standard creator posts
|
||||||
|
post_title = self.post.get('title', '') or 'untitled_post'
|
||||||
|
post_id = self.post.get('id', 'unknown_id')
|
||||||
|
post_main_file_info = self.post.get('file')
|
||||||
|
post_attachments = self.post.get('attachments', [])
|
||||||
|
post_content_html = self.post.get('content', '')
|
||||||
|
post_data = self.post # Reference to the post object
|
||||||
|
log_prefix = "Post"
|
||||||
|
|
||||||
|
# --- FIX: FETCH FULL POST DATA IF CONTENT IS MISSING BUT NEEDED ---
|
||||||
|
content_is_needed = (
|
||||||
|
self.show_external_links or
|
||||||
|
self.extract_links_only or
|
||||||
|
self.scan_content_for_images or
|
||||||
|
(self.filter_mode == 'text_only' and self.text_only_scope == 'content')
|
||||||
|
)
|
||||||
|
|
||||||
|
if content_is_needed and self.post.get('content') is None and self.service != 'discord':
|
||||||
|
self.logger(f" Post {post_id} is missing 'content' field, fetching full data...")
|
||||||
|
parsed_url = urlparse(self.api_url_input)
|
||||||
|
api_domain = parsed_url.netloc
|
||||||
|
headers = {'User-Agent': 'Mozilla/5.0'}
|
||||||
|
cookies = prepare_cookies_for_request(self.use_cookie, self.cookie_text, self.selected_cookie_file, self.app_base_dir, self.logger, target_domain=api_domain)
|
||||||
|
|
||||||
|
full_post_data = fetch_single_post_data(api_domain, self.service, self.user_id, post_id, headers, self.logger, cookies_dict=cookies)
|
||||||
|
|
||||||
|
if full_post_data:
|
||||||
|
self.logger(" ✅ Full post data fetched successfully.")
|
||||||
|
# Update the worker's post object with the complete data
|
||||||
|
self.post = full_post_data
|
||||||
|
# Re-initialize local variables from the new, complete post data
|
||||||
|
post_title = self.post.get('title', '') or 'untitled_post'
|
||||||
|
post_main_file_info = self.post.get('file')
|
||||||
|
post_attachments = self.post.get('attachments', [])
|
||||||
|
post_content_html = self.post.get('content', '')
|
||||||
|
post_data = self.post
|
||||||
|
else:
|
||||||
|
self.logger(f" ⚠️ Failed to fetch full content for post {post_id}. Content-dependent features may not work for this post.")
|
||||||
|
# --- END FIX ---
|
||||||
|
|
||||||
|
# 2. SHARED PROCESSING LOGIC: The rest of the function now uses the consistent variables from above.
|
||||||
result_tuple = (0, 0, [], [], [], None, None)
|
result_tuple = (0, 0, [], [], [], None, None)
|
||||||
|
total_downloaded_this_post = 0
|
||||||
|
total_skipped_this_post = 0
|
||||||
|
determined_post_save_path_for_history = self.override_output_dir if self.override_output_dir else self.download_root
|
||||||
|
|
||||||
try:
|
try:
|
||||||
if self._check_pause(f"Post processing for ID {self.post.get('id', 'N/A')}"):
|
if self._check_pause(f"{log_prefix} processing for ID {post_id}"):
|
||||||
result_tuple = (0, 0, [], [], [], None, None)
|
return (0, 0, [], [], [], None, None)
|
||||||
return result_tuple
|
|
||||||
if self.check_cancel():
|
if self.check_cancel():
|
||||||
result_tuple = (0, 0, [], [], [], None, None)
|
return (0, 0, [], [], [], None, None)
|
||||||
return result_tuple
|
|
||||||
|
|
||||||
current_character_filters = self._get_current_character_filters()
|
current_character_filters = self._get_current_character_filters()
|
||||||
kept_original_filenames_for_log = []
|
kept_original_filenames_for_log = []
|
||||||
retryable_failures_this_post = []
|
retryable_failures_this_post = []
|
||||||
permanent_failures_this_post = []
|
permanent_failures_this_post = []
|
||||||
total_downloaded_this_post = 0
|
|
||||||
total_skipped_this_post = 0
|
|
||||||
history_data_for_this_post = None
|
history_data_for_this_post = None
|
||||||
|
|
||||||
parsed_api_url = urlparse(self.api_url_input)
|
parsed_api_url = urlparse(self.api_url_input)
|
||||||
post_data = self.post
|
|
||||||
post_id = post_data.get('id', 'unknown_id')
|
# CONTEXT-AWARE URL for Referer Header
|
||||||
|
if self.service == 'discord':
|
||||||
|
server_id = self.user_id
|
||||||
|
channel_id = self.post.get('channel', 'unknown_channel')
|
||||||
|
post_page_url = f"https://{parsed_api_url.netloc}/discord/server/{server_id}/{channel_id}"
|
||||||
|
else:
|
||||||
|
post_page_url = f"https://{parsed_api_url.netloc}/{self.service}/user/{self.user_id}/post/{post_id}"
|
||||||
|
|
||||||
post_page_url = f"https://{parsed_api_url.netloc}/{self.service}/user/{self.user_id}/post/{post_id}"
|
|
||||||
headers = {'User-Agent': 'Mozilla/5.0', 'Referer': post_page_url, 'Accept': '*/*'}
|
headers = {'User-Agent': 'Mozilla/5.0', 'Referer': post_page_url, 'Accept': '*/*'}
|
||||||
link_pattern = re.compile(r"""<a\s+.*?href=["'](https?://[^"']+)["'][^>]*>(.*?)</a>""", re.IGNORECASE | re.DOTALL)
|
link_pattern = re.compile(r"""<a\s+.*?href=["'](https?://[^"']+)["'][^>]*>(.*?)</a>""", re.IGNORECASE | re.DOTALL)
|
||||||
post_data = self.post
|
|
||||||
post_title = post_data.get('title', '') or 'untitled_post'
|
|
||||||
post_id = post_data.get('id', 'unknown_id')
|
|
||||||
post_main_file_info = post_data.get('file')
|
|
||||||
post_attachments = post_data.get('attachments', [])
|
|
||||||
|
|
||||||
effective_unwanted_keywords_for_folder_naming = self.unwanted_keywords.copy()
|
effective_unwanted_keywords_for_folder_naming = self.unwanted_keywords.copy()
|
||||||
is_full_creator_download_no_char_filter = not self.target_post_id_from_initial_url and not current_character_filters
|
is_full_creator_download_no_char_filter = not self.target_post_id_from_initial_url and not current_character_filters
|
||||||
|
|
||||||
@@ -874,9 +956,9 @@ class PostProcessorWorker:
|
|||||||
self.logger(f" Applying creator download specific folder ignore words ({len(self.creator_download_folder_ignore_words)} words).")
|
self.logger(f" Applying creator download specific folder ignore words ({len(self.creator_download_folder_ignore_words)} words).")
|
||||||
effective_unwanted_keywords_for_folder_naming.update(self.creator_download_folder_ignore_words)
|
effective_unwanted_keywords_for_folder_naming.update(self.creator_download_folder_ignore_words)
|
||||||
|
|
||||||
post_content_html = post_data.get('content', '')
|
|
||||||
if not self.extract_links_only:
|
if not self.extract_links_only:
|
||||||
self.logger(f"\n--- Processing Post {post_id} ('{post_title[:50]}...') (Thread: {threading.current_thread().name}) ---")
|
self.logger(f"\n--- Processing {log_prefix} {post_id} ('{post_title[:50]}...') (Thread: {threading.current_thread().name}) ---")
|
||||||
|
|
||||||
num_potential_files_in_post = len(post_attachments or []) + (1 if post_main_file_info and post_main_file_info.get('path') else 0)
|
num_potential_files_in_post = len(post_attachments or []) + (1 if post_main_file_info and post_main_file_info.get('path') else 0)
|
||||||
|
|
||||||
post_is_candidate_by_title_char_match = False
|
post_is_candidate_by_title_char_match = False
|
||||||
@@ -920,7 +1002,7 @@ class PostProcessorWorker:
|
|||||||
if original_api_att_name:
|
if original_api_att_name:
|
||||||
all_files_from_post_api_for_char_check.append({'_original_name_for_log': original_api_att_name})
|
all_files_from_post_api_for_char_check.append({'_original_name_for_log': original_api_att_name})
|
||||||
|
|
||||||
if current_character_filters and self.char_filter_scope == CHAR_SCOPE_COMMENTS:
|
if current_character_filters and self.char_filter_scope == CHAR_SCOPE_COMMENTS and self.service != 'discord':
|
||||||
self.logger(f" [Char Scope: Comments] Phase 1: Checking post files for matches before comments for post ID '{post_id}'.")
|
self.logger(f" [Char Scope: Comments] Phase 1: Checking post files for matches before comments for post ID '{post_id}'.")
|
||||||
if self._check_pause(f"File check (comments scope) for post {post_id}"):
|
if self._check_pause(f"File check (comments scope) for post {post_id}"):
|
||||||
result_tuple = (0, num_potential_files_in_post, [], [], [], None, None)
|
result_tuple = (0, num_potential_files_in_post, [], [], [], None, None)
|
||||||
@@ -943,7 +1025,7 @@ class PostProcessorWorker:
|
|||||||
if post_is_candidate_by_file_char_match_in_comment_scope: break
|
if post_is_candidate_by_file_char_match_in_comment_scope: break
|
||||||
self.logger(f" [Char Scope: Comments] Phase 1 Result: post_is_candidate_by_file_char_match_in_comment_scope = {post_is_candidate_by_file_char_match_in_comment_scope}")
|
self.logger(f" [Char Scope: Comments] Phase 1 Result: post_is_candidate_by_file_char_match_in_comment_scope = {post_is_candidate_by_file_char_match_in_comment_scope}")
|
||||||
|
|
||||||
if current_character_filters and self.char_filter_scope == CHAR_SCOPE_COMMENTS:
|
if current_character_filters and self.char_filter_scope == CHAR_SCOPE_COMMENTS and self.service != 'discord':
|
||||||
if not post_is_candidate_by_file_char_match_in_comment_scope:
|
if not post_is_candidate_by_file_char_match_in_comment_scope:
|
||||||
if self._check_pause(f"Comment check for post {post_id}"):
|
if self._check_pause(f"Comment check for post {post_id}"):
|
||||||
result_tuple = (0, num_potential_files_in_post, [], [], [], None, None)
|
result_tuple = (0, num_potential_files_in_post, [], [], [], None, None)
|
||||||
@@ -1007,10 +1089,10 @@ class PostProcessorWorker:
|
|||||||
return result_tuple
|
return result_tuple
|
||||||
|
|
||||||
if not self.extract_links_only and self.manga_mode_active and current_character_filters and (self.char_filter_scope == CHAR_SCOPE_TITLE or self.char_filter_scope == CHAR_SCOPE_BOTH) and not post_is_candidate_by_title_char_match:
|
if not self.extract_links_only and self.manga_mode_active and current_character_filters and (self.char_filter_scope == CHAR_SCOPE_TITLE or self.char_filter_scope == CHAR_SCOPE_BOTH) and not post_is_candidate_by_title_char_match:
|
||||||
self.logger(f" -> Skip Post (Manga Mode with Title/Both Scope - No Title Char Match): Title '{post_title[:50]}' doesn't match filters.")
|
self.logger(f" -> Skip Post (Manga Mode with Title/Both Scope - No Title Char Match): Title '{post_title[:50]}' doesn't match filters.")
|
||||||
self._emit_signal('missed_character_post', post_title, "Manga Mode: No title match for character filter (Title/Both scope)")
|
self._emit_signal('missed_character_post', post_title, "Manga Mode: No title match for character filter (Title/Both scope)")
|
||||||
result_tuple = (0, num_potential_files_in_post, [], [], [], None, None)
|
result_tuple = (0, num_potential_files_in_post, [], [], [], None, None)
|
||||||
return result_tuple
|
return result_tuple
|
||||||
|
|
||||||
if not isinstance(post_attachments, list):
|
if not isinstance(post_attachments, list):
|
||||||
self.logger(f"⚠️ Corrupt attachment data for post {post_id} (expected list, got {type(post_attachments)}). Skipping attachments.")
|
self.logger(f"⚠️ Corrupt attachment data for post {post_id} (expected list, got {type(post_attachments)}). Skipping attachments.")
|
||||||
@@ -1143,29 +1225,50 @@ class PostProcessorWorker:
|
|||||||
suffix_counter = 0
|
suffix_counter = 0
|
||||||
final_post_subfolder_name = ""
|
final_post_subfolder_name = ""
|
||||||
|
|
||||||
while True:
|
suffix_counter = 0
|
||||||
|
folder_creation_successful = False
|
||||||
|
final_post_subfolder_name = ""
|
||||||
|
post_id_for_folder = str(self.post.get('id', 'unknown_id'))
|
||||||
|
|
||||||
|
while not folder_creation_successful:
|
||||||
if suffix_counter == 0:
|
if suffix_counter == 0:
|
||||||
name_candidate = original_cleaned_post_title_for_sub
|
name_candidate = original_cleaned_post_title_for_sub
|
||||||
else:
|
else:
|
||||||
name_candidate = f"{original_cleaned_post_title_for_sub}_{suffix_counter}"
|
name_candidate = f"{original_cleaned_post_title_for_sub}_{suffix_counter}"
|
||||||
|
|
||||||
potential_post_subfolder_path = os.path.join(base_path_for_post_subfolder, name_candidate)
|
potential_post_subfolder_path = os.path.join(base_path_for_post_subfolder, name_candidate)
|
||||||
try:
|
id_file_path = os.path.join(potential_post_subfolder_path, f".postid_{post_id_for_folder}")
|
||||||
os.makedirs(potential_post_subfolder_path, exist_ok=False)
|
|
||||||
final_post_subfolder_name = name_candidate
|
if not os.path.isdir(potential_post_subfolder_path):
|
||||||
if suffix_counter > 0:
|
# Folder does not exist, create it and its ID file
|
||||||
self.logger(f" Post subfolder name conflict: Using '{final_post_subfolder_name}' instead of '{original_cleaned_post_title_for_sub}' to avoid mixing posts.")
|
try:
|
||||||
break
|
os.makedirs(potential_post_subfolder_path)
|
||||||
except FileExistsError:
|
with open(id_file_path, 'w') as f:
|
||||||
suffix_counter += 1
|
f.write(post_id_for_folder)
|
||||||
if suffix_counter > 100:
|
|
||||||
self.logger(f" ⚠️ Exceeded 100 attempts to find unique subfolder name for '{original_cleaned_post_title_for_sub}'. Using UUID.")
|
final_post_subfolder_name = name_candidate
|
||||||
final_post_subfolder_name = f"{original_cleaned_post_title_for_sub}_{uuid.uuid4().hex[:8]}"
|
folder_creation_successful = True
|
||||||
os.makedirs(os.path.join(base_path_for_post_subfolder, final_post_subfolder_name), exist_ok=True)
|
if suffix_counter > 0:
|
||||||
|
self.logger(f" Post subfolder name conflict: Using '{final_post_subfolder_name}' to avoid mixing posts.")
|
||||||
|
except OSError as e_mkdir:
|
||||||
|
self.logger(f" ❌ Error creating directory '{potential_post_subfolder_path}': {e_mkdir}.")
|
||||||
|
final_post_subfolder_name = original_cleaned_post_title_for_sub
|
||||||
break
|
break
|
||||||
except OSError as e_mkdir:
|
else:
|
||||||
self.logger(f" ❌ Error creating directory '{potential_post_subfolder_path}': {e_mkdir}. Files for this post might be saved in parent or fail.")
|
# Folder exists, check if it's for this post or a different one
|
||||||
final_post_subfolder_name = original_cleaned_post_title_for_sub
|
if os.path.exists(id_file_path):
|
||||||
break
|
# ID file matches! This is a restore scenario. Reuse the folder.
|
||||||
|
self.logger(f" ℹ️ Re-using existing post subfolder: '{name_candidate}'")
|
||||||
|
final_post_subfolder_name = name_candidate
|
||||||
|
folder_creation_successful = True
|
||||||
|
else:
|
||||||
|
# Folder exists but ID file does not match (or is missing). This is a normal name collision.
|
||||||
|
suffix_counter += 1
|
||||||
|
if suffix_counter > 100: # Safety break
|
||||||
|
self.logger(f" ⚠️ Exceeded 100 attempts to find unique subfolder for '{original_cleaned_post_title_for_sub}'.")
|
||||||
|
final_post_subfolder_name = f"{original_cleaned_post_title_for_sub}_{uuid.uuid4().hex[:8]}"
|
||||||
|
os.makedirs(os.path.join(base_path_for_post_subfolder, final_post_subfolder_name), exist_ok=True)
|
||||||
|
break
|
||||||
determined_post_save_path_for_history = os.path.join(base_path_for_post_subfolder, final_post_subfolder_name)
|
determined_post_save_path_for_history = os.path.join(base_path_for_post_subfolder, final_post_subfolder_name)
|
||||||
|
|
||||||
if self.skip_words_list and (self.skip_words_scope == SKIP_SCOPE_POSTS or self.skip_words_scope == SKIP_SCOPE_BOTH):
|
if self.skip_words_list and (self.skip_words_scope == SKIP_SCOPE_POSTS or self.skip_words_scope == SKIP_SCOPE_BOTH):
|
||||||
@@ -1214,7 +1317,6 @@ class PostProcessorWorker:
|
|||||||
parsed_url = urlparse(self.api_url_input)
|
parsed_url = urlparse(self.api_url_input)
|
||||||
api_domain = parsed_url.netloc
|
api_domain = parsed_url.netloc
|
||||||
cookies = prepare_cookies_for_request(self.use_cookie, self.cookie_text, self.selected_cookie_file, self.app_base_dir, self.logger, target_domain=api_domain)
|
cookies = prepare_cookies_for_request(self.use_cookie, self.cookie_text, self.selected_cookie_file, self.app_base_dir, self.logger, target_domain=api_domain)
|
||||||
from .api_client import fetch_single_post_data
|
|
||||||
full_data = fetch_single_post_data(api_domain, self.service, self.user_id, post_id, headers, self.logger, cookies_dict=cookies)
|
full_data = fetch_single_post_data(api_domain, self.service, self.user_id, post_id, headers, self.logger, cookies_dict=cookies)
|
||||||
if full_data:
|
if full_data:
|
||||||
final_post_data = full_data
|
final_post_data = full_data
|
||||||
@@ -1807,14 +1909,23 @@ class PostProcessorWorker:
|
|||||||
permanent_failures_this_post, history_data_for_this_post,
|
permanent_failures_this_post, history_data_for_this_post,
|
||||||
None)
|
None)
|
||||||
|
|
||||||
|
except Exception as main_thread_err:
|
||||||
|
self.logger(f"\n❌ Critical error within Worker process for {log_prefix} {post_id}: {main_thread_err}")
|
||||||
|
self.logger(traceback.format_exc())
|
||||||
|
# Ensure we still return a valid tuple to prevent the app from stalling
|
||||||
|
result_tuple = (0, 1, [], [], [{'error': str(main_thread_err)}], None, None)
|
||||||
finally:
|
finally:
|
||||||
|
# This block ALWAYS executes, ensuring that every task signals its completion.
|
||||||
|
# This is critical for the main thread to know when all work is done.
|
||||||
if not self.extract_links_only and self.use_post_subfolders and total_downloaded_this_post == 0:
|
if not self.extract_links_only and self.use_post_subfolders and total_downloaded_this_post == 0:
|
||||||
path_to_check_for_emptiness = determined_post_save_path_for_history
|
path_to_check_for_emptiness = determined_post_save_path_for_history
|
||||||
try:
|
try:
|
||||||
|
# Check if the path is a directory and if it's empty
|
||||||
if os.path.isdir(path_to_check_for_emptiness) and not os.listdir(path_to_check_for_emptiness):
|
if os.path.isdir(path_to_check_for_emptiness) and not os.listdir(path_to_check_for_emptiness):
|
||||||
self.logger(f" 🗑️ Removing empty post-specific subfolder: '{path_to_check_for_emptiness}'")
|
self.logger(f" 🗑️ Removing empty post-specific subfolder: '{path_to_check_for_emptiness}'")
|
||||||
os.rmdir(path_to_check_for_emptiness)
|
os.rmdir(path_to_check_for_emptiness)
|
||||||
except OSError as e_rmdir:
|
except OSError as e_rmdir:
|
||||||
|
# Log if removal fails for any reason (e.g., permissions)
|
||||||
self.logger(f" ⚠️ Could not remove potentially empty subfolder '{path_to_check_for_emptiness}': {e_rmdir}")
|
self.logger(f" ⚠️ Could not remove potentially empty subfolder '{path_to_check_for_emptiness}': {e_rmdir}")
|
||||||
|
|
||||||
self._emit_signal('worker_finished', result_tuple)
|
self._emit_signal('worker_finished', result_tuple)
|
||||||
@@ -1881,7 +1992,10 @@ class DownloadThread(QThread):
|
|||||||
single_pdf_mode=False,
|
single_pdf_mode=False,
|
||||||
project_root_dir=None,
|
project_root_dir=None,
|
||||||
processed_post_ids=None,
|
processed_post_ids=None,
|
||||||
start_offset=0):
|
start_offset=0,
|
||||||
|
fetch_first=False,
|
||||||
|
skip_file_size_mb=None
|
||||||
|
):
|
||||||
super().__init__()
|
super().__init__()
|
||||||
self.api_url_input = api_url_input
|
self.api_url_input = api_url_input
|
||||||
self.output_dir = output_dir
|
self.output_dir = output_dir
|
||||||
@@ -1947,6 +2061,8 @@ class DownloadThread(QThread):
|
|||||||
self.project_root_dir = project_root_dir
|
self.project_root_dir = project_root_dir
|
||||||
self.processed_post_ids_set = set(processed_post_ids) if processed_post_ids is not None else set()
|
self.processed_post_ids_set = set(processed_post_ids) if processed_post_ids is not None else set()
|
||||||
self.start_offset = start_offset
|
self.start_offset = start_offset
|
||||||
|
self.fetch_first = fetch_first
|
||||||
|
self.skip_file_size_mb = skip_file_size_mb
|
||||||
|
|
||||||
if self.compress_images and Image is None:
|
if self.compress_images and Image is None:
|
||||||
self.logger("⚠️ Image compression disabled: Pillow library not found (DownloadThread).")
|
self.logger("⚠️ Image compression disabled: Pillow library not found (DownloadThread).")
|
||||||
@@ -1993,7 +2109,8 @@ class DownloadThread(QThread):
|
|||||||
selected_cookie_file=self.selected_cookie_file,
|
selected_cookie_file=self.selected_cookie_file,
|
||||||
app_base_dir=self.app_base_dir,
|
app_base_dir=self.app_base_dir,
|
||||||
manga_filename_style_for_sort_check=self.manga_filename_style if self.manga_mode_active else None,
|
manga_filename_style_for_sort_check=self.manga_filename_style if self.manga_mode_active else None,
|
||||||
processed_post_ids=self.processed_post_ids_set
|
processed_post_ids=self.processed_post_ids_set,
|
||||||
|
fetch_all_first=self.fetch_first
|
||||||
)
|
)
|
||||||
|
|
||||||
for posts_batch_data in post_generator:
|
for posts_batch_data in post_generator:
|
||||||
@@ -2066,6 +2183,7 @@ class DownloadThread(QThread):
|
|||||||
'single_pdf_mode': self.single_pdf_mode,
|
'single_pdf_mode': self.single_pdf_mode,
|
||||||
'multipart_parts_count': self.multipart_parts_count,
|
'multipart_parts_count': self.multipart_parts_count,
|
||||||
'multipart_min_size_mb': self.multipart_min_size_mb,
|
'multipart_min_size_mb': self.multipart_min_size_mb,
|
||||||
|
'skip_file_size_mb': self.skip_file_size_mb,
|
||||||
'project_root_dir': self.project_root_dir,
|
'project_root_dir': self.project_root_dir,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -7,8 +7,6 @@ import base64
|
|||||||
import time
|
import time
|
||||||
from urllib.parse import urlparse, urlunparse, parse_qs, urlencode
|
from urllib.parse import urlparse, urlunparse, parse_qs, urlencode
|
||||||
|
|
||||||
# --- Third-Party Library Imports ---
|
|
||||||
# Make sure to install these: pip install requests pycryptodome gdown
|
|
||||||
import requests
|
import requests
|
||||||
|
|
||||||
try:
|
try:
|
||||||
@@ -23,11 +21,8 @@ try:
|
|||||||
except ImportError:
|
except ImportError:
|
||||||
GDRIVE_AVAILABLE = False
|
GDRIVE_AVAILABLE = False
|
||||||
|
|
||||||
# --- Constants ---
|
|
||||||
MEGA_API_URL = "https://g.api.mega.co.nz"
|
MEGA_API_URL = "https://g.api.mega.co.nz"
|
||||||
|
|
||||||
# --- Helper Functions (Original and New) ---
|
|
||||||
|
|
||||||
def _get_filename_from_headers(headers):
|
def _get_filename_from_headers(headers):
|
||||||
"""
|
"""
|
||||||
Extracts a filename from the Content-Disposition header.
|
Extracts a filename from the Content-Disposition header.
|
||||||
|
|||||||
@@ -16,7 +16,8 @@ from ..main_window import get_app_icon_object
|
|||||||
from ...config.constants import (
|
from ...config.constants import (
|
||||||
THEME_KEY, LANGUAGE_KEY, DOWNLOAD_LOCATION_KEY,
|
THEME_KEY, LANGUAGE_KEY, DOWNLOAD_LOCATION_KEY,
|
||||||
RESOLUTION_KEY, UI_SCALE_KEY, SAVE_CREATOR_JSON_KEY,
|
RESOLUTION_KEY, UI_SCALE_KEY, SAVE_CREATOR_JSON_KEY,
|
||||||
COOKIE_TEXT_KEY, USE_COOKIE_KEY
|
COOKIE_TEXT_KEY, USE_COOKIE_KEY,
|
||||||
|
FETCH_FIRST_KEY ### ADDED ###
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -36,7 +37,7 @@ class FutureSettingsDialog(QDialog):
|
|||||||
|
|
||||||
screen_height = QApplication.primaryScreen().availableGeometry().height() if QApplication.primaryScreen() else 800
|
screen_height = QApplication.primaryScreen().availableGeometry().height() if QApplication.primaryScreen() else 800
|
||||||
scale_factor = screen_height / 800.0
|
scale_factor = screen_height / 800.0
|
||||||
base_min_w, base_min_h = 420, 360 # Adjusted height for new layout
|
base_min_w, base_min_h = 420, 390
|
||||||
scaled_min_w = int(base_min_w * scale_factor)
|
scaled_min_w = int(base_min_w * scale_factor)
|
||||||
scaled_min_h = int(base_min_h * scale_factor)
|
scaled_min_h = int(base_min_h * scale_factor)
|
||||||
self.setMinimumSize(scaled_min_w, scaled_min_h)
|
self.setMinimumSize(scaled_min_w, scaled_min_h)
|
||||||
@@ -49,7 +50,6 @@ class FutureSettingsDialog(QDialog):
|
|||||||
"""Initializes all UI components and layouts for the dialog."""
|
"""Initializes all UI components and layouts for the dialog."""
|
||||||
main_layout = QVBoxLayout(self)
|
main_layout = QVBoxLayout(self)
|
||||||
|
|
||||||
# --- Group 1: Interface Settings ---
|
|
||||||
self.interface_group_box = QGroupBox()
|
self.interface_group_box = QGroupBox()
|
||||||
interface_layout = QGridLayout(self.interface_group_box)
|
interface_layout = QGridLayout(self.interface_group_box)
|
||||||
|
|
||||||
@@ -76,36 +76,32 @@ class FutureSettingsDialog(QDialog):
|
|||||||
|
|
||||||
main_layout.addWidget(self.interface_group_box)
|
main_layout.addWidget(self.interface_group_box)
|
||||||
|
|
||||||
# --- Group 2: Download & Window Settings ---
|
|
||||||
self.download_window_group_box = QGroupBox()
|
self.download_window_group_box = QGroupBox()
|
||||||
download_window_layout = QGridLayout(self.download_window_group_box)
|
download_window_layout = QGridLayout(self.download_window_group_box)
|
||||||
|
|
||||||
# Window Size (Resolution)
|
|
||||||
self.window_size_label = QLabel()
|
self.window_size_label = QLabel()
|
||||||
self.resolution_combo_box = QComboBox()
|
self.resolution_combo_box = QComboBox()
|
||||||
self.resolution_combo_box.currentIndexChanged.connect(self._display_setting_changed)
|
self.resolution_combo_box.currentIndexChanged.connect(self._display_setting_changed)
|
||||||
download_window_layout.addWidget(self.window_size_label, 0, 0)
|
download_window_layout.addWidget(self.window_size_label, 0, 0)
|
||||||
download_window_layout.addWidget(self.resolution_combo_box, 0, 1)
|
download_window_layout.addWidget(self.resolution_combo_box, 0, 1)
|
||||||
|
|
||||||
# Default Path
|
|
||||||
self.default_path_label = QLabel()
|
self.default_path_label = QLabel()
|
||||||
self.save_path_button = QPushButton()
|
self.save_path_button = QPushButton()
|
||||||
# --- START: MODIFIED LOGIC ---
|
|
||||||
self.save_path_button.clicked.connect(self._save_cookie_and_path)
|
self.save_path_button.clicked.connect(self._save_cookie_and_path)
|
||||||
# --- END: MODIFIED LOGIC ---
|
|
||||||
download_window_layout.addWidget(self.default_path_label, 1, 0)
|
download_window_layout.addWidget(self.default_path_label, 1, 0)
|
||||||
download_window_layout.addWidget(self.save_path_button, 1, 1)
|
download_window_layout.addWidget(self.save_path_button, 1, 1)
|
||||||
|
|
||||||
# Save Creator.json Checkbox
|
|
||||||
self.save_creator_json_checkbox = QCheckBox()
|
self.save_creator_json_checkbox = QCheckBox()
|
||||||
self.save_creator_json_checkbox.stateChanged.connect(self._creator_json_setting_changed)
|
self.save_creator_json_checkbox.stateChanged.connect(self._creator_json_setting_changed)
|
||||||
download_window_layout.addWidget(self.save_creator_json_checkbox, 2, 0, 1, 2)
|
download_window_layout.addWidget(self.save_creator_json_checkbox, 2, 0, 1, 2)
|
||||||
|
|
||||||
|
self.fetch_first_checkbox = QCheckBox()
|
||||||
|
self.fetch_first_checkbox.stateChanged.connect(self._fetch_first_setting_changed)
|
||||||
|
download_window_layout.addWidget(self.fetch_first_checkbox, 3, 0, 1, 2)
|
||||||
|
|
||||||
main_layout.addWidget(self.download_window_group_box)
|
main_layout.addWidget(self.download_window_group_box)
|
||||||
|
|
||||||
main_layout.addStretch(1)
|
main_layout.addStretch(1)
|
||||||
|
|
||||||
# --- OK Button ---
|
|
||||||
self.ok_button = QPushButton()
|
self.ok_button = QPushButton()
|
||||||
self.ok_button.clicked.connect(self.accept)
|
self.ok_button.clicked.connect(self.accept)
|
||||||
main_layout.addWidget(self.ok_button, 0, Qt.AlignRight | Qt.AlignBottom)
|
main_layout.addWidget(self.ok_button, 0, Qt.AlignRight | Qt.AlignBottom)
|
||||||
@@ -113,17 +109,27 @@ class FutureSettingsDialog(QDialog):
|
|||||||
def _load_checkbox_states(self):
|
def _load_checkbox_states(self):
|
||||||
"""Loads the initial state for all checkboxes from settings."""
|
"""Loads the initial state for all checkboxes from settings."""
|
||||||
self.save_creator_json_checkbox.blockSignals(True)
|
self.save_creator_json_checkbox.blockSignals(True)
|
||||||
# Default to True so the feature is on by default for users
|
|
||||||
should_save = self.parent_app.settings.value(SAVE_CREATOR_JSON_KEY, True, type=bool)
|
should_save = self.parent_app.settings.value(SAVE_CREATOR_JSON_KEY, True, type=bool)
|
||||||
self.save_creator_json_checkbox.setChecked(should_save)
|
self.save_creator_json_checkbox.setChecked(should_save)
|
||||||
self.save_creator_json_checkbox.blockSignals(False)
|
self.save_creator_json_checkbox.blockSignals(False)
|
||||||
|
|
||||||
|
self.fetch_first_checkbox.blockSignals(True)
|
||||||
|
should_fetch_first = self.parent_app.settings.value(FETCH_FIRST_KEY, False, type=bool)
|
||||||
|
self.fetch_first_checkbox.setChecked(should_fetch_first)
|
||||||
|
self.fetch_first_checkbox.blockSignals(False)
|
||||||
|
|
||||||
def _creator_json_setting_changed(self, state):
|
def _creator_json_setting_changed(self, state):
|
||||||
"""Saves the state of the 'Save Creator.json' checkbox."""
|
"""Saves the state of the 'Save Creator.json' checkbox."""
|
||||||
is_checked = state == Qt.Checked
|
is_checked = state == Qt.Checked
|
||||||
self.parent_app.settings.setValue(SAVE_CREATOR_JSON_KEY, is_checked)
|
self.parent_app.settings.setValue(SAVE_CREATOR_JSON_KEY, is_checked)
|
||||||
self.parent_app.settings.sync()
|
self.parent_app.settings.sync()
|
||||||
|
|
||||||
|
def _fetch_first_setting_changed(self, state):
|
||||||
|
"""Saves the state of the 'Fetch First' checkbox."""
|
||||||
|
is_checked = state == Qt.Checked
|
||||||
|
self.parent_app.settings.setValue(FETCH_FIRST_KEY, is_checked)
|
||||||
|
self.parent_app.settings.sync()
|
||||||
|
|
||||||
def _tr(self, key, default_text=""):
|
def _tr(self, key, default_text=""):
|
||||||
if callable(get_translation) and self.parent_app:
|
if callable(get_translation) and self.parent_app:
|
||||||
return get_translation(self.parent_app.current_selected_language, key, default_text)
|
return get_translation(self.parent_app.current_selected_language, key, default_text)
|
||||||
@@ -132,33 +138,31 @@ class FutureSettingsDialog(QDialog):
|
|||||||
def _retranslate_ui(self):
|
def _retranslate_ui(self):
|
||||||
self.setWindowTitle(self._tr("settings_dialog_title", "Settings"))
|
self.setWindowTitle(self._tr("settings_dialog_title", "Settings"))
|
||||||
|
|
||||||
# Group Box Titles
|
|
||||||
self.interface_group_box.setTitle(self._tr("interface_group_title", "Interface Settings"))
|
self.interface_group_box.setTitle(self._tr("interface_group_title", "Interface Settings"))
|
||||||
self.download_window_group_box.setTitle(self._tr("download_window_group_title", "Download & Window Settings"))
|
self.download_window_group_box.setTitle(self._tr("download_window_group_title", "Download & Window Settings"))
|
||||||
|
|
||||||
# Interface Group Labels
|
|
||||||
self.theme_label.setText(self._tr("theme_label", "Theme:"))
|
self.theme_label.setText(self._tr("theme_label", "Theme:"))
|
||||||
self.ui_scale_label.setText(self._tr("ui_scale_label", "UI Scale:"))
|
self.ui_scale_label.setText(self._tr("ui_scale_label", "UI Scale:"))
|
||||||
self.language_label.setText(self._tr("language_label", "Language:"))
|
self.language_label.setText(self._tr("language_label", "Language:"))
|
||||||
|
|
||||||
# Download & Window Group Labels
|
|
||||||
self.window_size_label.setText(self._tr("window_size_label", "Window Size:"))
|
self.window_size_label.setText(self._tr("window_size_label", "Window Size:"))
|
||||||
self.default_path_label.setText(self._tr("default_path_label", "Default Path:"))
|
self.default_path_label.setText(self._tr("default_path_label", "Default Path:"))
|
||||||
self.save_creator_json_checkbox.setText(self._tr("save_creator_json_label", "Save Creator.json file"))
|
self.save_creator_json_checkbox.setText(self._tr("save_creator_json_label", "Save Creator.json file"))
|
||||||
|
|
||||||
# --- START: MODIFIED LOGIC ---
|
self.fetch_first_checkbox.setText(self._tr("fetch_first_label", "Fetch First (Download after all pages are found)"))
|
||||||
# Buttons and Controls
|
self.fetch_first_checkbox.setToolTip(self._tr("fetch_first_tooltip", "If checked, the downloader will find all posts from a creator first before starting any downloads.\nThis can be slower to start but provides a more accurate progress bar."))
|
||||||
|
|
||||||
self._update_theme_toggle_button_text()
|
self._update_theme_toggle_button_text()
|
||||||
self.save_path_button.setText(self._tr("settings_save_cookie_path_button", "Save Cookie + Download Path"))
|
self.save_path_button.setText(self._tr("settings_save_cookie_path_button", "Save Cookie + Download Path"))
|
||||||
self.save_path_button.setToolTip(self._tr("settings_save_cookie_path_tooltip", "Save the current 'Download Location' and Cookie settings for future sessions."))
|
self.save_path_button.setToolTip(self._tr("settings_save_cookie_path_tooltip", "Save the current 'Download Location' and Cookie settings for future sessions."))
|
||||||
self.ok_button.setText(self._tr("ok_button", "OK"))
|
self.ok_button.setText(self._tr("ok_button", "OK"))
|
||||||
# --- END: MODIFIED LOGIC ---
|
|
||||||
|
|
||||||
# Populate dropdowns
|
|
||||||
self._populate_display_combo_boxes()
|
self._populate_display_combo_boxes()
|
||||||
self._populate_language_combo_box()
|
self._populate_language_combo_box()
|
||||||
self._load_checkbox_states()
|
self._load_checkbox_states()
|
||||||
|
|
||||||
|
# --- (The rest of the file remains unchanged) ---
|
||||||
|
|
||||||
def _apply_theme(self):
|
def _apply_theme(self):
|
||||||
if self.parent_app and self.parent_app.current_theme == "dark":
|
if self.parent_app and self.parent_app.current_theme == "dark":
|
||||||
scale = getattr(self.parent_app, 'scale_factor', 1)
|
scale = getattr(self.parent_app, 'scale_factor', 1)
|
||||||
@@ -285,14 +289,12 @@ class FutureSettingsDialog(QDialog):
|
|||||||
path_saved = False
|
path_saved = False
|
||||||
cookie_saved = False
|
cookie_saved = False
|
||||||
|
|
||||||
# --- Save Download Path Logic ---
|
|
||||||
if hasattr(self.parent_app, 'dir_input') and self.parent_app.dir_input:
|
if hasattr(self.parent_app, 'dir_input') and self.parent_app.dir_input:
|
||||||
current_path = self.parent_app.dir_input.text().strip()
|
current_path = self.parent_app.dir_input.text().strip()
|
||||||
if current_path and os.path.isdir(current_path):
|
if current_path and os.path.isdir(current_path):
|
||||||
self.parent_app.settings.setValue(DOWNLOAD_LOCATION_KEY, current_path)
|
self.parent_app.settings.setValue(DOWNLOAD_LOCATION_KEY, current_path)
|
||||||
path_saved = True
|
path_saved = True
|
||||||
|
|
||||||
# --- Save Cookie Logic ---
|
|
||||||
if hasattr(self.parent_app, 'use_cookie_checkbox'):
|
if hasattr(self.parent_app, 'use_cookie_checkbox'):
|
||||||
use_cookie = self.parent_app.use_cookie_checkbox.isChecked()
|
use_cookie = self.parent_app.use_cookie_checkbox.isChecked()
|
||||||
cookie_content = self.parent_app.cookie_text_input.text().strip()
|
cookie_content = self.parent_app.cookie_text_input.text().strip()
|
||||||
@@ -301,7 +303,7 @@ class FutureSettingsDialog(QDialog):
|
|||||||
self.parent_app.settings.setValue(USE_COOKIE_KEY, True)
|
self.parent_app.settings.setValue(USE_COOKIE_KEY, True)
|
||||||
self.parent_app.settings.setValue(COOKIE_TEXT_KEY, cookie_content)
|
self.parent_app.settings.setValue(COOKIE_TEXT_KEY, cookie_content)
|
||||||
cookie_saved = True
|
cookie_saved = True
|
||||||
else: # Also save the 'off' state
|
else:
|
||||||
self.parent_app.settings.setValue(USE_COOKIE_KEY, False)
|
self.parent_app.settings.setValue(USE_COOKIE_KEY, False)
|
||||||
self.parent_app.settings.setValue(COOKIE_TEXT_KEY, "")
|
self.parent_app.settings.setValue(COOKIE_TEXT_KEY, "")
|
||||||
|
|
||||||
@@ -319,4 +321,4 @@ class FutureSettingsDialog(QDialog):
|
|||||||
self._tr("settings_save_nothing_message", "The download location is not a valid directory and no cookie was active."))
|
self._tr("settings_save_nothing_message", "The download location is not a valid directory and no cookie was active."))
|
||||||
return
|
return
|
||||||
|
|
||||||
QMessageBox.information(self, self._tr("settings_save_success_title", "Settings Saved"), message)
|
QMessageBox.information(self, self._tr("settings_save_success_title", "Settings Saved"), message)
|
||||||
146
src/ui/dialogs/discord_pdf_generator.py
Normal file
146
src/ui/dialogs/discord_pdf_generator.py
Normal file
@@ -0,0 +1,146 @@
|
|||||||
|
import os
|
||||||
|
import re
|
||||||
|
import datetime
|
||||||
|
try:
|
||||||
|
from fpdf import FPDF
|
||||||
|
FPDF_AVAILABLE = True
|
||||||
|
|
||||||
|
class PDF(FPDF):
|
||||||
|
"""Custom PDF class for Discord chat logs."""
|
||||||
|
def __init__(self, server_name, channel_name, *args, **kwargs):
|
||||||
|
super().__init__(*args, **kwargs)
|
||||||
|
self.server_name = server_name
|
||||||
|
self.channel_name = channel_name
|
||||||
|
self.default_font_family = 'DejaVu' # Can be changed to Arial if font fails
|
||||||
|
|
||||||
|
def header(self):
|
||||||
|
if self.page_no() == 1:
|
||||||
|
return # No header on the title page
|
||||||
|
self.set_font(self.default_font_family, '', 8)
|
||||||
|
self.cell(0, 10, f'{self.server_name} - #{self.channel_name}', 0, 0, 'L')
|
||||||
|
self.cell(0, 10, 'Page ' + str(self.page_no()), 0, 0, 'R')
|
||||||
|
self.ln(10)
|
||||||
|
|
||||||
|
def footer(self):
|
||||||
|
pass # No footer needed, header has page number
|
||||||
|
|
||||||
|
except ImportError:
|
||||||
|
FPDF_AVAILABLE = False
|
||||||
|
FPDF = None
|
||||||
|
PDF = None
|
||||||
|
|
||||||
|
def create_pdf_from_discord_messages(messages_data, server_name, channel_name, output_filename, font_path, logger=print):
|
||||||
|
"""
|
||||||
|
Creates a single PDF from a list of Discord message objects, formatted as a chat log.
|
||||||
|
UPDATED to include clickable links for attachments and embeds.
|
||||||
|
"""
|
||||||
|
if not FPDF_AVAILABLE:
|
||||||
|
logger("❌ PDF Creation failed: 'fpdf2' library is not installed.")
|
||||||
|
return False
|
||||||
|
|
||||||
|
if not messages_data:
|
||||||
|
logger(" No messages were found or fetched to create a PDF.")
|
||||||
|
return False
|
||||||
|
|
||||||
|
logger(" Sorting messages by date (oldest first)...")
|
||||||
|
messages_data.sort(key=lambda m: m.get('published', ''))
|
||||||
|
|
||||||
|
pdf = PDF(server_name, channel_name)
|
||||||
|
default_font_family = 'DejaVu'
|
||||||
|
|
||||||
|
try:
|
||||||
|
bold_font_path = font_path.replace("DejaVuSans.ttf", "DejaVuSans-Bold.ttf")
|
||||||
|
if not os.path.exists(font_path) or not os.path.exists(bold_font_path):
|
||||||
|
raise RuntimeError("Font files not found")
|
||||||
|
|
||||||
|
pdf.add_font('DejaVu', '', font_path, uni=True)
|
||||||
|
pdf.add_font('DejaVu', 'B', bold_font_path, uni=True)
|
||||||
|
except Exception as font_error:
|
||||||
|
logger(f" ⚠️ Could not load DejaVu font: {font_error}. Falling back to Arial.")
|
||||||
|
default_font_family = 'Arial'
|
||||||
|
pdf.default_font_family = 'Arial'
|
||||||
|
|
||||||
|
# --- Title Page ---
|
||||||
|
pdf.add_page()
|
||||||
|
pdf.set_font(default_font_family, 'B', 24)
|
||||||
|
pdf.cell(w=0, h=20, text="Discord Chat Log", align='C', new_x="LMARGIN", new_y="NEXT")
|
||||||
|
pdf.ln(10)
|
||||||
|
pdf.set_font(default_font_family, '', 16)
|
||||||
|
pdf.cell(w=0, h=10, text=f"Server: {server_name}", align='C', new_x="LMARGIN", new_y="NEXT")
|
||||||
|
pdf.cell(w=0, h=10, text=f"Channel: #{channel_name}", align='C', new_x="LMARGIN", new_y="NEXT")
|
||||||
|
pdf.ln(5)
|
||||||
|
pdf.set_font(default_font_family, '', 10)
|
||||||
|
pdf.cell(w=0, h=10, text=f"Generated on: {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", align='C', new_x="LMARGIN", new_y="NEXT")
|
||||||
|
pdf.cell(w=0, h=10, text=f"Total Messages: {len(messages_data)}", align='C', new_x="LMARGIN", new_y="NEXT")
|
||||||
|
|
||||||
|
pdf.add_page()
|
||||||
|
|
||||||
|
logger(f" Starting PDF creation with {len(messages_data)} messages...")
|
||||||
|
|
||||||
|
for i, message in enumerate(messages_data):
|
||||||
|
author = message.get('author', {}).get('global_name') or message.get('author', {}).get('username', 'Unknown User')
|
||||||
|
timestamp_str = message.get('published', '')
|
||||||
|
content = message.get('content', '')
|
||||||
|
attachments = message.get('attachments', [])
|
||||||
|
embeds = message.get('embeds', [])
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Handle timezone information correctly
|
||||||
|
if timestamp_str.endswith('Z'):
|
||||||
|
timestamp_str = timestamp_str[:-1] + '+00:00'
|
||||||
|
dt_obj = datetime.datetime.fromisoformat(timestamp_str)
|
||||||
|
formatted_timestamp = dt_obj.strftime('%Y-%m-%d %H:%M:%S')
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
formatted_timestamp = timestamp_str
|
||||||
|
|
||||||
|
# Draw a separator line
|
||||||
|
if i > 0:
|
||||||
|
pdf.ln(2)
|
||||||
|
pdf.set_draw_color(200, 200, 200) # Light grey line
|
||||||
|
pdf.cell(0, 0, '', border='T')
|
||||||
|
pdf.ln(2)
|
||||||
|
|
||||||
|
# Message Header
|
||||||
|
pdf.set_font(default_font_family, 'B', 11)
|
||||||
|
pdf.write(5, f"{author} ")
|
||||||
|
pdf.set_font(default_font_family, '', 9)
|
||||||
|
pdf.set_text_color(128, 128, 128)
|
||||||
|
pdf.write(5, f"({formatted_timestamp})")
|
||||||
|
pdf.set_text_color(0, 0, 0)
|
||||||
|
pdf.ln(6)
|
||||||
|
|
||||||
|
# Message Content
|
||||||
|
if content:
|
||||||
|
pdf.set_font(default_font_family, '', 10)
|
||||||
|
pdf.multi_cell(w=0, h=5, text=content)
|
||||||
|
|
||||||
|
# --- START: MODIFIED ATTACHMENT AND EMBED LOGIC ---
|
||||||
|
if attachments or embeds:
|
||||||
|
pdf.ln(1)
|
||||||
|
pdf.set_font(default_font_family, '', 9)
|
||||||
|
pdf.set_text_color(22, 119, 219) # A nice blue for links
|
||||||
|
|
||||||
|
for att in attachments:
|
||||||
|
file_name = att.get('name', 'untitled')
|
||||||
|
file_path = att.get('path', '')
|
||||||
|
# Construct the full, clickable URL for the attachment
|
||||||
|
full_url = f"https://kemono.cr/data{file_path}"
|
||||||
|
pdf.write(5, text=f"[Attachment: {file_name}]", link=full_url)
|
||||||
|
pdf.ln() # New line after each attachment
|
||||||
|
|
||||||
|
for embed in embeds:
|
||||||
|
embed_url = embed.get('url', 'no url')
|
||||||
|
# The embed URL is already a full URL
|
||||||
|
pdf.write(5, text=f"[Embed: {embed_url}]", link=embed_url)
|
||||||
|
pdf.ln() # New line after each embed
|
||||||
|
|
||||||
|
pdf.set_text_color(0, 0, 0) # Reset color to black
|
||||||
|
# --- END: MODIFIED ATTACHMENT AND EMBED LOGIC ---
|
||||||
|
|
||||||
|
try:
|
||||||
|
pdf.output(output_filename)
|
||||||
|
logger(f"✅ Successfully created Discord chat log PDF: '{os.path.basename(output_filename)}'")
|
||||||
|
return True
|
||||||
|
except Exception as e:
|
||||||
|
logger(f"❌ A critical error occurred while saving the final PDF: {e}")
|
||||||
|
return False
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -141,12 +141,15 @@ def prepare_cookies_for_request(use_cookie_flag, cookie_text_input, selected_coo
|
|||||||
def extract_post_info(url_string):
|
def extract_post_info(url_string):
|
||||||
"""
|
"""
|
||||||
Parses a URL string to extract the service, user ID, and post ID.
|
Parses a URL string to extract the service, user ID, and post ID.
|
||||||
|
UPDATED to support Discord server/channel URLs.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
url_string (str): The URL to parse.
|
url_string (str): The URL to parse.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
tuple: A tuple containing (service, user_id, post_id). Any can be None.
|
tuple: A tuple containing (service, id1, id2).
|
||||||
|
For posts: (service, user_id, post_id).
|
||||||
|
For Discord: ('discord', server_id, channel_id).
|
||||||
"""
|
"""
|
||||||
if not isinstance(url_string, str) or not url_string.strip():
|
if not isinstance(url_string, str) or not url_string.strip():
|
||||||
return None, None, None
|
return None, None, None
|
||||||
@@ -155,7 +158,15 @@ def extract_post_info(url_string):
|
|||||||
parsed_url = urlparse(url_string.strip())
|
parsed_url = urlparse(url_string.strip())
|
||||||
path_parts = [part for part in parsed_url.path.strip('/').split('/') if part]
|
path_parts = [part for part in parsed_url.path.strip('/').split('/') if part]
|
||||||
|
|
||||||
# Standard format: /<service>/user/<user_id>/post/<post_id>
|
# Check for new Discord URL format first
|
||||||
|
# e.g., /discord/server/891670433978531850/1252332668805189723
|
||||||
|
if len(path_parts) >= 3 and path_parts[0].lower() == 'discord' and path_parts[1].lower() == 'server':
|
||||||
|
service = 'discord'
|
||||||
|
server_id = path_parts[2]
|
||||||
|
channel_id = path_parts[3] if len(path_parts) >= 4 else None
|
||||||
|
return service, server_id, channel_id
|
||||||
|
|
||||||
|
# Standard creator/post format: /<service>/user/<user_id>/post/<post_id>
|
||||||
if len(path_parts) >= 3 and path_parts[1].lower() == 'user':
|
if len(path_parts) >= 3 and path_parts[1].lower() == 'user':
|
||||||
service = path_parts[0]
|
service = path_parts[0]
|
||||||
user_id = path_parts[2]
|
user_id = path_parts[2]
|
||||||
@@ -174,7 +185,6 @@ def extract_post_info(url_string):
|
|||||||
|
|
||||||
return None, None, None
|
return None, None, None
|
||||||
|
|
||||||
|
|
||||||
def get_link_platform(url):
|
def get_link_platform(url):
|
||||||
"""
|
"""
|
||||||
Identifies the platform of a given URL based on its domain.
|
Identifies the platform of a given URL based on its domain.
|
||||||
|
|||||||
@@ -391,6 +391,10 @@ def setup_ui(main_app):
|
|||||||
main_app.link_search_button.setVisible(False)
|
main_app.link_search_button.setVisible(False)
|
||||||
main_app.link_search_button.setFixedWidth(int(30 * scale))
|
main_app.link_search_button.setFixedWidth(int(30 * scale))
|
||||||
log_title_layout.addWidget(main_app.link_search_button)
|
log_title_layout.addWidget(main_app.link_search_button)
|
||||||
|
main_app.discord_scope_toggle_button = QPushButton("Scope: Files")
|
||||||
|
main_app.discord_scope_toggle_button.setVisible(False) # Hidden by default
|
||||||
|
main_app.discord_scope_toggle_button.setFixedWidth(int(140 * scale))
|
||||||
|
log_title_layout.addWidget(main_app.discord_scope_toggle_button)
|
||||||
main_app.manga_rename_toggle_button = QPushButton()
|
main_app.manga_rename_toggle_button = QPushButton()
|
||||||
main_app.manga_rename_toggle_button.setVisible(False)
|
main_app.manga_rename_toggle_button.setVisible(False)
|
||||||
main_app.manga_rename_toggle_button.setFixedWidth(int(140 * scale))
|
main_app.manga_rename_toggle_button.setFixedWidth(int(140 * scale))
|
||||||
|
|||||||
Reference in New Issue
Block a user