18 Commits

Author SHA1 Message Date
Yuvi9587
f41f354737 Update main_window.py 2025-07-13 21:46:34 -07:00
Yuvi9587
6b57ee099d Commit 2025-07-13 21:45:30 -07:00
Yuvi9587
21ecb60cb5 commit 2025-07-13 20:21:17 -07:00
Yuvi9587
ee00019f2e Update workers.py 2025-07-13 18:42:56 -07:00
Yuvi9587
d49c739fe4 Commit 2025-07-13 10:36:52 -07:00
Yuvi9587
dbdf82a079 Commit 2025-07-13 10:22:06 -07:00
Yuvi9587
f0bf74da16 Update readme.md 2025-07-11 01:30:07 -07:00
Yuvi9587
e8b655e492 Update readme.md 2025-07-11 01:28:48 -07:00
Yuvi9587
4f383910d2 Update readme.md 2025-07-11 01:26:57 -07:00
Yuvi9587
404c4ca59a commit 2025-07-11 01:24:56 -07:00
Yuvi9587
bcf26bea20 Commit 2025-07-11 01:24:12 -07:00
Yuvi9587
fa198c41c1 Commit 2025-07-10 09:59:51 -07:00
Yuvi9587
f214d2452e Update features.md 2025-07-08 13:14:46 +05:30
Yuvi9587
f39b510577 Update features.md 2025-07-08 13:03:24 +05:30
Yuvi9587
2c45c14696 Commit 2025-07-08 13:01:21 +05:30
Yuvi9587
aa2305c10e Commit 2025-07-07 14:10:52 +05:30
Yuvi9587
568c687f98 Update note.md 2025-07-06 17:34:04 +05:30
Yuvi9587
c8b77fb0d7 Commit 2025-07-05 06:02:21 +05:30
25 changed files with 7902 additions and 1812 deletions

View File

@@ -0,0 +1,97 @@
Fonts are (c) Bitstream (see below). DejaVu changes are in public domain.
Glyphs imported from Arev fonts are (c) Tavmjong Bah (see below)
Bitstream Vera Fonts Copyright
------------------------------
Copyright (c) 2003 by Bitstream, Inc. All Rights Reserved. Bitstream Vera is
a trademark of Bitstream, Inc.
Permission is hereby granted, free of charge, to any person obtaining a copy
of the fonts accompanying this license ("Fonts") and associated
documentation files (the "Font Software"), to reproduce and distribute the
Font Software, including without limitation the rights to use, copy, merge,
publish, distribute, and/or sell copies of the Font Software, and to permit
persons to whom the Font Software is furnished to do so, subject to the
following conditions:
The above copyright and trademark notices and this permission notice shall
be included in all copies of one or more of the Font Software typefaces.
The Font Software may be modified, altered, or added to, and in particular
the designs of glyphs or characters in the Fonts may be modified and
additional glyphs or characters may be added to the Fonts, only if the fonts
are renamed to names not containing either the words "Bitstream" or the word
"Vera".
This License becomes null and void to the extent applicable to Fonts or Font
Software that has been modified and is distributed under the "Bitstream
Vera" names.
The Font Software may be sold as part of a larger software package but no
copy of one or more of the Font Software typefaces may be sold by itself.
THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF COPYRIGHT, PATENT,
TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL BITSTREAM OR THE GNOME
FOUNDATION BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, INCLUDING
ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF
THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM OTHER DEALINGS IN THE
FONT SOFTWARE.
Except as contained in this notice, the names of Gnome, the Gnome
Foundation, and Bitstream Inc., shall not be used in advertising or
otherwise to promote the sale, use or other dealings in this Font Software
without prior written authorization from the Gnome Foundation or Bitstream
Inc., respectively. For further information, contact: fonts at gnome dot
org.
Arev Fonts Copyright
------------------------------
Copyright (c) 2006 by Tavmjong Bah. All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining
a copy of the fonts accompanying this license ("Fonts") and
associated documentation files (the "Font Software"), to reproduce
and distribute the modifications to the Bitstream Vera Font Software,
including without limitation the rights to use, copy, merge, publish,
distribute, and/or sell copies of the Font Software, and to permit
persons to whom the Font Software is furnished to do so, subject to
the following conditions:
The above copyright and trademark notices and this permission notice
shall be included in all copies of one or more of the Font Software
typefaces.
The Font Software may be modified, altered, or added to, and in
particular the designs of glyphs or characters in the Fonts may be
modified and additional glyphs or characters may be added to the
Fonts, only if the fonts are renamed to names not containing either
the words "Tavmjong Bah" or the word "Arev".
This License becomes null and void to the extent applicable to Fonts
or Font Software that has been modified and is distributed under the
"Tavmjong Bah Arev" names.
The Font Software may be sold as part of a larger software package but
no copy of one or more of the Font Software typefaces may be sold by
itself.
THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL
TAVMJONG BAH BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
OTHER DEALINGS IN THE FONT SOFTWARE.
Except as contained in this notice, the name of Tavmjong Bah shall not
be used in advertising or otherwise to promote the sale, use or other
dealings in this Font Software without prior written authorization
from Tavmjong Bah. For further information, contact: tavmjong @ free
. fr.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -1,391 +1,192 @@
# Kemono Downloader - Detailed Feature Guide # Kemono Downloader - Feature Guide
This guide provides a comprehensive overview of all user interface elements, input fields, buttons, popups, and functionalities available in the Kemono Downloader. This guide provides a comprehensive overview of all user interface elements, input fields, buttons, popups, and functionalities available in the Kemono Downloader.
--- ## 1. Main Interface & Workflow
## Main Interface & Workflow
These are the primary controls you'll interact with to initiate and manage downloads. These are the primary controls you'll interact with to initiate and manage downloads.
### 1. Main Inputs ### 1.1. Core Inputs
**🔗 Creator/Post URL Input Field**  
- **🔗 Kemono Creator/Post URL Input Field:** - **Purpose**: Paste the URL of the content you want to download.  
- **Purpose:** This is where you paste the URL of the content you want to download. - **Supported Sites**: Kemono.su, Coomer.party, Simpcity.su.  
- **Usage:** Supports full URLs for: - **Supported URL Types**:  
- Kemono.su (and mirrors like kemono.party) creator pages (e.g., `https://kemono.su/patreon/user/12345`).   - Creator pages (e.g., `https://kemono.su/patreon/user/12345`).  
- Kemono.su (and mirrors) individual posts (e.g., `https://kemono.su/patreon/user/12345/post/98765`).   - Individual posts (e.g., `https://kemono.su/patreon/user/12345/post/98765`).  
- Coomer.party (and mirrors like coomer.su) creator pages. - **Note**: When ⭐ Favorite Mode is active, this field is disabled. For Simpcity.su URLs, the "Use Cookie" option is mandatory and auto-enabled.
- Coomer.party (and mirrors) individual posts.
- **Note:** **🎨 Creator Selection Button**  
- When **⭐ Favorite Mode** is active, this field is disabled and shows a "Favorite Mode active" message. - **Icon**: 🎨 (Artist Palette)  
- This field can also be populated with a placeholder message (e.g., "{count} items in queue from popup") if posts are added to the download queue directly from the 'Creator Selection' dialog's 'Fetched Posts' view. - **Purpose**: Opens the "Creator Selection" dialog to browse and queue downloads from known creators.  
- **Dialog Features**:  
- **🎨 Creator Selection Button:**   - Loads creators from `creators.json`.  
- **Icon:** 🎨 (Artist Palette)   - **Search Bar**: Filter creators by name.  
- **Location:** Next to the URL input field.   - **Creator List**: Displays creators with their service (e.g., Patreon, Fanbox).  
- **Purpose:** Opens the "Creator Selection" dialog to easily add multiple creators to the URL field.   - **Selection**: Checkboxes to select one or more creators.  
- **Dialog Features:**   - **Download Scope**: Organize downloads by Characters or Creators.  
- Loads creators from your `creators.json` file (expected in the app's directory).   - **Add to Queue**: Adds selected creators or their posts to the download queue.
- **Search Bar:** Filter the list of creators by name.
- **Creator List:** Displays creators with their service (e.g., Patreon, Fanbox) and ID. **Page Range (Start to End) Input Fields**  
- **Selection:** Checkboxes to select one or more creators. - **Purpose**: Specify a range of pages to fetch for creator URLs.  
- **"Add Selected to URL" Button:** Adds the names of selected creators to the URL input field, comma-separated. - **Usage**: Enter the starting and ending page numbers.  
- **"Fetch Posts" Button:** After selecting creators, click this to retrieve their latest posts. This will display a new pane within the dialog showing the fetched posts. - **Behavior**:  
- **"Download Scope" Radio Buttons (`Characters` / `Creators`):** Determines the folder structure for items added via this popup.   - If blank, all pages are processed.  
- `Characters`: Assumes creator names are character names for folder organization.   - Disabled for single post URLs.
- `Creators`: Uses the actual creator names for folder organization.
- **Fetched Posts View (Right Pane - Appears after clicking 'Fetch Posts'):** **📁 Download Location Input Field & Browse Button**  
- **Posts Area Title Label:** Indicates loading status or number of fetched posts. - **Purpose**: Specify the main directory for downloaded files.  
- **Posts Search Input:** Allows filtering the list of fetched posts by title. - **Usage**: Type the path or click "Browse..." to select a folder.  
- **Posts List Widget:** Displays posts fetched from the selected creators, often grouped by creator. Each post is checkable. - **Requirement**: Mandatory for all download operations.
- **Select All / Deselect All Buttons (for Posts):** Convenience buttons for selecting/deselecting all displayed fetched posts.
- **"Add Selected Posts to Queue" Button:** Adds all checked posts from this view directly to the application's main download queue. The main URL input field will then show a message like "{count} items in queue from popup". ### 1.2. Action Buttons
- **"Close" Button (for Posts View):** Hides the fetched posts view and returns to the creator selection list, allowing you to use the 'Add Selected to URL' button if preferred. **⬇️ Start Download / 🔗 Extract Links Button**  
- **Purpose**: Initiates downloading or link extraction.  
- **Page Range (Start to End) Input Fields:** - **Behavior**:  
- **Purpose:** For creator URLs, specify a range of pages to fetch and process.   - Shows "🔗 Extract Links" if "Only Links" is selected.  
- **Usage:** Enter the starting page number in the first field and the ending page number in the second.   - Otherwise, shows "⬇️ Start Download".  
- **Behavior:**   - Supports single-threaded or multi-threaded downloads based on settings.
- If left blank, all pages for the creator are typically processed (or up to a reasonable limit).
- Disabled for single post URLs or when **📖 Manga/Comic Mode** is active (as manga mode fetches all posts for chronological sorting). **🔄 Restore Download Button**  
- **Visibility**: Appears if an incomplete session is detected on startup.  
- **📁 Download Location Input Field & Browse Button:** - **Purpose**: Resumes a previously interrupted download session.
- **Purpose:** Specify the main directory where all downloaded files and folders will be saved.
- **Usage:** **⏸️ Pause / ▶️ Resume Download Button**  
- Type or paste the path directly into the field. - **Purpose**: Pause or resume the ongoing download.  
- Click the **"Browse..."** button to open a system dialog to select a folder. - **Behavior**: Toggles between "Pause" and "Resume". Some UI settings can be changed while paused.
- **Requirement:** This field must be filled unless you are using the "🔗 Only Links" filter mode.
**❌ Cancel & Reset UI Button**  
### 2. Action Buttons - **Purpose**: Stops the current operation and performs a "soft" reset.  
- **Behavior**: Halts background threads, preserves URL and Download Location inputs, resets other settings.
- **⬇️ Start Download / 🔗 Extract Links Button:**
- **Purpose:** The primary action button to begin the downloading or link extraction process based on current settings. **🔄 Reset Button (in the log area)**  
- **Behavior:** - **Purpose**: Performs a "hard" reset when no operation is active.  
- If "🔗 Only Links" filter is selected, the button text changes to **"🔗 Extract Links"** and it will only gather external links from posts. - **Behavior**: Clears all inputs, resets options to default, and clears logs.
- Otherwise, it reads **"⬇️ Start Download"** and initiates the content download.
## 2. Filtering & Content Selection
- **⏸️ Pause / ▶️ Resume Download Button:** These options allow precise control over downloaded content.
- **Purpose:** Temporarily halt or continue the ongoing download/extraction process.
- **Behavior:** ### 2.1. Content Filtering
- When active, the button shows **"⏸️ Pause Download"**. Clicking it pauses the operation. **🎯 Filter by Character(s) Input Field**  
- When paused, the button shows **"▶️ Resume Download"**. Clicking it resumes from where it left off. - **Purpose**: Download content related to specific characters or series.  
- Some UI settings can be changed while paused (e.g., filter adjustments), which will apply upon resuming. - **Usage**: Enter comma-separated character names.  
- **Advanced Syntax**:  
- **❌ Cancel & Reset UI Button:**   - `Nami`: Simple filter.  
- **Purpose:** Immediately stops the current download/extraction operation and performs a "soft" reset of the UI.   - `(Vivi, Ulti)`: Grouped filter. Matches posts with "Vivi" OR "Ulti". Creates a shared folder like `Vivi Ulti` if subfolders are enabled.  
- **Behavior:**   - `(Boa, Hancock)~`: Aliased filter. Treats "Boa" and "Hancock" as the same entity.
- Halts all active threads and processes.
- Clears progress information and logs. **Filter: [Type] Button (Character Filter Scope)**  
- Preserves the content of the "🔗 Kemono Creator/Post URL" and "📁 Download Location" input fields. Other settings are reset to their defaults. - **Purpose**: Defines where the character filter is applied. Cycles on click.  
- **Options**:  
- **🔄 Reset Button (located in the log area):**   - **Filter: Title** (Default): Matches post titles.  
- **Purpose:** Performs a "hard" reset of the UI when no operation is active.   - **Filter: Files**: Matches filenames.  
- **Behavior:**   - **Filter: Both**: Checks title first, then filenames.  
- Clears all input fields (including URL and Download Location).   - **Filter: Comments (Beta)**: Checks filenames, then post comments.
- Resets all filter settings and options to their default values.
- Clears the log area. **🚫 Skip with Words Input Field**  
- **Purpose**: Exclude posts/files with specified keywords (e.g., `WIP`, `sketch`).
---
**Scope: [Type] Button (Skip Words Scope)**  
## Filtering & Content Selection - **Purpose**: Defines where skip words are applied. Cycles on click.  
- **Options**:  
These options allow you to precisely control what content is downloaded or skipped.   - **Scope: Posts** (Default): Skips posts if the title contains a skip word.  
  - **Scope: Files**: Skips files if the filename contains a skip word.  
- **🎯 Filter by Character(s) Input Field:**   - **Scope: Both**: Applies both rules.
- **Purpose:** Download content related to specific characters.
- **Usage:** Enter character names, comma-separated. **✂️ Remove Words from Name Input Field**  
- **Advanced Syntax:** - **Purpose**: Remove unwanted text from filenames (e.g., `patreon`, `[HD]`).
- `Nami`: Simple character filter. Matches "Nami".
- `(Vivi, Ulti, Uta)`: Grouped characters. Matches "Vivi" OR "Ulti" OR "Uta". If "Separate Folders" is on, creates a shared folder for the session (e.g., "Vivi Ulti Uta"). Adds "Vivi", "Ulti", "Uta" as *separate* entries to `Known.txt` if new. ### 2.2. File Type Filtering
- `(Boa, Hancock)~`: Aliased characters. Matches "Boa" OR "Hancock" but treats them as the same entity. If "Separate Folders" is on, creates a shared folder (e.g., "Boa Hancock"). Adds "Boa Hancock" as a *single group entry* to `Known.txt` if new, with "Boa" and "Hancock" as its aliases. **Filter Files (Radio Buttons)**  
- **Purpose**: Select file types to download.  
- **Filter: [Type] Button (Scope for Character Filter):** - **Options**:  
- **Location:** Next to the "Filter by Character(s)" input.   - **All**: All file types.  
- **Purpose:** Defines how the character filter is applied. Cycles through options on click.   - **Images/GIFs**: Common image formats.  
- **Options:**   - **Videos**: Common video formats.  
- `Filter: Files`: Checks individual filenames against the character filter. Only matching files from a post are downloaded.   - **🎧 Only Audio**: Common audio formats.  
- `Filter: Title` (Default): Checks post titles against the character filter. If the title matches, all files from that post are downloaded.   - **📦 Only Archives**: Only `.zip` and `.rar` files.  
- `Filter: Both`: Checks the post title first. If no match, then checks individual filenames within that post.   - **🔗 Only Links**: Extracts external links without downloading files.
- `Filter: Comments (Beta)`: Checks filenames first. If no file match, then checks post comments/description. (Note: This may use more API requests).
**Skip .zip / Skip .rar Checkboxes**  
- **🚫 Skip with Words Input Field:** - **Purpose**: Skip downloading `.zip` or `.rar` files.  
- **Purpose:** Exclude posts or files containing specified keywords. - **Behavior**: Disabled when "📦 Only Archives" is active.
- **Usage:** Enter words or phrases, comma-separated (e.g., `WIP, sketch, preview`).
## 3. Download Customization
- **Scope: [Type] Button (Scope for Skip with Words):** Options to refine the download process and output.
- **Location:** Next to the "Skip with Words" input.
- **Purpose:** Defines how the skip words are applied. Cycles through options on click. - **Download Thumbnails Only**: Downloads small preview images instead of full-resolution files.  
- **Options:** - **Scan Content for Images**: Scans post HTML for `<img>` tags, crucial for images in descriptions.  
- `Scope: Files`: Skips individual files if their names contain any of the skip words. - **Compress to WebP**: Converts images to WebP format (requires Pillow library).
- `Scope: Posts` (Default): Skips entire posts if their titles contain any of the skip words. - **Keep Duplicates**: Normally, if a post contains multiple files with the same name, only the first is downloaded. Checking this option will download all of them, renaming subsequent unique files with a numeric suffix (e.g., `image_1.jpg`).
- `Scope: Both`: Checks the post title first. If no skip words match, then checks individual filenames. - **🗄️ Custom Folder Name (Single Post Only)**: Specify a custom folder name for a single post's content (appears if subfolders are enabled).
- **✂️ Remove Words from name Input Field:** ## 4. 📖 Manga/Comic Mode
- **Purpose:** Clean up downloaded filenames by removing specified unwanted words or phrases. A mode for downloading creator feeds in chronological order, ideal for sequential content.
- **Usage:** Enter words or phrases, comma-separated (e.g., `patreon, [HD], kemono`).
- **Activation**: Active when downloading a creator's entire feed (not a single post).  
- **Filter Files (Radio Buttons):** - **Core Behavior**: Fetches all posts, processing from oldest to newest.  
- **Purpose:** Select the types of files to download. - **Filename Style Toggle Button (in the log area)**:  
- **Options:**   - **Purpose**: Controls file naming in Manga Mode. Cycles on click.  
- `All`: Download all file types attached to posts.   - **Options**:  
- `Images/GIFs`: Download only common image formats (JPG, PNG, GIF, WebP, etc.).     - **Name: Post Title**: First file named after post title; others keep original names.  
- `Videos`: Download only common video formats (MP4, MOV, MKV, WebM, etc.).     - **Name: Original File**: Files keep server-provided names, with optional prefix.  
- `📦 Only Archives`: Exclusively download `.zip` and `.rar` files. This mode disables the "Skip .zip/.rar" checkboxes and the "Show External Links in Log" feature.     - **Name: Title+G.Num**: Global numbering with post title prefix (e.g., `Chapter 1_001.jpg`).  
- `🎧 Only Audio`: Download only common audio formats (MP3, WAV, FLAC, OGG, etc.).     - **Name: Date Based**: Sequential naming by post date (e.g., `001.jpg`), with optional prefix.  
- `🔗 Only Links`: Do not download any files. Instead, extract and display external links found in post descriptions in the log area. The main action button changes to "🔗 Extract Links".     - **Name: Post ID**: Files named after post ID to avoid clashes.  
    - **Name: Date + Title**: Combines post date and title for filenames.
- **Skip .zip / Skip .rar Checkboxes:**
- **Purpose:** Individually choose to skip downloading `.zip` files or `.rar` files. ## 5. Folder Organization & Known.txt
- **Behavior:** Disabled if the "📦 Only Archives" filter is active. Controls for structuring downloaded content.
--- - **Separate Folders by Name/Title Checkbox**: Enables automatic subfolder creation.  
- **Subfolder per Post Checkbox**: Creates subfolders for each post, named after the post title.  
## Download Customization - **Date Prefix for Post Subfolders Checkbox**: When used with "Subfolder per Post," this option prefixes the folder name with the post's upload date (e.g., `2025-07-11 Post Title`), allowing for chronological sorting.
- **Known.txt Management UI (Bottom Left)**:  
Options to further refine the download process and output.   - **Purpose**: Manages a local `Known.txt` file for series, characters, or terms used in folder creation.  
  - **List Display**: Shows primary names from `Known.txt`.  
- **Download Thumbnails Only Checkbox:**   - ** Add Button**: Adds names or groups (e.g., `(Character A, Alias B)~`).  
- **Purpose:** Download only the small preview images (thumbnails) provided by the API, instead of full-resolution files.   - **⤵️ Add to Filter Button**: Select names from `Known.txt` for the character filter.  
- **Behavior:** If "**Scan Content for Images**" is also active, this option's behavior changes: *only* images found by the content scan (embedded `<img>` tags) are downloaded as thumbnails (API thumbnails are ignored).   - **🗑️ Delete Selected Button**: Removes selected names from `Known.txt`.  
  - **Open Known.txt Button**: Opens the file in the default text editor.  
- **Scan Content for Images Checkbox:**   - **❓ Help Button**: Opens this feature guide.  
- **Purpose:** Actively scan the HTML content of posts for `<img>` tags and direct image links. This is crucial for downloading images embedded in post descriptions that are not listed as direct attachments in the API response.   - **📜 History Button**: Views recent download history.
- **Behavior:** Resolves relative image paths to absolute URLs for downloading.
## 6. ⭐ Favorite Mode (Kemono.su Only)
- **Compress to WebP Checkbox:** Download from favorited artists/posts on Kemono.su.
- **Purpose:** Convert downloaded images to WebP format to potentially save disk space.
- **Requirement:** Requires the `Pillow` library to be installed. - **Enable Checkbox ("⭐ Favorite Mode")**:  
- **Behavior:** Attempts to convert images larger than a certain threshold (e.g., 1.5MB) to WebP if the WebP version is significantly smaller. Original files are not kept if conversion is successful.   - Switches to Favorite Mode.  
  - Disables the main URL input.  
- **🗄️ Custom Folder Name (Single Post Only) Input Field:**   - Changes action buttons to "Favorite Artists" and "Favorite Posts".  
- **Purpose:** When downloading a single post URL, allows you to specify a custom name for the folder where its contents will be saved.   - Requires cookies.  
- **Visibility:** Only appears if: - **🖼️ Favorite Artists Button**: Select and download from favorited artists.  
1. A single post URL is entered in the main URL field. - **📄 Favorite Posts Button**: Select and download specific favorited posts.  
2. The "**Separate Folders by Name/Title**" option is enabled. - **Favorite Download Scope Button**:  
  - **Scope: Selected Location**: Downloads favorites to the main directory.  
---   - **Scope: Artist Folders**: Creates subfolders per artist.
## 📖 Manga/Comic Mode ## 7. Advanced Settings & Performance
- **🍪 Cookie Management**:  
Specialized mode for downloading creator feeds in a way suitable for sequential reading, like manga or comics. This mode is implicitly active when downloading from a creator URL and certain filename styles are chosen.   - **Use Cookie Checkbox**: Enables cookies for restricted content.  
  - **Cookie Text Field**: Paste cookie string.  
- **Activation:** Primarily by downloading a creator's feed (not a single post) and selecting a relevant "Filename Style".   - **Browse... Button**: Select a `cookies.txt` file (Netscape format).  
- **Core Behavior:** Processes and downloads posts from the creator's feed in chronological order (oldest to newest). The "Page Range" input is typically disabled as all posts are fetched for correct sorting. - **Use Multithreading Checkbox & Threads Input**:  
  - **Purpose**: Configures simultaneous operations.  
- **Filename Style Toggle Button (located in the log area):**   - **Behavior**: Sets concurrent post processing (creator feeds) or file downloads (single posts).  
- **Purpose:** Controls how files are named when downloading in a manga/comic-like fashion. Cycles through options on click. - **Multi-part Download Toggle Button**:  
- **Options:**   - **Purpose**: Enables/disables multi-segment downloading for large files.  
- `Name: Post Title` (Default for non-manga): The first file in a post is named after the post title; subsequent files in the *same post* keep their original names.   - **Note**: Best for large files; less efficient for small files.
- `Name: Original File`: All downloaded files attempt to keep their original filenames as provided by the server. An optional "Filename Prefix" input field appears.
- `Name: Title+G.Num`: (Global Numbering) All files across all downloaded posts for the creator get a prefix from their respective post's title, followed by a global sequential number (e.g., `Chapter 1_001.jpg`, `Chapter 1_002.jpg`, `Chapter 2_003.jpg`). This ensures strict order across posts. Disables post-level multithreading for sequential numbering. ## 8. Logging, Monitoring & Error Handling
- `Name: Date Based`: Files are named sequentially (e.g., `001.jpg`, `002.jpg`) based on the post's publication date. An optional "Filename Prefix" input field appears. Disables post-level multithreading. - **📜 Progress Log Area**: Displays messages, progress, and errors.  
- **👁️ / 🙈 Log View Toggle Button**: Switches between Progress Log and Missed Character Log (skipped posts).  
- **Optional Filename Prefix Input Field (Manga Mode):** - **Show External Links in Log**: Displays external links (e.g., Mega, Google Drive) in a secondary panel.  
- **Visibility:** Appears when "Filename Style" is set to `Name: Original File` or `Name: Date Based`. - **Export Links Button**: Saves extracted links to a `.txt` file in "Only Links" mode.  
- **Purpose:** Allows you to add a custom prefix to all filenames generated using these styles (e.g., `MySeries_001.jpg`). - **Download Extracted Links Button**: Downloads files from supported external links in "Only Links" mode.  
- **🆘 Error Button & Dialog**:  
---   - **Purpose**: Active if files fail to download. The button will display a live count of failed files (e.g., **(3) Error**).  
  - **Dialog Features**:  
## Folder Organization     - Lists failed files.  
    - Retry failed downloads.  
Controls for how downloaded content is structured into folders.     - Export failed URLs to a text file.
- **Separate Folders by Name/Title Checkbox:** ## 9. Application Settings (⚙️)
- **Purpose:** Creates subfolders within the main "Download Location" based on matching criteria. - **Appearance**: Switch between Light and Dark themes.  
- **Behavior:** - **Language**: Change UI language (restart required).
- If "**Filter by Character(s)**" is used, folders are named after the matched character(s)/group(s).
- If no character filter matches (or no filter is active), but the post title matches an entry in `Known.txt`, a folder named after the `Known.txt` entry is created.
- If neither of the above, and this option is checked, folders might be created based on post titles directly (behavior can vary).
- **Subfolder per Post Checkbox:**
- **Purpose:** Creates an additional layer of subfolders, where each individual post's content goes into its own subfolder.
- **Behavior:** Only active if "**Separate Folders by Name/Title**" is also checked. The post subfolder will be created *inside* the character/title folder. Folder names are typically derived from sanitized post titles or IDs.
- **`Known.txt` Management UI (Bottom Left of UI):**
- **Purpose:** Manages a local list (`Known.txt` file in the app directory) of series, characters, or general terms used for automatic folder organization and character filter suggestions.
- **Elements:**
- **List Display:** Shows the primary names from your `Known.txt` file.
- **Add New Input Field:** Enter a new name or group to add to `Known.txt`.
- Simple Name: e.g., `My Series`
- Group (creates separate entries in `Known.txt`): e.g., `(Vivi, Ulti, Uta)`
- Group with Aliases (single entry in `Known.txt` with `~`): e.g., `(Boa, Hancock)~`
- ** Add Button:** Adds the entry from the "Add New" field to `Known.txt` and refreshes the list.
- **⤵️ Add to Filter Button:** Opens a dialog displaying all entries from `Known.txt` (with a search bar). Select one or more entries to add them to the "**🎯 Filter by Character(s)**" input field. Grouped names from `Known.txt` are added with the `~` syntax if applicable.
- **🗑️ Delete Selected Button:** Removes the currently selected name(s) from the list display and from the `Known.txt` file.
- **Open Known.txt Button:** Opens your `Known.txt` file in the system's default text editor for manual editing.
- **❓ Help Button:** Opens a guide or tooltip explaining the app feature
---
## ⭐ Favorite Mode (Kemono.su Only)
Download directly from your favorited artists and posts on Kemono.su.
- **Enable Checkbox ("⭐ Favorite Mode"):**
- **Location:** Usually near the "🔗 Only Links" filter option.
- **Purpose:** Switches the downloader to operate on your Kemono.su favorites.
- **UI Changes upon Enabling:**
- The "🔗 Kemono Creator/Post URL" input field is disabled/replaced with a "Favorite Mode active" message.
- The main action buttons change to "**🖼️ Favorite Artists**" and "**📄 Favorite Posts**".
- The "**🍪 Use Cookie**" option is automatically enabled and locked, as cookies are required to access your favorites.
- **🖼️ Favorite Artists Button & Dialog:**
- **Purpose:** Fetches and allows you to download content from artists you have favorited on Kemono.su.
- **Dialog Features:**
- Fetches the list of your favorited artists.
- **Search Bar:** Filter artists by name.
- **Artist List:** Displays favorited artists.
- **Select All / Deselect All:** Convenience buttons for selection.
- **"Download Selected" Button:** Queues all posts from the selected artists for download, respecting current filter settings.
- **📄 Favorite Posts Button & Dialog:**
- **Purpose:** Fetches and allows you to download specific posts you have favorited on Kemono.su.
- **Dialog Features:**
- Fetches the list of your favorited posts, usually grouped by artist and sorted by date.
- **Search Bar:** Filter posts by title, creator name, ID, or service.
- **Post List:** Displays favorited posts. Known names from your `Known.txt` may be highlighted in post titles for easier identification.
- **Select All / Deselect All:** Convenience buttons for selection.
- **"Download Selected" Button:** Queues the selected individual posts for download, respecting current filter settings.
- **Favorite Download Scope Button (Location may vary, often near Favorite Posts button):**
- **Purpose:** Determines the folder structure for downloads initiated via Favorite Mode.
- **Options:**
- `Scope: Selected Location`: All selected favorites (artists or posts) are downloaded directly into the main "📁 Download Location". Global filters apply.
- `Scope: Artist Folders`: A subfolder is created for each artist within the main "📁 Download Location" (e.g., `DownloadLocation/ArtistName/`). Content from that artist (whether a full artist download or specific favorited posts from them) goes into their respective subfolder. Filters apply within each artist's context.
---
## Advanced & Performance
- **🍪 Cookie Management:**
- **Use Cookie Checkbox:** Enables the use of browser cookies for accessing content that might be restricted or require login (e.g., certain posts, Favorite Mode).
- **Cookie Text Field:**
- **Purpose:** Directly paste your cookie string.
- **Format:** Standard HTTP cookie string format (e.g., `name1=value1; name2=value2`).
- **Browse... Button (for Cookies):**
- **Purpose:** Select a `cookies.txt` file from your system.
- **Format:** Must be in Netscape cookie file format.
- **Behavior:**
- The text field takes precedence if filled.
- If "Use Cookie" is checked and both the text field and browsed file path are empty, the application will attempt to automatically load a `cookies.txt` file from its root directory.
- **Use Multithreading Checkbox & Threads Input Field:**
- **Purpose:** Enable and configure the number of simultaneous operations to potentially speed up downloads.
- **Behavior:**
- **Creator Feeds:** The "Threads" input controls how many posts are processed concurrently.
- **Single Post URLs:** The "Threads" input controls how many files from that single post are downloaded concurrently.
- **Note:** Setting too high a number might lead to API rate-limiting or instability.
- **Multi-part Download Toggle Button (located in the log area):**
- **Purpose:** Enables/disables multi-segment downloading for individual large files.
- **Options:**
- `Multi-part: ON`: Large files are split into multiple parts that are downloaded simultaneously and then reassembled. Can significantly speed up downloads for single large files but may increase UI choppiness or log spam with many small files.
- `Multi-part: OFF` (Default): Files are downloaded as a single stream.
- **Behavior:** Disabled if "🔗 Only Links" or "📦 Only Archives" mode is active.
---
## Logging & Monitoring
- **📜 Progress Log / Extracted Links Log Area:**
- **Purpose:** The main text area displaying detailed messages about the ongoing process.
- **Content:** Shows download progress for each file, errors encountered, skipped items, summary information, or extracted links (if in "🔗 Only Links" mode).
- **👁️ / 🙈 Log View Toggle Button:**
- **Purpose:** Switches the content displayed in the main log area.
- **Views:**
- `👁️ Progress Log` (Default): Shows all download activity, errors, and general progress messages.
- `🙈 Missed Character Log`: Shows a list of key terms intelligently extracted from post titles or content that were skipped due to the "**🎯 Filter by Character(s)**" not matching. Useful for identifying characters you might want to add to your filter or `Known.txt`.
- **Show External Links in Log Checkbox & Panel:**
- **Purpose:** If checked, a secondary, smaller log panel appears (usually below the main log) that specifically displays any external links (e.g., to Mega, Google Drive) found in post descriptions.
- **Behavior:** Disabled if "🔗 Only Links" or "📦 Only Archives" mode is active (as "Only Links" uses the main log, and archives typically don't have such external links processed).
- **Export Links Button:**
- **Visibility:** Appears when the "**🔗 Only Links**" filter mode is active.
- **Purpose:** Saves all the links extracted and displayed in the main log area to a `.txt` file.
- **Progress Labels/Bars:**
- **Purpose:** Provide a visual and textual representation of the download progress.
- **Typically Includes:**
- Overall post progress (e.g., "Post 5 of 20").
- Individual file download status (e.g., "Downloading file.zip... 50% at 1.2 MB/s").
- Summary statistics at the end of a session (total downloaded, skipped, failed).
---
## Error Handling & Retries
- **🆘 Error Button (Main UI):**
- **Location:** Typically near the main action buttons (e.g., Start, Pause, Cancel).
- **Purpose:** Becomes active if files failed to download during the last session (and were not successfully retried). Clicking it opens the "Files Skipped Due to Errors" dialog.
- **"Files Skipped Due to Errors" Dialog:**
- **File List:** Displays a list of files that encountered download errors. Each entry shows the filename, the post it was from (title and ID).
- **Checkboxes:** Allows selection of individual files from the list.
- **"Select All" Button:** Checks all files in the list.
- **"Retry Selected" Button:** Attempts to re-download all checked files.
- **"Export URLs to .txt" Button:**
- Opens an "Export Options" dialog.
- **"Link per line (URL only)":** Exports only the direct download URL for each failed file, one URL per line.
- **"Export with details (URL [Post, File info])":** Exports the URL followed by details like Post Title, Post ID, and Original Filename in brackets.
- Prompts the user to save the generated `.txt` file.
- **"OK" Button:** Closes the dialog.
- **Note:** Files successfully retried or skipped due to hash match during a retry attempt are removed from this error list.
---
## ⚙️ Application Settings
These settings allow you to customize the application's appearance and language.
- **⚙️ Settings Button (Icon may vary, e.g., a gear ⚙️):**
- **Location:** Typically located in a persistent area of the UI, possibly near other global controls or in a menu.
- **Purpose:** Opens the "Settings" dialog.
- **Tooltip Example:** "Open application settings (Theme, Language, etc.)"
- **"Settings" Dialog:**
- **Title:** "Settings"
- **Purpose:** Provides options to configure application-wide preferences.
- **Sections:**
- **Appearance Group (`Appearance`):**
- **Theme Toggle Buttons/Options:**
- `Switch to Light Mode`
- `Switch to Dark Mode`
- **Purpose:** Allows users to switch between a light and dark visual theme for the application.
- **Tooltips:** Provide guidance on switching themes.
- **Language Settings Group (`Language Settings`):**
- **Language Selection Dropdown/List:**
- **Label:** "Language:"
- **Options:** Includes, but not limited to:
- English (`English`)
- 日本語 (`日本語 (Japanese)`)
- Français (French)
- Español (Spanish)
- Deutsch (German)
- Русский (Russian)
- 한국어 (Korean)
- 简体中文 (Chinese Simplified)
- **Purpose:** Allows users to change the display language of the application interface.
- **Restart Prompt:** After changing the language, a dialog may appear:
- **Title:** "Language Changed"
- **Message:** "The language has been changed. A restart is required for all changes to take full effect."
- **Informative Text:** "Would you like to restart the application now?"
- **Buttons:** "Restart Now", "OK" (or similar to defer restart).
- **"OK" Button:** Saves the changes made in the Settings dialog and closes it.
---
## Other UI Elements
- **Retry Failed Downloads Prompt:**
- **Trigger:** Appears at the end of a download session if there were files that failed to download due to recoverable errors (e.g., network interruption, IncompleteRead).
- **Action:** Prompts the user if they want to attempt downloading the failed files again.
- **New Name Confirmation Dialog (for Character Filter & `Known.txt`):**
- **Trigger:** When new, unrecognized names or groups are used in the "**🎯 Filter by Character(s)**" field that are not present in `Known.txt`.
- **Action:** Prompts the user to confirm if they want to add these new names/groups to `Known.txt` with the appropriate formatting (simple, grouped, or aliased).
- **Onboarding Tour / Help Guide Button (❓):**
- **Purpose:** Opens a built-in help guide or an onboarding tour that explains the basic functionalities and UI elements of the application. Often linked to this detailed feature guide.
---
This guide should cover all interactive elements of the Kemono Downloader. If you have further questions or discover elements not covered, please refer to the main `readme.md` or consider opening an issue on the project's repository.

29
main.py
View File

@@ -9,24 +9,26 @@ from PyQt5.QtWidgets import QApplication, QDialog
from PyQt5.QtCore import QCoreApplication from PyQt5.QtCore import QCoreApplication
# --- Local Application Imports --- # --- Local Application Imports ---
# These imports reflect the new, organized project structure.
from src.ui.main_window import DownloaderApp from src.ui.main_window import DownloaderApp
from src.ui.dialogs.TourDialog import TourDialog from src.ui.dialogs.TourDialog import TourDialog
from src.config.constants import CONFIG_ORGANIZATION_NAME, CONFIG_APP_NAME_MAIN from src.config.constants import CONFIG_ORGANIZATION_NAME, CONFIG_APP_NAME_MAIN
# --- Define APP_BASE_DIR globally and make available early ---
if getattr(sys, 'frozen', False) and hasattr(sys, '_MEIPASS'):
APP_BASE_DIR = sys._MEIPASS
else:
APP_BASE_DIR = os.path.abspath(os.path.dirname(__file__))
# Optional: Set a global variable or pass it into modules if needed
# Or re-export it via constants.py for cleaner imports
def handle_uncaught_exception(exc_type, exc_value, exc_traceback): def handle_uncaught_exception(exc_type, exc_value, exc_traceback):
""" """
Handles uncaught exceptions by logging them to a file for easier debugging, Handles uncaught exceptions by logging them to a file for easier debugging,
especially for bundled applications. especially for bundled applications.
""" """
# Determine the base directory for logging # Use APP_BASE_DIR to determine logging location
if getattr(sys, 'frozen', False): log_dir = os.path.join(APP_BASE_DIR, "logs")
base_dir_for_log = os.path.dirname(sys.executable)
else:
base_dir_for_log = os.path.dirname(os.path.abspath(__file__))
log_dir = os.path.join(base_dir_for_log, "logs")
log_file_path = os.path.join(log_dir, "uncaught_exceptions.log") log_file_path = os.path.join(log_dir, "uncaught_exceptions.log")
try: try:
@@ -57,41 +59,35 @@ def main():
qt_app = QApplication(sys.argv) qt_app = QApplication(sys.argv)
# Create the main application window from its new module # Create the main application window
downloader_app_instance = DownloaderApp() downloader_app_instance = DownloaderApp()
# --- Window Sizing and Positioning --- # --- Window Sizing and Positioning ---
# Logic moved from the old main.py to set an appropriate initial size
primary_screen = QApplication.primaryScreen() primary_screen = QApplication.primaryScreen()
if not primary_screen: if not primary_screen:
# Fallback for systems with no primary screen detected
downloader_app_instance.resize(1024, 768) downloader_app_instance.resize(1024, 768)
else: else:
available_geo = primary_screen.availableGeometry() available_geo = primary_screen.availableGeometry()
screen_width = available_geo.width() screen_width = available_geo.width()
screen_height = available_geo.height() screen_height = available_geo.height()
# Define minimums and desired ratios
min_app_width, min_app_height = 960, 680 min_app_width, min_app_height = 960, 680
desired_width_ratio, desired_height_ratio = 0.80, 0.85 desired_width_ratio, desired_height_ratio = 0.80, 0.85
app_width = max(min_app_width, int(screen_width * desired_width_ratio)) app_width = max(min_app_width, int(screen_width * desired_width_ratio))
app_height = max(min_app_height, int(screen_height * desired_height_ratio)) app_height = max(min_app_height, int(screen_height * desired_height_ratio))
# Ensure the window is not larger than the screen
app_width = min(app_width, screen_width) app_width = min(app_width, screen_width)
app_height = min(app_height, screen_height) app_height = min(app_height, screen_height)
downloader_app_instance.resize(app_width, app_height) downloader_app_instance.resize(app_width, app_height)
# Show the main window and center it # Show and center the main window
downloader_app_instance.show() downloader_app_instance.show()
if hasattr(downloader_app_instance, '_center_on_screen'): if hasattr(downloader_app_instance, '_center_on_screen'):
downloader_app_instance._center_on_screen() downloader_app_instance._center_on_screen()
# --- First-Run Welcome Tour --- # --- First-Run Welcome Tour ---
# Check if the tour should be shown and run it.
# This static method call keeps the logic clean and contained.
if TourDialog.should_show_tour(): if TourDialog.should_show_tour():
tour_dialog = TourDialog(parent_app=downloader_app_instance) tour_dialog = TourDialog(parent_app=downloader_app_instance)
tour_dialog.exec_() tour_dialog.exec_()
@@ -102,7 +98,6 @@ def main():
sys.exit(exit_code) sys.exit(exit_code)
except SystemExit: except SystemExit:
# Allow sys.exit() to work as intended
pass pass
except Exception as e: except Exception as e:
print("--- CRITICAL APPLICATION STARTUP ERROR ---") print("--- CRITICAL APPLICATION STARTUP ERROR ---")

5529
main_window_old.py Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -13,10 +13,9 @@ This project used to be one giant messy App Script. It worked, but it was hard t
``` ```
KemonoDownloader/ KemonoDownloader/
├── main.py # Where the app starts ├── main.py # Where the app starts
├── requirements.txt # List of Python libraries used
├── assets/ # Icons and other static files ├── assets/ # Icons and other static files
│ └── Kemono.ico │ └── Kemono.ico
├── data/ # Stuff that gets saved (user config, cookies, etc.) ├── data/
│ └── creators.json │ └── creators.json
├── logs/ # Error logs and other output ├── logs/ # Error logs and other output
│ └── uncaught_exceptions.log │ └── uncaught_exceptions.log

122
readme.md
View File

@@ -1,46 +1,45 @@
<h1 align="center">Kemono Downloader v5.5.0</h1> <h1 align="center">Kemono Downloader v6.0.0</h1>
<table align="center"> <div align="center">
<table>
<tr> <tr>
<td align="center"> <td align="center">
<img src="Read/Read.png" alt="Default Mode" width="400"/><br> <img src="Read/Read.png" alt="Default Mode" width="400"><br>
<strong>Default</strong> <strong>Default</strong>
</td> </td>
<td align="center"> <td align="center">
<img src="Read/Read1.png" alt="Favorite Mode" width="400"/><br> <img src="Read/Read1.png" alt="Favorite Mode" width="400"><br>
<strong>Favorite mode</strong> <strong>Favorite Mode</strong>
</td> </td>
</tr> </tr>
<tr> <tr>
<td align="center"> <td align="center">
<img src="Read/Read2.png" alt="Single Post" width="400"/><br> <img src="Read/Read2.png" alt="Single Post" width="400"><br>
<strong>Single Post</strong> <strong>Single Post</strong>
</td> </td>
<td align="center"> <td align="center">
<img src="Read/Read3.png" alt="Manga/Comic Mode" width="400"/><br> <img src="Read/Read3.png" alt="Manga/Comic Mode" width="400"><br>
<strong>Manga/Comic Mode</strong> <strong>Manga/Comic Mode</strong>
</td> </td>
</tr> </tr>
</table> </table>
</div>
--- ---
A powerful, feature-rich GUI application for downloading content from **[Kemono.su](https://kemono.su)** (and its mirrors like kemono.party) and **[Coomer.party](https://coomer.party)** (and its mirrors like coomer.su). A powerful, feature-rich GUI application for downloading content from **[Kemono.su](https://kemono.su)** (and its mirrors like kemono.party) and **[Coomer.party](https://coomer.party)** (and its mirrors like coomer.su).
Built with PyQt5, this tool is designed for users who want deep filtering capabilities, customizable folder structures, efficient downloads, and intelligent automation, all within a modern and user-friendly graphical interface.
*This v5.0.0 release marks a significant feature milestone. Future updates are expected to be less frequent, focusing on maintenance and minor refinements.* Built with PyQt5, this tool is designed for users who want deep filtering capabilities, customizable folder structures, efficient downloads, and intelligent automation — all within a modern and user-friendly graphical interface.
*Update v5.2.0 introduces multi-language support, theme selection, and further UI refinements.*
<p align="center"> <div align="center">
<a href="features.md">
<img alt="Features" src="https://img.shields.io/badge/📚%20Full%20Feature%20List-FFD700?style=for-the-badge&logoColor=black&color=FFD700"> [![](https://img.shields.io/badge/📚%20Full%20Feature%20List-FFD700?style=for-the-badge&logoColor=black&color=FFD700)](features.md)
</a> [![](https://img.shields.io/badge/📝%20License-90EE90?style=for-the-badge&logoColor=black&color=90EE90)](LICENSE)
<a href="LICENSE"> [![](https://img.shields.io/badge/⚠️%20Important%20Note-FFCCCB?style=for-the-badge&logoColor=black&color=FFCCCB)](note.md)
<img alt="License" src="https://img.shields.io/badge/📝%20License-90EE90?style=for-the-badge&logoColor=black&color=90EE90">
</a> </div>
<a href="note.md">
<img alt="Note" src="https://img.shields.io/badge/⚠️%20Important%20Note-FFCCCB?style=for-the-badge&logoColor=black&color=FFCCCB">
</a>
</p>
--- ---
@@ -49,76 +48,109 @@ Built with PyQt5, this tool is designed for users who want deep filtering capabi
Kemono Downloader offers a range of features to streamline your content downloading experience: Kemono Downloader offers a range of features to streamline your content downloading experience:
- **User-Friendly Interface:** A modern PyQt5 GUI for easy navigation and operation. - **User-Friendly Interface:** A modern PyQt5 GUI for easy navigation and operation.
- **Flexible Downloading:** - **Flexible Downloading:**
- Download content from Kemono.su (and mirrors) and Coomer.party (and mirrors). - Download content from Kemono.su (and mirrors) and Coomer.party (and mirrors).
- Supports creator pages (with page range selection) and individual post URLs. - Supports creator pages (with page range selection) and individual post URLs.
- Standard download controls: Start, Pause, Resume, and Cancel. - Standard download controls: Start, Pause, Resume, and Cancel.
- **Powerful Filtering:** - **Powerful Filtering:**
- **Character Filtering:** Filter content by character names. Supports simple comma-separated names and grouped names for shared folders. - **Character Filtering:** Filter content by character names. Supports simple comma-separated names and grouped names for shared folders.
- **Keyword Skipping:** Skip posts or files based on specified keywords. - **Keyword Skipping:** Skip posts or files based on specified keywords.
- **Filename Cleaning:** Remove unwanted words or phrases from downloaded filenames. - **Filename Cleaning:** Remove unwanted words or phrases from downloaded filenames.
- **File Type Selection:** Choose to download all files, or limit to images/GIFs, videos, audio, or archives. Can also extract external links only. - **File Type Selection:** Choose to download all files, or limit to images/GIFs, videos, audio, or archives. Can also extract external links only.
- **Customizable Downloads:** - **Customizable Downloads:**
- **Thumbnails Only:** Option to download only small preview images. - **Thumbnails Only:** Option to download only small preview images.
- **Content Scanning:** Scan post HTML for `<img>` tags and direct image links, useful for images embedded in descriptions. - **Content Scanning:** Scan post HTML for `<img>` tags and direct image links, useful for images embedded in descriptions.
- **WebP Conversion:** Convert images to WebP format for smaller file sizes (requires Pillow library). - **WebP Conversion:** Convert images to WebP format for smaller file sizes (requires Pillow library).
- **Organized Output:** - **Organized Output:**
- **Automatic Subfolders:** Create subfolders based on character names (from filters or `Known.txt`) or post titles. - **Automatic Subfolders:** Create subfolders based on character names (from filters or `Known.txt`) or post titles.
- **Per-Post Subfolders:** Option to create an additional subfolder for each individual post. - **Per-Post Subfolders:** Option to create an additional subfolder for each individual post.
- **Manga/Comic Mode:** - **Manga/Comic Mode:**
- Downloads posts from a creator's feed in chronological order (oldest to newest). - Downloads posts from a creator's feed in chronological order (oldest to newest).
- Offers various filename styling options for sequential reading (e.g., post title, original name, global numbering). - Offers various filename styling options for sequential reading (e.g., post title, original name, global numbering).
- **⭐ Favorite Mode:** - **⭐ Favorite Mode:**
- Directly download from your favorited artists and posts on Kemono.su. - Directly download from your favorited artists and posts on Kemono.su.
- Requires a valid cookie and adapts the UI for easy selection from your favorites. - Requires a valid cookie and adapts the UI for easy selection from your favorites.
- Supports downloading into a single location or artist-specific subfolders. - Supports downloading into a single location or artist-specific subfolders.
- **Performance & Advanced Options:** - **Performance & Advanced Options:**
- **Cookie Support:** Use cookies (paste string or load from `cookies.txt`) to access restricted content. - **Cookie Support:** Use cookies (paste string or load from `cookies.txt`) to access restricted content.
- **Multithreading:** Configure the number of simultaneous downloads/post processing threads for improved speed. - **Multithreading:** Configure the number of simultaneous downloads/post processing threads for improved speed.
- **Logging:** - **Logging:**
- A detailed progress log displays download activity, errors, and summaries. - A detailed progress log displays download activity, errors, and summaries.
- **Multi-language Interface:** Choose from several languages for the UI (English, Japanese, French, Spanish, German, Russian, Korean, Chinese Simplified). - **Multi-language Interface:** Choose from several languages for the UI (English, Japanese, French, Spanish, German, Russian, Korean, Chinese Simplified).
- **Theme Customization:** Selectable Light and Dark themes for user comfort. - **Theme Customization:** Selectable Light and Dark themes for user comfort.
--- ---
## ✨ What's New in v5.3.0 ## ✨ What's New in v6.0.0
- **Multi-Creator Post Fetching & Queuing:**
- The **Creator Selection popup** (🎨 icon) has been significantly enhanced. This release focuses on providing more granular control over file organization and improving at-a-glance status monitoring.
- After selecting multiple creators, you can now click a new "**Fetch Posts**" button.
- This will retrieve and display posts from all selected creators in a new view within the popup. ### New Features
- You can then browse these fetched posts (with search functionality) and select individual posts.
- A new "**Add Selected Posts to Queue**" button allows you to add your chosen posts directly to the main download queue, streamlining the process of gathering content from multiple artists. - **Live Error Count on Button**
- The traditional "**Add Selected to URL**" button is still available if you prefer to populate the main URL field with creator names. The **"Error" button** now dynamically displays the number of failed files during a download. Instead of opening the dialog, you can quickly see a live count like `(3) Error`, helping you track issues at a glance.
- **Improved Favorite Download Queue Handling:**
- When items are added to the download queue from the Creator Selection popup, the main URL input field will now display a placeholder message (e.g., "{count} items in queue from popup"). - **Date Prefix for Post Subfolders**
- The queue is now more robustly managed, especially when interacting with the main URL input field after items have been queued from the popup. A new checkbox labeled **"Date Prefix"** is now available in the advanced settings.
When enabled alongside **"Subfolder per Post"**, it prepends the post's upload date to the folder name (e.g., `2025-07-11 Post Title`).
This makes your downloads sortable and easier to browse chronologically.
- **Keep Duplicates Within a Post**
A **"Keep Duplicates"** option has been added to preserve all files from a post — even if some have the same name.
Instead of skipping or overwriting, the downloader will save duplicates with numbered suffixes (e.g., `image.jpg`, `image_1.jpg`, etc.), which is especially useful when the same file name points to different media.
### Bug Fixes
- The downloader now correctly renames large `.part` files when completed, avoiding leftover temp files.
- The list of failed files shown in the Error Dialog is now saved and restored with your session — so no errors get lost if you close the app.
- Your selected download location is remembered, even after pressing the **Reset** button.
- The **Cancel** button is now enabled when restoring a pending session, so you can abort stuck jobs more easily.
- Internal cleanup logs (like "Deleting post cache") are now excluded from the final download summary for clarity.
--- ---
## ✨ What's New in v5.1.0 ## 📅 Next Update Plans
- **Enhanced Error File Management**: The "Error" button now opens a dialog listing files that failed to download. This dialog includes:
- An option to **retry selected** failed downloads.
- A new **"Export URLs to .txt"** button, allowing users to save links of failed downloads either as "URL only" or "URL with details" (including post title, ID, and original filename).
- Fixed a bug where files skipped during retry (due to existing hash match) were not correctly removed from the error list.
- **Improved UI Stability**: Addressed issues with UI state management to more accurately reflect ongoing download activities (including retries and external link downloads). This prevents the "Cancel" button from becoming inactive prematurely while operations are still running.
## ✨ What's New in v5.2.0 ### 🔖 Post Tag Filtering (Planned for v6.1.0)
- **Multi-language Support:** The interface now supports multiple languages: English, Japanese, French, Spanish, German, Russian, Korean, and Chinese (Simplified). Select your preferred language in the new Settings dialog.
- **Theme Selection:** Choose between Light and Dark application themes via the Settings dialog for a personalized viewing experience. A powerful new **"Filter by Post Tags"** feature is planned:
- **Centralized Settings:** A new Settings dialog (accessible via a settings button, often with a gear icon) provides a dedicated space for language and appearance customizations.
- **Internal Localization:** Introduced `languages.py` for managing UI translations, streamlining the addition of new languages by contributors. - Filter and download content based on specific post tags.
- Combine tag filtering with current filters (character, file type, etc.).
- Use tag presets to automate frequent downloads.
This will provide **much greater control** over what gets downloaded, especially for creators who use tags consistently.
### 📁 Creator Download History (.json Save)
To streamline incremental downloads, a new system will allow the app to:
- Save a `.json` file with metadata about already-downloaded posts.
- Compare that file on future runs, so only **new** posts are downloaded.
- Avoids duplication and makes regular syncs fast and efficient.
Ideal for users managing large collections or syncing favorites regularly.
--- ---
## Installation ## 💻 Installation
### Requirements ### Requirements
- Python 3.6 or higher - Python 3.6 or higher
- pip (Python package installer) - pip (Python package installer)
### Install Dependencies ### Install Dependencies
Open your terminal or command prompt and run:
```bash ```bash
pip install PyQt5 requests Pillow mega.py pip install PyQt5 requests Pillow mega.py

View File

@@ -9,6 +9,7 @@ STYLE_ORIGINAL_NAME = "original_name"
STYLE_DATE_BASED = "date_based" STYLE_DATE_BASED = "date_based"
STYLE_DATE_POST_TITLE = "date_post_title" STYLE_DATE_POST_TITLE = "date_post_title"
STYLE_POST_TITLE_GLOBAL_NUMBERING = "post_title_global_numbering" STYLE_POST_TITLE_GLOBAL_NUMBERING = "post_title_global_numbering"
STYLE_POST_ID = "post_id" # Add this line
MANGA_DATE_PREFIX_DEFAULT = "" MANGA_DATE_PREFIX_DEFAULT = ""
# --- Download Scopes --- # --- Download Scopes ---
@@ -94,6 +95,7 @@ FOLDER_NAME_STOP_WORDS = {
"me", "my", "net", "not", "of", "on", "or", "org", "our", "me", "my", "net", "not", "of", "on", "or", "org", "our",
"s", "she", "so", "the", "their", "they", "this", "s", "she", "so", "the", "their", "they", "this",
"to", "ve", "was", "we", "were", "with", "www", "you", "your", "to", "ve", "was", "we", "were", "with", "www", "you", "your",
# add more according to need
} }
# Additional words to ignore specifically for creator-level downloads # Additional words to ignore specifically for creator-level downloads
@@ -107,4 +109,5 @@ CREATOR_DOWNLOAD_DEFAULT_FOLDER_IGNORE_WORDS = {
"oct", "october", "nov", "november", "dec", "december", "oct", "october", "nov", "november", "dec", "december",
"mon", "monday", "tue", "tuesday", "wed", "wednesday", "thu", "thursday", "mon", "monday", "tue", "tuesday", "wed", "wednesday", "thu", "thursday",
"fri", "friday", "sat", "saturday", "sun", "sunday" "fri", "friday", "sat", "saturday", "sun", "sunday"
# add more according to need
} }

View File

@@ -1,12 +1,10 @@
# --- Standard Library Imports ---
import time import time
import traceback import traceback
from urllib.parse import urlparse from urllib.parse import urlparse
import json # Ensure json is imported
# --- Third-Party Library Imports ---
import requests import requests
# --- Local Application Imports --- # (Keep the rest of your imports)
from ..utils.network_utils import extract_post_info, prepare_cookies_for_request from ..utils.network_utils import extract_post_info, prepare_cookies_for_request
from ..config.constants import ( from ..config.constants import (
STYLE_DATE_POST_TITLE STYLE_DATE_POST_TITLE
@@ -15,36 +13,24 @@ from ..config.constants import (
def fetch_posts_paginated(api_url_base, headers, offset, logger, cancellation_event=None, pause_event=None, cookies_dict=None): def fetch_posts_paginated(api_url_base, headers, offset, logger, cancellation_event=None, pause_event=None, cookies_dict=None):
""" """
Fetches a single page of posts from the API with retry logic. Fetches a single page of posts from the API with robust retry logic.
NEW: Requests only essential fields to keep the response size small and reliable.
Args:
api_url_base (str): The base URL for the user's posts.
headers (dict): The request headers.
offset (int): The offset for pagination.
logger (callable): Function to log messages.
cancellation_event (threading.Event): Event to signal cancellation.
pause_event (threading.Event): Event to signal pause.
cookies_dict (dict): A dictionary of cookies to include in the request.
Returns:
list: A list of post data dictionaries from the API.
Raises:
RuntimeError: If the fetch fails after all retries or encounters a non-retryable error.
""" """
if cancellation_event and cancellation_event.is_set(): if cancellation_event and cancellation_event.is_set():
logger(" Fetch cancelled before request.")
raise RuntimeError("Fetch operation cancelled by user.") raise RuntimeError("Fetch operation cancelled by user.")
if pause_event and pause_event.is_set(): if pause_event and pause_event.is_set():
logger(" Post fetching paused...") logger(" Post fetching paused...")
while pause_event.is_set(): while pause_event.is_set():
if cancellation_event and cancellation_event.is_set(): if cancellation_event and cancellation_event.is_set():
logger(" Post fetching cancelled while paused.") raise RuntimeError("Fetch operation cancelled by user while paused.")
raise RuntimeError("Fetch operation cancelled by user.")
time.sleep(0.5) time.sleep(0.5)
logger(" Post fetching resumed.") logger(" Post fetching resumed.")
paginated_url = f'{api_url_base}?o={offset}' # --- MODIFICATION: Added `fields` to the URL to request only metadata ---
# This prevents the large 'content' field from being included in the list, avoiding timeouts.
fields_to_request = "id,user,service,title,shared_file,added,published,edited,file,attachments,tags"
paginated_url = f'{api_url_base}?o={offset}&fields={fields_to_request}'
max_retries = 3 max_retries = 3
retry_delay = 5 retry_delay = 5
@@ -52,22 +38,18 @@ def fetch_posts_paginated(api_url_base, headers, offset, logger, cancellation_ev
if cancellation_event and cancellation_event.is_set(): if cancellation_event and cancellation_event.is_set():
raise RuntimeError("Fetch operation cancelled by user during retry loop.") raise RuntimeError("Fetch operation cancelled by user during retry loop.")
log_message = f" Fetching: {paginated_url} (Page approx. {offset // 50 + 1})" log_message = f" Fetching post list: {api_url_base}?o={offset} (Page approx. {offset // 50 + 1})"
if attempt > 0: if attempt > 0:
log_message += f" (Attempt {attempt + 1}/{max_retries})" log_message += f" (Attempt {attempt + 1}/{max_retries})"
logger(log_message) logger(log_message)
try: try:
response = requests.get(paginated_url, headers=headers, timeout=(15, 90), cookies=cookies_dict) # We can now remove the streaming logic as the response will be small and fast.
response = requests.get(paginated_url, headers=headers, timeout=(15, 60), cookies=cookies_dict)
response.raise_for_status() response.raise_for_status()
if 'application/json' not in response.headers.get('Content-Type', '').lower():
logger(f"⚠️ Unexpected content type from API: {response.headers.get('Content-Type')}. Body: {response.text[:200]}")
return []
return response.json() return response.json()
except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e: except requests.exceptions.RequestException as e:
logger(f" ⚠️ Retryable network error on page fetch (Attempt {attempt + 1}): {e}") logger(f" ⚠️ Retryable network error on page fetch (Attempt {attempt + 1}): {e}")
if attempt < max_retries - 1: if attempt < max_retries - 1:
delay = retry_delay * (2 ** attempt) delay = retry_delay * (2 ** attempt)
@@ -76,18 +58,46 @@ def fetch_posts_paginated(api_url_base, headers, offset, logger, cancellation_ev
continue continue
else: else:
logger(f" ❌ Failed to fetch page after {max_retries} attempts.") logger(f" ❌ Failed to fetch page after {max_retries} attempts.")
raise RuntimeError(f"Timeout or connection error fetching offset {offset}") raise RuntimeError(f"Network error fetching offset {offset}")
except requests.exceptions.RequestException as e: except json.JSONDecodeError as e:
err_msg = f"Error fetching offset {offset}: {e}" logger(f" ❌ Failed to decode JSON on page fetch (Attempt {attempt + 1}): {e}")
if e.response is not None: if attempt < max_retries - 1:
err_msg += f" (Status: {e.response.status_code}, Body: {e.response.text[:200]})" delay = retry_delay * (2 ** attempt)
raise RuntimeError(err_msg) logger(f" Retrying in {delay} seconds...")
except ValueError as e: # JSON decode error time.sleep(delay)
raise RuntimeError(f"Error decoding JSON from offset {offset}: {e}. Response: {response.text[:200]}") continue
else:
raise RuntimeError(f"JSONDecodeError fetching offset {offset}")
raise RuntimeError(f"Failed to fetch page {paginated_url} after all attempts.") raise RuntimeError(f"Failed to fetch page {paginated_url} after all attempts.")
def fetch_single_post_data(api_domain, service, user_id, post_id, headers, logger, cookies_dict=None):
"""
--- NEW FUNCTION ---
Fetches the full data, including the 'content' field, for a single post.
"""
post_api_url = f"https://{api_domain}/api/v1/{service}/user/{user_id}/post/{post_id}"
logger(f" Fetching full content for post ID {post_id}...")
try:
# Use streaming here as a precaution for single posts that are still very large.
with requests.get(post_api_url, headers=headers, timeout=(15, 300), cookies=cookies_dict, stream=True) as response:
response.raise_for_status()
response_body = b""
for chunk in response.iter_content(chunk_size=8192):
response_body += chunk
full_post_data = json.loads(response_body)
# The API sometimes wraps the post in a list, handle that.
if isinstance(full_post_data, list) and full_post_data:
return full_post_data[0]
return full_post_data
except Exception as e:
logger(f" ❌ Failed to fetch full content for post {post_id}: {e}")
return None
def fetch_post_comments(api_domain, service, user_id, post_id, headers, logger, cancellation_event=None, pause_event=None, cookies_dict=None): def fetch_post_comments(api_domain, service, user_id, post_id, headers, logger, cancellation_event=None, pause_event=None, cookies_dict=None):
"""Fetches all comments for a specific post.""" """Fetches all comments for a specific post."""
if cancellation_event and cancellation_event.is_set(): if cancellation_event and cancellation_event.is_set():

View File

@@ -20,6 +20,26 @@ try:
from PIL import Image from PIL import Image
except ImportError: except ImportError:
Image = None Image = None
#
try:
from fpdf import FPDF
# Add a simple class to handle the header/footer for stories
class PDF(FPDF):
def header(self):
pass # No header
def footer(self):
self.set_y(-15)
self.set_font('Arial', 'I', 8)
self.cell(0, 10, 'Page %s' % self.page_no(), 0, 0, 'C')
except ImportError:
FPDF = None
try:
from docx import Document
except ImportError:
Document = None
# --- PyQt5 Imports --- # --- PyQt5 Imports ---
from PyQt5 .QtCore import Qt ,QThread ,pyqtSignal ,QMutex ,QMutexLocker ,QObject ,QTimer ,QSettings ,QStandardPaths ,QCoreApplication ,QUrl ,QSize ,QProcess from PyQt5 .QtCore import Qt ,QThread ,pyqtSignal ,QMutex ,QMutexLocker ,QObject ,QTimer ,QSettings ,QStandardPaths ,QCoreApplication ,QUrl ,QSize ,QProcess
# --- Local Application Imports --- # --- Local Application Imports ---
@@ -48,6 +68,7 @@ class PostProcessorSignals (QObject ):
file_progress_signal =pyqtSignal (str ,object ) file_progress_signal =pyqtSignal (str ,object )
file_successfully_downloaded_signal =pyqtSignal (dict ) file_successfully_downloaded_signal =pyqtSignal (dict )
missed_character_post_signal =pyqtSignal (str ,str ) missed_character_post_signal =pyqtSignal (str ,str )
worker_finished_signal = pyqtSignal(tuple)
class PostProcessorWorker: class PostProcessorWorker:
def __init__ (self ,post_data ,download_root ,known_names , def __init__ (self ,post_data ,download_root ,known_names ,
@@ -77,8 +98,14 @@ class PostProcessorWorker:
scan_content_for_images =False , scan_content_for_images =False ,
creator_download_folder_ignore_words =None , creator_download_folder_ignore_words =None ,
manga_global_file_counter_ref =None , manga_global_file_counter_ref =None ,
use_date_prefix_for_subfolder=False,
keep_in_post_duplicates=False,
session_file_path=None, session_file_path=None,
session_lock=None, session_lock=None,
text_only_scope=None,
text_export_format='txt',
single_pdf_mode=False,
project_root_dir=None,
): ):
self .post =post_data self .post =post_data
self .download_root =download_root self .download_root =download_root
@@ -128,8 +155,14 @@ class PostProcessorWorker:
self .override_output_dir =override_output_dir self .override_output_dir =override_output_dir
self .scan_content_for_images =scan_content_for_images self .scan_content_for_images =scan_content_for_images
self .creator_download_folder_ignore_words =creator_download_folder_ignore_words self .creator_download_folder_ignore_words =creator_download_folder_ignore_words
self.use_date_prefix_for_subfolder = use_date_prefix_for_subfolder
self.keep_in_post_duplicates = keep_in_post_duplicates
self.session_file_path = session_file_path self.session_file_path = session_file_path
self.session_lock = session_lock self.session_lock = session_lock
self.text_only_scope = text_only_scope
self.text_export_format = text_export_format
self.single_pdf_mode = single_pdf_mode # <-- ADD THIS LINE
self.project_root_dir = project_root_dir
if self .compress_images and Image is None : if self .compress_images and Image is None :
self .logger ("⚠️ Image compression disabled: Pillow library not found.") self .logger ("⚠️ Image compression disabled: Pillow library not found.")
@@ -167,6 +200,7 @@ class PostProcessorWorker:
if self .dynamic_filter_holder : if self .dynamic_filter_holder :
return self .dynamic_filter_holder .get_filters () return self .dynamic_filter_holder .get_filters ()
return self .filter_character_list_objects_initial return self .filter_character_list_objects_initial
def _download_single_file (self ,file_info ,target_folder_path ,headers ,original_post_id_for_log ,skip_event , def _download_single_file (self ,file_info ,target_folder_path ,headers ,original_post_id_for_log ,skip_event ,
post_title ="",file_index_in_post =0 ,num_files_in_this_post =1 , post_title ="",file_index_in_post =0 ,num_files_in_this_post =1 ,
manga_date_file_counter_ref =None , manga_date_file_counter_ref =None ,
@@ -273,6 +307,15 @@ class PostProcessorWorker:
self .logger (f"⚠️ Manga Title+GlobalNum Mode: Counter ref not provided or malformed for '{api_original_filename }'. Using original. Ref: {manga_global_file_counter_ref }") self .logger (f"⚠️ Manga Title+GlobalNum Mode: Counter ref not provided or malformed for '{api_original_filename }'. Using original. Ref: {manga_global_file_counter_ref }")
filename_to_save_in_main_path =cleaned_original_api_filename filename_to_save_in_main_path =cleaned_original_api_filename
self .logger (f"⚠️ Manga mode (Title+GlobalNum Style Fallback): Using cleaned original filename '{filename_to_save_in_main_path }' for post {original_post_id_for_log }.") self .logger (f"⚠️ Manga mode (Title+GlobalNum Style Fallback): Using cleaned original filename '{filename_to_save_in_main_path }' for post {original_post_id_for_log }.")
elif self.manga_filename_style == STYLE_POST_ID:
if original_post_id_for_log and original_post_id_for_log != 'unknown_id':
base_name = str(original_post_id_for_log)
# Always append the file index for consistency (e.g., xxxxxx_0, xxxxxx_1)
filename_to_save_in_main_path = f"{base_name}_{file_index_in_post}{original_ext}"
else:
# Fallback if post_id is somehow not available
self.logger(f"⚠️ Manga mode (Post ID Style): Post ID missing. Using cleaned original filename '{cleaned_original_api_filename}'.")
filename_to_save_in_main_path = cleaned_original_api_filename
elif self .manga_filename_style ==STYLE_DATE_POST_TITLE : elif self .manga_filename_style ==STYLE_DATE_POST_TITLE :
published_date_str =self .post .get ('published') published_date_str =self .post .get ('published')
added_date_str =self .post .get ('added') added_date_str =self .post .get ('added')
@@ -543,6 +586,26 @@ class PostProcessorWorker:
final_total_for_progress =total_size_bytes if download_successful_flag and total_size_bytes >0 else downloaded_size_bytes final_total_for_progress =total_size_bytes if download_successful_flag and total_size_bytes >0 else downloaded_size_bytes
self ._emit_signal ('file_progress',api_original_filename ,(downloaded_size_bytes ,final_total_for_progress )) self ._emit_signal ('file_progress',api_original_filename ,(downloaded_size_bytes ,final_total_for_progress ))
# --- Start of Replacement Block ---
# Rescue download if an IncompleteRead error occurred but the file is complete
if (not download_successful_flag and
isinstance(last_exception_for_retry_later, http.client.IncompleteRead) and
total_size_bytes > 0 and downloaded_part_file_path and os.path.exists(downloaded_part_file_path)):
try:
actual_size = os.path.getsize(downloaded_part_file_path)
if actual_size == total_size_bytes:
self.logger(f" ✅ Rescued '{api_original_filename}': IncompleteRead error occurred, but file size matches. Proceeding with save.")
download_successful_flag = True
# The hash must be recalculated now that we've verified the file
md5_hasher = hashlib.md5()
with open(downloaded_part_file_path, 'rb') as f_verify:
for chunk in iter(lambda: f_verify.read(8192), b""): # Read in chunks
md5_hasher.update(chunk)
calculated_file_hash = md5_hasher.hexdigest()
except Exception as rescue_exc:
self.logger(f" ⚠️ Failed to rescue file despite matching size. Error: {rescue_exc}")
if self.check_cancel() or (skip_event and skip_event.is_set()) or (self.pause_event and self.pause_event.is_set() and not download_successful_flag): if self.check_cancel() or (skip_event and skip_event.is_set()) or (self.pause_event and self.pause_event.is_set() and not download_successful_flag):
self.logger(f" ⚠️ Download process interrupted for {api_original_filename}.") self.logger(f" ⚠️ Download process interrupted for {api_original_filename}.")
if downloaded_part_file_path and os.path.exists(downloaded_part_file_path): if downloaded_part_file_path and os.path.exists(downloaded_part_file_path):
@@ -550,51 +613,12 @@ class PostProcessorWorker:
except OSError: pass except OSError: pass
return 0, 1, filename_to_save_in_main_path, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_SKIPPED, None return 0, 1, filename_to_save_in_main_path, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_SKIPPED, None
if not download_successful_flag : # This logic block now correctly handles all outcomes: success, failure, or rescued.
self .logger (f"❌ Download failed for '{api_original_filename }' after {max_retries +1 } attempts.") if download_successful_flag:
# --- This is the success path ---
if self._check_pause(f"Post-download hash check for '{api_original_filename}'"):
return 0, 1, filename_to_save_in_main_path, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_SKIPPED, None
is_actually_incomplete_read =False
if isinstance (last_exception_for_retry_later ,http .client .IncompleteRead ):
is_actually_incomplete_read =True
elif hasattr (last_exception_for_retry_later ,'__cause__')and isinstance (last_exception_for_retry_later .__cause__ ,http .client .IncompleteRead ):
is_actually_incomplete_read =True
elif last_exception_for_retry_later is not None :
str_exc =str (last_exception_for_retry_later ).lower ()
if "incompleteread"in str_exc or (isinstance (last_exception_for_retry_later ,tuple )and any ("incompleteread"in str (arg ).lower ()for arg in last_exception_for_retry_later if isinstance (arg ,(str ,Exception )))):
is_actually_incomplete_read =True
if is_actually_incomplete_read :
self .logger (f" Marking '{api_original_filename }' for potential retry later due to IncompleteRead.")
retry_later_details ={
'file_info':file_info ,
'target_folder_path':target_folder_path ,
'headers':headers ,
'original_post_id_for_log':original_post_id_for_log ,
'post_title':post_title ,
'file_index_in_post':file_index_in_post ,
'num_files_in_this_post':num_files_in_this_post ,
'forced_filename_override':filename_to_save_in_main_path ,
'manga_mode_active_for_file':self .manga_mode_active ,
'manga_filename_style_for_file':self .manga_filename_style ,
}
return 0 ,1 ,filename_to_save_in_main_path ,was_original_name_kept_flag ,FILE_DOWNLOAD_STATUS_FAILED_RETRYABLE_LATER ,retry_later_details
else :
self .logger (f" Marking '{api_original_filename }' as permanently failed for this session.")
permanent_failure_details ={
'file_info':file_info ,
'target_folder_path':target_folder_path ,
'headers':headers ,
'original_post_id_for_log':original_post_id_for_log ,
'post_title':post_title ,
'file_index_in_post':file_index_in_post ,
'num_files_in_this_post':num_files_in_this_post ,
'forced_filename_override':filename_to_save_in_main_path ,
}
return 0 ,1 ,filename_to_save_in_main_path ,was_original_name_kept_flag ,FILE_DOWNLOAD_STATUS_FAILED_PERMANENTLY_THIS_SESSION ,permanent_failure_details
if self ._check_pause (f"Post-download hash check for '{api_original_filename }'"):return 0 ,1 ,filename_to_save_in_main_path ,was_original_name_kept_flag ,FILE_DOWNLOAD_STATUS_SKIPPED ,None
with self.downloaded_file_hashes_lock: with self.downloaded_file_hashes_lock:
if calculated_file_hash in self.downloaded_file_hashes: if calculated_file_hash in self.downloaded_file_hashes:
self.logger(f" -> Skip Saving Duplicate (Hash Match): '{api_original_filename}' (Hash: {calculated_file_hash[:8]}...).") self.logger(f" -> Skip Saving Duplicate (Hash Match): '{api_original_filename}' (Hash: {calculated_file_hash[:8]}...).")
@@ -620,6 +644,148 @@ class PostProcessorWorker:
filename_after_compression = filename_after_styling_and_word_removal filename_after_compression = filename_after_styling_and_word_removal
is_img_for_compress_check = is_image(api_original_filename) is_img_for_compress_check = is_image(api_original_filename)
if is_img_for_compress_check and self.compress_images and Image and downloaded_size_bytes > (1.5 * 1024 * 1024):
self.logger(f" Compressing '{api_original_filename}' ({downloaded_size_bytes / (1024 * 1024):.2f} MB)...")
if self._check_pause(f"Image compression for '{api_original_filename}'"): return 0, 1, filename_to_save_in_main_path, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_SKIPPED, None
img_content_for_pillow = None
try:
with open(downloaded_part_file_path, 'rb') as f_img_in:
img_content_for_pillow = BytesIO(f_img_in.read())
with Image.open(img_content_for_pillow) as img_obj:
if img_obj.mode == 'P': img_obj = img_obj.convert('RGBA')
elif img_obj.mode not in ['RGB', 'RGBA', 'L']: img_obj = img_obj.convert('RGB')
compressed_output_io = BytesIO()
img_obj.save(compressed_output_io, format='WebP', quality=80, method=4)
compressed_size = compressed_output_io.getbuffer().nbytes
if compressed_size < downloaded_size_bytes * 0.9:
self.logger(f" Compression success: {compressed_size / (1024 * 1024):.2f} MB.")
data_to_write_io = compressed_output_io
data_to_write_io.seek(0)
base_name_orig, _ = os.path.splitext(filename_after_compression)
filename_after_compression = base_name_orig + '.webp'
self.logger(f" Updated filename (compressed): {filename_after_compression}")
else:
self.logger(f" Compression skipped: WebP not significantly smaller.")
if compressed_output_io: compressed_output_io.close()
except Exception as comp_e:
self.logger(f"❌ Compression failed for '{api_original_filename}': {comp_e}. Saving original.")
finally:
if img_content_for_pillow: img_content_for_pillow.close()
final_filename_on_disk = filename_after_compression
temp_base, temp_ext = os.path.splitext(final_filename_on_disk)
suffix_counter = 1
while os.path.exists(os.path.join(effective_save_folder, final_filename_on_disk)):
final_filename_on_disk = f"{temp_base}_{suffix_counter}{temp_ext}"
suffix_counter += 1
if final_filename_on_disk != filename_after_compression:
self.logger(f" Applied numeric suffix in '{os.path.basename(effective_save_folder)}': '{final_filename_on_disk}' (was '{filename_after_compression}')")
if self._check_pause(f"File saving for '{final_filename_on_disk}'"):
return 0, 1, final_filename_on_disk, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_SKIPPED, None
final_save_path = os.path.join(effective_save_folder, final_filename_on_disk)
try:
if data_to_write_io:
with open(final_save_path, 'wb') as f_out:
time.sleep(0.05)
f_out.write(data_to_write_io.getvalue())
if downloaded_part_file_path and os.path.exists(downloaded_part_file_path):
try:
os.remove(downloaded_part_file_path)
except OSError as e_rem:
self.logger(f" -> Failed to remove .part after compression: {e_rem}")
else:
if downloaded_part_file_path and os.path.exists(downloaded_part_file_path):
time.sleep(0.1)
os.rename(downloaded_part_file_path, final_save_path)
else:
raise FileNotFoundError(f"Original .part file not found for saving: {downloaded_part_file_path}")
with self.downloaded_file_hashes_lock: self.downloaded_file_hashes.add(calculated_file_hash)
with self.downloaded_files_lock: self.downloaded_files.add(filename_to_save_in_main_path)
final_filename_saved_for_return = final_filename_on_disk
self.logger(f"✅ Saved: '{final_filename_saved_for_return}' (from '{api_original_filename}', {downloaded_size_bytes / (1024 * 1024):.2f} MB) in '{os.path.basename(effective_save_folder)}'")
downloaded_file_details = {
'disk_filename': final_filename_saved_for_return,
'post_title': post_title,
'post_id': original_post_id_for_log,
'upload_date_str': self.post.get('published') or self.post.get('added') or "N/A",
'download_timestamp': time.time(),
'download_path': effective_save_folder,
'service': self.service,
'user_id': self.user_id,
'api_original_filename': api_original_filename,
'folder_context_name': folder_context_name_for_history or os.path.basename(effective_save_folder)
}
self._emit_signal('file_successfully_downloaded', downloaded_file_details)
time.sleep(0.05)
return 1, 0, final_filename_saved_for_return, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_SUCCESS, None
except Exception as save_err:
self.logger(f"->>Save Fail for '{final_filename_on_disk}': {save_err}")
if os.path.exists(final_save_path):
try: os.remove(final_save_path)
except OSError: self.logger(f" -> Failed to remove partially saved file: {final_save_path}")
# --- FIX: Report as a permanent failure so it appears in the error dialog ---
permanent_failure_details = { 'file_info': file_info, 'target_folder_path': target_folder_path, 'headers': headers, 'original_post_id_for_log': original_post_id_for_log, 'post_title': post_title, 'file_index_in_post': file_index_in_post, 'num_files_in_this_post': num_files_in_this_post, 'forced_filename_override': filename_to_save_in_main_path, }
return 0, 1, final_filename_saved_for_return, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_FAILED_PERMANENTLY_THIS_SESSION, permanent_failure_details
finally:
if data_to_write_io and hasattr(data_to_write_io, 'close'):
data_to_write_io.close()
else:
# --- This is the failure path ---
self.logger(f"❌ Download failed for '{api_original_filename}' after {max_retries + 1} attempts.")
is_actually_incomplete_read = False
if isinstance(last_exception_for_retry_later, http.client.IncompleteRead):
is_actually_incomplete_read = True
elif hasattr(last_exception_for_retry_later, '__cause__') and isinstance(last_exception_for_retry_later.__cause__, http.client.IncompleteRead):
is_actually_incomplete_read = True
elif last_exception_for_retry_later is not None:
str_exc = str(last_exception_for_retry_later).lower()
if "incompleteread" in str_exc or (isinstance(last_exception_for_retry_later, tuple) and any("incompleteread" in str(arg).lower() for arg in last_exception_for_retry_later if isinstance(arg, (str, Exception)))):
is_actually_incomplete_read = True
if is_actually_incomplete_read:
self.logger(f" Marking '{api_original_filename}' for potential retry later due to IncompleteRead.")
retry_later_details = { 'file_info': file_info, 'target_folder_path': target_folder_path, 'headers': headers, 'original_post_id_for_log': original_post_id_for_log, 'post_title': post_title, 'file_index_in_post': file_index_in_post, 'num_files_in_this_post': num_files_in_this_post, 'forced_filename_override': filename_to_save_in_main_path, 'manga_mode_active_for_file': self.manga_mode_active, 'manga_filename_style_for_file': self.manga_filename_style, }
return 0, 1, filename_to_save_in_main_path, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_FAILED_RETRYABLE_LATER, retry_later_details
else:
self.logger(f" Marking '{api_original_filename}' as permanently failed for this session.")
permanent_failure_details = { 'file_info': file_info, 'target_folder_path': target_folder_path, 'headers': headers, 'original_post_id_for_log': original_post_id_for_log, 'post_title': post_title, 'file_index_in_post': file_index_in_post, 'num_files_in_this_post': num_files_in_this_post, 'forced_filename_override': filename_to_save_in_main_path, }
return 0, 1, filename_to_save_in_main_path, was_original_name_kept_flag, FILE_DOWNLOAD_STATUS_FAILED_PERMANENTLY_THIS_SESSION, permanent_failure_details
with self .downloaded_file_hashes_lock :
if calculated_file_hash in self .downloaded_file_hashes :
self .logger (f" -> Skip Saving Duplicate (Hash Match): '{api_original_filename }' (Hash: {calculated_file_hash [:8 ]}...).")
with self .downloaded_files_lock :self .downloaded_files .add (filename_to_save_in_main_path )
if downloaded_part_file_path and os .path .exists (downloaded_part_file_path ):
try :os .remove (downloaded_part_file_path )
except OSError as e_rem :self .logger (f" -> Failed to remove .part file for hash duplicate: {e_rem }")
return 0 ,1 ,filename_to_save_in_main_path ,was_original_name_kept_flag ,FILE_DOWNLOAD_STATUS_SKIPPED ,None
effective_save_folder =target_folder_path
filename_after_styling_and_word_removal =filename_to_save_in_main_path
try:
os.makedirs(effective_save_folder, exist_ok=True)
except OSError as e:
self.logger(f" ❌ Critical error creating directory '{effective_save_folder}': {e}. Skipping file '{api_original_filename}'.")
if downloaded_part_file_path and os.path.exists(downloaded_part_file_path):
try: os.remove(downloaded_part_file_path)
except OSError: pass
# --- FIX: Report as a permanent failure so it appears in the error dialog ---
permanent_failure_details = { 'file_info': file_info, 'target_folder_path': target_folder_path, 'headers': headers, 'original_post_id_for_log': original_post_id_for_log, 'post_title': post_title, 'file_index_in_post': file_index_in_post, 'num_files_in_this_post': num_files_in_this_post, 'forced_filename_override': filename_to_save_in_main_path, }
return 0, 1, api_original_filename, False, FILE_DOWNLOAD_STATUS_FAILED_PERMANENTLY_THIS_SESSION, permanent_failure_details
data_to_write_io =None
filename_after_compression =filename_after_styling_and_word_removal
is_img_for_compress_check =is_image (api_original_filename )
if is_img_for_compress_check and self .compress_images and Image and downloaded_size_bytes >(1.5 *1024 *1024 ): if is_img_for_compress_check and self .compress_images and Image and downloaded_size_bytes >(1.5 *1024 *1024 ):
self .logger (f" Compressing '{api_original_filename }' ({downloaded_size_bytes /(1024 *1024 ):.2f} MB)...") self .logger (f" Compressing '{api_original_filename }' ({downloaded_size_bytes /(1024 *1024 ):.2f} MB)...")
if self ._check_pause (f"Image compression for '{api_original_filename }'"):return 0 ,1 ,filename_to_save_in_main_path ,was_original_name_kept_flag ,FILE_DOWNLOAD_STATUS_SKIPPED ,None if self ._check_pause (f"Image compression for '{api_original_filename }'"):return 0 ,1 ,filename_to_save_in_main_path ,was_original_name_kept_flag ,FILE_DOWNLOAD_STATUS_SKIPPED ,None
@@ -717,10 +883,9 @@ class PostProcessorWorker:
if data_to_write_io and hasattr (data_to_write_io ,'close'): if data_to_write_io and hasattr (data_to_write_io ,'close'):
data_to_write_io .close () data_to_write_io .close ()
def process (self ): def process (self ):
if self ._check_pause (f"Post processing for ID {self .post .get ('id','N/A')}"):return 0 ,0 ,[],[],[],None if self ._check_pause (f"Post processing for ID {self .post .get ('id','N/A')}"):return 0 ,0 ,[],[],[],None, None
if self .check_cancel ():return 0 ,0 ,[],[],[],None if self .check_cancel ():return 0 ,0 ,[],[],[],None, None
current_character_filters =self ._get_current_character_filters () current_character_filters =self ._get_current_character_filters ()
kept_original_filenames_for_log =[] kept_original_filenames_for_log =[]
retryable_failures_this_post =[] retryable_failures_this_post =[]
@@ -728,6 +893,7 @@ class PostProcessorWorker:
total_downloaded_this_post =0 total_downloaded_this_post =0
total_skipped_this_post =0 total_skipped_this_post =0
history_data_for_this_post =None history_data_for_this_post =None
temp_filepath_for_return = None
parsed_api_url =urlparse (self .api_url_input ) parsed_api_url =urlparse (self .api_url_input )
referer_url =f"https://{parsed_api_url .netloc }/" referer_url =f"https://{parsed_api_url .netloc }/"
@@ -856,23 +1022,23 @@ class PostProcessorWorker:
if self .char_filter_scope ==CHAR_SCOPE_TITLE and not post_is_candidate_by_title_char_match : if self .char_filter_scope ==CHAR_SCOPE_TITLE and not post_is_candidate_by_title_char_match :
self .logger (f" -> Skip Post (Scope: Title - No Char Match): Title '{post_title [:50 ]}' does not match character filters.") self .logger (f" -> Skip Post (Scope: Title - No Char Match): Title '{post_title [:50 ]}' does not match character filters.")
self ._emit_signal ('missed_character_post',post_title ,"No title match for character filter") self ._emit_signal ('missed_character_post',post_title ,"No title match for character filter")
return 0 ,num_potential_files_in_post ,[],[],[],None return 0 ,num_potential_files_in_post ,[],[],[],None, None
if self .char_filter_scope ==CHAR_SCOPE_COMMENTS and not post_is_candidate_by_file_char_match_in_comment_scope and not post_is_candidate_by_comment_char_match : if self .char_filter_scope ==CHAR_SCOPE_COMMENTS and not post_is_candidate_by_file_char_match_in_comment_scope and not post_is_candidate_by_comment_char_match :
self .logger (f" -> Skip Post (Scope: Comments - No Char Match in Comments): Post ID '{post_id }', Title '{post_title [:50 ]}...'") self .logger (f" -> Skip Post (Scope: Comments - No Char Match in Comments): Post ID '{post_id }', Title '{post_title [:50 ]}...'")
if self .emitter and hasattr (self .emitter ,'missed_character_post_signal'): if self .emitter and hasattr (self .emitter ,'missed_character_post_signal'):
self ._emit_signal ('missed_character_post',post_title ,"No character match in files or comments (Comments scope)") self ._emit_signal ('missed_character_post',post_title ,"No character match in files or comments (Comments scope)")
return 0 ,num_potential_files_in_post ,[],[],[],None return 0 ,num_potential_files_in_post ,[],[],[],None, None
if self .skip_words_list and (self .skip_words_scope ==SKIP_SCOPE_POSTS or self .skip_words_scope ==SKIP_SCOPE_BOTH ): if self .skip_words_list and (self .skip_words_scope ==SKIP_SCOPE_POSTS or self .skip_words_scope ==SKIP_SCOPE_BOTH ):
if self ._check_pause (f"Skip words (post title) for post {post_id }"):return 0 ,num_potential_files_in_post ,[],[],[],None if self ._check_pause (f"Skip words (post title) for post {post_id }"):return 0 ,num_potential_files_in_post ,[],[],[],None
post_title_lower =post_title .lower () post_title_lower =post_title .lower ()
for skip_word in self .skip_words_list : for skip_word in self .skip_words_list :
if skip_word .lower ()in post_title_lower : if skip_word .lower ()in post_title_lower :
self .logger (f" -> Skip Post (Keyword in Title '{skip_word }'): '{post_title [:50 ]}...'. Scope: {self .skip_words_scope }") self .logger (f" -> Skip Post (Keyword in Title '{skip_word }'): '{post_title [:50 ]}...'. Scope: {self .skip_words_scope }")
return 0 ,num_potential_files_in_post ,[],[],[],None return 0 ,num_potential_files_in_post ,[],[],[],None, None
if not self .extract_links_only and self .manga_mode_active and current_character_filters and (self .char_filter_scope ==CHAR_SCOPE_TITLE or self .char_filter_scope ==CHAR_SCOPE_BOTH )and not post_is_candidate_by_title_char_match : if not self .extract_links_only and self .manga_mode_active and current_character_filters and (self .char_filter_scope ==CHAR_SCOPE_TITLE or self .char_filter_scope ==CHAR_SCOPE_BOTH )and not post_is_candidate_by_title_char_match :
self .logger (f" -> Skip Post (Manga Mode with Title/Both Scope - No Title Char Match): Title '{post_title [:50 ]}' doesn't match filters.") self .logger (f" -> Skip Post (Manga Mode with Title/Both Scope - No Title Char Match): Title '{post_title [:50 ]}' doesn't match filters.")
self ._emit_signal ('missed_character_post',post_title ,"Manga Mode: No title match for character filter (Title/Both scope)") self ._emit_signal ('missed_character_post',post_title ,"Manga Mode: No title match for character filter (Title/Both scope)")
return 0 ,num_potential_files_in_post ,[],[],[],None return 0 ,num_potential_files_in_post ,[],[],[],None, None
if not isinstance (post_attachments ,list ): if not isinstance (post_attachments ,list ):
self .logger (f"⚠️ Corrupt attachment data for post {post_id } (expected list, got {type (post_attachments )}). Skipping attachments.") self .logger (f"⚠️ Corrupt attachment data for post {post_id } (expected list, got {type (post_attachments )}). Skipping attachments.")
post_attachments =[] post_attachments =[]
@@ -994,6 +1160,20 @@ class PostProcessorWorker:
else : else :
original_cleaned_post_title_for_sub =cleaned_post_title_for_sub original_cleaned_post_title_for_sub =cleaned_post_title_for_sub
if self.use_date_prefix_for_subfolder:
# Prioritize 'published' date, fall back to 'added' date
published_date_str = self.post.get('published') or self.post.get('added')
if published_date_str:
try:
# Extract just the date part (YYYY-MM-DD)
date_prefix = published_date_str.split('T')[0]
# Prepend the date to the folder name
original_cleaned_post_title_for_sub = f"{date_prefix} {original_cleaned_post_title_for_sub}"
self.logger(f" Applying date prefix to subfolder: '{original_cleaned_post_title_for_sub}'")
except Exception as e:
self.logger(f" ⚠️ Could not parse date '{published_date_str}' for prefix. Using original name. Error: {e}")
else:
self.logger(" ⚠️ 'Date Prefix' is checked, but post has no 'published' or 'added' date. Omitting prefix.")
base_path_for_post_subfolder =determined_post_save_path_for_history base_path_for_post_subfolder =determined_post_save_path_for_history
@@ -1027,6 +1207,160 @@ class PostProcessorWorker:
break break
determined_post_save_path_for_history =os .path .join (base_path_for_post_subfolder ,final_post_subfolder_name ) determined_post_save_path_for_history =os .path .join (base_path_for_post_subfolder ,final_post_subfolder_name )
if self.filter_mode == 'text_only' and not self.extract_links_only:
self.logger(f" Mode: Text Only (Scope: {self.text_only_scope})")
# --- Apply Title-based filters to ensure post is a candidate ---
post_title_lower = post_title.lower()
if self.skip_words_list and (self.skip_words_scope == SKIP_SCOPE_POSTS or self.skip_words_scope == SKIP_SCOPE_BOTH):
for skip_word in self.skip_words_list:
if skip_word.lower() in post_title_lower:
self.logger(f" -> Skip Post (Keyword in Title '{skip_word}'): '{post_title[:50]}...'.")
return 0, num_potential_files_in_post, [], [], [], None, None
if current_character_filters and not post_is_candidate_by_title_char_match and not post_is_candidate_by_comment_char_match and not post_is_candidate_by_file_char_match_in_comment_scope:
self.logger(f" -> Skip Post (No character match for text extraction): '{post_title[:50]}...'.")
return 0, num_potential_files_in_post, [], [], [], None, None
# --- Get the text content based on scope ---
raw_text_content = ""
final_post_data = post_data
# Fetch full post data if content is missing and scope is 'content'
if self.text_only_scope == 'content' and 'content' not in final_post_data:
self.logger(f" Post {post_id} is missing 'content' field, fetching full data...")
parsed_url = urlparse(self.api_url_input)
api_domain = parsed_url.netloc
cookies = prepare_cookies_for_request(self.use_cookie, self.cookie_text, self.selected_cookie_file, self.app_base_dir, self.logger, target_domain=api_domain)
from .api_client import fetch_single_post_data # Local import to avoid circular dependency issues
full_data = fetch_single_post_data(api_domain, self.service, self.user_id, post_id, headers, self.logger, cookies_dict=cookies)
if full_data:
final_post_data = full_data
if self.text_only_scope == 'content':
raw_text_content = final_post_data.get('content', '')
elif self.text_only_scope == 'comments':
try:
parsed_url = urlparse(self.api_url_input)
api_domain = parsed_url.netloc
comments_data = fetch_post_comments(api_domain, self.service, self.user_id, post_id, headers, self.logger, self.cancellation_event, self.pause_event)
if comments_data:
comment_texts = []
for comment in comments_data:
user = comment.get('user', {}).get('name', 'Unknown User')
timestamp = comment.get('updated', 'No Date')
body = strip_html_tags(comment.get('content', ''))
comment_texts.append(f"--- Comment by {user} on {timestamp} ---\n{body}\n")
raw_text_content = "\n".join(comment_texts)
except Exception as e:
self.logger(f" ❌ Error fetching comments for text-only mode: {e}")
if not raw_text_content or not raw_text_content.strip():
self.logger(" -> Skip Saving Text: No content/comments found or fetched.")
return 0, num_potential_files_in_post, [], [], [], None, None
# --- Robust HTML-to-TEXT Conversion ---
paragraph_pattern = re.compile(r'<p.*?>(.*?)</p>', re.IGNORECASE | re.DOTALL)
html_paragraphs = paragraph_pattern.findall(raw_text_content)
cleaned_text = ""
if not html_paragraphs:
self.logger(" ⚠️ No <p> tags found. Falling back to basic HTML cleaning for the whole block.")
text_with_br = re.sub(r'<br\s*/?>', '\n', raw_text_content, flags=re.IGNORECASE)
cleaned_text = re.sub(r'<.*?>', '', text_with_br)
else:
cleaned_paragraphs_list = []
for p_content in html_paragraphs:
p_with_br = re.sub(r'<br\s*/?>', '\n', p_content, flags=re.IGNORECASE)
p_cleaned = re.sub(r'<.*?>', '', p_with_br)
p_final = html.unescape(p_cleaned).strip()
if p_final:
cleaned_paragraphs_list.append(p_final)
cleaned_text = '\n\n'.join(cleaned_paragraphs_list)
cleaned_text = cleaned_text.replace('', '...')
# --- Logic for Single PDF Mode (File-based) ---
if self.single_pdf_mode:
if not cleaned_text:
return 0, 0, [], [], [], None, None
content_data = {
'title': post_title,
'content': cleaned_text,
'published': self.post.get('published') or self.post.get('added')
}
temp_dir = os.path.join(self.app_base_dir, "appdata")
os.makedirs(temp_dir, exist_ok=True)
temp_filename = f"tmp_{post_id}_{uuid.uuid4().hex[:8]}.json"
temp_filepath = os.path.join(temp_dir, temp_filename)
try:
with open(temp_filepath, 'w', encoding='utf-8') as f:
json.dump(content_data, f, indent=2)
self.logger(f" Saved temporary text for '{post_title}' for single PDF compilation.")
self._emit_signal('worker_finished', (0, 0, [], [], [], None, temp_filepath)) # <--- CHANGE THIS
return (0, 0, [], [], [], None, temp_filepath)
except Exception as e:
self.logger(f" ❌ Failed to write temporary file for single PDF: {e}")
self._emit_signal('worker_finished', (0, 0, [], [], [], [], None))
return (0, 0, [], [], [], [], None)
# --- Logic for Individual File Saving ---
else:
file_extension = self.text_export_format
txt_filename = clean_filename(post_title) + f".{file_extension}"
final_save_path = os.path.join(determined_post_save_path_for_history, txt_filename)
try:
os.makedirs(determined_post_save_path_for_history, exist_ok=True)
base, ext = os.path.splitext(final_save_path)
counter = 1
while os.path.exists(final_save_path):
final_save_path = f"{base}_{counter}{ext}"
counter += 1
if file_extension == 'pdf':
if FPDF:
self.logger(f" Converting to PDF...")
pdf = PDF()
font_path = ""
if self.project_root_dir:
font_path = os.path.join(self.project_root_dir, 'data', 'dejavu-sans', 'DejaVuSans.ttf')
try:
if not os.path.exists(font_path): raise RuntimeError(f"Font file not found: {font_path}")
pdf.add_font('DejaVu', '', font_path, uni=True)
pdf.set_font('DejaVu', '', 12)
except Exception as font_error:
self.logger(f" ⚠️ Could not load DejaVu font: {font_error}. Falling back to Arial.")
pdf.set_font('Arial', '', 12)
pdf.add_page()
pdf.multi_cell(0, 5, cleaned_text)
pdf.output(final_save_path)
else:
self.logger(f" ⚠️ Cannot create PDF: 'fpdf2' library not installed. Saving as .txt.")
final_save_path = os.path.splitext(final_save_path)[0] + ".txt"
with open(final_save_path, 'w', encoding='utf-8') as f: f.write(cleaned_text)
elif file_extension == 'docx':
if Document:
self.logger(f" Converting to DOCX...")
document = Document()
document.add_paragraph(cleaned_text)
document.save(final_save_path)
else:
self.logger(f" ⚠️ Cannot create DOCX: 'python-docx' library not installed. Saving as .txt.")
final_save_path = os.path.splitext(final_save_path)[0] + ".txt"
with open(final_save_path, 'w', encoding='utf-8') as f: f.write(cleaned_text)
else: # Default to TXT
with open(final_save_path, 'w', encoding='utf-8') as f:
f.write(cleaned_text)
self.logger(f"✅ Saved Text: '{os.path.basename(final_save_path)}' in '{os.path.basename(determined_post_save_path_for_history)}'")
return 1, num_potential_files_in_post, [], [], [], history_data_for_this_post, None
except Exception as e:
self.logger(f" ❌ Critical error saving text file '{txt_filename}': {e}")
return 0, num_potential_files_in_post, [], [], [], None, None
if not self .extract_links_only and self .use_subfolders and self .skip_words_list : if not self .extract_links_only and self .use_subfolders and self .skip_words_list :
if self ._check_pause (f"Folder keyword skip check for post {post_id }"):return 0 ,num_potential_files_in_post ,[],[],[],None if self ._check_pause (f"Folder keyword skip check for post {post_id }"):return 0 ,num_potential_files_in_post ,[],[],[],None
@@ -1035,7 +1369,7 @@ class PostProcessorWorker:
if any (skip_word .lower ()in folder_name_to_check .lower ()for skip_word in self .skip_words_list ): if any (skip_word .lower ()in folder_name_to_check .lower ()for skip_word in self .skip_words_list ):
matched_skip =next ((sw for sw in self .skip_words_list if sw .lower ()in folder_name_to_check .lower ()),"unknown_skip_word") matched_skip =next ((sw for sw in self .skip_words_list if sw .lower ()in folder_name_to_check .lower ()),"unknown_skip_word")
self .logger (f" -> Skip Post (Folder Keyword): Potential folder '{folder_name_to_check }' contains '{matched_skip }'.") self .logger (f" -> Skip Post (Folder Keyword): Potential folder '{folder_name_to_check }' contains '{matched_skip }'.")
return 0 ,num_potential_files_in_post ,[],[],[],None return 0 ,num_potential_files_in_post ,[],[],[],None, None
if (self .show_external_links or self .extract_links_only )and post_content_html : if (self .show_external_links or self .extract_links_only )and post_content_html :
if self ._check_pause (f"External link extraction for post {post_id }"):return 0 ,num_potential_files_in_post ,[],[],[],None if self ._check_pause (f"External link extraction for post {post_id }"):return 0 ,num_potential_files_in_post ,[],[],[],None
try : try :
@@ -1182,6 +1516,14 @@ class PostProcessorWorker:
return 0 ,0 ,[],[],[],None return 0 ,0 ,[],[],[],None
files_to_download_info_list =[] files_to_download_info_list =[]
processed_original_filenames_in_this_post =set () processed_original_filenames_in_this_post =set ()
if self.keep_in_post_duplicates:
# If we keep duplicates, just add every file to the list to be processed.
# The downstream hash check and rename-on-collision logic will handle them.
files_to_download_info_list.extend(all_files_from_post_api)
self.logger(f" 'Keep Duplicates' is on. All {len(all_files_from_post_api)} files from post will be processed.")
else:
# This is the original logic that skips duplicates by name within a post.
for file_info in all_files_from_post_api: for file_info in all_files_from_post_api:
current_api_original_filename = file_info.get('_original_name_for_log') current_api_original_filename = file_info.get('_original_name_for_log')
if current_api_original_filename in processed_original_filenames_in_this_post: if current_api_original_filename in processed_original_filenames_in_this_post:
@@ -1191,7 +1533,9 @@ class PostProcessorWorker:
files_to_download_info_list.append(file_info) files_to_download_info_list.append(file_info)
if current_api_original_filename: if current_api_original_filename:
processed_original_filenames_in_this_post.add(current_api_original_filename) processed_original_filenames_in_this_post.add(current_api_original_filename)
if not files_to_download_info_list: if not files_to_download_info_list:
self .logger (f" All files for post {post_id } were duplicate original names or skipped earlier.") self .logger (f" All files for post {post_id } were duplicate original names or skipped earlier.")
return 0 ,total_skipped_this_post ,[],[],[],None return 0 ,total_skipped_this_post ,[],[],[],None
@@ -1341,12 +1685,24 @@ class PostProcessorWorker:
with open(self.session_file_path, 'r', encoding='utf-8') as f: with open(self.session_file_path, 'r', encoding='utf-8') as f:
session_data = json.load(f) session_data = json.load(f)
# Modify in memory
if not isinstance(session_data.get('download_state', {}).get('processed_post_ids'), list):
if 'download_state' not in session_data: if 'download_state' not in session_data:
session_data['download_state'] = {} session_data['download_state'] = {}
# Add processed ID
if not isinstance(session_data['download_state'].get('processed_post_ids'), list):
session_data['download_state']['processed_post_ids'] = [] session_data['download_state']['processed_post_ids'] = []
session_data['download_state']['processed_post_ids'].append(self.post.get('id')) session_data['download_state']['processed_post_ids'].append(self.post.get('id'))
# Add any permanent failures from this worker to the session file
if permanent_failures_this_post:
if not isinstance(session_data['download_state'].get('permanently_failed_files'), list):
session_data['download_state']['permanently_failed_files'] = []
# To avoid duplicates if the same post is somehow re-processed
existing_failed_urls = {f.get('file_info', {}).get('url') for f in session_data['download_state']['permanently_failed_files']}
for failure in permanent_failures_this_post:
if failure.get('file_info', {}).get('url') not in existing_failed_urls:
session_data['download_state']['permanently_failed_files'].append(failure)
# Write to temp file and then atomically replace # Write to temp file and then atomically replace
temp_file_path = self.session_file_path + ".tmp" temp_file_path = self.session_file_path + ".tmp"
with open(temp_file_path, 'w', encoding='utf-8') as f_tmp: with open(temp_file_path, 'w', encoding='utf-8') as f_tmp:
@@ -1389,7 +1745,13 @@ class PostProcessorWorker:
except OSError as e_rmdir : except OSError as e_rmdir :
self .logger (f" ⚠️ Could not remove empty post-specific subfolder '{path_to_check_for_emptiness }': {e_rmdir }") self .logger (f" ⚠️ Could not remove empty post-specific subfolder '{path_to_check_for_emptiness }': {e_rmdir }")
return total_downloaded_this_post ,total_skipped_this_post ,kept_original_filenames_for_log ,retryable_failures_this_post ,permanent_failures_this_post ,history_data_for_this_post result_tuple = (total_downloaded_this_post, total_skipped_this_post,
kept_original_filenames_for_log, retryable_failures_this_post,
permanent_failures_this_post, history_data_for_this_post,
None) # The 7th item is None because we already saved the temp file
self._emit_signal('worker_finished', result_tuple)
return result_tuple
class DownloadThread (QThread ): class DownloadThread (QThread ):
progress_signal =pyqtSignal (str ) progress_signal =pyqtSignal (str )
@@ -1434,9 +1796,15 @@ class DownloadThread (QThread ):
use_cookie =False , use_cookie =False ,
scan_content_for_images =False , scan_content_for_images =False ,
creator_download_folder_ignore_words =None , creator_download_folder_ignore_words =None ,
use_date_prefix_for_subfolder=False,
keep_in_post_duplicates=False,
cookie_text ="", cookie_text ="",
session_file_path=None, session_file_path=None,
session_lock=None, session_lock=None,
text_only_scope=None,
text_export_format='txt',
single_pdf_mode=False,
project_root_dir=None,
): ):
super ().__init__ () super ().__init__ ()
self .api_url_input =api_url_input self .api_url_input =api_url_input
@@ -1486,10 +1854,17 @@ class DownloadThread (QThread ):
self .manga_date_file_counter_ref =manga_date_file_counter_ref self .manga_date_file_counter_ref =manga_date_file_counter_ref
self .scan_content_for_images =scan_content_for_images self .scan_content_for_images =scan_content_for_images
self .creator_download_folder_ignore_words =creator_download_folder_ignore_words self .creator_download_folder_ignore_words =creator_download_folder_ignore_words
self.use_date_prefix_for_subfolder = use_date_prefix_for_subfolder
self.keep_in_post_duplicates = keep_in_post_duplicates
self .manga_global_file_counter_ref =manga_global_file_counter_ref self .manga_global_file_counter_ref =manga_global_file_counter_ref
self.session_file_path = session_file_path self.session_file_path = session_file_path
self.session_lock = session_lock self.session_lock = session_lock
self.history_candidates_buffer =deque (maxlen =8 ) self.history_candidates_buffer =deque (maxlen =8 )
self.text_only_scope = text_only_scope
self.text_export_format = text_export_format
self.single_pdf_mode = single_pdf_mode # <-- ADD THIS LINE
self.project_root_dir = project_root_dir # Add this assignment
if self .compress_images and Image is None : if self .compress_images and Image is None :
self .logger ("⚠️ Image compression disabled: Pillow library not found (DownloadThread).") self .logger ("⚠️ Image compression disabled: Pillow library not found (DownloadThread).")
self .compress_images =False self .compress_images =False
@@ -1512,13 +1887,21 @@ class DownloadThread (QThread ):
self .logger ("⏭️ Skip requested for current file (single-thread mode).") self .logger ("⏭️ Skip requested for current file (single-thread mode).")
self .skip_current_file_flag .set () self .skip_current_file_flag .set ()
else :self .logger (" Skip file: No download active or skip flag not available for current context.") else :self .logger (" Skip file: No download active or skip flag not available for current context.")
def run(self): def run(self):
"""
The main execution method for the single-threaded download process.
This version is corrected to handle 7 return values from the worker and
to pass the 'single_pdf_mode' setting correctly.
"""
grand_total_downloaded_files = 0 grand_total_downloaded_files = 0
grand_total_skipped_files = 0 grand_total_skipped_files = 0
grand_list_of_kept_original_filenames = [] grand_list_of_kept_original_filenames = []
was_process_cancelled = False was_process_cancelled = False
# This block for initializing manga mode counters remains unchanged
if self.manga_mode_active and self.manga_filename_style == STYLE_DATE_BASED and not self.extract_links_only and self.manga_date_file_counter_ref is None: if self.manga_mode_active and self.manga_filename_style == STYLE_DATE_BASED and not self.extract_links_only and self.manga_date_file_counter_ref is None:
# Determine the directory to scan for existing numbered files
series_scan_dir = self.output_dir series_scan_dir = self.output_dir
if self.use_subfolders : if self.use_subfolders :
if self.filter_character_list_objects_initial and self.filter_character_list_objects_initial [0] and self.filter_character_list_objects_initial[0].get("name"): if self.filter_character_list_objects_initial and self.filter_character_list_objects_initial [0] and self.filter_character_list_objects_initial[0].get("name"):
@@ -1527,35 +1910,46 @@ class DownloadThread (QThread ):
elif self.service and self.user_id : elif self.service and self.user_id :
creator_based_folder_name = clean_folder_name(str(self.user_id)) creator_based_folder_name = clean_folder_name(str(self.user_id))
series_scan_dir = os.path.join(series_scan_dir, creator_based_folder_name) series_scan_dir = os.path.join(series_scan_dir, creator_based_folder_name)
highest_num = 0 highest_num = 0
if os.path.isdir(series_scan_dir): if os.path.isdir(series_scan_dir):
self.logger(f" [Thread] Manga Date Mode: Scanning for existing files in '{series_scan_dir}'...") self.logger(f" [Thread] Manga Date Mode: Scanning for existing files in '{series_scan_dir}'...")
for dirpath, _, filenames_in_dir in os.walk(series_scan_dir): for dirpath, _, filenames_in_dir in os.walk(series_scan_dir):
for filename_to_check in filenames_in_dir: for filename_to_check in filenames_in_dir:
# Check for an optional prefix defined by the user
prefix_to_check = clean_filename(self.manga_date_prefix.strip()) if self.manga_date_prefix and self.manga_date_prefix.strip() else "" prefix_to_check = clean_filename(self.manga_date_prefix.strip()) if self.manga_date_prefix and self.manga_date_prefix.strip() else ""
name_part_to_match = filename_to_check name_part_to_match = filename_to_check
if prefix_to_check and name_part_to_match.startswith(prefix_to_check): if prefix_to_check and name_part_to_match.startswith(prefix_to_check):
name_part_to_match = name_part_to_match[len(prefix_to_check):].lstrip() name_part_to_match = name_part_to_match[len(prefix_to_check):].lstrip()
# Use regex to find the number at the start of the filename
base_name_no_ext = os.path.splitext(name_part_to_match)[0] base_name_no_ext = os.path.splitext(name_part_to_match)[0]
match = re.match(r"(\d+)", base_name_no_ext) match = re.match(r"(\d+)", base_name_no_ext)
if match :highest_num =max (highest_num ,int (match .group (1 ))) if match:
highest_num = max(highest_num, int(match.group(1)))
# Initialize the shared counter to the next number, protected by a thread lock
self.manga_date_file_counter_ref = [highest_num + 1, threading.Lock()] self.manga_date_file_counter_ref = [highest_num + 1, threading.Lock()]
self.logger(f" [Thread] Manga Date Mode: Initialized date-based counter at {self.manga_date_file_counter_ref[0]}.") self.logger(f" [Thread] Manga Date Mode: Initialized date-based counter at {self.manga_date_file_counter_ref[0]}.")
pass
if self.manga_mode_active and self.manga_filename_style == STYLE_POST_TITLE_GLOBAL_NUMBERING and not self.extract_links_only and self.manga_global_file_counter_ref is None: if self.manga_mode_active and self.manga_filename_style == STYLE_POST_TITLE_GLOBAL_NUMBERING and not self.extract_links_only and self.manga_global_file_counter_ref is None:
# Initialize the shared counter at 1, protected by a thread lock
self.manga_global_file_counter_ref = [1, threading.Lock()] self.manga_global_file_counter_ref = [1, threading.Lock()]
self.logger(f" [Thread] Manga Title+GlobalNum Mode: Initialized global counter at {self.manga_global_file_counter_ref[0]}.") self.logger(f" [Thread] Manga Title+GlobalNum Mode: Initialized global counter at {self.manga_global_file_counter_ref[0]}.")
pass
worker_signals_obj = PostProcessorSignals() worker_signals_obj = PostProcessorSignals()
try: try:
# Connect signals
worker_signals_obj.progress_signal.connect(self.progress_signal) worker_signals_obj.progress_signal.connect(self.progress_signal)
worker_signals_obj.file_download_status_signal.connect(self.file_download_status_signal) worker_signals_obj.file_download_status_signal.connect(self.file_download_status_signal)
worker_signals_obj.file_progress_signal.connect(self.file_progress_signal) worker_signals_obj.file_progress_signal.connect(self.file_progress_signal)
worker_signals_obj.external_link_signal.connect(self.external_link_signal) worker_signals_obj.external_link_signal.connect(self.external_link_signal)
worker_signals_obj.missed_character_post_signal.connect(self.missed_character_post_signal) worker_signals_obj.missed_character_post_signal.connect(self.missed_character_post_signal)
worker_signals_obj.file_successfully_downloaded_signal.connect(self.file_successfully_downloaded_signal) worker_signals_obj.file_successfully_downloaded_signal.connect(self.file_successfully_downloaded_signal)
worker_signals_obj.worker_finished_signal.connect(lambda result: None) # Connect to dummy lambda to avoid errors
self.logger(" Starting post fetch (single-threaded download process)...") self.logger(" Starting post fetch (single-threaded download process)...")
post_generator = download_from_api( post_generator = download_from_api(
self.api_url_input, self.api_url_input,
@@ -1571,12 +1965,17 @@ class DownloadThread (QThread ):
app_base_dir=self.app_base_dir, app_base_dir=self.app_base_dir,
manga_filename_style_for_sort_check=self.manga_filename_style if self.manga_mode_active else None manga_filename_style_for_sort_check=self.manga_filename_style if self.manga_mode_active else None
) )
for posts_batch_data in post_generator: for posts_batch_data in post_generator:
if self ._check_pause_self ("Post batch processing"):was_process_cancelled =True ;break if self.isInterruptionRequested():
if self .isInterruptionRequested ():was_process_cancelled =True ;break was_process_cancelled = True
break
for individual_post_data in posts_batch_data: for individual_post_data in posts_batch_data:
if self ._check_pause_self (f"Individual post processing for {individual_post_data .get ('id','N/A')}"):was_process_cancelled =True ;break if self.isInterruptionRequested():
if self .isInterruptionRequested ():was_process_cancelled =True ;break was_process_cancelled = True
break
# Create the worker, now correctly passing single_pdf_mode
post_processing_worker = PostProcessorWorker( post_processing_worker = PostProcessorWorker(
post_data=individual_post_data, post_data=individual_post_data,
download_root=self.output_dir, download_root=self.output_dir,
@@ -1618,14 +2017,25 @@ class DownloadThread (QThread ):
manga_global_file_counter_ref=self.manga_global_file_counter_ref, manga_global_file_counter_ref=self.manga_global_file_counter_ref,
use_cookie=self.use_cookie, use_cookie=self.use_cookie,
manga_date_file_counter_ref=self.manga_date_file_counter_ref, manga_date_file_counter_ref=self.manga_date_file_counter_ref,
use_date_prefix_for_subfolder=self.use_date_prefix_for_subfolder,
keep_in_post_duplicates=self.keep_in_post_duplicates,
creator_download_folder_ignore_words=self.creator_download_folder_ignore_words, creator_download_folder_ignore_words=self.creator_download_folder_ignore_words,
session_file_path=self.session_file_path, session_file_path=self.session_file_path,
session_lock=self.session_lock, session_lock=self.session_lock,
text_only_scope=self.text_only_scope,
text_export_format=self.text_export_format,
single_pdf_mode=self.single_pdf_mode, # <-- This is now correctly passed
project_root_dir=self.project_root_dir
) )
try: try:
dl_count ,skip_count ,kept_originals_this_post ,retryable_failures ,permanent_failures ,history_data =post_processing_worker .process () # Correctly unpack the 7 values returned from the worker
(dl_count, skip_count, kept_originals_this_post,
retryable_failures, permanent_failures,
history_data, temp_filepath) = post_processing_worker.process()
grand_total_downloaded_files += dl_count grand_total_downloaded_files += dl_count
grand_total_skipped_files += skip_count grand_total_skipped_files += skip_count
if kept_originals_this_post: if kept_originals_this_post:
grand_list_of_kept_original_filenames.extend(kept_originals_this_post) grand_list_of_kept_original_filenames.extend(kept_originals_this_post)
if retryable_failures: if retryable_failures:
@@ -1635,26 +2045,33 @@ class DownloadThread (QThread ):
self.post_processed_for_history_signal.emit(history_data) self.post_processed_for_history_signal.emit(history_data)
if permanent_failures: if permanent_failures:
self.permanent_file_failed_signal.emit(permanent_failures) self.permanent_file_failed_signal.emit(permanent_failures)
# In single-threaded text mode, pass the temp file path back to the main window
if self.single_pdf_mode and temp_filepath:
self.progress_signal.emit(f"TEMP_FILE_PATH:{temp_filepath}")
except Exception as proc_err: except Exception as proc_err:
post_id_for_err = individual_post_data.get('id', 'N/A') post_id_for_err = individual_post_data.get('id', 'N/A')
self.logger(f"❌ Error processing post {post_id_for_err} in DownloadThread: {proc_err}") self.logger(f"❌ Error processing post {post_id_for_err} in DownloadThread: {proc_err}")
traceback.print_exc() traceback.print_exc()
num_potential_files_est = len(individual_post_data.get('attachments', [])) + (1 if individual_post_data.get('file') else 0) num_potential_files_est = len(individual_post_data.get('attachments', [])) + (1 if individual_post_data.get('file') else 0)
grand_total_skipped_files += num_potential_files_est grand_total_skipped_files += num_potential_files_est
if self.skip_current_file_flag and self.skip_current_file_flag.is_set(): if self.skip_current_file_flag and self.skip_current_file_flag.is_set():
self.skip_current_file_flag.clear() self.skip_current_file_flag.clear()
self.logger(" Skip current file flag was processed and cleared by DownloadThread.") self.logger(" Skip current file flag was processed and cleared by DownloadThread.")
self.msleep(10) self.msleep(10)
if was_process_cancelled :break if was_process_cancelled:
break
if not was_process_cancelled and not self.isInterruptionRequested(): if not was_process_cancelled and not self.isInterruptionRequested():
self.logger("✅ All posts processed or end of content reached by DownloadThread.") self.logger("✅ All posts processed or end of content reached by DownloadThread.")
except Exception as main_thread_err: except Exception as main_thread_err:
self.logger(f"\n❌ Critical error within DownloadThread run loop: {main_thread_err}") self.logger(f"\n❌ Critical error within DownloadThread run loop: {main_thread_err}")
traceback.print_exc() traceback.print_exc()
if not self .isInterruptionRequested ():was_process_cancelled =False
finally: finally:
try: try:
# Disconnect signals
if worker_signals_obj: if worker_signals_obj:
worker_signals_obj.progress_signal.disconnect(self.progress_signal) worker_signals_obj.progress_signal.disconnect(self.progress_signal)
worker_signals_obj.file_download_status_signal.disconnect(self.file_download_status_signal) worker_signals_obj.file_download_status_signal.disconnect(self.file_download_status_signal)
@@ -1662,10 +2079,13 @@ class DownloadThread (QThread ):
worker_signals_obj.file_progress_signal.disconnect(self.file_progress_signal) worker_signals_obj.file_progress_signal.disconnect(self.file_progress_signal)
worker_signals_obj.missed_character_post_signal.disconnect(self.missed_character_post_signal) worker_signals_obj.missed_character_post_signal.disconnect(self.missed_character_post_signal)
worker_signals_obj.file_successfully_downloaded_signal.disconnect(self.file_successfully_downloaded_signal) worker_signals_obj.file_successfully_downloaded_signal.disconnect(self.file_successfully_downloaded_signal)
except (TypeError, RuntimeError) as e: except (TypeError, RuntimeError) as e:
self.logger(f" Note during DownloadThread signal disconnection: {e}") self.logger(f" Note during DownloadThread signal disconnection: {e}")
# Emit the final signal with all collected results
self.finished_signal.emit(grand_total_downloaded_files, grand_total_skipped_files, self.isInterruptionRequested(), grand_list_of_kept_original_filenames) self.finished_signal.emit(grand_total_downloaded_files, grand_total_skipped_files, self.isInterruptionRequested(), grand_list_of_kept_original_filenames)
def receive_add_character_result (self ,result ): def receive_add_character_result (self ,result ):
with QMutexLocker (self .prompt_mutex ): with QMutexLocker (self .prompt_mutex ):
self ._add_character_response =result self ._add_character_response =result

View File

@@ -22,13 +22,17 @@ def get_app_icon_object():
if _app_icon_cache and not _app_icon_cache.isNull(): if _app_icon_cache and not _app_icon_cache.isNull():
return _app_icon_cache return _app_icon_cache
# Declare a single variable to hold the base directory path.
app_base_dir = ""
# Determine the project's base directory, whether running from source or as a bundled app # Determine the project's base directory, whether running from source or as a bundled app
if getattr(sys, 'frozen', False): if getattr(sys, 'frozen', False):
# The application is frozen (e.g., with PyInstaller) # The application is frozen (e.g., with PyInstaller).
base_dir = os.path.dirname(sys.executable) # The base directory is the one containing the executable.
app_base_dir = os.path.dirname(sys.executable)
else: else:
# The application is running from a .py file # The application is running from a .py file.
# This path navigates up from src/ui/ to the project root # This path navigates up from src/ui/assets.py to the project root.
app_base_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')) app_base_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..'))
icon_path = os.path.join(app_base_dir, 'assets', 'Kemono.ico') icon_path = os.path.join(app_base_dir, 'assets', 'Kemono.ico')
@@ -36,6 +40,13 @@ def get_app_icon_object():
if os.path.exists(icon_path): if os.path.exists(icon_path):
_app_icon_cache = QIcon(icon_path) _app_icon_cache = QIcon(icon_path)
else: else:
# If the icon isn't found, especially in a frozen app, check the _MEIPASS directory as a fallback.
if getattr(sys, 'frozen', False) and hasattr(sys, '_MEIPASS'):
fallback_icon_path = os.path.join(sys._MEIPASS, 'assets', 'Kemono.ico')
if os.path.exists(fallback_icon_path):
_app_icon_cache = QIcon(fallback_icon_path)
return _app_icon_cache
print(f"Warning: Application icon not found at {icon_path}") print(f"Warning: Application icon not found at {icon_path}")
_app_icon_cache = QIcon() # Return an empty icon as a fallback _app_icon_cache = QIcon() # Return an empty icon as a fallback

View File

@@ -144,7 +144,7 @@ class EmptyPopupDialog (QDialog ):
self .setMinimumSize (int (400 *scale_factor ),int (300 *scale_factor )) self .setMinimumSize (int (400 *scale_factor ),int (300 *scale_factor ))
self .parent_app =parent_app_ref self .parent_app =parent_app_ref
self .current_scope_mode =self .SCOPE_CHARACTERS self.current_scope_mode = self.SCOPE_CREATORS
self .app_base_dir =app_base_dir self .app_base_dir =app_base_dir
app_icon =get_app_icon_object () app_icon =get_app_icon_object ()

View File

@@ -126,6 +126,21 @@ class FavoriteArtistsDialog (QDialog ):
self .artist_list_widget .setVisible (show ) self .artist_list_widget .setVisible (show )
def _fetch_favorite_artists (self ): def _fetch_favorite_artists (self ):
if self.cookies_config['use_cookie']:
# Check if we can load cookies for at least one of the services.
kemono_cookies = prepare_cookies_for_request(True, self.cookies_config['cookie_text'], self.cookies_config['selected_cookie_file'], self.cookies_config['app_base_dir'], self._logger, target_domain="kemono.su")
coomer_cookies = prepare_cookies_for_request(True, self.cookies_config['cookie_text'], self.cookies_config['selected_cookie_file'], self.cookies_config['app_base_dir'], self._logger, target_domain="coomer.su")
if not kemono_cookies and not coomer_cookies:
# If cookies are enabled but none could be loaded, show help and stop.
self.status_label.setText(self._tr("fav_artists_cookies_required_status", "Error: Cookies enabled but could not be loaded for any source."))
self._logger("Error: Cookies enabled but no valid cookies were loaded. Showing help dialog.")
cookie_help_dialog = CookieHelpDialog(self.parent_app, self)
cookie_help_dialog.exec_()
self.download_button.setEnabled(False)
return # Stop further execution
kemono_fav_url ="https://kemono.su/api/v1/account/favorites?type=artist" kemono_fav_url ="https://kemono.su/api/v1/account/favorites?type=artist"
coomer_fav_url ="https://coomer.su/api/v1/account/favorites?type=artist" coomer_fav_url ="https://coomer.su/api/v1/account/favorites?type=artist"

View File

@@ -4,7 +4,7 @@ import sys
# --- PyQt5 Imports --- # --- PyQt5 Imports ---
from PyQt5.QtCore import QUrl, QSize, Qt from PyQt5.QtCore import QUrl, QSize, Qt
from PyQt5.QtGui import QIcon from PyQt5.QtGui import QIcon, QDesktopServices
from PyQt5.QtWidgets import ( from PyQt5.QtWidgets import (
QApplication, QDialog, QHBoxLayout, QLabel, QPushButton, QVBoxLayout, QApplication, QDialog, QHBoxLayout, QLabel, QPushButton, QVBoxLayout,
QStackedWidget, QScrollArea, QFrame, QWidget QStackedWidget, QScrollArea, QFrame, QWidget

View File

@@ -0,0 +1,83 @@
from PyQt5.QtWidgets import (
QDialog, QVBoxLayout, QRadioButton, QDialogButtonBox, QButtonGroup, QLabel, QComboBox, QHBoxLayout, QCheckBox
)
from PyQt5.QtCore import Qt
class MoreOptionsDialog(QDialog):
"""
A dialog for selecting a scope, export format, and single PDF option.
"""
SCOPE_CONTENT = "content"
SCOPE_COMMENTS = "comments"
def __init__(self, parent=None, current_scope=None, current_format=None, single_pdf_checked=False):
super().__init__(parent)
self.setWindowTitle("More Options")
self.setMinimumWidth(350)
# ... (Layout and other widgets remain the same) ...
layout = QVBoxLayout(self)
self.description_label = QLabel("Please choose the scope for the action:")
layout.addWidget(self.description_label)
self.radio_button_group = QButtonGroup(self)
self.radio_content = QRadioButton("Description/Content")
self.radio_comments = QRadioButton("Comments")
self.radio_button_group.addButton(self.radio_content)
self.radio_button_group.addButton(self.radio_comments)
layout.addWidget(self.radio_content)
layout.addWidget(self.radio_comments)
if current_scope == self.SCOPE_COMMENTS:
self.radio_comments.setChecked(True)
else:
self.radio_content.setChecked(True)
export_layout = QHBoxLayout()
export_label = QLabel("Export as:")
self.format_combo = QComboBox()
self.format_combo.addItems(["PDF", "DOCX", "TXT"])
if current_format and current_format.upper() in ["PDF", "DOCX", "TXT"]:
self.format_combo.setCurrentText(current_format.upper())
else:
self.format_combo.setCurrentText("PDF")
export_layout.addWidget(export_label)
export_layout.addWidget(self.format_combo)
export_layout.addStretch()
layout.addLayout(export_layout)
# --- UPDATED: Single PDF Checkbox ---
self.single_pdf_checkbox = QCheckBox("Single PDF")
self.single_pdf_checkbox.setToolTip("If checked, all text from matching posts will be compiled into one single PDF file.")
self.single_pdf_checkbox.setChecked(single_pdf_checked)
layout.addWidget(self.single_pdf_checkbox)
self.format_combo.currentTextChanged.connect(self.update_single_pdf_checkbox_state)
self.update_single_pdf_checkbox_state(self.format_combo.currentText())
self.button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel)
self.button_box.accepted.connect(self.accept)
self.button_box.rejected.connect(self.reject)
layout.addWidget(self.button_box)
self.setLayout(layout)
def update_single_pdf_checkbox_state(self, text):
"""Enable the Single PDF checkbox only if the format is PDF."""
is_pdf = (text.upper() == "PDF")
self.single_pdf_checkbox.setEnabled(is_pdf)
if not is_pdf:
self.single_pdf_checkbox.setChecked(False)
def get_selected_scope(self):
if self.radio_comments.isChecked():
return self.SCOPE_COMMENTS
return self.SCOPE_CONTENT
def get_selected_format(self):
return self.format_combo.currentText().lower()
def get_single_pdf_state(self):
"""Returns the state of the Single PDF checkbox."""
return self.single_pdf_checkbox.isChecked() and self.single_pdf_checkbox.isEnabled()

View File

@@ -0,0 +1,77 @@
# SinglePDF.py
import os
try:
from fpdf import FPDF
FPDF_AVAILABLE = True
except ImportError:
FPDF_AVAILABLE = False
class PDF(FPDF):
"""Custom PDF class to handle headers and footers."""
def header(self):
# No header
pass
def footer(self):
# Position at 1.5 cm from bottom
self.set_y(-15)
self.set_font('DejaVu', '', 8)
# Page number
self.cell(0, 10, 'Page ' + str(self.page_no()), 0, 0, 'C')
def create_single_pdf_from_content(posts_data, output_filename, font_path, logger=print):
"""
Creates a single PDF from a list of post titles and content.
Args:
posts_data (list): A list of dictionaries, where each dict has 'title' and 'content' keys.
output_filename (str): The full path for the output PDF file.
font_path (str): Path to the DejaVuSans.ttf font file.
logger (function, optional): A function to log progress and errors. Defaults to print.
"""
if not FPDF_AVAILABLE:
logger("❌ PDF Creation failed: 'fpdf2' library is not installed. Please run: pip install fpdf2")
return False
if not posts_data:
logger(" No text content was collected to create a PDF.")
return False
pdf = PDF()
try:
if not os.path.exists(font_path):
raise RuntimeError("Font file not found.")
pdf.add_font('DejaVu', '', font_path, uni=True)
pdf.add_font('DejaVu', 'B', font_path, uni=True) # Add Bold variant
except Exception as font_error:
logger(f" ⚠️ Could not load DejaVu font: {font_error}")
logger(" PDF may not support all characters. Falling back to default Arial font.")
pdf.set_font('Arial', '', 12)
pdf.set_font('Arial', 'B', 16)
logger(f" Starting PDF creation with content from {len(posts_data)} posts...")
for post in posts_data:
pdf.add_page()
# Post Title
pdf.set_font('DejaVu', 'B', 16)
# vvv THIS LINE IS CORRECTED vvv
# We explicitly set align='L' and remove the incorrect positional arguments.
pdf.multi_cell(w=0, h=10, text=post.get('title', 'Untitled Post'), align='L')
pdf.ln(5) # Add a little space after the title
# Post Content
pdf.set_font('DejaVu', '', 12)
pdf.multi_cell(w=0, h=7, text=post.get('content', 'No Content'))
try:
pdf.output(output_filename)
logger(f"✅ Successfully created single PDF: '{os.path.basename(output_filename)}'")
return True
except Exception as e:
logger(f"❌ A critical error occurred while saving the final PDF: {e}")
return False

File diff suppressed because it is too large Load Diff