feat: add admin panel, Replicate AI translation, and document translation

- Admin panel (/admin) with JWT auth: configure Replicate API token,
  JigsawStack API key, model version, enable/disable AI translation,
  change admin password. Settings persisted in data/settings.json.

- Replicate AI translation: POST /api/translate/replicate uses
  JigsawStack text-translate model via Replicate API. Main page
  switches to client-side AI translation when enabled.

- Document translation tab: supports PDF, DOCX, XLSX, XLS, CSV.
  Excel/Word formatting fully preserved (SheetJS + JSZip XML manipulation).
  PDF uses pdf-parse extraction + pdf-lib reconstruction.
  Column selector UI for tabular data (per-sheet, All/None toggles).

- Updated README with full implementation documentation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-10 07:43:54 +01:00
parent 0190ea5da9
commit 0799101da3
23 changed files with 18595 additions and 261 deletions

397
README.md
View File

@@ -1,241 +1,256 @@
# Lingva Translate
# LingvAI
<img src="public/logo.svg" width="128" align="right">
[![Travis Build](https://travis-ci.com/thedaviddelta/lingva-translate.svg?branch=main)](https://travis-ci.com/thedaviddelta/lingva-translate)
[![Vercel Status](https://img.shields.io/github/deployments/thedaviddelta/lingva-translate/Production?label=vercel&logo=vercel&color=f5f5f5)](https://lingva.ml/)
[![Cypress Tests](https://img.shields.io/endpoint?url=https://dashboard.cypress.io/badge/simple/qgjdyd&style=flat&logo=cypress)](https://dashboard.cypress.io/projects/qgjdyd/runs)
[![License](https://img.shields.io/github/license/thedaviddelta/lingva-translate)](./LICENSE)
[![Awesome Humane Tech](https://raw.githubusercontent.com/humanetech-community/awesome-humane-tech/main/humane-tech-badge.svg?sanitize=true)](https://github.com/humanetech-community/awesome-humane-tech)
[<img src="https://www.datocms-assets.com/31049/1618983297-powered-by-vercel.svg" alt="Powered by Vercel" height="20">](https://vercel.com?utm_source=lingva-team&utm_campaign=oss)
Alternative front-end for Google Translate, serving as a Free and Open Source translator with over a hundred languages available
**LingvAI** is an enhanced fork of [Lingva Translate](https://github.com/thedaviddelta/lingva-translate) — a privacy-respecting alternative front-end for Google Translate — extended with **Replicate AI-powered translation**, an **admin panel**, and **document translation** (PDF, Word, Excel, CSV) with full formatting preservation.
---
## How does it work?
## Features
Inspired by projects like [NewPipe](https://github.com/TeamNewPipe/NewPipe), [Nitter](https://github.com/zedeus/nitter), [Invidious](https://github.com/iv-org/invidious) or [Bibliogram](https://git.sr.ht/~cadence/bibliogram), *Lingva* scrapes through Google Translate and retrieves the translation without directly accessing any Google-related service, preventing them from tracking.
### Core (from Lingva Translate)
- 100+ languages via Google Translate scraper (no tracking)
- Audio playback for source and translated text
- Auto-translate mode
- GraphQL and REST API
- PWA support (installable)
- Dark/light mode
For this purpose, *Lingva* is built, among others, with the following Open Source resources:
### New in LingvAI
+ [Lingva Scraper](https://github.com/thedaviddelta/lingva-scraper), a Google Translate scraper built and maintained specifically for this project, which obtains all kind of information from this platform.
+ [TypeScript](https://www.typescriptlang.org/), the JavaScript superset, as the language.
+ [React](https://reactjs.org/) as the main front-end framework.
+ [Next.js](https://nextjs.org/) as the complementary React framework, that provides Server-Side Rendering, Static Site Generation or serverless API endpoints.
+ [ChakraUI](https://chakra-ui.com/) for the in-component styling.
+ [Jest](https://jestjs.io/), [Testing Library](https://testing-library.com/) & [Cypress](https://www.cypress.io/) for unit, integration & E2E testing.
+ [Apollo Server](https://www.apollographql.com/docs/apollo-server/) for handling the GraphQL endpoint.
+ [Inkscape](https://inkscape.org/) for designing both the logo and the banner.
#### Admin Panel (`/admin`)
- Password-protected settings dashboard (gear icon in header)
- Configure Replicate API token and JigsawStack API key
- Select/change the Replicate model version
- Enable/disable AI translation per-instance
- Test translation button
- Change admin password
- Settings stored server-side in `data/settings.json` (never committed)
#### Replicate AI Translation
- When enabled in admin, uses [Replicate](https://replicate.com) + [JigsawStack](https://jigsawstack.com) text-translate model
- Replaces Google Translate scraper with AI translation when active
- Falls back to original lingva-scraper when Replicate is disabled
- Batch translation with separator trick for efficiency
## Deployment
#### Document Translation (new "Document" tab)
Translate whole documents while preserving original formatting:
As *Lingva* is a [Next.js](https://nextjs.org/) project you can deploy your own instance anywhere Next is supported.
| Format | Formatting | Notes |
|--------|-----------|-------|
| `.xlsx` / `.xls` | **Fully preserved** | Cell styles, formulas, column widths intact. Select which columns to translate. |
| `.docx` | **Fully preserved** | Fonts, tables, images, paragraph styles preserved via XML manipulation |
| `.csv` | Structure preserved | Column selection supported |
| `.pdf` | Best-effort | Text extracted, translated, new formatted PDF generated |
The only requirement is to set an environment variable called `NEXT_PUBLIC_SITE_DOMAIN` with the domain you're deploying the instance under. This is used for the canonical URL and the meta tags.
- Drag-and-drop or click-to-upload (up to 50 MB)
- **Column selector** for Excel/CSV: choose individual columns, use All/None toggles per sheet
- Download translated file (named `original_<lang>.ext`)
Optionally, there are other environment variables available:
+ `NEXT_PUBLIC_FORCE_DEFAULT_THEME`: Force a certain theme over the system preference set by the user. The accepted values are `light` and `dark`.
+ `NEXT_PUBLIC_DEFAULT_SOURCE_LANG`: Set an initial *source* language instead of the default `auto`.
+ `NEXT_PUBLIC_DEFAULT_TARGET_LANG`: Set an initial *target* language instead of the default `en`.
---
### Docker
## Getting Started
An [official Docker image](https://hub.docker.com/r/thedaviddelta/lingva-translate) is available to ease the deployment using Compose, Kubernetes or similar technologies. Remember to also include the environment variables (simplified to `site_domain`, `force_default_theme`, `default_source_lang` and `default_target_lang`) when running the container.
### Prerequisites
- Node.js 16+
- npm
#### Docker Compose:
```
version: '3'
services:
lingva:
container_name: lingva
image: thedaviddelta/lingva-translate:latest
restart: unless-stopped
environment:
- site_domain=lingva.ml
- force_default_theme=light
- default_source_lang=auto
- default_target_lang=en
ports:
- 3000:3000
```
#### Docker Run
### Installation
```bash
docker run -p 3000:3000 -e site_domain=lingva.ml -e force_default_theme=light -e default_source_lang=auto -e default_target_lang=en thedaviddelta/lingva-translate:latest
git clone https://devops.cloudhost.es/CloudHost/LingvAI.git
cd LingvAI
npm install
```
### Vercel
### Environment Variables
Another easy way is to use the Next.js creators' own platform, [Vercel](https://vercel.com/), where you can deploy it for free with the following button.
Create a `.env.local` file:
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/git/external?repository-url=https%3A%2F%2Fgithub.com%2Fthedaviddelta%2Flingva-translate%2Ftree%2Fmain&env=NEXT_PUBLIC_SITE_DOMAIN&envDescription=Your%20domain&utm_source=lingva-team&utm_campaign=oss)
```env
# Admin panel
ADMIN_PASSWORD=your_secure_password # Default: admin
ADMIN_JWT_SECRET=random_secret_string # Used to sign admin session tokens
# Replicate AI (optional - can also be set via admin panel)
REPLICATE_API_TOKEN=r8_...
JIGSAWSTACK_API_KEY=sk_...
## Instances
These are the currently known *Lingva* instances. Feel free to make a Pull Request including yours (please remember to add `[skip ci]` to the last commit).
| Domain | Hosting | SSL Provider |
|:---------------------------------------------------------------------------:|:-----------------------------------------:|:-----------------------------------------------------------------------------------------------:|
| [lingva.ml](https://lingva.ml/) (Official) | [Vercel](https://vercel.com/) | [Let's Encrypt](https://www.ssllabs.com/ssltest/analyze.html?d=lingva.ml) |
| [translate.igna.wtf](https://translate.igna.wtf/) | [Vercel](https://vercel.com/) | [Let's Encrypt](https://www.ssllabs.com/ssltest/analyze.html?d=translate.igna.wtf) |
| [translate.plausibility.cloud](https://translate.plausibility.cloud/) | [Hetzner](https://hetzner.com/) | [Let's Encrypt](https://www.ssllabs.com/ssltest/analyze.html?d=translate.plausibility.cloud) |
| [lingva.lunar.icu](https://lingva.lunar.icu/) | [Lansol](https://lansol.de/) | [Cloudflare](https://www.ssllabs.com/ssltest/analyze.html?d=lingva.lunar.icu) |
| [translate.projectsegfau.lt](https://translate.projectsegfau.lt/) | Self-hosted | [Let's Encrypt](https://www.ssllabs.com/ssltest/analyze.html?d=translate.projectsegfau.lt) |
| [translate.dr460nf1r3.org](https://translate.dr460nf1r3.org/) | [Netcup](https://netcup.eu/) | [Cloudflare](https://www.ssllabs.com/ssltest/analyze.html?d=translate.dr460nf1r3.org) |
| [lingva.garudalinux.org](https://lingva.garudalinux.org/) | [Hetzner](https://hetzner.com/) | [Cloudflare](https://www.ssllabs.com/ssltest/analyze.html?d=lingva.garudalinux.org) |
| [translate.jae.fi](https://translate.jae.fi/) | Self-hosted | [Let's Encrypt](https://www.ssllabs.com/ssltest/analyze.html?d=translate.jae.fi) |
## Public APIs
Nearly all the *Lingva* instances should supply a pair of public developer APIs: a RESTful one and a GraphQL one.
*Note: both APIs return the translation audio as a `Uint8Array` (served as `number[]` in JSON and `[Int]` in GraphQL) with the contents of the audio buffer.*
### REST API v1
+ GET `/api/v1/:source/:target/:query`
```typescript
{
translation: string
info?: TranslationInfo
}
# Optional: override default languages
NEXT_PUBLIC_DEFAULT_SOURCE_LANG=auto
NEXT_PUBLIC_DEFAULT_TARGET_LANG=en
```
+ GET `/api/v1/audio/:lang/:query`
```typescript
{
audio: number[]
}
### Running
```bash
# Development
npm run dev
# Production
npm run build
npm start
```
+ GET `/api/v1/languages/?:(source|target)`
```typescript
{
languages: [
{
code: string,
name: string
}
]
}
The app runs on [http://localhost:3000](http://localhost:3000) by default.
---
## Configuration
### Setting up Replicate AI Translation
1. Open the app and click the **gear icon** (⚙) in the top-right header
2. Log in with your admin password (default: `admin`)
3. Enter your **Replicate API token** — get one at [replicate.com/account/api-tokens](https://replicate.com/account/api-tokens)
4. Enter your **JigsawStack API key** — get one at [jigsawstack.com](https://jigsawstack.com)
5. Optionally change the **model version** (default is the JigsawStack text-translate model)
6. Toggle **Enable Replicate Translation** on
7. Click **Save Settings**
8. Use **Test Translation** to verify it works
Default model:
```
jigsawstack/text-translate:454df4c49941c05dea05175bd37686d0872c73c1f9366d1c2505db32ade52a89
```
In addition, every endpoint can return an error message with the following structure instead.
```typescript
{
error: string
}
```
### Replicate API call format
### GraphQL API
+ `/api/graphql`
```graphql
query {
translation(source: String target: String query: String!) {
source: {
lang: {
code: String!
name: String!
}
text: String!
audio: [Int]!
detected: {
code: String
name: String
}
typo: String
pronunciation: String
definitions: {
type: String
list: {
definition: String
example: String
field: String
synonyms: [String]
}
}
examples: [String]
similar: [String]
}
target: {
lang: {
code: String!
name: String!
}
text: String!
audio: [Int]!
pronunciation: String
extraTranslations: {
type: String
list: {
word: String
article: String
frequency: Int
meanings: [String]
}
}
}
```bash
curl -X POST \
-H "Authorization: Bearer $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-H "Prefer: wait" \
-d '{
"version": "jigsawstack/text-translate:454df4c49941c05dea05175bd37686d0872c73c1f9366d1c2505db32ade52a89",
"input": {
"text": "Hello, world!",
"api_key": "<jigsawstack_api_key>",
"target_language": "es"
}
audio(lang: String! query: String!) {
lang: {
code: String!
name: String!
}
text: String!
audio: [Int]!
}
languages(type: SOURCE|TARGET) {
code: String!
name: String!
}
}
}' \
https://api.replicate.com/v1/predictions
```
---
## Related projects
## API Endpoints
+ [Lingva Scraper](https://github.com/thedaviddelta/lingva-scraper) - Google Translate scraper built and maintained specifically for this project
+ [SimplyTranslate](https://codeberg.org/SimpleWeb/SimplyTranslate-Web) - Very simple translation front-end with multi-engine support
+ [LibreTranslate](https://github.com/LibreTranslate/LibreTranslate) - FOSS translation service that uses the open [Argos](https://github.com/argosopentech/argos-translate) engine
+ [Lentil for Android](https://github.com/yaxarat/lingvaandroid) - Unofficial native client for Android that uses Lingva's public API
+ [Arna Translate](https://github.com/MahanRahmati/translate) - Unofficial cross-platform native client that uses Lingva's public API
+ [Translate-UT](https://github.com/walking-octopus/translate-ut) - Unofficial native client for Ubuntu Touch that uses Lingva's public API
### Original REST API
```
GET /api/v1/:source/:target/:query → { translation, info }
GET /api/v1/audio/:lang/:text → { audio: number[] }
GET /api/v1/languages → { languages }
```
### GraphQL
```
POST /api/graphql
```
## Contributors
### New Endpoints
Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
```
POST /api/translate/replicate
Body: { text: string, targetLanguage: string }
Returns: { translation: string }
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->
<table>
<tr>
<td align="center"><a href="https://thedaviddelta.com/"><img src="https://avatars.githubusercontent.com/u/6679900?v=4?s=100" width="100px;" alt=""/><br /><sub><b>David</b></sub></a><br /><a href="#a11y-TheDavidDelta" title="Accessibility">️️️️♿️</a> <a href="https://github.com/TheDavidDelta/lingva-translate/commits?author=TheDavidDelta" title="Code">💻</a> <a href="https://github.com/TheDavidDelta/lingva-translate/commits?author=TheDavidDelta" title="Documentation">📖</a> <a href="#design-TheDavidDelta" title="Design">🎨</a> <a href="https://github.com/TheDavidDelta/lingva-translate/commits?author=TheDavidDelta" title="Tests">⚠️</a></td>
<td align="center"><a href="https://github.com/mhmdanas"><img src="https://avatars.githubusercontent.com/u/32234660?v=4?s=100" width="100px;" alt=""/><br /><sub><b>Mohammed Anas</b></sub></a><br /><a href="https://github.com/TheDavidDelta/lingva-translate/commits?author=mhmdanas" title="Code">💻</a></td>
<td align="center"><a href="https://PussTheCat.org/"><img src="https://avatars.githubusercontent.com/u/47571719?v=4?s=100" width="100px;" alt=""/><br /><sub><b>TheFrenchGhosty</b></sub></a><br /><a href="https://github.com/TheDavidDelta/lingva-translate/commits?author=TheFrenchGhosty" title="Documentation">📖</a></td>
</tr>
</table>
POST /api/translate/document
Body: multipart/form-data
file: <file>
targetLanguage: string
action: "translate" | "getColumns"
columnSelections?: JSON string (for Excel/CSV)
Returns: file download (translate) or { columns } (getColumns)
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
GET /api/admin/auth → { authenticated: boolean }
POST /api/admin/auth body: { password } → sets session cookie
DELETE /api/admin/auth → clears session cookie
<!-- ALL-CONTRIBUTORS-LIST:END -->
GET /api/admin/settings → { replicateApiToken, jigsawApiKey, modelVersion, replicateEnabled }
POST /api/admin/settings body: { ...settings, newPassword? }
```
This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!
---
## Architecture
```
pages/
[[...slug]].tsx Main translation page (Text + Document tabs)
admin/index.tsx Admin settings panel
api/
admin/
auth.ts JWT-based admin authentication
settings.ts Settings read/write (requires admin auth)
translate/
replicate.ts Replicate AI translation endpoint
document.ts Document upload & translation endpoint
v1/[[...slug]].ts Original REST API
graphql.ts Original GraphQL API
components/
DocumentTranslator.tsx File upload UI, progress, download
ColumnSelector.tsx Per-sheet column selection for Excel/CSV
Header.tsx + admin gear icon link
utils/
settings-store.ts Read/write data/settings.json
admin-auth.ts JWT sign/verify helpers
replicate-translate.ts Replicate API calls + batch helper
document-processors/
excel.ts SheetJS Excel/CSV processor
docx.ts JSZip + XML DOCX processor
pdf.ts pdf-parse + pdf-lib PDF processor
```
---
## Docker
```dockerfile
FROM node:18-alpine
WORKDIR /app
COPY . .
RUN npm install && npm run build
EXPOSE 3000
CMD ["npm", "start"]
```
Or using the included `Dockerfile`:
```bash
docker build -t lingvai .
docker run -p 3000:3000 \
-e ADMIN_PASSWORD=secret \
-e ADMIN_JWT_SECRET=random \
-v ./data:/app/data \
lingvai
```
> Mount `./data` as a volume to persist admin settings across container restarts.
---
## Tech Stack
| Layer | Technology |
|-------|-----------|
| Framework | Next.js 12, React 18, TypeScript |
| UI | Chakra UI 2, Framer Motion |
| Translation (default) | lingva-scraper (Google Translate) |
| Translation (AI) | Replicate + JigsawStack text-translate |
| Document processing | SheetJS (xlsx), JSZip, pdf-lib, pdf-parse |
| Admin auth | jose (JWT), HTTP-only cookie |
| File uploads | formidable v3 |
| API | REST + GraphQL (Apollo Server) |
---
## License
[![](https://www.gnu.org/graphics/agplv3-with-text-162x68.png)](https://www.gnu.org/licenses/agpl-3.0.html)
[AGPL-3.0](./LICENSE) — same as the upstream Lingva Translate project.
Copyright © 2021 [thedaviddelta](https://github.com/thedaviddelta) & contributors.
This project is [GNU AGPLv3](./LICENSE) licensed.
Original project by [thedaviddelta](https://github.com/thedaviddelta/lingva-translate).
LingvAI enhancements: admin panel, Replicate AI integration, document translation.