中文文档:README_zh.md
ruoyi_talk · v1.0.0 · MIT
Downloads the articles you actually care about.
payload is stage 2 of the ruoyi_talk knowledge pipeline. It reads the article queue produced by an upstream source (e.g. fairing), fetches full content locally — PDFs for papers, Markdown for blog posts — and manages a searchable local knowledge base. An interactive shell lets you browse, fetch, star, and open articles without leaving the terminal.
upstream (fairing / any producer)
└─ QUEUE_DIR/payload_queue.json
payload (this project)
├─ reads QUEUE_DIR/payload_queue.json
├─ tracks PAYLOAD_DATA_DIR/downloaded.jsonl
│ /failed.jsonl
└─ writes KNOWLEDGE_DIR/
<slug>.pdf / .md
preferred/<slug>.pdf / .md
index.json
git clone https://github.com/JiekerTime/payload.git
cd payload
# macOS / Linux
python3 -m venv .venv && source .venv/bin/activate
./run.sh
# Windows
.\run.batrun.sh / run.bat create the virtualenv, install dependencies, and launch
the interactive shell automatically.
Copy .env.example to .env and set at minimum QUEUE_DIR:
| Variable | Required | Default | Description |
|---|---|---|---|
QUEUE_DIR |
Yes | — | Directory containing payload_queue.json |
PAYLOAD_DATA_DIR |
No | ~/Documents/payload |
payload's own state files |
KNOWLEDGE_DIR |
No | ~/files/OneDrive/ruoyi_knowledge |
Downloaded articles and index |
FIRECRAWL_API_KEY |
No | — | Enables high-fidelity web → Markdown (optional) |
Launch the shell: python main.py (or ./run.sh)
Non-interactive: python main.py run [--dry-run]
| Shortcut | Command | Description |
|---|---|---|
\r |
run [--dry-run] |
Fetch all pending articles from the queue |
\ls |
list |
Browse queue with pagination; [1-N] to fetch, [q] to quit |
\fs |
search <kw…> |
Filter queue by title keyword; select to fetch |
\dl |
download <id> |
Fetch a queued article by its 16-char article ID |
\rt |
retry |
Re-fetch any previously fetched article (overwrite) |
\f |
failed |
Show all failed fetch attempts |
\l |
log [N] |
Show fetch history (default: last 20) |
\i |
index [kw] |
Browse knowledge index · [N] star/unstar · o[N] open · d[N] delete |
\fv |
fav |
Show preferred (★) articles |
\o |
open <id> |
Open a downloaded article with the system default app |
\st |
stats |
Knowledge base and queue statistics |
\e |
env [set K V] |
View or update .env variables |
\li |
license |
Show MIT license |
\h / \? |
shortcuts |
Show this help |
\q |
quit |
Exit |
Inside \i and \fv:
| Input | Action |
|---|---|
[1-N] |
Toggle ★ preferred for that row |
o[N] |
Open file with system default app |
d[N] |
Delete article (file + index entry); confirm required |
[n] / [p] |
Next / previous page |
[q] |
Quit |
Marking an article as preferred moves its file to KNOWLEDGE_DIR/preferred/.
Unmarking moves it back to KNOWLEDGE_DIR/.
| Domain | Output | Notes |
|---|---|---|
arxiv.org/abs/* |
Full paper | |
arxiv.org/pdf/* |
Direct PDF | |
| Any web page | Markdown | Firecrawl if key set, else requests + markdownify |
Web pages: footnote anchor links are rewritten to local #anchor references.
Images are downloaded to <article_id>/images/ and paths updated in the Markdown.
# payload/handlers/myhandler.py
from payload.handlers.base import BaseHandler, DownloadResult
from pathlib import Path
class MyHandler(BaseHandler):
patterns = ["example.com"]
def download(self, url: str, dest_dir: Path, article: dict) -> DownloadResult:
...
return DownloadResult(
article_id=article["article_id"],
url=url, path=dest, source="example", format="md",
)Register in payload/router.py before WebHandler.
pip install pytest
pytest tests/ -vMIT © JiekerTime (若呓)