Feat/meetup eventalways#23
Conversation
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughRemoves six debug scripts and HTML artifacts. Adds five new event scrapers (ClickTheCity, SISTIC, EventAlways, Eventsize, Meetup) and registers them in the scraper registry. Introduces a two-layer deduplication system: an inline post-save hook in the scraper base and a standalone CLI script backed by a pure-Python utility module. Adds backend endpoints for venue detail, organizer CSV export, dedup trigger, and script trigger; extends the frontend with a venue detail page, clickable event links, organizer export button, and scraper control buttons. ChangesScrapers, Dedup System, and UI Expansion
Sequence Diagram(s)sequenceDiagram
participant Frontend as Scrapers Page
participant BackendView as api_dedup_trigger / api_script_trigger
participant Subprocess as deduplicate.py / AI script
participant DB as PostgreSQL
rect rgba(59, 130, 246, 0.5)
note over Frontend,DB: Deduplication flow
Frontend->>BackendView: POST /api/scrapers/dedup/
BackendView->>BackendView: acquire _DEDUP_LOCK
BackendView->>Subprocess: subprocess.run deduplicate.py --entity all
Subprocess->>DB: find_*_duplicates(cursor)
DB-->>Subprocess: duplicate groups
Subprocess->>DB: merge_*(cursor, winner_id, loser_ids)
Subprocess-->>BackendView: stdout (summary table)
BackendView-->>Frontend: { output, entity }
end
rect rgba(16, 185, 129, 0.5)
note over Frontend,DB: Script trigger flow
Frontend->>BackendView: POST /api/scripts/classify-events/run/
BackendView->>BackendView: validate against _ALLOWED_SCRIPTS
BackendView->>Subprocess: Popen(detached AI script)
Subprocess-->>BackendView: pid
BackendView-->>Frontend: { started: true, pid }
end
sequenceDiagram
participant Scraper as BaseScraper (save_events/venues/organizers)
participant Hook as _dedup_after_save
participant DB as Django ORM
Scraper->>DB: upsert events/venues/organizers
Scraper->>Hook: _dedup_after_save("events", event_ids)
Hook->>DB: query rows by ids
Hook->>Hook: normalize URLs, group by key
alt duplicates found
Hook->>DB: fill missing fields on winner
Hook->>DB: UPDATE Event FK references
Hook->>DB: DELETE loser rows
end
Hook-->>Scraper: returns (exceptions swallowed)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Summary by CodeRabbit
Release Notes
New Features
Documentation
Chores