refactor: Economy-Fokus, CX via HKG, Telegram Bot, Anti-Bot
Komplette Neuausrichtung des Scanners: - Premium Economy → Economy (CX via HKG als Hauptroute) - Telegram Bot (@CX_HKG_Alert_bot) mit /preis, /best, /status - SeleniumBase 4.34 → 4.47 (besserer UC/CDP Mode) - Scrape-URL (.de) / Booking-URL (.com) Trennung - GDPR-Consent-Handling für Kayak/Momondo - NODE_SCANNER_SKIP: Geo-Block-Scanner pro Node konfigurierbar - Alert-Zähler pro Node (kein Spam durch bekannte Geo-Blocks) - .env Dateien aus Repo entfernt (Secrets) - STATE.md mit aktuellem Stand Made-with: Cursor
This commit is contained in:
parent
8c6eb7128a
commit
a9cb83871c
12 changed files with 1492 additions and 339 deletions
7
.gitignore
vendored
Normal file
7
.gitignore
vendored
Normal file
|
|
@ -0,0 +1,7 @@
|
||||||
|
.env
|
||||||
|
*.env
|
||||||
|
.env.*
|
||||||
|
__pycache__/
|
||||||
|
*.pyc
|
||||||
|
*.bak
|
||||||
|
data/
|
||||||
243
STATE.md
Normal file
243
STATE.md
Normal file
|
|
@ -0,0 +1,243 @@
|
||||||
|
# STATE: Flugpreisscanner
|
||||||
|
**Stand: 26.02.2026**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
🚀 **In Betrieb — seit 25.02.2026**
|
||||||
|
|
||||||
|
| Komponente | Status |
|
||||||
|
|------------|--------|
|
||||||
|
| flugscanner-hub | ✅ Läuft (Docker: web + scheduler) |
|
||||||
|
| flugscanner-asia | ✅ Läuft (Docker: agent + noVNC) |
|
||||||
|
| flugscanner-mu | ✅ Läuft (Docker: agent + noVNC) |
|
||||||
|
| Forgejo-Repo | ✅ http://100.89.246.60:3000/orbitalo/flugpreisscanner |
|
||||||
|
| Dashboard | ✅ http://100.92.161.97:8080 |
|
||||||
|
| Telegram Bot | ✅ @CX_HKG_Alert_bot — Alerts + /preis + /best + /status |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Kernidee
|
||||||
|
|
||||||
|
Täglich günstigste Flüge **FRA → KTI (Frankfurt → Phnom Penh)** automatisch finden.
|
||||||
|
Kabine: **Economy** · Gepäck: 1 Koffer + Handgepäck · Aufenthalt: ~2 Monate
|
||||||
|
Fokus: **Cathay Pacific (CX) via Hong Kong** — beste Preis-Leistung in Economy.
|
||||||
|
KI wertet aus: jetzt buchen oder warten?
|
||||||
|
Scraping läuft bewusst von Heimnetz-IPs — nicht von Hetzner (Datacenter-IPs werden geblockt).
|
||||||
|
|
||||||
|
**Route: 🇭🇰 HKG Stopover** — Multi-City FRA→HKG (1–2 Nächte) → KTI → FRA.
|
||||||
|
Realistischer Preis: **900–1.050 EUR** Roundtrip Economy.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Container
|
||||||
|
|
||||||
|
| CT | Name | Server | LAN-IP | Tailscale-IP | Aufgabe |
|
||||||
|
|----|------|--------|--------|--------------|---------|
|
||||||
|
| 115 | `flugscanner-hub` | pve-hetzner | 10.10.10.115 | 100.92.161.97 | Gehirn: Dashboard + Scheduler + KI-Auswertung (OpenRouter) + DB + Job-Koordination |
|
||||||
|
| 115 | `flugscanner-asia` | pve1 Kambodscha | 192.168.0.131 | 100.112.190.22 | Scraping-Node A: SeleniumBase CDP + noVNC, Heimnetz-IP Asien |
|
||||||
|
| 145 | `flugscanner-mu` | helmut-pve Muldenstein | 192.168.178.130 | 100.75.182.15 | Scraping-Node B: SeleniumBase CDP + noVNC, Heimnetz-IP Deutschland |
|
||||||
|
|
||||||
|
**Zugänge:**
|
||||||
|
- Hub (pve-hetzner): `ssh root@100.88.230.59` PW: Astral-Proxmox!2026 → `pct exec 115`
|
||||||
|
- Asia (pve1): `ssh root@192.168.0.197` PW: astral66 → `pct exec 115`
|
||||||
|
- Muldenstein: `ssh root@100.75.182.15` PW: astral66 (direkt, kein pct nötig)
|
||||||
|
- helmut-pve: `ssh root@100.87.235.11` PW: astral66
|
||||||
|
|
||||||
|
**Wichtig:**
|
||||||
|
- Scraping läuft NIE von CT 115 / Hetzner aus
|
||||||
|
- CT 115 koordiniert nur — die Nodes führen aus
|
||||||
|
- Muldenstein = deutsche IP (beste Ergebnisse für Kayak, Momondo)
|
||||||
|
- Kambodscha = asiatische IP (Momondo/Traveloka werden übersprungen — Geo-Block)
|
||||||
|
- **Tailscale auf allen Containern** — sichere Kommunikation über Tailnet
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## CT 115 — Flugpreisscanner Hub
|
||||||
|
|
||||||
|
**Nur Koordination, Auswertung, Dashboard — KEIN Scraping, KEIN noVNC hier.**
|
||||||
|
|
||||||
|
### Dienste (Docker)
|
||||||
|
|
||||||
|
| Service | Container | Port | Aufgabe |
|
||||||
|
|---------|-----------|------|---------|
|
||||||
|
| web | `flugscanner-web` | 8080 | Flask Dashboard |
|
||||||
|
| scheduler | `flugscanner-scheduler` | — | Jobs verteilen, KI auslösen, Telegram Bot |
|
||||||
|
|
||||||
|
### Pfade
|
||||||
|
|
||||||
|
```
|
||||||
|
/opt/flugscanner/
|
||||||
|
├── hub/
|
||||||
|
│ ├── docker-compose.yml
|
||||||
|
│ ├── .env
|
||||||
|
│ ├── Dockerfile
|
||||||
|
│ ├── data/
|
||||||
|
│ │ └── flugscanner.db ← SQLite Datenbank
|
||||||
|
│ └── src/
|
||||||
|
│ ├── web.py ← Flask Dashboard + API
|
||||||
|
│ ├── scheduler.py ← Job-Koordination + Telegram Bot
|
||||||
|
│ ├── ki.py ← OpenRouter Auswertung + Plausibilität
|
||||||
|
│ ├── db.py ← DB-Zugriff + Init
|
||||||
|
│ └── requirements.txt
|
||||||
|
└── node/ ← (auf Nodes ausgecheckt)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scraping-Nodes (asia + mu)
|
||||||
|
|
||||||
|
### Dienste (Docker)
|
||||||
|
|
||||||
|
| Service | Container | Port | Aufgabe |
|
||||||
|
|---------|-----------|------|---------|
|
||||||
|
| agent | `flugscanner-agent` | 5010 | Jobs empfangen, Selenium starten |
|
||||||
|
| novnc | `flugscanner-novnc` | 6080 | Chrome live im Browser sehen |
|
||||||
|
|
||||||
|
### Pfade
|
||||||
|
|
||||||
|
```
|
||||||
|
/opt/flugscanner/node/
|
||||||
|
├── docker-compose.yml
|
||||||
|
├── .env ← NODE_NAME=flugscanner-asia/mu
|
||||||
|
├── Dockerfile
|
||||||
|
└── src/
|
||||||
|
├── agent.py ← Flask API (POST /job, GET /status)
|
||||||
|
├── worker.py ← SeleniumBase CDP Scraper
|
||||||
|
└── requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
### Kommunikation
|
||||||
|
|
||||||
|
```
|
||||||
|
Hub Scheduler → POST http://[Node-Tailscale-IP]:5010/job
|
||||||
|
{ "scanner": "kayak_multicity", "von": "FRA", "nach": "KTI", "kabine": "economy", ... }
|
||||||
|
|
||||||
|
Node antwortet:
|
||||||
|
{ "results": [...], "node": "flugscanner-mu", "count": 10, "screenshot_b64": "..." }
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scanner
|
||||||
|
|
||||||
|
| Scanner | Status | Anmerkung |
|
||||||
|
|---------|--------|-----------|
|
||||||
|
| Kayak (Roundtrip) | ✅ Aktiv | Beste Datenquelle, GDPR-Consent automatisiert |
|
||||||
|
| **Kayak Multi-City CX via HKG** | ✅ Aktiv | Primärer Scanner — FRA→HKG→KTI→FRA |
|
||||||
|
| Trip.com | ✅ Aktiv | Gute Ergänzung, auch CX-Filter |
|
||||||
|
| Momondo | ✅ Aktiv | Nur auf Muldenstein (Geo-Block aus Asien) |
|
||||||
|
| Google Flights | ⚠ Eingeschränkt | Wenige Ergebnisse, Consent-Probleme |
|
||||||
|
| Traveloka | ⚠ Nur Muldenstein | Geo-Block aus Asien |
|
||||||
|
| Wego | ❌ Deaktiviert | |
|
||||||
|
| Skyscanner | ❌ Deaktiviert | Bot-Detection |
|
||||||
|
|
||||||
|
### Node-spezifische Einschränkungen
|
||||||
|
|
||||||
|
Momondo und Traveloka werden auf `flugscanner-asia` automatisch übersprungen (Geo-Block).
|
||||||
|
Konfiguration: `NODE_SCANNER_SKIP` in scheduler.py.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Anti-Bot-Strategie
|
||||||
|
|
||||||
|
- Scan-Intervall: zufällig **25–45 Minuten** (nicht regelmäßig)
|
||||||
|
- SeleniumBase **UC/CDP Mode** (undetected Chromium)
|
||||||
|
- GDPR-Consent automatisch wegklicken (Kayak, Momondo)
|
||||||
|
- **Zwei verschiedene Geo-Locations** (Kambodscha + Deutschland)
|
||||||
|
- Scrape-URL (.de) getrennt von Booking-URL (.com) — Nutzer sieht internationale Preise
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Telegram Bot
|
||||||
|
|
||||||
|
**Bot:** @CX_HKG_Alert_bot
|
||||||
|
**Token:** `8693839370:AAEPG0t2gA5jkLFH3J8UmstZMkHPdp0aTG4`
|
||||||
|
**Chat-ID:** 674951792
|
||||||
|
|
||||||
|
### Befehle
|
||||||
|
| Befehl | Funktion |
|
||||||
|
|--------|----------|
|
||||||
|
| /preis | Aktueller CX-Preis via HKG |
|
||||||
|
| /best | Top 3 günstigste heute |
|
||||||
|
| /status | Systemstatus (Nodes, letzte Scan-Zeit) |
|
||||||
|
|
||||||
|
### Automatische Nachrichten
|
||||||
|
| Wann | Was |
|
||||||
|
|------|-----|
|
||||||
|
| Täglich 07:00 | Morgenbericht mit Preisübersicht |
|
||||||
|
| Bei CX < 900€ | Preis-Alert |
|
||||||
|
| Bei Anstieg > 50€ | Preisanstieg-Warnung |
|
||||||
|
| Nach 3x Null-Ergebnissen | Scanner-Problem-Alert (pro Node) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Datenbank (SQLite auf CT 115)
|
||||||
|
|
||||||
|
Pfad: `/opt/flugscanner/hub/data/flugscanner.db`
|
||||||
|
|
||||||
|
| Tabelle | Inhalt |
|
||||||
|
|---------|--------|
|
||||||
|
| jobs | Geplante Scraping-Jobs (Route, Anbieter, Intervall, Airline-Filter) |
|
||||||
|
| prices | Rohe Preisdaten (Preis, Datum, Anbieter, Node, Booking-URL, plausibel) |
|
||||||
|
| screenshots | Vision-AI Screenshots mit Kabinenklassen-Erkennung |
|
||||||
|
| analyses | KI-Auswertungen mit Timestamp |
|
||||||
|
| prompts | Editierbare KI-Prompts |
|
||||||
|
| nodes | Registrierte Scraping-Nodes + Status |
|
||||||
|
| logs | System-Logs |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## KI-Auswertung
|
||||||
|
|
||||||
|
- Läuft automatisch nach jedem Scraping-Durchlauf
|
||||||
|
- **Vision AI**: Screenshots werden per gpt-4o-mini klassifiziert (Economy/PE/Business)
|
||||||
|
- **Plausibilitätsprüfung**: Preise 700–12.000€ für Economy Roundtrip
|
||||||
|
- **Marktanalyse**: Prompt editierbar im Dashboard
|
||||||
|
- OpenRouter Guthaben wird im Dashboard angezeigt
|
||||||
|
|
||||||
|
### OpenRouter
|
||||||
|
|
||||||
|
| Variable | Wert |
|
||||||
|
|----------|------|
|
||||||
|
| OPENROUTER_API_KEY | `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` |
|
||||||
|
| AI_MODEL | `openai/gpt-4o-mini` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Preiserwartung (Stand 26.02.2026)
|
||||||
|
|
||||||
|
**FRA → HKG → Phnom Penh → FRA — Cathay Pacific Economy Roundtrip**
|
||||||
|
|
||||||
|
| Metrik | Wert |
|
||||||
|
|--------|------|
|
||||||
|
| Günstigster | ~726 EUR |
|
||||||
|
| Realistischer Schnitt | **900–1.050 EUR** |
|
||||||
|
| Gute Airlines (CX/SQ/TG) Durchschnitt | ~1.030 EUR |
|
||||||
|
| Zum Vergleich: Reisebüro VA PE | ~2.000 EUR |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Repo
|
||||||
|
|
||||||
|
`git.orbitalo.net/orbitalo/flugpreisscanner`
|
||||||
|
API-Token (cursor-deploy-3): `a6dd1ee58e091c894169c5ae15f6b74bb9461c56`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Änderungslog
|
||||||
|
|
||||||
|
| Datum | Was |
|
||||||
|
|-------|-----|
|
||||||
|
| 25.02.2026 | System live geschaltet |
|
||||||
|
| 25.02.2026 | Cookie-Banner-Fix + Screenshot-Verbesserungen |
|
||||||
|
| 26.02.2026 | Umstellung PE → Economy, CX via HKG als Hauptroute |
|
||||||
|
| 26.02.2026 | Telegram Bot @CX_HKG_Alert_bot eingerichtet |
|
||||||
|
| 26.02.2026 | SeleniumBase 4.34 → 4.47 (CDP-Verbesserungen) |
|
||||||
|
| 26.02.2026 | _scrape_url / _booking_url Trennung (Scrape .de, Booking .com) |
|
||||||
|
| 26.02.2026 | GDPR-Consent-Handling für Kayak/Momondo |
|
||||||
|
| 26.02.2026 | NODE_SCANNER_SKIP: Momondo/Traveloka auf Asia deaktiviert |
|
||||||
|
| 26.02.2026 | Alert-Zähler jetzt pro Node (kein Spam durch Geo-Blocks) |
|
||||||
|
| 26.02.2026 | SSH-Fix Muldenstein (PermitRootLogin yes) |
|
||||||
|
| 26.02.2026 | Doku in CT999 ergänzt (ct-145-flugscanner-mu.md + index.md) |
|
||||||
5
hub/.env
5
hub/.env
|
|
@ -1,5 +0,0 @@
|
||||||
OPENROUTER_API_KEY=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
|
|
||||||
AI_MODEL=openai/gpt-4o-mini
|
|
||||||
TELEGRAM_BOT_TOKEN=
|
|
||||||
TELEGRAM_CHAT_ID=674951792
|
|
||||||
DB_PATH=/data/flugscanner.db
|
|
||||||
|
|
@ -29,7 +29,7 @@ services:
|
||||||
- AI_MODEL=${AI_MODEL}
|
- AI_MODEL=${AI_MODEL}
|
||||||
- TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN:-}
|
- TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN:-}
|
||||||
- TELEGRAM_CHAT_ID=${TELEGRAM_CHAT_ID:-}
|
- TELEGRAM_CHAT_ID=${TELEGRAM_CHAT_ID:-}
|
||||||
command: python /app/src/scheduler.py
|
command: python -u /app/src/scheduler.py
|
||||||
depends_on:
|
depends_on:
|
||||||
- web
|
- web
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,120 +0,0 @@
|
||||||
"""
|
|
||||||
Einmalig ausführen um laufende DB zu migrieren.
|
|
||||||
docker exec flugscanner-web python3 /app/src/db_migrate.py
|
|
||||||
"""
|
|
||||||
import sqlite3, os
|
|
||||||
|
|
||||||
DB_PATH = os.environ.get("DB_PATH", "/data/flugscanner.db")
|
|
||||||
conn = sqlite3.connect(DB_PATH)
|
|
||||||
|
|
||||||
# 1. Neue Spalten nachrüsten
|
|
||||||
for sql, desc in [
|
|
||||||
("ALTER TABLE jobs ADD COLUMN airline_filter TEXT DEFAULT ''", "airline_filter"),
|
|
||||||
("ALTER TABLE jobs ADD COLUMN layover_min INTEGER DEFAULT 120", "layover_min"),
|
|
||||||
("ALTER TABLE jobs ADD COLUMN layover_max INTEGER DEFAULT 300", "layover_max"),
|
|
||||||
("ALTER TABLE jobs ADD COLUMN max_flugzeit_h INTEGER DEFAULT 22","max_flugzeit_h"),
|
|
||||||
("ALTER TABLE jobs ADD COLUMN max_stops INTEGER DEFAULT 2", "max_stops"),
|
|
||||||
]:
|
|
||||||
try:
|
|
||||||
conn.execute(sql)
|
|
||||||
print(f" ✓ Spalte hinzugefügt: {desc}")
|
|
||||||
except Exception:
|
|
||||||
print(f" — Spalte existiert: {desc}")
|
|
||||||
|
|
||||||
# 2. Bestehende Jobs mit vernünftigen Standardwerten befüllen
|
|
||||||
conn.execute("""
|
|
||||||
UPDATE jobs SET
|
|
||||||
layover_min = 120,
|
|
||||||
layover_max = 300,
|
|
||||||
max_flugzeit_h = 22,
|
|
||||||
max_stops = 2
|
|
||||||
WHERE layover_min IS NULL OR layover_min = 0
|
|
||||||
""")
|
|
||||||
conn.execute("UPDATE jobs SET airline_filter = '' WHERE airline_filter IS NULL")
|
|
||||||
print(" ✓ Bestehende Jobs aktualisiert")
|
|
||||||
|
|
||||||
# 3. Airline-spezifische Jobs anlegen (nur wenn noch nicht vorhanden)
|
|
||||||
airlines = [
|
|
||||||
("CZ", "China Southern"),
|
|
||||||
("CX", "Cathay Pacific"),
|
|
||||||
("SQ", "Singapore Airlines"),
|
|
||||||
("TG", "Thai Airways"),
|
|
||||||
]
|
|
||||||
for code, name in airlines:
|
|
||||||
exists = conn.execute(
|
|
||||||
"SELECT id FROM jobs WHERE scanner='kayak' AND airline_filter=?", (code,)
|
|
||||||
).fetchone()
|
|
||||||
if not exists:
|
|
||||||
conn.execute("""
|
|
||||||
INSERT INTO jobs
|
|
||||||
(scanner, von, nach, tage, aufenthalt_tage, trip_type, kabine, gepaeck,
|
|
||||||
airline_filter, layover_min, layover_max, max_flugzeit_h, max_stops, intervall)
|
|
||||||
VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?)
|
|
||||||
""", ("kayak","FRA","KTI",30,60,"roundtrip","premium_economy",
|
|
||||||
"1koffer+handgepaeck", code, 120, 300, 22, 2, "daily"))
|
|
||||||
print(f" ✓ Job angelegt: Kayak [{code}] {name}")
|
|
||||||
else:
|
|
||||||
print(f" — Job existiert: [{code}] {name}")
|
|
||||||
|
|
||||||
# 4. Prompt aktualisieren
|
|
||||||
PROMPT = """Du bist ein Flugpreis-Analyst. Analysiere Preisdaten für diesen Flug:
|
|
||||||
|
|
||||||
STRECKE: ROUNDTRIP Frankfurt (FRA) → Phnom Penh Techo Airport (KTI)
|
|
||||||
KABINE: Premium Economy (echte PE-Sitze mit extra Beinfreiheit, NICHT Economy mit anderem Namen!)
|
|
||||||
GEPÄCK: 1 großer Aufgabekoffer + Handgepäck (zwingend inklusive)
|
|
||||||
AUFENTHALT: ca. 2 Monate
|
|
||||||
|
|
||||||
BEVORZUGTE AIRLINES (nach Preis-Leistung):
|
|
||||||
- China Southern (CZ): Umstieg Guangzhou (CAN) — meist günstigste Option
|
|
||||||
- Cathay Pacific (CX): Umstieg Hongkong (HKG) — sehr gutes PE-Produkt
|
|
||||||
- Singapore Airlines (SQ): Umstieg Singapur (SIN) — bestes PE-Produkt
|
|
||||||
- Thai Airways (TG): Umstieg Bangkok (BKK) — gutes Netz nach KTI
|
|
||||||
- Vietnam Airlines (VN): Umstieg Hanoi (HAN) — direktester Weg nach KTI
|
|
||||||
|
|
||||||
HARTE FILTER (Flüge außerhalb dieser Grenzen ablehnen):
|
|
||||||
- Umstiegszeit an asiatischen Hubs: MUSS 2–5 Stunden sein (120–300 Min)
|
|
||||||
→ Unter 2h = Gepäck-Transfer-Risiko / Über 5h = Hotelübernachtung nötig
|
|
||||||
- Gesamtreisezeit: MAX 22 Stunden (FRA→KTI oder KTI→FRA)
|
|
||||||
→ Flüge mit 30+ Stunden (z.B. mehrere Stopps mit langen Wartezeiten) ausschließen
|
|
||||||
- Maximale Stopps: 2 (idealerweise 1)
|
|
||||||
|
|
||||||
WICHTIG: Preise unter 1000 EUR für diesen Roundtrip sind fast immer unplausibel.
|
|
||||||
Mögliche Gründe: Economy statt PE, nur Hinflug, kein Freigepäck, falsche Route.
|
|
||||||
|
|
||||||
Aktuelle Preise (Anbieter | Node | Airline | Preis):
|
|
||||||
{preise_heute}
|
|
||||||
|
|
||||||
Preisverlauf letzte 30 Tage:
|
|
||||||
{preisverlauf}
|
|
||||||
|
|
||||||
Statistik: Ø {avg} EUR | Min {min} EUR | Max {max} EUR
|
|
||||||
|
|
||||||
Antworte auf Deutsch:
|
|
||||||
EMPFEHLUNG: [JETZT BUCHEN / WARTEN / NEUTRAL]
|
|
||||||
BEGRUENDUNG: [1-2 Sätze]
|
|
||||||
BESTER_PREIS: [Anbieter + Airline + Preis + Node]
|
|
||||||
BESTE_AIRLINE: [welche der Airlines aktuell am günstigsten und warum]
|
|
||||||
TREND: [STEIGEND / FALLEND / STABIL]
|
|
||||||
GEO_UNTERSCHIED: [DE-Scanner vs. KH-Scanner Preisdifferenz und Erklärung]
|
|
||||||
FILTER_WARNUNG: [Welche gefundenen Preise gegen Flugzeit/Stopps/Umstieg-Regeln verstoßen]
|
|
||||||
PLAUSI_CHECK: [Preise unter 1000 EUR einzeln einordnen was da nicht stimmt]"""
|
|
||||||
|
|
||||||
conn.execute("UPDATE prompts SET inhalt=?, updated_at=datetime('now') WHERE name='ki_auswertung'",
|
|
||||||
(PROMPT,))
|
|
||||||
print(" ✓ KI-Prompt aktualisiert")
|
|
||||||
|
|
||||||
conn.commit()
|
|
||||||
|
|
||||||
# 5. Status anzeigen
|
|
||||||
print("\n=== Aktuelle Jobs ===")
|
|
||||||
jobs = conn.execute("""
|
|
||||||
SELECT id, scanner, airline_filter, layover_min, layover_max,
|
|
||||||
max_flugzeit_h, max_stops, aktiv
|
|
||||||
FROM jobs ORDER BY id
|
|
||||||
""").fetchall()
|
|
||||||
for j in jobs:
|
|
||||||
al = f" [{j[2]}]" if j[2] else ""
|
|
||||||
print(f" #{j[0]} {j[1]}{al} | Umstieg {j[3]}-{j[4]}min | max {j[5]}h | {j[6]} Stopps | {'✓' if j[7] else '✗'}")
|
|
||||||
|
|
||||||
conn.close()
|
|
||||||
print("\n✅ Migration abgeschlossen")
|
|
||||||
187
hub/src/ki.py
187
hub/src/ki.py
|
|
@ -1,5 +1,6 @@
|
||||||
import os
|
import os
|
||||||
import json
|
import json
|
||||||
|
import requests
|
||||||
from openai import OpenAI
|
from openai import OpenAI
|
||||||
from db import get_conn, log
|
from db import get_conn, log
|
||||||
|
|
||||||
|
|
@ -10,45 +11,39 @@ client = OpenAI(
|
||||||
|
|
||||||
MODEL = os.environ.get("AI_MODEL", "openai/gpt-4o-mini")
|
MODEL = os.environ.get("AI_MODEL", "openai/gpt-4o-mini")
|
||||||
|
|
||||||
PLAUSI_PROMPT = """Du bist ein Flugpreis-Experte. Pruefe jeden der folgenden Preise auf Plausibilitaet.
|
PLAUSI_PROMPT = """Du bist ein Flugpreis-Experte. Prüfe jeden der folgenden Preise auf Plausibilität.
|
||||||
|
|
||||||
KONTEXT:
|
KONTEXT:
|
||||||
- Strecke: Roundtrip Frankfurt (FRA) → Phnom Penh Techo (KTI), ca. 2 Monate Aufenthalt
|
- Strecke: Roundtrip Frankfurt (FRA) → Phnom Penh/Siem Reap (KTI), ca. 2 Monate Aufenthalt
|
||||||
- Kabinenklasse: PREMIUM ECONOMY (nicht Economy!)
|
- Kabinenklasse: ECONOMY (normales Economy mit Gepäck)
|
||||||
- Gepaeck: 1 grosser Koffer + Handgepaeck MUSS inklusive sein
|
- Gepäck: 1 großer Koffer + Handgepäck muss inklusive sein
|
||||||
- Bevorzugte Airlines: China Southern (CZ), Cathay Pacific (CX), Singapore Airlines (SQ), Thai Airways (TG), Vietnam Airlines (VN)
|
- Ziel-Airlines: Cathay Pacific (CX), Singapore Airlines (SQ), Emirates (EK), Qatar Airways (QR)
|
||||||
|
|
||||||
PREISREFERENZ fuer Premium Economy Roundtrip FRA-KTI mit Gepaeck:
|
PREISREFERENZ für Economy Roundtrip FRA-KTI mit Gepäck:
|
||||||
- Sehr guenstig: 900-1200 EUR (seltene Deals, plausibel wenn bekannte Airline)
|
- Sehr günstig: 700-900 EUR (seltene Deals, plausibel wenn bekannte Airline)
|
||||||
- Normal: 1200-1800 EUR
|
- Normal: 900-1200 EUR
|
||||||
- Teuer: 1800-2500 EUR
|
- Teuer: 1200-1600 EUR
|
||||||
- Ueber 2500 EUR: zu teuer oder Business Class
|
- Über 1600 EUR: möglicherweise falsche Kabine oder Business
|
||||||
- UNTER 700 EUR: fast sicher ECONOMY, nicht Premium Economy!
|
- Unter 500 EUR: fast sicher Economy Light (ohne Gepäck) — NICHT PLAUSIBEL
|
||||||
- 700-900 EUR: sehr verdaechtig, wahrscheinlich Economy oder ohne Gepaeck
|
- 500-700 EUR: verdächtig, wahrscheinlich ohne Gepäck
|
||||||
|
|
||||||
PRUEFREGELN:
|
PRÜFREGELN:
|
||||||
1. Preis unter 700 EUR → NICHT PLAUSIBEL (Economy ohne Gepaeck)
|
1. Preis unter 500 EUR → NICHT PLAUSIBEL (Economy Light ohne Gepäck)
|
||||||
2. Preis 700-900 EUR → VERDAECHTIG (pruefen ob Economy oder ohne Gepaeck)
|
2. Preis 500-700 EUR → VERDÄCHTIG (prüfen ob ohne Gepäck)
|
||||||
3. Preis 900-2500 EUR mit bekannter Airline → PLAUSIBEL
|
3. Preis 700-1600 EUR mit bekannter Airline → PLAUSIBEL
|
||||||
4. Preis ueber 2500 EUR → VERDAECHTIG (eventuell Business Class)
|
4. Preis über 1600 EUR → VERDÄCHTIG (möglicherweise Business oder falsche Kabine)
|
||||||
5. Scanner "kayak_multicity" (HKG Stopover): Preise 100-200 EUR hoeher als Direkt ist normal
|
5. kayak_multicity (HKG Stopover): 50-150 EUR teurer als Direkt ist normal
|
||||||
6. Wenn ein Scanner deutlich guenstigere Preise zeigt als alle anderen: VERDAECHTIG
|
6. Wenn ein Scanner deutlich günstiger als alle anderen: VERDÄCHTIG
|
||||||
|
|
||||||
PREISE ZU PRUEFEN:
|
PREISE ZU PRÜFEN:
|
||||||
{preise_liste}
|
{preise_liste}
|
||||||
|
|
||||||
Antworte NUR mit gueltigem JSON-Array. Fuer jeden Preis:
|
Antworte NUR mit gültigem JSON-Array. Für jeden Preis:
|
||||||
{{"id": <price_id>, "plausibel": true/false, "grund": "<kurze Begruendung auf Deutsch>"}}
|
{{"id": <price_id>, "plausibel": true/false, "grund": "<kurze Begründung auf Deutsch>"}}"""
|
||||||
|
|
||||||
Beispiel:
|
|
||||||
[
|
|
||||||
{{"id": 123, "plausibel": true, "grund": "1350 EUR fuer CX PE Roundtrip ist marktgerecht"}},
|
|
||||||
{{"id": 124, "plausibel": false, "grund": "436 EUR ist Economy-Preis, nicht PE mit Gepaeck"}}
|
|
||||||
]"""
|
|
||||||
|
|
||||||
|
|
||||||
def plausibilitaetspruefung(von="FRA", nach="KTI"):
|
def plausibilitaetspruefung(von="FRA", nach="KTI"):
|
||||||
"""Prüft alle ungeprüften Preise des aktuellen Laufs via KI."""
|
"""Prüft alle ungeprüften Economy-Preise des aktuellen Laufs via KI."""
|
||||||
log("KI-Plausibilitätsprüfung gestartet")
|
log("KI-Plausibilitätsprüfung gestartet")
|
||||||
conn = get_conn()
|
conn = get_conn()
|
||||||
|
|
||||||
|
|
@ -58,28 +53,29 @@ def plausibilitaetspruefung(von="FRA", nach="KTI"):
|
||||||
WHERE von=? AND nach=?
|
WHERE von=? AND nach=?
|
||||||
AND plausibel IS NULL
|
AND plausibel IS NULL
|
||||||
AND date(scraped_at) = date('now')
|
AND date(scraped_at) = date('now')
|
||||||
|
AND kabine_erkannt IN ('Economy', 'Economy Light', 'Unbekannt')
|
||||||
|
OR (von=? AND nach=? AND plausibel IS NULL
|
||||||
|
AND date(scraped_at) = date('now')
|
||||||
|
AND kabine_erkannt IS NULL)
|
||||||
ORDER BY preis ASC
|
ORDER BY preis ASC
|
||||||
""", (von, nach)).fetchall()
|
""", (von, nach, von, nach)).fetchall()
|
||||||
|
|
||||||
if not ungepruefte:
|
if not ungepruefte:
|
||||||
log("Keine ungeprüften Preise — Plausibilitätsprüfung übersprungen")
|
log("Keine ungeprüften Economy-Preise — Plausibilitätsprüfung übersprungen")
|
||||||
conn.close()
|
conn.close()
|
||||||
return
|
return
|
||||||
|
|
||||||
# In Batches aufteilen (max 25 Preise pro KI-Call)
|
|
||||||
BATCH_SIZE = 25
|
BATCH_SIZE = 25
|
||||||
batches = [ungepruefte[i:i+BATCH_SIZE] for i in range(0, len(ungepruefte), BATCH_SIZE)]
|
batches = [ungepruefte[i:i+BATCH_SIZE] for i in range(0, len(ungepruefte), BATCH_SIZE)]
|
||||||
|
|
||||||
plausibel_total = 0
|
plausibel_total = verdaechtig_total = 0
|
||||||
verdaechtig_total = 0
|
|
||||||
|
|
||||||
for batch_nr, batch in enumerate(batches):
|
for batch_nr, batch in enumerate(batches):
|
||||||
preise_liste = "\n".join([
|
preise_liste = "\n".join([
|
||||||
f" ID {p['id']}: {p['preis']:.0f} EUR — Scanner: {p['scanner']} — "
|
f" ID {p['id']}: {p['preis']:.0f} EUR — Scanner: {p['scanner']} — "
|
||||||
f"Node: {p['node']} — Airline: {p['airline'] or 'k.A.'} — Abflug: {p['abflug']}"
|
f"Airline: {p['airline'] or 'k.A.'} — Abflug: {p['abflug']}"
|
||||||
for p in batch
|
for p in batch
|
||||||
])
|
])
|
||||||
|
|
||||||
prompt = PLAUSI_PROMPT.format(preise_liste=preise_liste)
|
prompt = PLAUSI_PROMPT.format(preise_liste=preise_liste)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
|
@ -90,28 +86,22 @@ def plausibilitaetspruefung(von="FRA", nach="KTI"):
|
||||||
temperature=0.1,
|
temperature=0.1,
|
||||||
)
|
)
|
||||||
antwort = response.choices[0].message.content.strip()
|
antwort = response.choices[0].message.content.strip()
|
||||||
|
|
||||||
if "```" in antwort:
|
if "```" in antwort:
|
||||||
antwort = antwort.split("```")[1]
|
antwort = antwort.split("```")[1]
|
||||||
if antwort.startswith("json"):
|
if antwort.startswith("json"):
|
||||||
antwort = antwort[4:]
|
antwort = antwort[4:]
|
||||||
|
|
||||||
ergebnisse = json.loads(antwort)
|
ergebnisse = json.loads(antwort)
|
||||||
|
|
||||||
for e in ergebnisse:
|
for e in ergebnisse:
|
||||||
pid = e.get("id")
|
pid = e.get("id")
|
||||||
ist_plausibel = 1 if e.get("plausibel") else 0
|
ist_plausibel = 1 if e.get("plausibel") else 0
|
||||||
grund = e.get("grund", "")[:200]
|
grund = e.get("grund", "")[:200]
|
||||||
|
|
||||||
conn.execute(
|
conn.execute(
|
||||||
"UPDATE prices SET plausibel=?, plausi_grund=? WHERE id=?",
|
"UPDATE prices SET plausibel=?, plausi_grund=? WHERE id=?",
|
||||||
(ist_plausibel, grund, pid)
|
(ist_plausibel, grund, pid)
|
||||||
)
|
)
|
||||||
if ist_plausibel:
|
if ist_plausibel: plausibel_total += 1
|
||||||
plausibel_total += 1
|
else: verdaechtig_total += 1
|
||||||
else:
|
|
||||||
verdaechtig_total += 1
|
|
||||||
|
|
||||||
conn.commit()
|
conn.commit()
|
||||||
|
|
||||||
except json.JSONDecodeError as e:
|
except json.JSONDecodeError as e:
|
||||||
|
|
@ -127,25 +117,52 @@ def plausibilitaetspruefung(von="FRA", nach="KTI"):
|
||||||
|
|
||||||
|
|
||||||
def _regelbasierte_plausi(conn, preise):
|
def _regelbasierte_plausi(conn, preise):
|
||||||
"""Fallback wenn KI nicht erreichbar: einfache Regeln."""
|
"""Fallback wenn KI nicht erreichbar: regelbasiert für Economy."""
|
||||||
log("Regelbasierte Plausibilitätsprüfung als Fallback")
|
log("Regelbasierte Plausibilitätsprüfung (Economy) als Fallback")
|
||||||
for p in preise:
|
for p in preise:
|
||||||
preis = p["preis"]
|
preis = p["preis"]
|
||||||
if preis < 700:
|
if preis < 500:
|
||||||
conn.execute("UPDATE prices SET plausibel=0, plausi_grund=? WHERE id=?",
|
conn.execute("UPDATE prices SET plausibel=0, plausi_grund=? WHERE id=?",
|
||||||
("Preis unter 700€ — sehr wahrscheinlich Economy", p["id"]))
|
("Unter 500€ — wahrscheinlich Economy Light ohne Gepäck", p["id"]))
|
||||||
elif preis < 900:
|
elif preis < 700:
|
||||||
conn.execute("UPDATE prices SET plausibel=0, plausi_grund=? WHERE id=?",
|
conn.execute("UPDATE prices SET plausibel=0, plausi_grund=? WHERE id=?",
|
||||||
("Preis 700-900€ — verdächtig, wahrscheinlich Economy oder ohne Gepäck", p["id"]))
|
("500-700€ — verdächtig, wahrscheinlich ohne Gepäck", p["id"]))
|
||||||
elif preis > 3000:
|
elif preis > 1800:
|
||||||
conn.execute("UPDATE prices SET plausibel=0, plausi_grund=? WHERE id=?",
|
conn.execute("UPDATE prices SET plausibel=0, plausi_grund=? WHERE id=?",
|
||||||
("Preis über 3000€ — möglicherweise Business Class", p["id"]))
|
("Über 1800€ — möglicherweise Business Class", p["id"]))
|
||||||
else:
|
else:
|
||||||
conn.execute("UPDATE prices SET plausibel=1, plausi_grund=? WHERE id=?",
|
conn.execute("UPDATE prices SET plausibel=1, plausi_grund=? WHERE id=?",
|
||||||
("Preis im erwarteten PE-Bereich", p["id"]))
|
("Preis im Economy-Roundtrip-Bereich", p["id"]))
|
||||||
conn.commit()
|
conn.commit()
|
||||||
|
|
||||||
|
|
||||||
|
def get_openrouter_guthaben() -> dict:
|
||||||
|
"""Fragt OpenRouter-Guthaben ab."""
|
||||||
|
api_key = os.environ.get("OPENROUTER_API_KEY", "")
|
||||||
|
if not api_key:
|
||||||
|
return {"fehler": "Kein API-Key konfiguriert"}
|
||||||
|
try:
|
||||||
|
r = requests.get(
|
||||||
|
"https://openrouter.ai/api/v1/auth/key",
|
||||||
|
headers={"Authorization": f"Bearer {api_key}"},
|
||||||
|
timeout=10
|
||||||
|
)
|
||||||
|
if r.status_code == 200:
|
||||||
|
d = r.json().get("data", {})
|
||||||
|
limit = d.get("limit")
|
||||||
|
usage = d.get("usage", 0)
|
||||||
|
verbleibend = round((limit - usage), 4) if limit else None
|
||||||
|
return {
|
||||||
|
"limit": limit,
|
||||||
|
"usage": round(usage, 4),
|
||||||
|
"verbleibend": verbleibend,
|
||||||
|
"is_free": d.get("is_free_tier", False),
|
||||||
|
}
|
||||||
|
return {"fehler": f"HTTP {r.status_code}"}
|
||||||
|
except Exception as e:
|
||||||
|
return {"fehler": str(e)}
|
||||||
|
|
||||||
|
|
||||||
def get_prompt():
|
def get_prompt():
|
||||||
conn = get_conn()
|
conn = get_conn()
|
||||||
row = conn.execute(
|
row = conn.execute(
|
||||||
|
|
@ -159,20 +176,34 @@ def auswerten(von="FRA", nach="KTI"):
|
||||||
log("KI-Auswertung gestartet")
|
log("KI-Auswertung gestartet")
|
||||||
conn = get_conn()
|
conn = get_conn()
|
||||||
|
|
||||||
|
# Nur Economy-Preise die plausibel sind
|
||||||
preise_heute = conn.execute("""
|
preise_heute = conn.execute("""
|
||||||
SELECT scanner, node, preis, airline, abflug
|
SELECT scanner, node, preis, airline, abflug, kabine_erkannt
|
||||||
FROM prices
|
FROM prices
|
||||||
WHERE von=? AND nach=?
|
WHERE von=? AND nach=?
|
||||||
AND date(scraped_at) = date('now')
|
AND date(scraped_at) = date('now')
|
||||||
AND (plausibel = 1 OR plausibel IS NULL)
|
AND plausibel = 1
|
||||||
|
AND kabine_erkannt IN ('Economy', 'Economy Light', 'Unbekannt')
|
||||||
ORDER BY preis ASC
|
ORDER BY preis ASC
|
||||||
""", (von, nach)).fetchall()
|
""", (von, nach)).fetchall()
|
||||||
|
|
||||||
|
qualitaet = conn.execute("""
|
||||||
|
SELECT
|
||||||
|
COUNT(*) as gesamt,
|
||||||
|
SUM(CASE WHEN kabine_erkannt='Economy' THEN 1 ELSE 0 END) as eco,
|
||||||
|
SUM(CASE WHEN kabine_erkannt='Economy Light' THEN 1 ELSE 0 END) as light,
|
||||||
|
SUM(CASE WHEN kabine_erkannt='Premium Economy' THEN 1 ELSE 0 END) as pe
|
||||||
|
FROM prices
|
||||||
|
WHERE von=? AND nach=? AND date(scraped_at) = date('now')
|
||||||
|
""", (von, nach)).fetchone()
|
||||||
|
|
||||||
preisverlauf = conn.execute("""
|
preisverlauf = conn.execute("""
|
||||||
SELECT date(scraped_at) as tag, MIN(preis) as min_preis, AVG(preis) as avg_preis
|
SELECT date(scraped_at) as tag, MIN(preis) as min_preis, AVG(preis) as avg_preis
|
||||||
FROM prices
|
FROM prices
|
||||||
WHERE von=? AND nach=?
|
WHERE von=? AND nach=?
|
||||||
AND scraped_at >= datetime('now', '-30 days')
|
AND scraped_at >= datetime('now', '-30 days')
|
||||||
|
AND kabine_erkannt IN ('Economy', 'Economy Light', 'Unbekannt')
|
||||||
|
AND plausibel = 1
|
||||||
GROUP BY date(scraped_at)
|
GROUP BY date(scraped_at)
|
||||||
ORDER BY tag
|
ORDER BY tag
|
||||||
""", (von, nach)).fetchall()
|
""", (von, nach)).fetchall()
|
||||||
|
|
@ -182,30 +213,43 @@ def auswerten(von="FRA", nach="KTI"):
|
||||||
FROM prices
|
FROM prices
|
||||||
WHERE von=? AND nach=?
|
WHERE von=? AND nach=?
|
||||||
AND scraped_at >= datetime('now', '-30 days')
|
AND scraped_at >= datetime('now', '-30 days')
|
||||||
|
AND kabine_erkannt IN ('Economy', 'Economy Light', 'Unbekannt')
|
||||||
|
AND plausibel = 1
|
||||||
""", (von, nach)).fetchone()
|
""", (von, nach)).fetchone()
|
||||||
|
|
||||||
conn.close()
|
conn.close()
|
||||||
|
|
||||||
if not preise_heute:
|
if not preise_heute:
|
||||||
log("Keine Preise für heute — KI-Auswertung übersprungen", "WARN")
|
log("Keine plausiblen Economy-Preise heute — KI-Auswertung übersprungen", "WARN")
|
||||||
return
|
return
|
||||||
|
|
||||||
|
qualitaet_hinweis = (
|
||||||
|
f"DATENQUALITÄT HEUTE: {qualitaet['eco'] or 0} Economy, "
|
||||||
|
f"{qualitaet['light'] or 0} Economy Light gescannt. "
|
||||||
|
f"Nur plausible Roundtrip-Preise mit Gepäck werden ausgewertet.\n"
|
||||||
|
)
|
||||||
|
|
||||||
preise_heute_str = "\n".join([
|
preise_heute_str = "\n".join([
|
||||||
f" {p['scanner']} ({p['node']}): {p['preis']} EUR — {p['airline'] or 'k.A.'}"
|
f" {p['scanner']}: {p['preis']} EUR — {p['airline'] or 'k.A.'} "
|
||||||
|
f"({p['kabine_erkannt'] or '?'})"
|
||||||
for p in preise_heute
|
for p in preise_heute
|
||||||
])
|
])
|
||||||
verlauf_str = "\n".join([
|
verlauf_str = "\n".join([
|
||||||
f" {p['tag']}: min {p['min_preis']:.0f} EUR, avg {p['avg_preis']:.0f} EUR"
|
f" {p['tag']}: min {p['min_preis']:.0f} EUR, avg {p['avg_preis']:.0f} EUR"
|
||||||
for p in preisverlauf
|
for p in preisverlauf
|
||||||
])
|
]) or " (noch keine Verlaufsdaten)"
|
||||||
|
|
||||||
prompt_template = get_prompt()
|
prompt_template = get_prompt()
|
||||||
prompt = prompt_template.format(
|
if not prompt_template:
|
||||||
|
log("Kein KI-Auswertungs-Prompt in DB — übersprungen", "WARN")
|
||||||
|
return
|
||||||
|
|
||||||
|
prompt = qualitaet_hinweis + "\n" + prompt_template.format(
|
||||||
preise_heute=preise_heute_str,
|
preise_heute=preise_heute_str,
|
||||||
preisverlauf=verlauf_str,
|
preisverlauf=verlauf_str,
|
||||||
avg=f"{stats['avg']:.0f}" if stats['avg'] else "?",
|
avg=f"{stats['avg']:.0f}" if stats and stats['avg'] else "?",
|
||||||
min=f"{stats['min']:.0f}" if stats['min'] else "?",
|
min=f"{stats['min']:.0f}" if stats and stats['min'] else "?",
|
||||||
max=f"{stats['max']:.0f}" if stats['max'] else "?"
|
max=f"{stats['max']:.0f}" if stats and stats['max'] else "?"
|
||||||
)
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
|
@ -215,28 +259,19 @@ def auswerten(von="FRA", nach="KTI"):
|
||||||
max_tokens=500
|
max_tokens=500
|
||||||
)
|
)
|
||||||
analyse = response.choices[0].message.content
|
analyse = response.choices[0].message.content
|
||||||
log(f"KI-Antwort erhalten: {analyse[:100]}...")
|
|
||||||
|
|
||||||
guenstigster = preise_heute[0]
|
guenstigster = preise_heute[0]
|
||||||
empfehlung = ""
|
if "JETZT BUCHEN" in analyse: empfehlung = "JETZT BUCHEN"
|
||||||
if "JETZT BUCHEN" in analyse:
|
elif "WARTEN" in analyse: empfehlung = "WARTEN"
|
||||||
empfehlung = "JETZT BUCHEN"
|
else: empfehlung = "NEUTRAL"
|
||||||
elif "WARTEN" in analyse:
|
|
||||||
empfehlung = "WARTEN"
|
|
||||||
else:
|
|
||||||
empfehlung = "NEUTRAL"
|
|
||||||
|
|
||||||
conn = get_conn()
|
conn = get_conn()
|
||||||
conn.execute("""
|
conn.execute("""
|
||||||
INSERT INTO analyses
|
INSERT INTO analyses
|
||||||
(von, nach, guenstigster_preis, guenstigster_anbieter, ki_empfehlung, ki_analyse)
|
(von, nach, guenstigster_preis, guenstigster_anbieter, ki_empfehlung, ki_analyse)
|
||||||
VALUES (?, ?, ?, ?, ?, ?)
|
VALUES (?, ?, ?, ?, ?, ?)
|
||||||
""", (
|
""", (von, nach, guenstigster["preis"],
|
||||||
von, nach,
|
f"{guenstigster['scanner']}", empfehlung, analyse))
|
||||||
guenstigster["preis"],
|
|
||||||
f"{guenstigster['scanner']} ({guenstigster['node']})",
|
|
||||||
empfehlung, analyse
|
|
||||||
))
|
|
||||||
conn.commit()
|
conn.commit()
|
||||||
conn.close()
|
conn.close()
|
||||||
log("KI-Auswertung gespeichert")
|
log("KI-Auswertung gespeichert")
|
||||||
|
|
|
||||||
|
|
@ -1,13 +1,120 @@
|
||||||
import os
|
import os
|
||||||
import time
|
import time
|
||||||
|
import random
|
||||||
import threading
|
import threading
|
||||||
import requests
|
import requests
|
||||||
import schedule
|
import schedule
|
||||||
from datetime import datetime, timedelta
|
from datetime import datetime, timedelta
|
||||||
from db import init_db, get_conn, log
|
from db import init_db, get_conn, log
|
||||||
from ki import auswerten, plausibilitaetspruefung
|
from ki import auswerten, plausibilitaetspruefung
|
||||||
|
from openai import OpenAI
|
||||||
|
|
||||||
# Verhindert dass zwei Läufe gleichzeitig laufen
|
# ── OpenRouter Vision Client ──────────────────────────────────────────────────
|
||||||
|
_vision_client = OpenAI(
|
||||||
|
base_url="https://openrouter.ai/api/v1",
|
||||||
|
api_key=os.environ.get("OPENROUTER_API_KEY")
|
||||||
|
)
|
||||||
|
|
||||||
|
# ── Telegram ──────────────────────────────────────────────────────────────────
|
||||||
|
TELEGRAM_TOKEN = os.environ.get("TELEGRAM_BOT_TOKEN", "")
|
||||||
|
TELEGRAM_CHAT_ID = os.environ.get("TELEGRAM_CHAT_ID", "")
|
||||||
|
|
||||||
|
def telegram_send(msg: str):
|
||||||
|
"""Sendet Nachricht via Telegram. Fehler werden nur geloggt, nie geworfen."""
|
||||||
|
if not TELEGRAM_TOKEN or not TELEGRAM_CHAT_ID:
|
||||||
|
return
|
||||||
|
try:
|
||||||
|
requests.post(
|
||||||
|
f"https://api.telegram.org/bot{TELEGRAM_TOKEN}/sendMessage",
|
||||||
|
json={"chat_id": TELEGRAM_CHAT_ID, "text": msg, "parse_mode": "HTML"},
|
||||||
|
timeout=10
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
log(f"Telegram-Fehler: {e}", "WARN")
|
||||||
|
|
||||||
|
# ── Zero-Result-Tracking (in-memory, pro Job-ID) ─────────────────────────────
|
||||||
|
_null_ergebnis_zaehler: dict[str, int] = {} # key = "node:job_id"
|
||||||
|
ALERT_NACH_N_NULLLAEUFEN = 3
|
||||||
|
|
||||||
|
# Scanner die aus Asien (Cambodia) nicht funktionieren - Geo-Block
|
||||||
|
NODE_SCANNER_SKIP = {
|
||||||
|
"flugscanner-asia": {"momondo", "traveloka"},
|
||||||
|
}
|
||||||
|
|
||||||
|
# ── Vision Prompt (angepasst für Economy) ────────────────────────────────────
|
||||||
|
VISION_PROMPT = """Du siehst einen Screenshot einer Flugsuche-Website (Kayak, Momondo etc.).
|
||||||
|
|
||||||
|
AUFGABE: Bestimme welche Kabinenklasse in den SUCHERGEBNISSEN gezeigt wird.
|
||||||
|
|
||||||
|
WICHTIG — WAS ZÄHLT:
|
||||||
|
✅ Die Kabinenklasse direkt UNTER den einzelnen Flügen in der Ergebnisliste
|
||||||
|
✅ Der aktive Kabinenfilter-Button in der Suche
|
||||||
|
✅ Labels neben dem Preis jedes Flugergebnisses
|
||||||
|
|
||||||
|
IGNORIERE:
|
||||||
|
❌ Werbebanner
|
||||||
|
❌ Empfehlungsboxen oben auf der Seite
|
||||||
|
❌ Texte die nicht zu konkreten Flugergebnissen gehören
|
||||||
|
|
||||||
|
KLASSIFIZIERUNG:
|
||||||
|
- "Economy Light" → "Economy Light", "Basic", "Light", "Nur Handgepäck", "Hand baggage"
|
||||||
|
- "Economy" → "Economy" ohne "Premium" davor
|
||||||
|
- "Premium Economy" → "Premium Economy" oder "W Class" bei Flugergebnissen
|
||||||
|
- "Business" → "Business" bei Flugergebnissen
|
||||||
|
- "Unbekannt" → Ladescreen, Captcha, Cookie-Banner, keine Ergebnisse
|
||||||
|
|
||||||
|
Antworte NUR mit dem einen passenden Begriff. Keine Erklärung."""
|
||||||
|
|
||||||
|
|
||||||
|
def klassifiziere_screenshot(screenshot_b64: str) -> str:
|
||||||
|
"""Vision-KI klassifiziert Kabine im Screenshot."""
|
||||||
|
if not screenshot_b64:
|
||||||
|
return "Unbekannt"
|
||||||
|
try:
|
||||||
|
response = _vision_client.chat.completions.create(
|
||||||
|
model="openai/gpt-4o-mini",
|
||||||
|
max_tokens=15,
|
||||||
|
messages=[{
|
||||||
|
"role": "user",
|
||||||
|
"content": [
|
||||||
|
{"type": "text", "text": VISION_PROMPT},
|
||||||
|
{"type": "image_url", "image_url": {
|
||||||
|
"url": f"data:image/jpeg;base64,{screenshot_b64}"
|
||||||
|
}}
|
||||||
|
]
|
||||||
|
}]
|
||||||
|
)
|
||||||
|
antwort = response.choices[0].message.content.strip().lower()
|
||||||
|
if "premium" in antwort: return "Premium Economy"
|
||||||
|
if "light" in antwort or "basic" in antwort: return "Economy Light"
|
||||||
|
if "economy" in antwort: return "Economy"
|
||||||
|
if "business" in antwort: return "Business"
|
||||||
|
if "first" in antwort: return "First"
|
||||||
|
return "Unbekannt"
|
||||||
|
except Exception as e:
|
||||||
|
log(f"Vision-Klassifizierung fehlgeschlagen: {e}", "WARN")
|
||||||
|
return "Unbekannt"
|
||||||
|
|
||||||
|
|
||||||
|
# ── Cleanup ───────────────────────────────────────────────────────────────────
|
||||||
|
def cleanup_alte_screenshots(tage=30):
|
||||||
|
"""Löscht Screenshots die älter als `tage` Tage sind."""
|
||||||
|
try:
|
||||||
|
conn = get_conn()
|
||||||
|
cur = conn.execute("""
|
||||||
|
DELETE FROM screenshots
|
||||||
|
WHERE created_at < datetime('now', ?)
|
||||||
|
""", (f"-{tage} days",))
|
||||||
|
deleted = cur.rowcount
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
if deleted > 0:
|
||||||
|
log(f"Cleanup: {deleted} Screenshots älter als {tage} Tage gelöscht")
|
||||||
|
except Exception as e:
|
||||||
|
log(f"Cleanup-Fehler: {e}", "WARN")
|
||||||
|
|
||||||
|
|
||||||
|
# ── Lock ──────────────────────────────────────────────────────────────────────
|
||||||
_scan_lock = threading.Lock()
|
_scan_lock = threading.Lock()
|
||||||
_lauf_aktiv = False
|
_lauf_aktiv = False
|
||||||
|
|
||||||
|
|
@ -21,11 +128,7 @@ def get_nodes():
|
||||||
return [dict(n) for n in nodes]
|
return [dict(n) for n in nodes]
|
||||||
|
|
||||||
|
|
||||||
def get_aktive_jobs(flex=False):
|
def get_aktive_jobs():
|
||||||
"""
|
|
||||||
flex=False → normale Jobs (datum_flex IS NULL or 0)
|
|
||||||
flex=True → alle Jobs, tage wird durch Aufrufer variiert
|
|
||||||
"""
|
|
||||||
conn = get_conn()
|
conn = get_conn()
|
||||||
jobs = conn.execute("SELECT * FROM jobs WHERE aktiv = 1").fetchall()
|
jobs = conn.execute("SELECT * FROM jobs WHERE aktiv = 1").fetchall()
|
||||||
conn.close()
|
conn.close()
|
||||||
|
|
@ -58,7 +161,7 @@ def dispatch_job(node, job, tage_override=None):
|
||||||
"tage": tage_override if tage_override is not None else job["tage"],
|
"tage": tage_override if tage_override is not None else job["tage"],
|
||||||
"aufenthalt_tage": job.get("aufenthalt_tage", 60),
|
"aufenthalt_tage": job.get("aufenthalt_tage", 60),
|
||||||
"trip_type": job.get("trip_type", "roundtrip"),
|
"trip_type": job.get("trip_type", "roundtrip"),
|
||||||
"kabine": job.get("kabine", "premium_economy"),
|
"kabine": job.get("kabine", "economy"),
|
||||||
"gepaeck": job.get("gepaeck", "1koffer+handgepaeck"),
|
"gepaeck": job.get("gepaeck", "1koffer+handgepaeck"),
|
||||||
"airline_filter": job.get("airline_filter", ""),
|
"airline_filter": job.get("airline_filter", ""),
|
||||||
"layover_min": job.get("layover_min", 120),
|
"layover_min": job.get("layover_min", 120),
|
||||||
|
|
@ -69,6 +172,7 @@ def dispatch_job(node, job, tage_override=None):
|
||||||
"stopover_min_h": job.get("stopover_min_h", 20),
|
"stopover_min_h": job.get("stopover_min_h", 20),
|
||||||
"stopover_max_h": job.get("stopover_max_h", 30),
|
"stopover_max_h": job.get("stopover_max_h", 30),
|
||||||
}
|
}
|
||||||
|
job_id = job["id"]
|
||||||
try:
|
try:
|
||||||
r = requests.post(
|
r = requests.post(
|
||||||
f"http://{node['tailscale_ip']}:5010/job",
|
f"http://{node['tailscale_ip']}:5010/job",
|
||||||
|
|
@ -79,13 +183,45 @@ def dispatch_job(node, job, tage_override=None):
|
||||||
data = r.json()
|
data = r.json()
|
||||||
results = data.get("results", [])
|
results = data.get("results", [])
|
||||||
screenshot_b64 = data.get("screenshot_b64", "")
|
screenshot_b64 = data.get("screenshot_b64", "")
|
||||||
|
|
||||||
via_label = f" via {job.get('via','')}" if job.get('via') else ""
|
via_label = f" via {job.get('via','')}" if job.get('via') else ""
|
||||||
|
airline_label = f" [{job.get('airline_filter','')}]" if job.get('airline_filter') else ""
|
||||||
log(f"{node['name']}: {len(results)} Preise ← {job['scanner']}"
|
log(f"{node['name']}: {len(results)} Preise ← {job['scanner']}"
|
||||||
f"{' ['+job.get('airline_filter','')+']' if job.get('airline_filter') else ''}"
|
f"{airline_label}{via_label}"
|
||||||
f"{via_label}"
|
|
||||||
f"{' +'+str(tage_override)+'T' if tage_override else ''}")
|
f"{' +'+str(tage_override)+'T' if tage_override else ''}")
|
||||||
|
|
||||||
|
# ── Zero-Result-Alert ─────────────────────────────────────────
|
||||||
|
if len(results) == 0:
|
||||||
|
zkey = f"{node['name']}:{job_id}"
|
||||||
|
_null_ergebnis_zaehler[zkey] = _null_ergebnis_zaehler.get(zkey, 0) + 1
|
||||||
|
zaehler = _null_ergebnis_zaehler[zkey]
|
||||||
|
log(f"⚠ {job['scanner']} liefert 0 Preise ({zaehler}/{ALERT_NACH_N_NULLLAEUFEN})", "WARN")
|
||||||
|
if zaehler >= ALERT_NACH_N_NULLLAEUFEN:
|
||||||
|
telegram_send(
|
||||||
|
f"⚠️ <b>Flugscanner-Alert</b>\n"
|
||||||
|
f"Scanner <b>{job['scanner']}</b> (Job #{job_id}) liefert "
|
||||||
|
f"seit {zaehler} Läufen <b>0 Preise</b>.\n"
|
||||||
|
f"Möglicherweise Anti-Bot-Erkennung oder Seite verändert."
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
zkey = f"{node['name']}:{job_id}"
|
||||||
|
_null_ergebnis_zaehler[zkey] = 0 # Reset bei Erfolg
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
screenshot_id = speichere_screenshot(screenshot_b64, node["name"], job)
|
screenshot_id = speichere_screenshot(screenshot_b64, node["name"], job)
|
||||||
speichere_preise(results, node["name"], job, screenshot_id)
|
|
||||||
|
# ── Vision-Wahrheitsfilter ────────────────────────────────────
|
||||||
|
kabine_erkannt = klassifiziere_screenshot(screenshot_b64)
|
||||||
|
log(f"{node['name']}/{job['scanner']}: Vision → {kabine_erkannt}")
|
||||||
|
# Für Economy-Suche: Business/First/PE sind Fehlklassifizierungen
|
||||||
|
FALSCHE_KABINEN = ("Premium Economy", "Business", "First")
|
||||||
|
if kabine_erkannt in FALSCHE_KABINEN:
|
||||||
|
log(f"⚠ Vision zeigt {kabine_erkannt} statt Economy — Preise markiert", "WARN")
|
||||||
|
# ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
pruefe_preis_alert(results, job)
|
||||||
|
pruefe_preisanstieg(results, job)
|
||||||
|
speichere_preise(results, node["name"], job, screenshot_id, kabine_erkannt)
|
||||||
return True
|
return True
|
||||||
else:
|
else:
|
||||||
log(f"{node['name']}: Fehler {r.status_code} bei {job['scanner']}", "ERROR")
|
log(f"{node['name']}: Fehler {r.status_code} bei {job['scanner']}", "ERROR")
|
||||||
|
|
@ -97,7 +233,6 @@ def dispatch_job(node, job, tage_override=None):
|
||||||
|
|
||||||
|
|
||||||
def speichere_screenshot(screenshot_b64, node_name, job):
|
def speichere_screenshot(screenshot_b64, node_name, job):
|
||||||
"""Speichert Screenshot in DB, gibt screenshot_id zurück (oder None)."""
|
|
||||||
if not screenshot_b64:
|
if not screenshot_b64:
|
||||||
return None
|
return None
|
||||||
try:
|
try:
|
||||||
|
|
@ -115,13 +250,47 @@ def speichere_screenshot(screenshot_b64, node_name, job):
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
def speichere_preise(results, node_name, job, screenshot_id=None):
|
ALERT_SCHWELLE_EUR = 900 # Telegram-Alert wenn CX via HKG unter diesen Preis fällt
|
||||||
|
|
||||||
|
def pruefe_preis_alert(results, job):
|
||||||
|
"""Sendet Telegram-Alert wenn kayak_multicity unter Schwelle fällt."""
|
||||||
|
if job.get("scanner") != "kayak_multicity":
|
||||||
|
return
|
||||||
|
for r in results:
|
||||||
|
if r.get("preis", 9999) < ALERT_SCHWELLE_EUR:
|
||||||
|
preis = r["preis"]
|
||||||
|
abflug = r.get("abflug", "?")
|
||||||
|
url = r.get("booking_url", "")
|
||||||
|
telegram_send(
|
||||||
|
f"✈️ <b>CX via HKG unter {ALERT_SCHWELLE_EUR}€!</b>\n\n"
|
||||||
|
f"💰 Preis: <b>{preis:.0f} EUR</b> Roundtrip\n"
|
||||||
|
f"📅 Abflug: {abflug}\n"
|
||||||
|
f"🔗 <a href='{url}'>Jetzt buchen</a>"
|
||||||
|
)
|
||||||
|
log(f"💰 PREIS-ALERT: {preis:.0f}EUR via HKG — Telegram gesendet")
|
||||||
|
break # Nur einmal pro Job-Lauf
|
||||||
|
|
||||||
|
|
||||||
|
def speichere_preise(results, node_name, job, screenshot_id=None, kabine_erkannt=None):
|
||||||
|
# Economy-Suche: PE/Business/First sind Fehlkabinen → disqualifizieren
|
||||||
|
FALSCHE_KABINEN = ("Premium Economy", "Business", "First")
|
||||||
|
ist_disqualifiziert = kabine_erkannt in FALSCHE_KABINEN
|
||||||
|
|
||||||
conn = get_conn()
|
conn = get_conn()
|
||||||
for r in results:
|
for r in results:
|
||||||
|
plausibel_init = None
|
||||||
|
plausi_grund_init = ""
|
||||||
|
if ist_disqualifiziert:
|
||||||
|
plausibel_init = 0
|
||||||
|
plausi_grund_init = (
|
||||||
|
f"[Vision-Filter] Screenshot zeigt {kabine_erkannt} — kein Economy"
|
||||||
|
)
|
||||||
|
|
||||||
conn.execute("""
|
conn.execute("""
|
||||||
INSERT INTO prices
|
INSERT INTO prices
|
||||||
(job_id, scanner, node, preis, waehrung, airline, abflug, ankunft, von, nach, booking_url, screenshot_id)
|
(job_id, scanner, node, preis, waehrung, airline, abflug, ankunft,
|
||||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
von, nach, booking_url, screenshot_id, kabine_erkannt, plausibel, plausi_grund)
|
||||||
|
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||||
""", (
|
""", (
|
||||||
job["id"], r.get("scanner", job["scanner"]), node_name,
|
job["id"], r.get("scanner", job["scanner"]), node_name,
|
||||||
r["preis"], r.get("waehrung", "EUR"), r.get("airline", ""),
|
r["preis"], r.get("waehrung", "EUR"), r.get("airline", ""),
|
||||||
|
|
@ -129,16 +298,15 @@ def speichere_preise(results, node_name, job, screenshot_id=None):
|
||||||
job["von"], job["nach"],
|
job["von"], job["nach"],
|
||||||
r.get("booking_url", ""),
|
r.get("booking_url", ""),
|
||||||
screenshot_id,
|
screenshot_id,
|
||||||
|
kabine_erkannt,
|
||||||
|
plausibel_init,
|
||||||
|
plausi_grund_init,
|
||||||
))
|
))
|
||||||
conn.commit()
|
conn.commit()
|
||||||
conn.close()
|
conn.close()
|
||||||
|
|
||||||
|
|
||||||
def scraping_lauf(label="Standard", flex_tage_liste=None):
|
def scraping_lauf(label="Standard", flex_tage_liste=None):
|
||||||
"""
|
|
||||||
Führt alle aktiven Jobs auf allen Nodes aus.
|
|
||||||
Übersprungen wenn ein anderer Lauf noch aktiv ist (Lock-Schutz).
|
|
||||||
"""
|
|
||||||
global _lauf_aktiv
|
global _lauf_aktiv
|
||||||
|
|
||||||
if not _scan_lock.acquire(blocking=False):
|
if not _scan_lock.acquire(blocking=False):
|
||||||
|
|
@ -158,16 +326,16 @@ def scraping_lauf(label="Standard", flex_tage_liste=None):
|
||||||
return
|
return
|
||||||
|
|
||||||
tage_varianten = flex_tage_liste or [None]
|
tage_varianten = flex_tage_liste or [None]
|
||||||
|
online = fehler = 0
|
||||||
online = 0
|
|
||||||
fehler = 0
|
|
||||||
preise_gesamt = 0
|
|
||||||
|
|
||||||
for node in nodes:
|
for node in nodes:
|
||||||
if node_ping(node):
|
if node_ping(node):
|
||||||
update_node_status(node["name"], "online")
|
update_node_status(node["name"], "online")
|
||||||
online += 1
|
online += 1
|
||||||
for job in jobs:
|
for job in jobs:
|
||||||
|
skip_set = NODE_SCANNER_SKIP.get(node["name"], set())
|
||||||
|
if job["scanner"] in skip_set:
|
||||||
|
continue
|
||||||
for tage_var in tage_varianten:
|
for tage_var in tage_varianten:
|
||||||
try:
|
try:
|
||||||
ok = dispatch_job(node, job, tage_override=tage_var)
|
ok = dispatch_job(node, job, tage_override=tage_var)
|
||||||
|
|
@ -182,7 +350,7 @@ def scraping_lauf(label="Standard", flex_tage_liste=None):
|
||||||
|
|
||||||
dauer = round((datetime.now() - start).total_seconds())
|
dauer = round((datetime.now() - start).total_seconds())
|
||||||
log(f"Scraping [{label}] fertig — {online}/{len(nodes)} Nodes | "
|
log(f"Scraping [{label}] fertig — {online}/{len(nodes)} Nodes | "
|
||||||
f"{fehler} Fehler | {dauer}s Laufzeit")
|
f"{fehler} Fehler | {dauer}s")
|
||||||
|
|
||||||
try:
|
try:
|
||||||
plausibilitaetspruefung()
|
plausibilitaetspruefung()
|
||||||
|
|
@ -202,7 +370,6 @@ def scraping_lauf(label="Standard", flex_tage_liste=None):
|
||||||
|
|
||||||
|
|
||||||
def standard_lauf():
|
def standard_lauf():
|
||||||
"""30-Minuten-Takt — in eigenem Thread damit der Scheduler nicht blockiert."""
|
|
||||||
threading.Thread(
|
threading.Thread(
|
||||||
target=scraping_lauf,
|
target=scraping_lauf,
|
||||||
kwargs={"label": datetime.now().strftime("%H:%M")},
|
kwargs={"label": datetime.now().strftime("%H:%M")},
|
||||||
|
|
@ -211,7 +378,6 @@ def standard_lauf():
|
||||||
|
|
||||||
|
|
||||||
def flex_lauf():
|
def flex_lauf():
|
||||||
"""Di/Mi 23:30 — ±3 Tage Datumsfenster."""
|
|
||||||
wochentag = datetime.now().weekday()
|
wochentag = datetime.now().weekday()
|
||||||
if wochentag not in (1, 2):
|
if wochentag not in (1, 2):
|
||||||
log("Flex-Lauf: heute kein Di/Mi — übersprungen")
|
log("Flex-Lauf: heute kein Di/Mi — übersprungen")
|
||||||
|
|
@ -226,17 +392,269 @@ def flex_lauf():
|
||||||
).start()
|
).start()
|
||||||
|
|
||||||
|
|
||||||
|
def vorlauf_lauf():
|
||||||
|
"""Täglich 08:00 — scannt 45/60/84 Tage vorab für Buchungsvorlauf-Kurve."""
|
||||||
|
threading.Thread(
|
||||||
|
target=scraping_lauf,
|
||||||
|
kwargs={"label": "Vorlauf-84T", "flex_tage_liste": [45, 60, 84]},
|
||||||
|
daemon=True
|
||||||
|
).start()
|
||||||
|
|
||||||
|
|
||||||
|
def cleanup_lauf():
|
||||||
|
"""Tägliche Wartung: alte Screenshots löschen."""
|
||||||
|
cleanup_alte_screenshots(tage=30)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Telegram Bot Befehle ──────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def _cx_preise_jetzt() -> dict:
|
||||||
|
"""Holt aktuellen CX-Multicity-Preis und Vergleichswerte aus DB."""
|
||||||
|
conn = get_conn()
|
||||||
|
cx = conn.execute("""
|
||||||
|
SELECT MIN(preis) as min_p, MAX(scraped_at) as zuletzt
|
||||||
|
FROM prices WHERE scanner='kayak_multicity'
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
|
AND scraped_at >= datetime('now','-3 hours')
|
||||||
|
""").fetchone()
|
||||||
|
direkt = conn.execute("""
|
||||||
|
SELECT MIN(preis) as min_p
|
||||||
|
FROM prices WHERE scanner='kayak'
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
|
AND scraped_at >= datetime('now','-3 hours')
|
||||||
|
""").fetchone()
|
||||||
|
gestern_cx = conn.execute("""
|
||||||
|
SELECT MIN(preis) as min_p
|
||||||
|
FROM prices WHERE scanner='kayak_multicity'
|
||||||
|
AND date(scraped_at) = date('now','-1 day')
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
|
""").fetchone()
|
||||||
|
ki = conn.execute("""
|
||||||
|
SELECT ki_empfehlung, ki_analyse FROM analyses
|
||||||
|
ORDER BY id DESC LIMIT 1
|
||||||
|
""").fetchone()
|
||||||
|
conn.close()
|
||||||
|
return {
|
||||||
|
"cx_min": cx["min_p"] if cx else None,
|
||||||
|
"cx_zuletzt": cx["zuletzt"][:16] if cx and cx["zuletzt"] else "?",
|
||||||
|
"direkt_min": direkt["min_p"] if direkt else None,
|
||||||
|
"gestern_cx": gestern_cx["min_p"] if gestern_cx else None,
|
||||||
|
"ki_empf": ki["ki_empfehlung"] if ki else "—",
|
||||||
|
"ki_text": ki["ki_analyse"][:200] if ki else "—",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _top3_heute() -> list:
|
||||||
|
"""Top 3 günstigste Multicity-Treffer heute."""
|
||||||
|
conn = get_conn()
|
||||||
|
rows = conn.execute("""
|
||||||
|
SELECT preis, abflug, ankunft, booking_url
|
||||||
|
FROM prices WHERE scanner='kayak_multicity'
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
|
AND scraped_at >= datetime('now','-3 hours')
|
||||||
|
ORDER BY preis ASC LIMIT 3
|
||||||
|
""").fetchall()
|
||||||
|
conn.close()
|
||||||
|
return [dict(r) for r in rows]
|
||||||
|
|
||||||
|
|
||||||
|
def handle_bot_command(text: str, chat_id: str):
|
||||||
|
"""Verarbeitet eingehende Bot-Befehle."""
|
||||||
|
cmd = text.strip().lower().split()[0] if text.strip() else ""
|
||||||
|
d = _cx_preise_jetzt()
|
||||||
|
|
||||||
|
if cmd == "/preis":
|
||||||
|
cx = d["cx_min"]
|
||||||
|
direkt = d["direkt_min"]
|
||||||
|
gestern = d["gestern_cx"]
|
||||||
|
trend = ""
|
||||||
|
if cx and gestern:
|
||||||
|
diff = cx - gestern
|
||||||
|
trend = f"↗️ +{diff:.0f}€ vs. gestern" if diff > 0 else f"↘️ {diff:.0f}€ vs. gestern"
|
||||||
|
aufpreis = f"+{cx-direkt:.0f}€ vs. Direktflug" if cx and direkt else ""
|
||||||
|
msg = (
|
||||||
|
f"✈️ <b>CX via HKG — aktueller Preis</b>\n\n"
|
||||||
|
f"💰 <b>{cx:.0f} EUR</b> Roundtrip {trend}\n"
|
||||||
|
f"🔵 Direktflug: {direkt:.0f} EUR ({aufpreis})\n"
|
||||||
|
f"🕐 Letzter Scan: {d['cx_zuletzt']}\n\n"
|
||||||
|
f"KI: <b>{d['ki_empf']}</b>"
|
||||||
|
) if cx else "⏳ Noch keine Daten im aktuellen Scan-Fenster."
|
||||||
|
|
||||||
|
elif cmd == "/best":
|
||||||
|
top3 = _top3_heute()
|
||||||
|
if not top3:
|
||||||
|
msg = "⏳ Noch keine Treffer im aktuellen Scan-Fenster."
|
||||||
|
else:
|
||||||
|
zeilen = "\n".join([
|
||||||
|
f"{i+1}. <b>{r['preis']:.0f}€</b> — Abflug {r['abflug']} "
|
||||||
|
f"<a href='{r['booking_url']}'>buchen</a>"
|
||||||
|
for i, r in enumerate(top3)
|
||||||
|
])
|
||||||
|
msg = f"🏆 <b>Top 3 CX via HKG heute</b>\n\n{zeilen}"
|
||||||
|
|
||||||
|
elif cmd == "/status":
|
||||||
|
conn = get_conn()
|
||||||
|
nodes = conn.execute("SELECT name, status, last_seen FROM nodes").fetchall()
|
||||||
|
jobs_n = conn.execute("SELECT COUNT(*) FROM jobs WHERE aktiv=1").fetchone()[0]
|
||||||
|
letzter = conn.execute("SELECT MAX(scraped_at) FROM prices").fetchone()[0]
|
||||||
|
conn.close()
|
||||||
|
from ki import get_openrouter_guthaben
|
||||||
|
gut = get_openrouter_guthaben()
|
||||||
|
node_str = "\n".join([
|
||||||
|
f" {'🟢' if n['status']=='online' else '🔴'} {n['name']} ({n['last_seen'][:16] if n['last_seen'] else '?'})"
|
||||||
|
for n in nodes
|
||||||
|
])
|
||||||
|
msg = (
|
||||||
|
f"🖥️ <b>Flugscanner Status</b>\n\n"
|
||||||
|
f"<b>Nodes:</b>\n{node_str}\n\n"
|
||||||
|
f"<b>Aktive Jobs:</b> {jobs_n}\n"
|
||||||
|
f"<b>Letzter Scan:</b> {letzter[:16] if letzter else '?'}\n"
|
||||||
|
f"<b>OpenRouter:</b> {gut.get('verbleibend','?')} USD verbleibend"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
msg = (
|
||||||
|
"✈️ <b>CX HKG Alert Bot</b>\n\n"
|
||||||
|
"/preis — Aktueller CX-Preis + Trend\n"
|
||||||
|
"/best — Top 3 günstigste Treffer heute\n"
|
||||||
|
"/status — Nodes, Scans, Guthaben"
|
||||||
|
)
|
||||||
|
|
||||||
|
telegram_send_to(chat_id, msg)
|
||||||
|
|
||||||
|
|
||||||
|
def telegram_send_to(chat_id: str, msg: str):
|
||||||
|
"""Sendet an spezifische Chat-ID."""
|
||||||
|
if not TELEGRAM_TOKEN:
|
||||||
|
return
|
||||||
|
try:
|
||||||
|
requests.post(
|
||||||
|
f"https://api.telegram.org/bot{TELEGRAM_TOKEN}/sendMessage",
|
||||||
|
json={"chat_id": chat_id, "text": msg,
|
||||||
|
"parse_mode": "HTML", "disable_web_page_preview": True},
|
||||||
|
timeout=10
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
log(f"Telegram-Send-Fehler: {e}", "WARN")
|
||||||
|
|
||||||
|
|
||||||
|
def telegram_polling():
|
||||||
|
"""Background-Thread: empfängt Bot-Befehle via Long Polling."""
|
||||||
|
if not TELEGRAM_TOKEN:
|
||||||
|
log("Telegram-Token fehlt — Bot-Polling deaktiviert", "WARN")
|
||||||
|
return
|
||||||
|
log("Telegram Bot Polling gestartet")
|
||||||
|
offset = 0
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
r = requests.get(
|
||||||
|
f"https://api.telegram.org/bot{TELEGRAM_TOKEN}/getUpdates",
|
||||||
|
params={"offset": offset, "timeout": 30, "allowed_updates": ["message"]},
|
||||||
|
timeout=40
|
||||||
|
)
|
||||||
|
if r.status_code == 200:
|
||||||
|
updates = r.json().get("result", [])
|
||||||
|
for u in updates:
|
||||||
|
offset = u["update_id"] + 1
|
||||||
|
msg = u.get("message", {})
|
||||||
|
text = msg.get("text", "")
|
||||||
|
chat_id = str(msg.get("chat", {}).get("id", ""))
|
||||||
|
if text and chat_id:
|
||||||
|
log(f"Bot-Befehl von {chat_id}: {text[:30]}")
|
||||||
|
threading.Thread(
|
||||||
|
target=handle_bot_command,
|
||||||
|
args=(text, chat_id),
|
||||||
|
daemon=True
|
||||||
|
).start()
|
||||||
|
except Exception as e:
|
||||||
|
log(f"Polling-Fehler: {e}", "WARN")
|
||||||
|
time.sleep(5)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Morgenbericht ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def morgenbericht():
|
||||||
|
"""Täglich 07:00: Tagesüberblick per Telegram."""
|
||||||
|
d = _cx_preise_jetzt()
|
||||||
|
top3 = _top3_heute()
|
||||||
|
|
||||||
|
cx = d["cx_min"]
|
||||||
|
gestern = d["gestern_cx"]
|
||||||
|
|
||||||
|
if not cx:
|
||||||
|
telegram_send("☀️ Guten Morgen — noch keine Scan-Daten für heute.")
|
||||||
|
return
|
||||||
|
|
||||||
|
trend_emoji = "📈" if (cx and gestern and cx > gestern) else "📉" if (cx and gestern and cx < gestern) else "➡️"
|
||||||
|
trend_str = f"{trend_emoji} {cx-gestern:+.0f}€ vs. gestern" if gestern else ""
|
||||||
|
|
||||||
|
empf_farbe = {"JETZT BUCHEN": "🟢", "WARTEN": "🔴", "NEUTRAL": "🟡"}.get(d["ki_empf"], "⚪")
|
||||||
|
|
||||||
|
top_str = ""
|
||||||
|
if top3:
|
||||||
|
top_str = "\n🏆 <b>Beste Angebote:</b>\n" + "\n".join([
|
||||||
|
f" {i+1}. {r['preis']:.0f}€ — {r['abflug']} <a href='{r['booking_url']}'>buchen</a>"
|
||||||
|
for i, r in enumerate(top3)
|
||||||
|
])
|
||||||
|
|
||||||
|
msg = (
|
||||||
|
f"☀️ <b>Guten Morgen — CX via HKG</b>\n\n"
|
||||||
|
f"💰 Heute ab <b>{cx:.0f} EUR</b> {trend_str}\n"
|
||||||
|
f"{empf_farbe} KI-Empfehlung: <b>{d['ki_empf']}</b>\n"
|
||||||
|
f"{top_str}"
|
||||||
|
)
|
||||||
|
telegram_send(msg)
|
||||||
|
log("Morgenbericht gesendet")
|
||||||
|
|
||||||
|
|
||||||
|
# ── Preisanstieg-Alert ────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
_letzter_cx_preis: float = 0.0
|
||||||
|
|
||||||
|
def pruefe_preisanstieg(results, job):
|
||||||
|
"""Alert wenn CX via HKG um mehr als 50€ gestiegen ist."""
|
||||||
|
global _letzter_cx_preis
|
||||||
|
if job.get("scanner") != "kayak_multicity" or not results:
|
||||||
|
return
|
||||||
|
aktuell = min(r["preis"] for r in results)
|
||||||
|
if _letzter_cx_preis > 0 and aktuell > _letzter_cx_preis + 50:
|
||||||
|
diff = aktuell - _letzter_cx_preis
|
||||||
|
telegram_send(
|
||||||
|
f"📈 <b>Preisanstieg CX via HKG!</b>\n\n"
|
||||||
|
f"Vorher: {_letzter_cx_preis:.0f}€ → Jetzt: <b>{aktuell:.0f}€</b> "
|
||||||
|
f"(+{diff:.0f}€)\n\n"
|
||||||
|
f"Falls du buchen wolltest: jetzt könnte es teurer werden."
|
||||||
|
)
|
||||||
|
log(f"📈 Preisanstieg-Alert: {_letzter_cx_preis:.0f}→{aktuell:.0f}EUR")
|
||||||
|
if aktuell > 0:
|
||||||
|
_letzter_cx_preis = aktuell
|
||||||
|
|
||||||
|
|
||||||
def run():
|
def run():
|
||||||
init_db()
|
init_db()
|
||||||
log("Scheduler gestartet — alle 30 Minuten + Flex Di/Mi 23:30")
|
log("Scheduler gestartet")
|
||||||
|
|
||||||
# Alle 30 Minuten Standard-Scan
|
# Zufälliges Intervall 25-45 Minuten — Anti-Detection
|
||||||
schedule.every(30).minutes.do(standard_lauf)
|
schedule.every(25).to(45).minutes.do(standard_lauf)
|
||||||
|
|
||||||
# Di + Mi um 23:30: erweiterter Flex-Lauf mit ±3 Tage Datumsfenster
|
# Di + Mi 23:30: Flex-Lauf ±3 Tage
|
||||||
schedule.every().day.at("23:30").do(flex_lauf)
|
schedule.every().day.at("23:30").do(flex_lauf)
|
||||||
|
|
||||||
|
# Täglich 03:00: Screenshots aufräumen
|
||||||
|
schedule.every().day.at("03:00").do(cleanup_lauf)
|
||||||
|
|
||||||
|
# Täglich 08:00: Buchungsvorlauf-Scan 45/60/84 Tage
|
||||||
|
schedule.every().day.at("08:00").do(vorlauf_lauf)
|
||||||
|
log("Vorlauf-Scan: täglich 08:00 für 45/60/84 Tage vorab")
|
||||||
|
|
||||||
|
# Täglich 07:00: Morgenbericht
|
||||||
|
schedule.every().day.at("07:00").do(morgenbericht)
|
||||||
|
log("Morgenbericht: täglich 07:00 Uhr")
|
||||||
|
|
||||||
log(f"Nächster Lauf: {str(schedule.jobs[0].next_run)[:16]}")
|
log(f"Nächster Lauf: {str(schedule.jobs[0].next_run)[:16]}")
|
||||||
|
log(f"Scan-Intervall: zufällig 25-45 Minuten (Anti-Bot)")
|
||||||
|
|
||||||
|
# Telegram Bot Polling in eigenem Thread
|
||||||
|
threading.Thread(target=telegram_polling, daemon=True).start()
|
||||||
|
|
||||||
while True:
|
while True:
|
||||||
schedule.run_pending()
|
schedule.run_pending()
|
||||||
|
|
|
||||||
517
hub/src/web.py
517
hub/src/web.py
|
|
@ -45,6 +45,14 @@ BASE_HTML = """<!DOCTYPE html>
|
||||||
.btn-green:hover { background: #047857; }
|
.btn-green:hover { background: #047857; }
|
||||||
.btn-red { background: #dc2626; }
|
.btn-red { background: #dc2626; }
|
||||||
.btn-red:hover { background: #b91c1c; }
|
.btn-red:hover { background: #b91c1c; }
|
||||||
|
.fb { background:#1e293b;border:1px solid #334155;color:#94a3b8;padding:0.25rem 0.7rem;border-radius:6px;cursor:pointer;font-size:0.78rem;transition:all .15s; }
|
||||||
|
.fb:hover { border-color:#60a5fa;color:#bfdbfe; }
|
||||||
|
.fb-active { background:#1e3a5f;border-color:#2563eb;color:#93c5fd; }
|
||||||
|
.sortable { cursor:pointer;user-select:none; }
|
||||||
|
.sortable:hover { background:rgba(255,255,255,0.05); }
|
||||||
|
.sortable.asc .sort-icon::after { content:"↑"; }
|
||||||
|
.sortable.desc .sort-icon::after { content:"↓"; }
|
||||||
|
.sort-icon { font-size:0.7rem;color:#475569;margin-left:3px; }
|
||||||
.grid-2 { display: grid; grid-template-columns: 1fr 1fr; gap: 1.5rem; }
|
.grid-2 { display: grid; grid-template-columns: 1fr 1fr; gap: 1.5rem; }
|
||||||
.grid-3 { display: grid; grid-template-columns: repeat(3, 1fr); gap: 1.5rem; }
|
.grid-3 { display: grid; grid-template-columns: repeat(3, 1fr); gap: 1.5rem; }
|
||||||
.stat-box { text-align: center; }
|
.stat-box { text-align: center; }
|
||||||
|
|
@ -66,6 +74,7 @@ BASE_HTML = """<!DOCTYPE html>
|
||||||
<nav>
|
<nav>
|
||||||
<h1>✈️ Flugpreisscanner</h1>
|
<h1>✈️ Flugpreisscanner</h1>
|
||||||
<a href="/" class="{{ 'active' if page == 'overview' else '' }}">Übersicht</a>
|
<a href="/" class="{{ 'active' if page == 'overview' else '' }}">Übersicht</a>
|
||||||
|
<a href="/markt" class="{{ 'active' if page == 'markt' else '' }}">🏷 Marktübersicht</a>
|
||||||
<a href="/quellen" class="{{ 'active' if page == 'quellen' else '' }}">Quellen</a>
|
<a href="/quellen" class="{{ 'active' if page == 'quellen' else '' }}">Quellen</a>
|
||||||
<a href="/jobs" class="{{ 'active' if page == 'jobs' else '' }}">Jobs</a>
|
<a href="/jobs" class="{{ 'active' if page == 'jobs' else '' }}">Jobs</a>
|
||||||
<a href="/prompts" class="{{ 'active' if page == 'prompts' else '' }}">Prompts</a>
|
<a href="/prompts" class="{{ 'active' if page == 'prompts' else '' }}">Prompts</a>
|
||||||
|
|
@ -79,7 +88,7 @@ BASE_HTML = """<!DOCTYPE html>
|
||||||
|
|
||||||
OVERVIEW_HTML = BASE_HTML.replace("{% block content %}{% endblock %}", """
|
OVERVIEW_HTML = BASE_HTML.replace("{% block content %}{% endblock %}", """
|
||||||
<div style="background:#0c1a3a;border:1px solid#1e40af;border-radius:8px;padding:0.6rem 1rem;margin-bottom:0.6rem;font-size:0.85rem;color:#93c5fd;display:flex;justify-content:space-between;align-items:center;flex-wrap:wrap;gap:0.5rem">
|
<div style="background:#0c1a3a;border:1px solid#1e40af;border-radius:8px;padding:0.6rem 1rem;margin-bottom:0.6rem;font-size:0.85rem;color:#93c5fd;display:flex;justify-content:space-between;align-items:center;flex-wrap:wrap;gap:0.5rem">
|
||||||
<span>✈️ <strong>FRA → KTI</strong> · Roundtrip · Premium Economy · 1 Koffer + Handgepäck · ~2 Monate · max 22h / 2 Stopps / Umstieg 2-5h · 🇭🇰 <strong>HKG Stopover</strong> Variante aktiv</span>
|
<span>✈️ <strong>FRA → KTI</strong> · Roundtrip · Economy mit Gepäck · ~2 Monate · 🇭🇰 <strong>Cathay Pacific via HKG</strong> — 2 Nächte Stopover</span>
|
||||||
<span id="schedule-info" style="font-size:0.78rem;color:#60a5fa">Lade Zeitplan...</span>
|
<span id="schedule-info" style="font-size:0.78rem;color:#60a5fa">Lade Zeitplan...</span>
|
||||||
</div>
|
</div>
|
||||||
<div style="background:#451a03;border:1px solid#92400e;border-radius:8px;padding:0.5rem 1rem;margin-bottom:1.2rem;font-size:0.82rem;color:#fcd34d">
|
<div style="background:#451a03;border:1px solid#92400e;border-radius:8px;padding:0.5rem 1rem;margin-bottom:1.2rem;font-size:0.82rem;color:#fcd34d">
|
||||||
|
|
@ -89,8 +98,8 @@ OVERVIEW_HTML = BASE_HTML.replace("{% block content %}{% endblock %}", """
|
||||||
<div class="grid-3" style="margin-bottom:1.5rem">
|
<div class="grid-3" style="margin-bottom:1.5rem">
|
||||||
<div class="card stat-box">
|
<div class="card stat-box">
|
||||||
<div class="value" id="min-preis">—</div>
|
<div class="value" id="min-preis">—</div>
|
||||||
<div class="label">Günstigster PE-Preis heute (EUR)</div>
|
<div class="label">Günstigster Economy-Preis heute (EUR)</div>
|
||||||
<div id="min-preis-warnung" style="display:none;margin-top:0.4rem;font-size:0.75rem;color:#34d399">✓ KI-geprüft: nur plausible PE-Preise</div>
|
<div id="min-preis-warnung" style="display:none;margin-top:0.4rem;font-size:0.75rem;color:#34d399">✓ KI-geprüft · Roundtrip mit Gepäck</div>
|
||||||
</div>
|
</div>
|
||||||
<div class="card stat-box">
|
<div class="card stat-box">
|
||||||
<div class="value" id="avg-preis">—</div>
|
<div class="value" id="avg-preis">—</div>
|
||||||
|
|
@ -102,6 +111,22 @@ OVERVIEW_HTML = BASE_HTML.replace("{% block content %}{% endblock %}", """
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<div class="card" style="margin-bottom:1.5rem;padding:0.9rem 1.5rem;display:flex;align-items:center;gap:2rem;flex-wrap:wrap">
|
||||||
|
<span style="font-size:0.8rem;color:#64748b;text-transform:uppercase;letter-spacing:0.05em">OpenRouter KI-Guthaben</span>
|
||||||
|
<span style="font-size:1.4rem;font-weight:700" id="or-guthaben">…</span>
|
||||||
|
<span style="font-size:0.85rem;color:#64748b">von <span id="or-limit">—</span> USD</span>
|
||||||
|
<div style="flex:1;min-width:120px;background:#0f172a;border-radius:6px;height:8px;overflow:hidden">
|
||||||
|
<div id="or-bar" style="height:100%;background:#38bdf8;border-radius:6px;width:0%;transition:width .5s"></div>
|
||||||
|
</div>
|
||||||
|
<span style="font-size:0.75rem;color:#475569">verbraucht: <span id="or-usage">—</span> USD</span>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="card" style="margin-bottom:1.5rem">
|
||||||
|
<h2>Wann buchen? — Preis nach Buchungsvorlauf <span style="font-size:0.75rem;color:#475569;font-weight:400;text-transform:none">(Direkt vs. via HKG)</span></h2>
|
||||||
|
<div id="vorlauf-hinweis" style="color:#64748b;font-size:0.85rem;margin-bottom:0.5rem"></div>
|
||||||
|
<canvas id="vorlaufChart" height="120"></canvas>
|
||||||
|
</div>
|
||||||
|
|
||||||
<div class="grid-2">
|
<div class="grid-2">
|
||||||
<div class="card">
|
<div class="card">
|
||||||
<h2>Preisverlauf (30 Tage)</h2>
|
<h2>Preisverlauf (30 Tage)</h2>
|
||||||
|
|
@ -143,8 +168,27 @@ OVERVIEW_HTML = BASE_HTML.replace("{% block content %}{% endblock %}", """
|
||||||
|
|
||||||
<div class="card">
|
<div class="card">
|
||||||
<h2>Alle Preise heute (Detail)</h2>
|
<h2>Alle Preise heute (Detail)</h2>
|
||||||
<table>
|
<div style="display:flex;gap:0.5rem;flex-wrap:wrap;margin-bottom:0.75rem;align-items:center">
|
||||||
<thead><tr><th>Anbieter</th><th>Node</th><th>Preis</th><th>Plausibilität</th><th>Abflug</th><th>Rückflug</th><th>Buchen</th><th>Screenshot</th></tr></thead>
|
<span style="color:#64748b;font-size:0.8rem">Filter Kabine:</span>
|
||||||
|
<button onclick="filterKabine('alle')" id="fb-alle" class="fb fb-active">Alle</button>
|
||||||
|
<button onclick="filterKabine('pe')" id="fb-pe" class="fb" style="display:none">PE</button>
|
||||||
|
<button onclick="filterKabine('eco')" id="fb-eco" class="fb">⚠ Economy</button>
|
||||||
|
<button onclick="filterKabine('light')" id="fb-light" class="fb">🚫 Eco Light</button>
|
||||||
|
<button onclick="filterKabine('unk')" id="fb-unk" class="fb">⏳ Unbekannt</button>
|
||||||
|
</div>
|
||||||
|
<table id="preise-table">
|
||||||
|
<thead><tr>
|
||||||
|
<th class="sortable" data-col="0">Anbieter <span class="sort-icon">⇅</span></th>
|
||||||
|
<th class="sortable" data-col="1">Node <span class="sort-icon">⇅</span></th>
|
||||||
|
<th class="sortable" data-col="2" data-type="num">Preis <span class="sort-icon">⇅</span></th>
|
||||||
|
<th class="sortable" data-col="3">Kabine (KI) <span class="sort-icon">⇅</span></th>
|
||||||
|
<th>Plausibilität</th>
|
||||||
|
<th class="sortable" data-col="5">Abflug <span class="sort-icon">⇅</span></th>
|
||||||
|
<th>Rückflug</th>
|
||||||
|
<th class="sortable" data-col="7">Gescrapt um <span class="sort-icon">⇅</span></th>
|
||||||
|
<th>Buchen</th>
|
||||||
|
<th>Screenshot</th>
|
||||||
|
</tr></thead>
|
||||||
<tbody id="preise-tbody"></tbody>
|
<tbody id="preise-tbody"></tbody>
|
||||||
</table>
|
</table>
|
||||||
</div>
|
</div>
|
||||||
|
|
@ -204,6 +248,28 @@ async function ladeUebersicht() {
|
||||||
document.getElementById('avg-preis').textContent = stats.avg_30d ? Math.round(stats.avg_30d) : '—';
|
document.getElementById('avg-preis').textContent = stats.avg_30d ? Math.round(stats.avg_30d) : '—';
|
||||||
document.getElementById('node-count').textContent = nodes.filter(n=>n.status==='online').length;
|
document.getElementById('node-count').textContent = nodes.filter(n=>n.status==='online').length;
|
||||||
|
|
||||||
|
// OpenRouter Guthaben
|
||||||
|
try {
|
||||||
|
const or = await fetch('/api/openrouter/guthaben').then(r=>r.json());
|
||||||
|
if (or.fehler) {
|
||||||
|
document.getElementById('or-guthaben').textContent = 'Fehler';
|
||||||
|
document.getElementById('or-guthaben').style.color = '#f87171';
|
||||||
|
} else {
|
||||||
|
const verb = or.verbleibend != null ? or.verbleibend : '?';
|
||||||
|
const limit = or.limit || 20;
|
||||||
|
const pct = or.verbleibend != null ? Math.round((or.verbleibend / limit) * 100) : 0;
|
||||||
|
const farbe = pct > 50 ? '#34d399' : pct > 20 ? '#fbbf24' : '#f87171';
|
||||||
|
document.getElementById('or-guthaben').textContent = verb + ' USD';
|
||||||
|
document.getElementById('or-guthaben').style.color = farbe;
|
||||||
|
document.getElementById('or-limit').textContent = limit;
|
||||||
|
document.getElementById('or-usage').textContent = or.usage || '—';
|
||||||
|
document.getElementById('or-bar').style.width = pct + '%';
|
||||||
|
document.getElementById('or-bar').style.background = farbe;
|
||||||
|
}
|
||||||
|
} catch(e) {
|
||||||
|
document.getElementById('or-guthaben').textContent = 'n/a';
|
||||||
|
}
|
||||||
|
|
||||||
if (ki.ki_empfehlung) {
|
if (ki.ki_empfehlung) {
|
||||||
const farben = {'JETZT BUCHEN':'#34d399','WARTEN':'#fbbf24','NEUTRAL':'#60a5fa'};
|
const farben = {'JETZT BUCHEN':'#34d399','WARTEN':'#fbbf24','NEUTRAL':'#60a5fa'};
|
||||||
document.getElementById('ki-empfehlung').textContent = ki.ki_empfehlung;
|
document.getElementById('ki-empfehlung').textContent = ki.ki_empfehlung;
|
||||||
|
|
@ -255,16 +321,15 @@ async function ladeUebersicht() {
|
||||||
}
|
}
|
||||||
|
|
||||||
// Detail-Tabelle
|
// Detail-Tabelle
|
||||||
const tbody = document.getElementById('preise-tbody');
|
|
||||||
const HOTEL_HKG = 150; // geschätzte Hotel-Kosten HKG in EUR
|
const HOTEL_HKG = 150; // geschätzte Hotel-Kosten HKG in EUR
|
||||||
tbody.innerHTML = preise.map(p => {
|
_allePreiszeilen = preise.map(p => {
|
||||||
const isMulticity = p.scanner === 'kayak_multicity';
|
const isMulticity = p.scanner === 'kayak_multicity';
|
||||||
// KI-Plausibilitätsstatus: 1=plausibel, 0=verdächtig, -1/null=ungeprüft
|
// KI-Plausibilitätsstatus: 1=plausibel, 0=verdächtig, -1/null=ungeprüft
|
||||||
const ps = p.plausi_status !== undefined ? p.plausi_status : (p.plausibel !== undefined ? p.plausibel : -1);
|
const ps = p.plausi_status !== undefined ? p.plausi_status : (p.plausibel !== undefined ? p.plausibel : -1);
|
||||||
const grund = p.plausi_info || p.plausi_grund || '';
|
const grund = p.plausi_info || p.plausi_grund || '';
|
||||||
let plausi;
|
let plausi;
|
||||||
if (ps === 1) {
|
if (ps === 1) {
|
||||||
plausi = `<span title="${grund}" style="background:#064e3b;color:#34d399;padding:0.15rem 0.5rem;border-radius:4px;font-size:0.75rem;cursor:help">✓ PE bestätigt</span>`;
|
plausi = `<span title="${grund}" style="background:#064e3b;color:#34d399;padding:0.15rem 0.5rem;border-radius:4px;font-size:0.75rem;cursor:help">✓ plausibel</span>`;
|
||||||
} else if (ps === 0) {
|
} else if (ps === 0) {
|
||||||
plausi = `<span title="${grund}" style="background:#7f1d1d;color:#fca5a5;padding:0.15rem 0.5rem;border-radius:4px;font-size:0.75rem;cursor:help">✗ ${grund.substring(0,40) || 'verdächtig'}</span>`;
|
plausi = `<span title="${grund}" style="background:#7f1d1d;color:#fca5a5;padding:0.15rem 0.5rem;border-radius:4px;font-size:0.75rem;cursor:help">✗ ${grund.substring(0,40) || 'verdächtig'}</span>`;
|
||||||
} else {
|
} else {
|
||||||
|
|
@ -290,17 +355,42 @@ async function ladeUebersicht() {
|
||||||
const rowStyle = verdaechtig
|
const rowStyle = verdaechtig
|
||||||
? ' style="background:rgba(239,68,68,0.08);border-left:3px solid #ef4444;opacity:0.7"'
|
? ' style="background:rgba(239,68,68,0.08);border-left:3px solid #ef4444;opacity:0.7"'
|
||||||
: (isMulticity ? ' style="background:rgba(99,102,241,0.06);border-left:3px solid #6366f1"' : '');
|
: (isMulticity ? ' style="background:rgba(99,102,241,0.06);border-left:3px solid #6366f1"' : '');
|
||||||
return `<tr${rowStyle}>
|
const scrapedAt = p.scraped_at
|
||||||
|
? (() => { const d = new Date(p.scraped_at.replace(' ','T')+'Z');
|
||||||
|
return d.toLocaleTimeString('de-DE',{hour:'2-digit',minute:'2-digit',second:'2-digit',timeZone:'Europe/Berlin'}); })()
|
||||||
|
: '—';
|
||||||
|
const kabine = p.kabine_erkannt || '⏳';
|
||||||
|
const kabineBadge = (() => {
|
||||||
|
const k = (p.kabine_erkannt || '').toLowerCase();
|
||||||
|
if (k.includes('premium'))
|
||||||
|
return `<span style="background:#14532d;color:#86efac;padding:0.15rem 0.45rem;border-radius:4px;font-size:0.72rem;white-space:nowrap">✅ CX</span>`;
|
||||||
|
if (k.includes('light') || k.includes('basic'))
|
||||||
|
return `<span style="background:#7f1d1d;color:#fca5a5;padding:0.15rem 0.45rem;border-radius:4px;font-size:0.72rem;white-space:nowrap">🚫 Eco Light</span>`;
|
||||||
|
if (k.includes('economy'))
|
||||||
|
return `<span style="background:#78350f;color:#fcd34d;padding:0.15rem 0.45rem;border-radius:4px;font-size:0.72rem;white-space:nowrap">⚠ Economy</span>`;
|
||||||
|
if (k.includes('business'))
|
||||||
|
return `<span style="background:#1e1b4b;color:#a5b4fc;padding:0.15rem 0.45rem;border-radius:4px;font-size:0.72rem;white-space:nowrap">💼 Business</span>`;
|
||||||
|
return `<span style="color:#475569;font-size:0.72rem">⏳</span>`;
|
||||||
|
})();
|
||||||
|
return {
|
||||||
|
_html: `<tr${rowStyle}>
|
||||||
<td>${scannerLabel}</td>
|
<td>${scannerLabel}</td>
|
||||||
<td style="font-size:0.8rem;color:#64748b">${p.node}</td>
|
<td style="font-size:0.8rem;color:#64748b">${p.node}</td>
|
||||||
<td>${gesamtHtml}</td>
|
<td>${gesamtHtml}</td>
|
||||||
|
<td>${kabineBadge}</td>
|
||||||
<td>${plausi}</td>
|
<td>${plausi}</td>
|
||||||
<td style="font-size:0.85rem">${p.abflug||'—'}</td>
|
<td style="font-size:0.85rem">${p.abflug||'—'}</td>
|
||||||
<td style="font-size:0.85rem">${p.ankunft||'—'}</td>
|
<td style="font-size:0.85rem">${p.ankunft||'—'}</td>
|
||||||
|
<td style="font-size:0.78rem;color:#64748b;white-space:nowrap">${scrapedAt}</td>
|
||||||
<td>${buchBtn}</td>
|
<td>${buchBtn}</td>
|
||||||
<td>${ssBtn}</td>
|
<td>${ssBtn}</td>
|
||||||
</tr>`;
|
</tr>`,
|
||||||
}).join('') || '<tr><td colspan="8" style="color:#475569;text-align:center">Noch keine Daten heute</td></tr>';
|
_kabine: p.kabine_erkannt || '',
|
||||||
|
_cells: [p.scanner||'', p.node||'', String(p.preis||0), p.kabine_erkannt||'', '',
|
||||||
|
p.abflug||'', p.ankunft||'', scrapedAt, '', ''],
|
||||||
|
};
|
||||||
|
});
|
||||||
|
renderPreisTabelle(_aktiveZeilen());
|
||||||
|
|
||||||
const ntbody = document.getElementById('nodes-tbody');
|
const ntbody = document.getElementById('nodes-tbody');
|
||||||
ntbody.innerHTML = nodes.map(n => `
|
ntbody.innerHTML = nodes.map(n => `
|
||||||
|
|
@ -309,7 +399,7 @@ async function ladeUebersicht() {
|
||||||
<td>${n.last_seen||'—'}</td></tr>
|
<td>${n.last_seen||'—'}</td></tr>
|
||||||
`).join('');
|
`).join('');
|
||||||
|
|
||||||
// Chart
|
// Preisverlauf Chart (30 Tage)
|
||||||
const verlauf = await fetch('/api/preise/verlauf').then(r=>r.json());
|
const verlauf = await fetch('/api/preise/verlauf').then(r=>r.json());
|
||||||
const ctx = document.getElementById('priceChart').getContext('2d');
|
const ctx = document.getElementById('priceChart').getContext('2d');
|
||||||
new Chart(ctx, {
|
new Chart(ctx, {
|
||||||
|
|
@ -326,6 +416,71 @@ async function ladeUebersicht() {
|
||||||
scales: { x: { ticks: { color: '#64748b' }, grid: { color: '#1e293b' }},
|
scales: { x: { ticks: { color: '#64748b' }, grid: { color: '#1e293b' }},
|
||||||
y: { ticks: { color: '#64748b' }, grid: { color: '#1e293b' }}}}
|
y: { ticks: { color: '#64748b' }, grid: { color: '#1e293b' }}}}
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// Buchungsvorlauf Chart
|
||||||
|
const vorlaufDaten = await fetch('/api/preise/vorlauf').then(r=>r.json());
|
||||||
|
const vCtx = document.getElementById('vorlaufChart').getContext('2d');
|
||||||
|
|
||||||
|
// Aufteilen in direkt und hkg
|
||||||
|
const direktMap = {}, hkgMap = {};
|
||||||
|
vorlaufDaten.forEach(d => {
|
||||||
|
if (d.typ === 'direkt') direktMap[d.tage_vorab] = d.min_preis;
|
||||||
|
else hkgMap[d.tage_vorab] = d.min_preis;
|
||||||
|
});
|
||||||
|
|
||||||
|
const alleLabels = [...new Set(vorlaufDaten.map(d=>d.tage_vorab))].sort((a,b)=>b-a);
|
||||||
|
|
||||||
|
if (alleLabels.length < 2) {
|
||||||
|
document.getElementById('vorlauf-hinweis').textContent =
|
||||||
|
'⏳ Noch zu wenig Daten — wird täglich befüllt. Erster aussagekräftiger Chart in ca. 1 Woche.';
|
||||||
|
} else {
|
||||||
|
document.getElementById('vorlauf-hinweis').textContent =
|
||||||
|
`${alleLabels.length} Vorlauf-Punkte erfasst (${alleLabels[alleLabels.length-1]}–${alleLabels[0]} Tage vor Abflug)`;
|
||||||
|
}
|
||||||
|
|
||||||
|
new Chart(vCtx, {
|
||||||
|
type: 'line',
|
||||||
|
data: {
|
||||||
|
labels: alleLabels.map(t => t + 'T'),
|
||||||
|
datasets: [
|
||||||
|
{
|
||||||
|
label: 'Direkt FRA→KTI',
|
||||||
|
data: alleLabels.map(t => direktMap[t] || null),
|
||||||
|
borderColor: '#38bdf8', backgroundColor: 'rgba(56,189,248,0.08)',
|
||||||
|
tension: 0.3, fill: false, spanGaps: true,
|
||||||
|
pointRadius: 4, pointHoverRadius: 6
|
||||||
|
},
|
||||||
|
{
|
||||||
|
label: 'Via HKG (Cathay)',
|
||||||
|
data: alleLabels.map(t => hkgMap[t] || null),
|
||||||
|
borderColor: '#a78bfa', backgroundColor: 'rgba(167,139,250,0.08)',
|
||||||
|
tension: 0.3, fill: false, spanGaps: true,
|
||||||
|
pointRadius: 4, pointHoverRadius: 6, borderDash: [4,3]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
options: {
|
||||||
|
plugins: {
|
||||||
|
legend: { labels: { color: '#94a3b8' }},
|
||||||
|
tooltip: {
|
||||||
|
callbacks: {
|
||||||
|
label: ctx => ctx.dataset.label + ': ' + Math.round(ctx.parsed.y) + ' €'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
scales: {
|
||||||
|
x: {
|
||||||
|
reverse: true,
|
||||||
|
title: { display: true, text: '← früher buchen später buchen →', color: '#64748b' },
|
||||||
|
ticks: { color: '#64748b' }, grid: { color: '#1e293b' }
|
||||||
|
},
|
||||||
|
y: {
|
||||||
|
title: { display: true, text: 'EUR (Roundtrip)', color: '#64748b' },
|
||||||
|
ticks: { color: '#64748b' }, grid: { color: '#1e293b' }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
async function pruefeScanStatus() {
|
async function pruefeScanStatus() {
|
||||||
|
|
@ -343,6 +498,73 @@ async function pruefeScanStatus() {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── Tabellen-Sortierung ────────────────────────────────────────────────────────
|
||||||
|
let _sortCol = null, _sortAsc = true;
|
||||||
|
let _allePreiszeilen = [];
|
||||||
|
|
||||||
|
function renderPreisTabelle(zeilen) {
|
||||||
|
const tbody = document.getElementById('preise-tbody');
|
||||||
|
tbody.innerHTML = zeilen.length
|
||||||
|
? zeilen.map(z => z._html).join('')
|
||||||
|
: '<tr><td colspan="10" style="color:#475569;text-align:center">Keine Einträge für diesen Filter</td></tr>';
|
||||||
|
}
|
||||||
|
|
||||||
|
function sortiereTabelle(colIdx, isNum) {
|
||||||
|
// Header-Icons aktualisieren
|
||||||
|
document.querySelectorAll('#preise-table th.sortable').forEach(th => {
|
||||||
|
th.classList.remove('asc','desc');
|
||||||
|
th.querySelector('.sort-icon').textContent = '⇅';
|
||||||
|
});
|
||||||
|
const th = document.querySelector(`#preise-table th[data-col="${colIdx}"]`);
|
||||||
|
if (_sortCol === colIdx) { _sortAsc = !_sortAsc; }
|
||||||
|
else { _sortCol = colIdx; _sortAsc = true; }
|
||||||
|
th.classList.add(_sortAsc ? 'asc' : 'desc');
|
||||||
|
th.querySelector('.sort-icon').textContent = '';
|
||||||
|
|
||||||
|
const aktiv = _aktiveZeilen();
|
||||||
|
aktiv.sort((a, b) => {
|
||||||
|
const va = a._cells[colIdx] || '', vb = b._cells[colIdx] || '';
|
||||||
|
if (isNum) return _sortAsc ? parseFloat(va||0)-parseFloat(vb||0) : parseFloat(vb||0)-parseFloat(va||0);
|
||||||
|
return _sortAsc ? va.localeCompare(vb) : vb.localeCompare(va);
|
||||||
|
});
|
||||||
|
renderPreisTabelle(aktiv);
|
||||||
|
}
|
||||||
|
|
||||||
|
let _kabineFilter = 'alle';
|
||||||
|
function _aktiveZeilen() {
|
||||||
|
return _allePreiszeilen.filter(z => {
|
||||||
|
const k = (z._kabine || '').toLowerCase();
|
||||||
|
if (_kabineFilter === 'pe') return k.includes('premium');
|
||||||
|
if (_kabineFilter === 'eco') return k.includes('economy') && !k.includes('light') && !k.includes('premium');
|
||||||
|
if (_kabineFilter === 'light') return k.includes('light') || k.includes('basic');
|
||||||
|
if (_kabineFilter === 'unk') return !k || k === 'unbekannt' || k === 'fehler';
|
||||||
|
return true;
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function filterKabine(typ) {
|
||||||
|
_kabineFilter = typ;
|
||||||
|
document.querySelectorAll('.fb').forEach(b => b.classList.remove('fb-active'));
|
||||||
|
document.getElementById('fb-' + typ).classList.add('fb-active');
|
||||||
|
_sortCol = null; _sortAsc = true;
|
||||||
|
document.querySelectorAll('#preise-table th.sortable').forEach(th => {
|
||||||
|
th.classList.remove('asc','desc');
|
||||||
|
const si = th.querySelector('.sort-icon');
|
||||||
|
if (si) si.textContent = '⇅';
|
||||||
|
});
|
||||||
|
renderPreisTabelle(_aktiveZeilen());
|
||||||
|
}
|
||||||
|
|
||||||
|
// Sortier-Listener auf Header setzen (einmalig nach DOM-ready)
|
||||||
|
document.addEventListener('DOMContentLoaded', () => {
|
||||||
|
document.querySelectorAll('#preise-table th.sortable').forEach(th => {
|
||||||
|
th.addEventListener('click', () => {
|
||||||
|
sortiereTabelle(parseInt(th.dataset.col), th.dataset.type === 'num');
|
||||||
|
});
|
||||||
|
});
|
||||||
|
});
|
||||||
|
// ──────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
function zeigeScreenshot(id, label) {
|
function zeigeScreenshot(id, label) {
|
||||||
const modal = document.getElementById('ss-modal');
|
const modal = document.getElementById('ss-modal');
|
||||||
const img = document.getElementById('ss-img');
|
const img = document.getElementById('ss-img');
|
||||||
|
|
@ -416,6 +638,7 @@ def api_preise_heute():
|
||||||
(SELECT MAX(scraped_at) FROM prices WHERE date(scraped_at) = date('now')),
|
(SELECT MAX(scraped_at) FROM prices WHERE date(scraped_at) = date('now')),
|
||||||
'-20 minutes'
|
'-20 minutes'
|
||||||
)
|
)
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
ORDER BY preis ASC LIMIT 100
|
ORDER BY preis ASC LIMIT 100
|
||||||
""").fetchall()
|
""").fetchall()
|
||||||
if not rows:
|
if not rows:
|
||||||
|
|
@ -424,6 +647,7 @@ def api_preise_heute():
|
||||||
COALESCE(plausi_grund, '') as plausi_info
|
COALESCE(plausi_grund, '') as plausi_info
|
||||||
FROM prices
|
FROM prices
|
||||||
WHERE date(scraped_at) = date('now')
|
WHERE date(scraped_at) = date('now')
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
ORDER BY preis ASC LIMIT 100
|
ORDER BY preis ASC LIMIT 100
|
||||||
""").fetchall()
|
""").fetchall()
|
||||||
conn.close()
|
conn.close()
|
||||||
|
|
@ -444,6 +668,7 @@ def api_preise_vergleich():
|
||||||
'-20 minutes'
|
'-20 minutes'
|
||||||
)
|
)
|
||||||
AND (plausibel = 1 OR plausibel IS NULL)
|
AND (plausibel = 1 OR plausibel IS NULL)
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
GROUP BY scanner, node
|
GROUP BY scanner, node
|
||||||
ORDER BY scanner, preis
|
ORDER BY scanner, preis
|
||||||
""").fetchall()
|
""").fetchall()
|
||||||
|
|
@ -453,6 +678,7 @@ def api_preise_vergleich():
|
||||||
FROM prices
|
FROM prices
|
||||||
WHERE date(scraped_at) = date('now')
|
WHERE date(scraped_at) = date('now')
|
||||||
AND (plausibel = 1 OR plausibel IS NULL)
|
AND (plausibel = 1 OR plausibel IS NULL)
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
GROUP BY scanner, node
|
GROUP BY scanner, node
|
||||||
ORDER BY scanner, preis
|
ORDER BY scanner, preis
|
||||||
""").fetchall()
|
""").fetchall()
|
||||||
|
|
@ -493,6 +719,7 @@ def api_preise_verlauf():
|
||||||
rows = conn.execute("""
|
rows = conn.execute("""
|
||||||
SELECT date(scraped_at) as tag, MIN(preis) as min_preis, AVG(preis) as avg_preis
|
SELECT date(scraped_at) as tag, MIN(preis) as min_preis, AVG(preis) as avg_preis
|
||||||
FROM prices WHERE scraped_at >= datetime('now','-30 days')
|
FROM prices WHERE scraped_at >= datetime('now','-30 days')
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
GROUP BY date(scraped_at) ORDER BY tag
|
GROUP BY date(scraped_at) ORDER BY tag
|
||||||
""").fetchall()
|
""").fetchall()
|
||||||
conn.close()
|
conn.close()
|
||||||
|
|
@ -629,8 +856,239 @@ def api_screenshot(screenshot_id):
|
||||||
headers={"Cache-Control": "public, max-age=3600"})
|
headers={"Cache-Control": "public, max-age=3600"})
|
||||||
|
|
||||||
|
|
||||||
|
MARKT_HTML = """
|
||||||
|
<!-- Screenshot Lightbox (geteilt) -->
|
||||||
|
<div id="m-ss-modal" onclick="this.style.display='none'"
|
||||||
|
style="display:none;position:fixed;top:0;left:0;width:100%;height:100%;
|
||||||
|
background:rgba(0,0,0,0.93);z-index:9999;overflow:auto;cursor:zoom-out">
|
||||||
|
<div style="text-align:center;padding:1.5rem">
|
||||||
|
<button onclick="document.getElementById('m-ss-modal').style.display='none'"
|
||||||
|
style="background:#ef4444;color:white;border:none;padding:0.4rem 1.2rem;
|
||||||
|
border-radius:6px;cursor:pointer;margin-bottom:0.8rem">✕ Schließen</button>
|
||||||
|
<div id="m-ss-label" style="color:#94a3b8;font-size:0.85rem;margin-bottom:0.5rem"></div>
|
||||||
|
<img id="m-ss-img" src="" style="max-width:100%;border-radius:8px;
|
||||||
|
box-shadow:0 0 40px rgba(0,0,0,0.8)" onclick="event.stopPropagation()">
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="card" style="margin-bottom:1rem">
|
||||||
|
<div style="display:flex;justify-content:space-between;align-items:center">
|
||||||
|
<h2 style="margin:0">🏷 Marktübersicht — letzte 24h</h2>
|
||||||
|
<span id="markt-stand" style="font-size:0.8rem;color:#475569"></span>
|
||||||
|
</div>
|
||||||
|
<p style="color:#64748b;font-size:0.85rem;margin:0.4rem 0 0">
|
||||||
|
Top-3 günstigste Angebote pro Kabinenklasse · mit Screenshot zur direkten Verifikation
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div id="markt-grid" style="display:grid;grid-template-columns:1fr 1fr 1fr;gap:1.2rem;margin-bottom:1.5rem">
|
||||||
|
<div id="col-light" class="markt-col"></div>
|
||||||
|
<div id="col-eco" class="markt-col"></div>
|
||||||
|
<div id="col-pe" class="markt-col"></div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="card">
|
||||||
|
<h2>Preisdifferenz-Analyse</h2>
|
||||||
|
<div id="diff-analyse" style="display:grid;grid-template-columns:1fr 1fr;gap:1rem"></div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<style>
|
||||||
|
.markt-col { background:#0f172a;border:1px solid #1e293b;border-radius:12px;overflow:hidden; }
|
||||||
|
.markt-header { padding:1rem 1.2rem 0.8rem;border-bottom:1px solid #1e293b; }
|
||||||
|
.markt-preis-card { padding:0.9rem 1.2rem;border-bottom:1px solid #0f172a;transition:background .15s; }
|
||||||
|
.markt-preis-card:hover { background:rgba(255,255,255,0.03); }
|
||||||
|
.markt-preis-card:last-child { border-bottom:none; }
|
||||||
|
.markt-empty { padding:2rem;text-align:center;color:#334155;font-size:0.85rem; }
|
||||||
|
.ss-verify-btn { background:#1e3a5f;border:1px solid #2563eb;color:#93c5fd;
|
||||||
|
padding:0.2rem 0.5rem;border-radius:5px;cursor:pointer;
|
||||||
|
font-size:0.75rem;margin-left:4px; }
|
||||||
|
.ss-verify-btn:hover { background:#1d4ed8; }
|
||||||
|
.rank-badge { display:inline-block;width:20px;height:20px;border-radius:50%;
|
||||||
|
text-align:center;line-height:20px;font-size:0.7rem;font-weight:700;
|
||||||
|
margin-right:6px; }
|
||||||
|
</style>
|
||||||
|
|
||||||
|
<script>
|
||||||
|
function zeigeMarktScreenshot(id, label) {
|
||||||
|
document.getElementById('m-ss-modal').style.display = 'block';
|
||||||
|
document.getElementById('m-ss-label').textContent = label;
|
||||||
|
const img = document.getElementById('m-ss-img');
|
||||||
|
img.src = '';
|
||||||
|
img.src = '/api/screenshot/' + id;
|
||||||
|
}
|
||||||
|
|
||||||
|
function preisKarte(p, rank) {
|
||||||
|
const rankColors = ['#fbbf24','#94a3b8','#b45309'];
|
||||||
|
const rankLabels = ['1','2','3'];
|
||||||
|
const airline = p.airline || 'k.A.';
|
||||||
|
const abflug = p.abflug || '—';
|
||||||
|
const seit = p.scraped_at ? (() => {
|
||||||
|
const diff = Math.round((Date.now() - new Date(p.scraped_at.replace(' ','T')+'Z')) / 60000);
|
||||||
|
return diff < 60 ? `vor ${diff} Min` : `vor ${Math.floor(diff/60)}h`;
|
||||||
|
})() : '';
|
||||||
|
const ssBtn = p.screenshot_id
|
||||||
|
? `<button class="ss-verify-btn" onclick="zeigeMarktScreenshot(${p.screenshot_id},'${p.scanner} · ${p.kabine_erkannt} · ${abflug}')">📷 Prüfen</button>`
|
||||||
|
: '<span style="color:#334155;font-size:0.72rem">kein SS</span>';
|
||||||
|
const buchBtn = p.booking_url
|
||||||
|
? `<a href="${p.booking_url}" target="_blank"
|
||||||
|
style="background:#059669;color:white;padding:0.2rem 0.6rem;border-radius:5px;
|
||||||
|
font-size:0.75rem;text-decoration:none;margin-left:4px">↗ Buchen</a>`
|
||||||
|
: '';
|
||||||
|
const plausiIcon = p.plausibel === 1 ? '✓' : p.plausibel === 0 ? '✗' : '⏳';
|
||||||
|
const plausiColor = p.plausibel === 1 ? '#34d399' : p.plausibel === 0 ? '#f87171' : '#fbbf24';
|
||||||
|
|
||||||
|
return `<div class="markt-preis-card">
|
||||||
|
<div style="display:flex;justify-content:space-between;align-items:flex-start">
|
||||||
|
<div>
|
||||||
|
<span class="rank-badge" style="background:${rankColors[rank]};color:#0f172a">${rankLabels[rank]}</span>
|
||||||
|
<strong style="font-size:1.15rem;color:#f1f5f9">${p.preis} €</strong>
|
||||||
|
<span style="color:${plausiColor};font-size:0.72rem;margin-left:4px">${plausiIcon}</span>
|
||||||
|
</div>
|
||||||
|
<div style="text-align:right;font-size:0.72rem;color:#64748b">${seit}</div>
|
||||||
|
</div>
|
||||||
|
<div style="margin-top:0.4rem;font-size:0.8rem;color:#94a3b8">
|
||||||
|
<span style="color:#e2e8f0">${p.scanner}</span>
|
||||||
|
· ${p.node.replace('flugscanner-','')}
|
||||||
|
· ✈ ${airline}
|
||||||
|
</div>
|
||||||
|
<div style="margin-top:0.25rem;font-size:0.78rem;color:#64748b">
|
||||||
|
Abflug: <span style="color:#cbd5e1">${abflug}</span>
|
||||||
|
${p.ankunft ? `· Rück: <span style="color:#cbd5e1">${p.ankunft}</span>` : ''}
|
||||||
|
</div>
|
||||||
|
<div style="margin-top:0.5rem;display:flex;align-items:center;flex-wrap:wrap;gap:4px">
|
||||||
|
${ssBtn}${buchBtn}
|
||||||
|
</div>
|
||||||
|
</div>`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function marktKolumne(kabine, id, farbe, icon, beschreibung, eintraege) {
|
||||||
|
const header = `<div class="markt-header">
|
||||||
|
<div style="display:flex;justify-content:space-between;align-items:center">
|
||||||
|
<span style="font-size:1rem;font-weight:700;color:${farbe}">${icon} ${kabine}</span>
|
||||||
|
<span style="font-size:0.72rem;background:${farbe}22;color:${farbe};
|
||||||
|
padding:0.1rem 0.5rem;border-radius:10px">${eintraege.length} Treffer</span>
|
||||||
|
</div>
|
||||||
|
<div style="font-size:0.75rem;color:#475569;margin-top:0.2rem">${beschreibung}</div>
|
||||||
|
${eintraege.length > 0
|
||||||
|
? `<div style="font-size:1.6rem;font-weight:800;color:${farbe};margin-top:0.5rem">
|
||||||
|
${eintraege[0].preis} €
|
||||||
|
<span style="font-size:0.8rem;font-weight:400;color:#475569">günstigster</span>
|
||||||
|
</div>`
|
||||||
|
: ''}
|
||||||
|
</div>`;
|
||||||
|
const karten = eintraege.length > 0
|
||||||
|
? eintraege.map((p, i) => preisKarte(p, i)).join('')
|
||||||
|
: `<div class="markt-empty">Keine Daten letzte 24h</div>`;
|
||||||
|
return header + karten;
|
||||||
|
}
|
||||||
|
|
||||||
|
function diffAnalyse(daten) {
|
||||||
|
const pe = daten['Premium Economy']?.[0]?.preis;
|
||||||
|
const eco = daten['Economy']?.[0]?.preis;
|
||||||
|
const light = daten['Economy Light']?.[0]?.preis;
|
||||||
|
const cards = [];
|
||||||
|
|
||||||
|
if (pe && eco) {
|
||||||
|
const diff = Math.round(pe - eco);
|
||||||
|
cards.push(`<div class="card" style="margin:0">
|
||||||
|
<div style="font-size:0.8rem;color:#64748b">PE vs. Economy</div>
|
||||||
|
<div style="font-size:1.4rem;font-weight:700;color:#f59e0b">+${diff} €</div>
|
||||||
|
<div style="font-size:0.78rem;color:#94a3b8">Aufpreis für breiteren Sitz + mehr Service</div>
|
||||||
|
</div>`);
|
||||||
|
}
|
||||||
|
if (pe && light) {
|
||||||
|
const diff = Math.round(pe - light);
|
||||||
|
cards.push(`<div class="card" style="margin:0">
|
||||||
|
<div style="font-size:0.8rem;color:#64748b">PE vs. Economy Light</div>
|
||||||
|
<div style="font-size:1.4rem;font-weight:700;color:#f59e0b">+${diff} €</div>
|
||||||
|
<div style="font-size:0.78rem;color:#94a3b8">Inkl. Koffer (~50-100€) + PE-Komfort</div>
|
||||||
|
</div>`);
|
||||||
|
}
|
||||||
|
if (eco && light) {
|
||||||
|
const diff = Math.round(eco - light);
|
||||||
|
cards.push(`<div class="card" style="margin:0">
|
||||||
|
<div style="font-size:0.8rem;color:#64748b">Economy vs. Economy Light</div>
|
||||||
|
<div style="font-size:1.4rem;font-weight:700;color:#60a5fa">+${diff} €</div>
|
||||||
|
<div style="font-size:0.78rem;color:#94a3b8">Aufpreis für Koffer inklusive</div>
|
||||||
|
</div>`);
|
||||||
|
}
|
||||||
|
return cards.join('');
|
||||||
|
}
|
||||||
|
|
||||||
|
async function ladenMarkt() {
|
||||||
|
const daten = await fetch('/api/markt').then(r => r.json());
|
||||||
|
|
||||||
|
document.getElementById('col-light').innerHTML = marktKolumne(
|
||||||
|
'Economy Light', 'col-light', '#f87171', '🚫',
|
||||||
|
'Nur Handgepäck · kein Koffer inklusive',
|
||||||
|
daten['Economy Light'] || []
|
||||||
|
);
|
||||||
|
document.getElementById('col-eco').innerHTML = marktKolumne(
|
||||||
|
'Economy', 'col-eco', '#fbbf24', '⚠',
|
||||||
|
'Koffer inklusive · Standard-Sitz',
|
||||||
|
daten['Economy'] || []
|
||||||
|
);
|
||||||
|
document.getElementById('col-pe').innerHTML = marktKolumne(
|
||||||
|
'Premium Economy', 'col-pe', '#34d399', '✅',
|
||||||
|
'Breiterer Sitz · mehr Beinfreiheit · Koffer inkl.',
|
||||||
|
daten['Premium Economy'] || []
|
||||||
|
);
|
||||||
|
|
||||||
|
document.getElementById('diff-analyse').innerHTML = diffAnalyse(daten);
|
||||||
|
|
||||||
|
const stats = daten['_stats'] || [];
|
||||||
|
const total = stats.reduce((s, r) => s + r.n, 0);
|
||||||
|
document.getElementById('markt-stand').textContent =
|
||||||
|
`${total} Scans letzte 24h · Stand: ${new Date().toLocaleTimeString('de-DE')}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
ladenMarkt();
|
||||||
|
setInterval(ladenMarkt, 60000);
|
||||||
|
</script>
|
||||||
|
"""
|
||||||
|
|
||||||
|
# ─── Marktübersicht API ────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
@app.route("/api/markt")
|
||||||
|
def api_markt():
|
||||||
|
"""Top-3 günstigste Preise pro Kabinenklasse aus den letzten 24h."""
|
||||||
|
conn = get_conn()
|
||||||
|
result = {}
|
||||||
|
for kabine in ("Economy Light", "Economy", "Premium Economy"):
|
||||||
|
rows = conn.execute("""
|
||||||
|
SELECT p.id, p.preis, p.scanner, p.node, p.airline,
|
||||||
|
p.abflug, p.ankunft, p.booking_url, p.scraped_at,
|
||||||
|
p.screenshot_id, p.plausibel, p.kabine_erkannt
|
||||||
|
FROM prices p
|
||||||
|
WHERE p.kabine_erkannt = ?
|
||||||
|
AND p.scraped_at >= datetime('now', '-24 hours')
|
||||||
|
ORDER BY p.preis ASC
|
||||||
|
LIMIT 3
|
||||||
|
""", (kabine,)).fetchall()
|
||||||
|
result[kabine] = [dict(r) for r in rows]
|
||||||
|
|
||||||
|
# Gesamtstatistik letzte 24h
|
||||||
|
stats = conn.execute("""
|
||||||
|
SELECT kabine_erkannt, COUNT(*) as n, MIN(preis) as min_p
|
||||||
|
FROM prices
|
||||||
|
WHERE scraped_at >= datetime('now', '-24 hours')
|
||||||
|
AND kabine_erkannt IS NOT NULL
|
||||||
|
GROUP BY kabine_erkannt
|
||||||
|
""").fetchall()
|
||||||
|
result["_stats"] = [dict(r) for r in stats]
|
||||||
|
conn.close()
|
||||||
|
return jsonify(result)
|
||||||
|
|
||||||
|
|
||||||
# ─── Seiten ────────────────────────────────────────────────────────────────────
|
# ─── Seiten ────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
@app.route("/markt")
|
||||||
|
def markt():
|
||||||
|
return render_template_string(BASE_HTML.replace(
|
||||||
|
"{% block content %}{% endblock %}", MARKT_HTML
|
||||||
|
), page="markt")
|
||||||
|
|
||||||
|
|
||||||
@app.route("/")
|
@app.route("/")
|
||||||
def overview():
|
def overview():
|
||||||
return render_template_string(OVERVIEW_HTML, page="overview")
|
return render_template_string(OVERVIEW_HTML, page="overview")
|
||||||
|
|
@ -791,6 +1249,41 @@ setInterval(laden, 10000);
|
||||||
return render_template_string(html, page="logs")
|
return render_template_string(html, page="logs")
|
||||||
|
|
||||||
|
|
||||||
|
# ─── OpenRouter Guthaben ───────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
@app.route("/api/openrouter/guthaben")
|
||||||
|
def api_openrouter_guthaben():
|
||||||
|
from ki import get_openrouter_guthaben
|
||||||
|
return jsonify(get_openrouter_guthaben())
|
||||||
|
|
||||||
|
|
||||||
|
# ─── Buchungsvorlauf-Kurve ─────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
@app.route("/api/preise/vorlauf")
|
||||||
|
def api_preise_vorlauf():
|
||||||
|
"""
|
||||||
|
Gibt pro Vorlauf-Fenster (tage_vorab) den günstigsten Preis zurück.
|
||||||
|
Separat für direkte Routen und Multicity (via HKG).
|
||||||
|
"""
|
||||||
|
conn = get_conn()
|
||||||
|
rows = conn.execute("""
|
||||||
|
SELECT
|
||||||
|
CAST(julianday(abflug) - julianday(date(scraped_at)) AS INT) as tage_vorab,
|
||||||
|
CASE WHEN scanner = 'kayak_multicity' THEN 'hkg' ELSE 'direkt' END as typ,
|
||||||
|
MIN(preis) as min_preis,
|
||||||
|
COUNT(*) as n
|
||||||
|
FROM prices
|
||||||
|
WHERE abflug != ''
|
||||||
|
AND (kabine_erkannt != 'Premium Economy' OR kabine_erkannt IS NULL)
|
||||||
|
AND plausibel != 0
|
||||||
|
GROUP BY tage_vorab, typ
|
||||||
|
HAVING tage_vorab BETWEEN 7 AND 100
|
||||||
|
ORDER BY tage_vorab DESC
|
||||||
|
""").fetchall()
|
||||||
|
conn.close()
|
||||||
|
return jsonify([dict(r) for r in rows])
|
||||||
|
|
||||||
|
|
||||||
# ─── Start ─────────────────────────────────────────────────────────────────────
|
# ─── Start ─────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|
|
||||||
|
|
@ -1 +0,0 @@
|
||||||
NODE_NAME=flugscanner-asia
|
|
||||||
|
|
@ -1 +0,0 @@
|
||||||
NODE_NAME=flugscanner-mu
|
|
||||||
|
|
@ -1,2 +1,2 @@
|
||||||
flask==3.1.0
|
flask==3.1.0
|
||||||
seleniumbase==4.34.4
|
seleniumbase>=4.46.5
|
||||||
|
|
|
||||||
|
|
@ -53,7 +53,7 @@ def scrape(scanner, von, nach, tage=30, aufenthalt_tage=60,
|
||||||
screenshot_b64 = JPEG Full-Page Screenshot als base64-String (leer wenn Fehler)
|
screenshot_b64 = JPEG Full-Page Screenshot als base64-String (leer wenn Fehler)
|
||||||
"""
|
"""
|
||||||
dispatcher = {
|
dispatcher = {
|
||||||
"google_flights": _scrape_disabled,
|
"google_flights": scrape_google_flights,
|
||||||
"kayak": scrape_kayak,
|
"kayak": scrape_kayak,
|
||||||
"kayak_multicity": scrape_kayak_multicity,
|
"kayak_multicity": scrape_kayak_multicity,
|
||||||
"momondo": scrape_momondo,
|
"momondo": scrape_momondo,
|
||||||
|
|
@ -96,23 +96,14 @@ def _take_screenshot(sb):
|
||||||
def _booking_url_google(von, nach, abflug, rueck, kc):
|
def _booking_url_google(von, nach, abflug, rueck, kc):
|
||||||
# Hash-Fragment wird von headless Chrome ignoriert → tfs-Parameter nutzen
|
# Hash-Fragment wird von headless Chrome ignoriert → tfs-Parameter nutzen
|
||||||
if rueck:
|
if rueck:
|
||||||
return (f"https://www.google.com/travel/flights?hl=de&curr=EUR"
|
return (f"https://www.google.com/travel/flights?hl=en&curr=EUR"
|
||||||
f"#flt={von}.{nach}.{abflug}*{nach}.{von}.{rueck};c:EUR;e:1;sd:1;t:r;sc:{kc}")
|
f"#flt={von}.{nach}.{abflug}*{nach}.{von}.{rueck};c:EUR;e:1;sd:1;t:r;sc:{kc}")
|
||||||
return (f"https://www.google.com/travel/flights?hl=de&curr=EUR"
|
return (f"https://www.google.com/travel/flights?hl=en&curr=EUR"
|
||||||
f"#flt={von}.{nach}.{abflug};c:EUR;e:1;sd:1;t:f;sc:{kc}")
|
f"#flt={von}.{nach}.{abflug};c:EUR;e:1;sd:1;t:f;sc:{kc}")
|
||||||
|
|
||||||
|
|
||||||
def _booking_url_kayak(von, nach, abflug, rueck, kc, bags=1,
|
def _kayak_filters(bags, layover_min, layover_max, max_flugzeit_h, max_stops, airline):
|
||||||
layover_min=120, layover_max=300, airline="",
|
"""Gemeinsame Filter-Logik für alle Kayak-URL-Funktionen."""
|
||||||
max_flugzeit_h=22, max_stops=2):
|
|
||||||
"""
|
|
||||||
Kayak fs-Filter:
|
|
||||||
bfc=1 → min. 1 Freigepäck inklusive
|
|
||||||
ctr=120,300 → Umstiegszeit 2–5 Stunden (Minuten)
|
|
||||||
duration=-1320 → Max. Gesamtflugzeit (Minuten, hier 22h)
|
|
||||||
s=2 → Max. 2 Stopps
|
|
||||||
airlines=XX → Airline-Code (CZ, CX, SQ, TG …)
|
|
||||||
"""
|
|
||||||
filters = []
|
filters = []
|
||||||
if bags:
|
if bags:
|
||||||
filters.append(f"bfc%3D{bags}")
|
filters.append(f"bfc%3D{bags}")
|
||||||
|
|
@ -124,13 +115,47 @@ def _booking_url_kayak(von, nach, abflug, rueck, kc, bags=1,
|
||||||
filters.append(f"s%3D{max_stops}")
|
filters.append(f"s%3D{max_stops}")
|
||||||
if airline:
|
if airline:
|
||||||
filters.append(f"airlines%3D{airline}")
|
filters.append(f"airlines%3D{airline}")
|
||||||
fs = ("&fs=" + "%3B".join(filters)) if filters else ""
|
return ("&fs=" + "%3B".join(filters)) if filters else ""
|
||||||
|
|
||||||
|
|
||||||
|
def _scrape_url_kayak(von, nach, abflug, rueck, kc, bags=1,
|
||||||
|
layover_min=120, layover_max=300, airline="",
|
||||||
|
max_flugzeit_h=22, max_stops=2):
|
||||||
|
"""Interne Scraping-URL (kayak.de — bekannte HTML-Struktur)."""
|
||||||
|
fs = _kayak_filters(bags, layover_min, layover_max, max_flugzeit_h, max_stops, airline)
|
||||||
base = f"https://www.kayak.de/flights/{von}-{nach}/{abflug}"
|
base = f"https://www.kayak.de/flights/{von}-{nach}/{abflug}"
|
||||||
if rueck:
|
if rueck:
|
||||||
return f"{base}/{rueck}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
return f"{base}/{rueck}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
||||||
return f"{base}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
return f"{base}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
||||||
|
|
||||||
|
|
||||||
|
def _booking_url_kayak(von, nach, abflug, rueck, kc, bags=1,
|
||||||
|
layover_min=120, layover_max=300, airline="",
|
||||||
|
max_flugzeit_h=22, max_stops=2):
|
||||||
|
"""User-facing Booking-URL (kayak.com international, kein DE-Aufschlag)."""
|
||||||
|
fs = _kayak_filters(bags, layover_min, layover_max, max_flugzeit_h, max_stops, airline)
|
||||||
|
base = f"https://www.kayak.com/flights/{von}-{nach}/{abflug}"
|
||||||
|
if rueck:
|
||||||
|
return f"{base}/{rueck}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
||||||
|
return f"{base}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
||||||
|
|
||||||
|
|
||||||
|
def _consent_kayak(sb):
|
||||||
|
"""Kayak/Momondo GDPR-Consent wegklicken."""
|
||||||
|
for sel in ['#didomi-notice-agree-button', 'button[class*="accept"]',
|
||||||
|
'button[class*="agree"]', '[data-testid*="accept"]',
|
||||||
|
'button[id*="accept"]', '.RxNS-button-content',
|
||||||
|
'button[aria-label*="akzeptieren"]', 'button[aria-label*="Alle"]']:
|
||||||
|
try:
|
||||||
|
sb.find_element(sel, timeout=2).click()
|
||||||
|
print(f"[CONSENT] Kayak Consent geklickt: {sel}")
|
||||||
|
sb.sleep(3)
|
||||||
|
return True
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
def _booking_url_momondo(von, nach, abflug, rueck, kc, bags=1,
|
def _booking_url_momondo(von, nach, abflug, rueck, kc, bags=1,
|
||||||
layover_min=120, layover_max=300, airline="",
|
layover_min=120, layover_max=300, airline="",
|
||||||
max_flugzeit_h=22, max_stops=2):
|
max_flugzeit_h=22, max_stops=2):
|
||||||
|
|
@ -147,12 +172,28 @@ def _booking_url_momondo(von, nach, abflug, rueck, kc, bags=1,
|
||||||
if airline:
|
if airline:
|
||||||
filters.append(f"airlines%3D{airline}")
|
filters.append(f"airlines%3D{airline}")
|
||||||
fs = ("&fs=" + "%3B".join(filters)) if filters else ""
|
fs = ("&fs=" + "%3B".join(filters)) if filters else ""
|
||||||
base = f"https://www.momondo.de/flight-search/{von}-{nach}/{abflug}"
|
base = f"https://www.momondo.com/flight-search/{von}-{nach}/{abflug}"
|
||||||
if rueck:
|
if rueck:
|
||||||
return f"{base}/{rueck}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
return f"{base}/{rueck}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
||||||
return f"{base}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
return f"{base}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
def _scrape_url_momondo(von, nach, abflug, rueck, kc, bags=1,
|
||||||
|
layover_min=120, layover_max=300, airline="",
|
||||||
|
max_flugzeit_h=22, max_stops=2):
|
||||||
|
filters = []
|
||||||
|
if bags: filters.append(f"bfc%3D{bags}")
|
||||||
|
if layover_min and layover_max: filters.append(f"ctr%3D{layover_min}%2C{layover_max}")
|
||||||
|
if max_flugzeit_h: filters.append(f"duration%3D-{max_flugzeit_h * 60}")
|
||||||
|
if max_stops is not None and max_stops < 10: filters.append(f"s%3D{max_stops}")
|
||||||
|
if airline: filters.append(f"airlines%3D{airline}")
|
||||||
|
fs = ("&fs=" + "%3B".join(filters)) if filters else ""
|
||||||
|
base = f"https://www.momondo.de/flight-search/{von}-{nach}/{abflug}"
|
||||||
|
if rueck:
|
||||||
|
return f"{base}/{rueck}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
||||||
|
return f"{base}?sort=price_a&cabin={kc}¤cy=EUR{fs}"
|
||||||
|
|
||||||
def _booking_url_trip(von, nach, abflug_fmt, rueck_fmt, kc, von_name, nach_name):
|
def _booking_url_trip(von, nach, abflug_fmt, rueck_fmt, kc, von_name, nach_name):
|
||||||
if rueck_fmt:
|
if rueck_fmt:
|
||||||
return (f"https://www.trip.com/flights/{von_name}-to-{nach_name}/"
|
return (f"https://www.trip.com/flights/{von_name}-to-{nach_name}/"
|
||||||
|
|
@ -189,8 +230,10 @@ def _parse_preis(text):
|
||||||
def _preise_aus_body(body, scanner, abflug):
|
def _preise_aus_body(body, scanner, abflug):
|
||||||
results = []
|
results = []
|
||||||
seen = set()
|
seen = set()
|
||||||
for m in re.finditer(r'(\d[\d\s\.]{1,5})\s?€|€\s?(\d[\d\s\.]{1,5})', body):
|
# Normalisierung: thin/non-breaking spaces → reguläre Leerzeichen
|
||||||
raw = (m.group(1) or m.group(2)).replace(' ', '').replace('.', '')
|
body_norm = body.replace('\xa0', ' ').replace('\u202f', ' ').replace('\u00a0', ' ')
|
||||||
|
for m in re.finditer(r'(\d{1,2}[.,]\d{3}|\d[\d\s\.]{1,5})\s?€|€\s?(\d[\d\s\.]{1,5})', body_norm):
|
||||||
|
raw = (m.group(1) or m.group(2)).strip().replace(' ', '').replace('.', '').replace(',', '')
|
||||||
try:
|
try:
|
||||||
v = float(raw)
|
v = float(raw)
|
||||||
if 300 < v < 12000 and v not in seen:
|
if 300 < v < 12000 and v not in seen:
|
||||||
|
|
@ -207,10 +250,13 @@ def _preise_aus_body(body, scanner, abflug):
|
||||||
|
|
||||||
def _consent_google(sb):
|
def _consent_google(sb):
|
||||||
"""Google Consent-Seite (DSGVO) behandeln."""
|
"""Google Consent-Seite (DSGVO) behandeln."""
|
||||||
if "consent" in sb.get_current_url() or "Bevor Sie" in sb.get_title():
|
title = sb.get_title()
|
||||||
|
url = sb.get_current_url()
|
||||||
|
if "consent" in url or "Bevor Sie" in title or "Before you" in title:
|
||||||
print("[CONSENT] Google Consent erkannt")
|
print("[CONSENT] Google Consent erkannt")
|
||||||
for sel in ['form[action*="save"] button', 'button[jsname="tHlp8d"]',
|
for sel in ['form[action*="save"] button', 'button[jsname="tHlp8d"]',
|
||||||
'.lssxud button', 'button[aria-label*="kzeptieren"]']:
|
'.lssxud button', 'button[aria-label*="kzeptieren"]',
|
||||||
|
'button[aria-label*="Accept all"]', 'button[aria-label*="Accept"]']:
|
||||||
try:
|
try:
|
||||||
sb.click(sel, timeout=3)
|
sb.click(sel, timeout=3)
|
||||||
sb.sleep(4)
|
sb.sleep(4)
|
||||||
|
|
@ -277,52 +323,66 @@ def scrape_google_flights(von, nach, tage=30, aufenthalt_tage=60,
|
||||||
print(f"[GF] Suche: {von_name}→{nach_name} {abflug_de}")
|
print(f"[GF] Suche: {von_name}→{nach_name} {abflug_de}")
|
||||||
|
|
||||||
with SB(uc=True, headless=True, chromium_arg="--no-sandbox --disable-dev-shm-usage") as sb:
|
with SB(uc=True, headless=True, chromium_arg="--no-sandbox --disable-dev-shm-usage") as sb:
|
||||||
# ── Strategie 1: Direkte URL mit Datums-Parametern ─────────────────
|
# Hash-Fragment URL (wird nach Consent-Redirect verloren — daher 2-Schritt)
|
||||||
# Google Flights verarbeitet den Hash-Fragment erst nach JS-Ausführung
|
|
||||||
direct_url = (
|
direct_url = (
|
||||||
f"https://www.google.com/travel/flights?hl=de&curr=EUR"
|
f"https://www.google.com/travel/flights?hl=en&curr=EUR"
|
||||||
f"#flt={von}.{nach}.{abflug}*{nach}.{von}.{rueck}"
|
f"#flt={von}.{nach}.{abflug}*{nach}.{von}.{rueck}"
|
||||||
f";c:EUR;e:1;sd:1;t:r;sc:w"
|
f";c:EUR;e:1;sd:1;t:r;sc:e"
|
||||||
) if rueck else (
|
) if rueck else (
|
||||||
f"https://www.google.com/travel/flights?hl=de&curr=EUR"
|
f"https://www.google.com/travel/flights?hl=en&curr=EUR"
|
||||||
f"#flt={von}.{nach}.{abflug};c:EUR;e:1;sd:1;t:f;sc:w"
|
f"#flt={von}.{nach}.{abflug};c:EUR;e:1;sd:1;t:f;sc:e"
|
||||||
)
|
)
|
||||||
sb.open(direct_url)
|
|
||||||
sb.sleep(8)
|
# ── Schritt 1: Consent zuerst auf der Basis-URL akzeptieren ─────────
|
||||||
_consent_google(sb)
|
# Consent-Redirect von consent.google.com strippt den #-Fragment.
|
||||||
|
# Lösung: Consent einmal auf Basisseite akzeptieren, dann Hash-URL öffnen.
|
||||||
|
sb.open("https://www.google.com/travel/flights?hl=en&curr=EUR")
|
||||||
|
sb.sleep(6)
|
||||||
|
consented = _consent_google(sb)
|
||||||
|
if consented:
|
||||||
|
print("[GF] Consent akzeptiert — öffne jetzt Hash-URL")
|
||||||
sb.sleep(3)
|
sb.sleep(3)
|
||||||
|
|
||||||
|
# ── Schritt 2: Jetzt Hash-URL mit Suchparametern öffnen ─────────────
|
||||||
|
sb.open(direct_url)
|
||||||
|
sb.sleep(12)
|
||||||
title_direct = sb.get_title()
|
title_direct = sb.get_title()
|
||||||
print(f"[GF] URL-Ansatz: {title_direct[:60]}")
|
url_now = sb.get_current_url()
|
||||||
|
print(f"[GF] Titel: {title_direct[:60]}")
|
||||||
|
print(f"[GF] URL: {url_now[:80]}")
|
||||||
|
|
||||||
# Wenn direkte URL Ergebnisse liefert (Titel enthält Städtenamen)
|
# Wenn Hash-Deeplink Ergebnisse liefert
|
||||||
url_erfolgreich = any(kw in title_direct for kw in
|
url_erfolgreich = any(kw in title_direct for kw in
|
||||||
[von, nach, "FRA", "KTI", "Frankfurt", "Phnom", "Flüge"])
|
[von, nach, "FRA", "KTI", "Frankfurt", "Phnom", "Flights to", "Flüge"])
|
||||||
if not url_erfolgreich:
|
if not url_erfolgreich:
|
||||||
# ── Strategie 2: Startseite + Formular befüllen ─────────────────
|
# ── Fallback: Formular manuell befüllen ─────────────────────────
|
||||||
print("[GF] Direktlink kein Ergebnis — wechsle zu Formular-Ansatz")
|
print("[GF] Hash-URL kein Ergebnis — wechsle zu Formular-Ansatz")
|
||||||
sb.open("https://www.google.com/travel/flights?hl=de&curr=EUR")
|
sb.open("https://www.google.com/travel/flights?hl=en&curr=EUR")
|
||||||
sb.sleep(5)
|
sb.sleep(4)
|
||||||
_consent_google(sb)
|
|
||||||
sb.sleep(2)
|
|
||||||
|
|
||||||
# ── 1. Kabine auf "Premium Economy" setzen ──────────────────────────
|
# ── 1. Kabine auf Economy setzen (Standard — meist schon vorausgewählt) ──
|
||||||
|
# Economy = data-value="1" in Google Flights Dropdown
|
||||||
|
# Nur klicken falls aktuell etwas anderes ausgewählt ist
|
||||||
try:
|
try:
|
||||||
# VfPpkd-Buttons: [0]=Hin+Rück [1]=Economy(Klasse)
|
|
||||||
btns = sb.find_elements('button[class*="VfPpkd"]')
|
btns = sb.find_elements('button[class*="VfPpkd"]')
|
||||||
if len(btns) >= 2:
|
if len(btns) >= 2:
|
||||||
btns[1].click()
|
cabin_btn = btns[1]
|
||||||
|
cabin_text = cabin_btn.text.lower()
|
||||||
|
if "economy" not in cabin_text or "premium" in cabin_text:
|
||||||
|
cabin_btn.click()
|
||||||
sb.sleep(1)
|
sb.sleep(1)
|
||||||
# Option "Premium Economy" im Dropdown auswählen
|
for opt_sel in ['[data-value="1"]',
|
||||||
for opt_sel in ['[data-value="2"]',
|
'li[class*="economy"]:first-child',
|
||||||
'li[class*="premium"]',
|
'[role="option"]:nth-child(2)']:
|
||||||
'[role="option"]:nth-child(3)']:
|
|
||||||
try:
|
try:
|
||||||
sb.find_element(opt_sel, timeout=2).click()
|
sb.find_element(opt_sel, timeout=2).click()
|
||||||
sb.sleep(0.5)
|
sb.sleep(0.5)
|
||||||
print(f"[GF] Kabine gesetzt via {opt_sel}")
|
print(f"[GF] Economy gesetzt via {opt_sel}")
|
||||||
break
|
break
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
else:
|
||||||
|
print("[GF] Economy bereits ausgewählt")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(f"[GF] Kabine: {e}")
|
print(f"[GF] Kabine: {e}")
|
||||||
|
|
||||||
|
|
@ -505,23 +565,29 @@ def scrape_kayak(von, nach, tage=30, aufenthalt_tage=60,
|
||||||
rueck = (datetime.now() + timedelta(days=tage + aufenthalt_tage)).strftime("%Y-%m-%d") if trip_type == "roundtrip" else ""
|
rueck = (datetime.now() + timedelta(days=tage + aufenthalt_tage)).strftime("%Y-%m-%d") if trip_type == "roundtrip" else ""
|
||||||
kc = KABINE_KAYAK.get(kabine, "w")
|
kc = KABINE_KAYAK.get(kabine, "w")
|
||||||
bags = 1 if "koffer" in gepaeck else 0
|
bags = 1 if "koffer" in gepaeck else 0
|
||||||
|
scrape_url = _scrape_url_kayak(von, nach, abflug, rueck, kc, bags,
|
||||||
|
layover_min, layover_max, airline_filter,
|
||||||
|
max_flugzeit_h, max_stops)
|
||||||
booking_url = _booking_url_kayak(von, nach, abflug, rueck, kc, bags,
|
booking_url = _booking_url_kayak(von, nach, abflug, rueck, kc, bags,
|
||||||
layover_min, layover_max, airline_filter,
|
layover_min, layover_max, airline_filter,
|
||||||
max_flugzeit_h, max_stops)
|
max_flugzeit_h, max_stops)
|
||||||
airline_label = f" [{airline_filter}]" if airline_filter else ""
|
airline_label = f" [{airline_filter}]" if airline_filter else ""
|
||||||
print(f"[KY{airline_label}] URL: {booking_url}")
|
print(f"[KY{airline_label}] Scrape: {scrape_url[:80]}")
|
||||||
|
|
||||||
results = []
|
results = []
|
||||||
|
|
||||||
with SB(uc=True, headless=True, chromium_arg="--no-sandbox --disable-dev-shm-usage") as sb:
|
with SB(uc=True, headless=True, chromium_arg="--no-sandbox --disable-dev-shm-usage") as sb:
|
||||||
sb.open(booking_url)
|
sb.open(scrape_url)
|
||||||
|
sb.sleep(8)
|
||||||
|
_consent_kayak(sb)
|
||||||
sb.sleep(15)
|
sb.sleep(15)
|
||||||
|
|
||||||
title = sb.get_title()
|
title = sb.get_title()
|
||||||
body = sb.get_text("body")
|
body = sb.get_text("body")
|
||||||
print(f"[KY] Title: {title[:80]}")
|
print(f"[KY] Title: {title[:80]}")
|
||||||
|
|
||||||
for sel in ['.price-text', '.f8F1-price-text', 'div[class*="price"] span',
|
for sel in ['div[class*="hYzH-price"]', 'div[class*="e2GB-price-text"]',
|
||||||
|
'.price-text', '.f8F1-price-text', 'div[class*="price"] span',
|
||||||
'span[class*="price"]', '.Iqt3', 'div.nrc6-price', '.price']:
|
'span[class*="price"]', '.Iqt3', 'div.nrc6-price', '.price']:
|
||||||
try:
|
try:
|
||||||
elems = sb.find_elements(sel, timeout=2)
|
elems = sb.find_elements(sel, timeout=2)
|
||||||
|
|
@ -630,8 +696,15 @@ def scrape_trip(von, nach, tage=30, aufenthalt_tage=60,
|
||||||
def _booking_url_kayak_multicity(von, nach, via, abflug, via_datum, rueck, kc, bags=1, airline=""):
|
def _booking_url_kayak_multicity(von, nach, via, abflug, via_datum, rueck, kc, bags=1, airline=""):
|
||||||
"""
|
"""
|
||||||
Kayak Multi-City URL: FRA→HKG/DATE1 → HKG→KTI/DATE2 → KTI→FRA/DATE3
|
Kayak Multi-City URL: FRA→HKG/DATE1 → HKG→KTI/DATE2 → KTI→FRA/DATE3
|
||||||
Kabinen-Code: w=Premium Economy
|
Bei CX: direkt auf cathaypacific.com verlinken (günstiger, keine Aufschläge).
|
||||||
"""
|
"""
|
||||||
|
if airline.upper() == "CX":
|
||||||
|
# Google Flights Multi-City mit CX-Filter — präziser Deeplink, kein Aufschlag
|
||||||
|
return (
|
||||||
|
f"https://www.google.com/travel/flights?hl=en&curr=EUR"
|
||||||
|
f"#flt={von}.{via}.{abflug}*{via}.{nach}.{via_datum}*{nach}.{von}.{rueck}"
|
||||||
|
f";c:EUR;e:1;sd:1;t:m;a:CX"
|
||||||
|
)
|
||||||
filters = []
|
filters = []
|
||||||
if bags:
|
if bags:
|
||||||
filters.append(f"bfc%3D{bags}")
|
filters.append(f"bfc%3D{bags}")
|
||||||
|
|
@ -639,13 +712,25 @@ def _booking_url_kayak_multicity(von, nach, via, abflug, via_datum, rueck, kc, b
|
||||||
filters.append(f"airlines%3D{airline}")
|
filters.append(f"airlines%3D{airline}")
|
||||||
fs = ("&fs=" + "%3B".join(filters)) if filters else ""
|
fs = ("&fs=" + "%3B".join(filters)) if filters else ""
|
||||||
# Kayak Multi-City Format: /flights/FRA-HKG/DATE/HKG-KTI/DATE/KTI-FRA/DATE
|
# Kayak Multi-City Format: /flights/FRA-HKG/DATE/HKG-KTI/DATE/KTI-FRA/DATE
|
||||||
return (f"https://www.kayak.de/flights"
|
return (f"https://www.kayak.com/flights"
|
||||||
f"/{von}-{via}/{abflug}"
|
f"/{von}-{via}/{abflug}"
|
||||||
f"/{via}-{nach}/{via_datum}"
|
f"/{via}-{nach}/{via_datum}"
|
||||||
f"/{nach}-{von}/{rueck}"
|
f"/{nach}-{von}/{rueck}"
|
||||||
f"?sort=price_a&cabin={kc}¤cy=EUR{fs}")
|
f"?sort=price_a&cabin={kc}¤cy=EUR{fs}")
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
def _scrape_url_kayak_multicity(von, nach, via, abflug, via_datum, rueck, kc, bags=1, airline=""):
|
||||||
|
filters = []
|
||||||
|
if bags: filters.append(f"bfc%3D{bags}")
|
||||||
|
if airline: filters.append(f"airlines%3D{airline}")
|
||||||
|
fs = ("&fs=" + "%3B".join(filters)) if filters else ""
|
||||||
|
return (f"https://www.kayak.de/flights"
|
||||||
|
f"/{von}-{via}/{abflug}"
|
||||||
|
f"/{via}-{nach}/{via_datum}"
|
||||||
|
f"/{nach}-{von}/{rueck}"
|
||||||
|
f"?sort=price_a&cabin={kc}¤cy=EUR{fs}")
|
||||||
|
|
||||||
def scrape_kayak_multicity(von, nach, tage=30, aufenthalt_tage=60,
|
def scrape_kayak_multicity(von, nach, tage=30, aufenthalt_tage=60,
|
||||||
kabine="premium_economy",
|
kabine="premium_economy",
|
||||||
gepaeck="1koffer+handgepaeck",
|
gepaeck="1koffer+handgepaeck",
|
||||||
|
|
@ -662,23 +747,28 @@ def scrape_kayak_multicity(von, nach, tage=30, aufenthalt_tage=60,
|
||||||
bags = 1 if "koffer" in gepaeck else 0
|
bags = 1 if "koffer" in gepaeck else 0
|
||||||
airline_label = f" [{airline_filter}]" if airline_filter else ""
|
airline_label = f" [{airline_filter}]" if airline_filter else ""
|
||||||
|
|
||||||
|
scrape_url = _scrape_url_kayak_multicity(von, nach, via, abflug, via_datum, rueck,
|
||||||
|
kc, bags, airline_filter)
|
||||||
booking_url = _booking_url_kayak_multicity(von, nach, via, abflug, via_datum, rueck,
|
booking_url = _booking_url_kayak_multicity(von, nach, via, abflug, via_datum, rueck,
|
||||||
kc, bags, airline_filter)
|
kc, bags, airline_filter)
|
||||||
|
|
||||||
print(f"[MC{airline_label}] Multi-City via {via}: {abflug} → +1T → {rueck}")
|
print(f"[MC{airline_label}] Multi-City via {via}: {abflug} → +1T → {rueck}")
|
||||||
print(f"[MC{airline_label}] URL: {booking_url}")
|
print(f"[MC{airline_label}] Scrape: {scrape_url[:80]}")
|
||||||
|
|
||||||
results = []
|
results = []
|
||||||
|
|
||||||
with SB(uc=True, headless=True, chromium_arg="--no-sandbox --disable-dev-shm-usage") as sb:
|
with SB(uc=True, headless=True, chromium_arg="--no-sandbox --disable-dev-shm-usage") as sb:
|
||||||
sb.open(booking_url)
|
sb.open(scrape_url)
|
||||||
|
sb.sleep(8)
|
||||||
|
_consent_kayak(sb)
|
||||||
sb.sleep(15)
|
sb.sleep(15)
|
||||||
|
|
||||||
title = sb.get_title()
|
title = sb.get_title()
|
||||||
body = sb.get_text("body")
|
body = sb.get_text("body")
|
||||||
print(f"[MC] Title: {title[:80]}")
|
print(f"[MC] Title: {title[:80]}")
|
||||||
|
|
||||||
for sel in ['.price-text', '.f8F1-price-text', 'div[class*="price"] span',
|
for sel in ['div[class*="hYzH-price"]', 'div[class*="e2GB-price-text"]',
|
||||||
|
'.price-text', '.f8F1-price-text', 'div[class*="price"] span',
|
||||||
'span[class*="price"]', '.Iqt3', 'div.nrc6-price', '.price']:
|
'span[class*="price"]', '.Iqt3', 'div.nrc6-price', '.price']:
|
||||||
try:
|
try:
|
||||||
elems = sb.find_elements(sel, timeout=2)
|
elems = sb.find_elements(sel, timeout=2)
|
||||||
|
|
@ -725,38 +815,32 @@ def scrape_momondo(von, nach, tage=30, aufenthalt_tage=60,
|
||||||
if trip_type == "roundtrip" else ""
|
if trip_type == "roundtrip" else ""
|
||||||
kc = KABINE_KAYAK.get(kabine, "w")
|
kc = KABINE_KAYAK.get(kabine, "w")
|
||||||
bags = 1 if "koffer" in gepaeck else 0
|
bags = 1 if "koffer" in gepaeck else 0
|
||||||
|
scrape_url = _scrape_url_momondo(von, nach, abflug, rueck, kc, bags,
|
||||||
|
layover_min, layover_max, airline_filter,
|
||||||
|
max_flugzeit_h, max_stops)
|
||||||
booking_url = _booking_url_momondo(von, nach, abflug, rueck, kc, bags,
|
booking_url = _booking_url_momondo(von, nach, abflug, rueck, kc, bags,
|
||||||
layover_min, layover_max, airline_filter,
|
layover_min, layover_max, airline_filter,
|
||||||
max_flugzeit_h, max_stops)
|
max_flugzeit_h, max_stops)
|
||||||
airline_label = f" [{airline_filter}]" if airline_filter else ""
|
airline_label = f" [{airline_filter}]" if airline_filter else ""
|
||||||
print(f"[MO{airline_label}] URL: {booking_url}")
|
print(f"[MO{airline_label}] Scrape: {scrape_url[:80]}")
|
||||||
|
|
||||||
results = []
|
results = []
|
||||||
screenshot_b64 = ""
|
screenshot_b64 = ""
|
||||||
|
|
||||||
with SB(uc=True, headless=True, chromium_arg="--no-sandbox --disable-dev-shm-usage") as sb:
|
with SB(uc=True, headless=True, chromium_arg="--no-sandbox --disable-dev-shm-usage") as sb:
|
||||||
sb.open(booking_url)
|
sb.open(scrape_url)
|
||||||
sb.sleep(8)
|
sb.sleep(8)
|
||||||
|
_consent_kayak(sb)
|
||||||
|
|
||||||
# Momondo Cookie-Consent wegklicken
|
# Nach Consent: Ergebnisse laden lassen
|
||||||
for sel in ['button[class*="accept"]', '.RxNS-button-content',
|
|
||||||
'#onetrust-accept-btn-handler', 'button[title*="akzeptieren"]',
|
|
||||||
'button[title*="Alle akzeptieren"]', '.evidon-banner-acceptbutton']:
|
|
||||||
try:
|
|
||||||
sb.find_element(sel, timeout=2).click()
|
|
||||||
print(f"[MO] Consent geklickt: {sel}")
|
|
||||||
sb.sleep(3)
|
|
||||||
break
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
# Nach Consent: Seite muss neu laden / Ergebnisse warten
|
|
||||||
sb.sleep(12)
|
sb.sleep(12)
|
||||||
title = sb.get_title()
|
title = sb.get_title()
|
||||||
body = sb.get_text("body")
|
body = sb.get_text("body")
|
||||||
print(f"[MO] Title: {title[:80]} | Body: {len(body)} chars")
|
print(f"[MO] Title: {title[:80]} | Body: {len(body)} chars")
|
||||||
|
|
||||||
for sel in ['.price-text', '.f8F1-price-text', 'div[class*="price"] span',
|
for sel in ['div[class*="hYzH-price"]', 'div[class*="e2GB-price-text"]',
|
||||||
|
'div[class*="ixMA-price"]',
|
||||||
|
'.price-text', '.f8F1-price-text', 'div[class*="price"] span',
|
||||||
'span[class*="price"]', '.Iqt3', 'div.nrc6-price', '.price',
|
'span[class*="price"]', '.Iqt3', 'div.nrc6-price', '.price',
|
||||||
'[class*="resultPrice"]', '.lowest-price']:
|
'[class*="resultPrice"]', '.lowest-price']:
|
||||||
try:
|
try:
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue