Thorsten Bus e7ad1b3cce chore: commit pint style fix, CCLI-API.md planning doc, and npm lock update

2026-05-11 11:00:50 +02:00

24 KiB

Raw Blame History

CCLI SongSelect Partner API — Doc Pointer

Where to get the docs

Postman documentation (only public source, no PDF/OpenAPI mirror): https://documenter.getpostman.com/view/604633/TzseGkmA

The page is JS-rendered. Two ways to read it:

Open in a browser (Chrome/Firefox), wait for the Postman documenter to render.
Click the "Run in Postman" button top-right to import the full collection + environment into a Postman workspace — then inspect every endpoint, params, headers, sample requests/responses.

The collection name is "SongSelect Partner API" under owner id 604633.

Status (read first!)

NOTICE: CCLI has retired the SongSelect API Partner Program and is no longer accepting new API partners.

Existing partners keep working. New access requires contacting CCLI directly (partners@ccli.com / regional CCLI office) to request reinstatement or special arrangement.

Key facts (from the docs)

Auth: OpenID Connect / OAuth 2.0, Authorization Code with PKCE, refresh tokens supported
- Authorize: https://identityservices.ccli.com/connect/authorize
- Token: https://identityservices.ccli.com/connect/token
- Scope: openid cclipartnerapi.read offline_access
Subscription Key: every request needs header Ocp-Apim-Subscription-Key: <key> (dev key for testing, prod key for live)
Tokens: access token 1h, refresh token 60-day sliding (one-time use, new refresh returned on each refresh)
Rate limits: 100 calls / 10s short term, 300 calls / 5min long term. 429 returns JSON {statusCode, message}.
Dev restrictions: dev client only sees content for users linked to the "SongSelect API Partners" test organization.
Endpoint reference (search, song detail, lyrics, chord chart, etc.) lives inside the Postman collection — load it to see exact paths/params, not summarized in the public preview.

Credentials needed before coding

CCLI Partner ClientId + ClientSecret
Development Subscription Key (Ocp-Apim-Subscription-Key)
Production Subscription Key (later)
A CCLI user account linked to the Partner test organization (for dev refresh-token bootstrap)

Store in .env:

CCLI_PARTNER_CLIENT_ID=
CCLI_PARTNER_CLIENT_SECRET=
CCLI_PARTNER_SUBSCRIPTION_KEY_DEV=
CCLI_PARTNER_SUBSCRIPTION_KEY_PROD=
CCLI_PARTNER_REDIRECT_URI=https://pp-planer.ddev.site/oauth/ccli/callback

Bootstrap flow for a new agent

Load Postman collection from URL above → list every endpoint with its path, params, sample response.
Mirror existing ChurchToolsService pattern (app/Services/ChurchToolsService.php) — closure-injectable fetcher, logApiCall, classifyError, German error messages, ApiRequestLog row per call.
Implement OAuth2 PKCE handshake → persist refresh token (encrypted) in a ccli_tokens table. Auto-refresh on 401.
Always send Ocp-Apim-Subscription-Key header alongside Authorization: Bearer <access_token>.
Respect rate limits (Laravel RateLimiter::for('ccli', ...) with 100/10s + 300/5min buckets).
Map result to existing schema: Song.ccli_id, arrangements + global Labels (Strophe 1 / Refrain / Bridge), SongSlide.text_content. See ProImportService::upsertSong for the upsert template.

Fallback if API access denied

Manual paste flow → parser splits on Verse N, Chorus, Bridge, Pre-Chorus, Tag, Ending headings.
.pro import already implemented (POST /api/songs/import-pro).

Alternative: Headless-browser scraping (NO official API)

Use this when the Partner API is not available (current default for new projects). It drives songselect.ccli.com with a real browser session using a normal CCLI SongSelect subscription. Same data the user would download manually, just automated.

ToS / legal note

CCLI's SongSelect ToS forbids "automated retrieval" without partner agreement. A church-internal tool that only acts on behalf of an authenticated subscriber and respects rate limits is a gray area many open-source projects (OpenLP, FreeShow community fork, gwonamfromkoradai/SongSelectSave) operate in. Document the risk in README and let the church decide.

Required credentials

CCLI_SONGSELECT_USER=         # CCLI account email
CCLI_SONGSELECT_PASSWORD=     # CCLI account password
CCLI_SONGSELECT_BASE_URL=https://songselect.ccli.com

Single shared app account (chosen). Encrypt the password at rest (Crypt::encryptString) — never log it.

Tech stack pick

Three viable headless-browser options for Laravel:

Tool	Pros	Cons
`spatie/browsershot` (Puppeteer + Chromium via Node)	Already in Laravel ecosystem; simple PHP API; supports cookies, headers, screenshots	Heavyweight; needs Node + Chromium in container
`laravel/dusk` (ChromeDriver)	Pure Laravel; auth helpers; assertion DSL	Built for testing, awkward for prod scraping
Playwright via Node side-script (`tests/e2e` already uses it)	Best automation API; persistent storage state; identical to existing E2E setup	Crosses PHP↔Node boundary (CLI exec or queue worker)

Recommendation: Playwright — already a dev dep, tests/e2e/auth.setup.ts proves the pattern. Run as a queue job that shells out to a Node script, returns JSON.

DDEV needs Chromium installed — add to .ddev/web-build/Dockerfile.example:

RUN apt-get update && apt-get install -y chromium fonts-liberation
RUN npx --yes playwright install --with-deps chromium

Endpoints / DOM contract (observed)

These are not an "API" — they are URL + selector contracts that can change. Re-verify quarterly.

URL: https://profile.ccli.com/account/signin?appContext=SongSelect
Form fields: input[name="EmailAddress"], input[name="Password"], button[type="submit"]
Success: redirect to https://songselect.ccli.com/
Persist cookies (profile.ccli.com, songselect.ccli.com) in storage/app/ccli/state.json (Playwright storageState). Re-login when cookies expire.

2. Search by keyword

URL: https://songselect.ccli.com/search/results?Keyword={url-encoded-query}
Result rows: .song-result (or current class — verify with DevTools)
Fields per row: .song-title a (link + title), .song-authors (authors), .song-ccli-number or attribute data-id (CCLI #)
Pagination: ?Keyword=...&CurrentPage=2

3. Search by CCLI number

URL: https://songselect.ccli.com/Songs/{ccliId} → redirects to canonical song page

4. Song detail

URL: https://songselect.ccli.com/Songs/{ccliId}/{slug}
Metadata in <dl> or schema.org JSON-LD <script type="application/ld+json"> (preferred — stable):
- name → title
- author[].name → authors
- copyrightYear, copyrightHolder
Themes / publishers in side panel.

5. Lyrics download (the "parts" the user wants)

URL: https://songselect.ccli.com/Songs/{ccliId}/{slug}/viewlyrics
Trigger: click #lyricsDownloadButton (gives .txt) OR fetch hidden link a[data-download-format="txt"]

The .txt payload is structured by part, e.g.:

Verse 1
Amazing grace, how sweet the sound
...

Chorus
My chains are gone...

Verse 2
...

Bridge
...

CCLI Song # 22025
© Public Domain
CCLI License # 12345

Headers to detect (regex): ^(Verse \d+|Chorus( \d+)?|Pre-Chorus|Bridge( \d+)?|Tag|Ending|Intro|Interlude|Refrain|Coda)\s*$

6. ChordPro download (optional, if account has chord access)

URL: https://songselect.ccli.com/Songs/{ccliId}/{slug}/chordpro → click .chordpro-download
Format is industry-standard ChordPro — easier to parse than HTML.

Mapping to existing schema

SongSelect part header  →  global Label name
─────────────────────────────────────────────
Verse N                 →  Strophe N
Chorus / Refrain        →  Refrain
Pre-Chorus              →  Pre-Refrain
Bridge                  →  Bridge
Tag / Ending / Coda     →  Outro
Intro / Interlude       →  Intro / Zwischenspiel

Lookup labels case-insensitive (SongService::createDefaultGroups already does LOWER(name)); create new global label if no match.

Persistence template (mirror ProImportService::upsertSong):

Song::firstOrNew(['ccli_id' => $ccliId]) — restore soft-deleted via restore()
Update title / author / copyright_text / copyright_year / publisher
Wipe existing arrangements for clean re-import (or skip if user opted "merge")
Create one SongArrangement(name='Normal', is_default=true)
For each parsed part → find/create Label, create SongSlide(label_id, order, text_content), attach via SongArrangementLabel(order)

Service skeleton

// app/Services/SongSelectScraperService.php
final class SongSelectScraperService
{
    public function __construct(
        private readonly SongImportService $importer,
    ) {}

    public function search(string $query): Collection { /* runs node script: search */ }

    public function fetchByCcliId(int $ccliId): array { /* runs node script: detail+lyrics */ }

    public function importToDb(int $ccliId): Song
    {
        $payload = $this->fetchByCcliId($ccliId);
        return $this->importer->upsertFromSongSelect($payload); // mirrors ProImportService
    }
}

Run scraper inside a queue job (ScrapeSongSelectJob) — never block HTTP request. Frontend polls or uses Inertia partial reload.

Node side-script (Playwright)

scripts/songselect-fetch.mjs:

import { chromium } from 'playwright';
import fs from 'node:fs';

const [, , action, arg] = process.argv;          // e.g. 'search' 'amazing grace'  OR  'detail' 22025
const STATE = 'storage/app/ccli/state.json';

const browser = await chromium.launch({ headless: true });
const ctx = fs.existsSync(STATE)
  ? await browser.newContext({ storageState: STATE })
  : await browser.newContext();
const page = await ctx.newPage();

// auto-login if cookies missing
await page.goto('https://songselect.ccli.com/');
if (await page.locator('text=Sign In').isVisible().catch(() => false)) {
  await page.goto('https://profile.ccli.com/account/signin?appContext=SongSelect');
  await page.fill('input[name="EmailAddress"]', process.env.CCLI_SONGSELECT_USER);
  await page.fill('input[name="Password"]',     process.env.CCLI_SONGSELECT_PASSWORD);
  await page.click('button[type="submit"]');
  await page.waitForURL('**/songselect.ccli.com/**');
  await ctx.storageState({ path: STATE });
}

let result;
if (action === 'search') {
  await page.goto(`https://songselect.ccli.com/search/results?Keyword=${encodeURIComponent(arg)}`);
  result = await page.$$eval('.song-result', rows => rows.map(r => ({
    ccli_id: r.dataset.id ?? r.querySelector('.song-ccli-number')?.textContent?.trim(),
    title:   r.querySelector('.song-title')?.textContent?.trim(),
    authors: r.querySelector('.song-authors')?.textContent?.trim(),
    url:     r.querySelector('a')?.href,
  })));
} else if (action === 'detail') {
  await page.goto(`https://songselect.ccli.com/Songs/${arg}`);
  const url   = page.url();
  const meta  = await page.$eval('script[type="application/ld+json"]', s => JSON.parse(s.textContent));
  await page.goto(url.replace(/\/?$/, '/viewlyrics'));
  const lyrics = await page.locator('pre, .lyrics-content').innerText();
  result = { ccli_id: arg, ...meta, lyrics };
}

console.log(JSON.stringify(result));
await browser.close();

PHP side calls via Symfony\Component\Process\Process and decodes JSON.

Lyrics → parts parser (PHP)

final class SongSelectLyricsParser
{
    private const HEADER = '/^(Verse \d+|Chorus(?: \d+)?|Pre-Chorus|Bridge(?: \d+)?|Tag|Ending|Intro|Interlude|Refrain|Coda)\s*$/i';
    private const LABEL_MAP = [
        'verse'      => 'Strophe',   // suffix the number
        'chorus'     => 'Refrain',
        'refrain'    => 'Refrain',
        'pre-chorus' => 'Pre-Refrain',
        'bridge'     => 'Bridge',
        'tag'        => 'Outro',
        'ending'     => 'Outro',
        'coda'       => 'Outro',
        'intro'      => 'Intro',
        'interlude'  => 'Zwischenspiel',
    ];

    /** @return array<int, array{label: string, text: string}> */
    public function parse(string $raw): array { /* split on HEADER, map via LABEL_MAP */ }
}

Rate limiting & politeness

Cap to 30 requests/minute per app instance (RateLimiter::for('ccli-scrape', fn () => Limit::perMinute(30))).
One concurrent scrape job (ScrapeSongSelectJob with WithoutOverlapping middleware).
Cache result for 30 days (songs.ccli_id already keyed). User can force-refresh via "Re-import" button.
Random jitter 500-1500ms between page loads.

UI integration

Songs/Index.vue — top-bar search input "CCLI Lookup" → POST /api/ccli/search { q } → modal with results → "Import" button per row.
SongAgendaItem.vue (unmatched row) — new button "SongSelect suchen" next to existing Request/Assign → opens same modal pre-filled with CTS song name.
Preview modal before save — show parsed parts grouped by detected Label, allow drag-reassign / rename, then confirm import.
All German text, Du-form: "Suche bei CCLI…", "Importieren", "Als Strophe 1 zuweisen", etc.

Failure modes & detection

Symptom	Cause	Action
Redirect to `/account/signin` mid-session	Cookie expired	Re-run login flow, retry once
Empty `.song-result` list	DOM changed OR query 0 hits	Save HTML snapshot to `storage/logs/ccli/` for inspection
HTTP 429 / "Too many requests" page	Rate limit hit	Back off 5min, alert admin
Captcha (`recaptcha` iframe)	CCLI flagged automation	Stop, surface admin notice, fall back to manual paste
Login fails	Wrong creds OR account suspended	German error to admin

Log every scrape into api_request_logs (existing table) with service='songselect' so the existing log UI shows them alongside CTS calls.

Testing

Unit-test the parser with fixtures in tests/Fixtures/songselect/*.txt.
Mock the Playwright invocation in service tests via constructor closure (mirror ChurchToolsService pattern).
E2E test against a sandbox public-domain song (e.g. CCLI #22025 "Amazing Grace") — gated by CCLI_SONGSELECT_USER env, skip if missing.

Bootstrap checklist for a new agent

Confirm CCLI subscription credentials are in .env.
Add Chromium to DDEV web container.
Create scripts/songselect-fetch.mjs.
Create app/Services/SongSelectScraperService.php + SongSelectLyricsParser.php + SongImportService::upsertFromSongSelect() (refactor common parts out of ProImportService).
Create ScrapeSongSelectJob (queued, WithoutOverlapping).
Add routes POST /api/ccli/search, POST /api/ccli/import/{ccliId}.
Add Vue search modal + integrate into Songs/Index.vue + SongAgendaItem.vue.
Write parser unit tests + service feature test (mock Process).
Document the ToS gray area in README.

Reference: How OpenLP imports from CCLI

Source: openlp/plugins/songs/lib/songselect.py on https://gitlab.com/openlp/openlp (LGPL).

Approach: embedded Qt WebEngine (= real Chromium) + JS injection

OpenLP does NOT do headless HTTP scraping. It opens a QWebEngineView (PySide6 Qt Chromium) inside the desktop app on https://profile.ccli.com/account/signin?appContext=SongSelect&returnUrl=https%3a%2f%2fsongselect.ccli.com%2f. The user signs in manually in that embedded browser (so they solve any captcha themselves). After login the same webview holds the authenticated cookies.

OpenLP then drives the page via webview.page().runJavaScript(...) to:

Detect current page by URL (Login / Home / Search / Song / Other).
Navigate by setting document.location = "<url>".

Pre-fill login fields:

document.getElementById("EmailAddress").value = "<email>";
document.getElementById("Password").value     = "<password>";

(User still clicks Sign-In manually so Turnstile sees a real interaction.)

Fetch any URL with the page's session cookies by injecting:
```
var openlp_page_data = null;
fetch("<url>")
  .then(r => r.text())
  .then(t => { openlp_page_data = t; });
```
then polls openlp_page_data != null and reads the result back into Python. This is the clever bit — they bypass cookie-export entirely, using the already-authenticated browser context as the HTTP client.
Parse HTML → song dict → write into the OpenLP DB via SQLAlchemy (Song, Author, Topic, SongXML verses with VerseType.tags).

URL constants in OpenLP:

BASE_URL   = 'https://songselect.ccli.com'
LOGIN_PAGE = 'https://profile.ccli.com/account/signin?appContext=SongSelect&returnUrl=https%3a%2f%2fsongselect.ccli.com%2f'
LOGIN_URL  = 'https://profile.ccli.com'
LOGOUT_URL = BASE_URL + '/account/logout'
SEARCH_URL = BASE_URL + '/search/results'
SONG_PAGE  = BASE_URL + '/Songs/'
CCLI_NUMBER_REGEX = r'.*?Songs\/([0-9]+).*'

Lesson for a Laravel server-side port: OpenLP succeeds because it ships a full GUI Chromium and pushes the captcha problem onto the user. A server-side scraper has to solve the same captcha non-interactively — see next section.

Confirmed by fetching https://profile.ccli.com/account/signin?appContext=SongSelect:

<script src="https://challenges.cloudflare.com/turnstile/v0/api.js"></script>
<div class="cf-turnstile sr-only"
     data-sitekey="0x4AAAAAAA1USwfe0YamenZA"
     data-appearance="interaction-only"
     data-callback="enableSubmit" inert></div>

Mode: interaction-only (Managed/Invisible — silent unless trust score drops, then escalates to checkbox click)
Sitekey: 0x4AAAAAAA1USwfe0YamenZA
Submit button is disabled until Turnstile callback fires, then a hidden cf-turnstile-response input is added to the POST body
Form also includes ASP.NET __RequestVerificationToken (CSRF) — must be scraped from the GET response and sent back
CCLI also injects Cloudflare Bot Management JSD (/cdn-cgi/challenge-platform/scripts/jsd/main.js) — additional passive fingerprinting on every page

Can Turnstile be bypassed WITHOUT a real Chrome?

Short answer: No. Turnstile requires a JavaScript runtime + canvas + WebGL + AudioContext + matching TLS/JA3 fingerprint to mint a valid token. A real browser engine must run somewhere — locally, in a queue worker, or in the cloud.

The realistic option matrix:

Approach	"Real Chrome" needed?	Cost	Reliability for CCLI	Notes
Pure HTTP (Guzzle / curl / requests)	none	free	Will not work	Cannot execute the Turnstile JS that mints the token. Hard wall.
`curl-impersonate` / `curl_cffi` (TLS-fingerprint spoofing)	none	free	Will not work alone	Solves JA3 fingerprint but still no JS engine for the Turnstile widget. Useful only AFTER a session cookie exists.
Patched headless Chromium (Playwright + `playwright-stealth`, `puppeteer-extra-plugin-stealth`, `nodriver`, `patchright`)	yes (local)	free	Medium for `interaction-only` mode	Stealth plugins hide `navigator.webdriver`, fix canvas/WebGL leaks. Often passes Turnstile silently. Breaks under residential-IP requirement or escalation to interactive.
`undetected-chromedriver` + SeleniumBase UC Mode	yes (local)	free	Medium-High	Has built-in `uc_gui_click_captcha()` that uses pyautogui to click the checkbox if Turnstile escalates. Python-only.
Camoufox (patched Firefox, fingerprint injection at C++ level)	yes (local)	free	Medium-High	Different signature from Chromium-based detection profiles; useful when stealth-Chromium gets flagged.
CAPTCHA-solving service (2Captcha, CapSolver, NextCaptcha, Anti-Captcha)	none locally; service runs browsers	≈$1.45/1k tokens	Low for CCLI specifically	They return a Turnstile token bound to the sitekey + your IP. CCLI also fingerprints the browser env + JSD beacon, so token alone often fails to authenticate. Token TTL ≈ 5min, single-use.
Cloud browser API (Scrapfly ASP, Browserless, Bright Data Scraping Browser, Scrapeless, ZenRows, Oxylabs Web Unblocker)	yes (remote)	≈$5-50/1k pages	High	Real Chromium + residential proxy + automatic challenge solving in one call. The only "no local Chrome" option that actually works at scale.
Manual one-time login + persisted cookies (OpenLP model)	yes (one-time, in user's own browser)	free	High	User logs in once via popup/embedded view, app stores `.AspNet.ApplicationCookie` + Cloudflare `cf_clearance` cookies, reuses them for HTTP scraping until they expire (typically 30 days; `cf_clearance` is shorter ≈ 1 hour but auto-refreshes if you keep the same browser fingerprint via `curl-impersonate`).

cf_clearance cookie pitfall: even with a valid .AspNet.ApplicationCookie, Cloudflare checks cf_clearance on every request and ties it to the originating browser's TLS+UA fingerprint. Reusing the cookie from raw curl will give 403 / cf_chl_* because the JA3 fingerprint won't match. Use curl-impersonate-chrome or curl_cffi (curl_cffi.requests with impersonate="chrome120") so the TLS handshake matches the browser that minted the cookie.

Recommended architecture for pp-planer

Hybrid that mirrors OpenLP's user-driven login but server-side scraping:

Admin panel "CCLI Session" page
- "Sign in to CCLI" button opens a popup window pointed at https://profile.ccli.com/account/signin?appContext=SongSelect&returnUrl=https://pp-planer.ddev.site/api/ccli/oauth-callback.
- User logs in normally. Their own browser handles Turnstile (silent in 99% of cases for residential IPs).
- On the redirect back to our callback, JS reads document.cookie from the popup (only works for cookies on our domain — see below) — so this approach actually requires a different mechanism.
Better: bundled headless browser inside a queue worker
- Use Playwright (already a dev dep) + playwright-extra + playwright-extra-plugin-stealth in headed mode for first login, headless for re-use.
- Persist storageState to storage/app/ccli/state.json (encrypted at rest).
- First-time setup: admin runs php artisan ccli:login → opens a non-headless Playwright browser on the server's display (or via VNC/X11 forwarding in DDEV) → admin types credentials and solves any escalated Turnstile checkbox.
- All subsequent fetches use saved cookies in headless mode. Re-prompt admin when cookies expire.
For ongoing fetches: once authenticated, can drop down to curl_cffi-style HTTP via Symfony HttpClient with a Chrome JA3 fingerprint (PHP package: quic-go/curl-impersonate shell-out, or call Node curl-impersonate script) — much faster than re-launching browser per request.
Fallback if Turnstile escalates beyond stealth limits: route through a cloud browser (Scrapfly ASP asp=true flag handles it). Make it pluggable behind SongSelectClient interface.

Honest recommendation

For a church-internal tool used by a handful of staff, scraping at all is overkill. Realistic ranking:

Manual paste flow + lyric parser → 2 days of work, zero external deps, zero ToS risk.
.pro import (already done) — staff can download .pro files from SongSelect manually and drop them in the existing upload area.
OpenLP-style embedded webview — only works for desktop; doesn't fit a Laravel web app.
Server-side stealth Playwright + persisted cookies — works, but ~1-2 weeks of fragile glue code, breaks every CCLI redesign or Cloudflare ruleset bump.
Cloud browser API (Scrapfly etc.) — most reliable, costs €€, still ToS-gray.

If automation is mandatory: option 4 with option 5 as fallback when the local browser fails.

24 KiB Raw Blame History