Migrate scheduled jobs to Apalis with Postgres backend

## Why

Right now we have one periodic job (\`resolve_taxa\`, scheduled via Cloud Scheduler → Cloud Run Job after #446 lands). That setup is the right fit for a single idempotent task. Once we have a second or third job, the per-cron Terraform / console wiring stops paying off and a job framework starts.

[Apalis](https://github.com/geofmureithi/apalis) with the Postgres backend is the leading candidate:

- No new infrastructure — reuses our existing Cloud SQL.
- Durable job state in the same DB as the data the jobs operate on.
- Built-in retries, dead-letter, scheduled / cron / one-off triggers.
- HTTP-triggerable jobs come for free (one of the main reasons we'd graduate from Cloud Scheduler).

## When to do this

Trigger conditions, any one of which makes the migration worth doing:

- A second periodic job lands (likely candidates: Wikidata thumbnail refresh, IUCN conservation-status sync, GBIF backbone freshness check on hot \`taxa\` rows).
- We need an ad-hoc / API-triggered job (\"re-resolve taxon X now\", \"backfill kingdom Y\").
- Retry semantics beyond \"next scheduled pass picks it up\" are needed for any job.

Until at least one of these is true, the Cloud Scheduler + one-shot binary pattern stays the right call.

## Scope

1. Add Apalis (\`apalis\` + \`apalis-sql\` postgres feature) as a workspace dependency.
2. New crate \`observing-jobs\` (or fold into an existing crate) hosting:
   - The Apalis runtime / worker entry point binary.
   - Job definitions as \`apalis::Job\` impls.
   - Migration for the Apalis tables (separate sqlx migration, or via Apalis's own migrate fn — decide).
3. Migrate \`resolve_taxa\` from a standalone binary to an Apalis job. Keep the same logic; cron triggers replace Cloud Scheduler.
4. Replace the Cloud Run Job + Cloud Scheduler wiring with a single long-lived Cloud Run *Service* hosting the Apalis worker. Tear down the cron infra in IaC.
5. Document how to add a new job (single example in CONTRIBUTING / docs).

## Open questions

- **Schema location**: Apalis's tables in \`apalis\` schema vs \`public\` vs reuse of \`appview\` / \`ingester\`. Probably its own schema for clarity; revoke runtime-role write access on it from non-job services.
- **Workers per service**: one Cloud Run Service for all jobs, or one per job class? Start with one; split if a noisy neighbor emerges.
- **Backwards compatibility**: do we keep \`resolve_taxa\` as a one-shot CLI for ops use, or fully replace? Probably keep — it's useful for ad-hoc backfills outside the job system.
- **Postgres connection pressure**: Apalis polls the DB for jobs. Tune polling interval and worker count. Verify it doesn't move the needle on Cloud SQL load.
- **Observability**: how to surface job state in our existing tracing/Cloud Logging stack. Apalis has \`apalis-prometheus\` and tracing layers; pick one.

## Acceptance criteria

- [ ] At least two distinct jobs running on Apalis (the trigger condition for this work).
- [ ] \`resolve_taxa\` runs on the same cadence as today, via Apalis, with the same idempotency guarantees.
- [ ] Cloud Scheduler trigger for \`resolve_taxa\` removed.
- [ ] An HTTP endpoint (or admin CLI) exists that can ad-hoc-trigger any job.
- [ ] Job retries / failures are visible in our existing logs.
- [ ] Docs or a README section showing how to add a new job in <50 lines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate scheduled jobs to Apalis with Postgres backend #447

Why

When to do this

Scope

Open questions

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Migrate scheduled jobs to Apalis with Postgres backend #447

Description

Why

When to do this

Scope

Open questions

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions