XCrawler is a Laravel 12 operational crawler application for metadata crawling, normalization, Elasticsearch indexing, and React/Inertia crawl management.
It crawls movie metadata through the local Modules/Crawler parser and adapter module, stores normalized catalog data in MySQL, uses Redis/Horizon for queue work, records crawl logs, indexes movies into Elasticsearch, and provides an admin-focused UI for dashboard, movies, performers, sites, crawl operations, and search.
The companion ../xcrawler-observability service is a separate NestJS/React/PostgreSQL observability app. XCrawler's Laravel OBS integration lives in Modules/Observability and can send signed downstream event copies through observability_outbox for dashboards and timelines; it must not replace XCrawler local crawl_logs, Redis backoff/throttle state, Horizon queues, MySQL source data, or Elasticsearch indexing.
- PHP 8.5, Laravel 12
- React 19, Inertia 3, TypeScript 6, Vite 7, Tailwind CSS 4
- MySQL 9, Redis 7, Elasticsearch 9.x, MongoDB 8.3, Qdrant 1.13
- Laravel Horizon
- JOOservices packages:
client,dto,laravel-config,laravel-controller,laravel-repository,useragent
Use Docker Desktop with the desktop-linux context.
docker context use desktop-linux
cp .env.example .env
docker compose up -d --build
docker compose exec app composer install
docker compose exec app npm install
docker compose exec app php artisan key:generate
docker compose exec app php artisan migrate
docker compose exec app php artisan search:ensure-index
docker compose exec app php artisan vector:ensure-collection
docker compose exec app npm run buildFor a new disposable local DB only:
docker compose exec app php artisan db:seedApp: http://127.0.0.1:8080
Horizon: http://127.0.0.1:8080/horizon
docker compose exec app php artisan crawl:dispatch-due --limit=5
docker compose exec app php artisan crawl:site onejav
docker compose exec app php artisan search:reindex-movies
docker compose exec app php artisan vector:ensure-collection
docker compose exec app php artisan vector:sync-movies
docker compose exec app php artisan horizon:status
docker compose exec app php artisan schedule:list
composer test
npm run typecheck
npm run lint
npm run format:check
npm run build- Use
syncWithoutDetaching()for movie performer/genre pivots. - Use
TaxonomyServicefor performer/genre names. - Use
App\Support\UrlHashfor URL hashes. - Do not run
search:ensure-index --forceorsearch:reindex-movies --recreatewithout environment confirmation. - The primary local quality gate is the full Docker quality gate via
bash scripts/docker-testing.sh(run before every commit/push). Before push, also run a narrow real crawl smoke test (crawl:probe-site onejav) in the isolateddocker-compose.test.ymlstack. Individual checks (composer test, etc.) are for partial/troubleshooting runs only. - Do not use Colima for this repository; use Docker Desktop
desktop-linux.
Start with the Documentation Hub.
Important links:
- Architecture
- Docker Installation
- Crawl Pipeline
- Search Indexing
- Vector DB Operations
- Release Flow
- Deploy Flow
- Known Dangerous Commands
- Observability Service
- Modules/Crawler README - Adapter and parser engine documentation
- Modules/Observability README - Downstream telemetry and outbox documentation
- AGENTS.md
- Claude instructions
- AI skills usage
- Canonical skills:
.github/skills