Skip to content

fix(xiaoyuzhou): macOS BSD grep compat — replace grep -oP with perl -ne#290

Open
hymansun wants to merge 1 commit intoPanniantong:mainfrom
hymansun:fix/macos-bsd-grep-xiaoyuzhou
Open

fix(xiaoyuzhou): macOS BSD grep compat — replace grep -oP with perl -ne#290
hymansun wants to merge 1 commit intoPanniantong:mainfrom
hymansun:fix/macos-bsd-grep-xiaoyuzhou

Conversation

@hymansun
Copy link
Copy Markdown

@hymansun hymansun commented May 4, 2026

Summary

  • Fixes transcribe_xiaoyuzhou.sh failing on every macOS install with grep: invalid option -- P
  • Root cause: grep -oP (PCRE) is a GNU extension and is not available in BSD grep that ships with macOS
  • Replaces the three affected sites with perl -ne — macOS ships perl by default, so no new dependency, and works identically on Linux

Closes #289

What changed

agent_reach/scripts/transcribe_xiaoyuzhou.sh, lines 38, 39, 114:

-AUDIO_URL=$(echo "$PAGE" | grep -oP 'https://media\.xyzcdn\.net/[^"]*\.(m4a|mp3)' | head -1)
-TITLE=$(echo "$PAGE" | grep -oP '"title":"[^"]*"' | head -1 | sed 's/"title":"//;s/"//')
+AUDIO_URL=$(echo "$PAGE" | perl -ne 'while (/(https:\/\/media\.xyzcdn\.net\/[^"]*\.(?:m4a|mp3))/g) { print "$1\n" }' | head -1)
+TITLE=$(echo "$PAGE" | perl -ne 'if (/"title":"([^"]*)"/) { print "$1\n"; last }' | head -1)
-WAIT_SEC=$(echo "$BODY" | grep -oP 'in \K[0-9]+m' | sed 's/m//' | head -1)
+WAIT_SEC=$(echo "$BODY" | perl -ne 'if (/in (\d+)m/) { print "$1\n"; exit }')

Net change: 3 insertions, 3 deletions. No behavior change on Linux; fixes the macOS path entirely.

Test plan

  • Lint: shell still parses (bash -n agent_reach/scripts/transcribe_xiaoyuzhou.sh)

  • End-to-end on macOS 14 (Apple Silicon, bash 3.2, BSD grep, ffmpeg 8.1, Python 3.14, agent-reach 1.4.0): full transcription of https://www.xiaoyuzhoufm.com/episode/69f2d432bb3ffa11e59cc5b0 (9:05, 8.4MB → 4MB mono mp3 → 1 chunk → Groq Whisper large-v3 → 2561 字 markdown). Sample run:

    📻 小宇宙播客转文字
    🔍 正在解析页面...
    📝 标题: 五一前夜,5A景区集体翻车背后
    🔗 音频: https://media.xyzcdn.net/.../xxx.m4a
    ⬇️  正在下载音频...
    📦 文件大小: 8.4M
    ⏱️  时长: 9分5秒
    🔄 正在转码...
    📦 转码后: 4MB
    📎 无需切片
    🎙️  正在转录 (Groq Whisper large-v3)...
       段 1/1... ✅ (2561 字)
    ✅ 完成!
    
  • Linux smoke (any maintainer with a Linux box): same command should still work — perl regexes are equivalent to the prior PCRE patterns.

🤖 Generated with Claude Code

…mpat

`grep -oP` (PCRE) is a GNU extension and not available in BSD grep
shipped with macOS, causing transcribe_xiaoyuzhou.sh to fail at the
page-parsing step on every macOS install:

  grep: invalid option -- P
  ❌ 无法从页面提取音频链接

Switch the three affected sites to `perl -ne`. macOS ships perl by
default, so this introduces no new dependency and works identically
on Linux.

Closes Panniantong#289

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

transcribe_xiaoyuzhou.sh fails on macOS: grep -oP (PCRE) is not supported by BSD grep

1 participant