Scrape a static listing site into a structured dataset
何时使用: You need a dataset from a public site that doesn't have an API.
前置条件
- Skill installed — git clone https://github.com/yfe404/web-scraper ~/.claude/skills/web-scraper
- Node 20 for Apify Actors — nvm install 20
步骤
-
Let the skill do reconUse web-scraper. Target: https://example.com/listings. I want name + URL + category. Recon first — tell me the cheapest extraction path.✓ 已复制→ Skill reports: 'sitemap.xml available, use Cheerio'
-
Scaffold the Apify ActorScaffold a TypeScript Apify Cheerio actor for that extraction.✓ 已复制→ Actor tree + main.ts ready to run
-
Run and iterateRun locally on 10 pages; tighten selectors if needed.✓ 已复制→ Clean JSON output
结果: An Apify Actor you can deploy for scheduled scrapes.
注意事项
- Jumping to Playwright when Cheerio would do — Trust the recon — headful browsers 10x the cost unnecessarily