Search by job, company or skills

L

AI Scraper Engineer

Save
new job description bg glownew job description bg glow
  • Posted 20 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

AI Scraper Engineer (Hybrid)

We're hiring an engineer to design and run server-side scraping systems—including custom scrapers—that stay reliable as targets change and defenses tighten. You'll work closely with leadership (in English only) and help evolve toward AI-orchestrated scraping (e.g. GPT-style models generating or coordinating scripts, proxy choices, and runs).

What you'll do

  • Build and maintain production-grade scrapers in Python (and supporting tooling), from HTTP/API clients to headless browser flows where needed.
  • Reverse-engineer sites and mobile/web apps: discover internal/private APIs, headers, auth flows, pagination, and rate limits; document findings for the team.
  • Operate in the full scraping stack: proxies (residential/datacenter/mobile rotation), TLS/JA3, browser fingerprints, cookies/sessions, CAPTCHA/mitigation strategies where ethical and legal.
  • Use Playwright (or equivalent) for high-defense targets where pure HTTP isn't enough; keep runs stable, observable, and cost-aware.
  • Collaborate remotely with fluent English (written and spoken) for specs, incidents, and architecture discussions with our team.
  • Use AI coding assistants (e.g. Cursor, Codex, ChatGPT-class tools) as a normal part of workflow—shipping fixes and features fast while preserving code quality, security, and performance.
  • Lay groundwork for generative-AI orchestration: prompts/tooling that propose or adjust scraper logic, schedules, proxy usage, and failure recovery (within compliance boundaries).

What we're looking for

Must-have

  • 2+ years professional experience building scrapers, crawlers, or data-ingestion pipelines against real, changing websites.
  • Strong Python for networking, async I/O, parsing (HTML/JSON), error handling, retries, and structured logging.
  • Practical grasp of HTTP/HTTPS, REST/JSON (and common variations), cookies, redirects, CORS (from a client perspective), and basic TLS implications for scraping.
  • Hands-on experience with proxies, IP rotation, session stickiness, and anti-bot concepts (fingerprints, headers, behavioral signals).
  • Playwright or Selenium-class automation for difficult sites.
  • Comfortable debugging with DevTools, mitmproxy/Charles-class tools, HAR captures, and reading minified JS when tracing API calls.
  • Professional English for daily collaboration with English-only stakeholders.
  • Openness to hybrid: remote initially, with expectation to work on-site/offline with the team later (details to align during hiring).

Nice-to-have

  • Experience with queue/workers (Celery, RQ, Dramatiq), containers, and basic observability (metrics, tracing, alerting).
  • Familiarity with LLM APIs, prompt design, and safe patterns for AI-generated code (review, tests, sandbox runs).
  • Exposure to classification/extraction (rules + ML) for messy HTML or unstructured text.
  • Understanding of legal and ethical boundaries (robots.txt, ToS, regional rules)—we expect judgment and compliance-minded design.

How we work

  • Hybrid model: remote to start; later combination of remote and in-person as agreed.
  • We value engineers who iterate quickly with AI tools but still own outcomes: correctness, edge cases, security (secrets, customer data), and operational stability.

To apply:

  • Submit your application here: https://link.lntech.ai/h-ase
  • Send a short note on two scraping projects you've shipped (stack, defenses faced, how you kept them running), plus anything relevant on Playwright, proxy/fingerprint setups, or AI-assisted development.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 148397563