Design, build, and operate the LLM-driven services that power patent ingestion, classification (IPC/CPC), claim and entity extraction, semantic search, and prior-art retrieval across the Digital IP platform.
Own the lifecycle of LLM applications on patent data: retrieval design, prompt and chain engineering, structured-output design, evaluation, and continuous improvement on top of managed LLMs.
Build retrieval systems over Ingersoll Rand's internal corpus and external patent databases (USPTO, EPO, WIPO), including embedding strategy, vector indexes, and hybrid lexical/semantic search.
Develop rigorous offline and online evaluation: gold sets co-curated with the IP Council, regression suites, hallucination and citation-faithfulness checks, and human-in-the-loop review workflows.
Productionise LLM-driven systems end-to-end — retrieval indexes, prompt and chain wiring, serving, monitoring, latency and cost control — on Snowflake (Cortex, Snowpark).
Що пропонуємо
Direct experience with patent or other long-form legal/technical documents: claim structure, IPC/CPC taxonomies, USPTO/EPO/WIPO data, or PATSTAT.
Fine-tuning of LLMs or embedding models for domain-specific tasks, where the case for it was clear.
Classical NLP or machine-learning background (training models, MLO