Petey turns PDFs into structured data. Drop one in, get a row out. Drop a thousand in, get a table. No code. No upload.
PDF data extraction is everywhere — and easier to solve than most people think. The pieces (parsers, AI models, structured outputs) have existed for years; getting them to work together is the hard part. Petey is the desktop app that does the wiring, striking a balance between sophistication and accessibility.
Under the hood, the same four-step pipeline runs everywhere — desktop app, web app, Docker container, Python library. Pluggable parsers, pluggable LLMs. Same blueprints, same outputs. Open source, AGPL, free.
Sources, evaluation, destinations, custom blueprints, on-prem help. We'll prioritize building what a customer actually needs. Same AGPL license; once it's built, it's in the package for everyone.
The app is built on a library you can use directly — same parsers, same blueprints, same extraction logic. Drop Petey into your existing pipeline, or wire it into agents and automations.
# $ pip install petey from petey import extract results = extract( blueprint="blueprints/cms1500.bpt", files="claims/*.pdf", )
Same code, same blueprints, three deployment shapes. Pick the one that fits the environment your work actually runs in.
Petey is a tool, not a service. We never see your documents — and you decide how much of your data leaves your environment, if any.