We give LLMs superpowers.
The cheap LLM version with PortMem hits 99% accuracy on regulated questions, compared to 65% using it alone. If you prefer to use the latest agentic LLM that gets most answers right, PortMem makes querying it much cheaper. It is a win-win no matter which LLM you pick.
The two dominant ways to "fix" RAG today are agentic loops (the model thinks, calls a tool, re-reads, repeats) and long-context stuffing (drop the whole corpus in the prompt and pray). Both pay a tax in tokens, dollars, and seconds, and neither solves currency or authority.
PortMem returns a verified passage in one retrieval round. The LLM, if you use one, only sees the right document.
Sub-second retrieval against your corpus, regardless of size. Stuffing 100k tokens into a frontier model is 20 to 40 seconds and grows with corpus.
A retrieval round is fractions of a cent. A long-context call on a frontier model is dollars per query. Multiply by daily volume.
Not five agent steps, not three rerank passes. Cost and latency are predictable, which is what compliance and procurement care about.
Instead of asking an LLM to summarize what it found, PortMem returns the exact passage that should govern the answer, along with the trail of how it got there. The LLM, if you use one at all, becomes a presentation layer over a verified source.
The answer is a passage from a real document, with a citation. Nothing is generated on the critical path.
Superseded, recalled, and overruled documents are removed at retrieval time. The result is the one that is still in force.
When the evidence is weak or contradictory, the system says so. Better silence than a confident wrong answer in regulated work.
For questions that need two or three connected documents, PortMem chains the retrievals and shows the bridge between them.
Different question types need different strategies. A router picks the right one without you having to label queries.
Sits above your vector store. Vendor-neutral, model-agnostic, no rip-and-replace.
PortMem sits above any vector store. Buyers add the regulated-domain ranking layer that long-context LLMs, agentic frameworks, and generic rerankers do not have.
| Approach | Currency | Audit | Multi-hop | Model-agnostic | Regulated | Speed | Cost |
|---|---|---|---|---|---|---|---|
|
Long-context LLMs
Sonnet 1M, GPT-4
|
✗ | ✗ | ~ | ✗ | ✗ | Slow | $$$ |
|
Naive RAG
LangChain, LlamaIndex
|
✗ | ~ | ✗ | ✓ | ✗ | Fast | $ |
|
Agentic RAG
LangGraph, CrewAI, AutoGen
|
✗ | ~ | ✓ | ✓ | ✗ | Slow | $$ |
|
Compilation-stage knowledge
Pinecone Nexus, PageIndex
|
✗ | ~ | ✓ | ✓ | ✗ | Med | $$ |
|
Vertical legal AI
Harvey, Casetext, Hebbia
|
~ | ✓ | ~ | ✗ | legal only | Med | $$$ |
|
Enterprise search
Glean, Sana
|
✗ | ~ | ✗ | ✓ | ✗ | Fast | $$ |
|
Vector store + rerank
Vectara, Pinecone, Cohere
|
✗ | ~ | ~ | ✓ | ✗ | Fast | $$ |
|
PortMem
Retrieval for regulated markets
|
✓ | ✓ | ✓ | ✓ | ✓ | Fast | $ |
We built three test corpora where the right answer is not the most similar document. In every one, a workhorse frontier model working alone fails most of the time. PortMem clears it.
Across SEC filings, FASB updates, FINRA rule changes, and SEC no-action letters, the latest superseding document wins. PortMem picks it deterministically. A frontier LLM working from raw text gets it right 65% of the time.
SCOTUS overrules are semantically distant from the cases they replace. PortMem finds them by authority structure, not by similarity.
FDA's recall database overlaps with marketing-approval text. Standard retrieval pulls the approval; we surface the recall. Pharma demo →
Datasets, code, and the single-command evaluator are published with each paper. "Workhorse frontier model alone" baseline is Claude Haiku reading the same corpus directly without PortMem retrieval. Method, prompts, and full result tables are in the CAR paper (paper 4) under github.com/andremir/car-retrieval.
The pain is sharpest where a stale rule is an SEC enforcement action, a withdrawn no-action letter is a deficient supervisory procedure, or a missed supersession is a restated filing. We are starting with finance because the buyer has the budget, the audit clock is recurring, and the data (10-Ks, FASB ASUs, FINRA rule amendments, no-action letters) is structurally version-controlled and machine-readable.
SEC filings, FASB Accounting Standards Updates, FINRA rule amendments, SEC no-action letters, internal compliance memos. Knowing which version is currently in force is the entire job, and the cost of being wrong is regulatory exposure measured in seven figures.
PortMem hits 100% accuracy on financial supersessions. A workhorse frontier model alone hits 65%. The 35-point gap is the gap between "tooling" and "audit-defensible."
Case law, precedent tracking, statutory authority. One mis-cited brief is 80 to 200 hours of associate time and a malpractice exposure. PortMem catches overruled precedent that frontier LLMs miss 93% of the time.
Buyers: AmLaw partners, in-house GC, litigation support, legal AI platforms.
FDA labels, clinical trial protocols, drug recall notices. A recalled product retrieved as "approved" is a patient safety event.
Buyers: medical affairs, regulatory affairs, pharmacovigilance. Pharma demo →
CVE entries, GHSA advisories, vendor patch notes. "Is this CVE patched?" is a controlling-authority question. Standard RAG gets it wrong 39% of the time.
Buyers: secops, vulnerability management, AppSec platforms.
PortMem is the productization of four sole-authored research contributions. Each one fixes a specific failure mode that standard retrieval has in regulated content.
Vector search and graph search produce scores on different scales. PhaseGraph maps them onto a common rank-based scale before fusing, which lifts last-hop recall on MuSiQue and 2Wiki without discarding magnitude information.
arXiv 2603.28886 →For questions that need a chain of evidence, the second-hop document should be ranked by usefulness given the first hop, not by similarity to the original question. Training-free, graph-free, beats published baselines on three standard benchmarks.
arXiv 2604.03384 →Different question types need different retrieval strategies. A lightweight router picks the right mode per query, with measurable gains in domain and graceful behavior out of domain. No hand-labeled query types required.
arXiv 2604.09019 →Finding the currently valid document is a different problem from finding the most similar one, and the two metrics are formally decoupled. CAR validates the framework on FDA, SCOTUS, and security advisories with large gains over dense baselines.
arXiv 2604.14488 · GitHub →Currently in active conversations with finance and legal teams across mid-market RIAs, regional banks, AmLaw firms, and large-cap controller offices. We onboard four to six design partners before commercial GA.
Hosted endpoint, integration support, and a co-built ingestion adapter for one corpus.
Design-partner license at a discount from list. Roadmap input. No long-term lock-in.
Two-week scoping. Six-week integration. Twelve-week paid pilot decision.
Or write directly to andre@portmem.com.