Semantic Search
at the Edge
Intelligent query caching that understands meaning — not just keywords. Vector similarity powered by nomic-embed-text, served from Akamai Functions KV at the network edge.
Defense & Guardrails
Production-grade safety layers — a blueprint for deploying LLMs responsibly.
Schema Validation
Zod validates all request params and every KV read. Wrong shape or missing fields → rejected before any logic runs.
Injection Blocklist
20+ regex patterns reject instruction overrides, jailbreak keywords ("DAN", "act as"), model delimiter tokens, and explicit code/exploit generation requests.
System Prompt Lock
A hardcoded system prompt on every LLM call explicitly forbids code generation, hate speech, and persona hijacks. User input cannot override it.
KV Data Integrity
Every value read from the KV store is parsed with a Zod schema. Corrupt data is silently dropped — treated as a cache miss, not a runtime crash.
Network Lockdown
Spin's allowed_outbound_hosts restricts all egress to the single configured Ollama endpoint. The Wasm sandbox makes arbitrary network calls impossible.
Query flow
Step 1
Query Arrives
Input is validated, sanitized against prompt injection, and checked against the exact-match KV cache.
Step 2
Embed & Search
Ollama generates a 768-dim vector via nomic-embed-text. Cosine similarity finds semantically equivalent queries.
Step 3
Cache Hit?
Similarity > 85%? Return the cached answer instantly — zero GPU cycles, single-digit milliseconds.
Step 4
Generate & Cache
On a miss the LLM generates a fresh answer, stores it with its vector embedding for all future semantic lookups.