Koca Ventures Ltd
71-75 Shelton Street
Covent Garden, London
WC2H 9JQ, United Kingdom
Registered in England & Wales16231043

AGENTIC AI & ON-PREMISE SYSTEMS

Private AI Agents That Run YourOperations, Not Just Your Inbox

Custom agent harnesses, Claude / OpenAI Agent SDK deployments, Kuzu knowledge graphs, and self-hosted inference — built around the workflow your team already uses. NVIDIA PSIRT-credited founder; security-sensitive edge AI is our daily work.

WHAT WE BUILD

Six pieces, all production-grade

01

Custom Agent Harnesses

Bespoke agent runtimes built around your workflows — not generic LLM wrappers. Tool calling, memory architectures, retry semantics, and approval gates designed for the specific class of work your team actually does.

02

Agent SDK Integration

Production deployments on the Claude Agent SDK, OpenAI Agents framework, and the Model Context Protocol (MCP). Multi-agent orchestration, tool use, structured output — wired into your existing services and databases.

03

Document Agents + Kuzu Knowledge Graphs

RAG over your contracts, specs, SOPs, RFIs, and PDFs with citations. Backed by Kuzu — an embedded property-graph database — so the agent reasons across entity relationships, not just chunk similarity.

04

Data & CRM Pipelines

ETL and ingestion pipelines that feed agents the structured context they need. CRM automation (HubSpot, Salesforce, Pipedrive), inbound-lead routing, auto-summary of enquiries, and reliable follow-up workflows.

05

On-Premise AI Deployment

Run agents on your hardware. Self-hosted inference (vLLM, Ollama, llama.cpp), encrypted local storage, signed updates, role-based access, audit logs. No customer data ever leaves your infrastructure.

06

Edge AI & Computer Vision

NVIDIA Jetson Orin deployments, DeepStream pipelines, TensorRT/ONNX optimisation, and air-gapped operation. Edge inference near cameras for real-time alerts without cloud dependency.

FLAGSHIP USE CASE

WhatsApp AI Operations Organizer

Most companies already run a large part of daily operations through WhatsApp: managers ask for updates, field teams send photos and voice notes, sales discuss customer requests, procurement shares supplier prices. WhatsApp is fast, but it is not organised. Important tasks get buried, ownership is unclear, follow-ups are forgotten, leadership has no reliable operational memory.

The Operations Organizer is an AI layer that sits on top of company-approved WhatsApp channels and converts unstructured conversations into tasks, reminders, summaries, decisions, and reports — without forcing teams to learn a new app.

01

AI Message Understanding

Classifies conversations into tasks, issues, approvals, risks, customer requests, procurement needs, and scheduling updates — automatically.

02

Voice Note & Image Processing

Transcribes voice notes, extracts useful information from images and documents, and connects each item to the right project, customer, team, or location.

03

Task & Follow-Up Engine

Creates tasks, assigns owners, tracks due dates, reminds responsible people, and escalates ignored or overdue items.

04

Daily & Weekly Reports

Generates concise summaries for managers: completed work, open issues, delays, risks, spending items, unresolved customer requests, and team performance.

05

Company Knowledge Memory

A searchable internal memory (backed by Kuzu) of past decisions, recurring problems, customer history, supplier notes, and operational patterns.

06

Manager Dashboard

A web-based control panel where leadership can see what is happening across teams without reading hundreds of messages.

Manager queries the system can answer

  • “Which jobs are delayed today?”
  • “Who is waiting on approval?”
  • “Summarise all site updates from this week.”
  • “Which customer requests have not been followed up?”
  • “Create a task list from yesterday's WhatsApp messages.”
  • “Show me procurement items mentioned but not ordered yet.”
SECURITY CREDIBILITY

Edge AI is security-sensitive. We treat it that way.

Founder Ozgur Ogul Koca is publicly credited by NVIDIA PSIRT for the responsible disclosure of CVE-2026-24148 — a CWE-1188 vulnerability in NVIDIA Jetson Linux. NIST/NVD scored it 9.4 CRITICAL; NVIDIA scored it 8.3 HIGH.

This is a credibility anchor for any conversation about edge AI, on-premise deployment, hardened runtime, signed updates, device identity, and air-gapped operation.

DEPLOYMENT MODEL

Your infrastructure, your data, your control.

  • · Camera streams stay local where possible.
  • · Documents indexed in private vector + Kuzu graph stores.
  • · Answers include citations to reduce hallucinations.
  • · Security controls: network isolation, encrypted storage, audit logs, signed updates, role-based access.
  • · External LLM APIs are optional — not the foundation.
FREQUENTLY ASKED

Agentic AI FAQ

What is an agentic AI system?

An agentic AI system is software where an LLM autonomously calls tools, reads data, and executes multi-step workflows toward a goal — not a chatbot that only generates text. Koca Ventures builds agentic systems that read documents, query databases, file tickets, send emails, and call APIs under structured human approval where needed.

How is this different from a chatbot?

A chatbot answers questions. An agent acts. Our agent harnesses include tool calling, retry semantics, memory persistence (often backed by Kuzu property-graph), approval workflows, and integration with your real systems (ERP, CRM, POS, file storage). Output is actions performed — not just answers displayed.

Can we run this on our own servers?

Yes. On-premise deployment is a first-class option, not a fallback. We deploy self-hosted inference (vLLM, Ollama, llama.cpp depending on hardware), encrypted local vector and graph stores, signed update channels, and role-based access. Camera streams and documents stay inside your network. External LLM APIs are optional, not foundational.

Do you use Claude, GPT, or local models?

We use whichever fits the workload. For sensitive on-premise deployments: local models (Llama, Qwen, Mistral) served via vLLM. For complex reasoning where data sensitivity permits: Claude (via the Anthropic Agent SDK) or GPT (via the OpenAI Agents framework). Many production deployments are hybrid — local models for ingestion, hosted models for hard reasoning.

What is Kuzu DB and why does it matter for agents?

Kuzu is an embedded property-graph database with a Cypher-like query language. For document agents, vector search alone is not enough — you need to reason across relationships (this RFI is about this spec, which is owned by this engineer, who signed off on this revision). Kuzu lets agents traverse those relationships in milliseconds without an external graph service.

What does a typical deployment timeline look like?

A focused pilot around one workflow (e.g., site updates, customer service, procurement, or document Q&A) is typically four weeks: discovery (week 0), data intake (week 1), demo build (weeks 2–3), and roadmap review (week 4). Full production rollout depends on scope but commonly runs 8–16 weeks after the pilot.

GOOD CANDIDATE WORKFLOWS

Where this lands fastest

Construction & field service

Site updates, safety incidents, RFI/submittal flow, equipment history. Voice notes from the field become structured tickets with photos attached.

Restaurants & hospitality

Service operations, supplier prices, customer feedback. Pairs naturally with Kobor AI for the camera + chat operational picture.

Manufacturing & logistics

Shift updates, machine downtime, procurement, customer ETAs. Operator WhatsApp groups become inventory + KPI dashboards.

Distributors & dealer networks

Customer enquiries, price requests, technical questions. CRM-bound auto-summarisation + follow-up routing.

Last reviewed:

READY TO TALK?

Start with one pain point

The strongest opening is not a generic AI pitch. Share one workflow that hurts — site updates, customer follow-up, document Q&A, procurement chaos — and we build a small demo around your real data before any larger commitment.