Everything AI
ConveyorAI accuracy, nuance, and complexity in plain language
ConveyorAI’s job is to pull the best information you already have (policies, past answers, wikis, etc.) and draft security-questionnaire or RFP answers for you. Most of the time it’s 90 %+ correct, but mistakes can still happen. Here is the simple version of why and what we do about it:
Why an answer can be wrong | How ConveyorAI reduces the risk |
---|---|
Out-of-date or conflicting content in your knowledge base. | • Sync directly with “source-of-truth” systems (Confluence, Google Drive, websites) so you edit in one place. • Automatic reminders to verify any Q&A you store in Conveyor. • “AI Librarian” spots duplicates or conflicts and retires or flags them. • If you manually correct an answer, that correction becomes the new source for the future. |
The question was misread or ambiguous. | Ongoing improvements to question extraction and context capture. If extraction is wrong you can correct the answer - which will be remembered infinitely by default. |
LLM hallucinations (the model makes something up). | Retrieval-Augmented Generation (RAG) forces the model to ground answers only in the cited sources, plus automatic checks that the draft actually matches those sources. Occurs < 0.1 % of the time in internal tests. |
What are confidence scores and how do they affect if/when ConveyorAI answers?*
When ConveyorAI drafts an answer it tags it with a traffic-light confidence color:
Green – exact match from a past, already-approved answer (safe to auto-send).
Blue – high-confidence AI answer grounded in good sources (often safe to auto-send).
Yellow – lower confidence (partial match, mixed sources, or uncertain context). These are routed to a human for review before anyone external sees them.
Admins can set a policy such as “only send Green or Blue automatically; hold Yellow for review,” so the confidence score directly gates whether the answer is released immediately, held for escalation, or never shown.
How do we protect customers from hallucinations?
Grounding in your data (RAG). The model is instructed to answer only with the snippets it just retrieved.
Source-consistency guardrail. If the AI’s draft strays from those snippets, the system deletes or flags the answer instead of sending it.
Modern, higher-accuracy models. Switching to state-of-the-art LLMs further lowers the base hallucination rate.
Visibility of evidence. Every answer cites its sources, so reviewers (or customers, if you allow it) can quickly verify the claim.
What is our AI grading mechanism and why does it matter?
Conveyor runs internal evaluations (“evals”) on both lab test sets and real production traffic:
Each new model, retrieval tweak, or prompt change is graded against a gold-standard answer set before release.
Live answers are continuously sampled and scored so the team can catch quality dips early.
These scores feed the product roadmap—areas with lower grades get prioritized for improvement.
For a user, this means accuracy isn’t left to chance; it is measured, trended, and used to drive ongoing improvements.
Hallucinations vs. inaccuracies—what’s the difference and how are they handled?
Term | What it means | Typical Cause | How ConveyorAI mitigates |
---|---|---|---|
Hallucination | The AI invents facts that do not appear in any source. | LLM reasoning quirks. | RAG grounding + source-consistency checks + modern models keep the rate under 0.1 %. |
Inaccuracy | The AI faithfully uses the source, but the source itself is wrong, outdated, conflicting, or not the right context. | Content maintenance issues; retrieval misses; question misunderstood. | Source-of-truth sync, verification reminders, AI Librarian conflict detection, product-line scoping, confidence scoring and human review for edge cases. |
Take-away for users
Confidence scores tell you when it is safe to let the ConveyorAI respond automatically. Rigorous guardrails and grading make sure both hallucinations and simple inaccuracies stay rare and visible. As your team updates or corrects answers, ConveyorAI learns, pushing overall accuracy even higher over time
Updated about 12 hours ago