Trust Score: How I Judge an AI Tool's Reliability in 2026
My evaluation methodology for AI tools at Trust-Vault. 4 pillars, 15+ criteria, a 0-100 score. A field report after two years of cataloging.
In short: The Trust Score is Trust-Vault's methodology for rating an AI tool's reliability on a 0-100 scale. It rests on four pillars marked out of 25 points each: Privacy, Reliability, Security and Transparency. The final score is the straight sum, recalculated weekly from verified reviews, policy changes, certifications and public incidents, with no sponsored ranking and full editorial independence.
Ever since I started cataloging AI tools for Trust-Vault, people keep asking me the same thing: "how do you tell the good ones from the bad ones?" With several tens of thousands of tools on the market and aggressive marketing from every single vendor, I figured out pretty fast that I needed a rigorous framework — otherwise I'd spend the rest of my life comparing sales sheets. This article lays out the Trust Score methodology exactly as I apply it today, after two years of cataloging and several hundred evaluations.
Why I built this framework
Three symptoms pushed me to put structure around all of this. First, the comparison articles you find online often read like content marketing in disguise, with zero transparency about affiliate links or the author's own biases. Second, user reviews are sometimes manipulated — I've come across platforms running 50% fake accounts, and vendors quietly trading reviews with one another. Third, the structural criteria that actually matter — GDPR, security, transparency — almost never get audited seriously.
The Trust Score is my attempt to bring something objective, measurable and reproducible to the table. It's not an oracle. It's just a grid that forces you to look at what genuinely counts.
The 4 pillars, 25 points each
My score is built on four dimensions, each marked out of 25 points, for a total of 100. I picked these pillars because they map onto what a professional buyer actually checks before signing a contract.
1. Privacy — 25 points
Privacy comes first, because that's where a company is most exposed. I rate effective GDPR compliance, where the servers physically sit, encryption at rest and in transit, how clear the privacy policy is, whether user rights actually work in practice (access, rectification, deletion, portability), and how transparent the vendor is about its subprocessors and transfers outside the EU. A tool that hosts everything in the United States with no Standard Contractual Clauses loses points — even if it's technically excellent.
2. Reliability — 25 points
A tool has to keep its promises. I look at how the advertised features hold up against reality, the quality and consistency of the outputs (I test them myself), the satisfaction of verified users weighted by review volume, availability (uptime, response time, incident handling) and how responsive support actually is. A vendor that answers a critical ticket within 48 hours beats one that takes a week.
3. Security — 25 points
Technical security is what protects you against leaks, vulnerabilities and abuse. I check certifications (SOC 2 Type II, ISO 27001, HIPAA depending on the sector), encryption (TLS 1.2+ in transit, AES-256 at rest), multi-factor authentication, access control, logging, whether there's a bug bounty or third-party audits, and the incident history. A vendor that's open about past incidents earns more trust from me than one claiming to have never had a single one.
4. Transparency — 25 points
A transparent vendor inspires confidence and makes the buyer's job far easier. I rate access conditions clarity (no hidden fees, cancellation terms you can actually read), documentation quality (guides, API docs, a public changelog), code openness where it applies, how regularly they communicate (blog, public roadmap) and how legible the terms of service are. A dated changelog at every release is worth a lot more to me than a fuzzy commercial roadmap.
The formula
The Trust Score is the straight sum of the four pillars. Final score: a whole number between 0 and 100. I recalculate it every week to fold in fresh data — verified reviews, policy changes, updated certifications, public incidents. No hidden weighting, no secret coefficient. If a vendor drops 5 points on Security, the score drops 5 points. End of story.
How to read a score
| Range | Level | My reading |
|---|---|---|
| 85-100 | Excellent | Maximum trust, strong across all 4 pillars |
| 70-84 | Very good | Solid guarantees, minor areas to improve |
| 50-69 | Good | Reliable, but identified gaps on 1 or 2 pillars |
| 25-49 | Average | A few positive signals, significant weaknesses |
| 1-24 | Low | Few guarantees, use with caution |
| 0 | Not rated | Not enough data to calculate a score |
For sensitive professional use, I never go below 70. For personal exploration, a more modest score can be fine — as long as you're going in with eyes open about the trade-offs.
My process, in 4 steps
Every tool goes through the same steps before it gets a Trust Score:
- Submission or discovery: the tool lands in my evaluation queue, either because someone flagged it to me or because I picked it up during active monitoring.
- Expert analysis: I work through the 15+ criteria across the 4 pillars, reading the official documentation, testing the tool whenever that's possible, and going through the terms of service in full.
- Community verification: verified user reviews get folded into the score with logarithmic weighting, so a flood of lukewarm reviews can't drown out a handful of genuinely detailed ones.
- Continuous monitoring: weekly tracking of changes (policy, certifications, incidents), with the score automatically recalculated.
My independence commitments
The integrity of my evaluation is Trust-Vault's core asset. There are three commitments I spell out to every vendor who reaches out to me:
- No sponsored ranking: no vendor can pay to bump up their score. When someone offers to "talk about the ranking", the answer is always, unfailingly, no.
- Affiliate disclosure: my affiliate links are visually flagged and have zero influence on the score. A tool I dislike can carry an affiliate link, and the other way around too.
- Editorial independence: the people writing are kept separate from the commercial side of the site. In practice, I keep the two hats apart myself whenever I write up a profile.
How I verify reviews
Because user reviews feed directly into the score, vetting them is critical. Mandatory authentication via a verified email, automatic detection of suspicious patterns (reviews posted in bursts from the same IP, copy-pasted phrasing), human moderation before anything goes live, and community flagging with a 48-hour review window. Plus the logarithmic weighting I mentioned earlier: a tool with 1,000 lukewarm reviews does not mechanically beat a tool with 50 detailed ones.
How to actually use it
A few rules I follow myself and recommend to others. A high score is a great starting point, but it never lets you off the hook for your own due diligence in your specific context — a tool rated 90 for marketing work might be completely wrong for your healthcare use case. An average score doesn't disqualify a tool either: it just tells you to dig deeper. For sensitive professional use, I stay above 70. For personal or exploratory use, more modest scores can do the job.
The Trust Score is a reading grid, not a verdict. Cross-reference it with your own business constraints (sector, company size, regulatory demands), the feedback from your DPO and your CISO, and a limited proof of concept before you roll anything out at scale.
A living methodology
My methodology keeps evolving. I tune it regularly based on community feedback, regulatory shifts (GDPR, the AI Act — see EU Regulation 2024/1689), and emerging threats or use cases. When the AI Act spelled out the GPAI obligations, I added three criteria to the Transparency pillar. Every major change is documented publicly.
If you publish an AI tool and would like it evaluated, get in touch through the Trust-Vault contact form. Every submission goes through the exact same process — no financial strings attached, no fast-tracking on offer.
--- Sources: EU Regulation 2024/1689 (AI Act); CNIL — generative AI recommendations 2024; AICPA SOC 2 framework; ISO 27001 standard; HuggingFace model cards methodology.
Further reading
Compare AI tools
Compare tools by use case, category, and trust signals.
Trust Ranking
Review reliability, transparency, and product maturity signals.
RGPD et outils IA : guide conformité
Cadre pratique pour vérifier données, fournisseurs, DPA, transferts et gouvernance IA.
Sécurité IA : protéger les données
Méthode pour évaluer les risques, les accès, la confidentialité et les usages sensibles.
Official sources and method
Trust-Vault combines field usage with institutional sources to strengthen verification, compliance, and comparison clarity.
- AI Act policy overview - European Commission. Official overview of the European framework for safe, human-centric AI.
- Recommandations IA et RGPD - CNIL. French authority guidance on AI system development and GDPR compliance.
- AI Risk Management Framework - NIST. US federal framework for assessing and managing AI risks.
- Artificial Intelligence - CISA. US federal resources on AI security, governance, and risk.
Laurent Duplat
Editor-in-Chief — Trust-Vault