The evolution of Keystone: Earthly’s best-in-class nature project assessment

How Earthly's project assessment framework grew from 43 indicators in 2020 to 168 in 2025 as the most comprehensive nature project assessment on the market, and why every version matters.

Jenny Hyndman, Earthly

Jenny Hyndman, Earthly

14 May, 2026

Loading...
The evolution of Keystone: Earthly’s best-in-class nature project assessment

The voluntary carbon market has changed a lot since 2020. Consequently, so has the way we assess projects.

What started as a groundbreaking tool for screening carbon projects holistically has evolved through four distinct versions into a rigorous, evidence-based framework evaluating over 160 quality indicators across carbon, biodiversity, and people, and a wide range of nature-based solutions.

This blog walks through that journey: what drove each iteration, what each version was built to catch, and why the evolution of Keystone reflects the evolution of the market itself.

TL;DR - Key takeaways

  • Keystone has gone through four versions since 2020

  • Every version remains valid: evolution reflects rising standards 

  • Evolved from a detailed checklist to a dual-axis scoring system

  • Keystone 3.0 introduces 168 context-specific indicators with TNFD alignment and red-flag detection

  • Outputs are now audit-ready for disclosure demands including CSRD and TNFD

Why assessment frameworks have to evolve

In 2020, the voluntary carbon market was growing fast and scrutiny was growing with it. Businesses wanted to invest in nature, but most had no structured way to evaluate whether a project was doing what it said it would. Quantifiable carbon dominated; biodiversity and community outcomes were hard to measure, so often an afterthought.

Crucially, there were no standardised due diligence tools. Earthly's response was to establish a leading framework for the market that would assess a project’s impact on carbon, nature, and people together.

We believe it set an integrity standard no one else had attempted. It laid the foundation for everything that came next, with each subsequent version building on proven thinking while responding to new science, new regulation, and a market that continually raises the bar.

Vanga Bay - 14

“Independent, science-based assessments” were ranked as the most influential factor improving confidence in nature-based solutions investments in Earthly’s January 2026 survey of sustainability decision-makers.

Why independent, third-party verification is absolutely essential

In our recent

survey

(January 2026) of sustainability decision-makers, results revealed the most influential thing that would improve confidence to invest in nature projects was "independent, science-based assessments". Projects of all types - from carbon credits, to biodiversity credits, to tree-planting or marine-based innovations - will present the evidence required to their relevant standards, methodology or registry, in order to fit their requirements.

Keystone’s role as an independent assessor is to look beyond existing qualifications, verifications or standards of a project and score it completely objectively, with people, climate and biodiversity as equal priorities.

Keystone 0.0

The industry's first holistic project assessment, 2020

The first iteration of our project assessment looked at 43 core quality indicators: a comprehensive scope for its time and a genuine first in an industry that was still thinking almost exclusively in terms of tonnes of carbon. It introduced something the market was missing: a whole-system perspective, combining carbon, biodiversity, and social outcomes in a single tool before any standard required it. It assessed biodiversity across multiple levels including genetic, species, community, ecosystem and connectivity, and identified tangible social benefits like jobs, land rights and inclusion of Indigenous Peoples and Local Communities (IPLCs).

Its red-flag system surfaced risks clearly: overestimated emissions, monoculture restoration, exclusion of women from governance. For buyers navigating an opaque market, it provided a structured, expert-led lens that simply didn't exist elsewhere. Like any pioneering tool, it had boundaries. Using a binary pass/fail system, rather than a graduated scoring system, there was limited comparability between projects.

As the science of project evaluation advanced and market expectations grew, our limitations became the brief for the next version.

Rainforest conservation Malaysia Kuamut

“The world is facing an interconnected crisis of unprecedented biodiversity loss, food insecurity, and environmental degradation that can no longer be tackled through fragmented and piecemeal solutions.” IPBES assessment commentary on holistic approaches to nature and climate challenges.

Keystone 1.0

Bringing scientific rigour to due diligence, June 2022

By mid-2022, trust in carbon markets was under pressure. High-profile project failures involving weak additionality, inflated baselines and governance gaps were attracting serious scrutiny. The market needed more than a broad scan. It needed a reporting-ready framework grounded in science.

Our first revision of Keystone delivered exactly that. The indicator count grew to 57, reflecting both expanded knowledge of what good project design actually looks like and the growing complexity of nature markets. It introduced maturity scoring and confidence scoring, aligned with established scientific standards like IUCN, and created a clear distinction between a project making a claim and a project providing verified evidence of impact.

That distinction became a cornerstone of the framework going forward. It was also, at the time, ahead of where most of the industry was operating. Keystone 1.0 was built for the project type most common in the market: high-biodiversity, rural, Global South land-use projects where carbon, biodiversity, and community outcomes intersect.

As the market diversified into new geographies and project types, that focus became the design brief for Keystone 2.0.

Keystone 2.0

Scalable, comparable, decision-ready, September 2022

Keystone 2.0 arrived months after Keystone 1.0 with a clear mission: make the best-in-class framework work at scale across projects world-wide. Research into assessment design and feedback from real-world use showed that depth alone wasn't enough. Assessors needed structure, and buyers needed a way to compare projects against each other with confidence. The indicator count expanded to 106 and was organised into clear sections within each pillar.

The dual-axis scoring matrix was formalised: a four-level Maturity scale (from ‘not addressed’ to ‘best practice’) crossed with a four-level Confidence scale (from ‘no explicit evidence’ to ‘verified by trusted third party’). A minimum passing score made go/no-go decisions objective. A ‘spider diagram’ data visualisation made performance gaps visible at a glance.

This was the version that made meaningful portfolio analysis possible. It also introduced Systemic Leadership as an assessment section, asking whether a project actively promotes best practice across the nature-based solutions sector, not just within its own boundaries.

At a time when the market was reckoning with double-counting problems introduced by the Paris Agreement and growing criticism of REDD+ crediting, Keystone 2.0’s emphasis on additionality, leakage and baseline integrity made it the most commercially useful assessment tool available.

Laptop displaying a nature project titled "Earthly's Keystone 3.0" with environmental data and colorful charts on the screen.

Keystone 3.0 introduces 168 context-specific project assessment indicators across carbon, biodiversity, and people - built for the disclosure era and designed to assess nature projects with unprecedented depth and rigour.

Keystone 3.0

Built for the disclosure era, September 2025

The regulatory landscape of 2025 is fundamentally different from 2020.

CSRD

,

TNFD

and the

ICVCM Core Carbon Principles

have moved from emerging frameworks to active corporate requirements. Corporate buyers now need disclosure-ready data, not just a quality signal, and the assessment framework had to evolve to match.

Keystone 3.0 has 168 quality indicators across three pillars: carbon, biodiversity and people, each redesigned with context-specific questions that adapt based on project type and nuances of the environment. The carbon pillar now covers baseline, additionality, leakage, MRV, accuracy and permanence,

with a custom version for biodiversity and tree planting projects that recognises how diverse project types and units of sale require distinct approaches to evaluating carbon impact.

The biodiversity pillar has expanded significantly. It now covers eight distinct sections: benchmark, beyond business-as-usual, suitability and risks, MRV, impact, water, spillover, and resilience. TNFD mapping runs throughout the biodiversity pillar, connecting the distinct sections to nature-related disclosure requirements.

. Buyers can use Keystone outputs directly in nature-related disclosure reporting without needing to translate assessment results into a different framework.

The people pillar introduces deeper mapping across livelihoods, gender equity, human rights, education, health, and local governance. Every assessment now begins with an independent contextual analysis of land tenure, Indigenous rights, and community governance structures. SDG contributions are mapped at indicator level, not loosely claimed. The red-flag system, introduced in Keystone 0.0 and retired in Keystone 0.1 as scoring nuance took precedence, has been reintroduced in response to a market demanding clearer risk signals.

It surfaces issues like greenwashing in carbon claims, FPIC violations, benefit-sharing inequity, and community-level risks including GBV and land tenure conflicts. Detailed scoring and red-flag signalling now work together: nuance for those who need depth, clarity for those who need speed.

The result is a framework that is registry-agnostic and geography-agnostic, capable of assessing virtually any nature project type, anywhere in the world, to a standard the market has never previously had access to.

What's stayed the same

Across all four versions, the core intent hasn't changed: to give businesses an independent, evidence-based way to evaluate nature projects that goes beyond marketing claims. The pillars of carbon, biodiversity and people have always been there. The commitment to identifying both what a project does well and what it gets wrong has always been there. What's evolved is the precision, the comparability, and the alignment with a regulatory environment that now demands verifiable proof. Keystone has always been the best tool available for its moment. Keystone 3.0 is built for this one.

Two people walking and talking on a forest path lined with bare trees, dressed in casual outdoor clothing.

The Boothby Wildland project was assessed under Keystone 3.0 - only around 10% of projects meet Earthly’s assessment standards. This gives businesses greater confidence that they are investing in a rigorously vetted, high-integrity nature project delivering measurable benefits for climate, biodiversity, and local communities.

Q&As

Are projects assessed using older versions of Keystone still valid?

All versions of Keystone represent a rigorous, best-in-class assessment at the time they were conducted. The framework has evolved in response to rising market standards and new regulatory requirements, not because earlier versions were flawed. A project assessed under Keystone 1.0 was assessed to the highest standard the market had in 2022, and those results remain meaningful. Buyers should consider the version used when interpreting results, but should not discount older assessments.

What's the most significant change between Keystone 2.0 and 3.0?

The biggest shifts are in scope, specificity and disclosure readiness. Keystone 3.0 introduces context-specific questions that adapt based on project type, expands the biodiversity pillar to include water, resilience and spillover effects, and adds deep social-impact mapping across livelihoods, gender equity and health. It also maps outputs directly to TNFD metrics and reintroduces a structured red-flag system, making it significantly more powerful for corporate buyers who need both detail and clarity.

How does Keystone 3.0 help businesses meet CSRD or TNFD requirements?

TNFD metrics are mapped throughout Keystone’s 3.0’s biodiversity pillar. As each project is assessed, relevant metrics are extracted directly, giving buyers the quantifiable data they need for their TNFD reporting. On the social side, the detailed indicators around consent, Indigenous rights, and labour rights align directly with the due diligence requirements of the EU's CSRD and the UN Guiding Principles. Keystone 3.0 was built so that the connection to disclosure frameworks isn't an add-on, it's embedded.

Why did Earthly bring back the red-flag system that was removed after Keystone 0.0?

The red-flag system in Keystone 0.0 was an effective way to surface project risks clearly and quickly. As the framework matured through Keystone 1.0 and Keystone 2.0, the focus shifted toward nuanced, evidence-based scoring, and red flags were retired in favour of that depth. As the market matured further, it became clear that corporate buyers needed both: the full scoring picture for due diligence, and clear risk signals for faster decision-making.

What types of projects can Keystone 3.0 assess?

Keystone 3.0 is designed to be registry-agnostic and ecosystem-agnostic, with context-specific questions that adapt to different environments, geographies and community contexts. It includes dedicated sections for different project types, including a specific carbon section for biodiversity and tree planting projects, so that all nature-based solutions are assessed against criteria that actually fit them, rather than being forced into a single mould.