Validating with Confidence: Building Defensible Candidate Key Assessments

kelly93055
Feb 12
3 min read

Updated: Feb 12

Accreditation and state approval conversations rarely hinge on whether an assessment exists; they hinge on whether it’s defensible. For Educator Preparation Programs (EPPs), a “key assessment” must do more than align to standards. It must show credible evidence that (1) the instrument measures what it claims to measure, (2) scores are consistent and interpretable, and (3) different evaluators can apply the scoring process with acceptable agreement.

Accrediting entities, including CAEP, AAQEP, and NASDTEC, as well as state agencies (e.g., GaPSC), require EPPs to submit valid and reliable data that corroborates program impact, candidate proficiency, and continuous improvement.

The good news? Building a defensible validation process doesn’t have to be overwhelming.

What follows is a practical look at how EPPs can gather and document validity and reliability evidence patterned after the EPiC™ Key Assessment.

Step 1: Design a Clear, Defensible Validation Plan

During the four-year development of the EPiC Key Assessment, the Georgia-based EPiC team focused on key measurement properties that would lay the groundwork for future adjustments to the 10-item rubric and set in motion an annual validation cycle to maintain its statistical integrity.

The most recent EPiC validation studies (2023 & 2025) were triggered by a reality most EPPs recognize: once a rubric is revised (in EPiC’s case, adjustments to rubric categories and structure), it is essential to re-establish validity and reliability evidence for the updated version. A validation protocol was subsequently designed to re-evaluate the “core four” statistical components for the EPiC Key Assessment: content validity, construct validity, internal consistency (reliability), and inter-rater reliability.

Step 2: Establish Content and Construct Validity

Two foundational questions guided this work:

Content validity: Are the rubric components representative, important, and clear?
Construct validity: Do score patterns support the intended underlying structure?

EPiC used a conventional structured content review process in four stages:

Selecting content review experts,
Conducting content validation,
Collecting expert ratings for each rubric,
Calculating the Content Validity Index (CVI).

Content validity using a CVI process produces the kind of documentation accreditors appreciate because it shows (a) structured expert judgment, (b) transparent decision rules, and (c) traceable improvement actions when items fall below expectations. A key “defensibility move” here was that EPiC didn’t just report a global average—it identified specific rubric targets for improvement to strengthen representativeness and clarity. Across rubrics, EPiC reported strong CVI results overall.

The EPiC Key Assessment tested construct validity using principal component analysis (PCA) and supporting diagnostics, including:

A correlation matrix review,
Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy, and
Bartlett’s Test of Sphericity to confirm the correlation structure was appropriate for component/factor extraction.

EPiC’s results supported a two-component structure consistent with the assessment’s intended design: Part A: Planning & Instruction rubrics and Part B: Assessment & Analysis rubrics.

Step 3: Check Internal Consistency and Inter-Rater Reliability

Key questions surrounding internal consistency and inter-rater reliability included:

Internal Consistency (Reliability): Do rubric components cohere as a set?
Inter-rater Reliability: Do independent evaluators produce consistent results?

EPiC calculated Cronbach’s alpha, a measure of internal consistency, separately for Part A and Part B of the EPiC Key Assessment. Cronbach’s alpha is an important design choice because the EPiC Key Assessment is intentionally multi-construct. Results provided evidence that rubric sets were sufficiently consistent for program use, especially when paired with the construct validity evidence showing the two-part structure (i.e., Part A: Planning & Instruction Part B: Assessment & Analysis).

EPiC’s study design emphasized rater preparation before inter-rater reliability testing; raters received training aligned to the EPiC Key Assessment evidence markers, practiced on benchmark performances, scored independently, and reconciled inconsistencies to sharpen shared interpretation. Inter-rater reliability was then quantified using Intraclass Correlation Coefficients (ICC) for each rubric. The ICC values for both the individual rubrics as well as for the combined Part A & B rubrics of the EPiC Key Assessment indicated good to excellent inter-rater reliability

This is exactly the kind of reporting that strengthens defensibility: rubric-by-rubric reliability, confidence intervals, and statistical significance, rather than a single global claim.

Step 4: Summarize Findings for Accreditation Reports

Accreditors aren’t looking for raw data dumps. They want:

Clear summaries
Evidence of reflection
Documentation of follow-up actions

Well-structured validation summaries explain what was examined, what was learned, and how the program responded. EPiC-aligned documentation makes it easier to translate technical findings into reviewer-friendly language.

Validation as Confidence—Not Compliance

At its best, validation is not about satisfying an external requirement. It’s about knowing your assessments are fair, meaningful, and defensible.

By adopting structured, key assessment models such as the EPiC Key Assessment, EPPs can move from reacting to accreditation questions to answering them with confidence.

Join Us!

Next month’s webinar, Turning Key Assessment Data into Reviewer-Ready Evidence, takes a closer look at EPiC™ validation practices and how EPPs document validity and reliability in a clear, consistent, and accreditation-ready manner.

Validating with Confidence: Building Defensible Candidate Key Assessments

Recent Posts

Comments