OpenDecisions
Formulas

Index formulas (v1)

Versioned formulas for derived indexes, DQI, CCI, AFuI, CRI, OVI, AFI, ACI (indexes/v1)

Derived index formulas (v1)

This document defines version 1 formulas for OpenDecisions derived indexes ("decision labels"). All derived outputs must include:

  • value: 0–100 or structured (e.g. band + interval for ACI)
  • uncertainty: at least CI90, or a defined confidence rule
  • explanation: drivers, missingness, warnings
  • citations: Source references (computedFrom)

Formula version: indexes/v1
Schema version: odg/v1


Versioning

  • Schema: odg/v1
  • Formula version: indexes/v1
  • (Optional later) Model version: e.g. aci_model/v1

Any change to formulas must:

  • Bump formula_version
  • Update release notes / changelog
  • Include migration notes if the semantic meaning of an index changes

Notation

  • clamp(x, a, b): clamp x into the interval [a, b]
  • sigmoid(z) = 1 / (1 + exp(-z))
  • zscore(x): computed within an appropriate peer group (e.g. same level and Carnegie band)

1. CCI: Competitiveness / Crowd Index (0–100)

Question: "How competitive is this cycle independent of a given applicant?"

Inputs (cycle-level when available):

  • applications, admitted, enrolled
  • Optional: year-over-year application growth

Reference formula:

acc = admitted / applications
acc_score = 1 - acc
yield = enrolled / admitted
growth_z = zscore(log(applications) - log(prev_year_applications))  // clipped
yield_z = zscore(yield)
raw = 0.55*acc_score + 0.25*sigmoid(growth_z) + 0.20*sigmoid(yield_z)
CCI = round(100 * raw)

Uncertainty: Binomial CI for acc and yield, then propagate through the formula (e.g. bootstrap).


2. DQI: Data Quality Index (0–100)

Question: "How much should you trust what you see?"

Components (each in [0, 1]):

  • staleness: 0 if current; approaches 1 when beyond policy threshold
  • missingness: fraction of required inputs missing for the requested label view
  • auditability: 1 for official reported; lower for scraped-only or unclear
  • inconsistency: cross-source conflicts / unreconciled mismatches

Reference formula:

DQI = 100 - (25*staleness + 45*missingness + 20*(1 - auditability) + 10*inconsistency)

Rules:

  • DQI MUST be returned whenever any other index is returned.
  • DQI MUST influence presentation (warnings) and uncertainty widening.

3. AFI: Academic Fit Index (0–100)

Question: "How aligned are my academics with the reported admitted range?"

Inputs:

  • Applicant academics (GPA, tests where used)
  • Admitted distributions (often from institution documents; "submitted-only" caveats in test-optional years)

Reference (simple, explainable):

  • Convert applicant value to percentile against admitted distribution: p_gpa in [0,1], p_test in [0,1]
  • prereq_match in [0,1] (optional; requirements checklist)
AFI = round(100 * mean(p_gpa, p_test, prereq_match))

Note: AFI is an alignment signal with explicit data caveats, not a "worth" score.


4. AFuI: Affordability & Funding Index (0–100)

Question: "Can I afford this, given my budget?"

Inputs:

  • COA components and/or net price distributions
  • User's stated ability to pay

Reference formula:

expected_net_cost = median(net_cost_distribution)   // or another robust estimator
AFuI = 100 * (1 - min(1, expected_net_cost / ability_to_pay))

Applicability:

  • If net price is defined for domestic aid regimes only, publish with an applicability warning.
  • Do not claim "international net cost" unless the source supports it.

5. CRI: Completion Risk Index (0–100)

Question: "What's the downside risk of not completing?"

Reference formula:

CRI = 100 * (1 - graduation_rate_T)
  • T = 6 years for US undergrad benchmark.

Note: Cohort-level signal, not an individual prediction.


6. OVI: Outcome Value Index (0–100)

Question: "What is the expected upside, conservatively?"

Reference (v1-safe):

OVI = scale_0_100(0.6*grad_rate + 0.4*earnings_score)

Rules:

  • If earnings/outcomes are not auditable or comparable, reduce weight and lower DQI.

7. ACI: Admission Chance Index (band + interval; v1-safe)

Question: "Given my profile and this cycle's competitiveness, is this Reach / Target / Likely?"

v1-safe posture: Default to bands and wide uncertainty unless labeled outcomes exist and governance permits probabilistic outputs.

Inputs:

  • AFI in [0, 1] (e.g. AFI/100)
  • CCI in [0, 100]
  • missingness in [0, 1]
  • acc: cycle acceptance rate when available

Reference: composite score and bands:

s = 0.6*AFI + 0.4*(1 - CCI/100)

Band thresholds:

  • likely if s >= 0.75
  • target if 0.55 <= s < 0.75
  • reach if s < 0.55

Conservative probability interval:

p_mid = clamp(acc * (0.5 + 1.0*AFI), 0.01, 0.95)
width = 0.25 + 0.35*missingness
p_low = clamp(p_mid - width/2, 0, 1)
p_high = clamp(p_mid + width/2, 0, 1)

Publish band (reach | target | likely) and the interval (p_low, p_mid, p_high) with clear uncertainty messaging.


8. International mobility / work pathway (flags; v1)

This is not a numeric index in v1. Publish structured flags and citations only:

  • OPT/CPT policy notes (general; not individualized)
  • STEM designation flag (where auditable)
  • "Policy can change" warnings
  • Visa issuance stats as country-level context, not school-level odds

Summary

IndexNameQuestionScale
DQIData Quality IndexHow much to trust what you see?0–100
CCICompetitiveness IndexHow competitive is this cycle?0–100
AFuIAffordability IndexCan I afford this?0–100
CRICompletion Risk IndexWhat's the downside risk?0–100
OVIOutcome Value IndexWhat's the expected upside?0–100
AFIAcademic Fit IndexHow aligned are my academics?0–100 (profiled)
ACIAdmission Chance IndexReach / Target / Likely?Band + interval (profiled)

See also: DerivedIndex type (schema), Schema reference.