ICU scoring systems and prognostication

Contents


APACHE II

Acute Physiology and Chronic Health Evaluation II is the most widely used severity of illness score internationally.

Components

A. Acute physiology score (APS): 12 physiological variables, each scored 0–4 based on deviation from normal in either direction:

Variable Normal Value
Temperature 36–38.4°C
Mean arterial pressure 70–109 mmHg
Heart rate 70–109 bpm
Respiratory rate 12–24/min
Oxygenation: A-aDO₂ or PaO₂/FiO₂ ratio
Arterial pH 7.33–7.49
Serum sodium 130–149 mmol/L
Serum potassium 3.5–5.4 mmol/L
Creatinine 53–133 µmol/L
Haematocrit 30–45.9%
WBC 3–14.9 ×10⁹/L
Glasgow Coma Scale (GCS) Score used as: 15 − GCS

B. Age points: 0 (≤44) to 6 (≥75)

C. Chronic health points: 2–5 points for severe organ dysfunction or immunocompromise (hepatic, cardiovascular, renal, pulmonary, immunological)

Scoring

Total range: 0–71 (in practice rarely exceeds 55)
Calculated within the first 24 hours of ICU admission using worst values.

An APACHE II score of 25 corresponds to approximately 50% predicted hospital mortality in the original derivation cohort (Knaus 1985).

Area Under the ROC Curve (AUROC)

~0.80–0.85 for predicting hospital mortality in the original validation; performance varies across populations and healthcare systems.


SOFA Score

Sequential Organ Failure Assessment (Vincent 1996) quantifies organ dysfunction across six organ systems. Used in real-time for clinical monitoring.

Components (each 0–4):

System Measure Score 0 Score 1 Score 2 Score 3 Score 4
Respiratory PaO₂/FiO₂ (mmHg) ≥400 300–399 200–299 100–199 + MV <100 + MV
Coagulation Platelets ×10⁹/L ≥150 100–149 50–99 20–49 <20
Liver Bilirubin µmol/L <20 20–32 33–101 102–204 >204
Cardiovascular MAP / vasopressor dose MAP ≥70 MAP <70 Dopamine ≤5 or dobutamine Dopamine >5 or NA/A ≤0.1 Dopamine >15 or NA/A >0.1
Neurological GCS 15 13–14 10–12 6–9 <6
Renal Creatinine or UO <110 110–170 171–299 (or UO <500mL/24h) 300–440 (or UO <200mL/24h) >440

Total score: 0–24

Uses in Critical Care

  • Sepsis-3 definition: SOFA increase ≥2 points from baseline in suspected infection = organ dysfunction (sepsis)
  • SOFA score ≥2 in suspected infection correlates with ~10% in-hospital mortality
  • Day-to-day tracking of organ failure trajectory: rising SOFA = deterioration; falling = response to treatment
  • SOFA is better than APACHE for tracking changes over time

qSOFA

Quick SOFA — bedside tool for identifying patients at risk of poor outcome from sepsis, particularly outside the ICU.

Three variables (1 point each):

  1. Respiratory rate ≥22/min
  2. Altered mentation (GCS <15)
  3. SBP ≤100 mmHg

Score ≥2: associated with significantly increased risk of in-hospital death and ICU admission; prompt assessment and initiation of sepsis management.

Limitations:

  • Sensitivity ~50–60% compared to SOFA ≥2 for identifying organ dysfunction from sepsis
  • Better used as a clinical prompt than a diagnostic criterion
  • Poor sensitivity means a score of 0 or 1 does not exclude sepsis

SAPS II

Simplified Acute Physiology Score II (Le Gall 1993) — the European alternative to APACHE II.

  • 17 variables: 12 physiological, age, admission type, and 3 chronic disease variables (AIDS, metastatic cancer, haematological malignancy)
  • Validated primarily in European ICU populations
  • Similar AUROC to APACHE II (~0.85)
  • Predicts hospital mortality using the total score; converted to risk of death via a regression equation

NEWS2

National Early Warning Score 2 (Royal College of Physicians, 2017) — the UK standard for detecting clinical deterioration on general wards.

Parameters (each 0–3):

Respiratory rate, SpO₂ (two scales: standard and target 88–92% for hypercapnic COPD risk), systolic BP, pulse rate, consciousness (ACVPU: Alert, Confusion, Voice, Pain, Unresponsive), temperature.

Supplemental oxygen use also scores +2.

Thresholds for escalation:

  • NEWS ≥5 (aggregate): urgent clinical review; consider ICU/HDU review
  • Any single parameter scoring 3: urgent review
  • NEWS ≥7: emergency response

NEWS2 is not designed for the ICU but governs ward-based escalation and triggers Critical Care Outreach.


Clinical Frailty Scale (CFS)

The Clinical Frailty Scale (Rockwood 2005) is a 9-point scale assessing functional independence and frailty:

Score Category Description
1 Very fit Robust, active, energetic
2 Well No active disease; exercises regularly
3 Managing well Well-treated medical conditions; not active
4 Living with very mild frailty Some slowing, mild dependency
5 Mildly frail Dependent in IADLs (finances, transport)
6 Moderately frail Help with all outside activities and housekeeping
7 Severely frail Completely dependent on others for personal care
8 Very severely frail Completely dependent; terminally ill
9 Terminally ill Life expectancy <6 months

CFS in Critical Care

  • CFS ≥5 at ICU admission is associated with significantly higher ICU and hospital mortality
  • COVID-19 pandemic data robustly demonstrated CFS as a powerful independent predictor of 30-day ICU mortality (more predictive than age alone)
  • CFS ≥7: ICU admission may be associated with very high mortality and poor long-term functional recovery; should inform shared decision-making discussions
  • CFS should be assessed based on pre-morbid function (within 2 weeks before acute illness onset, not during acute illness)
  • CFS is not a ceiling-of-care tool but informs prognosis and helps clinicians have better-informed goals of care conversations

Limitations of Scoring Systems

Designed for Populations, Not Individuals

Prognostic scores generate probability statements for groups of similar patients — they cannot predict outcomes for individual patients with precision. An APACHE II score of 30 might correspond to 60% predicted mortality — but that specific patient either survives or dies; the score cannot determine which.

Lead-Time Bias

ICU scoring is highest on admission — this is when the patient is most physiologically compromised. If one unit intervenes earlier in an illness (lower severity at ICU admission), their patients will have lower APACHE scores and better apparent outcomes, not because care is better but because of case-mix bias. Comparing units solely on APACHE-derived mortality ratios is methodologically problematic.

The First-Day Problem

APACHE II scores worst physiological values in the first 24 hours. A patient admitted after a large haemorrhage, resuscitated en route, and improving rapidly may have a very high first-day score despite an excellent trajectory — the score captures a single moment rather than the illness dynamic.

Limited Variables

APACHE II does not capture:

  • Pre-morbid functional status (frailty)
  • Quality of life
  • Aetiology of illness (type of malignancy, reversibility of underlying condition)
  • Patient preferences
  • Socioeconomic or care-home context

Frailty (CFS) independently predicts outcomes beyond physiological severity scores — combining CFS with APACHE II improves prognostic accuracy.

Temporal Calibration Drift

Models derived from populations in 1985 or 1993 do not perfectly predict outcomes in modern ICUs with different case mixes, treatments, and demographics. APACHE IV and SAPS 3 attempt to address this but still require local recalibration.


Using Scores in Clinical Practice

Appropriate uses

  • Unit-level benchmarking: comparing standardised mortality ratios (SMR = observed/expected mortality) across ICUs; useful quality improvement tool
  • Research stratification: ensuring comparable disease severity in clinical trials
  • Clinical trial enrolment criteria: SOFA ≥2 for sepsis trial eligibility; P:F <150 for ARDS trials
  • Communication: gives a structured language for discussing severity with patients/families ("your loved one has a severity score of X, which is associated with approximately Y% mortality in similar patients")
  • Day-to-day monitoring: SOFA changes over time track organ failure trajectory

Inappropriate uses

  • Individual prognosis: a score is not a prediction for this patient; never present a score as a definitive mortality estimate
  • Ceiling of care decisions: scores should inform, not determine, decisions to withhold or withdraw treatment; clinical judgement, patient wishes, and best interests assessment are essential
  • Admission triage in major incidents: in mass casualty scenarios, scoring tools are sometimes used; their limitations are even more significant under these conditions

Viva Questions

1. What is the SOFA score and how is it used in the definition of sepsis?

The Sequential Organ Failure Assessment (SOFA) score quantifies organ dysfunction across six systems: respiratory (PaO₂/FiO₂), coagulation (platelets), liver (bilirubin), cardiovascular (MAP and vasopressor dose), neurological (GCS), and renal (creatinine and urine output). Each system scores 0–4, total 0–24. In the Sepsis-3 definitions (Singer 2016), sepsis is defined as life-threatening organ dysfunction caused by a dysregulated host response to infection — operationalised as an acute increase in SOFA score of ≥2 points from baseline in the context of suspected infection. A SOFA increase ≥2 corresponds to approximately 10% hospital mortality in unselected ICU patients with suspected infection. The score helps shift the definition away from SIRS criteria (non-specific) towards organ dysfunction as the hallmark of sepsis. In clinical practice, SOFA is also used to track organ failure trajectory over time — a rising SOFA despite treatment indicates deterioration; a falling score tracks response. qSOFA (three rapid bedside criteria) is used outside the ICU to identify patients at risk of poor outcome.


2. What are the limitations of APACHE II for prognosticating individual ICU patients?

APACHE II was designed and validated for predicting mortality in populations — it generates probability estimates that may be accurate for groups of 100 similar patients, but cannot reliably predict what will happen to any individual patient. Several additional limitations apply. Lead-time bias: units that admit patients earlier have lower APACHE scores, potentially masking worse outcomes rather than representing genuinely better care. The first-day problem: APACHE uses the worst values in the first 24 hours, which may overestimate severity in rapidly improving patients. The score does not capture pre-morbid function or frailty — two patients with identical APACHE II scores may have completely different trajectories depending on whether one is frail (CFS 6) or fit (CFS 2). It does not account for patient wishes, the aetiology or reversibility of the acute illness, or socioeconomic context. Finally, the original derivation cohort (Knaus 1985) predates modern ICU care — calibration has drifted; predictions from the original equations may not match contemporary outcomes. Scores should inform, not determine, individual clinical decisions or ceiling-of-care discussions.


3. How does the Clinical Frailty Scale add prognostic information beyond physiological severity scores?

Physiological severity scores like APACHE II capture the acute severity of illness but not the patient's pre-morbid reserve. Two patients with identical APACHE scores may have very different outcomes if one is a fit 70-year-old (CFS 2) and the other is a frail dependent care home resident (CFS 7). The CFS quantifies pre-existing functional reserve — independent of acute illness — by assessing functional dependence across a spectrum from very fit (CFS 1) to terminally ill (CFS 9). Multiple studies, robustly confirmed during the COVID-19 pandemic, show CFS ≥5 is independently associated with significantly higher ICU and hospital mortality, longer ICU stay, and worse functional recovery, even after adjusting for physiological severity. CFS adds discriminatory power to APACHE II. Critically, CFS informs the nature of clinical discussions: a fit patient with a high APACHE score has good prognosis if they survive the acute insult; a frail patient with the same APACHE score may survive the immediate illness but may not recover to meaningful functional independence. CFS should be assessed based on pre-morbid function, within 2 weeks before the acute illness.


4. A family asks you to give them a survival percentage for their relative, based on the ICU scoring system. How do you respond?

I would explain the meaning and limitations of the score carefully. I might say: "These scores tell us how severe his illness is and how similar patients do on average — but they cannot tell us exactly what will happen to him as an individual. The score suggests that patients with this level of illness have [X]% chance of surviving to discharge, but he may do better or worse than that average. What I can tell you is how he looks today and whether things seem to be improving. The score is one of many tools we use, but your loved one is not a statistic." I would then focus the conversation on trajectory (is he improving or deteriorating?), on what recovery would look like if he survives (functional outcome, quality of life), and on understanding what he himself would want if he could tell us. Giving a precise percentage without context risks creating false certainty in either direction. The family often want honesty about prognosis, and acknowledging uncertainty while sharing the clinical picture is more useful than a number that they may interpret as a definitive prediction.