Core Architecture & SDWA Compliance Taxonomy

Modern water utilities operate at the intersection of continuous process control and stringent federal regulation. Translating high-frequency SCADA telemetry into auditable Safe Drinking Water Act (SDWA) compliance artifacts requires more than ad hoc scripting or manual spreadsheet reconciliation; it demands a deterministic, version-controlled data pipeline anchored in a unified regulatory taxonomy. This section of the WaterUtility.org engineering reference establishes the structural blueprint for bridging operational technology (OT) with environmental compliance systems, ensuring that every sensor reading, sampling event, and analytical result maps directly to enforceable EPA and state requirements. It is the architectural anchor for the more focused topics that follow — SCADA data ingestion upstream and the Violation Detection Rule Engine downstream — and defines the shared vocabulary those systems consume.

The audience for this architecture is concrete: water utility operations staff who own the historian, environmental compliance teams who sign the reports, and the municipal Python developers who build the automation between them. The governing constraint is that a compliance dataset is legal evidence. Under a primacy agency review, a utility must be able to reconstruct exactly which raw register value produced which reported result, through which transformation, at which timestamp, with which calibration record attached. Every design decision below serves that reconstructability.

Foundational Pipeline Architecture

A production-grade compliance pipeline must enforce strict separation of concerns across ingestion, transformation, validation, and reporting layers while maintaining unbroken data lineage. Raw telemetry from PLCs, RTUs, and online analyzers enters the architecture via industrial protocols — Modbus TCP, DNP3, and OPC UA — and is immediately serialized into immutable event streams. While time-series databases serve as the operational source of truth, raw values alone lack regulatory context. Every data point must be enriched with calibration certificates, analytical method codes, chain-of-custody identifiers, and temporal alignment markers before entering the compliance evaluation engine.

SDWA compliance data pipeline, from SCADA telemetry to EPA reporting.

The five stages map to distinct responsibilities that should never be collapsed into a single process. Ingestion is concerned only with capturing bytes off the wire and stamping them with a monotonic receive time; it makes no compliance judgment. Transformation decodes protocol payloads, normalizes engineering units, and attaches the taxonomy identifiers that give a number meaning — a task that depends on clean timestamps produced by time-series alignment strategies. Validation evaluates each enriched record against regulatory thresholds and averaging rules. Reporting formats validated records for the appropriate submission channel, and the audit trail captures every state transition as an append-only record. Keeping these boundaries crisp is what allows a single stage to be reprocessed — for example, re-running validation after a rule update — without corrupting the others.

Data integrity is enforced through strict schema validation, cryptographic checksums, and idempotent processing. Because exactly-once delivery is impractical across distributed systems, pipeline orchestration should combine at-least-once delivery with idempotent writes so that backfill or recovery operations cannot produce duplicate compliance calculations. In practice this means every event carries a deterministic key — typically {device_id}:{register}:{source_timestamp} — and every downstream write is an upsert keyed on that value. A replayed message therefore overwrites its own prior result instead of generating a phantom second sample. Because compliance data traverses both OT and enterprise IT networks, network segmentation and least-privilege access controls are mandatory; those controls are elaborated in the OT boundary section below and in the dedicated Security Boundary Design topic. Resilient architectures must also account for telemetry gaps; fallback routing during SCADA outages preserves continuous compliance logging by buffering data and queuing missed intervals until the primary control network recovers.

Regulatory & Standards Foundation

The entire architecture exists to satisfy a finite, well-defined body of federal regulation, and every subsystem should trace its behavior back to a specific rule. The EPA Safe Drinking Water Act framework and its implementing regulations in 40 CFR Part 141 define the contaminant limits, monitoring frequencies, and reporting obligations that the pipeline enforces. Rather than embedding these rules as scattered constants, a durable design captures them as versioned, machine-readable rule sets so that a regulatory change is a data change, not a code deployment.

The regulations most relevant to real-time automation fall into a handful of families. The Total Coliform Rule and its Revised Total Coliform Rule successor govern microbial monitoring and treatment-technique triggers. The Surface Water Treatment Rules constrain turbidity and disinfection, driving the continuous NTU monitoring handled in Modbus TCP Parsing Workflows. The Stage 1 and Stage 2 Disinfectants and Disinfection Byproducts Rules impose running-average limits on total trihalomethanes (TTHM) and haloacetic acids (HAA5). The Lead and Copper Rule adds action-level logic based on 90th-percentile tap results. Each family contributes a distinct averaging shape — instantaneous maximum, monthly average, or locational running annual average — that the validation engine must implement precisely.

Regulatory limits themselves are captured in a reference table that the pipeline resolves at runtime. The SDWA MCL Reference Mapping is the authoritative source for these values; a representative extract illustrates the shape of the data:

Contaminant	EPA method	Limit type	Threshold	Averaging basis
Turbidity (filtered)	180.1	Treatment technique	0.3 NTU (95th pct)	Monthly, per filter
Total trihalomethanes (TTHM)	524.2	MCL	0.080 mg/L	Locational running annual avg
Haloacetic acids (HAA5)	552.3	MCL	0.060 mg/L	Locational running annual avg
Free chlorine (residual)	—	MRDL	4.0 mg/L	Running annual average
Nitrate (as N)	353.2	MCL	10 mg/L	Single sample
Lead	200.8	Action level	0.015 mg/L	90th percentile of tap samples

Because the same contaminant can be governed by different averaging windows depending on system size and source type, the taxonomy stores the averaging rule as data alongside the numeric limit. The locational running annual average that governs the disinfection byproduct rules, for instance, is computed per monitoring location $\ell$ as the mean of the four most recent quarterly averages:

\text{LRAA}_{\ell,q} = \frac{1}{4}\sum_{i=q-3}^{q} \bar{C}_{\ell,i}

where $\bar{C}_{\ell,i}$ is the arithmetic mean of all samples collected at location $\ell$ during quarter $i$ . Encoding the summation window and the per-location grouping as configuration — rather than as a hard-coded four-quarter loop — is what lets the same engine evaluate a monthly turbidity percentile and a four-quarter DBP average without branching logic. Method codes (the EPA analytical method used to produce a result) are likewise stored on every record so that a laboratory result and an online-analyzer result for the same parameter can be distinguished during review.

Component Architecture & Data Contracts

The architecture decomposes into four cooperating subsystems, each owning a stage of the compliance lifecycle and each documented in its own topic. The SDWA MCL Reference Mapping supplies the threshold lookup layer; Monitoring Frequency Scheduling enforces the temporal calendar of when samples are due; Security Boundary Design governs how data crosses the OT/IT perimeter; and Violation Code Classification translates exceedances into standardized regulatory codes. What binds them is a set of explicit data contracts — schemas that every record must satisfy as it moves between subsystems.

At the foundation of these contracts is the deterministic mapping of operational data tags to regulatory constructs. Each sensor or laboratory result must resolve to a specific contaminant identifier and the applicable Maximum Contaminant Level (MCL), Maximum Residual Disinfectant Level (MRDL), or treatment technique requirement.

Taxonomy mapping from an operational data tag to a reporting obligation.

The enriched compliance record is the central data contract of the whole architecture — the object that leaves transformation and enters validation. Defining it explicitly, and validating it at each boundary, is what prevents malformed or context-free data from reaching a report. A minimal contract carries the fields below:

Field	Type	Purpose
`record_id`	UUID	Idempotency key and audit anchor
`contaminant_id`	string	Resolves to the MCL reference row
`value`	float	Measured result in canonical units
`unit`	enum	Canonical unit (mg/L, NTU, …)
`method_code`	string	EPA analytical method that produced the value
`sample_ts`	datetime (UTC)	Sampling/measurement time
`location_id`	string	Monitoring location for LRAA grouping
`quality_flag`	enum	GOOD / SUSPECT / BAD / INTERPOLATED / OFFLINE
`calibration_ref`	string	Certificate binding value to instrument state
`source_hash`	string	SHA-256 of the raw payload

The quality_flag field is the contract point between ingestion and validation: only GOOD and, conditionally, INTERPOLATED records are eligible for compliance calculation, while SUSPECT, BAD, and OFFLINE records are routed to a quarantine buffer for human review. This is the same flag vocabulary produced upstream when parsing Modbus registers for turbidity sensors, which keeps a single quality taxonomy consistent from the wire all the way to the report. Once a record is validated, its results feed the Violation Detection Rule Engine, where exceedance logic, severity scoring, and monitoring-gap detection turn measurements into actionable compliance state.

Regulatory compliance is also inherently time-bound. Sampling intervals, rolling averages, and reporting windows must be programmatically enforced rather than manually tracked. The Monitoring Frequency Scheduling subsystem operationalizes these temporal constraints, generating automated sampling calendars and triggering validation checks against EPA-mandated windows. This eliminates the risk of missed samples or misaligned averaging periods that frequently trigger administrative violations, and it is the mechanism behind automating monthly versus quarterly monitoring schedules for systems whose obligations shift with population served or source water classification.

Four subsystems exchanging schema fields with the central enriched compliance record contract.

Security & Operational Technology (OT) Boundaries

Because compliance telemetry originates inside live control networks, the architecture is inseparable from its security posture. Real-time SCADA telemetry governs pump actuation, valve positioning, and chemical dosing on tight control loops where added latency or unexpected traffic can disrupt treatment. A compliance pipeline must therefore be a strictly one-directional consumer of OT data — never a source of commands or polling load that could perturb those loops. A properly implemented Security Boundary Design ensures that compliance telemetry flows in a single direction from control systems to reporting environments, limiting the exposure of critical infrastructure to lateral movement or query injection.

The reference topology places three zones in series: an OT zone containing the control network, a demilitarized zone (DMZ) where protocol translation and validation occur, and an IT zone where reporting and EPA submission run. Data moves outward through unidirectional gateways (data diodes) or hardened API proxies; nothing routes back inward. Industrial payloads are parsed, reduced to read-only measurement data, and re-serialized into a structured format such as JSON or Parquet before they are allowed to cross into the DMZ. Ingress and egress are constrained by strict access control lists, mutual TLS, and scoped API tokens, and the whole arrangement should be validated against recognized operational technology security guidance such as NIST SP 800-82 Rev. 3. The concrete continuous-verification patterns for this perimeter — device posture checks, per-flow authentication, and micro-segmentation — are detailed in Implementing Zero-Trust Boundaries for SCADA Networks.

The boundary is an ongoing operational control, not a one-time configuration. Certificate expiration, PLC firmware updates, and network topology changes each represent moments where an incorrectly managed boundary can reopen a lateral-movement path or silently drop telemetry. Treating certificate rotation, ACL review, and diode health as scheduled, automated tasks — with a tested recovery runbook — is what keeps the perimeter from degrading between audits.

Audit Trail & Data Lineage Requirements

Auditability is the property that distinguishes a compliance dataset from ordinary operational data, and it must be engineered in rather than bolted on. Every validation pass, transformation step, and submission event generates an immutable record, so that a reviewer can trace any reported number back through every intermediate state to the raw register value that produced it. This lineage is enforced through append-only storage, cryptographic hashing, and version-controlled dataset snapshots.

The mechanism is straightforward but must be applied without exception. Each raw payload is hashed with SHA-256 at ingestion, and that digest — the source_hash on the compliance record — travels with the data through every stage. Any transformation writes a new record referencing its input’s hash, producing a verifiable chain from raw byte to reported result. Because the store is append-only, a correction is a new superseding record, never an in-place edit, which preserves the original for review while still driving the corrected value forward. Suitable storage substrates include append-only Parquet files on object storage and insert-only tables in a relational store, both of which support efficient point-in-time reconstruction.

Retention is itself a regulated obligation, and the architecture must retain each artifact for at least the statutory minimum. Representative periods drawn from 40 CFR Part 141 recordkeeping requirements:

Record class	Statutory minimum	Notes
Microbiological analytical results	5 years	Bacteriological sampling records
Chemical analytical results	10 years	Including DBP and inorganic results
Sanitary survey reports	10 years	Or as directed by the primacy agency
Records of corrective action	3 years after correction	Treatment-technique follow-up
Variance / exemption documentation	5 years after expiry	Retain through the granting period

Storing raw payload hashes alongside their metadata in a write-once repository gives environmental compliance teams defensible evidence during state or federal reviews and lets the pipeline prove, cryptographically, that a reported value was not altered after the fact.

Implementation Standards & Tooling

For municipal developers and automation engineers, implementing this architecture requires strict adherence to data engineering and cybersecurity best practices. The reference stack is Python-centric because the surrounding ecosystem — protocol libraries, dataframe tooling, and schema validators — is mature and well understood by utility automation teams. Validation layers should use schema enforcement libraries such as Pydantic to enforce type safety and required regulatory fields before data reaches the reporting tier, so that a record missing a method_code or carrying an out-of-range value is rejected at the boundary rather than in a report.

A compact contract model makes the enriched compliance record self-validating:

from datetime import datetime
from enum import Enum
from pydantic import BaseModel, Field, field_validator


class QualityFlag(str, Enum):
    GOOD = "GOOD"
    SUSPECT = "SUSPECT"
    BAD = "BAD"
    INTERPOLATED = "INTERPOLATED"
    OFFLINE = "OFFLINE"


class ComplianceRecord(BaseModel):
    record_id: str
    contaminant_id: str
    value: float
    unit: str
    method_code: str
    sample_ts: datetime
    location_id: str
    quality_flag: QualityFlag
    calibration_ref: str
    source_hash: str = Field(min_length=64, max_length=64)

    @field_validator("sample_ts")
    @classmethod
    def _must_be_utc(cls, ts: datetime) -> datetime:
        if ts.tzinfo is None or ts.utcoffset() is None:
            raise ValueError("sample_ts must be timezone-aware UTC")
        return ts

    def eligible_for_compliance(self) -> bool:
        """Only clean or interpolated readings may drive a calculation."""
        return self.quality_flag in {QualityFlag.GOOD, QualityFlag.INTERPOLATED}

Beyond schema enforcement, three practices form the operational baseline for any production deployment. First, version-controlled configuration management: the MCL reference values, averaging rules, and primacy overrides all live in tracked configuration so that a regulatory change is reviewable and reversible. Second, automated regression testing against historical compliance datasets, so that a rule or code change is proven not to alter previously reported results before it ships; this is especially important for the shared logic consumed by the Violation Detection Rule Engine. Third, immutable audit logging wired into CI/CD, so that every deployment is itself an auditable event. Long-running enrichment and validation jobs are typically dispatched through an asynchronous worker tier — see async batch processing setup — which lets backfill and reprocessing run without blocking real-time ingestion.

Jurisdictional & Primacy Variations

Water utilities frequently navigate overlapping federal mandates and state-specific primacy requirements. Because states authorized to implement the SDWA may adopt limits more stringent than the federal baseline, a compliance architecture must abstract jurisdictional variations into configurable rule sets rather than hard-coded logic. State primacy parameters — MCL values, approved treatment techniques, monitoring frequencies, and reporting formats — should be stored in version-controlled configuration tables that the pipeline resolves at runtime for each service area’s governing primacy agency.

Parameter	Federal baseline	State override example	Resolution
Arsenic MCL	0.010 mg/L	0.010 mg/L (NJ, stricter guidance values apply)	Per-primacy table lookup
DBP reporting format	EPA SDWIS flat file	State portal XML	Per-primacy formatter
Turbidity reporting frequency	Monthly	Monthly + state quarterly summary	Additive schedule merge
Public notification tier timing	Federal Tier 1/2/3	State-shortened Tier 2 window	Min of federal / state deadline

Runtime resolution follows a clear precedence: the pipeline loads the federal rule, then applies the primacy overlay for the record’s primacy_id, taking the more stringent value or the more demanding deadline wherever the two disagree. This keeps the core pipeline logic unchanged when a primacy agency updates a limit without an accompanying federal rulemaking — the change is a new configuration row, reviewed and version-controlled, not a code deployment. The same resolution pattern governs how exceedances become reportable events in Violation Code Classification, where a raw exceedance is mapped to both an EPA reporting category and any state-specific enforcement code, a translation examined in detail in translating EPA violation codes to internal alerts. Together, configurable rule sets and runtime primacy resolution let a single codebase serve utilities across jurisdictions without forking, while preserving the deterministic lineage that every SDWA review demands.

Core Architecture & SDWA Compliance Taxonomy

Related pages