Hardening RUT acceptance: strict mode and safe processing
Chilean RUT acceptance gate that survives placeholders, unicode dashes, and Ley 19.628 PII logging — with `strict` mode and rut.ts safe-mode helpers.
A permissive Chilean RUT validator is a foothold. When the acceptance layer treats "looks like a RUT" as "is a valid RUT", it lets placeholders bypass identity checks, duplicate accounts pile up, and analytics pipelines accumulate garbage silently. The hardening recipe is short and explicit: strict mode at the boundary, and safe-mode helpers everywhere downstream where you cannot guarantee the input has already been vetted.
The acceptance pipeline in three lines#
import { isRutLike, validate } from "rut.ts";
export const acceptRutStrict = (input: unknown): string | null =>
typeof input === "string" && isRutLike(input) && validate(input, { strict: true })
? input
: null;Type guard, shape check, strict semantic gate — composed into a single zero-dependency function. Drop this into every API endpoint that accepts a RUT, and the rest of the post explains why each layer is load-bearing.
Threat model#
Real production systems encounter several categories of RUT abuse, none of them theoretical.
Placeholder RUTs are the most common. Values like 11.111.111-1, 22.222.222-2, and 99.999.999-9 have valid Modulo 11 verifier digits — bots, test scripts, and users who want to skip identity checks all reach for them. The plain validate() call returns true for all of them.
Encoding games are subtler. A string that passes a naive regex can contain unicode dashes (‐, –, —) instead of the ASCII hyphen, zero-width characters between digits, or trailing whitespace that breaks an exact match against a stored canonical form.
Type confusion happens at API boundaries. "rut": 12345678 in a JSON body is a number. Calling .toUpperCase() or .replace() on it without a type guard throws. So does null, undefined, or an array value.
Case and verifier confusion round out the picture. The letter verifier K is sometimes entered as k, and whether that passes or fails depends on where normalization runs relative to validation. These categories combine: a single submission using k, a unicode dash, and a placeholder body defeats any validator that does not address all three layers.
The acceptance pipeline#
The reliable pattern is three layers applied in order, each cheap enough that the cost is negligible.
The first layer is a type check. Reject anything that is not a string. This protects every downstream call from type errors and ensures your validator never touches null, undefined, or a numeric body parsed from JSON.
The second layer is a shape check using isRutLike. It confirms the string has RUT structure — digits, optional dot grouping, a hyphen, a verifier character — without computing the checksum. Encoding garbage and clearly malformed strings are eliminated before the heavier check runs.
The third layer is the semantic gate: validate with { strict: true }. This verifies the Modulo 11 checksum and rejects repeated-digit placeholders. A value that passes all three layers is a real, canonical, non-placeholder RUT.
import { isRutLike, validate } from "rut.ts";
export function acceptRutStrict(input: unknown): string | null {
if (typeof input !== "string") return null;
if (!isRutLike(input)) return null;
if (!validate(input, { strict: true })) return null;
return input;
}Each rejection is a null return rather than a throw. The function is designed for contexts where invalid input is expected and handled without exception overhead. When you need an exception, throw at the call site, not inside the gate.
Safe-mode helpers for downstream code#
Once a RUT clears the acceptance pipeline, clean() and decompose() normalize it for storage or display. In those code paths the input is trusted, and throwing on invalid input is reasonable — the value should never have reached that point.
But not all code paths start from a trusted source. Log parsers, message-queue consumers, ETL pipelines, and reporting jobs often receive RUT-shaped strings from external systems that were not validated by your acceptance layer. In those contexts you want normalization that degrades gracefully rather than throwing.
All rut.ts decomposition helpers accept { throwOnError: false }. With that option, an invalid or malformed input returns null instead of raising an exception. The caller branches on null and continues processing other records.
import { clean, decompose } from "rut.ts";
export function safeNormalize(input: string) {
const cleaned = clean(input, { throwOnError: false });
if (!cleaned) return null;
return decompose(cleaned, { throwOnError: false });
}The pattern is: attempt to clean, short-circuit on null, then decompose. If either step fails the function returns null and the caller decides whether to skip the record, log a warning, or increment a counter — no try/catch needed for expected failure modes.
Logging without leaking PII#
RUT is personally identifiable information under Ley 19.628, Chile's data protection law. That means logging a raw RUT in error messages, request traces, or analytics events is a compliance issue, not just a hygiene preference.
The practical rule: never log the raw value. If you need to correlate a log entry with a specific RUT for debugging, mask it to the verifier digit plus the last two digits of the body — something like ••••••78-5. That pattern is unique enough to find the record in a database but reveals nothing on its own.
Error reporters like Sentry require explicit configuration. The two safe approaches are scrubbing the field at the SDK layer using beforeSend, or stripping the value before it is serialized into an error object. Relying on Sentry's default scrubbing is not sufficient: the defaults target passwords and credit card numbers, not RUTs. A raw rut property on any error object will be captured.
The same principle applies to structured logging. A logger that serializes the full request body will include the RUT. Mask before passing values to any logging call.
Pitfalls#
Forgetting strict and accepting placeholders. validate('11.111.111-1') returns true. The Modulo 11 algorithm does not distinguish real bodies from repeated-digit sequences. Without strict: true, your signup form, your identity verification flow, and your deduplication logic will all see 11.111.111-1 as a valid RUT and create real records for it.
Client-only validation. The client-side check is a UX aid. Any caller with curl or a browser DevTools console can send an arbitrary body. Re-validate with strict: true on every server endpoint that accepts a RUT, regardless of what the client already checked.
Throwing in code paths where invalid input is expected by design. Log scanners, queue consumers, and ETL jobs process input from systems that may not have validated it. Using the default throwing behavior in those contexts means one malformed record halts processing of everything that follows it. Use { throwOnError: false } and branch on null.
Caching invalid normalizations. If you cache the result of clean() keyed on the raw input string, and the raw input is invalid, you will store a null value in the cache and serve it to subsequent callers. Run validate() first; only cache the cleaned result when validation passes.
Further reading#
- Install rut.ts —
pnpm add rut.ts, zero dependencies - Quick start — wire
acceptRutStrictinto a real endpoint in five minutes - Security guide
validate()referenceisRutLike()reference- Validate Chilean RUT in NestJS with class-validator
- The right way to store Chilean RUTs in your database
- Stop hardcoding "11.111.111-1": generating test RUTs with rut.ts