Releases · ekscrypto/SwiftEmailValidator

26 Apr 14:52

1.7.0

94fb51b

1.7.0 — SwiftEmailValidatorIDNA + UTS #46 §4 V1-V7 Latest

Latest

New opt-in companion target layering UTS #46 IDNA Compatibility Processing on the host portion of the address. Mirrors the SwiftEmailValidatorUTS39 architecture: imported separately (import SwiftEmailValidatorIDNA) so the ~385 KB UCD-derived data does not bundle into callers that don't need it.

Highlights

Full UTS #46 §4 pipeline — Map / NFC / Break into labels / Validate / ToASCII.
All V1-V7 validity criteria enforced by default: V1 NFC, V2 hyphen rules (with xn-- carve-out), V3 leading combining mark rejection, V4/V5 per-scalar status + STD3 LDH gate, V6 CheckBidi, V7 CheckJoiners.
RFC 3492 Punycode encoder/decoder with overflow guards on every multiply/add.
RFC 5893 §2 Bidi rule (UTS #46 V6). Domain-wide trigger per §1.4 — pure-LTR siblings of RTL labels are still gated.
RFC 5892 §A.1 / §A.2 CONTEXTJ rules (UTS #46 V7). ZWJ/ZWNJ allowed only in legitimate joining contexts; misuse — a known homograph attack vector — is rejected.
RFC 5892 §A.3-§A.9 CONTEXTO layered on top of UTS #46 as a default-on security extension. Catches Catalan middle dot, Greek keraia, Hebrew geresh/gershayim, Katakana middle dot, and mixed Arabic-Indic / Extended Arabic-Indic digit homographs. Opt out with IDNA.Options.checkContextO: false for strict UTS #46-only conformance.
VerifyDnsLength + UseSTD3ASCIIRules enforced at the validator layer. STD3 specifically required because the modern preprocessed mapping table marks non-LDH ASCII as valid (with NV8).

Conformance

IdnaTestV2DriverTests runs the official Unicode IdnaTestV2.txt (v17.0.0) end-to-end across toUnicode, toAsciiN (Nontransitional), and toAsciiT (Transitional). All status code families (Pn, Vn, An, Bn, Cn, Xn, U1) are in scope: any row carrying any of them must be rejected. 0 failures across >1000 vectors × 3 operations.

API surface

import SwiftEmailValidatorIDNA

// Direct UTS #46 use:
IDNA.toAscii("münchen.de")     // "xn--mnchen-3ya.de"
IDNA.toUnicode("xn--mnchen-3ya.de")  // "münchen.de"

// Plug into EmailSyntaxValidator:
EmailSyntaxValidator.correctlyFormatted(
    "user@münchen.de",
    idna: IDNA.Options())  // defaults: nontransitional, all checks on

Compatibility

No breaking changes. 1.6.x remains source-compatible — the new IDNA target is opt-in via a separate import.

Full changelog

See CHANGELOG.md for the complete entry, including the tooling and benchmark scope-grading changes.

Assets 2

26 Apr 05:22

ekscrypto

1.6.1

abac85c

1.6.1 — Default_Ignorable hardening + RFC compliance fixes

Security/correctness release. All findings reduce permissiveness; no API surface changed. Users on 1.6.0 should upgrade.

Security

Default_Ignorable spoofing closure (RFC 5892 §2.6)

CharacterSet.letters on Darwin admits a number of Default_Ignorable scalars that produce no glyph and are DISALLOWED in IDNA2008. Several slipped through both the local-part and domain-label gates in 1.6.0:

Domain label path now rejects U+3164 HANGUL FILLER, U+FE0F VS-16, U+E0100 VS-17 (SSP), U+180B–U+180F MONGOLIAN FREE VARIATION SELECTORS, U+115F/U+1160 HANGUL CHOSEONG/JUNGSEONG FILLERS, U+17B4–U+17B5 KHMER VOWEL INHERENT, and U+FFA0 HALFWIDTH HANGUL FILLER. PVALID combining marks (e.g. U+05B0 HEBREW POINT SHEVA) remain accepted; a canary test pins this.
Local-part path now rejects U+034F COMBINING GRAPHEME JOINER, the SMP U+1BCA0–U+1BCA3 SHORTHAND FORMAT controls, the U+1D173–U+1D17A MUSICAL SYMBOL formatting controls, and the reserved U+FFF0–U+FFF8 block.
Leading combining marks rejected. A label starting with an Mn/Mc/Me scalar (e.g. lone U+0301) is now rejected per-label. Mid-label combining marks remain accepted so legitimate diacritics still validate.

Validator behaviour fixes

Empty quoted local part rejected for parity with the dot-atom path (RFC 5321 §3.3).
.arpa rejected by TLDDomainValidator per RFC 3172 (DNS infrastructure only).
TLDDomainValidator.isPubliclyDeliverable(_:) two-layer split. The public form now trims surrounding whitespace and folds U+3002/U+FF0E/U+FF61 to ASCII . before dispatching to a raw _isPubliclyDeliverable(_:) worker. Hardens the API in isolation; end-to-end pipeline through EmailSyntaxValidator was already protected.
IPv6 regex case + leading-zero gaps. _matchIPv6 accepted embedded-IPv4 octets with leading zeros (e.g. 192.168.001.001) and rejected [IPv6:::FFFF:1.2.3.4] because the IPv4-mapped prefix was hardcoded lowercase. Both fixed (RFC 4291 §2.2 / RFC 3986 §3.2.2).

RFC 2047 encoder/decoder hardening

75-octet cap enforced on encoder output. The decoder rejected over-length encoded-words but the encoder did not, so long Unicode inputs auto-encoded then re-decoded silently failed to round-trip. encode() now returns nil if the assembled =?utf-8?b?<base64>?= exceeds 75 chars.
Base64 residue==1 rejected explicitly instead of relying on Foundation to reject the malformed === padding.
Encoded-text grammar tightened to 1*<text> per RFC 2047 §2 — rejects empty payloads (=?utf-8?b??=) and literal ? in encoded-text.

UTS #39 hardening

§5.2 Moderately Restrictive: second-script pool restricted to UAX #31 Recommended scripts. Closes a lenient-mode bypass where Latin + Phoenician/Limbu/etc. passed when rejectRestrictedIdentifiers=false.
§5.1 Augmented_Script_Set applied in Single Script analysis. Pure-Japanese (Han+Hira+Kana, no Latin), pure-Korean (Han+Hang), pure-Chinese-with-Bopomofo (Han+Bopo) and Hira+Kana strings were misclassified as multi-script at .singleScript.

Documentation

.asciiWithUnicodeExtension documented as a project convention (whole-address RFC 2047 wrap), not standards-conformant SMTPUTF8.
domainLabelCharacterSet documented as a coarse Letter+digit gate, not RFC 5891 §4.2.3.2 PVALID.
UTS #39 docstrings corrected (RestrictionLevel.highlyRestrictive combos now match §5.2.2 Table 1; non-existent §5.2.1/§5.2.2/§5.2.3 anchors switched to §5.2's named-bullet form).
UTS #39 out-of-scope sections documented (§5.6.1 Whole-Script Confusables, §5.7.1 Mixed-Numbers, Identifier_Type=Not_NFKC).
Domain-length comment now cites RFC 5321 §4.5.3.1.2 as the headline (255-octet wire cap) and explains how 253 is the derived presentation-form ceiling.

Tooling

Tools/generate_tlds.py switched to PyPI idna (IDNA2008 + RFC 3492 Punycode) from stdlib encodings.idna (deprecation-flagged). Generated output is byte-identical today; no consumer impact.

Tests

Test count grew from 272 to 299 (all passing).
Multiple weak assertions tightened across the suite.
DemoApp test corpus extended with 34 Default_Ignorable spoofing cases.

See CHANGELOG.md for the full entry.

Assets 2

26 Apr 05:22

ekscrypto

1.6.0

568dc0f

1.6.0 — drop SwiftPublicSuffixList for bundled IANA TLD validator

Removed

SwiftPublicSuffixList dependency. The package no longer pulls any third-party Swift dependency. The Public Suffix List was the wrong primitive for email validation: it was designed for cookie scoping and its multi-level / PRIVATE-section entries are policy artifacts of specific registries, with weekly churn driven by non-email concerns.

Added

TLDDomainValidator (new public type). Default domain validator used by EmailSyntaxValidator. Confirms the rightmost DNS label is a currently-delegated IANA TLD (ACE xn--… and Unicode U-label forms both accepted) and rejects names reserved by the IETF Special-Use Domain Names registry:
- .test (RFC 6761 §6.2)
- .example, example.com, example.net, example.org (RFC 6761 §6.5)
- .invalid (RFC 6761 §6.4)
- .localhost (RFC 6761 §6.3)
- .local (RFC 6762 — mDNS)
- .onion (RFC 7686 — Tor)
- .alt (RFC 9476)
- home.arpa (RFC 8375)
Subdomains under any of these are also rejected.
Sources/SwiftEmailValidator/Generated/IANATLDs.swift — bundled IANA TLD set (~1,400 ACE + ~150 U-label entries). Auto-generated; do not edit by hand.
Tools/generate_tlds.py — generator that fetches https://data.iana.org/TLD/tlds-alpha-by-domain.txt, expands ACE TLDs to U-labels, and writes the Swift source. Records source URL, fetch timestamp, and SHA-256.
.github/workflows/update-tlds.yml — nightly workflow that refreshes the bundled TLD list and opens a PR if it changed.
TLDDomainValidatorTests — new test class covering real TLDs, fake TLDs, special-use rejection, IDN handling, case insensitivity, trailing root dot, and wiring as the validator default.

Changed

Default domainValidator closure on EmailSyntaxValidator.correctlyFormatted and mailbox(from:) switched from { PublicSuffixList.isUnrestricted(PublicSuffixList.ace($0)) } to { TLDDomainValidator.isPubliclyDeliverable($0) }.
UTS39.domainValidator(_:base:) default base closure likewise switched from PSL to TLDDomainValidator.
EmailSyntaxValidator.correctlyFormatted(_:uts39:) and mailbox(from:uts39:) convenience overloads likewise switched.
README & benchmark output rewritten to describe the new default and the rationale for moving off the PSL.

Migration notes

Drop the dependency: remove SwiftPublicSuffixList from your Package.swift. SwiftEmailValidator no longer requires it.
@example.com / @example.net / @example.org now fail the default validator (RFC 6761 §6.5). If your tests or sample addresses used these, switch to a real public domain (@iana.org is stable) or pass a permissive domainValidator: { _ in true }.
@localhost, @host.local, intranet domains also fail the default. Pass a custom domainValidator closure if your application accepts these — see "Domain validation" in the README.
PSL-based custom rules: if you were calling PublicSuffixList.isUnrestricted($0, rules: customRules), replace with your own closure (the test suite has examples of a simple TLD-allowlist closure in LocalPartValidatorHookTests).
Newly-delegated TLDs: the bundled list ships frozen at the release SHA. The nightly GitHub workflow keeps the canonical copy current; downstream consumers waiting for a tagged release can override domainValidator with their own check or run python3 Tools/generate_tlds.py and ship the regenerated file.

See CHANGELOG.md for the full entry.

Assets 2

23 Apr 20:40

ekscrypto

1.5.0

d12c418

1.5.0 — UTS #39 Unicode Security Mechanisms

SwiftEmailValidatorUTS39 — opt-in companion target

New second library product layering UTS #39 on top of the core validator. Import only what you need:

import SwiftEmailValidator       // unchanged, zero new cost
import SwiftEmailValidatorUTS39  // adds ~280 KB of UCD data + checks

What the addon enforces

Identifier_Status filter — rejects Restricted scripts (Linear B, Runic, Deseret, etc.)
Mixed-script detection — Single Script / Highly Restrictive / Moderately Restrictive per §5.2
§4 confusable skeletons — skeleton-equality against caller-supplied protected forms (opt-in per call)

Usage

Convenience API — one call, both sides of the address checked:

let ok = EmailSyntaxValidator.correctlyFormatted(
    "alice@example.com",
    uts39: UTS39.Policy()  // default: Highly Restrictive
)

Lower-level — compose closures yourself:

EmailSyntaxValidator.correctlyFormatted(
    candidate,
    domainValidator: UTS39.domainValidator(policy),
    localPartValidator: UTS39.localPartValidator(policy)
)

Main-library change (non-breaking)

One new closure parameter on correctlyFormatted and mailbox(from:):

localPartValidator: (String) -> Bool = { _ in true }

It receives the semantic local-part string (dot-atom as-is, quoted-string unescaped). Default preserves existing behavior.

Implementation notes

All Unicode tables generated from UCD 17.0.0 via Sources/SwiftEmailValidatorUTS39/Tools/generate.py
Skeleton algorithm iterates map+NFD to a fixed point (13 of ~6500 confusables.txt entries require up to 3 iterations for idempotence)
Multi-scalar NFD sources (48 entries, e.g. U+01A1 → [U+006F U+031B]) handled via a longest-match prefix table
Restriction Level classification uses per-scalar Script_Extensions ∩ target ≠ ∅ per §5.1 (not union-based)

Test count

242 total (was 164). Bulk regression test walks every entry in the generated confusables table; restriction-level edge cases cover UTS #39 §5.2 examples and ICU itspoof.cpp patterns including Arabic-Indic digits, combining marks, and the Japanese/Korean/Chinese whitelist combos.

Assets 2

23 Apr 19:14

ekscrypto

1.4.1

a219a59

1.4.1 — accept RFC 4291 §2.2 format-2 IPv6 literals

Small patch release. Closes the single syntax-level gap surfaced by the
reverse-check added in 1.4.0's Benchmarks harness.

Fixed

IPv6 literal regex now accepts RFC 4291 §2.2 format 2. Six
uncompressed hex groups followed by a trailing IPv4-in-dotted-decimal
(e.g. aaaa:aaaa:aaaa:aaaa:aaaa:aaaa:127.0.0.1) are now recognised as
valid. The upstream regex this validator was derived from only
recognised the compressed / IPv4-mapped forms (::ffff:x.x.x.x,
1::5:x.x.x.x). Email addresses such as
valid.ipv6v4.addr@[IPv6:aaaa:aaaa:aaaa:aaaa:aaaa:aaaa:127.0.0.1]
now validate as expected.

Maximum IPv6 literal length remains 45 octets, which is exactly the
existing IPAddressSyntaxValidator public-API length cap — no guard
changes needed.

Tests

Added testIPv6Format2UncompressedWithEmbeddedIPv4 (5 positive cases
including the length boundary) and testIPv6Format2RejectsWrongGroupCount
(5 negative cases for wrong hex-group count, wrong IPv4 octet count, and
out-of-range octets). Test count is 164.

Reverse benchmark

After this fix the --reverse benchmark mode reports 140/144 agreement
with competitor test assertions. The remaining four disagreements are all
syntax-vs-policy differences (127.0.0.1, 127.0.0.256, 127.0.0.1.26,
mailserver as unbracketed hostnames) — our syntax layer accepts them
because they are legal RFC 1035 labels, and the shipped default
domainValidator rejects them as domain policy.

Full changes: 1.4.0...1.4.1

Assets 2

23 Apr 18:59

ekscrypto

1.4.0

a54ff77

1.4.0 — IP validator DoS hardening + benchmark harness

Additive minor release. No behavior change for users of EmailSyntaxValidator.

Added

IPAddressSyntaxValidator public length-capped wrappers. match(_:),
matchIPv4(_:), and matchIPv6(_:) now apply a utf8.count guard
(15 octets for IPv4, 45 for IPv6) before dispatching to the regex engine.
Prior to 1.4.0 a caller invoking IPAddressSyntaxValidator directly with
a multi-megabyte string would spend O(n) inside NSRegularExpression
before the $ anchor failed — a denial-of-service vector. Internal raw
matchers _match(_:) / _matchIPv4(_:) / _matchIPv6(_:) retain the
pre-1.4.0 behavior for EmailSyntaxValidator's hot path, which is
already bounded by the upstream 254 UTF-8 octet address cap.
Benchmarks/ SPM package. A standalone harness runs the 195-case
DemoApp corpus through every SPM-consumable Swift email validator we
could locate (evanrobertson, MimeEmailParser, bdolewski's regex,
jwelton-equivalent via NSDataDetector) and emits a Markdown accuracy
table. Published results are in the new "Comparison with other Swift
email validators" section of the README. Kept out of the main
Package.swift so consumers don't transitively pull competitor deps.

Security

The length-capped public wrappers close the only input-length DoS vector
found in a manual audit of the library's public API surface.
EmailSyntaxValidator users were never exposed. No crashes introduced;
correctlyFormatted(_:) behavior is unchanged.

Compatibility

Purely additive: the new public methods wrap previously-internal
behavior. No existing API signature or return value changed.
Test suite: 162 passing (was 157).
Minimum Swift / platforms unchanged (Swift 5.5, macOS 10.12+, iOS 11+, tvOS 11+).

Full changes: 1.3.1...1.4.0

Assets 2

23 Apr 05:47

ekscrypto

1.3.1

ed99f92

1.3.1 — SwiftPublicSuffixList 3.1.0 compatibility

Changed

SwiftPublicSuffixList dependency bumped to 3.1.0. v3.0 tightened PublicSuffixList.isUnrestricted(_:) / match(_:) to reject non-ASCII hostnames — IDN labels must be in ACE (Punycode) form. The default domainValidator closure now calls PublicSuffixList.ace(_:) on the domain before dispatching to isUnrestricted(_:), so Unicode IDN domains continue to validate exactly as they did on 1.3.0 with PSL 2.x.
Mailbox.Host.domain(...) still carries the original user-facing string; only the validator dispatch uses the ACE form.

Migration

Callers who pass a custom domainValidator closure to correctlyFormatted(_:) / mailbox(from:) and rely on the PSL default behavior via PublicSuffixList.isUnrestricted(_:) should wrap their call site with PublicSuffixList.ace(_:) if the closure receives Unicode IDN domains — e.g. { PublicSuffixList.isUnrestricted(PublicSuffixList.ace($0), rules: myRules) }.

Assets 2

23 Apr 02:02

ekscrypto

1.3.0

b338577

1.3.0 — EmailNormalizer.nfc + RFC 6532 audit

Highlights

New API: `EmailNormalizer.nfc(_:)`

RFC 6532 §3.1-compliant, name-preserving NFC normalization. Use this when you need a spec-compliant comparison form, or when you are normalizing an address you intend to preserve for display, forwarding, or reply-to. NFC collapses canonically-equivalent scalar sequences (e.g. decomposed e + U+0301 → precomposed é) but does not fold compatibility variants (fullwidth, ligatures, superscripts).

EmailNormalizer.nfkc(_:) remains available for anti-spoofing / account de-duplication use cases (Gmail-style), as a documented deliberate deviation from RFC 6532 §3.1's "SHOULD NOT".

RFC compatibility audit

Conducted an in-depth audit of the normalizer against RFC 6532 §3.1, RFC 5198 §3, RFC 5891 §4.1 (IDNA2008), RFC 5321 §2.4, RFC 5321 §4.1.2 (quoted strings), and UAX #15. No spec violations found. nfc(_:) is fully compliant; nfkc(_:) deviations are documented in-source and pinned by tests.

Docstring improvements

nfkc(_:) IDNA2008 section now names UTS#46 transitional mode and RFC 3492 Punycode explicitly
Dot-folding example list extended (U+FF0E, U+3002, U+FF61)
New section enumerating compatibility folds that introduce ASCII SPACE (U+00A0, U+2003, U+FDFA)

New tests

testNfcIsIdempotent / testNfkcIsIdempotent — UAX #15 D8/D9 stability
testNfkcOutputIsAlsoInNfc — NFKC ⊇ NFC property

Test count: 157 (was 154 in 1.2.2), 0 failures.

No breaking changes

Pure additive release. Existing nfkc(_:) callers are unaffected.

Assets 2

23 Apr 01:14

ekscrypto

1.2.2

4100793

1.2.2 — internal simplification

What changed

Follow-up to 1.2.1 — post-release adversarial re-review empirically verified that the 17-case inline BMP guard in extractQuotedString was fully redundant with the per-scalar check at the end of the loop. Collapsed. Also tightened the RFC 2047 candidate gate to reject supplementary-plane Tag/PUA/noncharacter scalars upfront (previously caught after encode + re-parse).

No behavior change for valid input. All 129 tests pass.

Details

extractQuotedString: removed the 17-case inline BMP contains block (U+00AD, U+00A0, U+1680, U+2000–U+200A, U+200B–U+200D, U+202F, U+205F, U+2060–U+2065, U+3000, U+FEFF, U+FE00–U+FE0F, U+2028–U+2029, U+FDD0–U+FDEF, U+FFFE, U+FFFF). All 26 scalars confirmed rejected by qtextUnicodeSMTPCharacterSet's per-scalar allSatisfy. The supplementary-plane guard (isRejectedSupplementaryScalar) stays in the loop — SSP scalars like U+E0100 can attach as grapheme extenders and must be rejected per-scalar, since they are unioned into the set and cannot be subtracted (Foundation .subtracting() bug on supplementary planes).
candidateForRfc2047: first gate now also rejects isRejectedSupplementaryScalar, bringing it in line with extractDotAtom / extractQuotedString.
Replaced a misleading comment that claimed allSatisfy "only examines the first scalar of each Character" (it iterates all scalars).

Upgrade

Drop-in patch. No API or behavior change for valid input.

Assets 2

23 Apr 00:29

ekscrypto

1.2.1

8ff3b67

Internal refactor + invariant tests

A patch release with no behavior change. Adversarial re-review of the post-1.2.0 design observations confirmed two items and retracted three; this release lands the confirmed work and adds clarifying comments around the retracted items so future reviewers reach the same conclusion faster.

Confirmed (implemented)

Extracted isRejectedSupplementaryScalar(_:) — the 6 supplementary-plane conditions genuinely shared between extractDotAtom and extractQuotedString are now in one place. The BMP block in extractQuotedString stays put — it is path-specific (per-grapheme-cluster iteration), not a duplicate of dot-atom's per-scalar check.
Added InvariantTests.swift with 4 deterministic sweeps:
- correctlyFormatted ⇔ mailbox != nil agreement across historically buggy scalar ranges
- dot-atom and quoted-string parity for supplementary-plane scalars
- ASCII acceptance implies Unicode acceptance (subset invariant)
- No crashes on short byte strings (8000 probes)
- Runs in ~90ms with no random seed (reproducible CI)

Retracted (with anti-confusion comments)

Options enum: doc comment now explains the [Options] array shape is intentional forward-compatibility.
CharacterSet block: MARK section header documents that co-location with parsing is deliberate (the subtract-before-union ordering invariant is load-bearing and easy to violate if split).
extractQuotedString BMP guard: comment now states it is not a duplicate of extractDotAtom — different iteration model, different requirements.

Tests

125 → 129 (all passing).

Assets 2

Uh oh!

Releases: ekscrypto/SwiftEmailValidator

1.7.0 — SwiftEmailValidatorIDNA + UTS #46 §4 V1-V7

Highlights

Conformance

API surface

Compatibility

Full changelog

Uh oh!

1.6.1 — Default_Ignorable hardening + RFC compliance fixes

Security

Default_Ignorable spoofing closure (RFC 5892 §2.6)

Validator behaviour fixes

RFC 2047 encoder/decoder hardening

UTS #39 hardening

Documentation

Tooling

Tests

Uh oh!

1.6.0 — drop SwiftPublicSuffixList for bundled IANA TLD validator

Removed

Added

Changed

Migration notes

Uh oh!

1.5.0 — UTS #39 Unicode Security Mechanisms

SwiftEmailValidatorUTS39 — opt-in companion target

What the addon enforces

Usage

Main-library change (non-breaking)

Implementation notes

Test count

Uh oh!

1.4.1 — accept RFC 4291 §2.2 format-2 IPv6 literals

Fixed

Tests

Reverse benchmark

Uh oh!

1.4.0 — IP validator DoS hardening + benchmark harness

Added

Security

Compatibility

Uh oh!

1.3.1 — SwiftPublicSuffixList 3.1.0 compatibility

Changed

Migration

Uh oh!

1.3.0 — EmailNormalizer.nfc + RFC 6532 audit

Highlights

New API: EmailNormalizer.nfc(_:)

RFC compatibility audit

Docstring improvements

New tests

No breaking changes

Uh oh!

1.2.2 — internal simplification

What changed

Details

Upgrade

Uh oh!

Internal refactor + invariant tests

Confirmed (implemented)

Retracted (with anti-confusion comments)

Tests

Uh oh!

New API: `EmailNormalizer.nfc(_:)`