Skip to content

Add comprehensive specification-derived test suite#73

Merged
eywalker merged 3 commits intodevfrom
claude/comprehensive-testing-plan-uN9E4
Mar 5, 2026
Merged

Add comprehensive specification-derived test suite#73
eywalker merged 3 commits intodevfrom
claude/comprehensive-testing-plan-uN9E4

Conversation

@eywalker
Copy link
Contributor

@eywalker eywalker commented Mar 5, 2026

Summary

This PR introduces a new test-objective/ directory at the project root containing a comprehensive, specification-first test suite for the orcapod-python codebase. The tests are derived purely from design documents, protocol definitions, and interface contracts—not from reading implementation code—to avoid "self-affirmation" testing patterns.

Key Changes

  • TESTING_PLAN.md: Added a 680-line comprehensive test specification catalog documenting all planned test cases organized by module and test type (unit, integration, property-based)

  • test-objective/ directory structure: Created a new test suite with:

    • conftest.py: Shared fixtures for sources, streams, functions, and operators
    • unit/: 20+ unit test modules covering individual components
      • Core types: test_types.py, test_datagram.py, test_tag.py, test_packet.py
      • Streams & sources: test_stream.py, test_sources.py, test_source_registry.py
      • Functions & pods: test_packet_function.py, test_function_pod.py
      • Operators: test_operators.py (Join, Filter, Select, Drop, MapTags, MapPackets, etc.)
      • Nodes: test_nodes.py (FunctionNode, OperatorNode, Persistent variants)
      • Infrastructure: test_hashing.py, test_databases.py, test_schema_utils.py, test_arrow_utils.py, test_arrow_data_utils.py, test_semantic_types.py, test_contexts.py, test_tracker.py, test_lazy_module.py
    • integration/: 5 integration test modules covering end-to-end scenarios
      • test_pipeline_flows.py: Source → Operator/FunctionPod → Stream pipelines
      • test_caching_flows.py: DB-backed caching with PersistentFunctionNode/OperatorNode
      • test_hash_invariants.py: Hash stability and Merkle chain properties
      • test_provenance.py: System tag lineage through pipelines
      • test_column_config_filtering.py: ColumnConfig visibility across components
    • property/: 3 property-based test modules using Hypothesis
      • test_schema_properties.py: Schema algebra properties
      • test_hash_properties.py: Hash determinism and collision resistance
      • test_operator_algebra.py: Operator commutativity, associativity, idempotency

Implementation Details

  • Tests are organized by specification source (design documents, protocol definitions, type annotations, docstrings)
  • Each test module includes docstrings explaining which specification sections it derives from
  • Helper functions and fixtures are centralized in conftest.py for reusability
  • Tests cover both happy paths and error conditions (e.g., missing columns, type mismatches)
  • Property-based tests use Hypothesis to verify invariants hold across arbitrary inputs
  • Integration tests verify cross-component behaviors like system tag evolution and caching semantics

https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM

Copilot AI review requested due to automatic review settings March 5, 2026 07:41
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new “specification-first” test suite under test-objective/ plus a root TESTING_PLAN.md catalog, aiming to validate OrcaPod behavior against documented contracts rather than implementation details.

Changes:

  • Added TESTING_PLAN.md describing a comprehensive, spec-derived test catalog and execution guidance.
  • Introduced test-objective/ with unit, integration, and property-based tests covering core types, datagrams/streams/sources, operators/nodes, hashing, databases, and contexts.
  • Added shared pytest fixtures in test-objective/conftest.py to standardize test inputs across modules.

Reviewed changes

Copilot reviewed 31 out of 35 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
TESTING_PLAN.md Adds the specification-derived test plan and run instructions
test-objective/init.py Marks test-objective as a package
test-objective/conftest.py Shared fixtures/helpers for objective tests
test-objective/integration/init.py Marks integration tests package
test-objective/integration/test_caching_flows.py Integration coverage for DB-backed caching flows
test-objective/integration/test_column_config_filtering.py Integration coverage for ColumnConfig across components
test-objective/integration/test_hash_invariants.py Integration coverage for content/pipeline hash invariants
test-objective/integration/test_pipeline_flows.py Integration coverage for end-to-end pipeline scenarios
test-objective/integration/test_provenance.py Integration coverage for system tag lineage rules
test-objective/property/init.py Marks property tests package
test-objective/property/test_hash_properties.py Hypothesis property tests for hashing and ContentHash roundtrips
test-objective/property/test_operator_algebra.py Property-style algebraic tests for operators
test-objective/property/test_schema_properties.py Hypothesis property tests for Schema algebra
test-objective/unit/init.py Marks unit tests package
test-objective/unit/test_arrow_data_utils.py Unit tests for system tag/source-info Arrow helpers
test-objective/unit/test_arrow_utils.py Unit tests for Arrow schema/table utilities
test-objective/unit/test_contexts.py Unit tests for DataContext resolution and defaults
test-objective/unit/test_databases.py Unit tests for database implementations
test-objective/unit/test_datagram.py Unit tests for Datagram semantics and conversions
test-objective/unit/test_function_pod.py Unit tests for FunctionPod/FunctionPodStream behaviors
test-objective/unit/test_hashing.py Unit tests for semantic hashing and registries
test-objective/unit/test_lazy_module.py Unit tests for LazyModule deferred import behavior
test-objective/unit/test_nodes.py Unit tests for FunctionNode/OperatorNode and persistent variants
test-objective/unit/test_operators.py Unit tests for operator semantics and validation
test-objective/unit/test_packet.py Unit tests for Packet source-info behavior
test-objective/unit/test_packet_function.py Unit tests for PythonPacketFunction and CachedPacketFunction
test-objective/unit/test_schema_utils.py Unit tests for schema utility functions
test-objective/unit/test_semantic_types.py Unit tests for type conversion behavior
test-objective/unit/test_source_registry.py Unit tests for SourceRegistry behaviors
test-objective/unit/test_sources.py Unit tests for source types and edge cases
test-objective/unit/test_stream.py Unit tests for ArrowTableStream behavior and conversions
test-objective/unit/test_tag.py Unit tests for Tag system-tags behavior
test-objective/unit/test_tracker.py Unit tests for tracker manager / graph tracker

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +10 to +12
from hypothesis import given, settings
from hypothesis import strategies as st

Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module imports hypothesis, but pyproject.toml does not declare Hypothesis in dependencies/dev dependency groups. As-is, the test suite will fail to import in a clean environment. Add hypothesis to the test/dev dependencies (or skip these tests when it isn’t installed).

Suggested change
from hypothesis import given, settings
from hypothesis import strategies as st
hypothesis = pytest.importorskip("hypothesis")
given = hypothesis.given
settings = hypothesis.settings
st = hypothesis.strategies

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +21
def test_schema_acts_as_mapping(self):
s = Schema({"x": pa.int64(), "y": pa.utf8()})
assert "x" in s
assert s["x"] == pa.int64()
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema in orcapod.types is defined as a mapping from field names to Python types (e.g., int, str), not PyArrow DataType instances. Using pa.int64()/pa.utf8() here makes the tests diverge from the documented contract and from how schemas are produced elsewhere (e.g., via the type converter). Update these tests to use Python types (and reserve Arrow typing assertions for the Arrow converter utilities).

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +30
tag = Tag({"x": 1, "y": "hello"}, data_context=ctx, system_tags={"_tag::src:abc": "val"})
keys = list(tag.keys())
assert "x" in keys
assert "y" in keys
assert not any(k.startswith("_tag::") for k in keys)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System tag conventions in the codebase use constants.SYSTEM_TAG_PREFIX ("tag"). Hard-coding _tag::... here bakes in an incorrect prefix and can reduce coverage vs real provenance tags. Prefer building the key/assertions from constants.SYSTEM_TAG_PREFIX (and/or using realistic system tag key shapes like _tag_source_id::...).

Copilot uses AI. Check for mistakes.
Comment on lines +377 to +379
# System tag columns are prefixed with _tag:: internally
assert all(not k.startswith("_tag_") for k in tag_keys)

Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment/documentation claims system tag columns are prefixed with _tag::, but the actual prefix used throughout the codebase is constants.SYSTEM_TAG_PREFIX ("tag") and system tag columns are typically _tag_source_id::... / _tag_record_id::.... Update the comment to match the implemented naming so future readers aren’t misled.

Copilot uses AI. Check for mistakes.
Comment on lines +648 to +652
## Dependencies

- **hypothesis** — added as a test dependency for property-based testing in `test-objective/property/`
- **pytest** — test runner (already present)
- DeltaTableDatabase tests marked with `@pytest.mark.slow` (skip with `-m "not slow"`)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TESTING_PLAN.md states that Hypothesis is “added as a test dependency”, but the PR doesn’t update pyproject.toml to include it. Either add Hypothesis to the declared test/dev dependencies or adjust this plan so it matches the repository configuration.

Copilot uses AI. Check for mistakes.

from __future__ import annotations

import pytest
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module imports hypothesis, but the repo doesn’t declare it as a dependency in pyproject.toml, so running tests will fail on import. Add hypothesis to the test/dev dependencies (or gate these tests behind an optional extra/skip).

Suggested change
import pytest
import pytest
# Skip this test module if Hypothesis is not installed
_hypothesis = pytest.importorskip("hypothesis")

Copilot uses AI. Check for mistakes.
Comment on lines +67 to +93
"""Per design, system_tags=True includes _tag:: columns in Tag."""

def test_system_tags_excluded_by_default(self):
t = Tag(
{"id": 1},
system_tags={"_tag::source:abc": "rec1"},
)
keys = t.keys()
assert "_tag::source:abc" not in keys

def test_system_tags_included_with_config(self):
t = Tag(
{"id": 1},
system_tags={"_tag::source:abc": "rec1"},
)
keys_default = t.keys()
keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True))
assert len(keys_with_tags) > len(keys_default)
assert "_tag::source:abc" in keys_with_tags

def test_all_info_includes_system_tags(self):
t = Tag(
{"id": 1},
system_tags={"_tag::source:abc": "rec1"},
)
keys = t.keys(all_info=True)
assert "_tag::source:abc" in keys
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses _tag::source:abc as a system tag key, but the codebase’s system-tag prefix is constants.SYSTEM_TAG_PREFIX ("tag") and the provenance tags generated by sources/operators follow _tag_source_id::... / _tag_record_id::.... Use constants.SYSTEM_TAG_PREFIX (and realistic key shapes) so the test validates production behavior.

Suggested change
"""Per design, system_tags=True includes _tag:: columns in Tag."""
def test_system_tags_excluded_by_default(self):
t = Tag(
{"id": 1},
system_tags={"_tag::source:abc": "rec1"},
)
keys = t.keys()
assert "_tag::source:abc" not in keys
def test_system_tags_included_with_config(self):
t = Tag(
{"id": 1},
system_tags={"_tag::source:abc": "rec1"},
)
keys_default = t.keys()
keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True))
assert len(keys_with_tags) > len(keys_default)
assert "_tag::source:abc" in keys_with_tags
def test_all_info_includes_system_tags(self):
t = Tag(
{"id": 1},
system_tags={"_tag::source:abc": "rec1"},
)
keys = t.keys(all_info=True)
assert "_tag::source:abc" in keys
"""Per design, system_tags=True includes system-tag-prefixed columns in Tag."""
def test_system_tags_excluded_by_default(self):
system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc"
t = Tag(
{"id": 1},
system_tags={system_tag_key: "rec1"},
)
keys = t.keys()
assert system_tag_key not in keys
def test_system_tags_included_with_config(self):
system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc"
t = Tag(
{"id": 1},
system_tags={system_tag_key: "rec1"},
)
keys_default = t.keys()
keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True))
assert len(keys_with_tags) > len(keys_default)
assert system_tag_key in keys_with_tags
def test_all_info_includes_system_tags(self):
system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc"
t = Tag(
{"id": 1},
system_tags={system_tag_key: "rec1"},
)
keys = t.keys(all_info=True)
assert system_tag_key in keys

Copilot uses AI. Check for mistakes.
Comment on lines +145 to +148
assert path not in {
tuple(k.split("/")) for k in db._tables
}, "Record should not be in committed store before flush"

Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts pending-vs-committed behavior by reading InMemoryArrowDatabase private fields (_tables). That’s brittle to internal refactors and doesn’t test the public database contract. Prefer assertions via public methods (or add a small public inspection hook if committed/pending distinction is part of the spec).

Copilot uses AI. Check for mistakes.
Comment on lines +656 to +659
Run the full test suite with:
```bash
uv run pytest test-objective/ -v
```
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: repo pytest.ini sets testpaths = tests, so running plain pytest (as CI commonly does) will not discover test-objective/ tests. If these are meant to run in CI by default, pytest.ini (or CI command) needs updating; otherwise it’s worth calling out explicitly here that they are opt-in via an explicit path.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +13
import pytest
from hypothesis import given, settings
from hypothesis import strategies as st

from orcapod.types import Schema
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These property tests require the hypothesis package, but the repository's pyproject.toml does not declare it in dependencies/dev dependency groups. As-is, the test suite will fail to import. Add hypothesis to the test/dev dependencies (or skip these tests when Hypothesis isn’t installed).

Copilot uses AI. Check for mistakes.
claude added 3 commits March 5, 2026 08:14
Create a new `test-objective/` directory with 552 tests derived from design
documents, protocol definitions, and interface contracts rather than
implementation code. This provides independent validation against the
specification to catch self-affirmation gaps in existing tests.

Structure:
- test-objective/unit/ — 20 test modules covering all core components
- test-objective/integration/ — 5 modules for end-to-end pipeline scenarios
- test-objective/property/ — 3 modules with Hypothesis property-based tests
- TESTING_PLAN.md — comprehensive test case catalog (250+ cases)

Key coverage areas:
- Schema, ContentHash, ColumnConfig type contracts
- Datagram/Tag/Packet immutability and lazy conversion
- Stream construction, iteration, and format conversions
- All source types and SourceRegistry CRUD
- PythonPacketFunction and CachedPacketFunction
- FunctionPod, FunctionNode, PersistentFunctionNode caching
- All operators (Join, MergeJoin, SemiJoin, Batch, filters, mappers)
- Hash stability, Merkle chain properties, pipeline hash invariants
- System tag evolution rules (name-preserving/extending/evolving)
- ColumnConfig filtering consistency across components
- Operator algebra properties (commutativity, associativity, idempotency)

https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM
Runs test-objective/ suite on the same Python version matrix (3.11, 3.12)
as the existing test workflow, with identical setup and coverage reporting.

https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM
- Add hypothesis>=6.0 to dev dependency group and regenerate lockfile
  (fixes ModuleNotFoundError in CI for property-based tests)
- Replace Arrow types (pa.int64(), pa.utf8()) with Python types (int, str)
  in Schema tests to match documented DataType = type | UnionType contract
- Replace hardcoded _tag:: prefix with constants.SYSTEM_TAG_PREFIX (_tag_)
  across test_tag.py, test_column_config_filtering.py, test_arrow_data_utils.py,
  and test_stream.py
- Remove private field access (db._tables, db._get_record_key) in database
  tests, using only public API methods (get_record_by_id, get_all_records)

https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM
@eywalker eywalker force-pushed the claude/comprehensive-testing-plan-uN9E4 branch from e07b6d9 to 3047a01 Compare March 5, 2026 08:15
@eywalker eywalker merged commit f290932 into dev Mar 5, 2026
4 checks passed
@eywalker eywalker deleted the claude/comprehensive-testing-plan-uN9E4 branch March 11, 2026 04:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants