Add comprehensive specification-derived test suite by eywalker · Pull Request #73 · walkerlab/orcapod-python

eywalker · 2026-03-05T07:41:05Z

Summary

This PR introduces a new test-objective/ directory at the project root containing a comprehensive, specification-first test suite for the orcapod-python codebase. The tests are derived purely from design documents, protocol definitions, and interface contracts—not from reading implementation code—to avoid "self-affirmation" testing patterns.

Key Changes

TESTING_PLAN.md: Added a 680-line comprehensive test specification catalog documenting all planned test cases organized by module and test type (unit, integration, property-based)
test-objective/ directory structure: Created a new test suite with:
- conftest.py: Shared fixtures for sources, streams, functions, and operators
- unit/: 20+ unit test modules covering individual components
  - Core types: test_types.py, test_datagram.py, test_tag.py, test_packet.py
  - Streams & sources: test_stream.py, test_sources.py, test_source_registry.py
  - Functions & pods: test_packet_function.py, test_function_pod.py
  - Operators: test_operators.py (Join, Filter, Select, Drop, MapTags, MapPackets, etc.)
  - Nodes: test_nodes.py (FunctionNode, OperatorNode, Persistent variants)
  - Infrastructure: test_hashing.py, test_databases.py, test_schema_utils.py, test_arrow_utils.py, test_arrow_data_utils.py, test_semantic_types.py, test_contexts.py, test_tracker.py, test_lazy_module.py
- integration/: 5 integration test modules covering end-to-end scenarios
  - test_pipeline_flows.py: Source → Operator/FunctionPod → Stream pipelines
  - test_caching_flows.py: DB-backed caching with PersistentFunctionNode/OperatorNode
  - test_hash_invariants.py: Hash stability and Merkle chain properties
  - test_provenance.py: System tag lineage through pipelines
  - test_column_config_filtering.py: ColumnConfig visibility across components
- property/: 3 property-based test modules using Hypothesis
  - test_schema_properties.py: Schema algebra properties
  - test_hash_properties.py: Hash determinism and collision resistance
  - test_operator_algebra.py: Operator commutativity, associativity, idempotency

Implementation Details

Tests are organized by specification source (design documents, protocol definitions, type annotations, docstrings)
Each test module includes docstrings explaining which specification sections it derives from
Helper functions and fixtures are centralized in conftest.py for reusability
Tests cover both happy paths and error conditions (e.g., missing columns, type mismatches)
Property-based tests use Hypothesis to verify invariants hold across arbitrary inputs
Integration tests verify cross-component behaviors like system tag evolution and caching semantics

https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM

codecov-commenter · 2026-03-05T07:42:36Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copilot

Pull request overview

This PR adds a new “specification-first” test suite under test-objective/ plus a root TESTING_PLAN.md catalog, aiming to validate OrcaPod behavior against documented contracts rather than implementation details.

Changes:

Added TESTING_PLAN.md describing a comprehensive, spec-derived test catalog and execution guidance.
Introduced test-objective/ with unit, integration, and property-based tests covering core types, datagrams/streams/sources, operators/nodes, hashing, databases, and contexts.
Added shared pytest fixtures in test-objective/conftest.py to standardize test inputs across modules.

Reviewed changes

Copilot reviewed 31 out of 35 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
TESTING_PLAN.md	Adds the specification-derived test plan and run instructions
test-objective/init.py	Marks `test-objective` as a package
test-objective/conftest.py	Shared fixtures/helpers for objective tests
test-objective/integration/init.py	Marks integration tests package
test-objective/integration/test_caching_flows.py	Integration coverage for DB-backed caching flows
test-objective/integration/test_column_config_filtering.py	Integration coverage for ColumnConfig across components
test-objective/integration/test_hash_invariants.py	Integration coverage for content/pipeline hash invariants
test-objective/integration/test_pipeline_flows.py	Integration coverage for end-to-end pipeline scenarios
test-objective/integration/test_provenance.py	Integration coverage for system tag lineage rules
test-objective/property/init.py	Marks property tests package
test-objective/property/test_hash_properties.py	Hypothesis property tests for hashing and ContentHash roundtrips
test-objective/property/test_operator_algebra.py	Property-style algebraic tests for operators
test-objective/property/test_schema_properties.py	Hypothesis property tests for Schema algebra
test-objective/unit/init.py	Marks unit tests package
test-objective/unit/test_arrow_data_utils.py	Unit tests for system tag/source-info Arrow helpers
test-objective/unit/test_arrow_utils.py	Unit tests for Arrow schema/table utilities
test-objective/unit/test_contexts.py	Unit tests for DataContext resolution and defaults
test-objective/unit/test_databases.py	Unit tests for database implementations
test-objective/unit/test_datagram.py	Unit tests for Datagram semantics and conversions
test-objective/unit/test_function_pod.py	Unit tests for FunctionPod/FunctionPodStream behaviors
test-objective/unit/test_hashing.py	Unit tests for semantic hashing and registries
test-objective/unit/test_lazy_module.py	Unit tests for LazyModule deferred import behavior
test-objective/unit/test_nodes.py	Unit tests for FunctionNode/OperatorNode and persistent variants
test-objective/unit/test_operators.py	Unit tests for operator semantics and validation
test-objective/unit/test_packet.py	Unit tests for Packet source-info behavior
test-objective/unit/test_packet_function.py	Unit tests for PythonPacketFunction and CachedPacketFunction
test-objective/unit/test_schema_utils.py	Unit tests for schema utility functions
test-objective/unit/test_semantic_types.py	Unit tests for type conversion behavior
test-objective/unit/test_source_registry.py	Unit tests for SourceRegistry behaviors
test-objective/unit/test_sources.py	Unit tests for source types and edge cases
test-objective/unit/test_stream.py	Unit tests for ArrowTableStream behavior and conversions
test-objective/unit/test_tag.py	Unit tests for Tag system-tags behavior
test-objective/unit/test_tracker.py	Unit tests for tracker manager / graph tracker

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-05T07:47:22Z

test-objective/property/test_schema_properties.py

+from hypothesis import given, settings
+from hypothesis import strategies as st
+


This module imports hypothesis, but pyproject.toml does not declare Hypothesis in dependencies/dev dependency groups. As-is, the test suite will fail to import in a clean environment. Add hypothesis to the test/dev dependencies (or skip these tests when it isn’t installed).

Suggested change

from hypothesis import given, settings

from hypothesis import strategies as st

hypothesis = pytest.importorskip("hypothesis")

given = hypothesis.given

settings = hypothesis.settings

st = hypothesis.strategies

Copilot · 2026-03-05T07:47:22Z

test-objective/unit/test_types.py

+    def test_schema_acts_as_mapping(self):
+        s = Schema({"x": pa.int64(), "y": pa.utf8()})
+        assert "x" in s
+        assert s["x"] == pa.int64()


Schema in orcapod.types is defined as a mapping from field names to Python types (e.g., int, str), not PyArrow DataType instances. Using pa.int64()/pa.utf8() here makes the tests diverge from the documented contract and from how schemas are produced elsewhere (e.g., via the type converter). Update these tests to use Python types (and reserve Arrow typing assertions for the Arrow converter utilities).

Copilot · 2026-03-05T07:47:22Z

test-objective/unit/test_tag.py

+        tag = Tag({"x": 1, "y": "hello"}, data_context=ctx, system_tags={"_tag::src:abc": "val"})
+        keys = list(tag.keys())
+        assert "x" in keys
+        assert "y" in keys
+        assert not any(k.startswith("_tag::") for k in keys)


System tag conventions in the codebase use constants.SYSTEM_TAG_PREFIX ("tag"). Hard-coding _tag::... here bakes in an incorrect prefix and can reduce coverage vs real provenance tags. Prefer building the key/assertions from constants.SYSTEM_TAG_PREFIX (and/or using realistic system tag key shapes like _tag_source_id::...).

Copilot · 2026-03-05T07:47:23Z

test-objective/unit/test_stream.py

+        # System tag columns are prefixed with _tag:: internally
+        assert all(not k.startswith("_tag_") for k in tag_keys)
+


This comment/documentation claims system tag columns are prefixed with _tag::, but the actual prefix used throughout the codebase is constants.SYSTEM_TAG_PREFIX ("tag") and system tag columns are typically _tag_source_id::... / _tag_record_id::.... Update the comment to match the implemented naming so future readers aren’t misled.

Copilot · 2026-03-05T07:47:23Z

TESTING_PLAN.md

+## Dependencies
+
+- **hypothesis** — added as a test dependency for property-based testing in `test-objective/property/`
+- **pytest** — test runner (already present)
+- DeltaTableDatabase tests marked with `@pytest.mark.slow` (skip with `-m "not slow"`)


TESTING_PLAN.md states that Hypothesis is “added as a test dependency”, but the PR doesn’t update pyproject.toml to include it. Either add Hypothesis to the declared test/dev dependencies or adjust this plan so it matches the repository configuration.

Copilot · 2026-03-05T07:47:23Z

test-objective/property/test_hash_properties.py

+
+from __future__ import annotations
+
+import pytest


This module imports hypothesis, but the repo doesn’t declare it as a dependency in pyproject.toml, so running tests will fail on import. Add hypothesis to the test/dev dependencies (or gate these tests behind an optional extra/skip).

Suggested change

import pytest

import pytest

# Skip this test module if Hypothesis is not installed

_hypothesis = pytest.importorskip("hypothesis")

Copilot · 2026-03-05T07:47:23Z

test-objective/integration/test_column_config_filtering.py

+    """Per design, system_tags=True includes _tag:: columns in Tag."""
+
+    def test_system_tags_excluded_by_default(self):
+        t = Tag(
+            {"id": 1},
+            system_tags={"_tag::source:abc": "rec1"},
+        )
+        keys = t.keys()
+        assert "_tag::source:abc" not in keys
+
+    def test_system_tags_included_with_config(self):
+        t = Tag(
+            {"id": 1},
+            system_tags={"_tag::source:abc": "rec1"},
+        )
+        keys_default = t.keys()
+        keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True))
+        assert len(keys_with_tags) > len(keys_default)
+        assert "_tag::source:abc" in keys_with_tags
+
+    def test_all_info_includes_system_tags(self):
+        t = Tag(
+            {"id": 1},
+            system_tags={"_tag::source:abc": "rec1"},
+        )
+        keys = t.keys(all_info=True)
+        assert "_tag::source:abc" in keys


This test uses _tag::source:abc as a system tag key, but the codebase’s system-tag prefix is constants.SYSTEM_TAG_PREFIX ("tag") and the provenance tags generated by sources/operators follow _tag_source_id::... / _tag_record_id::.... Use constants.SYSTEM_TAG_PREFIX (and realistic key shapes) so the test validates production behavior.

Suggested change

"""Per design, system_tags=True includes _tag:: columns in Tag."""

def test_system_tags_excluded_by_default(self):

t = Tag(

{"id": 1},

system_tags={"_tag::source:abc": "rec1"},

)

keys = t.keys()

assert "_tag::source:abc" not in keys

def test_system_tags_included_with_config(self):

t = Tag(

{"id": 1},

system_tags={"_tag::source:abc": "rec1"},

)

keys_default = t.keys()

keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True))

assert len(keys_with_tags) > len(keys_default)

assert "_tag::source:abc" in keys_with_tags

def test_all_info_includes_system_tags(self):

t = Tag(

{"id": 1},

system_tags={"_tag::source:abc": "rec1"},

)

keys = t.keys(all_info=True)

assert "_tag::source:abc" in keys

"""Per design, system_tags=True includes system-tag-prefixed columns in Tag."""

def test_system_tags_excluded_by_default(self):

system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc"

t = Tag(

{"id": 1},

system_tags={system_tag_key: "rec1"},

)

keys = t.keys()

assert system_tag_key not in keys

def test_system_tags_included_with_config(self):

system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc"

t = Tag(

{"id": 1},

system_tags={system_tag_key: "rec1"},

)

keys_default = t.keys()

keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True))

assert len(keys_with_tags) > len(keys_default)

assert system_tag_key in keys_with_tags

def test_all_info_includes_system_tags(self):

system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc"

t = Tag(

{"id": 1},

system_tags={system_tag_key: "rec1"},

)

keys = t.keys(all_info=True)

assert system_tag_key in keys

Copilot · 2026-03-05T07:47:24Z

test-objective/unit/test_databases.py

+        assert path not in {
+            tuple(k.split("/")) for k in db._tables
+        }, "Record should not be in committed store before flush"
+


This test asserts pending-vs-committed behavior by reading InMemoryArrowDatabase private fields (_tables). That’s brittle to internal refactors and doesn’t test the public database contract. Prefer assertions via public methods (or add a small public inspection hook if committed/pending distinction is part of the spec).

Copilot · 2026-03-05T07:47:24Z

TESTING_PLAN.md

+Run the full test suite with:
+```bash
+uv run pytest test-objective/ -v
+```


Note: repo pytest.ini sets testpaths = tests, so running plain pytest (as CI commonly does) will not discover test-objective/ tests. If these are meant to run in CI by default, pytest.ini (or CI command) needs updating; otherwise it’s worth calling out explicitly here that they are opt-in via an explicit path.

Copilot · 2026-03-05T07:47:24Z

test-objective/property/test_schema_properties.py

+import pytest
+from hypothesis import given, settings
+from hypothesis import strategies as st
+
+from orcapod.types import Schema


These property tests require the hypothesis package, but the repository's pyproject.toml does not declare it in dependencies/dev dependency groups. As-is, the test suite will fail to import. Add hypothesis to the test/dev dependencies (or skip these tests when Hypothesis isn’t installed).

Create a new `test-objective/` directory with 552 tests derived from design documents, protocol definitions, and interface contracts rather than implementation code. This provides independent validation against the specification to catch self-affirmation gaps in existing tests. Structure: - test-objective/unit/ — 20 test modules covering all core components - test-objective/integration/ — 5 modules for end-to-end pipeline scenarios - test-objective/property/ — 3 modules with Hypothesis property-based tests - TESTING_PLAN.md — comprehensive test case catalog (250+ cases) Key coverage areas: - Schema, ContentHash, ColumnConfig type contracts - Datagram/Tag/Packet immutability and lazy conversion - Stream construction, iteration, and format conversions - All source types and SourceRegistry CRUD - PythonPacketFunction and CachedPacketFunction - FunctionPod, FunctionNode, PersistentFunctionNode caching - All operators (Join, MergeJoin, SemiJoin, Batch, filters, mappers) - Hash stability, Merkle chain properties, pipeline hash invariants - System tag evolution rules (name-preserving/extending/evolving) - ColumnConfig filtering consistency across components - Operator algebra properties (commutativity, associativity, idempotency) https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM

Runs test-objective/ suite on the same Python version matrix (3.11, 3.12) as the existing test workflow, with identical setup and coverage reporting. https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM

- Add hypothesis>=6.0 to dev dependency group and regenerate lockfile (fixes ModuleNotFoundError in CI for property-based tests) - Replace Arrow types (pa.int64(), pa.utf8()) with Python types (int, str) in Schema tests to match documented DataType = type | UnionType contract - Replace hardcoded _tag:: prefix with constants.SYSTEM_TAG_PREFIX (_tag_) across test_tag.py, test_column_config_filtering.py, test_arrow_data_utils.py, and test_stream.py - Remove private field access (db._tables, db._get_record_key) in database tests, using only public API methods (get_record_by_id, get_all_records) https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM

Copilot AI review requested due to automatic review settings March 5, 2026 07:41

Copilot started reviewing on behalf of eywalker March 5, 2026 07:41 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

claude added 3 commits March 5, 2026 08:14

ci: add GitHub Actions workflow for objective tests

aebd932

Runs test-objective/ suite on the same Python version matrix (3.11, 3.12) as the existing test workflow, with identical setup and coverage reporting. https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM

eywalker force-pushed the claude/comprehensive-testing-plan-uN9E4 branch from e07b6d9 to 3047a01 Compare March 5, 2026 08:15

eywalker merged commit f290932 into dev Mar 5, 2026
4 checks passed

eywalker deleted the claude/comprehensive-testing-plan-uN9E4 branch March 11, 2026 04:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive specification-derived test suite#73

Add comprehensive specification-derived test suite#73
eywalker merged 3 commits intodevfrom
claude/comprehensive-testing-plan-uN9E4

eywalker commented Mar 5, 2026

Uh oh!

codecov-commenter commented Mar 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		from hypothesis import given, settings
		from hypothesis import strategies as st

-from hypothesis import given, settings
-from hypothesis import strategies as st
+hypothesis = pytest.importorskip("hypothesis")
+given = hypothesis.given
+settings = hypothesis.settings
+st = hypothesis.strategies

		# System tag columns are prefixed with _tag:: internally
		assert all(not k.startswith("_tag_") for k in tag_keys)

Conversation

eywalker commented Mar 5, 2026

Summary

Key Changes

Implementation Details

Uh oh!

codecov-commenter commented Mar 5, 2026

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants