Add comprehensive specification-derived test suite#73
Conversation
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
This PR adds a new “specification-first” test suite under test-objective/ plus a root TESTING_PLAN.md catalog, aiming to validate OrcaPod behavior against documented contracts rather than implementation details.
Changes:
- Added
TESTING_PLAN.mddescribing a comprehensive, spec-derived test catalog and execution guidance. - Introduced
test-objective/with unit, integration, and property-based tests covering core types, datagrams/streams/sources, operators/nodes, hashing, databases, and contexts. - Added shared pytest fixtures in
test-objective/conftest.pyto standardize test inputs across modules.
Reviewed changes
Copilot reviewed 31 out of 35 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| TESTING_PLAN.md | Adds the specification-derived test plan and run instructions |
| test-objective/init.py | Marks test-objective as a package |
| test-objective/conftest.py | Shared fixtures/helpers for objective tests |
| test-objective/integration/init.py | Marks integration tests package |
| test-objective/integration/test_caching_flows.py | Integration coverage for DB-backed caching flows |
| test-objective/integration/test_column_config_filtering.py | Integration coverage for ColumnConfig across components |
| test-objective/integration/test_hash_invariants.py | Integration coverage for content/pipeline hash invariants |
| test-objective/integration/test_pipeline_flows.py | Integration coverage for end-to-end pipeline scenarios |
| test-objective/integration/test_provenance.py | Integration coverage for system tag lineage rules |
| test-objective/property/init.py | Marks property tests package |
| test-objective/property/test_hash_properties.py | Hypothesis property tests for hashing and ContentHash roundtrips |
| test-objective/property/test_operator_algebra.py | Property-style algebraic tests for operators |
| test-objective/property/test_schema_properties.py | Hypothesis property tests for Schema algebra |
| test-objective/unit/init.py | Marks unit tests package |
| test-objective/unit/test_arrow_data_utils.py | Unit tests for system tag/source-info Arrow helpers |
| test-objective/unit/test_arrow_utils.py | Unit tests for Arrow schema/table utilities |
| test-objective/unit/test_contexts.py | Unit tests for DataContext resolution and defaults |
| test-objective/unit/test_databases.py | Unit tests for database implementations |
| test-objective/unit/test_datagram.py | Unit tests for Datagram semantics and conversions |
| test-objective/unit/test_function_pod.py | Unit tests for FunctionPod/FunctionPodStream behaviors |
| test-objective/unit/test_hashing.py | Unit tests for semantic hashing and registries |
| test-objective/unit/test_lazy_module.py | Unit tests for LazyModule deferred import behavior |
| test-objective/unit/test_nodes.py | Unit tests for FunctionNode/OperatorNode and persistent variants |
| test-objective/unit/test_operators.py | Unit tests for operator semantics and validation |
| test-objective/unit/test_packet.py | Unit tests for Packet source-info behavior |
| test-objective/unit/test_packet_function.py | Unit tests for PythonPacketFunction and CachedPacketFunction |
| test-objective/unit/test_schema_utils.py | Unit tests for schema utility functions |
| test-objective/unit/test_semantic_types.py | Unit tests for type conversion behavior |
| test-objective/unit/test_source_registry.py | Unit tests for SourceRegistry behaviors |
| test-objective/unit/test_sources.py | Unit tests for source types and edge cases |
| test-objective/unit/test_stream.py | Unit tests for ArrowTableStream behavior and conversions |
| test-objective/unit/test_tag.py | Unit tests for Tag system-tags behavior |
| test-objective/unit/test_tracker.py | Unit tests for tracker manager / graph tracker |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from hypothesis import given, settings | ||
| from hypothesis import strategies as st | ||
|
|
There was a problem hiding this comment.
This module imports hypothesis, but pyproject.toml does not declare Hypothesis in dependencies/dev dependency groups. As-is, the test suite will fail to import in a clean environment. Add hypothesis to the test/dev dependencies (or skip these tests when it isn’t installed).
| from hypothesis import given, settings | |
| from hypothesis import strategies as st | |
| hypothesis = pytest.importorskip("hypothesis") | |
| given = hypothesis.given | |
| settings = hypothesis.settings | |
| st = hypothesis.strategies |
test-objective/unit/test_types.py
Outdated
| def test_schema_acts_as_mapping(self): | ||
| s = Schema({"x": pa.int64(), "y": pa.utf8()}) | ||
| assert "x" in s | ||
| assert s["x"] == pa.int64() |
There was a problem hiding this comment.
Schema in orcapod.types is defined as a mapping from field names to Python types (e.g., int, str), not PyArrow DataType instances. Using pa.int64()/pa.utf8() here makes the tests diverge from the documented contract and from how schemas are produced elsewhere (e.g., via the type converter). Update these tests to use Python types (and reserve Arrow typing assertions for the Arrow converter utilities).
test-objective/unit/test_tag.py
Outdated
| tag = Tag({"x": 1, "y": "hello"}, data_context=ctx, system_tags={"_tag::src:abc": "val"}) | ||
| keys = list(tag.keys()) | ||
| assert "x" in keys | ||
| assert "y" in keys | ||
| assert not any(k.startswith("_tag::") for k in keys) |
There was a problem hiding this comment.
System tag conventions in the codebase use constants.SYSTEM_TAG_PREFIX ("tag"). Hard-coding _tag::... here bakes in an incorrect prefix and can reduce coverage vs real provenance tags. Prefer building the key/assertions from constants.SYSTEM_TAG_PREFIX (and/or using realistic system tag key shapes like _tag_source_id::...).
test-objective/unit/test_stream.py
Outdated
| # System tag columns are prefixed with _tag:: internally | ||
| assert all(not k.startswith("_tag_") for k in tag_keys) | ||
|
|
There was a problem hiding this comment.
This comment/documentation claims system tag columns are prefixed with _tag::, but the actual prefix used throughout the codebase is constants.SYSTEM_TAG_PREFIX ("tag") and system tag columns are typically _tag_source_id::... / _tag_record_id::.... Update the comment to match the implemented naming so future readers aren’t misled.
| ## Dependencies | ||
|
|
||
| - **hypothesis** — added as a test dependency for property-based testing in `test-objective/property/` | ||
| - **pytest** — test runner (already present) | ||
| - DeltaTableDatabase tests marked with `@pytest.mark.slow` (skip with `-m "not slow"`) |
There was a problem hiding this comment.
TESTING_PLAN.md states that Hypothesis is “added as a test dependency”, but the PR doesn’t update pyproject.toml to include it. Either add Hypothesis to the declared test/dev dependencies or adjust this plan so it matches the repository configuration.
|
|
||
| from __future__ import annotations | ||
|
|
||
| import pytest |
There was a problem hiding this comment.
This module imports hypothesis, but the repo doesn’t declare it as a dependency in pyproject.toml, so running tests will fail on import. Add hypothesis to the test/dev dependencies (or gate these tests behind an optional extra/skip).
| import pytest | |
| import pytest | |
| # Skip this test module if Hypothesis is not installed | |
| _hypothesis = pytest.importorskip("hypothesis") |
| """Per design, system_tags=True includes _tag:: columns in Tag.""" | ||
|
|
||
| def test_system_tags_excluded_by_default(self): | ||
| t = Tag( | ||
| {"id": 1}, | ||
| system_tags={"_tag::source:abc": "rec1"}, | ||
| ) | ||
| keys = t.keys() | ||
| assert "_tag::source:abc" not in keys | ||
|
|
||
| def test_system_tags_included_with_config(self): | ||
| t = Tag( | ||
| {"id": 1}, | ||
| system_tags={"_tag::source:abc": "rec1"}, | ||
| ) | ||
| keys_default = t.keys() | ||
| keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True)) | ||
| assert len(keys_with_tags) > len(keys_default) | ||
| assert "_tag::source:abc" in keys_with_tags | ||
|
|
||
| def test_all_info_includes_system_tags(self): | ||
| t = Tag( | ||
| {"id": 1}, | ||
| system_tags={"_tag::source:abc": "rec1"}, | ||
| ) | ||
| keys = t.keys(all_info=True) | ||
| assert "_tag::source:abc" in keys |
There was a problem hiding this comment.
This test uses _tag::source:abc as a system tag key, but the codebase’s system-tag prefix is constants.SYSTEM_TAG_PREFIX ("tag") and the provenance tags generated by sources/operators follow _tag_source_id::... / _tag_record_id::.... Use constants.SYSTEM_TAG_PREFIX (and realistic key shapes) so the test validates production behavior.
| """Per design, system_tags=True includes _tag:: columns in Tag.""" | |
| def test_system_tags_excluded_by_default(self): | |
| t = Tag( | |
| {"id": 1}, | |
| system_tags={"_tag::source:abc": "rec1"}, | |
| ) | |
| keys = t.keys() | |
| assert "_tag::source:abc" not in keys | |
| def test_system_tags_included_with_config(self): | |
| t = Tag( | |
| {"id": 1}, | |
| system_tags={"_tag::source:abc": "rec1"}, | |
| ) | |
| keys_default = t.keys() | |
| keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True)) | |
| assert len(keys_with_tags) > len(keys_default) | |
| assert "_tag::source:abc" in keys_with_tags | |
| def test_all_info_includes_system_tags(self): | |
| t = Tag( | |
| {"id": 1}, | |
| system_tags={"_tag::source:abc": "rec1"}, | |
| ) | |
| keys = t.keys(all_info=True) | |
| assert "_tag::source:abc" in keys | |
| """Per design, system_tags=True includes system-tag-prefixed columns in Tag.""" | |
| def test_system_tags_excluded_by_default(self): | |
| system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc" | |
| t = Tag( | |
| {"id": 1}, | |
| system_tags={system_tag_key: "rec1"}, | |
| ) | |
| keys = t.keys() | |
| assert system_tag_key not in keys | |
| def test_system_tags_included_with_config(self): | |
| system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc" | |
| t = Tag( | |
| {"id": 1}, | |
| system_tags={system_tag_key: "rec1"}, | |
| ) | |
| keys_default = t.keys() | |
| keys_with_tags = t.keys(columns=ColumnConfig(system_tags=True)) | |
| assert len(keys_with_tags) > len(keys_default) | |
| assert system_tag_key in keys_with_tags | |
| def test_all_info_includes_system_tags(self): | |
| system_tag_key = f"{constants.SYSTEM_TAG_PREFIX}source_id::abc" | |
| t = Tag( | |
| {"id": 1}, | |
| system_tags={system_tag_key: "rec1"}, | |
| ) | |
| keys = t.keys(all_info=True) | |
| assert system_tag_key in keys |
| assert path not in { | ||
| tuple(k.split("/")) for k in db._tables | ||
| }, "Record should not be in committed store before flush" | ||
|
|
There was a problem hiding this comment.
This test asserts pending-vs-committed behavior by reading InMemoryArrowDatabase private fields (_tables). That’s brittle to internal refactors and doesn’t test the public database contract. Prefer assertions via public methods (or add a small public inspection hook if committed/pending distinction is part of the spec).
| Run the full test suite with: | ||
| ```bash | ||
| uv run pytest test-objective/ -v | ||
| ``` |
There was a problem hiding this comment.
Note: repo pytest.ini sets testpaths = tests, so running plain pytest (as CI commonly does) will not discover test-objective/ tests. If these are meant to run in CI by default, pytest.ini (or CI command) needs updating; otherwise it’s worth calling out explicitly here that they are opt-in via an explicit path.
| import pytest | ||
| from hypothesis import given, settings | ||
| from hypothesis import strategies as st | ||
|
|
||
| from orcapod.types import Schema |
There was a problem hiding this comment.
These property tests require the hypothesis package, but the repository's pyproject.toml does not declare it in dependencies/dev dependency groups. As-is, the test suite will fail to import. Add hypothesis to the test/dev dependencies (or skip these tests when Hypothesis isn’t installed).
Create a new `test-objective/` directory with 552 tests derived from design documents, protocol definitions, and interface contracts rather than implementation code. This provides independent validation against the specification to catch self-affirmation gaps in existing tests. Structure: - test-objective/unit/ — 20 test modules covering all core components - test-objective/integration/ — 5 modules for end-to-end pipeline scenarios - test-objective/property/ — 3 modules with Hypothesis property-based tests - TESTING_PLAN.md — comprehensive test case catalog (250+ cases) Key coverage areas: - Schema, ContentHash, ColumnConfig type contracts - Datagram/Tag/Packet immutability and lazy conversion - Stream construction, iteration, and format conversions - All source types and SourceRegistry CRUD - PythonPacketFunction and CachedPacketFunction - FunctionPod, FunctionNode, PersistentFunctionNode caching - All operators (Join, MergeJoin, SemiJoin, Batch, filters, mappers) - Hash stability, Merkle chain properties, pipeline hash invariants - System tag evolution rules (name-preserving/extending/evolving) - ColumnConfig filtering consistency across components - Operator algebra properties (commutativity, associativity, idempotency) https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM
Runs test-objective/ suite on the same Python version matrix (3.11, 3.12) as the existing test workflow, with identical setup and coverage reporting. https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM
- Add hypothesis>=6.0 to dev dependency group and regenerate lockfile (fixes ModuleNotFoundError in CI for property-based tests) - Replace Arrow types (pa.int64(), pa.utf8()) with Python types (int, str) in Schema tests to match documented DataType = type | UnionType contract - Replace hardcoded _tag:: prefix with constants.SYSTEM_TAG_PREFIX (_tag_) across test_tag.py, test_column_config_filtering.py, test_arrow_data_utils.py, and test_stream.py - Remove private field access (db._tables, db._get_record_key) in database tests, using only public API methods (get_record_by_id, get_all_records) https://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM
e07b6d9 to
3047a01
Compare
Summary
This PR introduces a new
test-objective/directory at the project root containing a comprehensive, specification-first test suite for the orcapod-python codebase. The tests are derived purely from design documents, protocol definitions, and interface contracts—not from reading implementation code—to avoid "self-affirmation" testing patterns.Key Changes
TESTING_PLAN.md: Added a 680-line comprehensive test specification catalog documenting all planned test cases organized by module and test type (unit, integration, property-based)
test-objective/ directory structure: Created a new test suite with:
conftest.py: Shared fixtures for sources, streams, functions, and operatorsunit/: 20+ unit test modules covering individual componentstest_types.py,test_datagram.py,test_tag.py,test_packet.pytest_stream.py,test_sources.py,test_source_registry.pytest_packet_function.py,test_function_pod.pytest_operators.py(Join, Filter, Select, Drop, MapTags, MapPackets, etc.)test_nodes.py(FunctionNode, OperatorNode, Persistent variants)test_hashing.py,test_databases.py,test_schema_utils.py,test_arrow_utils.py,test_arrow_data_utils.py,test_semantic_types.py,test_contexts.py,test_tracker.py,test_lazy_module.pyintegration/: 5 integration test modules covering end-to-end scenariostest_pipeline_flows.py: Source → Operator/FunctionPod → Stream pipelinestest_caching_flows.py: DB-backed caching with PersistentFunctionNode/OperatorNodetest_hash_invariants.py: Hash stability and Merkle chain propertiestest_provenance.py: System tag lineage through pipelinestest_column_config_filtering.py: ColumnConfig visibility across componentsproperty/: 3 property-based test modules using Hypothesistest_schema_properties.py: Schema algebra propertiestest_hash_properties.py: Hash determinism and collision resistancetest_operator_algebra.py: Operator commutativity, associativity, idempotencyImplementation Details
conftest.pyfor reusabilityhttps://claude.ai/code/session_01FQ9vEEdmUZjB4D7YL6mJuM