Skip to content

SDAP-542: Updated collection manager and granule ingester build to python 3.11#118

Open
ngachung wants to merge 3 commits into
apache:developfrom
ngachung:SDAP-542
Open

SDAP-542: Updated collection manager and granule ingester build to python 3.11#118
ngachung wants to merge 3 commits into
apache:developfrom
ngachung:SDAP-542

Conversation

@ngachung
Copy link
Copy Markdown
Contributor

Caveat: Built and tested locally only. Have not verified that the builds work in an AWS deployment.

Granule ingester build has to use https://github.com/apache/sdap-nexusproto/tree/protobuf-bump otherwise it will fail with AttributeError: module 'collections' has no attribute 'MutableMapping'

Confirmed that a local build of collection manager detects new files and granule ingester ingests into cassandra and solr.

Removed pip uninstall -y chardet from the collection manager build and confirmed that in the new build it is not installed

docker run --rm --entrypoint bash sdap-local/sdap-collection-manager:1.5.0 -c "pip show chardet"
WARNING: Package(s) not found: chardet

@ngachung ngachung requested a review from RKuttruff May 22, 2026 04:53
Copy link
Copy Markdown
Contributor

@RKuttruff RKuttruff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ngachung sorry it took me so long to get to this.

I built without issues but had problems deploying to EKS:

2026-05-26 15:02:55,973 [ERROR] [__main__::135] module 'aioboto3' has no attribute 'resource'
Traceback (most recent call last):
  File "/collection_manager/collection_manager/main.py", line 127, in main
    await collection_watcher.start_watching()
  File "/usr/local/lib/python3.11/site-packages/collection_manager/services/CollectionWatcher.py", line 74, in start_watching
    await self._observer.start()
  File "/usr/local/lib/python3.11/site-packages/collection_manager/services/S3Observer.py", line 61, in start
    await self._run_periodically(loop=None,
  File "/usr/local/lib/python3.11/site-packages/collection_manager/services/S3Observer.py", line 89, in _run_periodically
    await func(*args, **kwargs)
  File "/usr/local/lib/python3.11/site-packages/collection_manager/services/S3Observer.py", line 101, in _poll
    new_cache_for_watch = await self._get_s3_files(watch.path)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/collection_manager/services/S3Observer.py", line 135, in _get_s3_files
    async with aioboto3.resource("s3") as s3:
               ^^^^^^^^^^^^^^^^^
AttributeError: module 'aioboto3' has no attribute 'resource'

It looks like granule ingester will suffer from this issue as well: https://github.com/ngachung/incubator-sdap-ingester/blob/1dc06d00a53e86235915398f60fad7697b0dc05f/granule_ingester/granule_ingester/granule_loaders/GranuleLoader.py#L102

Additionally, since a specific nexusproto branch is required for this build, we should probably merge it into dev and publish a prerelease onto pypi before proceeding with any releases containing this PR. I opened a PR on that branch.

@ngachung
Copy link
Copy Markdown
Contributor Author

@RKuttruff Thank you for building and testing in EKS! Because we bumped aioboto3 version we also have to change

        async with aioboto3.resource("s3") as s3:

to

        session = aioboto3.Session()
        async with session.resource("s3") as s3:

That error should be resolved now.

Copy link
Copy Markdown
Contributor

@RKuttruff RKuttruff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retested with aioboto3 changes and components are functioning correctly in EKS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants