Skip to content

AMQP Context Tracking connection error handling#2106

Open
sanikache wants to merge 1 commit into
AcademySoftwareFoundation:mainfrom
dreamworksanimation:issue_1968
Open

AMQP Context Tracking connection error handling#2106
sanikache wants to merge 1 commit into
AcademySoftwareFoundation:mainfrom
dreamworksanimation:issue_1968

Conversation

@sanikache
Copy link
Copy Markdown
Contributor

@JeanChristopheMorinPerso and I were discussing this over #1968

Pika's vendor library was emitting ERROR-level log messages whenever a broker connection failed, even when context tracking was not configured. This produced spurious output in normal rez usage.

Changes:

  • init.py: Set rez.vendor.pika logger to CRITICAL at startup, silencing all pika output by default.

  • amqp.py:

    • Only call set_pika_log_level() when debug_context_tracking is enabled, rather than unconditionally on every publish attempt.
    • Also catch AMQPConnectionError (in addition to socket.error) on connection failure pika can raise either depending on the failure mode.
    • Downgrade connection failure message from ERROR to WARNING when context_tracking_host is explicitly set (unexpected failure), or to DEBUG when it is not (normal/expected when tracking is not configured).
    • Remove the dead else branch in set_pika_log_level() that reset to WARNING and the CRITICAL default is now set centrally in init.py.
    • rezconfig.py: Document the warning/debug behaviour on broker unreachability, and note that debug_context_tracking also enables pika DEBUG logging.

Here's the new behavior with this change after pointing context_tracking_host to a fake host:

rez-env dev_build
13:11:39 WARNING  Cannot connect to the message broker: [Errno -2] Name or service not known

To enabling context_tracking to see the full stacktrace:

setenv REZ_DEBUG_CONTEXT_TRACKING 1
or
set debug_context_tracking to True in rezconfig. 
Exception in thread Thread-1 (_publish_messages_async):
Traceback (most recent call last):
  File "/mmfs1/sasha/pipeline/rez_packages/python/3.11.12.a1/platform-linux/arch-x86_64/os-Rocky-8.7/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/mmfs1/sasha/pipeline/rez_packages/python/3.11.12.a1/platform-linux/arch-x86_64/os-Rocky-8.7/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/mmfs1/sasha/home/ben.hawkyard/rez/lib/python3.11/site-packages/rez/utils/amqp.py", line 133, in _publish_messages_async
    _publish_message(**kwargs)
  File "/mmfs1/sasha/home/ben.hawkyard/rez/lib/python3.11/site-packages/rez/utils/amqp.py", line 103, in _publish_message
    conn = BlockingConnection(params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mmfs1/sasha/home/ben.hawkyard/rez/lib/python3.11/site-packages/rez/vendor/pika/adapters/blocking_connection.py", line 361, in __init__
    self._impl = self._create_connection(parameters, _impl_class)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mmfs1/sasha/home/ben.hawkyard/rez/lib/python3.11/site-packages/rez/vendor/pika/adapters/blocking_connection.py", line 452, in _create_connection
    raise self._reap_last_connection_workflow_error(error)
rez.vendor.pika.exceptions.AMQPConnectionError

Signed-off-by: Ibrahim Sani <Ibrahim.Sani@dreamworks.com>
@sanikache sanikache requested a review from a team as a code owner May 14, 2026 21:41
@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 20.00000% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.62%. Comparing base (d415b96) to head (eeaa7e7).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/rez/utils/amqp.py 20.00% 8 Missing ⚠️
src/rez/__init__.py 20.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2106      +/-   ##
==========================================
- Coverage   60.65%   60.62%   -0.03%     
==========================================
  Files         164      164              
  Lines       20584    20594      +10     
  Branches     3579     3583       +4     
==========================================
  Hits        12485    12485              
- Misses       7224     7233       +9     
- Partials      875      876       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maxnbk maxnbk added the devdays26 ASWF Dev Days 2026 label May 14, 2026
@maxnbk
Copy link
Copy Markdown
Contributor

maxnbk commented May 14, 2026

I like the change, it seems solid overall, but I will just want to wait for @JeanChristopheMorinPerso to comment.

@sanikache : Since the cov-bot is basically mostly complaining that nothing hits inside the logging-conf block, I'm wondering if a mocked test-module could hit this codepath without needing an actual AMQP server to hit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devdays26 ASWF Dev Days 2026

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants