perf: Expand Benchmarks vs Upstream OpenTelemetry & CI Regression by JacksonWeber · Pull Request #1500 · microsoft/ApplicationInsights-node.js

JacksonWeber · 2026-05-22T02:36:22Z

Adds four new perf scenarios so we can measure overhead of this package against equivalent upstream OpenTelemetry calls:

AzureMonitorSpanTest / AzureMonitorLogTest (useAzureMonitor + direct OTel API)
OtelSpanTest / OtelLogTest (plain @opentelemetry/sdk-trace-base & sdk-logs reference, informational only)

Introduces a deterministic benchmark runner (bench.mjs + runBenchmarks.mjs) that bypasses the @azure-tools/test-perf worker pool, runs each scenario in a fresh Node child process to avoid OTel global-state contamination, and emits structured JSON with median/mean/stdev across N samples.

Adds .github/workflows/performance.yml: packs both PR and base branch as tarballs via npm pack, installs each in turn under the PR''s perf harness, runs the benchmark suite, and fails the job (blocking merge when set as a required check) if any gating scenario regresses beyond the configured threshold. Posts a sticky PR comment with the comparison table.

Regression limits

The gate is driven by PERF_REGRESSION_THRESHOLD (percent, set in .github/workflows/performance.yml).

Default threshold: 15%. A gating scenario fails the build only when its median throughput (ops/s) drops by more than 15% relative to the base branch. Anything from 0% down to -15% (inclusive) is treated as within acceptable noise and passes.
The threshold is compared against the median ops/s across samples (not mean), to reduce sensitivity to single-run outliers / GC jitter.
Improvements (positive Δ%) never fail the gate, regardless of magnitude.
Only scenarios marked tier: "gating" can fail the build:
- TrackDependencyTest (gating)
- TrackTraceTest (gating)
- AzureMonitorSpanTest (gating)
- AzureMonitorLogTest (gating)
- OtelSpanTest / OtelLogTest are informational only — they are reported in the PR comment for like-for-like comparison against upstream OpenTelemetry but never block merge, since regressions there are not owned by this repo.
The threshold can be tightened or loosened per-branch by editing PERF_REGRESSION_THRESHOLD in the workflow env (e.g. set it to 10 for a stricter 10% gate); no code change to the runner is required.

… gate Adds four new perf scenarios so we can measure overhead of this package against equivalent upstream OpenTelemetry calls: - AzureMonitorSpanTest / AzureMonitorLogTest (useAzureMonitor + direct OTel API) - OtelSpanTest / OtelLogTest (plain @opentelemetry/sdk-trace-base & sdk-logs reference, informational only) Introduces a deterministic benchmark runner (bench.mjs + runBenchmarks.mjs) that bypasses the @azure-tools/test-perf worker pool, runs each scenario in a fresh Node child process to avoid OTel global-state contamination, and emits structured JSON with median/mean/stdev across N samples. Adds .github/workflows/performance.yml: packs both PR and base branch as tarballs via npm pack, installs each in turn under the PR's perf harness, runs the benchmark suite, and fails the job (blocking merge when set as a required check) if any gating scenario regresses by more than PERF_REGRESSION_THRESHOLD percent (default 15%). Posts a sticky PR comment with the comparison table. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR expands the test/performanceTests harness to add new OpenTelemetry-based span/log scenarios, introduces a deterministic multi-scenario benchmark runner (per-scenario child process + JSON output), and adds a GitHub Actions workflow to compare candidate vs baseline performance and gate PRs on regressions.

Changes:

Add four new perf scenarios: AzureMonitorSpanTest/AzureMonitorLogTest (gating) and OtelSpanTest/OtelLogTest (informational baseline).
Add bench.mjs, runBenchmarks.mjs, and comparePerf.mjs to run isolated benchmarks and produce/compare structured results.
Add .github/workflows/performance.yml to run baseline vs candidate benchmarks on PRs and post a sticky comparison comment.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
test/performanceTests/test/otelSpan.spec.ts	New upstream OTel span baseline perf scenario.
test/performanceTests/test/otelLog.spec.ts	New upstream OTel log baseline perf scenario.
test/performanceTests/test/azureMonitorSpan.spec.ts	New `useAzureMonitor()` + OTel span perf scenario (gating).
test/performanceTests/test/azureMonitorLog.spec.ts	New `useAzureMonitor()` + OTel log perf scenario (gating).
test/performanceTests/test/index.spec.ts	Registers new scenarios and updates perf output capture + telemetry reporting.
test/performanceTests/test/appInsightsShim.spec.ts	Makes shim startup idempotent across multiple test instantiations.
test/performanceTests/bench.mjs	Single-scenario deterministic benchmark runner (no worker pool).
test/performanceTests/runBenchmarks.mjs	Multi-scenario runner: per-scenario child process + sampling + JSON summary.
test/performanceTests/comparePerf.mjs	Compares baseline vs candidate JSON and returns a gating exit code + Markdown.
test/performanceTests/README.md	Documents scenarios, tiers, manual runs, and regression CI behavior.
test/performanceTests/package.json	Updates perf harness deps and adds benchmark/compare scripts.
test/performanceTests/package-lock.json	Lockfile updates for new/updated perf harness dependencies.
.github/workflows/performance.yml	Adds PR perf regression workflow (pack+install both versions, benchmark, compare, comment).

Files not reviewed (1)

test/performanceTests/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Fixes CI: add explicit npm run build of the perf harness before running benchmarks; previous run died with ERR_MODULE_NOT_FOUND because dist-esm was never produced. Review feedback: - workflow: switch perf harness install to npm ci; add --no-package-lock to tarball installs so the lockfile is not rewritten mid-run - AzureMonitor scenarios: acquire @opentelemetry/api and @opentelemetry/api-logs via createRequire resolved from the installed applicationinsights, so the Tracer/Logger we benchmark is backed by the SAME api / api-logs instance that useAzureMonitor() mutated (otherwise a duplicate hoisted copy at the harness level would yield a no-op proxy and we'd silently measure nothing) - OTel reference scenarios: use provider.getTracer / provider.getLogger directly instead of going through the global registry, eliminating dual-instance concerns for these - index.spec.ts: capture console.log with rest args + util.format so multi-arg / non-string calls are formatted the same way Node would print them - runBenchmarks.mjs: propagate child.error and child.signal in failure messages (spawnSync status can be null on spawn error or signal exit) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Previous run cancelled at 25min job timeout: candidate ran 12min (5 samples x 6 scenarios x ~24s/sample on CI runners; AzureMonitor scenarios pay a 5-10s SDK init cost per fresh child process), baseline got 12min in before cancellation. Cut samples 5 -> 3, duration 8s -> 5s, warmup 2s -> 1s. New estimate: ~5min per side, ~12min total with install/build. Bumped job timeout 25 -> 40 min for safety margin. Median of 3 samples is still robust to a single outlier (the main source of CI flake), and 5s is enough sustained measurement time for even the slowest scenario (AzureMonitorLog at ~12k ops/s yields ~60k ops per sample). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…api-logs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

JacksonWeber requested a review from Copilot May 22, 2026 02:36

Merge branch 'main' into perf/expand-tests-and-regression-gate

4ecabfc

Copilot started reviewing on behalf of JacksonWeber May 22, 2026 02:36 View session

JacksonWeber changed the title ~~perf: expand benchmarks vs upstream OpenTelemetry + add CI regression…~~ perf: Expand Benchmarks vs Upstream OpenTelemetry & CI Regression May 22, 2026

Copilot AI reviewed May 22, 2026

View reviewed changes

Comment thread .github/workflows/performance.yml

Comment thread test/performanceTests/package.json Outdated

Comment thread test/performanceTests/test/index.spec.ts Outdated

Comment thread test/performanceTests/runBenchmarks.mjs

Comment thread .github/workflows/performance.yml

JacksonWeber and others added 3 commits May 21, 2026 21:36

perf: align OTel package versions with applicationinsights to dedupe …

7ad2383

…api-logs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

JacksonWeber requested review from hectorhdzg and rads-1996 May 22, 2026 20:03

hectorhdzg approved these changes May 22, 2026

View reviewed changes

JacksonWeber merged commit a4c4943 into microsoft:main May 22, 2026
13 checks passed

This was referenced May 23, 2026

perf: Add pytest-benchmark suite and PR regression gate JacksonWeber/opentelemetry-distro-python#1

Closed

perf: Add pytest-benchmark suite and PR regression gate microsoft/opentelemetry-distro-python#165

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Expand Benchmarks vs Upstream OpenTelemetry & CI Regression#1500

perf: Expand Benchmarks vs Upstream OpenTelemetry & CI Regression#1500
JacksonWeber merged 5 commits into
microsoft:mainfrom
JacksonWeber:perf/expand-tests-and-regression-gate

JacksonWeber commented May 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JacksonWeber commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression limits

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JacksonWeber commented May 22, 2026 •

edited

Loading