perf(gui): implement batched UI rendering#2514
perf(gui): implement batched UI rendering#2514githubawn wants to merge 18 commits intoTheSuperHackers:mainfrom
Conversation
|
| Filename | Overview |
|---|---|
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp | Core of the batching implementation: adds setup2DRenderState, onBeginBatch/onEndBatch/onFlush, and reworks all draw primitives to defer Render() calls. Ref-counting is correct; P2 concern about blend-state restoration after onFlush. |
| GeneralsMD/Code/GameEngine/Include/GameClient/Display.h | Adds public beginBatch/endBatch/flush/isBatching interface and protected virtual hooks; m_isBatching properly initialised in Display.cpp constructor. |
| GeneralsMD/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h | Adds TextureClass forward declaration and four new batch-state members; all members initialised in the W3DDisplay constructor. |
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplayString.cpp | Flushes the 2D batch before rendering text to preserve correct draw order; hotkey renderer render moved outside the needNewPolys block to ensure it always fires. |
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DInGameUI.cpp | Wraps the entire InGameUI draw pass in beginBatch/endBatch; P2 concern about non-reentrant batch semantics if ever called within W3DView's own batch region. |
| Core/GameEngineDevice/Source/W3DDevice/GameClient/W3DView.cpp | Minimal change: wraps iterateDrawablesInRegion post-draw callbacks in beginBatch/endBatch to batch drawable 2D overlay rendering. |
| GeneralsMD/Code/GameEngine/Source/GameClient/Display.cpp | Implements beginBatch/endBatch/flush with guard against nested calls; m_isBatching initialised to FALSE in constructor. |
Sequence Diagram
sequenceDiagram
participant Caller as W3DInGameUI / W3DView
participant D as Display::beginBatch/endBatch/flush
participant W3D as W3DDisplay (onBeginBatch/onFlush/onEndBatch)
participant R2D as Render2DClass (m_2DRender)
participant TS as W3DDisplayString
Caller->>D: beginBatch()
D->>W3D: onBeginBatch()
W3D->>R2D: Reset()
Note over W3D: m_batchNeedsInit=TRUE, m_batchTexture=nullptr
loop Per draw call (drawImage / drawLine / drawFillRect)
Caller->>W3D: drawXxx()
W3D->>W3D: setup2DRenderState(tex, mode, grayscale)
alt State unchanged
W3D-->>W3D: early return (no flush)
else State changed
W3D->>W3D: onFlush() → Render2D.Render() + Reset()
W3D->>W3D: Add_Ref on new tex, Release_Ref on old tex
end
W3D->>R2D: Add_Quad / Add_Tri / Add_Line
Note over W3D: m_isBatching=TRUE → skip immediate Render()
end
opt Text rendering (W3DDisplayString::draw)
TS->>D: isBatching() → TRUE
TS->>D: flush()
D->>W3D: onFlush() → Render() pending batch
TS->>TS: m_textRenderer.Render()
end
Caller->>D: endBatch()
D->>W3D: onEndBatch()
W3D->>W3D: onFlush() → final Render() + Reset()
W3D->>W3D: Release_Ref(m_batchTexture)
Note over W3D: m_isBatching=FALSE
Prompt To Fix All With AI
This is a comment left during a code review.
Path: GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Line: 2749
Comment:
**Clipping inner-check fires for images fully inside clip region**
After the outer early-return guard was correctly expanded (`startX >= hi.x || startY >= hi.y` is new), the inner condition now triggers for any image that merely *touches* a boundary — including images that are fully inside the clip region:
```cpp
if (screen_rect.Left < m_clipRegion.lo.x || // left edge is to the left of clip (outside)
screen_rect.Right > m_clipRegion.hi.x || // right edge is to the right of clip (outside)
screen_rect.Top < m_clipRegion.lo.y || // top edge is above clip (outside)
screen_rect.Bottom > m_clipRegion.hi.y) // bottom edge is below clip (outside)
```
Because `m_clipRegion.lo`/`hi` define the *inside* boundary, an image that sits fully within `[lo, hi]` satisfies none of those conditions, so the block is skipped and `screen_rect`/`uv_rect` are passed unchanged to `Add_Quad`/`Add_Tri`. This is the correct fast-path — no clipping computation is performed.
The concern is the opposite case: an image partially outside on its **right or bottom edge only** (e.g., `screen_rect.Right > m_clipRegion.hi.x`). In the original code this branch was also reached (via the `else` after the old early-return), so the effective behaviour is unchanged. The fix is correct; this is just a note for reviewers.
No change needed — confirmed correct.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Line: 613-621
Comment:
**`onFlush` does not restore blend states after render**
`onFlush()` calls `m_2DRender->Render()` then `m_2DRender->Reset()`, but it does not reset `Enable_Grayscale`, `Enable_Alpha`, or `Enable_Additive` to their defaults. The *next* `setup2DRenderState` call will explicitly re-apply these states anyway, so there is no observable bug in the current batched callers.
The latent risk appears when `flush()` is called *outside* a batch (the method is `public virtual` and `m_batchNeedsInit` starts as `FALSE`, so the guard `!m_batchNeedsInit` passes). The subsequent non-batched draw call would inherit whatever blend state was set by the last batch item rather than starting from a clean slate.
Consider adding a state reset at the end of `onFlush` to make the invariant explicit:
```cpp
void W3DDisplay::onFlush()
{
if (m_2DRender && !m_batchNeedsInit)
{
m_2DRender->Render();
m_2DRender->Reset();
m_2DRender->Enable_Grayscale(FALSE);
m_2DRender->Enable_Alpha(TRUE);
m_2DRender->Enable_Additive(FALSE);
m_batchNeedsInit = TRUE;
}
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DInGameUI.cpp
Line: 390-443
Comment:
**`endBatch` inside W3DInGameUI will prematurely terminate an outer batch**
`Display::endBatch` unconditionally ends the batch whenever `m_isBatching == TRUE`. There is no nesting depth counter — only the first `endBatch` call matters.
`W3DView::draw()` now also calls `beginBatch()`/`endBatch()` around `iterateDrawablesInRegion`. If `W3DInGameUI::draw()` is ever invoked from within that region iteration (e.g., via a post-draw callback that triggers UI repainting), the inner `endBatch()` here would flush and close the outer batch prematurely. Any subsequent 2D draws in `W3DView` would then fall through to the non-batched path, silently losing the performance benefit and potentially mis-ordering rendered geometry.
Today the two sites appear to be separate render passes and this path is not taken, but the current implementation has no guard against future refactors that introduce such nesting. A reference-counted `beginBatch`/`endBatch` (or a push/pop stack) would make the semantics explicit and safe.
How can I resolve this? If you propose a fix, please make it concise.Reviews (13): Last reviewed commit: "Move HUD batching from GameWindowManager..." | Re-trigger Greptile
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Outdated
Show resolved
Hide resolved
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Outdated
Show resolved
Hide resolved
split single line IF
|
Significantly improves #109 |
optimize clipping check
remove obsolete else
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Outdated
Show resolved
Hide resolved
return statement on seperate line
centralized flush
|
"One P1 render-state initialization bug in onBeginBatch() should be fixed before merging to avoid potential cross-frame visual corruption." The P1 finding could be a false positive based on a local variable analysis that omits the downstream hardware synchronization in Render2DClass::Render() (render2d.cpp:601). Render2DClass is a self-synchronizing state machine. Its Render() call (line 628 for textures, line 694 for shaders) explicitly pushes its internal members to the DX8Wrapper on every execution. The tracking variables in W3DDisplay (m_batchTexture, etc.) are mirror-images of Render2DClass's internal members. happy? |
|
No, the P1 finding stands. Your argument about Render2DClass being self-synchronizing addresses hardware state consistency, but not the C++ undefined behavior in W3DDisplay. The issue:
Even if Render2DClass correctly synchronizes hardware state on every
An uninitialized bool means these decisions are based on garbage. The fact that the hardware might recover downstream doesn't make reading uninitialized memory defined behavior. The fix is one line. Keep it. |
endBatch() is re-entrant-safe Clipping fast-path in drawImage remove Redundant m_isBatching = FALSE
…sGameCode into perf/gui-batch
|
I see greptile woke up on the sassy side of the motherboard. |
Another test would be to get a lot of units on screen then select them all so their unit info is being drawn. The health bars make use of the line drawing classes which i believe are inherently affected by this change. |


Reduces draw calls by batching 2D elements by texture state in setup2DRenderState.