[llm][1/4] Add Jinja2Cpp-based chat template formatter library by seyeong-han · Pull Request #19533 · pytorch/executorch

seyeong-han · 2026-05-13T04:51:03Z

Summary

Foundation PR for the chat-template support stack split out of #16987 per reviewer request from @kirklandsign. This PR adds the Jinja2Cpp-based JinjaChatFormatter, supporting chat-types, embedded Llama3 / Llama3.2 / Gemma3 templates, build glue (CMake/Buck), and a focused C++ unit-test suite.

This PR is reviewable in isolation — it has no behavior change for any existing runner; downstream PRs (B/C/D) plug it in.

Stack overview

PR	Subject
1/4 (this PR)	Library + tests
2/4	TextLLMRunner echo-gated special-token filter + EOS merge
3/4	Python bindings + Python LlamaRunner integration
4/4	llama_main CLI flags + chat_formatter wrapper + universal Jinja docs

What this PR adds

extension/llm/chat_template/{chat_templates.h, BUCK, CMakeLists.txt, targets.bzl} — embedded Llama3 / Llama3.2 / Gemma3 templates and the ChatTemplateType enum + ModelTokens. The CMake file FetchContents Jinja2Cpp 1.3.2, with SUPPORT_REGEX_LOOKAHEAD set before FetchContent_MakeAvailable so it propagates correctly, plus header staging for nonstd headers that some Jinja2Cpp installations omit. Installs chat_templates.h so SDK consumers can include it.
extension/llm/runner/{chat_types.h, jinja_chat_formatter.{h,cpp}} — the Universal Jinja chat formatter that supports any HuggingFace / vLLM chat template, not just the embedded ones. Loadable via fromTemplate (built-in), fromString (any string), or fromFile (any .jinja file). formatConversation injects vLLM/HuggingFace-standard params (tools=[], tool_choice=None, date_string, chat_template_kwargs) so any template that references those variables renders correctly.
normalizeTemplate handles vLLM/HF template quirks for Jinja2Cpp: notably, not tools is none maps to tools (truthy check), preserving the intent of tools is not none for empty-list defaults.
extension/llm/runner/{CMakeLists.txt, targets.bzl} — link extension_llm_runner against jinja2cpp (PRIVATE) and define EXECUTORCH_USE_JINJA2CPP.
extension/llm/runner/test/test_jinja_chat_formatter.cpp + test build files — unit tests covering Llama3 / Llama3.2 / Gemma3 embedded templates, parseChatTemplateType (case-insensitive), and three universal-Jinja regression tests:
- generic HuggingFace-style template (proves it's not Llama-specific)
- tools-aware template (validates the tools=[] default)
- not tools is none normalization regression test
CMakeLists.txt — adds add_subdirectory(extension/llm/chat_template) guarded by EXECUTORCH_BUILD_EXTENSION_LLM_RUNNER.
shim_et/xplat/executorch/build/build_variables.bzl — adds jinja_chat_formatter.cpp to the runner sources.

Universal Jinja support

Any HuggingFace / vLLM-style Jinja template works:

// From a template file (HuggingFace tokenizer_config.json, vLLM examples)
auto formatter = JinjaChatFormatter::fromFile("path/to/template.jinja");

// From a string
auto formatter = JinjaChatFormatter::fromString(template_str);

// Built-in shortcuts
auto formatter = JinjaChatFormatter::fromTemplate(ChatTemplateType::Llama3);

Notes

No behavior change for existing TextLLMRunner / MultimodalRunner users: the formatter is opt-in, only invoked when downstream code calls it.
Sample vLLM templates are NOT checked in (per reviewer feedback @metascroy). Documentation in the follow-up CLI PR (4/4) points users to vLLM's examples/ directory and HuggingFace tokenizer_config.json files.

Test Plan

Build with cmake --workflow llm-release
Build with make llama-cpu
Run unit tests: extension/llm/runner/test/test_jinja_chat_formatter
Verify embedded templates render correctly for Llama3 / Llama3.2 / Gemma3
Verify universal Jinja support with HuggingFace tokenizer_config.json templates

Original PR

Splitting #16987 into 4 reviewable PRs.

cc @kirklandsign @larryliu0820 @metascroy @lucylq @mergennachin

pytorch-bot · 2026-05-13T04:51:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19533

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull request jobs on OSDC runners in shadow mode

❌ 103 New Failures, 1 Cancelled Job, 1 Unrelated Failure, 6 Unclassified Failures

As of commit 0898aa3 with merge base 2ea50ac ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
RuntimeError: Command docker exec -t becc93e31bdfcb8a77162b2b279ef69a8a9b08d5acc5dbf8c6155566a34bc5d7 /exec failed with exit code 3
Build Presets / apple (ios-simulator) / build (gh)
/Users/runner/work/executorch/executorch/pytorch/executorch/cmake-out/_deps/boost-src/libs/regex/include/boost/regex/v5/c_regex_traits.hpp:461:17: error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]
Build Presets / apple (ios) / build (gh)
/Users/runner/work/executorch/executorch/pytorch/executorch/cmake-out/_deps/boost-src/libs/regex/include/boost/regex/v5/basic_regex_parser.hpp:1950:26: error: implicit conversion loses integer precision: 'std::intmax_t' (aka 'long') to 'unsigned int' [-Werror,-Wshorten-64-to-32]
Build Presets / apple (llm) / build (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 2
Build Presets / apple (macos) / build (gh)
/Users/runner/work/executorch/executorch/pytorch/executorch/cmake-out/_deps/boost-src/libs/regex/include/boost/regex/v5/c_regex_traits.hpp:461:17: error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]
Build Presets / apple (profiling) / build (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 2
Build Presets / apple (pybind) / build (gh)
/Users/runner/work/executorch/executorch/pytorch/executorch/cmake-out/_deps/fmt-src/include/fmt/format.h:3639:12: error: call to deleted constructor of 'formatter<basic_memory_buffer<wchar_t, 500, allocator<wchar_t>>, char>'
Build Presets / linux (linux, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
Build Presets / linux (linux, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
Build Presets / linux (llm, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
Build Presets / linux (llm, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
Build Presets / linux (pybind, linux.2xlarge, executorch-ubuntu-22.04-clang12) / build (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
Build Presets / linux (pybind, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
Build Presets / windows (pybind) / build (gh)
Process completed with exit code 1.
Build Presets / windows (windows) / build (gh)
Process completed with exit code 1.
Lint / lintrunner (gh)
>>> Lint for extension/llm/runner/test/test_jinja_chat_formatter.cpp:
pull / android / build-android (gh)
Process completed with exit code 1.
pull / test-build-wasm-linux / linux-job (gh)
RuntimeError: Command docker exec -t b618011e89e636a383802950d88f06a8e20eff4c99555a11a28dfeabb8b5ea3e /exec failed with exit code 2
pull / test-coreml-bc-macos (macos-m1-stable) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
pull / test-coreml-bc-macos (macos-m2-stable) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
pull / test-llama-runner-linux (bf16, custom, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-llama-runner-linux (fp32, xnnpack+custom+qe, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-llama-runner-linux (fp32, xnnpack+custom+qe, linux.arm64.2xlarge, executorch-ubuntu-22.04-gc... / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv, linux.2xlarge, executorch-ubuntu-22.04... / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu... / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu-22.04-... / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-llama-runner-linux-android / linux-job (gh)
RuntimeError: Command docker exec -t cc5d53544fb4f176152184415bb11d9850fc0205e410a96dfb7530213db1df4a /exec failed with exit code 1
pull / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t 0549f60883f33c3bbb4151325639b2dbf16871d3fd75cc52e39d1ae913bbaf06 /exec failed with exit code 2
pull / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t 44b8702ed6193966478ba67178eda1232f91b40dfecb1d653f8f553584cf9165 /exec failed with exit code 2
pull / test-lora-linux / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-lora-multimethod-linux / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-mediatek-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 2f401dc5141fa6b2fd9b7bd116406dfe3c5650b7c4d768e4d43750250db2af2a /exec failed with exit code 2
pull / test-models-linux (add_mul, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (add_mul, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t a839c702d1bf730482fa52dba6befa7a3eeec121558a644c574c4e2b64fa426d /exec failed with exit code 1
pull / test-models-linux (add, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (add, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 9e7f5ec74108ff08ce7259b38eac22bd021b38f406224b63844a34d05f226d00 /exec failed with exit code 1
pull / test-models-linux (emformer_join, portable, linux.4xlarge.memory) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (emformer_join, xnnpack-quantization-delegation, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t a128dfe971307eaebf633cd45f3df1265ad14c28ec18982d20cf8c8158c31793 /exec failed with exit code 1
pull / test-models-linux (emformer_transcribe, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (emformer_transcribe, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 849bc50802d276ae0d746906dffb88063b3a75bc02ad9b625f7ae6efe2be4b29 /exec failed with exit code 1
pull / test-models-linux (ic3, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (ic3, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t fa89672d0f791a59c1af1c363cfd3cd88df017b65ec57a9736a1160158b393c1 /exec failed with exit code 1
pull / test-models-linux (ic4, portable, linux.4xlarge.memory) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (ic4, xnnpack-quantization-delegation, linux.4xlarge.memory) / linux-job (gh)
RuntimeError: Command docker exec -t 18f9641012bed77bfb2e49f7962eb8538f8dfdb82ee5b8881a08cbe3cf2db1a9 /exec failed with exit code 1
pull / test-models-linux (linear, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (linear, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 5c2289a3b8609663d563891e3bb39031478be8c0b67eafb1e0313a62c1299c31 /exec failed with exit code 1
pull / test-models-linux (llama3_2_vision_encoder, portable, linux.4xlarge.memory) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (mobilebert, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (mobilebert, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t add52e8703004220bf7cb7a52414bf3835fef4b08b7897c39d33f8b25c4b61d7 /exec failed with exit code 1
pull / test-models-linux (mv2, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (mv2, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t adf79a95b95777cc1c13fda4c6e5c8800dfc9a9d5b1d628f8115a59a68ecfa02 /exec failed with exit code 1
pull / test-models-linux (phi_4_mini, portable, linux.4xlarge.memory) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (resnet18, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (resnet18, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t ed6ea392580885ef8488d2b3bde19abf702436231b05087f8afb81d4b85d3c9b /exec failed with exit code 1
pull / test-models-linux (resnet50, portable, linux.2xlarge) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux (resnet50, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t 71a71dc573e5c485f8c2e58077c712f21e5c67c7bac65a867e8907fcccb5a3db /exec failed with exit code 1
pull / test-models-linux (w2l, portable, linux.4xlarge.memory) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux-basic (mv3, portable, buck2, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t 46d60d48cb783b39e69018bf9ad06b12ecc165953c6a5ed80d9c64e7b6d4484e /exec failed with exit code 3
pull / test-models-linux-basic (mv3, portable, cmake, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux-basic (mv3, portable, cmake, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11... / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, buck2, linux.2xlarge, executorch-u... / linux-job (gh)
RuntimeError: Command docker exec -t f811078840768d81932f405a8b68692b461d09444d9598fc34f15995d99dd793 /exec failed with exit code 3
pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, cmake, linux.2xlarge, executorch-u... / linux-job (gh)
RuntimeError: Command docker exec -t bcc8a936dfd83c000ab20a3b21f26112a27d3fb4fad5ab30c092010d98b0947f /exec failed with exit code 1
pull / test-models-linux-basic (mv3, xnnpack-quantization-delegation, cmake, linux.arm64.2xlarge, execut... / linux-job (gh)
RuntimeError: Command docker exec -t e69a2bebd69bf89f5db79fbf626b54a4a38ff52dfba0d5aa7b22bca63a15ff26 /exec failed with exit code 1
pull / test-models-linux-basic (vit, portable, buck2, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
RuntimeError: Command docker exec -t 5e01088b4f3bb31c77efbf0dc8d06f229cd14e73cbf9da4b0752a971e273bf9b /exec failed with exit code 3
pull / test-models-linux-basic (vit, portable, cmake, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux-basic (vit, portable, cmake, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11... / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-models-linux-basic (vit, xnnpack-quantization-delegation, buck2, linux.2xlarge, executorch-u... / linux-job (gh)
RuntimeError: Command docker exec -t 3a37291e0d52736cad2bc22859750c122f1877455fa06cd30bdb5df409bcf0cb /exec failed with exit code 3
pull / test-models-linux-basic (vit, xnnpack-quantization-delegation, cmake, linux.2xlarge, executorch-u... / linux-job (gh)
RuntimeError: Command docker exec -t 04c2eeed7c13a5b7f645d1616bb0798f0bb9680e42de9e107e6ea53f5dcfb38b /exec failed with exit code 1
pull / test-models-linux-basic (vit, xnnpack-quantization-delegation, cmake, linux.arm64.2xlarge, execut... / linux-job (gh)
RuntimeError: Command docker exec -t 25ca24f496918ea4ac237cef86aeaa377dacb0d45a70b3eb7a8d95b75b03bc44 /exec failed with exit code 1
pull / test-moshi-linux / linux-job (gh)
test_mimi
pull / test-openvino-linux / linux-job (gh)
RuntimeError: Command docker exec -t 0d2ad7e3529f46a6ebf3cb0122b3fc94077ad559a1d67157aad10448e7bf63e9 /exec failed with exit code 1
pull / test-parakeet-xnnpack-linux / linux-job (gh)
undefined reference to executorch::backends::xnnpack::XnnpackBackendOptions::get_option(executorch::runtime::BackendOption&) const'`
pull / test-phi-3-mini-runner-linux / linux-job (gh)
undefined reference to executorch::backends::xnnpack::XnnpackBackendOptions::get_option(executorch::runtime::BackendOption&) const'`
pull / test-qnn-buck-build-linux / linux-job (gh)
RuntimeError: Command docker exec -t 7eecb8c2c6f3e7da1f0da5a6505f4cf411581932d0c0d2b44273f98f9bd222a5 /exec failed with exit code 3
pull / test-qnn-delegate-linux / linux-job (gh)
RuntimeError: Command docker exec -t 0307621dc307199eda783059107c0ca66a131b4131b0a72527c314fba5f37106 /exec failed with exit code 2
pull / test-qnn-models-linux (dl3) / linux-job (gh)
RuntimeError: Command docker exec -t 02800c5ee75cafa0d2d50a708b5dab8173403e6a77cb789621cec30551329bed /exec failed with exit code 2
pull / test-qnn-models-linux (mv2) / linux-job (gh)
RuntimeError: Command docker exec -t 7f46abe4427251f91184e4ba3ef18ad93ea9ad2d7bcf1f9302be39fe36d6fcca /exec failed with exit code 2
pull / test-qnn-models-linux (mv3) / linux-job (gh)
RuntimeError: Command docker exec -t e94a008b15176889462ac3431bd7781378fdb3c879207ef6409345506e8f3054 /exec failed with exit code 2
pull / test-qnn-passes-linux / linux-job (gh)
RuntimeError: Command docker exec -t 899f48958e1b035da76adfa088c1007ca294bbfc7fe9a8b438957c9053a4cbdb /exec failed with exit code 2
pull / test-qnn-python-imports-linux / linux-job (gh)
RuntimeError: Command docker exec -t 3fda1012bb8429918d6dc3be37184a3e4a9e05c976760a421277ca376a70908a /exec failed with exit code 2
pull / test-qnn-testsuite-linux / test-backend-linux (qnn, models) / linux-job (gh)
RuntimeError: Command docker exec -t 253ff18db23e7d67f27fae8298dfc18ea52a1284ef6a720608686a5f4507f3fc /exec failed with exit code 2
pull / test-qnn-testsuite-linux / test-backend-linux (qnn, operators) / linux-job (gh)
RuntimeError: Command docker exec -t e0036aa322946a6e7c946207a7293fc907c56880377000e939ebe075abc71d72 /exec failed with exit code 2
pull / test-selective-build-linux / linux-job (gh)
RuntimeError: Command docker exec -t dd8ce07c889fe433fc90c590aee07dff81cd13ba44e5c2cc5e7c0d3f3e4711e2 /exec failed with exit code 2
pull / test-setup-linux-gcc / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / test-sqnr-static-llm-qnn-linux (smollm2_135m) / linux-job (gh)
RuntimeError: Command docker exec -t e2d25fcbbf2951e2061464dc42284a4f41028b2cd8a77a32958fcd2fba85ab1b /exec failed with exit code 2
pull / test-static-llama-qnn-linux (stories_110m) / linux-job (gh)
RuntimeError: Command docker exec -t 0fd366f3fb6063420e48e658625e578307fc9072b8bd2296644984698cec0855 /exec failed with exit code 2
pull / test-static-llama-qnn-linux (stories_260k_bc) / linux-job (gh)
RuntimeError: Command docker exec -t a88bd8bba0b836cfebdf8736d7ca85959a35cad662eca87ca9a35ef2c66afe2c /exec failed with exit code 2
pull / test-voxtral-realtime-xnnpack-linux / linux-job (gh)
AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
pull / test-vulkan-models-linux / linux-job (gh)
undefined reference to executorch::backends::xnnpack::XnnpackBackendOptions::get_option(executorch::runtime::BackendOption&) const'`
pull / test-vulkan-operators-linux / linux-job (gh)
undefined reference to executorch::backends::xnnpack::XnnpackBackendOptions::get_option(executorch::runtime::BackendOption&) const'`
pull / unittest / linux / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / unittest / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 2
pull / unittest / windows / windows-job (gh)
Process completed with exit code 1.
pull / unittest-arm-backend-with-no-deps (test_pytest_ops_no_target) / linux-job (gh)
RuntimeError: Command docker exec -t eec80a113480bcc4cfd3f5938d238261574ac4aa5f1e31c8c04dbec839faad28 /exec failed with exit code 1
pull / unittest-buck / linux / linux-job (gh)
RuntimeError: Command docker exec -t 96d52cda51d71ef179cba3e7060d5103f894dd1e5dd2f69b3fe7ba42cf72d576 /exec failed with exit code 3
pull / unittest-buck / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 3
pull / unittest-editable / linux / linux-job (gh)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
pull / unittest-editable / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 2
pull / unittest-editable / windows / windows-job (gh)
Process completed with exit code 1.
pull / unittest-nxp-neutron / linux-job (gh)
RuntimeError: Command docker exec -t 3e45436034ed264ce9840618882275f296953647c8d4ec2026520dd86ae7772d /exec failed with exit code 1
Test QNN Backend / test-qnn / test-backend-linux (qnn, models) / linux-job (gh)
RuntimeError: Command docker exec -t 0a24e7bbbae20b1ddb339dcf6323f3abb016fa577866482c2332cc8d015532d3 /exec failed with exit code 2
Test QNN Backend / test-qnn / test-backend-linux (qnn, operators) / linux-job (gh)
RuntimeError: Command docker exec -t 04b33bc7ef570a9c705c74633744cbbdf16c45d314faef0a4ce2178bb0241fea /exec failed with exit code 2

UNCLASSIFIED FAILURES - DrCI could not classify the following jobs because the workflow did not run on the merge base. The failures may be pre-existing on trunk or introduced by this PR:

periodic / test-models-linux (buck2, mv3, portable, linux.2xlarge, 90) / linux-job (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)
RuntimeError: Command docker exec -t ff2d915c0059595eb63027e6a0902b9c7f4ef769052e71a40301aeefdc2fc4b7 /exec failed with exit code 3
periodic / test-models-linux (buck2, mv3, xnnpack-quantization-delegation, linux.2xlarge, 90) / linux-job (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)
RuntimeError: Command docker exec -t 9f7430da00abefa496de0ccbaf921295008d82898e7e3859dc00d82d844c1e69 /exec failed with exit code 3
periodic / test-models-linux (cmake, mv3, portable, linux.2xlarge, 90) / linux-job (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
periodic / test-models-linux (cmake, mv3, xnnpack-quantization-delegation, linux.2xlarge, 90) / linux-job (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)
RuntimeError: Command docker exec -t 5c441b85039937f23226ee5577e1ba29d74b910b0aad1563189e2b77d4d56b33 /exec failed with exit code 1
periodic / test-models-linux (cmake, vit, portable, linux.2xlarge, 90) / linux-job (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)
undefined reference to executorch::runtime::validate_program(executorch_flatbuffer::Program const*)'`
periodic / test-models-linux (cmake, vit, xnnpack-quantization-delegation, linux.2xlarge, 90) / linux-job (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)
RuntimeError: Command docker exec -t 76615011fb8cbfeb751f9e4020367d813c798d5738aaa203c01254364b47bbd0 /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

pull / unittest-arm-backend-with-no-deps (test_pytest_ops_tosa) / linux-job (gh)
##[error]The operation was canceled.

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / unittest-arm-backend-with-no-deps (test_pytest_models_tosa) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-05-13T04:51:47Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Foundation PR for the chat-template support stack. Adds the Jinja2Cpp-based JinjaChatFormatter, supporting chat-types, embedded Llama3/Llama3.2/Gemma3 templates, build glue (CMake/Buck), and a focused C++ unit-test suite. This PR is reviewable in isolation — it has no behavior change for any existing runner; downstream PRs (B/C/D) plug it in. This is part 1 of a 4-PR stack split out of pytorch#16987 per reviewer request: 1/4 (this PR) Library + tests 2/4 TextLLMRunner echo-gated special-token filter + EOS merge 3/4 Python bindings + Python LlamaRunner integration 4/4 llama_main CLI flags + chat_formatter wrapper + docs What this PR adds ----------------- * extension/llm/chat_template/{chat_templates.h, BUCK, CMakeLists.txt, targets.bzl} — embedded Llama3/Llama3.2/Gemma3 templates and the ChatTemplateType enum + ModelTokens. The CMake file FetchContent's Jinja2Cpp 1.3.2, with SUPPORT_REGEX_LOOKAHEAD set BEFORE FetchContent_MakeAvailable so it propagates correctly, plus header staging for nonstd headers that some Jinja2Cpp installations omit. Installs chat_templates.h so SDK consumers can include it. * extension/llm/runner/{chat_types.h, jinja_chat_formatter.{h,cpp}} — the Universal Jinja chat formatter that supports any HuggingFace / vLLM chat template, not just the embedded ones. Loadable via fromTemplate (built-in), fromString (any string), or fromFile (any .jinja file). formatConversation injects vLLM/HuggingFace-standard params (tools=[], tool_choice=None, date_string, chat_template_kwargs) so any template that references those variables renders correctly. * normalizeTemplate handles vLLM/HF template quirks for Jinja2Cpp: notably, 'not tools is none' maps to 'tools' (truthy check), preserving the intent of 'tools is not none' for empty-list defaults. * extension/llm/runner/{CMakeLists.txt, targets.bzl} — link extension_llm_runner against jinja2cpp (PRIVATE) and define EXECUTORCH_USE_JINJA2CPP. * extension/llm/runner/test/{test_jinja_chat_formatter.cpp, CMakeLists.txt, targets.bzl, BUCK} — unit tests covering Llama3 / Llama3.2 / Gemma3 embedded templates, parseChatTemplateType (case-insensitive), and three universal-Jinja regression tests: - generic HuggingFace-style template (proves it's not Llama-specific) - tools-aware template (validates the tools=[] default) - 'not tools is none' normalization regression test * CMakeLists.txt — adds add_subdirectory(extension/llm/chat_template) guarded by EXECUTORCH_BUILD_EXTENSION_LLM_RUNNER. * shim_et/xplat/executorch/build/build_variables.bzl — adds jinja_chat_formatter.cpp to the runner sources. Notes ----- * No behavior change for existing TextLLMRunner / MultimodalRunner users: the formatter is opt-in, only invoked when downstream code calls it. * Sample vLLM templates are NOT checked in (per reviewer feedback); documentation in the follow-up CLI PR points users to vLLM's examples directory and HuggingFace tokenizer_config.json files. Original PR (full stack): pytorch#16987

seyeong-han requested a review from larryliu0820 as a code owner May 13, 2026 04:51

seyeong-han requested a review from kirklandsign May 13, 2026 04:51

seyeong-han requested review from kirklandsign and mergennachin as code owners May 13, 2026 04:51

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 13, 2026

seyeong-han force-pushed the chat-jinja-library branch from e5c5a56 to 0898aa3 Compare May 13, 2026 05:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[llm][1/4] Add Jinja2Cpp-based chat template formatter library#19533

[llm][1/4] Add Jinja2Cpp-based chat template formatter library#19533
seyeong-han wants to merge 1 commit into
pytorch:mainfrom
seyeong-han:chat-jinja-library

seyeong-han commented May 13, 2026

Uh oh!

pytorch-bot Bot commented May 13, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seyeong-han commented May 13, 2026

Summary

Stack overview

What this PR adds

Universal Jinja support

Notes

Test Plan

Original PR

Uh oh!

pytorch-bot Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19533

❗ 1 Active SEVs

❌ 103 New Failures, 1 Cancelled Job, 1 Unrelated Failure, 6 Unclassified Failures

Uh oh!

github-actions Bot commented May 13, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pytorch-bot Bot commented May 13, 2026 •

edited

Loading

This PR needs a `release notes:` label