Skip to content

LoRA support for LLM & VLM#4051

Open
dkalinowski wants to merge 22 commits intomainfrom
lora_2026
Open

LoRA support for LLM & VLM#4051
dkalinowski wants to merge 22 commits intomainfrom
lora_2026

Conversation

@dkalinowski
Copy link
Copy Markdown
Collaborator

🛠 Summary

CVS-182580

@dkalinowski dkalinowski added the WIP Do not merge until resolved label Mar 10, 2026
Comment thread versions.mk Outdated
@dkalinowski dkalinowski marked this pull request as ready for review April 13, 2026 07:14
Copilot AI review requested due to automatic review settings April 13, 2026 07:14
@dkalinowski dkalinowski changed the title [WIP] LoRA support for LLM & VLM LoRA support for LLM & VLM Apr 13, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds initial LoRA adapter support to OVMS GenAI-based LLM/VLM pipelines by extending MediaPipe node options and wiring adapter configuration into GenAI plugin configuration during servable initialization.

Changes:

  • Introduces LoraAdapter options in LLMCalculatorOptions and applies adapters during servable initialization.
  • Extends GenAiServableProperties to carry LoRA adapter configuration state.
  • Updates GenAI dependency pin and wires in a new (currently missing) unit test target.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
versions.mk Updates GenAI source pin and changes GenAI org (now pointing to a fork).
src/llm/llm_calculator.proto Adds LoraAdapter message and repeated lora_adapter option.
src/llm/servable.hpp Adds ov::genai::AdapterConfig to servable properties.
src/llm/servable_initializer.hpp Declares initializeLoraAdapters(...) helper.
src/llm/servable_initializer.cpp Implements LoRA adapter loading and injects adapters into pluginConfig.
src/llm/language_model/legacy/servable_initializer.cpp Calls initializeLoraAdapters(...) during LM legacy init.
src/llm/language_model/continuous_batching/servable_initializer.cpp Calls initializeLoraAdapters(...) during LM CB init.
src/llm/visual_language_model/legacy/servable_initializer.cpp Calls initializeLoraAdapters(...) during VLM legacy init.
src/BUILD Adds test/llm/lora_adapter_test.cpp to cc_test sources.

Comment thread versions.mk Outdated
Comment thread src/llm/servable.hpp
Comment thread src/llm/servable_initializer.cpp
Comment thread src/BUILD Outdated
optional LLMCalculatorOptions ext = 113473750;
}

message LoraAdapter {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont need alias in LLM/VLM. It's not possible to interact with it via request. What do you think?

Comment thread src/llm/servable.hpp Outdated
@dkalinowski dkalinowski removed the WIP Do not merge until resolved label Apr 22, 2026
Comment thread src/BUILD Outdated
"test/multipart_calculator_test.cpp",
"test/llm/assisted_decoding_test.cpp",
"test/llm/llmnode_test.cpp",
"test/llm/lora_adapter_test.cpp",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate target?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread src/test/llm/lora_adapter_test.cpp Outdated
Comment thread src/test/llm/lora_adapter_test.cpp Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

}
}
// since it is only applied once at initialization, static mode is sufficient and more efficient.
adapterConfig.set_mode(ov::genai::AdapterConfig::MODE_STATIC);
#include <vector>

#include <gtest/gtest.h>
#include <openvino/genai/lora_adapter.hpp>
Comment on lines +328 to +332
SPDLOG_INFO("Processing LoRA adapter number {} with model path: {} alpha: {}", i, loraAdapterOption.model_path(), loraAdapterOption.alpha());
if (loraAdapterOption.alpha() <= 0.0f || loraAdapterOption.alpha() > 1.0f) {
SPDLOG_ERROR("LoRA adapter alpha value {} is out of valid range (0.0, 1.0]", loraAdapterOption.alpha());
return StatusCode::LLM_NODE_RESOURCE_STATE_INITIALIZATION_FAILED;
}
Comment on lines +333 to +337
auto fsLoraPath = std::filesystem::path(loraAdapterOption.model_path());
std::string loraPath;
if (fsLoraPath.is_relative()) {
loraPath = (std::filesystem::path(graphPath) / fsLoraPath).string();
} else {
Comment on lines +79 to 86
status = initializeLoraAdapters(nodeOptions, graphPath, properties);
if (!status.ok()) {
return status;
}

status = JsonParser::parsePluginConfig(nodeOptions.plugin_config(), properties->pluginConfig);
if (!status.ok()) {
SPDLOG_ERROR("Error during llm node plugin_config option parsing to JSON: {}", nodeOptions.plugin_config());
Comment on lines +201 to 208
status = initializeLoraAdapters(nodeOptions, graphPath, properties);
if (!status.ok()) {
return status;
}

status = JsonParser::parsePluginConfig(nodeOptions.plugin_config(), properties->pluginConfig);
if (!status.ok()) {
SPDLOG_ERROR("Error during llm node plugin_config option parsing to JSON: {}", nodeOptions.plugin_config());
Comment on lines +78 to 85
status = initializeLoraAdapters(nodeOptions, graphPath, properties);
if (!status.ok()) {
return status;
}

status = JsonParser::parsePluginConfig(nodeOptions.plugin_config(), properties->pluginConfig);
if (!status.ok()) {
SPDLOG_ERROR("Error during llm node plugin_config option parsing to JSON: {}", nodeOptions.plugin_config());
a2->set_model_path(loraFilePath);
a2->set_alpha(0.7f);
ASSERT_EQ(initializeLoraAdapters(nodeOptions, "", properties), StatusCode::OK);
EXPECT_EQ(properties->pluginConfig.count("adapters"), 1);
@dtrawins dtrawins added this to the 2026.2_rc milestone May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants