Skip to content

[RFC] Support DistillT5#652

Draft
idostyle wants to merge 3 commits into
leejet:masterfrom
idostyle:dt
Draft

[RFC] Support DistillT5#652
idostyle wants to merge 3 commits into
leejet:masterfrom
idostyle:dt

Conversation

@idostyle
Copy link
Copy Markdown
Contributor

@idostyle idostyle commented Apr 9, 2025

Add support for https://huggingface.co/LifuWang/DistillT5 https://github.com/LifuWang-66/DistillT5

Can be tested with https://huggingface.co/Eviation/DistillT5 e.g. https://huggingface.co/Eviation/DistillT5/blob/main/DistillT5-F32.safetensors which removes the additional wrapping.

  • Probably needs to be integrated, and at best automatically detected, alongside T5 XXL
  • Check if there is a way to use the unmodified LifuWang DistillT5 model with the additional encoder wrapping on the tensor names
    image

I would appreciate any ideas on how best to incorporate this in a less intrusive way.

@stduhpf
Copy link
Copy Markdown
Contributor

stduhpf commented Apr 10, 2025

Nice, just as I was thinking of adding it.

@github-actions
Copy link
Copy Markdown

This PR has been inactive for 365 days. If there is no new activity within 7 days, it will be closed automatically. Comment, push new commits, or remove the pr:inactive label to keep it open. Add pr:keep-open to exempt it from future inactive PR cleanup.

@github-actions github-actions Bot added the pr:inactive Inactive PR pending automatic closure label May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr:inactive Inactive PR pending automatic closure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants