Skip to content

feat(minimax-multimodal-toolkit): update models, simplify scripts, and standardize docs#57

Open
BLUE-coconut wants to merge 1 commit intoMiniMax-AI:mainfrom
BLUE-coconut:feat/update-multimodal-toolkit-models-and-docs
Open

feat(minimax-multimodal-toolkit): update models, simplify scripts, and standardize docs#57
BLUE-coconut wants to merge 1 commit intoMiniMax-AI:mainfrom
BLUE-coconut:feat/update-multimodal-toolkit-models-and-docs

Conversation

@BLUE-coconut
Copy link
Copy Markdown

Summary

  • Add default models table & error handling: New section documenting default models for each capability (TTS, Music, Image, Video) so the agent uses them without prompting the user. Added auto-fallback logic for video quota exhaustion (MiniMax-Hailuo-2.3 → MiniMax-Hailuo-2.3-Fast).
  • Remove deprecated models & simplify constraints: Removed speech-2.6, T2V-01, I2V-01, S2V-01, MiniMax-Hailuo-02 references. Unified all video modes (t2v/i2v/sef/ref) to use MiniMax-Hailuo-2.3 with 6s + 768P only. Changed default video duration from 10s to 6s.
  • Translate Chinese text to English: All Chinese descriptions in docs, prompt examples, camera instructions, and aspect ratio tables are now in English for international accessibility.
  • Simplify image generation script: Replaced the Windows-oriented temp-file payload builder with inline jq piping. Fixed base64 -w 0base64 for macOS compatibility. Removed verbose error output.
  • Add model validation to video scripts: Both generate_video.sh and generate_long_video.sh now validate model/duration/resolution combinations upfront with clear error messages.
  • Remove plan limits section: Quota tables were removed from SKILL.md as they are subject to frequent changes and add maintenance burden.

Test plan

  • Verify TTS generation works with speech-2.8-hd default model
  • Verify image generation (t2i and i2i modes) with simplified payload builder
  • Verify video generation in all modes (t2v, i2v, sef, ref) defaults to MiniMax-Hailuo-2.3 at 6s + 768P
  • Verify model validation rejects unsupported model/duration/resolution combinations
  • Verify long video generation defaults to 6s segments
  • Confirm base64 command works on macOS without -w 0 flag

🤖 Generated with Claude Code

…d standardize docs

- Add default models table and error handling section with auto-fallback for video quota exhaustion
- Remove deprecated models (speech-2.6, T2V-01, I2V-01, S2V-01, MiniMax-Hailuo-02) and unify all video modes to MiniMax-Hailuo-2.3
- Simplify video model constraints to 6s + 768P only (remove 10s/1080P options)
- Remove plan limits/quotas section from SKILL.md (subject to change, reduces maintenance burden)
- Translate all Chinese text in docs and prompts to English for international accessibility
- Simplify image generation script: replace Windows temp-file payload builder with inline jq, fix macOS base64 compatibility
- Add model validation to video generation scripts with clear error messages
- Change default video duration from 10s to 6s to match actual model constraints

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant