Skip to content

Lastgenre: Cleanup existing genres#6317

Open
Nukesor wants to merge 2 commits intobeetbox:masterfrom
Nukesor:canonicalize-existing
Open

Lastgenre: Cleanup existing genres#6317
Nukesor wants to merge 2 commits intobeetbox:masterfrom
Nukesor:canonicalize-existing

Conversation

@Nukesor
Copy link
Contributor

@Nukesor Nukesor commented Jan 24, 2026

Description

Fixes #6305

Implements a new cleanup_existing config flag. The added documentation should explain its behavior. If there're any unclarities, we need to adjust the docs :)

To Do

  • Documentation.
  • Changelog.
  • Tests.

@Nukesor Nukesor requested a review from a team as a code owner January 24, 2026 21:35
Copilot AI review requested due to automatic review settings January 24, 2026 21:35
@Nukesor Nukesor changed the base branch from master to lastgenre_item_fallback January 24, 2026 21:35
@github-actions
Copy link

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@Nukesor Nukesor changed the base branch from lastgenre_item_fallback to master January 24, 2026 21:36
@Nukesor Nukesor force-pushed the canonicalize-existing branch 2 times, most recently from 4296305 to 64a9c41 Compare January 24, 2026 21:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a new canonicalize_existing configuration flag for the lastgenre plugin to enable whitelist canonicalization of existing genres when force: no, canonical: yes, and whitelist: yes are set.

Changes:

  • Adds a new canonicalize_existing boolean configuration option with a default value of False
  • Updates the genre resolution logic to canonicalize existing genres before returning them when the new flag is enabled
  • Adds documentation explaining the new flag's behavior and requirements

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
beetsplug/lastgenre/init.py Adds the canonicalize_existing config option and implements canonicalization logic for existing genres
docs/plugins/lastgenre.rst Documents the new canonicalize_existing configuration option
docs/changelog.rst Adds changelog entry for the new feature
test/plugins/test_lastgenre.py Adds test case to verify canonicalization of existing genres

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.96%. Comparing base (83f1671) to head (40675ae).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beetsplug/lastgenre/__init__.py 77.77% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6317   +/-   ##
=======================================
  Coverage   68.96%   68.96%           
=======================================
  Files         140      140           
  Lines       18685    18693    +8     
  Branches     3056     3058    +2     
=======================================
+ Hits        12886    12892    +6     
- Misses       5155     5156    +1     
- Partials      644      645    +1     
Files with missing lines Coverage Δ
beetsplug/lastgenre/__init__.py 70.25% <77.77%> (+0.14%) ⬆️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Nukesor Nukesor changed the title Canonicalize existing [LastGenre] Canonicalize existing genres Jan 24, 2026
@Nukesor Nukesor force-pushed the canonicalize-existing branch 2 times, most recently from 1c99cf1 to c796161 Compare January 29, 2026 19:32
@JOJ0 JOJ0 force-pushed the canonicalize-existing branch from c796161 to c5eb470 Compare January 31, 2026 12:32
@JOJ0
Copy link
Member

JOJ0 commented Jan 31, 2026

Hm interesting feature and useful, but I'm not sure if this should maybe be called something containing the word "cleanup". It might even include whitelist filtering without canonicalization too.

I see it as a "cleanup existing only" kind of functionality.

And are you sure this can't be included in the force branch somehow? We are moving away from force:no being what one would expect from it. let's brainstorm some more here please

@JOJ0
Copy link
Member

JOJ0 commented Feb 1, 2026

@Nukesor I adjusted my wording above slightly. I think it was weird. Maybe still is...

@Nukesor
Copy link
Contributor Author

Nukesor commented Feb 3, 2026

I didn't read your first response, but the edited doesn't seem weird at all :D

And are you sure this can't be included in the force branch somehow?

The force-branch already canonicalizes local genres (if keep_existing is set), so that case is already covered. Right now, it's just not possible to canonicalize genres in non-force mode.

I see it as a "cleanup existing only" kind of functionality.

It's more of a "cleanup stuff from lastfm, but please also cleanup local stuff that isn't force-overwritten/merged" :D

Regarding "cleanup", I would be up for it, although I think that canonicalization better conveys what actually happens. If we were to change the wording, we should do it for the whole plugin though.

We are moving away from force:no being what one would expect from it.

Neat. How would that look like?
The current flag handling is definitely a bit confusing. A different approach to configuration would probably be easier to understand and use. Although I'm not sure how that would look like yet. Maybe enum-based configuration, where sensible combinations of behavior are given a dedicated name?

@JOJ0 JOJ0 changed the title [LastGenre] Canonicalize existing genres Lastgenre: Canonicalize existing genres Feb 8, 2026
@JOJ0
Copy link
Member

JOJ0 commented Feb 8, 2026

I didn't read your first response, but the edited doesn't seem weird at all :D

And are you sure this can't be included in the force branch somehow?

The force-branch already canonicalizes local genres (if keep_existing is set), so that case is already covered. Right now, it's just not possible to canonicalize genres in non-force mode.

Yes you are right, my bad, I didn't think it through.

Still: We are introducing an option that allows the user / beets to manipulate something even though they said: DO NOT FORCE, that is why I want to be supercareful to make this most obvious and understandable.

I see it as a "cleanup existing only" kind of functionality.

It's more of a "cleanup stuff from lastfm, but please also cleanup local stuff that isn't force-overwritten/merged" :D

You are right

Regarding "cleanup", I would be up for it, although I think that canonicalization better conveys what actually happens. If we were to change the wording, we should do it for the whole plugin though.

We are moving away from force:no being what one would expect from it.

Neat. How would that look like? The current flag handling is definitely a bit confusing. A different approach to configuration would probably be easier to understand and use. Although I'm not sure how that would look like yet. Maybe enum-based configuration, where sensible combinations of behavior are given a dedicated name?

I agree that it could be easier and we discussed it way back when -k was introduced and the initial force behaviour was fixed. An option like --mode overwrite, --mode combine, and so on was on the table but I was against it because it felt tedious from a cli usability point of view. Using beet lastgenre -f -k term or beet lastgenre -f -K seemed more practcally usable.

Each way has it's strenghtes though, I agree, but still I don't want to change the usability concept now again. It's established and well documented (I hope!) now.

What I'm trying to say with a cleanup option is something like this:

beet lastgenre -F --cleanup-existing -> whitelist filtering and/or canonicalization (and in the future normalizing of spelling / applying of regex aliases) will take place if configured.

A shortform -c would be available.

No better idea for option naming currently, please help me out! :-)

--help could state something like this:

  -c, --cleanup-existing   even with --no-force provided, existing genres get cleaned up (whitelist, canoncialitation, ...)

maybe make this shorter more concise help text...

We should not restrict this no-force-cleanup to canonicalization only.

@JOJ0 JOJ0 added lastgenre plugin Pull requests that are plugins related labels Feb 11, 2026
@Nukesor
Copy link
Contributor Author

Nukesor commented Feb 18, 2026

Damn. I started writing an answer and then got sidetracked. Sorry for the long wait.

beet lastgenre -F --cleanup-existing -> whitelist filtering and/or canonicalization (and in the future normalizing of spelling / applying of regex aliases) will take place if configured.

I see. In that context, cleanup makes a lot more sense. I like the idea where lastgenre is heading! Rather than a "Pull genres from lastfm", it feels more like a general canonicalization/cleanup/completion tool, which is pretty much what I'm using it for :D.

Tbh. now that I know the planned changes, I'm perfectly fine with --cleanup-existing. It'll be a bit confusing in the beginning, but it'll make sense in the long term.

So just to double-check, regarding this PR my todos would be:

  • Rename the config flag
  • Anything else :D?

We should not restrict this no-force-cleanup to canonicalization only.

What other cleanup would be there to perform? The current impl does _try_resolve_stage, which performs the other cleanup as well, right?

@Nukesor Nukesor force-pushed the canonicalize-existing branch from c5eb470 to 16d4df8 Compare February 18, 2026 16:25
@Nukesor
Copy link
Contributor Author

Nukesor commented Feb 18, 2026

btw: The CI is complaining about docstring formatting.
But when I'm running the lint checks and on my machine it looks perfectly fine:

docstrfmt docs README_kr.rst CONTRIBUTING.rst CODE_OF_CONDUCT.rst README.rst
121 files was checked.

I'm using docstrfmt v2 though, as that's the only thing running with python 3.14
See: #6377

@snejus
Copy link
Member

snejus commented Feb 18, 2026

I'm using docstrfmt v2 though, as that's the only thing running with python 3.14

Beets does not support 3.14 yet - use 3.13 instead and install the dependencies with poe install to make sure they are in sync.

@JOJ0 JOJ0 changed the title Lastgenre: Canonicalize existing genres Lastgenre: Cleanup existing genres Feb 18, 2026
# No fallback configured.
return None, "fallback unconfigured"

# Without force pre-populated tags are returned as-is.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please adjust this comment: without force and ...

Copy link
Member

@JOJ0 JOJ0 Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe remove it here and extend the larger comment above, right after the condition to include a concise description right at the entry point of the if genres and force not true) branch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the split, but adjusted the wording to match the code flow

Copy link
Member

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this! One thought on the condition here - if the user has set cleanup_existing, they've already opted in, so silently doing nothing when whitelist or canonical aren't set seems like a footgun.

_try_resolve_stage already handles whether whitelist filtering or canonicalization applies based on its own config, so wouldn't the following be sufficient?

            if self.config["cleanup_existing"]:

@JOJ0
Copy link
Member

JOJ0 commented Feb 18, 2026

We should not restrict this no-force-cleanup to canonicalization only.

What other cleanup would be there to perform? The current impl does _try_resolve_stage, which performs the other cleanup as well, right?

whitelist checking as mentioned in the inline comments. and in the future maybe normalizing genre names. eg. DnB -> Drum And Bass, and removing blacklisted genres.

@Nukesor Nukesor force-pushed the canonicalize-existing branch 2 times, most recently from 394a76a to f5f6346 Compare February 20, 2026 14:51
@Nukesor
Copy link
Contributor Author

Nukesor commented Feb 20, 2026

@JOJ0 @snejus I adjusted according your review :)

The tests fail now however. From what I can see, this is an issue with the way configuration is handled in beets. Test state (specifically the config values) persist once set.

It looks like each plugin has a view to the global persisted beets.config field? As a result, variables that aren't explicitly overwritten are persisted in that global config view.

Should this be fixed in a specific way or should I just work around this? The tests can be "fixed" by adding a cleanup_existing: False to the test fixture right after the one that sets cleanup_existing: True.

@Nukesor Nukesor force-pushed the canonicalize-existing branch 2 times, most recently from 37b001a to e1791e4 Compare February 20, 2026 15:30
@Nukesor
Copy link
Contributor Author

Nukesor commented Feb 20, 2026

I went ahead and added a commit on top that exposes the default config as a static method and uses that default config to reset the state in between fixture calls.

There were 5 tests that depended on count: 10 being set on previous test fixtures, which I adjusted accordingly.

Let me know what you think of that change :)

@Bergbok
Copy link

Bergbok commented Feb 20, 2026

Hey, thanks for your work on this @Nukesor !

One thing I'd like to see changed is it being untied from force: no, I feel like it's a bit confusing.
It's changing force's behavior unnecessarily.

I'd much prefer another solution. What if running beet lastgenre --cleanup-existing ran through your library's genres without interacting with Last.fm and updated them based on your whitelist/canonicalization tree? This would be very useful for when you modify your whitelist or tree file and want to quickly apply those changes to your library without re-fetching from Last.fm.

You added a section in the docs for the new option but I think you should probably also update the Handling pre-populated tags section and force's description and if we're going this route.

@Nukesor
Copy link
Contributor Author

Nukesor commented Feb 21, 2026

Hey @Bergbok

If I understood you correctly, --cleanup-existing would then imply a --no-force and behave somewhat like a subcommand.

IMO, that behavior would makes sense, if we were to have multiple subcommands that represent different "stages".

  • lastgenre fetch -> Pull metadata from lastfm, potentially overwriting or merging existing genres, depending on --keep-existing and --force
  • lastgenre cleanup -> Go through all genres and normalize them (capitalization, aliasing, whitelisting, canonicalization)

Right now, existing tags are also "cleaned up" in force: yes + keep-existing: yes mode, which is why I decided to go for another flag. Having lastgenre --keep-existing behave like a subcommand that cleans up all existing genres, but then also having local genres cleaned up in another mode is a bit confusing.

Is there a way to easily introduce subcommands to plugins?
I guess a split into such subcommands would also require a quite large refactoring, as the fetch+merge and normalization logic should then probably be separated and those two processes are currently very closely tied together.

@JOJ0
Copy link
Member

JOJ0 commented Feb 21, 2026

I went ahead and added a commit on top that exposes the default config as a static method and uses that default config to reset the state in between fixture calls.

There were 5 tests that depended on count: 10 being set on previous test fixtures, which I adjusted accordingly.

Let me know what you think of that change :)

nice catch!!! would it be too much to ask if you could split out this test refactor/fix into a separate PR. We have so much lastgenre stuff in flight currently and it would be easier for us that way! thank you very much!! 🙏

@Nukesor
Copy link
Contributor Author

Nukesor commented Feb 21, 2026

@JOJ0 created #6386 :)
I left this MR on top of the other one so the tests pass :)

(Did a bit of a oopsie on the rebase+cherrypick, but the history should be clean now again)

@Nukesor Nukesor force-pushed the canonicalize-existing branch 4 times, most recently from 02d51c4 to 51db511 Compare February 21, 2026 15:35
Fixes a bug in the lastgenre, where test state bled into the following fixtures.

Each plugin has a view to the global persisted beets.config field. As a result,
variables that aren't explicitly overwritten are persisted in that global config view.

This commit exposes the default config as a static method and uses that default
config to reset the state in between fixture calls.

There were 5 tests that depended on count: 10 being set on previous test fixtures,
which I adjusted accordingly.
Introduce a new lastgenre `cleanup_existing` flag.

It handles the case where canonicalization is desired on existing tags.
The new logic triggers if:
- `force`: False
- `cleanup_existing: True

Depending on whether `whitelist: True` or `canonical: True`, the genres
are then canonicalized and/or whitelisting is applied
@Nukesor Nukesor force-pushed the canonicalize-existing branch from 51db511 to 40675ae Compare February 21, 2026 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lastgenre plugin Pull requests that are plugins related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow canonicalization of existing genres

5 participants