Skip to content

Fix AssertionError hashing HTML blocks spread over multiple lines (#686)#687

Merged
nicholasserra merged 2 commits intotrentm:masterfrom
Crozzers:assertionerror-issue686
Mar 12, 2026
Merged

Fix AssertionError hashing HTML blocks spread over multiple lines (#686)#687
nicholasserra merged 2 commits intotrentm:masterfrom
Crozzers:assertionerror-issue686

Conversation

@Crozzers
Copy link
Contributor

This PR fixes #686.

We were using a regex to look for HTML tags, but not accounting for newlines. Used re.S flag to account for that

@Crozzers Crozzers force-pushed the assertionerror-issue686 branch from e1b8dff to a53a611 Compare March 11, 2026 20:19
@Crozzers
Copy link
Contributor Author

Crozzers commented Mar 11, 2026

Root cause seems to be bf0a1e2 where the _tag_is_closed helper was changed.
Previously it would do a simple check for number of opening tags == closing tags, whereas now it checks that the closing tag is AFTER the opening one.

In both iterations the simple tag check regex is tripped up by the sample given, so the function thinks that the tag is "closed". However the extra dilligence in the new one catches that:

        close_index = text.find(f'</{tag_name}')
        open_index = text.find(f'<{tag_name}')
        # text = '<p\n'  - so open and close index == -1
        return open_index != -1 and close_index != -1 and open_index < close_index

@nicholasserra
Copy link
Collaborator

Thank you!

@nicholasserra nicholasserra merged commit 29a9d78 into trentm:master Mar 12, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AssertionError from markdown2.markdown when an opening html tag is multiline, and the closing tag is on yet another line

2 participants