Support matching bitstrings that are in fact binaries (size divisible by 8)#1978
Support matching bitstrings that are in fact binaries (size divisible by 8)#1978mat-hek wants to merge 5 commits intoatomvm:mainfrom
Conversation
bb34fe8 to
7bb58f2
Compare
576fb54 to
ffcc68d
Compare
…408) Closes #393 Once atomvm/AtomVM#1978 is merged and downstreamed, the eval_bits patch won't be needed (unless there are other unrelated patches in there OFC)
ffcc68d to
89e5cf8
Compare
5aa8f8c to
c2b0ede
Compare
bettio
left a comment
There was a problem hiding this comment.
I think there is still an issue.
build-and-test (arm-linux-gnueabihf-gcc, -mcpu=cortex-a7 -mfloat-abi=hard -O2 -mthumb -mthumb-int is failing and this tells me there is a problem with JIT implementation.
We have to fix this before merging this PR. Maybe @pguyot has some advice here.
c2b0ede to
6c9843d
Compare
bettio
left a comment
There was a problem hiding this comment.
Tests keep failing:
test_binary_to_term:
CRASH
======
pid: <0.1.0>
Stacktrace:
[{test_binary_to_term,test_function,0,[{file,"/home/runner/work/AtomVM/AtomVM/tests/erlang_tests/test_binary_to_term.erl"},{line,316}]},{test_binary_to_term,start,0,[{file,"/home/runner/work/AtomVM/AtomVM/tests/erlang_tests/test_binary_to_term.erl"},{line,170}]}]
cp: #CP<module: 0, label: 52, offset: 2272>
x[0]: error
x[1]: {badmatch,<<1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,1,119,19,116,101,115,116,95,98,105,110,97,114,121,95,116,111,95,116,101,114,109,97,5,98,5,26,215,111,88,119,13,110,111,110,111,100,101,64,110,111,104,111,115,116,0,0,0,0,0,0,0,0,0,0,0,0,97,2>>}
x[2]: {2,2,74,1,[{0,8240},{0,27350}],error}
Stack
-----
<<88,119,13,110,111,110,111,100,101,64,110,111,104,111,115,116,0,0,0,0,0,0,0,0,0,0,0,0>>
85645167
4
85
21
<<119,19,116,101,115,116,95,98,105,110,97,114,121,95,116,111,95,116,101,114,109>>
<<131,108,0,0,0,2,112,0,0,0,85,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,119,19,116,101,115,116,95,98,105,110,97,114,121,95,116,111,95,116,101,114,109,97,4,98,5,26,215,111,88,119,13,110,111,110,111,100,101,64,110,111,104,111,115,116,0,0,0,0,0,0,0,0,0,0,0,0,112,0,0,0,87,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,0,0,0,1,119,19,116,101,115,116,95,98,105,110,97,114,121,95,116,111,95,116,101,114,109,97,5,98,5,26,215,111,88,119,13,110,111,110,111,100,101,64,110,111,104,111,115,116,0,0,0,0,0,0,0,0,0,0,0,0,97,2,106>>
#CP<module: 0, label: 2, offset: 2760>
#CP<module: 0, label: 0, offset: 5424>
Mailbox
-------
Monitors
--------
**End Of Crash Report**
Expected 0 but result is not an integer
test_binary_to_term:FAILED
|
@bettio I'll try looking into it. There's also https://github.com/atomvm/AtomVM/actions/runs/20231091694/job/58073962162?pr=1978#step:18:596 failing, with disabled JIT 🤔 Instead of |
libs/jit/src/jit.erl
Outdated
| end, | ||
| MSt12 = cond_jump_to_label({{free, SizeReg}, '<', BSOffsetReg1}, Fail, MMod, MSt11), | ||
| {MSt12, Size bsl 4}; | ||
| {MSt12, (Size * Unit) div 8}; |
There was a problem hiding this comment.
You still need to bsl 4 here as well.
There was a problem hiding this comment.
Actually it seems there is a bug in this opcode. Working on it.
There was a problem hiding this comment.
See #2113
You may want to rebase once it is merged.
libs/jit/src/jit.erl
Outdated
| ctx, jit_state, offset, ?UNSUPPORTED_ATOM | ||
| ]); | ||
| true -> | ||
| MMod:sub(MSt10, SizeReg, (Size * Unit) div 8) |
There was a problem hiding this comment.
You still need to bsl 4 the (Size * Unit) div 8. You could write (Size * Unit) bsl 1.
Indeed, SizeReg contains the size in bytes as a term ((Imm bsl 4) bor 16#F).
We could add a comment here to clarify.
… by 8) Signed-off-by: Mateusz Front <mateusz.front@swmansion.com>
6c9843d to
9a1b28e
Compare
|
Per AMP https://ampcode.com/threads/T-019cafb8-53f3-71de-9d8e-6def785206e8: all caveats apply, the recommended fixes looks legit. PR Review: Support matching bitstrings that are in fact binariesCommits: Oracle Code Review Summary🔴 Bug 1 — Interpreter sub-binary offset (
|
libs/jit/src/jit.erl
Outdated
| {MSt11, SizeReg}; | ||
| is_integer(Size) -> | ||
| % SizeReg is binary size | ||
| % SizeReg contains binary size in bytes as a term |
There was a problem hiding this comment.
I don't think this is true.
SizeReg is just the size in bytes AFAIK, not a term. We got it from get_array_element(...,1): it's the second word of the boxed binary.
There was a problem hiding this comment.
Sorry, I accidentally pushed a half baked thing. But I got suggested by
Indeed, SizeReg contains the size in bytes as a term
libs/jit/src/jit.erl
Outdated
| MSt11 = MMod:sub(MSt10, SizeReg, SizeBytes), | ||
| MSt11 = | ||
| if | ||
| (Size * Unit) rem 8 =/= 0 -> |
There was a problem hiding this comment.
Size is a tagged integer so you don't want to multiply it by Unit.
Since Unit may not be 8, you want to rename SizeBytes into SizeInUnits.
libs/jit/src/jit.erl
Outdated
| ]); | ||
| true -> | ||
| % Equivalent of SizeReg - (((SizeBytes * Unit) div 8) bsl 4) | ||
| MMod:sub(MSt10, SizeReg, (SizeBytes * Unit) bsl 1) |
There was a problem hiding this comment.
Since SizeReg is the size in bytes I'm not sure this is correct
|
had tokens surplus - llm claims these two commits: https://github.com/petermm/AtomVM/commits/test1978/ all caveats apply. |
Fix several issues in the JIT OP_BS_GET_BINARY2 code generation
introduced by the bitstring-as-binary matching support.
- Rename `SizeBytes` to `SizeInUnits` to reflect that `Size bsr 4`
gives the count in units, not bytes. The actual byte count is
`(SizeInUnits * Unit) div 8`.
- Use `SizeInUnits * Unit` (not `Size * Unit`) in the alignment check,
since `Size` is a tagged integer and must be untagged first.
- Subtract raw bytes `(SizeInUnits * Unit) div 8` from `SizeReg`
instead of using `bsl 1`, since `SizeReg` holds a raw byte count
(not a tagged term).
- Add fast-path for `Unit =:= 8` that skips the `mul`/`if_block`/
`shift_right` sequence entirely, since size-in-units already equals
size-in-bytes. This avoids unnecessary codegen on ARM32 Thumb where
register pressure is high.
- Fix register leak for `Unit =/= 8`: pass `{free, SizeValReg}` to
`shift_right/3` so it reuses the register in-place instead of
allocating a new one. Without this, `SizeValReg` (holding bits)
remains allocated while the new result register is never freed,
which can exhaust available GPRs on ARM32 and cause incorrect
code generation.
Fix tagged/raw integer confusion in JIT OP_BS_GET_BINARY2
constant-size path.
- SizeReg holds raw bytes, not a tagged term. Subtract raw bytes
`(SizeBytes * Unit) div 8` instead of `(SizeBytes * Unit) bsl 1`.
- Use untagged `SizeBytes` in the alignment check instead of tagged
`Size`, since `Size * Unit` on a tagged integer gives a wrong result.
- Fix comment: SizeReg is "raw bytes (not a term)".
Signed-off-by: Peter M <petermm@gmail.com>
Signed-off-by: Mateusz Front <mateusz.front@swmansion.com>
Expand test_bs_match_bitstring_modifier with additional coverage for edge cases identified during code review: - Non-zero offset with dynamic size - Size=0 (empty binary match) - Negative dynamic size (should error, not crash) - Constant size 16/bitstring (compile-time path) - Non-zero offset with constant size These exercise both the JIT constant-size (is_integer) and dynamic-size (runtime variable) code paths, as well as the interpreter's bs_offset byte-alignment logic. Signed-off-by: Peter M <petermm@gmail.com> Signed-off-by: Mateusz Front <mateusz.front@swmansion.com>
|
Thanks @petermm - it looks fine to me, I cherry-picked these commits with minor changes |
| MMod:free_native_registers(BSt4, [SizeValReg]) | ||
| {BSt2, SizeValReg2} = | ||
| if | ||
| is_integer(SizeValReg) -> |
There was a problem hiding this comment.
I believe this cannot happen.
In some cases, there's generic code that uses the
bitstringmodifier because it needs to work for both bitstrings and binaries, for example in erl_eval. Currently such code doesn't work even if no actual bitstrings are involved, and this PR attempts to fix it. In other words, this now works:TODO:
jit.erl