escape data strings according to WebAssembly spec , Fixes #37#947
escape data strings according to WebAssembly spec , Fixes #37#947RayanVSS wants to merge 2 commits intoOCamlPro:mainfrom
Conversation
| @@ -0,0 +1,8 @@ | |||
| test data special chars round-trip: | |||
| $ owi fmt data_special_chars.wat > /tmp/owi_test_output.wat | |||
There was a problem hiding this comment.
Putting the file in /tmp is likely going to cause various issues. Doing ... > ./owi_test_output.wat is fine (dune takes care of having the files in the right place).
| (memory 1) | ||
| (data (memory 0) (offset i32.const 0) "hello\n\t\u{0d}\"\'\\world") | ||
| ) | ||
| $ rm /tmp/owi_test_output.wat |
There was a problem hiding this comment.
no need to do this if you put it in the current directory
| | '\\' -> string fmt "\\\\" | ||
| | c -> | ||
| let ci = Char.code c in | ||
| if 0x20 <= ci && ci < 0x7f then char fmt c else pp_hex_char fmt c |
There was a problem hiding this comment.
Can be rewritten as:
| '\x20' .. '\x7e' as c -> char fmt c
| c -> pp_hex_char fmt c | in | ||
| let pp_unicode_char fmt = function | ||
| | (0x09 | 0x0a) as c -> pp_char fmt (Char.chr c) | ||
| | uc when 0x20 <= uc && uc < 0x7f -> pp_char fmt (Char.chr uc) |
There was a problem hiding this comment.
Same here, you could use a character interval for the pattern (and it's likely not necessary to use integers for this function, using char should be enough)
|
Thanks for the review, I've made the suggested changes. |
|
Thanks. Sorry, I forgot about this one... Could you rebase the PR? I tried to convince myself this is actually fixing the issue but I failed to do so. I'd need to spend a little bit of time to check a test that was failing before and is now passing. In particular, you have a roundtrip test (which is nice), but it is not checking if the data are actually the same after the roundtrip! Do you think you can add something like this so I can easily check if this is correct? Otherwise, I suggest not to put generated AI garbage in PR description, this is really not helpful (if you could even remove it and try to write something, even short, you wrote yourself, I'd be more confident about the change). |
Fix for Issue #37: WebAssembly Data Pretty Printing
The Initial Problem
Issue #37 (#37) reported a problem in the pretty-printing of the
datainstruction in WebAssembly Text format.The original code used OCaml's
%Stag to display the initialization string:pf fmt {|(data%a %a %S)|} pp_id_opt d.id Mode.pp d.mode d.initWhy Was This a Problem?
OCaml's
%Stag escapes characters according to OCaml conventions, not according to WebAssembly Text Format conventions. This caused issuesConcrete Example
With the old code, a string containing special characters like
\n,\r,\t, etc. was not escaped correctly, making it impossible to re-read the file.Modifications in
src/ir/text.mlI added two new pretty-printing functions that comply with the WebAssembly Text Format specification:
New Test Files in
test/fmt/To validate the fix, I added three new test files:
data_special_chars.wat: Test with special characters (\n,\t,\r,",',\)data_bytes.wat: Test with raw bytes and non-printable charactersdata_roundtrip.t: Round-trip test to verify format stabilityprint.t: Added test cases for the two files aboveThese tests are automatically executed with
dune runtest test/fmt.Added Functions
1.
pp_name_inner- Correct EscapingThis function implements WebAssembly escaping rules:
\n,\r,\t,\',\",\\are explicitly escaped\xx\u{xx}notation for tab (0x09) and newline (0x0a)2.
pp_name- Wrapper with QuotesSimple wrapper that surrounds the result of
pp_name_innerwith double quotes.3. Modification to
Data.ppChange:
%S→pp_nameto use our custom escaping function.Specification
The solution follows the WebAssembly Text Format specification:
The implemented escaping rules exactly match the spec.
Inspiration
PR #391 (which attempted to solve this problem) referenced a PR in the official WebAssembly repo:
this implementation aligns with these changes.
Non-regression Tests
All existing tests pass:
Stability Property (Idempotence)
The round-trip test verifies the idempotence property:
Case Analysis
I manually tested these cases:
\n\n\r\u{0d}\t\t"\"'\'\\\\xxReferences
Modified and Created Files
Modified Files
src/ir/text.ml: Addedpp_name_innerandpp_namefunctions, modifiedData.pptest/fmt/dune: Added new test files to dependenciestest/fmt/print.t: Added tests for special characters and raw bytesNew Test Files
test/fmt/data_special_chars.wat: Test with special characterstest/fmt/data_bytes.wat: Test with raw bytestest/fmt/data_roundtrip.t: Round-trip test