optimizations#190
Open
HoneyryderChuck wants to merge 28 commits into
Open
Conversation
this removes one less extra string compared to the previous strategy
this replaces multiple calls to #read_str (which allocates a string per byte popped) by an accounting loop using String#getbyte and a single #read_str call at the end algo should be roughly the same speed, with less GC pressure as a result
this comes at the cost of having an extra global hash, albeit a small one. to compensate, DEFINE_SETTINGS was turned into an array. there's the space saving, but Array#include?, despite O(n), should be more performant than Hash#key.
frame containers would store flag information as arrays of symbols, which would then be passed up and down the encoding/decoding chain in that format before being translated to/from the actual bits that are encoded into the buffer. this would cause some overhead, given that 1) arrays of symbols generate more GC pressure than raw integers (tagged objects); 2) array operations vs bitwise operations; 3) translation overhead (and given it used bit positions rather than the bit representation, bit shifting). this approach is replaced by carrying the bits around, which solves all problems, at the expense of making the flag less readable when exposed to the end user via the frame_sent/received callbacks. for that reason, one can consider it a "regression", but IMO this is an internal representation that end users shouldn't be relying on for anything.
this seems like a leftover from a WIP version of the HTTP/2 RFC that never made it to the final spec
avoids the whole array resizing logic
while ruby allows sharing the underlying buffer string when slicing to the end of the string, it still allocates the 40b extra ruby objects, and the code doesn't get more complicated through it, so why not
…id O(n)'ing on every size check
…ve operations the prior version of this would append the huffman string THEN add the length prefix via String#insert, which needs to resize the string in the middle thereby doing a memmove operation, which is bad. this avoids it by either appending the length prefix before the string when possible, pre-append a null byte then change it after generating the huffman string, and in the rare case where the length may require more than one byte, remove the appended byte and use the old strategy relying on String#insert, because we can't have nice things
by keeping a side hash which keeps the offset of matched entries, lookups don't require table scans anymore, which makes this more scalable across different dynamic table sizes (at the expense of an extra container where to store things)
no intermediate 1 char string
the buffer will be already force-encoded after the method is called
these were identified while running rbs runtime tests against the record union type for frames, something which is still making steep block... but is nevertheless where this should land
…n already encoded settings list
which support appending byte numbers
the resulting hash from merge was bigger than it needed to be
…closed management with a twist though; instead of traversing the whole hash, break the loop as early as possible, thereby having the same algo advantage of Hash#delete_while while not having to tear down the list each time
…ning max size chunks this uses #read_str helper, which will use String#bytesplice when available
some of the functions are raising type errors instead of the error which would close the connection or stream.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
this is a series of optimizations around memory usage / doing less in a series of path both hot and cold.
The main change is how flags are represented in frames: instead of arrays of symbols, like
[:end_stream], they're now integers representing bitflags, i.e.1 (0000 0001). It's a semi-breaking change, in the sense that frames get exposed to end users viaon(:frame_received | :frame_sent)callbacks, but it's internal representation regardless.