Skip to content

GH-1038 Trim object memory for ArrowBuf#1044

Open
lriggs wants to merge 4 commits intoapache:mainfrom
lriggs:arrowBufMemory
Open

GH-1038 Trim object memory for ArrowBuf#1044
lriggs wants to merge 4 commits intoapache:mainfrom
lriggs:arrowBufMemory

Conversation

@lriggs
Copy link
Contributor

@lriggs lriggs commented Feb 27, 2026

What's Changed

A significant number of ArrowBuf and BufferLedger objects are created during certain workloads. Saving several bytes per instance could add up to significant memory savings and reduced memory allocation expense and garbage collection.

The id field, which was a sequential value used when logging object information, is replaced with an identity hash code. This should still allow enough information for debugging without the memory overhead. There may be possible duplicate values but it shouldn't matter for logging purposes.

Atomic fields can be replaced by a primitive and a static updater which saves several bytes per instance.

ArrowBuf

Component Before After Savings
idGenerator (static) AtomicLong Removed 24 bytes globally
id field (per instance) long (8 bytes) Removed 8 bytes per instance
getId() Returns id field Returns System.identityHashCode(this)

BufferLedger

Component Before After Savings
LEDGER_ID_GENERATOR (static) AtomicLong Removed 24 bytes globally
ledgerId (per instance) long (8 bytes) Removed 8 bytes per instance
bufRefCnt AtomicInteger (24 bytes) volatile int + static updater 20 bytes per instance

Total Savings

Scale ArrowBuf BufferLedger Combined
100K 800 KB 2.8 MB ~3.6 MB
1M 8 MB 28 MB ~36 MB
10M 80 MB 280 MB ~360 MB

Benchmarking

I ran the added benchmark before and after the metadata trimming.

Metadata Trimmed

Benchmark Mode Score Error Units
MemoryFootprintBenchmarks.measureAllocationPerformance avgt 456.831 ± 36.059 us/op
MemoryFootprintBenchmarks.measureArrowBufMemoryFootprint ss 161.085 ± 35.596 ms/op
Created 100000 ArrowBuf instances. Heap memory used sum 35631520 bytes (33.98 MB) 0 bytes
Average memory per ArrowBuf sum 356.32 bytes 0 bytes

Previous Object Layout

Benchmark Mode Score Error Units
MemoryFootprintBenchmarks.measureAllocationPerformance avgt 466.171 ± 16.233 us/op
MemoryFootprintBenchmarks.measureArrowBufMemoryFootprint ss 176.790 ± 17.943 ms/op
Created 100000 ArrowBuf instances. Heap memory used sum 38817480 bytes (37.02 MB) 0 bytes
Average memory per ArrowBuf sum 388.17 bytes 0 bytes

Closes #1038.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trim object memory for ArrowBuf

1 participant