- Heap-backed reader: Read sparkey files into JVM heap
byte[]arrays instead of memory-mapped files. Useful when the JVM has a large unused heap but limited page cache. Open is slower (must read entire file), but avoids page cache pressure entirely. Heap-backed lookups are 2-11% slower than mmap when data is resident in memory. - SparkeyReaderBuilder: Fluent builder API for configuring readers:
Sparkey.reader().file(f).useHeap(true).open(). Supports explicitindexFile()/logFile()for non-standard file names. Options:useHeap(),singleThreaded(),poolSize(). - SparkeyImplSelector consolidation: Internal reader selection reduced to a single
open(builder)method, overridden via MRJAR for Java 22+.
- mlock support: Added
INDEX_MLOCK,LOG_MLOCK,ALL_MLOCKtoLoadModefor pinning pages in RAM viamlock(2), preventing eviction under memory pressure. Requires Java 22+ and--enable-native-access=ALL-UNNAMED. Falls back silently to advisory prefetch if unavailable. UseLoadResult.locked()to check whether mlock succeeded. - Public LoadResult API:
LoadResult.completed(),create(), andcombine()are now public, enabling external composition (e.g. aggregating results from multiple sharded readers). - LoadResult.lockedFuture(): Returns a
CompletableFuture<Boolean>for async composition on the locked state.
- Page cache prefetch API: Added
SparkeyReader.load(LoadMode)for best-effort prefetching of mapped file data into memory. This can significantly reduce page faults for large sparkey files on network-attached storage (e.g. GCP Hyperdisk) by requesting the OS to make data resident before random lookups begin.LoadMode.ALL— prefetch both index and logLoadMode.INDEX— prefetch only the index (cheap, typically tens of MB)LoadMode.LOG— prefetch only the logLoadMode.NONE— no-op, for configuration-driven call sites- Custom
Executorsupport viaload(LoadMode, Executor) - Returns
LoadResultwithisDone(),await(),requestedBytes(), andtoCompletableFuture() - Default executor: dedicated daemon thread pool (2 threads), configurable via
system property
sparkey.load.parallelism
- Performance fix for network-attached storage: Replaced
parkNanosexponential backoff inPooledSparkeyReaderwith a recursive overflow pool. When all CAS slots are busy (e.g. during slow I/O on Hyperdisk), the reader now lazily creates additional pool capacity instead of sleeping. This fixes a 3x performance regression observed on Google Cloud Hyperdisk when upgrading from 3.2.5 to 3.5.0. JMH benchmark with 200μs simulated I/O shows 4x throughput improvement at 64 threads and 11x at 256 threads.
-
Java 22+ optimizations: Added MemorySegment-based implementations for zero-copy I/O and values larger than 2GB. On Java 22+,
getValueAsStream()now returns a zero-copy stream that reads directly from memory-mapped files. Performance improvements for uncompressed files:- Single-threaded: ~5% faster (469 ns/op vs 493 ns/op on Java 8)
- 8 threads: 38% faster (400 ns/op vs 646 ns/op)
- 16 threads: 10% faster (937 ns/op vs 1,045 ns/op)
- 32 threads: 15% faster (1,919 ns/op vs 2,245 ns/op)
Compressed files continue to work on all Java versions with no performance regression. Automatically enabled when running on Java 22+; older Java versions use existing implementations.
-
Architecture simplification: Separated read-only optimized uncompressed reader (
UncompressedSparkeyReaderJ22) from general-purpose reader (SingleThreadedSparkeyReaderJ22), eliminating dual-mode complexity while maintaining performance. -
>2GB value support: Values larger than 2GB are now supported on Java 22+ via
getValueAsStream(). ThegetValue()method (which returnsbyte[]) throwsIllegalStateExceptionfor values exceeding the byte array limit with guidance to use streaming instead. -
Zero-allocation empty streams: Added
EmptyInputStreamsingleton for DELETE entries and empty values, eliminating unnecessary allocations during log iteration. -
VLQ optimization: Inlined variable-length quantity (VLQ) reads in Java 22+ uncompressed hash lookups, eliminating double-parsing overhead and Entry object allocation.
-
Type safety improvements: Added covariant return types to
duplicate()methods and changedasSlice()parameters frominttolong, eliminating casts and better matching the MemorySegment API. -
Performance baseline (Intel Xeon @ 2.2GHz, 100K entries, Java 25):
- Uncompressed single-threaded: 469 ns/op
- Uncompressed 8 threads: 400 ns/op
- Uncompressed 16 threads: 937 ns/op
- Compressed SNAPPY 8 threads: 3,367 ns/op (8.4x slower than uncompressed)
- Compressed ZSTD 8 threads: 9,987 ns/op (25x slower than uncompressed)
- Performance optimization: New
readFullyCompare()method combines reading and comparing bytes in a single operation, avoiding temporary buffer allocation and data copying. Provides 11-17% improvement in high-concurrency uncompressed workloads and 2-8% improvement in most other scenarios. - SIMD optimization: Byte array comparisons now use vectorized instructions (AVX2/AVX-512) on Java 9+ via Multi-Release JAR. Provides 4-13% improvement in single-threaded and low-concurrency scenarios (up to 8 threads). Java 8 continues to use the standard byte-by-byte comparison.
- PooledSparkeyReader improvements: Optimized with lock-free atomic operations for better performance under high contention.
- Fix thread-safety bug in PooledSparkeyReader.getAsEntry() where the entry object could be shared across threads.
- Internal simplification: ThreadLocalSparkeyReader now extends PooledSparkeyReader to reduce code duplication.
- Build improvements: Added comprehensive JMH benchmarking infrastructure and performance testing tools.
- Dependency updates:
- Bump commons-io from 2.7 to 2.14.0
- Bump guava from 29.0-jre to 32.0.0-jre
- Bump snappy-java from 1.1.7.2 to 1.1.10.4
- Bump logback-classic from 1.2.3 to 1.2.13
- Upgrade zstd-jni from 1.5.2-2 to 1.5.2-5
- Upgrade slf4j-api from 1.7.2 to 1.7.36
- New PooledSparkeyReader: Sparkey.open() now returns PooledSparkeyReader instead of ThreadLocalSparkeyReader by default. This provides better memory safety for Java 21+ applications using virtual threads, where ThreadLocalSparkeyReader can cause unbounded memory growth. Performance is on par with ThreadLocalSparkeyReader.
- PooledSparkeyReader uses thread-ID-based striping with a bounded pool of readers, providing O(pool size) memory usage instead of O(threads).
- ThreadLocalSparkeyReader is now deprecated but still available via Sparkey.openThreadLocalReader() for backward compatibility.
- New factory methods: Sparkey.openPooledReader(file) and Sparkey.openPooledReader(file, poolSize) for explicit pool configuration.
- Fix ByteBufferCleaner to avoid deprecated sun.misc.Unsafe.invokeCleaner on Java 19+. Uses version-specific cleaners: Java 8 uses sun.reflect, Java 9-18 uses sun.misc.Unsafe.invokeCleaner, and Java 19+ uses a no-op implementation (automatic cleanup via JVM's internal Cleaner API).
- Performance improvement on Java 19+: Skip both cleanup loop and 100ms sleep when manual cleanup is not needed, providing faster close operations.
- Fix for ThreadLocalSparkeyReader due to changed behavior of ThreadLocal for tasks in ForkJoinPool.commonPool() in Java 16+
- Fixed bug where calling isLoaded on a duplicated MappedByteBuffer throws exception.
- Updated version of zstd-jni. See #52
- Fixed bug where creating hash files would erroneously lose some keys. The bug only applies to cases where the construction mode is SORTING and hash collisions are present (so typically only when hash mode is 32 bits and the number of keys is more than 100000). The bug was introduced along with SORTING in 2.3.0
- Added methods to reader:
getLoadedBytes()andgetTotalBytes()
- Added support for zstd compression
- Fix bug where file-descriptors are not closed after using a Sparkey iterator.
- Compiles as Java 8, up from Java 6.
- Removed Guava as a dependency.
- Upgraded Snappy dependency.
- Optimized sorting-based hash file creation slightly.
- Changed API from ListenableFuture to CompletionStage in ReloadableSparkeyReader.
- Add automatic-module-name for better module support.
- Running close() on a SparkeyReader will now synchronously unmap the files. This may block for 100 ms if the reader has been duplicated (typically when used from multiple threads).
- Fixed some bugs with unclosed files in certain cases. As an effect of this, it is more stable on Windows.
- Fix bug where file creation didnt properly close mmap immediately, causing subsequent rename failures on windows
-
New method of creating hash indexes added: Sorting By presorting the hash entries, the hash table construction can be done using less memory than the size of the hash table. The cost is extra CPU time and temporary disk usage. As a result, some new API methods have been added:
- SparkeyWriter.setHashSeed() to manually set which seed to use, for deterministic hash indexes. (Default: random)
- Add SparkeyWriter.setMaxMemory() to set how much memory to use for index construction (Default: free memory / 2)
- Add SparkeyWriter.setConstructionMethod to allow for explicit configuration of creation method to use (Default: AUTO)
-
Various minor optimizations
-
Add dependency on com.fasterxml.util:java-merge-sort
-
Performance differences:
- Writing hash index is 2-3x slower when using sorting (but this can be avoided by setting a large max memory or explicitly setting construction method to IN_MEMORY.
- Random lookups are 6% faster than in 2.2.1 for compressed data and 5-17% faster for uncompressed (more improvement for larger data sets)
- Minor bug fix to avoid stack overflow for large read and write operations.
- Make Sparkey.open return a thread local (i.e. thread safe) reader by default.
- Update snappy dependency
- Fix bug with replacing files on Windows systems.
- Fix minor bug related to generating filenames.
- Always close files in case of exceptions upon creation.
- Removed IOExceptions for close() for reader types
- Make SparkeyReader and SparkeyWriter implement Closeable to support try-with-resources
- Fix bug which triggers a BufferUnderflowException in some rare cases.
- Add support for using fsync in the writer.
- Improve detection of corrupt files.
- Fix file-descriptor leaks when closing readers and writers.
- Use JMH for benchmarks.
- Make hash sparsity configurable.
- Fix minor bug in isAt()
- Fix race condition in ThreadLocalSparkeyReader.close() and improve GC of thread-local readers.
- Initial public release