Skip to content

Roydl/Text

Repository files navigation

.NET Core Cross-platform License

Build+Test Commits Source

NuGet NuGet Website Mirror

Roydl.Text

Roydl.Text provides a simple, generic way to encode and decode binary data as text. Extension methods are available for string and byte[], and a growing set of encodings is offered — all of which are performance-optimized and parallelized across available CPU cores, with AVX2 and AVX-512 SIMD acceleration where applicable.

Table of Contents


Prerequisites

  • .NET 10 LTS or higher
  • Supported platforms: Windows, Linux, macOS
  • Hardware acceleration (optional): AVX2 or AVX-512 capable CPU

Install

$ dotnet add package Roydl.Text

Binary-To-Text Encodings

Type Character Set Output Ratio Hardware Support
Base-2 0 and 1 AVX-512
AVX2
Base-8 0–7 AVX-512
AVX2
Base-10 0–9 AVX-512
AVX2
Base-16 0–9 and a–f AVX-512
AVX2
Base-32 A–Z and 2–7; = for padding 1.6× AVX2 ¹
Base-64 A–Z, a–z, 0–9, + and /; = for padding 1.33× AVX2 ²
Base-85 ASCII printable range !u; z shortcut for null groups 1.25× AVX2
Base-91 A–Z, a–z, 0–9 and !#$%&()*+,-.:;<=>?@[]^_`{|}~" ~1.23× None ³

¹ AVX2 is used for the alphabet lookup phase only. The non-power-of-two 5-bit group width prevents full SIMD vectorization of the bit-extraction phase.

² Delegates to .NET's built-in System.Buffers.Text.Base64 which is internally AVX2-accelerated. Parallelization and double-buffered I/O are layered on top.

³ The algorithm maintains a serial bit-accumulator state across every byte, making it fundamentally incompatible with SIMD vectorization or parallel processing. Any optimization that would break this dependency chain would also break compatibility with existing encoded data.

For general binary-to-text encoding, Base-85 and Base-91 offer better compactness than Base-64 — Base-85 produces ~6% smaller output, and Base-91 ~9% smaller. Base-85 is the better practical choice of the two: it is over 7× faster than Base-91 while sacrificing only marginal compactness.


Encoding Performance

Base-64 and Base-16 are the fastest encodings in this library. Base-91 is a known outlier — its serial design makes parallelization impossible without breaking the algorithm.

Encoding Throughput
Base-2 2.2 GiB/s
Base-8 1.0 GiB/s
Base-10 1.2 GiB/s
Base-16 7.5 GiB/s
Base-32 1.3 GiB/s
Base-64 9.6 GiB/s
Base-85 2.8 GiB/s
Base-91 380 MiB/s
Benchmark methodology
Component Details
CPU AMD Ryzen 5 7600 (6C/12T, 5.1 GHz boost)
RAM 32 GB DDR5
OS Manjaro Linux (Kernel 6.19.2-1)
Runtime .NET 10
Build Release (dotnet run -c Release)

Each encoding is benchmarked using stream reuse to eliminate allocation overhead. Four input patterns are tested per encoding: random bytes, all-zeros, sequential, and mixed (25% zero groups). Each pattern runs five cycles of three seconds each. The reported throughput is the median across all patterns and cycles, which avoids cache-warmup bias and reflects sustained real-world performance. You can find the benchmark test here.


Usage

// Encode — value can be string or byte[]
// BinToTextEncoding defaults to Base64 if not specified
string encoded = value.Encode(BinToTextEncoding.Base85);

// Decode
byte[] original = encoded.Decode(BinToTextEncoding.Base85);
string original = encoded.DecodeString(BinToTextEncoding.Base85);

// File encoding via extension methods
// For large files, use the instance-based approach below instead
string encoded = path.EncodeFile(BinToTextEncoding.Base85);
byte[] original = path.DecodeFile(BinToTextEncoding.Base85);

// Instance-based — recommended for large files or repeated use
// GetDefaultInstance() returns a cached singleton per encoding type
var encoder = BinToTextEncoding.Base85.GetDefaultInstance();

// Stream-based — most efficient for large files
using var input = new FileStream(srcPath, FileMode.Open, FileAccess.Read);
using var output = new FileStream(destPath, FileMode.Create);
encoder.EncodeStream(input, output);

// Line length — inserts Environment.NewLine after every N encoded chars
string encoded = value.Encode(BinToTextEncoding.Base64, lineLength: 76);

// All public methods are available on every encoding instance
string encoded  = encoder.EncodeBytes(bytes);
string encoded  = encoder.EncodeString(text);
string encoded  = encoder.EncodeFile(path);
byte[] original = encoder.DecodeBytes(encoded);
string original = encoder.DecodeString(encoded);
byte[] original = encoder.DecodeFile(path);

Would you like to help?

  • Star this Project ⭐ and show me that this project interests you 🤗
  • Open an Issue ☕ to give me your feedback and tell me your ideas and wishes for the future 😎
  • Open a Ticket 📫 if you don't have a GitHub account, you can contact me directly on my website 😉
  • Donate by PayPal 💸 to buy me some cakes 🍰

Sponsor this project

  •  

Packages

 
 
 

Contributors

Languages