SeaweedFS is a distributed storage system for object storage (S3), file systems, and Iceberg tables, designed to handle billions of files with O(1) disk access and effortless horizontal scaling.
-
Updated
Mar 20, 2026 - Go
SeaweedFS is a distributed storage system for object storage (S3), file systems, and Iceberg tables, designed to handle billions of files with O(1) disk access and effortless horizontal scaling.
Ceph is a distributed object, block, and file storage platform
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
Utils for streaming large files (S3, HDFS, gzip, bz2...)
The Universal Storage Engine
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Real Time Analytics and Data Pipelines based on Spark Streaming
Deprecated - See Lenses.io Community Edition
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underlying resource management and maintenance.
Fundamentals of Spark with Python (using PySpark), code examples
Add a description, image, and links to the hdfs topic page so that developers can more easily learn about it.
To associate your repository with the hdfs topic, visit your repo's landing page and select "manage topics."