diff --git a/config.md b/config.md index 1b818ba..f7fe002 100644 --- a/config.md +++ b/config.md @@ -18,7 +18,7 @@ config JSON file. Model config JSON files MUST be valid JSON objects. Contains metadata describing the model. - **format**: string, REQUIRED - The packaging format of the model file(s). Currently the only supported value is `gguf`. + The packaging format of the model file(s). Supported values are `gguf` and `safetensors`. - **format_version**: string, OPTIONAL @@ -30,6 +30,13 @@ config JSON file. Model config JSON files MUST be valid JSON objects. standardized [key-value pairs](https://github.com/ggml-org/ggml/blob/master/docs/gguf.md#general) defined in the GGUF specification. + - **safetensors**: object, OPTIONAL + + Contains metadata specific to the `safetensors` format. May include fields such as: + - **architecture**: string, OPTIONAL - The model architecture (e.g., "llama", "qwen2", "mistral") + - **parameter_count**: string, OPTIONAL - The total number of parameters (e.g., "7.24 B", "13 B") + + - **size**: string, REQUIRED The total size of the model in bytes. @@ -45,7 +52,9 @@ config JSON file. Model config JSON files MUST be valid JSON objects. The media type of the file. This indicates the type of the file and how it should be interpreted. -## Example +## Examples + +### GGUF Model ```json { @@ -79,3 +88,38 @@ config JSON file. Model config JSON files MUST be valid JSON objects. } ``` +### Safetensors Model (Sharded) + +```json +{ + "descriptor": { + "createdAt": "2025-01-01T00:00:00Z" + }, + "config": { + "format": "safetensors", + "safetensors": { + "architecture": "qwen2", + "parameter_count": "3.09 B" + }, + "size": "6171926992" + }, + "files": [ + { + "diffID": "sha256:67347b23fb4165b652eb6611f5e1f2a06dfcddba8e909df1b2b0b1857bee06c2", + "type": "application/vnd.docker.ai.safetensors" + }, + { + "diffID": "sha256:a40d941d0e7e0b966ad8b62bb6d6b7c88cce1299197b599d9d0a4ce59aabfc1d", + "type": "application/vnd.docker.ai.safetensors" + }, + { + "diffID": "sha256:5acfb0cc82593273b8c9032239bbe897b80d17b185d8e7ae148afe21cb188067", + "type": "application/vnd.docker.ai.vllm.config.tar" + }, + { + "diffID": "sha256:d0ce8fae4da6de6e5a4b85ebee156ac8f3ab6d8407caf4493968d34e9bc3939e", + "type": "application/vnd.docker.ai.license" + } + ] +} +``` diff --git a/spec.md b/spec.md index 152e593..073c2e0 100644 --- a/spec.md +++ b/spec.md @@ -11,11 +11,20 @@ All layers blobs SHOULD contain the contents of a single file. Layers SHOULD NOT - `application/vnd.docker.ai.gguf.v3` - A file adhering to version 3 of the [GGUF specification](https://github.com/ggml-org/ggml/blob/master/docs/gguf.md), containing a tensor model. - `application/vnd.docker.ai.gguf.v3.lora` - A file adhering to version 3 of the GGUF specification, containing a LoRA adapter. - `application/vnd.docker.ai.gguf.v3.mmproj` - A file containing multimodal projector weights in GGUF format, used to bridge vision and language models by projecting visual features into the language model's embedding space. +- `application/vnd.docker.ai.safetensors` - A file adhering to the [safetensors specification](https://github.com/huggingface/safetensors), a safe and fast serialization format for machine learning tensors. +- `application/vnd.docker.ai.vllm.config.tar` - A tar archive containing configuration files (*.json) and metadata files (e.g., merges.txt) used by inference engines. - `application/vnd.docker.ai.license` - Plain text file containing a software license. - `application/vnd.docker.ai.chat.template.jinja` - A text file containing a [Jinja](https://jinja.palletsprojects.com/en/stable/) prompt template, used to define chat/inference formatting. -## Example Manifest +### Sharded Models +Both GGUF and safetensors formats support sharded models where the model weights are split across multiple files. In such cases: +- Multiple layers with the same media type (e.g., `application/vnd.docker.ai.safetensors` or `application/vnd.docker.ai.gguf.v3`) represent different shards of the same model. +- The order of layers in the manifest defines the shard sequence. +- Shards typically follow naming conventions such as `model-00001-of-00002.safetensors` or `model-00001-of-00002.gguf`. +## Example Manifests + +### GGUF Model ```json { "schemaVersion": 2, @@ -39,3 +48,38 @@ All layers blobs SHOULD contain the contents of a single file. Layers SHOULD NOT ] } ``` + +### Safetensors Model (Sharded) +```json +{ + "schemaVersion": 2, + "mediaType": "application/vnd.oci.image.manifest.v1+json", + "config": { + "mediaType": "application/vnd.docker.ai.model.config.v0.1+json", + "size": 465, + "digest": "sha256:2ea258562df7df57407d739f3215419dd1827093d4a8057386b0c723fa011305" + }, + "layers": [ + { + "mediaType": "application/vnd.docker.ai.safetensors", + "size": 3968658944, + "digest": "sha256:67347b23fb4165b652eb6611f5e1f2a06dfcddba8e909df1b2b0b1857bee06c2" + }, + { + "mediaType": "application/vnd.docker.ai.safetensors", + "size": 2203268048, + "digest": "sha256:a40d941d0e7e0b966ad8b62bb6d6b7c88cce1299197b599d9d0a4ce59aabfc1d" + }, + { + "mediaType": "application/vnd.docker.ai.vllm.config.tar", + "size": 11530752, + "digest": "sha256:5acfb0cc82593273b8c9032239bbe897b80d17b185d8e7ae148afe21cb188067" + }, + { + "mediaType": "application/vnd.docker.ai.license", + "size": 13, + "digest": "sha256:d0ce8fae4da6de6e5a4b85ebee156ac8f3ab6d8407caf4493968d34e9bc3939e" + } + ] +} +```