GGUF endianness cannot be determined from GGUF itself · ggml-org/llama.cpp#3957

(20 comments) (4 reactions) (0 assignees)C++ (18,202 forks)batch import

breaking changeenhancementgood first issue

Repository metrics

Stars: (110,169 stars)
PR merge metrics: (Avg merge 6d 8h) (389 merged PRs in 30d)

Description

As of the time of writing, the big-endian support that was added in https://github.com/ggerganov/llama.cpp/pull/3552 doesn't encode the endianness within the file itself:

https://github.com/ggerganov/llama.cpp/blob/3d48f42efcd05381221654376e9f6f69d76af739/gguf-py/gguf/gguf.py#L689-L698

This means that there is no way to distinguish a big-endian GGUF file from a little-endian file, which may cause some degree of consternation in the future if these files get shared around 😅

The cleanest solution would be to add the endianness to the header - ideally, it would be in the metadata, but the reading of the metadata is dependent on the endianness - but that would be a breaking change.

Given that, my suggestion would be to use FUGG as the header for big-endian files so that a little-endian executor won't attempt to read it at all unless it knows how to deal with it. The same can go the other way, as well (a big-endian executor won't attempt to read a little-endian executor).

Contributor guide

Research direction: Examine the GGUF header definition in gguf.h and the Python implementation in gguf.py. Study the linked PR #3552 to understand how big endian support was added. Then implement a solution such as using 'FUGG' as the magic number for big endian files, ensuring backward compatibility and updating both reading and writing code paths.
Tech stack: python
Domain: backendmachine learning
Issue type: Feature
Difficulty: 3
Estimated time: 1-2 days
Activity status: Active
Clarity: Clear
Prerequisites: understanding of endiannessfile format basicsC++ or Python
Newbie friendliness: 65

Repository metrics

Description

Contributor guide

Get fresh easy issues in your inbox.