M4A
How musefs scans and synthesizes MP4-container audio (.m4a, .m4b). Only
unfragmented files with exactly one track, and that track audio (soun), are
accepted; anything else is skipped at scan time. For the segment model these
layouts plug into, see the segment model.
What round-trips
- Canonical text tags map to their standard
ilstatoms (©nam,©ART,aART,©alb,©day, …) via the shared vocabulary (musefs-format/src/tagmap.rs). - Vocabulary freeform keys (ReplayGain fields, MusicBrainz album/artist
ids,
ISRC,COPYRIGHT, …) round-trip through----freeform atoms under thecom.apple.iTunesmean, matched case-insensitively. - Other text freeform atoms round-trip keyed by their verbatim
name, original casing preserved. - Track and disc numbers, with totals: the binary
trkn/diskatoms are decoded totracknumber/discnumberas"N"or"N/M"(the "N of M" total, matching ID3TRCK/TPOS) and rebuilt as binary atoms with the total filled in. - Integer atoms:
tmpo/cpil/pgapmap to the canonicalbpm/compilation/gaplesskeys (shared with ID3TBPM/TCMPand Vorbis) and are rebuilt as type-21 integer atoms. - Multi-value atoms: every
datasub-box of an atom is read (the iTunes multiple-dataconvention), so a multi-valued atom round-trips all its values, not just the first. - Opaque binary freeform atoms, byte-exact: a
----atom whose payload is binary-typed is captured verbatim under the key----:<mean>:<name>(so the mean survives) and re-emitted streamed from the DB (BinaryTagsegment). - Cover art: every
datachild of acovratom (the iTunes multiple-artwork convention) is ingested; synthesis emits onecovratom with onedatachild per stored art row, in order, image bytes streamed.
Lossy edges
- A text freeform atom under a mean other than
com.apple.iTunesis re-emitted with thecom.apple.iTunesmean (the scan keys text freeform by name only). Binary freeform atoms keep their mean via the----:<mean>:<name>key. - Binary
ilstatoms outside the handled set (trkn/disk, thetmpo/cpil/pgapinteger atoms, and----freeform) are dropped at scan time, since they are not re-emitted on synthesis. covringestion accepts only JPEG (type 13) and PNG (type 14) artwork; other type codes are skipped. MP4 has no picture-type or description fields: scanned art becomes "front cover" with an empty description, and any non-PNG stored art is emitted with the JPEG type code.- A
covrimage or binary----value larger than its size cap is skipped at scan time — before the image is materialized out of a potentially largemoov— and logged (awarnline on stderr) so the lossy drop is explained rather than silent.
How synthesis works
mp4::synthesize_layout (musefs-format/src/mp4.rs) regenerates the moov
box and serves [ftyp][regenerated moov][mdat header][mdat payload]:
offset 0
┌──────────────────────────────────────────────┐ ┐
│ █ ftyp, copied verbatim (Inline) │ │
│ █ moov: kept structural children, (Inline) │ │ regenerated
│ █ stco/co64 offset values += Δ │ │ front
│ █ fresh udta/meta/ilst framing (Inline) │ │
│ █ ---- framing + ▒ freeform body (BinaryTag) │ │
│ █ covr framing + ▒ image bytes (ArtImage) │ │
│ █ mdat header (Inline) │ │
├──────────────────────────────────────────────┤ ┘
│ ░ mdat payload, verbatim (BackingAudio) │
└──────────────────────────────────────────────┘
EOF █ inline-generated ▒ DB-streamed ░ untouched backing
Δ = new mdat payload offset − old
- The scan keeps
moov's structural children and drops its oldudta. A freshudta/meta/ilstis built from the DB: inline box framing, with each opaque----value and each cover image spliced in as streamedBinaryTag/ArtImagesegments. Every enclosing box size accounts for the streamed lengths, so the spliced bytes land exactly where the sizes say. - The
mdatpayload is served verbatim (BackingAudio), merely relocated: every chunk offset instco(32-bit) orco64(64-bit) shifts by one constant delta. Only offset values are patched, never box sizes, so the newmoovsize is computable before the delta — no circular dependency. A 32-bitstcooffset that would overflow fails synthesis rather than corrupt. - A
moovthat sits aftermdat(common for faststart-less files) is handled by a streaming reader that skips the mdat payload — the potentially hundreds-of-MB payload is never read at resolve time.
Quirks & invariants
- The structural metadata read at resolve time is capped
(
MAX_MP4_METADATA_BYTES, 256 MiB); a file declaring more is refused with a controlled error instead of ballooning memory. - MP4 box sizes are 32-bit: oversized synthesized metadata (e.g. enormous
art) fails with
TooLargeat the format boundary rather than emitting a truncated size field. - Byte-identical audio and structural validity are asserted by
musefs-format/tests/proptest_mp4.rs, an offset-patching oracle test (mp4_oracle.rs), and the mutagen interop suite (musefs-core/tests/interop_emit.rs).