JPEG XL

Info

rules 57
github 35276
reddit 647

JPEG XL

tools 4225
website 1655
adoption 20712
image-compression-forum 0

General chat

welcome 3810
introduce-yourself 291
color 1414
photography 3435
other-codecs 23765
on-topic 24923
off-topic 22701

Voice Channels

General 2147

Archived

bot-spam 4380

jxl

Anything JPEG XL related

_wb_
2022-05-19 08:45:20
So gcc bug?
veluca
2022-05-19 08:45:23
you could perhaps try compiling with `-mpreferred-stack-boundary=5` ?
Traneptora
2022-05-19 08:48:23
trying with ` -mpreferred-stack-boundary=5 -fno-ipa-stack-alignment`
2022-05-19 08:49:30
what about `-mstackrealign`
veluca
2022-05-19 08:51:13
worth trying
Traneptora
2022-05-19 08:52:03
```cc1: error: '-mpreferred-stack-boundary=5' is not between 3 and 4```
2022-05-19 08:52:06
<:Madge:859968580996956200>
veluca
2022-05-19 08:52:17
eh
2022-05-19 08:52:28
well then.
2022-05-19 08:52:34
what gcc is that?
Traneptora
2022-05-19 08:55:30
x86_64-w64-mingw32-g++ 11.2.0
2022-05-19 08:56:06
... oh?
2022-05-19 08:56:08
` -mavx256-split-unaligned-load -mavx256-split-unaligned-store`
2022-05-19 08:56:33
``` -mavx256-split-unaligned-load -mavx256-split-unaligned-store Split 32-byte AVX unaligned load and store. ```
2022-05-19 08:56:36
definitely wroth testing.
2022-05-19 08:57:22
If this does what I think it does, it splits load and store into two loads and stores if it's unaligned
veluca
2022-05-19 09:02:20
maybe worth a try... but I doubt it
2022-05-19 09:02:29
the compiler thinks it's aligned after all
Traneptora
2022-05-19 09:10:05
it still crashes in wine, yea
2022-05-19 09:15:26
is there a way to force 32-byte alignment inside highway?
veluca
2022-05-19 09:16:51
alignment *for what*?
2022-05-19 09:16:59
for sure not for compiler-generated spills...
2022-05-19 09:17:19
this is pretty much a mingw / gcc bug, I'd say
Traneptora
2022-05-19 09:17:58
can we work around it?
veluca this is pretty much a mingw / gcc bug, I'd say
2022-05-19 09:20:30
https://github.com/msys2/MSYS2-packages/issues/1209#issuecomment-379547589
2022-05-19 09:20:34
known, apparently
veluca
2022-05-19 09:23:19
well, it's at least not a msys2 problem
2022-05-19 09:24:00
and I think the issue ppl found was something else, since it's not from code built with gcc at all
Traneptora
veluca ok, then can you do 2 things... (a) print the pointer ๐Ÿ˜„ (b) switch to impl 2
2022-05-19 09:59:21
would it be better to define `HWY_LOADUP_ASM` or to switch to impl2
veluca
2022-05-19 10:00:20
mhhh given that your code gets miscompiled in entirely new and exciting ways I'm not sure it would be incredibly useful to do either...
Traneptora
2022-05-19 10:00:36
what does defining `HWY_LOADUP_ASM` do?
veluca
2022-05-19 11:04:15
it uses assembly directly (the inline asm thingy there)
2022-05-19 11:04:29
why does that exist? unclear, I couldn't find anything that defines that flag to true
Traneptora
2022-05-20 12:05:35
defining it manually didn't seem to fix anything
veluca
2022-05-20 12:09:31
you need to define it to 1... but yeah, I doubt it'll change much
Traneptora
2022-05-20 12:09:37
that's what I meant
2022-05-20 12:09:52
`-DHWY_LOADUP_ASM=1`
veluca
2022-05-20 12:09:56
you'd have to compile with a different compiler, or at least one that isn't as broken as mingw-gcc
Traneptora
2022-05-20 12:10:49
in unrelated news, what's the difference between lcms2 and skcms?
veluca
2022-05-20 12:10:51
(_mm256_broadcast_ps compiles to vbroadcastf128 anyway)
Traneptora
2022-05-20 12:10:57
practical difference I mean
2022-05-20 12:11:09
why use one or the other
veluca
Traneptora in unrelated news, what's the difference between lcms2 and skcms?
2022-05-20 12:11:09
as far as I understand, lcms2 is slower but supports more features
Traneptora
2022-05-20 12:11:14
ah
2022-05-20 12:11:42
in that case, is there a way to use skcms if it supports what we want it to do, but lcms2 if skcms does not?
veluca
2022-05-20 12:11:56
I think skcms doesn't support converting *to* CMYK or to LUT-based profiles, and probably doesn't support some of the more uncommon rendering intents
Traneptora in that case, is there a way to use skcms if it supports what we want it to do, but lcms2 if skcms does not?
2022-05-20 12:12:26
we actually discussed this today xD not yet, but we're thinking about it
Hello71
2022-05-23 05:45:52
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c40 says it can be worked around with -muse-unaligned-vector-move
2022-05-23 05:46:01
which is an absolutely terrible hack
2022-05-23 05:47:02
note that this would be CXXFLAGS+=' -Wa,-muse-unaligned-vector-move'
veluca
2022-05-23 06:11:52
Reported: 2012-08-30 01:10 UTC by R Copley that's not very reassuring
Traneptora
2022-05-24 04:59:44
It does what you want though, it produces `vmoveups` instead of `vmoveaps` and similar
veluca
2022-05-24 06:28:49
well yes, but *always*
Hello71
2022-05-24 06:40:28
is that bad
Traneptora
2022-05-24 06:41:55
I don't believe there's a major performance difference between these two instructions tbh
veluca
2022-05-24 06:43:18
that's true on new enough CPUs
Hello71
2022-05-24 06:44:15
well it's `vmovups` and `vmovaps`, not `vmoveups` and `vmoveaps`, but setting that aside, https://stackoverflow.com/questions/20259694/is-the-sse-unaligned-load-intrinsic-any-slower-than-the-aligned-load-intrinsic-o claims that they're about the same, and https://stackoverflow.com/questions/52147378/choice-between-aligned-vs-unaligned-x86-simd-instructions claims that "On pre-Nehalem and Bonnell and pre-Bulldozer: [...] movups/vmovups can be up to twice as slow as movaps/vmovaps"
Traneptora
Hello71 well it's `vmovups` and `vmovaps`, not `vmoveups` and `vmoveaps`, but setting that aside, https://stackoverflow.com/questions/20259694/is-the-sse-unaligned-load-intrinsic-any-slower-than-the-aligned-load-intrinsic-o claims that they're about the same, and https://stackoverflow.com/questions/52147378/choice-between-aligned-vs-unaligned-x86-simd-instructions claims that "On pre-Nehalem and Bonnell and pre-Bulldozer: [...] movups/vmovups can be up to twice as slow as movaps/vmovaps"
2022-05-24 06:45:08
none of those support avx2 so I don't believe it will matter
Hello71
2022-05-24 06:45:27
yes, i don't understand why they wrote movups/vmovups
2022-05-24 06:45:36
it should only apply to movups vs movaps
Traneptora
2022-05-24 06:46:19
and `-muse-unaligned-vector-move` should only affect vmov[au]ps, not the nonvector version
veluca
2022-05-24 06:48:21
iirc v is for vex-encoded, not for vector
Traneptora
2022-05-24 06:48:26
oh, nvm
Hello71
2022-05-24 07:08:16
i thought v was for aVx ๐Ÿคท
veluca
2022-05-24 07:18:14
it might
2022-05-24 07:18:41
IIRC the instruction encoding prefix is called VEX for AVX and EVEX for AVX512+
Traneptora
2022-05-24 07:49:17
VEX is Vector Extensions iirc
2022-05-26 02:49:42
how hard would it be to implement the code path in https://github.com/libjxl/libjxl/pull/1366
2022-05-26 02:49:49
because it seems like it's going nowhere
2022-05-26 02:49:58
and I'm interested in possibly taking a look at it
_wb_
2022-05-26 04:06:02
Probably the easiest way to do it would be to call the cms in the final writing stage of the render pipeline, just after interleaving RGB but before converting it to integers.
2022-05-26 04:06:43
Only tricky bit will be to get gray and cmyk right too, but can start with rgb...
Traneptora
_wb_ Probably the easiest way to do it would be to call the cms in the final writing stage of the render pipeline, just after interleaving RGB but before converting it to integers.
2022-05-26 05:29:41
does the CMS take packed and not planar RGB?
2022-05-26 05:31:49
oh right, ugh, the project is in C++
2022-05-26 05:31:53
and I don't know C++, I only know C
2022-05-26 05:32:14
I'll be taking a class this summer on C++ so I guess I'll look at it after that
_wb_
2022-05-26 06:06:31
CMS takes an interleaved row
veluca
2022-05-26 11:54:03
one of the harder parts (I think) will be figuring out when you don't *need* the CMS
2022-05-26 11:54:41
and also developing this in a somewhat long-term maintanable way
fab
2022-05-27 12:46:24
https://it.wikipedia.org/wiki/JPEG_XL
Traneptora
Hello71 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c40 says it can be worked around with -muse-unaligned-vector-move
2022-05-27 06:16:56
I just tested this and it works
2022-05-27 06:17:01
i.e. it prevents the crash
veluca
2022-05-27 06:51:00
good
2022-05-27 06:51:09
ideally though gcc would fix their bugs ๐Ÿ˜›
Traneptora
2022-05-27 07:22:43
yea ikr
BlueSwordM
2022-05-27 09:50:14
I think there is another way to fix this.
2022-05-27 09:50:22
Use mingw based on Clang instead of GCC.
Kleis Auke
BlueSwordM Use mingw based on Clang instead of GCC.
2022-05-28 12:35:24
I also recommend this. In case anyone is looking for a way to do this with Docker: https://discord.com/channels/794206087879852103/848189884614705192/934838441647173812
fab
2022-05-28 01:58:20
avif is ugly
2022-05-28 01:58:20
2022-05-28 01:58:42
86 kb image
2022-05-28 01:59:57
you can't really say it is at jxl level
yurume
2022-05-28 02:00:36
what on the earth is it
fab
2022-05-28 02:01:14
2022-05-28 02:01:48
now is woking
yurume
2022-05-28 02:01:58
is it corrupted? looks fine to me (via my Firefox)
fab
2022-05-28 02:02:11
is a windows seven error
2022-05-28 02:02:28
or because there is jpg extension before avif
2022-05-28 02:02:49
why mozilla on pages that uses avif on my windows seven shows unusable images
2022-05-28 02:02:51
always
2022-05-28 02:02:54
and always
2022-05-28 02:03:59
source
2022-05-28 02:03:59
Traneptora
fab
2022-05-28 02:26:28
win7 shows the dimensions as incorrect too
2022-05-28 02:26:41
since it's vertical
fab
2022-05-28 02:27:12
the esolution was high
Traneptora
2022-05-28 02:27:25
oh nvm there's a scrollbar there
2022-05-28 02:27:34
but yea that looks like a decoder error to me
Yari-nyan
fab
2022-05-29 05:53:04
that's kind of amazing
Traneptora
2022-05-31 07:10:56
does libjxl populate `JxlColorEncoding->white_point_xy` even when the enum is not set to `JXL_WHITE_POINT_CUSTOM`? for the decoder API, I mean.
2022-05-31 07:14:46
looks like it, in `void ConvertInternalToExternalColorEncoding` but this is not documented
2022-05-31 08:13:18
also, another question
2022-05-31 08:24:11
do all Jpeg Reconstructions have `info.uses_original_profile` set to true?
Traneptora does libjxl populate `JxlColorEncoding->white_point_xy` even when the enum is not set to `JXL_WHITE_POINT_CUSTOM`? for the decoder API, I mean.
2022-05-31 08:24:36
also same with the corresponding primaries
veluca
Traneptora do all Jpeg Reconstructions have `info.uses_original_profile` set to true?
2022-05-31 08:25:11
I think so
_wb_
Traneptora do all Jpeg Reconstructions have `info.uses_original_profile` set to true?
2022-05-31 08:25:36
Yes, because XYB jpegs are not a thing in the wild
Traneptora
2022-05-31 08:26:17
I'm working on my wrapper in FFmpeg to make colors more correct, which is why I ask about `white_point_xy` btw
2022-05-31 08:27:14
rather than have a big switch statement that maps combinations of enums to the corresponding FFmpeg enums, I figure it's simpler to take the actual values and feed it to a function in `libavutil/csp.h` which outputs FFmpeg's enum based on the values
2022-05-31 08:30:38
but I was wondering if those values were correct even if libjxl uses enum values
2022-05-31 08:31:15
it appears that is how it currently works but it's unclear to me that it will always stay that way, given the documentation doesn't specify it
2022-05-31 08:31:21
should I send a PR that documents this more explicitly?
yurume
2022-05-31 08:35:01
is the `libjxl/conformance` repo meant to be a part of ISO/IEC 18181-3 (conformance testing)? or will be there a separate conformance test suite?
_wb_
Traneptora should I send a PR that documents this more explicitly?
2022-05-31 08:38:32
Yes, I think this is useful behavior and we should document it and keep doing it :)
yurume is the `libjxl/conformance` repo meant to be a part of ISO/IEC 18181-3 (conformance testing)? or will be there a separate conformance test suite?
2022-05-31 08:39:12
Yes, basically part 3 is just a snapshot of that repo with some boilerplate text
yurume
2022-05-31 08:40:23
the repo looked like a handful number of very high-level integration tests right now
_wb_
2022-05-31 08:40:25
And we should probably add way more conformance bitstreams but this seemed to be a reasonable start, if a decoder can handle all that stuff, it probably handles most things in the spec, at least
yurume
2022-05-31 08:41:08
I'm starting to think about making some sort of test suite, at least for jxsml
_wb_
2022-05-31 08:41:15
That would be great
2022-05-31 08:42:24
Feel free to add stuff to the conformance repo too, we can't really have too many tests there
yurume
2022-05-31 08:43:48
I'm not sure how to design the test suite, especially what to compare against, that's why I'm holding that idea right now, but that's something I love to have
_wb_
2022-05-31 08:43:58
Especially some set of smaller tests that are just a small bitstream encoding a small image that checks one or a small number of coding tools
yurume
2022-05-31 08:44:22
yeah, that kind of things
2022-05-31 08:44:44
right now that information is deeply embedded in libjxl unit tests and can't be easily extracted
_wb_
2022-05-31 08:45:54
Well currently we just used libjxl to produce the reference decodes, but of course if we hit a case where your decoder fails something, we'll have to check who is correct: libjxl, jxsml, and/or the spec :)
Traneptora
2022-05-31 09:08:07
did spot ever get fixed?
2022-05-31 09:08:14
it's a level 10 codestream by itself without a container
2022-06-03 06:42:32
Is there an updated draft of the spec?
2022-06-03 06:42:39
with the bugs that yurume found fixed I mean
yurume
2022-06-03 06:43:48
I think wb did a round of fixups weeks ago but I still keep finding bugs as I implement it further so there is not exactly a fully updated spec with no known bugs
Traneptora
2022-06-03 06:44:33
I'm planning on writing a rudimentary ANS decoder
2022-06-03 06:44:42
this is why I ask
2022-06-03 06:45:03
but I don't want to start on it if there's lots of bugs
yurume
2022-06-03 06:46:17
so essentially the full entropy decoder? (which would involve Brotli huffman trees and stuffs)
2022-06-03 06:47:16
maybe I can paraphrase it in the way I understand (and hopefully correctly) to guide other implementations
2022-06-03 06:48:23
I think that part of the spec has been sorted out in the past round anyway, so an updated spec might be still enough for you (I'll DM you soon)
Arcane
2022-06-03 06:48:34
yurume
2022-06-03 06:48:50
(click miss...)
Traneptora
yurume so essentially the full entropy decoder? (which would involve Brotli huffman trees and stuffs)
2022-06-03 06:49:26
just ANS. for example, the spec draft says this:
2022-06-03 06:49:41
> the decoder reads permutation (F.3.2) from a single entropy coded stream with 8 clustered distributions, as specified in C.2, with size equal to the number of TOC entries and skip = 0.
2022-06-03 06:50:01
where Section C.2 is ANS and Prefix Coding
yurume
2022-06-03 06:50:25
ah, that still requires you to implement everything in Section C
2022-06-03 06:52:29
the entropy coding is either done by ANS or Huffman, both requiring you to decode the frequency distribution and cluster/context map, and that decoding uses another level of compression which can be also done by either ANS or Huffman
2022-06-03 06:53:24
so even though the final code itself is ANS, the metadata needed for constructing that code might use Huffman and vice versa
Traneptora
2022-06-03 06:56:42
does that mean you might need Brotli too?
yurume
2022-06-03 06:57:03
the tree encoding of Brotli, not the entirety of it, but yeah
Traneptora
2022-06-03 06:57:11
yuck, so that's a lot to implement
yurume
2022-06-03 06:57:17
indeed
2022-06-03 06:57:50
I had to implement it when I finished the image header and then have to go through an (compressed) ICC profile
Traneptora
2022-06-03 06:58:14
do you know of any shortcut if I only want to determine the length, rather than the contents?
yurume
2022-06-03 06:58:16
it turned out that implementing entropy codes that early was a good decision
2022-06-03 06:58:35
which length do you mean?
Traneptora
2022-06-03 06:58:43
the length of the encoded stream, so I can skip over it
2022-06-03 06:59:01
for example, if I want to skip to the FrameHeader of a JXL codestream that contains an ICC Profile
2022-06-03 06:59:23
I'd need to know the compressed length of the ICC profile, which is not signaled by the codestream
yurume
2022-06-03 06:59:50
I think it is impossible, you at least have to decode all the symbols (the ICC codec then uses those symbols, which you don't have to implement if you just want to skip them)
Traneptora
2022-06-03 07:00:13
and to decode all those symbols would I need to implement basically all of section C?
yurume
2022-06-03 07:00:41
yeah. I think (but am not sure) that there is technically an index to all the frames, in the `jxli` box in the container, but if you don't have a container it's no use
Traneptora
2022-06-03 07:00:58
Yea, this is for a raw codestream
2022-06-03 07:01:17
and the jxli box is frequently not present
yurume
2022-06-03 07:02:11
it's annoying that the ICC stream is not skipable but placed before any frame and/or TOC
Traneptora
2022-06-03 07:02:29
Yea, that's what I'm trying to work around
yurume
2022-06-03 07:03:10
back in early this year I thought it would be neat to find out the exact amount of data you need to render a certain pass
2022-06-03 07:03:32
so that you can "fake" the progressive rendering
Traneptora
2022-06-03 07:04:29
so what actually do I need to implement to be able to decode (or rather skip) over the ICC profile? It looks like I have to implement section C.2 but section C.1 to me is kind of unclear, it just says "see these RFCs" which is a little bit eh?
yurume
2022-06-03 07:04:59
C.1 refers to certain sections in RFC 7932 (Brotli), which you also need to implement
2022-06-03 07:05:36
specifically you need the section 3 of RFC 7932 for tree coding
Traneptora
2022-06-03 07:05:56
blech, this is gonna take a bunch of time
2022-06-03 07:06:01
I'm not used to reading specs either
yurume
2022-06-03 07:06:08
let me look at jxsml to list exactly what are needed
2022-06-03 07:09:58
- cluster/context map decoding (C.2.5); this can recursively decode another entropy stream - canonical Huffman tree decoding (C.1, referring to RFC 7932 Sec. 3) - rANS distribution decoding (C.2.4) - rANS alias table reconstruction (C.2.2) - hybrid integer coding parameter decoding (C.2.7) - hybrid integer coding aka `DecodeHybridVarLenUint` (C.2.3 + C.2.6)
2022-06-03 07:10:27
to my knowledge you can't even simplify hybrid integer coding because it can get used by C.2.5
2022-06-03 07:11:35
feel free to check out my jxsml (which is public domain btw) if you feel fine with that: https://gist.github.com/lifthrasiir/137a3bf550f5a8408d4d9450d026101f
2022-06-03 07:12:23
I believe it at least got entropy coding right
Traneptora
2022-06-03 07:20:41
I can't use your code cause the goal is to implement this in FFmpeg
2022-06-03 07:20:47
which requires me to copyright it and LGPL it
2022-06-03 07:20:58
while *you* may be fine with integrating public domain code, they won't
2022-06-03 07:21:10
since I don't believe I can copyright code I didn't write
2022-06-03 07:21:32
this will take a bunch of time, I see
yurume
2022-06-03 07:24:16
Well I will almost surely put it in CC0, which I believe is compatible with LGPL without requiring me to do anything else
_wb_
2022-06-03 07:40:00
You can also just be listed as one of the authors in ffmpeg, and since you wrote the code you can license it lgpl in ffmpeg and cc0 in your own project if you want
2022-06-03 07:41:04
With hindsight, the non-skippable icc profiles and entropy coded TOC permutation was a bit silly, we should have made that stuff parseable without implementing real entropy decode
2022-06-03 07:41:12
Too late now though
2022-06-03 07:42:26
It's one of the collateral damages caused by our urge to have minimalistic headers, so no redundant size fields or missed opportunity to entropy code some header data...
Traneptora
2022-06-03 07:44:38
is there a maximum number of `num_dist`?
2022-06-03 07:44:46
like 256 or something?
2022-06-03 07:44:53
or is there no cap
_wb_
2022-06-03 07:45:13
For Exif/XMP I did consider parsing-only applications (like exiftool/exiv2), which is why those can be found by just parsing isobmf (though we did add brob, which complicates stuff)... But for the codestream itself, I hadn't considered the need for no-decode parsing
2022-06-03 07:45:38
Uh, is that the one before or after clustering?
Traneptora
2022-06-03 07:45:54
`num_dist` is the number of clustered distributions (before clustering)
2022-06-03 07:46:03
I know the number of clusters is capped at 256
_wb_
2022-06-03 07:46:17
Before clustering isn't really capped
Traneptora
2022-06-03 07:46:36
is there a max that is actually used though? something low like 4096
_wb_
2022-06-03 07:46:42
Nothing low
Traneptora
2022-06-03 07:46:46
yuck
_wb_
2022-06-03 07:47:01
AC uses quite a lot of pre-clustering contexts
Traneptora
2022-06-03 07:47:01
I'm trying to stack allocate an array of clusters without reading num_dist
2022-06-03 07:47:11
but looks like I gotta heap allocate it after reading num_dist
_wb_
2022-06-03 07:47:19
And so can modular when it has a large enough tree
yurume
2022-06-03 07:47:35
after clustering it *is* capped to 256, this is one thing the spec was unclear about
_wb_
2022-06-03 07:47:48
Yeah, after clustering it is capped
Traneptora
2022-06-03 07:47:48
the spec is clear to me about that
_wb_
2022-06-03 07:48:07
But the cluster mapping itself can be large
yurume
Traneptora the spec is clear to me about that
2022-06-03 07:49:01
was it? I was confused about that for a while until I found that inverse MTF implicitly caps it to 256 anyway.
Traneptora
2022-06-03 07:49:19
it states it explicitly, yea
2022-06-03 07:49:20
> All integers in [0, num_clusters) are present in this array, and num_clusters < 256. Position i in this array indicates that context i is merged into the corresponding cluster.
_wb_
2022-06-03 07:49:32
What version is that?
yurume
2022-06-03 07:49:34
ah, you are looking at the fixed version ๐Ÿ™‚
_wb_
2022-06-03 07:49:43
The clarified version:)
Traneptora
2022-06-03 07:49:54
yea, an updated draft
yurume
_wb_ With hindsight, the non-skippable icc profiles and entropy coded TOC permutation was a bit silly, we should have made that stuff parseable without implementing real entropy decode
2022-06-03 07:50:36
maybe we are still able to partially fix it by extending the image header bundle
2022-06-03 07:51:39
say, if the extension bit 0 is set, the next xx bits are the ICC size and it is guaranteed that no TOC permutation is in use
_wb_
2022-06-03 07:51:41
Well not really unless we invalidate existing bitstreams that don't use that extension but have an icc profile or toc permutation
yurume
2022-06-03 07:53:03
I think this is mostly for the optimization (that wants to read groups as soon as possible) or optimistic partial decoders (it suffices for them to work for majority of bitstreams, can fall back otherwise)
_wb_
2022-06-03 07:53:07
One thing we could check though is what parts of entropy coding are actually used for icc compression and toc permutations
Traneptora
2022-06-03 07:53:34
I asked yurume earlier, apparently basically all of it
_wb_
2022-06-03 07:53:44
Theoretically, yes
yurume
2022-06-03 07:53:53
what bits are actually used in libjxl, I assume
Traneptora
2022-06-03 07:53:59
ah but all JXL files atm are produced by libjxl or fjxl atm
_wb_
2022-06-03 07:54:14
But maybe some of it doesn't end up getting used in existing bitstreams, and then we could add restrictions in the spec
2022-06-03 08:00:24
Then again if there is a short implementation of the entropy coding stuff given by jxsml, that can be used in parsing-only applications, maybe that's good enough
Traneptora
2022-06-03 08:00:33
for example, do MTF-clustered distributions end up used?
_wb_
2022-06-03 08:00:57
Skipping over icc/toc will of course be faster, but it's not like it's typically a large amount of stuff to entropy decode
2022-06-03 08:01:12
Only for CMYK profiles can it get large
2022-06-03 08:01:48
In the more typical case of RGB profiles, it's typically a few kb, and probably less than that to entropy decode
2022-06-03 08:03:10
Same for TOC permutations: typically not used, and if used, it's data proportional to the number of groups only, i.e. at most 16 symbols per megapixel
Traneptora
2022-06-03 08:03:50
I was under the impression that TOC permutations were going to be very common
_wb_
2022-06-03 08:03:58
Maybe
2022-06-03 08:04:08
They're not done by default atm
2022-06-03 08:04:29
It's useful though, so that could change
Traneptora
2022-06-03 08:04:34
interesting
_wb_
2022-06-03 08:05:13
But a parsing application might want to decode the TOC anyway, including permutation info
Traneptora
2022-06-03 08:05:22
well you'd have to
2022-06-03 08:05:28
since the actual blocks are after the TOC
_wb_
2022-06-03 08:05:41
I mean decode, not just parse/skip
Traneptora
2022-06-03 08:06:12
I care about the TOC because it lets me find the end of the codestream
2022-06-03 08:06:46
by adding the lengths together
2022-06-03 08:06:53
the actual permutation doesn't affect that
_wb_
2022-06-03 08:06:55
Because you may want to know where different pixel regions are located in the codestream, e.g. for region of interest decode
Traneptora
2022-06-03 08:07:00
possibly
2022-06-03 08:07:07
but for me that's not what I need, I need something more basic
yurume
2022-06-03 08:07:14
ah that makes sense
_wb_
2022-06-03 08:07:26
Yeay to just find the end you only need the toc without the permutation
yurume
2022-06-03 08:07:45
so you want something like `jxli` but that always works even when it is not present
Traneptora
2022-06-03 08:17:44
what is the context `ctx` in `DecodeHybridVarLenUint(ctx)`
yurume
2022-06-03 08:18:03
it is supplied by the caller
Traneptora
2022-06-03 08:18:11
is it an integer?
yurume
2022-06-03 08:18:17
yes, non-negative one (and also `< num_dist`)
Traneptora
2022-06-03 08:18:44
> For each pre-clustered distribution i, the decoder reads an integer as specified in C.2.6.
2022-06-03 08:19:19
Does this mean that for each pre-clustered distribution `i` it populates `clusters[i]` with `DecodeHybridVarLenUint(i)`?
yurume
2022-06-03 08:20:04
to my knowledge the context is always 0, because it has already read a single clustered distribution (i.e. `num_dist` is 1)
2022-06-03 08:20:10
but yeah that needs to be clarified
Traneptora
2022-06-03 08:20:56
when did it read that single clustered distribution?
yurume
2022-06-03 08:21:17
> The decoder then initializes a symbol decoder using a single distribution D, as specified in C.2.1.
2022-06-03 08:21:43
ah wait
Traneptora
2022-06-03 08:21:44
but I'm implementing section 2.5, which is required by section 2.1
2022-06-03 08:22:22
where do I get the various `configs[clusters[ctx]]` that I need?
2022-06-03 08:22:28
I don't recall it telling me to initiatlize a config
yurume
2022-06-03 08:23:24
I'm not sure if my implementation works correctly for those nested ones, but if I'm correct 2.5 recursively invokes 2.1
2022-06-03 08:23:49
but the nesting ends there because 2.5 is only used when 2+ pre-clustering distributions are in use
Traneptora
2022-06-03 08:24:22
This is in section 2.1: > The decoder then reads the mapping clusters of the `num_dist` distributions into `num_clusters` clusters as specified in C.2.5. It then reads `use_prefix_code` as a Bool(). If `use_prefix_code` is false, the decoder sets `log_alphabet_size` to `5 + u(2)`; otherwise, it sets `log_alphabet_size` to 15. Then for each post-clustered distribution, in increasing order of index i in `[0, num_clusters - 1)`, the decoder reads `configs[i] = HybridUintConfig(log_alphabet_size)` as specified in C.2.7.
2022-06-03 08:24:29
So you're saying C2.5 invokes 2.1 again
yurume
2022-06-03 08:24:33
correct
Traneptora
2022-06-03 08:24:56
argh
2022-06-03 08:24:59
this is getting more complicated
yurume
2022-06-03 08:25:18
yeah, no doubt why the corresponding spec was pretty buggy
Traneptora
2022-06-03 08:25:36
> If num_dist == 1, then num_clusters = 1 and clusters[0] = 0, and the remainder of this subclause > is skipped. Eventually you have to hit this case, or it recurses forever.
2022-06-03 08:37:28
in `DecodeHybridVarLenUint` where is `window` defined?
2022-06-03 08:38:08
likewise where is `copy_pos` defined?
2022-06-03 08:38:15
it appears they aren't
yurume
2022-06-03 08:38:17
it's not defined, and one of the reasons I proposed a restructuring
Traneptora
2022-06-03 08:38:43
the first thing you do is a read from that array, not a write
yurume
2022-06-03 08:39:12
`window` is implicitly a `1<<20`-element array of output symbols, while `copy_pos` is undefined until LZ77 is in active (`num_to_copy > 0`)
2022-06-03 08:40:03
in the actual implementation I guess `window` should be flexible, since not every stream has more than 2^20 symbols
Traneptora
2022-06-03 08:40:26
is `window` the LZ77 sliding window of backreference buffer?
yurume
2022-06-03 08:40:45
yes, that's the reason that it can be initially undefined just fine
Traneptora
2022-06-03 08:40:55
and in this case is it a revolving window?
2022-06-03 08:41:23
it looks like it's circular so the array doesn't need to be shifted
2022-06-03 08:41:56
`0xFFFFF` is `(1 << 20) - 1` as well, right?
yurume
2022-06-03 08:42:22
sort of, you can implement it with a non-revolving window but implementing it as written in the spec would be simpler
2022-06-03 08:42:28
also yes for `0xFFFFF`
Traneptora
2022-06-03 08:42:42
I mean the spec looks like it's treating the window as a revolving window
2022-06-03 08:43:08
but I see how this works, okay
yurume
2022-06-03 08:43:08
(typo from mine)
Traneptora
2022-06-03 08:43:27
I think I understand how this works now
2022-06-03 08:44:00
initially, `num_to_copy` is zero and if `lz77.enabled && token >= lz77.min_symbol` then it has discovered a length/distance pair
2022-06-03 08:44:08
if it's not greater than that, it's a literal and just copies it as is
yurume
2022-06-03 08:44:23
yup
Traneptora
2022-06-03 08:44:50
so in the event where lz77.enabled is false, the window part could honestly just be skipped altogether
yurume
2022-06-03 08:44:51
the reason why it is structured in that way is that the decoding routine is not active but passively called from the caller
2022-06-03 08:45:12
so a single LZ77 sequence may persist across multiple calls to `DecodeHybridVarLenUint`
Traneptora
2022-06-03 08:45:33
`token = /* Read symbol from entropy coded stream using D[clusters[ctx]] */ ;` in this case, D is the entire distribution read from section 2.1 initialized earlier, right?
yurume
2022-06-03 08:46:38
if `use_prefix_code` is false, `D` is indeed an array of frequencies and read according to C.2.3 (you have to reconstruct an alias table before); if `use_prefix_code` is true it doesn't exactly exist but you instead read a number using a previously decoded Huffman tree
Traneptora
2022-06-03 08:48:00
oh that is not clear at all
2022-06-03 08:48:40
so section 2.3 specifies how to read a symbol from a stream given an already-constructed distribution D
2022-06-03 08:49:01
so to construct that distribution D, you follow the instructions in section 2.1
2022-06-03 08:49:12
part of following those instructions in 2.1 requires you to read the clustering, by invoking 2.5
2022-06-03 08:49:29
and sometimes, invoking 2.5 requires you to read an entire distribution E
2022-06-03 08:50:22
so basically, to do section 2.5, you read an entire distribution called E as described in section 2.1
2022-06-03 08:50:44
then you read symbols from the codestream, using E as the distribution
2022-06-03 08:50:56
that lets you populate the clustering
2022-06-03 08:51:16
and then when 2.5 returns, and then 2.1 returns you end up with a distribution D that is the one you *actually* care about
2022-06-03 08:51:47
I see
yurume
2022-06-03 08:52:10
yeah, it is quite hard to understand at the first glance
Traneptora
2022-06-03 08:53:13
so here's what I have so far in pseudo-code
2022-06-03 09:11:27
```c // assuming here for simplicity that use_prefix_code = false /* section C.2.1 */ JxlSymbolD *read_distribution() { read_lz77_params(); clusters = read_dist_clustering(); read_hybrid_uint_confs(); populate_d_array(); } /* section C.2.2 */ AliasMapping *gen_alias_func(JxlSymbolD *d, int log_alphabet_size); /* section C.2.3 */ int read_from_distribution(JxlSymbolD *d, AliasMapping *mapping); /* section C.2.4 */ void populate_d_array(); /* section C.2.5 */ int *read_dist_clustering() { if (simple) return simple_clustering(); JxlSymbolD *e = read_distribution(); for (int i = 0; i < num_dist; i++) clusters[i] = read_hybrid_integer(e, 0); return clusters; } /* section C.2.6 */ int read_hybrid_integer(JxlSymbolD *d, HybridUintConf *conf, int ctx) { token = read_from_distribution(d); do_lz77_stuff(); return do_hybrid_stuff(token, conf); } /* section C.2.7 */ HybridUintConf *read_hybrid_uint_conf(int log_alphabet_size); ```
2022-06-03 09:11:39
it looks like it works roughly like this
2022-06-03 09:11:41
is this correct?
yurume
2022-06-03 09:12:52
`read_hybrid_integer(e, i)` is missing `ctx == 0`, and you don't read `token` when lz77 sequence is ongoing
Traneptora
2022-06-03 09:13:43
changed that, so now it's `clusters[i] = read_hybrid_integer(e, 0)`
2022-06-03 09:13:59
also w/r/t the lz77 stuff I just put that all under do_lz77_stuff()
yurume
2022-06-03 09:15:19
okay
2022-06-03 09:15:41
`read_hybrid_uint_confs` and `populate_d_array` should be repeated `num_clusters` times as well
Traneptora
2022-06-03 09:15:58
it's populating an array so it's included
2022-06-03 09:16:02
this is psuedo-code
2022-06-03 09:16:11
just trying to get a *general* flowchart here
yurume
2022-06-03 09:16:20
ah good
Traneptora
2022-06-03 09:16:37
one thing I'm still confused about is where is `index` defined in section C.2.3
yurume
2022-06-03 09:16:38
I think `gen_alias_func` should be also called at the very end of `read_distribution`, which is not obvious from the spec
Traneptora
2022-06-03 09:16:58
oh yea it does appear that way, but you are correct it's not obvious
2022-06-03 09:17:03
should be explicit
2022-06-03 09:17:35
oh wait I lied it's defined
yurume
2022-06-03 09:17:43
for C.2.3 `index` is a temporary variable I think?
Traneptora
2022-06-03 09:18:48
hm, so at the end of section 2.1, it says > The decoder reads integers from entropy coded streams as specified in C.2.6.
2022-06-03 09:19:07
what I'm wondering is, what `ctx` does it pass?
yurume
2022-06-03 09:19:12
that sentence is quite confusing indeed
2022-06-03 09:19:46
the decoder *will* read integers by calling `DecodeHybridVarLenUint` at the later moment in decoding
Traneptora
2022-06-03 09:20:21
it says, the decoder will read from this generated D by calling section C.2.6, but where is `ctx` specified
2022-06-03 09:20:34
is it something the decoder must do?
yurume
2022-06-03 09:20:59
uh, the last paragraph of C.2.1 should be considered a summary of what C.2.6 specifies
2022-06-03 09:21:40
more accurately speaking, that paragraph explains `/* Read symbol from entropy coded stream using D[clusters[ctx]] */` in C.2.6
2022-06-03 09:22:02
I don't like that arrangement and that paragraph should go straight into C.2.6
Traneptora
2022-06-03 09:22:26
So when, I see this: > the decoder reads permutation (F.3.2) from a single entropy coded stream with 8 clustered distributions, as specified in C.2, with size equal to the number of TOC entries and skip = 0. what I do is I call section C.2.1 with `num_dist = 8` to generate `D`, and then when I need a symbol I invoke section C.2.3
2022-06-03 09:22:52
assuming `use_prefix_code = false` for simplicity
yurume
2022-06-03 09:23:11
you should invoke C.2.6
Traneptora
2022-06-03 09:23:17
what `ctx` do I pass?
_wb_
2022-06-03 09:23:20
Please remember these unclear passages so they can get clarified in the spec.
yurume
2022-06-03 09:24:09
(I'll try to summarize this discussion into an issue, but I already have a backlog of issues I need to write down)
Traneptora
2022-06-03 09:24:37
what I'm trying to figure out is, when I need a symbol from the stream what ctx do I pass to `DecodeHybridVarLenUint`
yurume
Traneptora So when, I see this: > the decoder reads permutation (F.3.2) from a single entropy coded stream with 8 clustered distributions, as specified in C.2, with size equal to the number of TOC entries and skip = 0. what I do is I call section C.2.1 with `num_dist = 8` to generate `D`, and then when I need a symbol I invoke section C.2.3
2022-06-03 09:25:10
ah wait, I realized that this is a part of TOC
Traneptora
2022-06-03 09:25:17
Yes
yurume
2022-06-03 09:26:58
hmm, nevermind, I recalled a peculiar bit of TOC (namely, F.3.2 gets reused elsewhere in the spec) but that doesn't make the decoding process any different
2022-06-03 09:27:42
so yeah, you do C.2.1 with `num_dist = 8`, read `D`, then go to F.3.2 and whenever you see `DecodeHybridVarLenUint` calls you invoke C.2.6
2022-06-03 09:27:52
and `ctx` will be given as arguments to `DecodeHybridVarLenUint`
Traneptora
2022-06-03 09:28:36
Oh I see, `F.3.2` specifies what to pass to that
2022-06-03 09:32:43
> The decoder reads enc_size as U64(). The decoder reads 41 pre-clustered distributions as specified > in C.2. The decoder then reads enc_size integers as specified in C.2.6 to obtain decompressed bytes
2022-06-03 09:33:04
does that mean in the ICC Profile stream, the alphabet size is `[0, 256)`
2022-06-03 09:34:17
also, I think I finally understand the general structure of this
2022-06-03 09:34:21
but that took some effort
yurume
2022-06-03 09:35:14
they are meant to be bytes, but it's unclear to me if the spec actually caps the maximum symbol
2022-06-03 09:35:54
hybrid integer decoding can technically output an enormous integer which is very likely to be invalid but not guaranteed
Traneptora
2022-06-03 09:37:46
I see
2022-06-03 09:38:01
not that it really matters here if you just assign it to a `uint8_t` and truncate it
yurume
2022-06-03 09:38:04
another item to report!
Traneptora
2022-06-03 09:38:10
oh?
yurume
2022-06-03 09:38:23
I mean, I was filing spec issues for a while
2022-06-03 09:38:35
and my backlog just grew to... uh something like at least 4 issues? ...
Traneptora
2022-06-03 09:38:50
It looks like the TOC permutation is only 8 pre-clustered distributions, and the ICC Profile is 41. so if I'm only decoding those two I can cap the number of pre-clustered distributions to 42.
2022-06-03 09:39:12
because of the possible lz77 +1
yurume
2022-06-03 09:39:29
that was my gut feeling, but look at H.4.2 and I.3.1.2
2022-06-03 09:39:45
the number of pre-clustering distributions can be much higher
Traneptora
2022-06-03 09:39:50
I don't need to decode modular mode
2022-06-03 09:39:56
my goal is to determine the end of the file
yurume
2022-06-03 09:40:11
ah yeah that's enough for a specialized code path
Traneptora
2022-06-03 09:40:11
if I can read the TOC then I can add up the block lengths (ignoring the actual contents of the permutation)
2022-06-03 09:40:27
(since that won't change the length)
yurume
2022-06-03 09:40:46
I think 41 (+1) is the maximum for fixed-sized distributions
Traneptora
2022-06-04 12:17:49
> The ANS decoder keeps an internal 32-bit unsigned integer state. Upon creation of the decoder (immediately > before reading the first symbol from a new ANS stream), state is initialized with a u(32) from the codestream. Does this mean that the first 32 bits of any entropy coded stream are these u(32), or is it after the clustering etc. but before the first symbol
yurume
2022-06-04 12:23:14
before the first symbol.
2022-06-04 12:23:53
so if the distribution is read but the first symbol is preceded by some other data u(32) will be read after that
Traneptora
2022-06-04 12:43:36
C.2.3 doesn't actually say what it returns
2022-06-04 12:43:43
does it return the updated value of `state`?
2022-06-04 12:49:13
> After the decoder reads the last symbol in a given stream, state is 0x130000. is this the *definition* of the last symbol?
yurume
Traneptora C.2.3 doesn't actually say what it returns
2022-06-04 01:12:37
indeed, the return value is `symbol` and only `state` is going to persist here
Traneptora > After the decoder reads the last symbol in a given stream, state is 0x130000. is this the *definition* of the last symbol?
2022-06-04 01:13:28
this is a way to say that the state *should* be 0x130000 after the last symbol. the spec is written from a perspective of the decoder so such a sentence is actually an assertion.
Traneptora
yurume this is a way to say that the state *should* be 0x130000 after the last symbol. the spec is written from a perspective of the decoder so such a sentence is actually an assertion.
2022-06-04 02:10:56
I mean, does the state equalling 0x130000 signify EOF, or is EOF signalled elsewhere
2022-06-04 02:11:10
> `if (logcounts[i] == 13) { rle = U8(); same[i] = rle + 5; i += rle + 3; continue; }` what is `same[i]`?
yurume
Traneptora I mean, does the state equalling 0x130000 signify EOF, or is EOF signalled elsewhere
2022-06-04 02:11:56
there is no separate EOF, but the stream only would read a finite amount of symbols and that last state acts as a checksum
Traneptora
2022-06-04 02:13:17
I see
yurume
2022-06-04 02:15:14
that `same[i]` is kinda annoying, so the frequency distribution is read in two passes, the first will read exponents + run lengths if that log is 13, then the second will read mantissas
2022-06-04 02:16:03
since run lengths are read in the first pass (I don't know why), the second pass needs to know where runs occur and thus skip reading mantissas
2022-06-04 02:16:40
that's what `same[i]` does; if non-zero, it means a run starts at `i` and ends at `i + same[i] + some const I can't remember`
Traneptora
2022-06-04 02:48:59
`symbols.push_back(s); offsets.push_back(bucket_size ร— i); cutoffs.push_back(0);`
2022-06-04 02:49:03
what is `push_back`
yurume
2022-06-04 02:56:10
they are (implicitly) supposed to be a C++ vector
2022-06-04 02:56:57
pseudocodes in the spec rarely define variables and leave them to a common sense (at least for C++ coders)
Traneptora
2022-06-04 03:40:59
I'm not a C++ programmer, that's why I asked
2022-06-04 03:41:10
it looked like C++ stuff
2022-06-04 03:43:00
alright, looks like I got this thing done! well mostly done. still have to implement brotli trees
2022-06-04 03:43:01
https://github.com/thebombzen/FFmpeg/blob/jpegxl-ans/libavcodec/jpegxl_parser.c
yurume
2022-06-04 03:45:28
does ffmpeg have its own huffman decoder? (just out of curiosity)
Traneptora
2022-06-04 03:45:45
it links zlib, I dont' know if zlib can do it
2022-06-04 03:46:00
and it probably does tbh
yurume does ffmpeg have its own huffman decoder? (just out of curiosity)
2022-06-04 03:48:56
I believe it has its own VLC decoder, now that I think about it
yurume
2022-06-04 03:49:29
if it has a common infrastructure for optimized huffman decoder (as opposed to having separate decoders for each codec) I guess it's a way to go
Traneptora
2022-06-04 03:49:36
I believe it does yea
2022-06-04 03:49:41
that way I don't have to implement it
2022-06-04 03:49:48
atm I just abort out and go "oops"
_wb_
2022-06-04 05:13:18
Well just getting the dimensions is not _that_ hard
veluca
2022-06-04 09:20:31
also definitely not trivial though
_wb_
2022-06-04 09:27:08
No, but still just bit reading, no ANS alias tables etc
Traneptora
2022-06-04 04:05:06
does JXL support extended size ISOBMFF boxes?
2022-06-04 04:05:28
that is the `size` field is set to `1` and then it is immediately followed by an 8-byte box-size
veluca
2022-06-04 04:30:58
it should in theory
2022-06-04 04:31:20
I think
_wb_
2022-06-04 04:37:06
Yes, at least the spec says it does
2022-06-04 04:37:22
I assume libjxl also allows it
JendaLinda
2022-06-05 04:49:03
As JXL supports images in CMYK color space, is there any program that can do that? For example, transcoding CMYK TIFF losslessly to CMYK JXL?
_wb_
2022-06-05 05:05:40
I wrote a libjxl application that can reversibly encode PSD (including cmyk ones) but I don't have clearance yet to release it
2022-06-05 05:10:21
I wish there was a simple PPM style format for cmyk, then it would be easier to add support for that to cjxl/djxl
JendaLinda
2022-06-05 05:23:03
TIFF seems to be the most widely supported, general purpose format capable of CMYK.
2022-06-05 05:26:20
I tried Imagemagick, but either it's broken or I don't know how to use it. It just takes the CMY channels and encodes them like they were RGB, so the result is completely wrong.
_wb_
2022-06-05 05:30:22
Yeah that's not going to work if they didn't implement a code path for cmyk encoding specifically
2022-06-05 05:32:07
The annoying thing about tiff is that it is even more complicated than psd, and it's just as proprietary (though it has been standardized for a while)
2022-06-05 05:33:00
Even just baseline tiff is already pretty annoying to implement for a format that's mostly uncompressed
Fraetor
2022-06-05 05:50:03
Are CMYK JPEGs non-standardised, or are you really looking for a lossless format?
JendaLinda
2022-06-05 05:56:42
cjxl doesn't support CMYK JPEG. These are not standard and they can't be losslessly transcoded to JXL.
_wb_ Even just baseline tiff is already pretty annoying to implement for a format that's mostly uncompressed
2022-06-05 05:59:03
Could libtiff be used for that?
_wb_
2022-06-05 06:05:46
Yes, libtiff could be used
JendaLinda
2022-06-05 08:54:51
Speaking of PPM format, it looks like cjxl and djxl assume sRGB in PPM.
_wb_
2022-06-05 09:31:42
Yes but you can change that assumption, the syntax is somewhat arcane though
JendaLinda
2022-06-05 09:53:02
Indeed, Netpbm formats are kinda confusing. The official Netpbm specification requires to use some BT.709 gamma transfer function. I don't know why they thought it's a good idea. Everybody seem to ignore this requirement anyway.
_wb_
2022-06-05 09:56:05
I suppose that was before sRGB became a thing...
2022-06-05 09:57:15
I wonder why they didn't change it though
2022-06-05 09:57:36
Everyone considers untagged images to be sRGB now
2022-06-05 09:58:31
And ppm has no way to tag a colorspace or attach an icc profile, so interpreting it as sRGB makes the most sense
2022-06-05 09:59:02
That's also what is done with untagged jpegs, pngs, and gifs.
2022-06-05 10:00:08
Who knows how untagged images were rendered before sRGB was a thing. I suppose some would assume 709, some would assume gamma 2.2, and some would assume gamma 1.8
2022-06-05 10:01:22
Which lead to the famous "images look too dark" thing when taking Mac images to PC, etc
JendaLinda
2022-06-05 10:02:08
I'd say, before sRGB, 24bpp on PC was kind of luxury.
_wb_
2022-06-05 10:02:47
Sure, that too
2022-06-05 10:03:19
And monitors would often have brightness and saturation knobs anyway, so who knows what is right and what is wrong then
JendaLinda
2022-06-05 10:05:56
For the reasons, I'm avoiding netpbm formats and I prefer BMP as a simple image format.
2022-06-05 10:07:10
It has its own quirks and limitations but works quite well.
Traneptora
_wb_ I assume libjxl also allows it
2022-06-06 04:48:34
it doesn't on the decode side, that's why I ask
2022-06-06 04:50:12
2022-06-06 04:50:16
here's a sample
_wb_
2022-06-06 05:20:25
Oh. That's a bug then. Is it just djxl or also libjxl?
Traneptora
_wb_ Oh. That's a bug then. Is it just djxl or also libjxl?
2022-06-06 06:04:49
I tested both and produced the same issue
2022-06-06 06:05:48
libjxl reported JXL_DEC_NEED_MORE_INPUT even after reading the full file
2022-06-06 06:06:07
djxl reported "decoding container format failed"
2022-06-06 06:07:09
as far as i'm aware, the box size by spec is supposed to include the box tag, headers, and payload all at once
2022-06-06 06:07:13
like the *full* box size
2022-06-06 06:07:40
and it's possible that libjxl isn't subtracting the extra 8 bytes in the box size, so it thinks there's supposed to be 8 more bytes than there actually are
2022-06-06 06:07:57
if I'm wrong about that then my sample is corrupt, but I believe that's what's going on here
_wb_
2022-06-06 06:20:03
https://github.com/libjxl/libjxl/blob/main/lib/jxl/decode.cc#L1768
2022-06-06 06:20:25
That's what the code does
Traneptora
_wb_ https://github.com/libjxl/libjxl/blob/main/lib/jxl/decode.cc#L1768
2022-06-06 06:26:08
it looks like it's reading the tag before the extended size
2022-06-06 06:26:48
on Line 1766 it populates the `type` field and then it reads the extended size on line 1768
2022-06-06 06:28:19
I believe this is happening in the wrong order
2022-06-06 06:28:55
compare to the code here
2022-06-06 06:28:55
https://github.com/ffmpeg/FFmpeg/blob/master/libavformat/mov.c#L7813
2022-06-06 06:29:03
FFmpeg reads the extended size before it reads the tag
BlueSwordM
2022-06-06 06:29:53
I see that durandal has an interesting opinion about JXL ๐Ÿ˜‚
Traneptora
2022-06-06 06:30:15
yea, Paul B. Mahol has lots of opinions on lots of things
2022-06-06 06:30:22
he's a quite rude individual
BlueSwordM
2022-06-06 06:32:47
Anyway, I don't understand the issue about the parser too much. Can you explain it for me a bit tbombz?
Traneptora
2022-06-06 06:33:13
a parser has to solve the issue "where does this frame end and the next frame begin"
2022-06-06 06:33:45
which for concatenated raw JXL files, requires you to find the end of the codestream. There's no reliable signature since 0xff0a can easily occur in the middle of a codestream.
2022-06-06 06:34:30
The problem is that JXL's file format does not make this easy. For something like PNG, every chunk contains a chunk_size so you can just, skip over all the chunks and when you find IEND, then you know you've found the end.
2022-06-06 06:35:37
JXL *does* contain a table of contents of blocks in the Frame Header, and in theory you could just add up the block sizes to determine the end of the frame. The *problem* with this is that the table of contents can contain a permutation.
2022-06-06 06:35:46
And the permutation, if present, is encoded with ANS.
2022-06-06 06:35:59
Likewise, if there is an ICC Profile present in the stream, that is also encoded with ANS.
2022-06-06 06:37:00
You don't actually need to *know* the contents of the ICC profile or the permutation table (since the permutation of the blocks won't affect their compressed length), but you do need to skip over these ANS streams. And the only way to determine the length of an ANS stream is to decode it.
2022-06-06 06:37:26
So this means that parsing a JXL codestream requires you to be able to decode ANS, which is ~500 lines of code if done naively, and more if you want to do it efficiently.
2022-06-06 06:38:32
So the issue with a parser is the fact that you need to include hundreds of lines of code to determine file boundaries, and whether or not that's something that you'd actually want to include.
2022-06-06 06:38:49
<@321486891079696385> does that make sense?
yurume
2022-06-06 06:42:53
I'm finally starting to look at VarDCT, it would take a time to wrap my head around it
Traneptora
2022-06-06 06:43:01
~~there's also a bunch of bugs in the JPEG XL spec so this is a lot of trial by fire~~
BlueSwordM
Traneptora <@321486891079696385> does that make sense?
2022-06-06 06:43:27
Yeah. I perfectly understand now :)
Traneptora
2022-06-06 06:43:31
bugs in the code of the JXL spec is a problem because it makes it harder to determine if *you* followed the spec incorrectly or if the spec is wrong
2022-06-06 06:43:55
*wrong* in this case means does not match libjxl behavior, since the spec is derived from libjxl code and not the other way around
yurume
2022-06-06 06:44:04
I learned not to believe the spec at a face value ๐Ÿ˜‰
Traneptora
2022-06-06 06:45:09
in hindsight, not spending 4 bytes on compressed ICC length was a mistake
2022-06-06 06:45:15
or even three
yurume
2022-06-06 06:45:50
I would say it's not too late, just put it into extensions
Traneptora
2022-06-06 06:46:29
yea, you can always include an extension with ID=1 which is "this is the size of the ICC profile" or ID=2 which is "this is the size of all the frames combined"
2022-06-06 06:46:30
or something
2022-06-06 06:46:41
I'd like that to be the case
2022-06-06 06:46:46
that's what extensions are for, tbf.
yurume
2022-06-06 06:47:36
why do you need frame boundaries by the way, say, seeking?
Traneptora
2022-06-06 06:47:40
concatenated files
2022-06-06 06:47:50
the end of the file is the end of a frame
yurume
2022-06-06 06:48:30
uh, there may be multiple frames necessary to produce a single image (even when not animated)
Traneptora
2022-06-06 06:48:40
Yea, but if you can find the frame boundary you can skip to the next frame
2022-06-06 06:48:52
since they're just concatenated
yurume
2022-06-06 06:49:02
yeah, so I assumed you needed them for seeking
Traneptora
2022-06-06 06:49:07
frames are always marked as "is_last" or not
2022-06-06 06:49:24
so you just keep skipping frames until you find one marked as is_last, and then the end of that frame is the end of the codestream
yurume
2022-06-06 06:49:45
frames can refer to previous frames so the actual seeking process would be complicated but it can be done without reading all the frames, technically
Traneptora
2022-06-06 06:50:03
well you would just need to read the frame header and then the TOC