|
Traneptora
|
2022-06-06 06:50:20
|
and then the TOC contains sizes for all the blocks in that frame
|
|
2022-06-06 06:50:36
|
skip the total, and you should end up at the frame boundary
|
|
|
yurume
|
2022-06-06 06:51:13
|
yeah, calculating them is not hard once you have an ANS decoder. my question was why you need them in the first place
|
|
2022-06-06 06:51:24
|
since I know nothing about ffmpeg internals
|
|
|
Traneptora
|
2022-06-06 06:51:25
|
to determine file boundaries
|
|
|
yurume
|
2022-06-06 06:51:50
|
can multiple JXL *bitstreams* be concatenated? (as opposed to frames)
|
|
|
Traneptora
|
2022-06-06 06:52:16
|
most image file formats can be worked with as a concatenated stream of images
|
|
2022-06-06 06:52:37
|
JPEG XL can in theory, *if* you can determine where an image ends and where the next one starts.
|
|
|
yurume
|
2022-06-06 06:52:37
|
that's a technicality and not a necessity AFAIK
|
|
|
_wb_
|
2022-06-06 06:53:57
|
Ffmpeg is a bit weird in having this convention that it can just take concatenated still images
|
|
|
yurume
|
|
_wb_
|
2022-06-06 06:56:28
|
While jxl was kind of designed with the assumption that we don't really need end of codestream markers, nor start markers that are unique enough to be unlikely to happen mid-stream
|
|
2022-06-06 06:57:18
|
Extensions could help, but maybe easier is to just require a convention for this use case
|
|
2022-06-06 06:57:47
|
E.g. the convention that box format has to be used
|
|
2022-06-06 06:58:19
|
(or even box format with either a jxlc or one jxlp per frame)
|
|
|
yurume
|
2022-06-06 06:59:10
|
`image/jxl-sequence`?
|
|
|
Traneptora
|
2022-06-06 06:59:38
|
yea, if the box format has to be used then that works
|
|
2022-06-06 06:59:47
|
but you have to forbid box_size=0 in that case
|
|
2022-06-06 06:59:49
|
since that gains you nothing
|
|
|
_wb_
|
|
Traneptora
I believe this is happening in the wrong order
|
|
2022-06-06 07:00:19
|
Please open an issue, could very well be that we're doing it wrong, have to check spec too. I don't think anyone ever needed jxl boxes bigger than 4 GB so we probably just didn't encounter this case yet
|
|
|
yurume
|
2022-06-06 07:00:21
|
I realized there are `image/heif-sequence` and `image/heic-sequence` already, ~~is it just a concatenation of multiple `image/heif` or `image/heic` bitstreams~~ it seems that they are just a single file with multiple images?
|
|
|
_wb_
|
2022-06-06 07:01:29
|
Nah, those can actually be a single hevc codestream too
|
|
2022-06-06 07:01:42
|
Can also be multiple codestreams
|
|
2022-06-06 07:02:02
|
Or even one codestream where multiple crops and stuff are defined
|
|
2022-06-06 07:02:14
|
Heif is relatively expressive
|
|
2022-06-06 07:02:36
|
For a container it's quite expressive
|
|
2022-06-06 07:04:28
|
Can do layers, orientation, compositing (with special case of grid), cropping, alpha and depth channels (as separate yuv400 codestreams), etc.
|
|
2022-06-06 07:07:23
|
(all stuff that we do at the codestream level but they had to do it at the container level since they want to use payload codestreams like hevc and av1)
|
|
|
Traneptora
|
|
_wb_
Please open an issue, could very well be that we're doing it wrong, have to check spec too. I don't think anyone ever needed jxl boxes bigger than 4 GB so we probably just didn't encounter this case yet
|
|
2022-06-06 07:08:15
|
https://github.com/libjxl/libjxl/issues/1478
|
|
|
_wb_
|
|
Traneptora
yea, if the box format has to be used then that works
|
|
2022-06-06 07:09:19
|
We could have an 'ffmpeg-friendly' encode mode that produces only (easy to parse) self-delimiting output
|
|
|
Traneptora
|
2022-06-06 07:09:36
|
honestly `--container` already does that
|
|
2022-06-06 07:09:46
|
since it produces `jxlc` boxes with nonzero sizes
|
|
|
_wb_
|
2022-06-06 07:09:54
|
Ah does it?
|
|
|
Traneptora
|
2022-06-06 07:10:00
|
yea, it populates the size field
|
|
2022-06-06 07:10:06
|
at least it does when I encoded lenna.png
|
|
2022-06-06 07:10:10
|
and I haven't seen it do otherwise
|
|
|
_wb_
|
2022-06-06 07:10:15
|
Ah but will it still in cjxl_ng? We should check
|
|
2022-06-06 07:12:53
|
Anyway, I think it's probably best done as a convention (and maybe having a jxli too); adding an extension also doesn't help you since there is no guarantee that the extension will be present
|
|
|
BlueSwordM
I see that durandal has an interesting opinion about JXL 😂
|
|
2022-06-06 07:14:36
|
Where is that?
|
|
|
Traneptora
|
2022-06-06 07:14:56
|
probably #ffmpeg-devel on IRC
|
|
|
_wb_
Ah but will it still in cjxl_ng? We should check
|
|
2022-06-06 07:15:12
|
```
$ cjxl_ng --container ../lenna.png ./lenna-ng.jxl
Warning: This is work in progress, consider using cjxl instead!
$ hexdump -C ./lenna-ng.jxl | head -n3
00000000 00 00 00 0c 4a 58 4c 20 0d 0a 87 0a 00 00 00 14 |....JXL ........|
00000010 66 74 79 70 6a 78 6c 20 00 00 00 00 6a 78 6c 20 |ftypjxl ....jxl |
00000020 00 00 bc 2d 6a 78 6c 63 ff 0a f8 1f 90 50 5c 08 |...-jxlc.....P\.|
```
|
|
2022-06-06 07:15:15
|
looks good so far
|
|
2022-06-06 07:17:19
|
and that works with FFmpeg
|
|
2022-06-06 07:17:20
|
`$ cat ./lenna-ng.jxl{,,,,} | ./ffmpeg_g -i - -c copy -f null -`
|
|
2022-06-06 07:17:27
|
with my WIP patch, that is
|
|
2022-06-06 07:17:35
|
`frame= 5`
|
|
|
tr7zw
|
2022-06-07 12:11:16
|
Currently trying to get Mango(an open source selfhosted Manga/Comic server) to support Jpeg XL. I did test conversion of my library, and cjxl turned 57gb of manga pages into 3.8gb at a quality setting of 50(which is enough for this use case). Mind-blowing
|
|
|
|
veluca
|
2022-06-07 12:14:35
|
50 seems quite low for manga
|
|
|
tr7zw
|
2022-06-07 12:16:53
|
It's just some loss of noise, which doesn't really matter with full flat colors(also I really don't need stuff in a maximum quality to read it on the phone on the go)
|
|
|
|
veluca
|
2022-06-07 12:30:19
|
well, that'll be a lot of data to transfer
|
|
|
w
|
2022-06-07 12:33:32
|
now if only tachiyomi merged my pr...
|
|
2022-06-07 12:33:36
|
right now it decodes twice
|
|
2022-06-07 12:34:12
|
my almost 1 year old pr...
|
|
2022-06-07 12:34:37
|
and the firefox one is also almost 1 year old...
|
|
|
tr7zw
|
2022-06-07 12:36:03
|
not sure, currently I just added jxl to the list of supported file formats, so mango just sends the file to the browser. Also meaning I need to use for example Firefox Nightly or Tachiyomi on Android to view them
|
|
2022-06-07 12:39:44
|
If you are a Tachiyomi user basically this pr is all you need, but for Web currently it sometimes sets the image size values to 0, so something is not behaving correctly
|
|
2022-06-07 12:51:33
|
It happens for the entire page with all images being marked as having 0 width/height(but they are loaded in the background correctly). Only when reloading with control+f5 it fixes itself. I guess the js pulls the image sizes before they are loaded then caches that? Will have to dig more into where the size values come from.
|
|
2022-06-07 01:12:33
|
It's a mix. They are also from many different sources, so some where highly optimized pngs/jpgs, but others where non optimized giant pngs
|
|
2022-06-07 01:23:16
|
I still have a copy of the originals, maybe I'll play around with the settings more in the future(and when I feel like torturing my cpu at 100% for 2 more hours), but for now I just used ``-q 50`` with default effort. Depending on the source, many of them already had artifacts, so no damage done really
|
|
|
Traneptora
|
2022-06-07 02:24:06
|
Is there any way to receive 10-bit or 12-bit information from libjxl
|
|
2022-06-07 02:24:25
|
for, say, modular 12-bit lossless
|
|
|
_wb_
|
2022-06-07 04:53:10
|
Atm you have to take 16-bit and rescale yourself (`*4095/65535`)
|
|
2022-06-07 04:54:48
|
We should probably add arbitrary bitdepth options too, that code is there anyway for converting to ppm, might as well have it in the api too
|
|
|
Traneptora
|
2022-06-07 05:42:13
|
I'm reading section C.1, that is RFC7932 about prefix codes
|
|
2022-06-07 05:42:26
|
RFC7932 Sections 3.4 and 3.5 describe how to construct a prefix code from the stream
|
|
2022-06-07 05:42:34
|
that part I understand
|
|
2022-06-07 05:42:38
|
but then I see this sentence:
|
|
2022-06-07 05:42:48
|
> After reading the counts, the decoder reads each `D[i]` as specified in C.1, with `alphabet_size = count[i]`.
|
|
2022-06-07 05:45:59
|
How is `D[i]` read, exactly? Do I go
```c
for (i = 0; i < num_clusters; i++) {
int alphabet_size = count[i];
code = read_brotli_prefix_code(alphabet_size);
for (int j = 0; j < alphabet_size; j++) {
D[j] = read_via_prefix_code(code);
}
}
```
|
|
2022-06-07 05:46:26
|
it doesn't explicitly say when you read the prefix code and how you populate `D[j]`
|
|
2022-06-07 05:46:29
|
that's why I ask
|
|
2022-06-07 05:47:06
|
I also notice that the spec says this:
|
|
|
yurume
|
2022-06-07 05:47:53
|
I questioned this as well, for prefix codes there is no concrete array for `D[i]` and "reading with a distribution `D[i]`" means "reading with a prefix code tree corresponding to `D[i]`"
|
|
|
Traneptora
|
2022-06-07 05:48:28
|
so you don't actually populate `D[i]` like you do with ANS, instead you just read from the prefix code directly
|
|
|
yurume
|
|
Traneptora
|
2022-06-07 05:48:58
|
I see, cause the spec makes it seem like you populate a huge array with values
|
|
2022-06-07 05:49:01
|
and index into it
|
|
2022-06-07 05:49:05
|
which is probably not a good idea
|
|
|
yurume
|
2022-06-07 05:49:19
|
yeah that was my first reading, but it sounded very strange to me too
|
|
|
Traneptora
|
2022-06-07 05:49:28
|
Another question, the spec says:
```
[0, num_clusters - 1)
```
I'm guessing it should probably be `[0, num_clusters)` or `[0, num_clusters - 1]`
|
|
2022-06-07 05:49:35
|
because as written it omits the last cluster for seemingly no reason
|
|
2022-06-07 05:49:58
|
I don't know if that was intentional and if it was it should probably be explicit that the last one is intentionally omitted
|
|
|
yurume
|
2022-06-07 05:50:08
|
indeed! that's definitely a typo. I will update an issue for that.
|
|
|
Traneptora
|
2022-06-07 05:50:26
|
it occurs twice in section C.2.1
|
|
2022-06-07 05:50:30
|
I didn't check elsewhere
|
|
|
_wb_
|
2022-06-07 07:02:37
|
Looks like I'll have to do a nice amount of typo fixing when I return to the spec repo
|
|
2022-06-07 07:02:43
|
Good!
|
|
2022-06-07 07:04:43
|
And yes, it's confusing that we describe everything as "distribution D" which in case of ANS is something quite literal while in case of prefix codes it's more of an implied thing
|
|
|
Traneptora
|
2022-06-08 06:32:47
|
```
if (l2size == 1) {
*out_fast_len = *out_max_len = 0;
JXSML__SHOULD(*out_table = malloc(sizeof(int32_t)), "!mem");
(*out_table)[0] = 0;
return 0;
}
```
line 554 of jxsml
|
|
2022-06-08 06:32:52
|
The spec doesn't mention this step
|
|
|
yurume
|
2022-06-08 06:34:08
|
correct, this was reported back then
|
|
2022-06-08 06:34:33
|
but I think the spec itself hasn't been updated yet
|
|
|
Traneptora
|
2022-06-08 06:43:22
|
another thing, not sure if this is mentioned yet
|
|
2022-06-08 06:43:37
|
the spec implies that you read the `count` for each prefix code, and then you read the code
|
|
2022-06-08 06:43:42
|
not that you read *all* of the counts first
|
|
|
yurume
|
2022-06-08 06:46:27
|
I read it in a correct way, but yeah that's a good thing to clarify
|
|
2022-06-08 07:10:14
|
should I publish a public list of errata somewhere instead of (or rather, in addition to) a private repo
|
|
|
_wb_
|
2022-06-08 07:14:24
|
Uh, might be slightly risky to do that
|
|
2022-06-08 07:15:02
|
Considering the CD hasn't actually been officially made publicly available yet
|
|
|
yurume
|
2022-06-08 07:15:50
|
ah, indeed, the CD at the time I was starting out was already substantially different from the IS
|
|
|
_wb_
|
2022-06-08 07:16:15
|
I just need to get the fixes done and leak another spec draft so everyone who cares has a good version
|
|
|
yurume
|
2022-06-08 07:17:27
|
I'm just feeling that the revision might be too taxing to wb since I've accumulated another 20 issues or so since the last revision
|
|
|
_wb_
|
2022-06-08 07:18:21
|
I wish there was a nicer process but leaking pdf files in a 'non-public' place like discord seems to hit a good balance between not waking up ISO copyright watchdogs and getting eyes on the spec from those who are interested in it
|
|
|
yurume
I'm just feeling that the revision might be too taxing to wb since I've accumulated another 20 issues or so since the last revision
|
|
2022-06-08 07:19:55
|
Nah it's fine, I have just been busy with other stuff like the subjective experiment we've been doing, it is quite exciting to investigate the results so I've been doing that instead of spec fixing
|
|
2022-06-08 07:21:16
|
Officially we have to wait for the CD ballot results, which will take at least until the July meeting but probably until the October meeting, and that will be the earliest possible time to submit the next stage of the spec, the DIS version
|
|
|
yurume
|
2022-06-08 07:21:24
|
to be honest I didn't fully respond to your feedback because I too found building jxsml is more interesting than fixing and reporting spec bugs 😉
|
|
2022-06-08 07:22:16
|
hopefully I can fully implement jxsml before that, I personally allocated about a month of my life (well, since I'm still not employed) but it seems that's going to be 2 or 3...
|
|
|
_wb_
|
2022-06-08 07:23:31
|
Spec work is somewhat boring but in the end it's the thing that defines the bitstream and that is supposed to be the definitive answer in case there's disagreement on if and how something should be decoded, so it is kind of important — but there are more fun things to do :)
|
|
2022-06-08 07:24:35
|
Implementing a full jxl decoder from scratch in a month is... Optimistic :)
|
|
2022-06-08 07:25:14
|
2 or 3 months would still be quite impressive, especially for a 1-person team
|
|
|
yurume
|
2022-06-08 07:25:20
|
well I didn't fully understand the complexity back then
|
|
|
_wb_
|
2022-06-08 07:26:33
|
Part of the complexity is all the spec bugs and ambiguities you are fixing too — when you'll be done, it will be easier for the next one who will try it
|
|
2022-06-08 07:27:34
|
But yes, it's quite a lot more complex than the old jpeg
|
|
2022-06-08 07:29:39
|
Still significantly less complex than a full-blown modern video codec like what you need for heic or avif...
|
|
2022-06-08 07:31:39
|
Btw at some point we might introduce a 'simple' jxl profile that doesn't support splines, patches, the largest vardct block sizes, maybe also no WP and arbitrary MA trees. Hardware people want something like that.
|
|
|
Traneptora
|
2022-06-08 08:25:52
|
yea the fact that I can debug against jsxml is very helpful with this
|
|
2022-06-08 09:59:03
|
and I think I got it working! at least I can init the decoder properly
|
|
2022-06-08 11:42:29
|
okay well that's a weird problem
|
|
2022-06-08 11:42:51
|
it decodes the first 45 varlenuints correctly from an ICC Profile stream
|
|
2022-06-08 11:42:57
|
and then it decodes the 46th one incorrectly
|
|
2022-06-08 11:43:09
|
this is the first token that's "large" so to speak tho
|
|
2022-06-08 11:43:21
|
the problem is the token is coming out of the VLC incorrectly
|
|
|
yurume
|
2022-06-08 11:57:00
|
it can be either a bad context given to the decoder, or a faulty ANS/histogram code initialization resulting in the state desync
|
|
2022-06-08 11:58:03
|
dumping both in your code and jxsml will point to the exact discrepancy
|
|
2022-06-08 11:59:51
|
if ANS is being used, `code.ans_state` is easy to dump; otherwise you need to track `st->bits` and `st->nbits` to pinpoint the desync point
|
|
|
Traneptora
|
2022-06-09 12:03:28
|
I figured it out, it's faulty handling of `code == 17` case when creating the prefix tree
|
|
2022-06-09 12:16:48
|
and it's fixed, noice
|
|
2022-06-09 03:05:27
|
anyway, my WIP is here: https://github.com/thebombzen/FFmpeg/blob/jpegxl-ans/libavcodec/jpegxl_parser.c
|
|
|
fab
|
2022-06-09 12:15:23
|
PIK's design has been inspired by guetzli and our own attempt at high speed jpeg recompression. We just added adaptive quantization, a better colorspace, some more possibilities to separate decorrelation and color quantization, a frequency lifting prediction system (dc to low freq ac, lower freq ac to higher freq...), and quite a few other smaller details.
|
|
2022-06-09 12:17:48
|
9. the final butteraugli score is the maximum local (spatiality there is complex, but often within about 4x4 to 8x8 pixels) error in the image, so an image with a much higher butteraugli error can still "look" better when looked at holistically
|
|
2022-06-09 12:17:58
|
Good news on pik: New speedups ramped multicore software decoding speed to 1 GB/s on a highend desktop CPU.
|
|
|
Traneptora
|
2022-06-09 03:12:26
|
What is the largest possible image header, not counting extensions, that can exist, and is it less than 4k?
|
|
2022-06-09 03:12:33
|
i.e. less than 4096 bytes
|
|
2022-06-09 03:13:20
|
I think it should be but I'm not entirely sure
|
|
|
_wb_
|
2022-06-09 03:41:31
|
if you include icc profiles, it's basically unbounded
|
|
2022-06-09 03:46:59
|
if you have lots of extra channels and they all have long names, it could also get quite long
|
|
2022-06-09 03:48:06
|
if you signal all the optional weights and stuff (custom xyb matrix, custom upsampling weights etc) that probably adds up to 1 kb or so
|
|
2022-06-09 03:48:35
|
I _think_ the only possibly large things are icc profiles and lots of extra channels
|
|
|
Traneptora
|
2022-06-09 03:54:13
|
I'm not including the ICC Profile in this case, only the ImageMetadata bundle and what comes before it
|
|
2022-06-09 03:54:30
|
but yea, I forgot about extra channels
|
|
2022-06-09 03:54:55
|
I suppose I can just ignore those scenarios though
|
|
|
_wb_
|
2022-06-09 04:17:15
|
ah, actually...
|
|
2022-06-09 04:17:32
|
if it's a naked codestream, it's level 5, and that level has at most 4 extra channels (so at most 7 channels total)
|
|
2022-06-09 04:17:44
|
if you want more extra channels, you have to use the container
|
|
|
Traneptora
|
|
_wb_
if you want more extra channels, you have to use the container
|
|
2022-06-09 04:35:29
|
the issue would be a container with a `jxlc` box size of 0.
|
|
2022-06-09 04:35:42
|
atm I think libjxl does not produce those, so it could be possible to forbid them altogether
|
|
2022-06-09 04:36:06
|
and neither does cjxl
|
|
2022-06-09 04:36:16
|
with the requirement that you *must* use jxlp boxes with known box sizes
|
|
2022-06-09 04:38:26
|
> `tps_numerator / tps_denominator` indicates the number of ticks per second.
What is a "tick"? Is this the timebase or the framerate?
|
|
|
_wb_
|
2022-06-09 04:48:56
|
Every frame has a duration expressed in ticks
|
|
|
Traneptora
|
2022-06-09 04:49:04
|
I see, so it's a timebase
|
|
|
_wb_
|
2022-06-09 04:50:42
|
A jxlc box size of 0 could be useful for the same reason there is no framecount up front: you could use it to make a streaming encoder...
|
|
2022-06-09 04:51:05
|
But I guess jxlp is good enough for that
|
|
|
Traneptora
|
2022-06-09 04:56:04
|
Yea, I think box sizes of zero should be forbidden since `jxlp` box exists for exactly that reason
|
|
|
_wb_
|
2022-06-09 05:04:38
|
I suppose jxlp can indeed replace all needs for "until eof" jxlc boxes, with a little overhead but I don't think that matters much. Wdyt, <@179701849576833024> ?
|
|
|
|
veluca
|
2022-06-09 05:08:12
|
I dunno, I don't see a huge benefit in forbidding them right now
|
|
2022-06-09 05:08:25
|
I assume there's a reason why they exist in isobmff
|
|
|
_wb_
|
2022-06-09 05:11:59
|
Yes, that reason is probably allowing to make a streaming encoder that can produce a valid file without having to buffer the whole output (or seek back to write the size afterwards)
|
|
2022-06-09 05:12:43
|
But for that, jxlp is good enough: you just buffer however much you want/can, and spit out jxlp boxes as you go
|
|
|
|
veluca
|
2022-06-09 05:15:05
|
Eh, not for big frames
|
|
|
_wb_
|
2022-06-09 05:22:23
|
You can have a jxlp per group or whatever is convenient
|
|
2022-06-09 05:23:01
|
Or a jxlp per 10 kb of output codestream, whatever
|
|
2022-06-09 05:23:44
|
If you can write the codestream to disk without buffering, you can also write it to jxlp boxes
|
|
|
Traneptora
|
2022-06-09 05:31:26
|
jxlp allows for arbitrary subdivisions
|
|
2022-06-09 05:31:29
|
of the codestream
|
|
2022-06-09 05:31:38
|
as it stands, box size zero is completely unnecessary
|
|
|
|
veluca
|
|
Traneptora
|
|
veluca
I dunno, I don't see a huge benefit in forbidding them right now
|
|
2022-06-10 07:01:29
|
having them exist creates the same issue as before which is determining the end of the file
|
|
2022-06-10 07:02:05
|
but unlike with raw codestreams, a level10 codestream inside a jxlc box of size zero doesn't have a realistic size limit on the header
|
|
2022-06-11 08:29:07
|
> there is one entry for each of the following sections, in the order they are
> listed: LfGlobal, one per LfGroup in raster order, one for HfGlobal followed by HfPass data for all the passes,
> and num_groups × frame_header.passes.num_passes PassGroup.
|
|
2022-06-11 08:29:14
|
IN particular:
|
|
2022-06-11 08:29:26
|
does this mean there is one entry for this whole unit "one for HfGlobal followed by HfPass data for all the passes"
|
|
2022-06-11 08:29:34
|
or does this mean `1 + num_passes` entry for this unit
|
|
|
yurume
|
2022-06-11 08:57:41
|
the former is correct, there is a single section containing HfGlobal and all HfPasses
|
|
|
Traneptora
|
2022-06-11 10:16:37
|
also
|
|
2022-06-11 10:16:39
|
> In case there is a single TOC entry, permuting is useless, but permuted_toc is still signaled.
|
|
2022-06-11 10:16:52
|
what if `permuted_toc = 1` but there's a single TOC entry?
|
|
2022-06-11 10:16:58
|
do we decode a lehmer sequence of size 1, by initializing a 8 pre-clustered distributions, reading `end = 0` and then going "ah"?
|
|
2022-06-11 10:17:27
|
like is this what happens?
|
|
|
yurume
|
2022-06-11 10:27:59
|
I believe so
|
|
2022-06-11 10:28:33
|
technically speaking `end = 1` is also permitted, but the single integer should be 0
|
|
2022-06-11 10:28:53
|
and to my knowledge libjxl always strips all trailing zeros so this would never happen in reality
|
|
|
Traneptora
|
2022-06-12 05:21:19
|
```
leo@gauss ~/Development/ffmpeg-devel/ffmpeg :) $ hexdump -C ../ants-raw.jxl | head -n1
00000000 ff 0a 7a 4c 03 91 e0 5b 04 10 00 70 a8 da 58 20 |..zL...[...p..X |
leo@gauss ~/Development/ffmpeg-devel/ffmpeg :) $ cat ../ants-raw.jxl{,,,,,} | ./ffmpeg_g -i - -f null -
ffmpeg version N-107087-g56d2ac501d Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 12.1.0 (GCC)
configuration: --enable-libjxl --disable-stripping --disable-doc --assert-level=2
libavutil 57. 26.100 / 57. 26.100
libavcodec 59. 33.100 / 59. 33.100
libavformat 59. 24.100 / 59. 24.100
libavdevice 59. 6.100 / 59. 6.100
libavfilter 8. 40.100 / 8. 40.100
libswscale 6. 6.100 / 6. 6.100
libswresample 4. 6.100 / 4. 6.100
Input #0, jpegxl_pipe, from 'pipe:':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: jpegxl, rgb24, 3264x2448, 25 fps, 25 tbr, 25 tbn
Stream mapping:
Stream #0:0 -> #0:0 (jpegxl (libjxl) -> wrapped_avframe (native))
Output #0, null, to 'pipe:':
Metadata:
encoder : Lavf59.24.100
Stream #0:0: Video: wrapped_avframe, rgb24(pc, gbr/bt709/iec61966-2-1, progressive), 3264x2448, q=2-31, 200 kb/s, 25 fps, 25 tbn
Metadata:
encoder : Lavc59.33.100 wrapped_avframe
frame= 6 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.24 bitrate=N/A speed=0.906x
video:3kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
leo@gauss ~/Development/ffmpeg-devel/ffmpeg :) $
```
|
|
2022-06-12 05:21:22
|
<:PogU:772706812427894824>
|
|
2022-06-12 05:21:55
|
it's decoding all 6 frames concatenated
|
|
2022-06-12 05:22:13
|
it still fails some of the conformance samples test cases so I need to debug it
|
|
2022-06-12 05:22:20
|
but it looks like the base part is working
|
|
|
_wb_
|
2022-06-12 07:20:21
|
It's not really animated jxl, it's concatenated still files
|
|
|
Traneptora
|
2022-06-12 01:55:38
|
had to work around two more spec bugs
|
|
|
yurume
|
2022-06-12 02:21:00
|
I'm reporting any issues and bugs I and others have found, but there is no public list for some reason
|
|
2022-06-12 02:22:01
|
those bugs are for the unpublished community draft, so a public list would have to be adjusted for the current standard or equivalent
|
|
|
_wb_
|
2022-06-12 02:54:56
|
Officially you have to be an ISO member and make comments through your national body
|
|
|
fab
|
2022-06-12 03:19:53
|
Default speed is 3 of a 9 in cjxlng
|
|
2022-06-12 03:20:39
|
So basically the question wanted an update on what after cjxl
|
|
2022-06-12 03:21:02
|
Other than the bug fixes
|
|
2022-06-12 03:21:34
|
<#804324493420920833>
|
|
2022-06-12 03:21:51
|
Here i chatted
|
|
|
Traneptora
|
|
Traneptora
had to work around two more spec bugs
|
|
2022-06-12 03:36:28
|
those in question, dunno if they already came up but
|
|
2022-06-12 03:36:49
|
the Restoration Filter mentions `!all_default && epf_iters` at a variety of points
|
|
2022-06-12 03:36:56
|
but this is actually just `epf_iters`
|
|
2022-06-12 03:37:16
|
since if `all_default == true` then `epf_iters` defaults to 2
|
|
2022-06-12 03:37:30
|
and the actual libjxl code doesn't check `!all_default && epf_iters` it just checks `epf_iters > 0`
|
|
2022-06-12 03:37:49
|
so in these scenarios a few bits about things like custom weights or sigma must be signalled
|
|
2022-06-12 03:37:58
|
same with `gab`
|
|
2022-06-12 03:38:09
|
if `all_default` then `gab = 1` which means `gab_custom` is signalled
|
|
2022-06-12 03:38:28
|
even tho the spec says `!all_default && gab` this is incorrect, it's actually just `gab`
|
|
2022-06-12 03:39:25
|
the second issue is the condition for `save_before_ct` is not correct
|
|
2022-06-12 03:39:33
|
`d_sbct` is the condition to signal it, not the default value
|
|
2022-06-12 03:39:36
|
the default value is 1
|
|
2022-06-12 03:40:00
|
these two spec bugs look to me like libjxl bugs that have become de facto standards
|
|
2022-06-12 03:40:12
|
but since every jxl file produce will have these issues we'll have to modify the spec
|
|
|
_wb_
|
2022-06-12 03:55:07
|
right
|
|
2022-06-12 03:55:39
|
can you open a libjxl issue anyway? we won't be changing what libjxl does here, but just so we don't forget to fix the spec
|
|
|
Traneptora
|
2022-06-12 04:47:27
|
provided that <@268284145820631040> doesn't already have these two on their list
|
|
2022-06-12 04:48:03
|
also btw speaking of which, yurume you make an incorrect assumption about jxsml__u32, which is that `the maximum value U32() actually reads is 2^30 + 4211711, so int32_t should be enough`
|
|
2022-06-12 04:48:16
|
the animation header reads `u(32)` in `num_loops`
|
|
|
_wb_
|
2022-06-12 05:34:23
|
u(32) is not the same thing as U32() though
|
|
|
Traneptora
|
2022-06-12 07:15:01
|
I know, but he aborts() if one of the `u(n)` in U32 contains `n > 30`
|
|
2022-06-12 07:15:03
|
which is legal
|
|
|
_wb_
can you open a libjxl issue anyway? we won't be changing what libjxl does here, but just so we don't forget to fix the spec
|
|
2022-06-12 07:15:34
|
no because i double checked the behavior against libjxl and the spec is indeed correct here
|
|
2022-06-12 07:15:56
|
I had two bugs that masked each other because they produced the same alignment when both came into play
|
|
2022-06-12 07:16:18
|
fixing one then broke the restoration filter and setting *its* behavior to match the spec fixed it
|
|
|
_wb_
|
2022-06-12 07:38:51
|
Ok so both spec and libjxl were actually ok then?
|
|
|
yurume
|
2022-06-12 11:20:05
|
I believe libjxl and the spec agrees with each other for epf_iters, otherwise it is impossible to get through the image header anyway; the all_default condition is an early exit in libjxl
|
|
2022-06-12 11:20:40
|
I also have already noted save_before_ct issue, which was necessary for fully parsing the image header
|
|
2022-06-12 11:21:08
|
maybe I should just comment every place that needs attention in jxsml
|
|
2022-06-12 11:22:35
|
`u(32)` thing is one thing I'm aware but didn't fix yet because I'm yet to handle animations, I think there is a comment that reads "TODO u(32)?!" or similar already
|
|
2022-06-12 11:23:40
|
my current plan is to make a 64-bit version of `jxsml__u` when necessary
|
|
|
BlueSwordM
|
|
_wb_
Officially you have to be an ISO member and make comments through your national body
|
|
2022-06-13 12:21:57
|
Sadly, that's one of the things that some ffmpeg devs are still a bit annoyed about.
I know that kind of change takes time, but we need to change the state so that people don't stay stuff like:
"Sigh, it's another closed specification open standard as always."
|
|
|
yurume
|
|
yurume
maybe I should just comment every place that needs attention in jxsml
|
|
2022-06-13 12:47:00
|
I'll do this today
|
|
|
_wb_
|
|
BlueSwordM
Sadly, that's one of the things that some ffmpeg devs are still a bit annoyed about.
I know that kind of change takes time, but we need to change the state so that people don't stay stuff like:
"Sigh, it's another closed specification open standard as always."
|
|
2022-06-13 05:16:07
|
I am also annoyed about it.
|
|
|
fab
|
2022-06-13 11:40:54
|
|
|
2022-06-13 11:41:18
|
Why JPEG XL is setted to replace PNG?
|
|
2022-06-13 11:41:33
|
Is isnt called JPEG XL?
|
|
2022-06-13 11:41:45
|
Argument
|
|
|
_wb_
|
2022-06-13 12:39:36
|
the only advantage PNG has over JPEG XL is that it is better supported in existing software
|
|
|
yurume
|
2022-06-13 12:54:12
|
random idea: would it be possible to exploit the side information from existing PNG files (say, filters or Huffman tree) for faster compression at the same effort level?
|
|
|
Traneptora
|
2022-06-13 12:55:43
|
the huffman tree is just zlib directly
|
|
|
yurume
|
2022-06-13 12:55:44
|
honestly I think it's unlikely, but it might be some use for lower levels where heuristics are not as strong
|
|
|
Traneptora
the huffman tree is just zlib directly
|
|
2022-06-13 12:56:14
|
maybe I should say the block split or so
|
|
|
Traneptora
|
2022-06-13 12:56:36
|
well the context for the prefix code in zlib is the post-filtered data
|
|
|
yurume
|
2022-06-13 12:56:42
|
(which is again from zlib, probably via tallying, but anyway)
|
|
|
Traneptora
|
2022-06-13 12:56:55
|
I don't believe JXL will be compressing data that looks similar
|
|
2022-06-13 12:57:15
|
one of the main disadvantages of PNG is that encoding high-effort zlib is slow
|
|
2022-06-13 12:57:55
|
but yea unlike with JPEG reconstruction, post-filtered PNG data looks nothing like the modular data that's being compressed
|
|
|
yurume
|
2022-06-13 12:57:59
|
you are entirely correct that the distribution would look very different, but I was thinking of the possibility that existing PNG can give some low-profile insight about the resulting distribution
|
|
|
Traneptora
|
2022-06-13 12:58:56
|
it seems unlikely to me, and you probably don't even want that if the PNG was compressed with something low-effort like zlib-6
|
|
|
_wb_
|
2022-06-13 01:00:02
|
for images smaller than 1024x1024 and using png8 (i.e. palette), I think jxl could basically do a "literal transcode" of a PNG stream, i.e. replicate the palette, use the png predictor per row (using a tree that just selects on `y`) and make all deflate matches in the png correspond to lz77 matches in jxl
|
|
2022-06-13 01:00:27
|
for png24/png32 it's a bit trickier due to png being interleaved and jxl being planar
|
|
2022-06-13 01:00:42
|
and for images larger than 1024x1024 the group splitting will ruin things
|
|
|
Traneptora
|
|
_wb_
for images smaller than 1024x1024 and using png8 (i.e. palette), I think jxl could basically do a "literal transcode" of a PNG stream, i.e. replicate the palette, use the png predictor per row (using a tree that just selects on `y`) and make all deflate matches in the png correspond to lz77 matches in jxl
|
|
2022-06-13 01:01:17
|
would you even want to do this though
|
|
|
_wb_
|
2022-06-13 01:01:29
|
probably not
|
|
|
Traneptora
|
2022-06-13 01:01:41
|
I feel like at that point you'd gain no file size
|
|
|
yurume
|
2022-06-13 01:01:41
|
well PNG predictors wrap around I believe
|
|
|
_wb_
|
2022-06-13 01:01:43
|
it would probably be guaranteed to be smaller even if the png was optimized very exhaustively
|
|
|
Traneptora
|
2022-06-13 01:01:55
|
PNG predictors select on `up, left, and upleft`
|
|
|
yurume
|
2022-06-13 01:02:01
|
"lossless PNG recompression" (taken literally)
|
|
|
_wb_
|
2022-06-13 01:02:19
|
yeah the predictors are not quite the same, but in any case, for png8 they tend to be rarely used anyway
|
|
|
Traneptora
|
2022-06-13 01:02:47
|
but would that be more performant than existing fjxl?
|
|
2022-06-13 01:02:49
|
I'm guessing not
|
|
2022-06-13 01:03:04
|
performant and efficient I should say
|
|
|
yurume
|
2022-06-13 01:03:12
|
that's also why I think it's unlikely
|
|
2022-06-13 01:03:23
|
basically I was just thinking out loud
|
|
|
_wb_
|
2022-06-13 01:03:43
|
well if you first need to do zopfli or some other slow png optimization, then of course this will be slow and probably not even better than just doing jxl encode from scratch
|
|
|
Traneptora
|
2022-06-13 01:03:52
|
yea, since PNG does not quantize I see no benefit of doing an equivalent of JPEG-reconstruction with PNG->JXL
|
|
2022-06-13 01:04:11
|
the primary reason you care about JPEG reconstruction is to avoid generation loss
|
|
|
yurume
|
2022-06-13 01:04:40
|
it would be still nice to guarantee some sort of hard bounds for PNG-to-JXL conversion efficiency
|
|
|
_wb_
|
|
Traneptora
the primary reason you care about JPEG reconstruction is to avoid generation loss
|
|
2022-06-13 01:04:48
|
yes, and also because there is in principle more information in the dct coeffs than when you decode it to 8-bit rgb with some implementation
|
|
|
Traneptora
|
2022-06-13 01:05:27
|
speaking of which, JPEG decoding is not fully specified, but JXL decoding is
so in some ways JPEG->JXL transcoding contains a JPEG decoder method inside it
|
|
2022-06-13 01:05:54
|
you could argue that libjxl is a JPEG decoder because JPEG->JXL->Pixels is fully specified
|
|
|
_wb_
|
2022-06-13 01:06:00
|
one of the worst-case kind of behaviors is something like an image that is duplicated a few times horizontally and each copy is too large to have any matches within jxl groups
|
|
|
spider-mario
|
|
Traneptora
speaking of which, JPEG decoding is not fully specified, but JXL decoding is
so in some ways JPEG->JXL transcoding contains a JPEG decoder method inside it
|
|
2022-06-13 01:06:06
|
yep, and a pretty good one if we may say so
|
|
|
_wb_
|
2022-06-13 01:06:41
|
png can do horizontal duplication quite cheaply since the copies are basically free and will be easily found with even cheap deflate settings
|
|
2022-06-13 01:07:09
|
jxl _could_ do the copies using patches but the current encoder will not detect this
|
|
|
Traneptora
|
2022-06-13 01:08:23
|
long-range perfect redundancy could be detected fairly cheaply though, even if it doesn't currently
|
|
|
_wb_
|
|
Traneptora
speaking of which, JPEG decoding is not fully specified, but JXL decoding is
so in some ways JPEG->JXL transcoding contains a JPEG decoder method inside it
|
|
2022-06-13 01:08:37
|
jxl decoding is way more fully specified than jpeg decoding, but still not fully specified, or at least, conformance does allow some deviation from the mathematically idealized infinite-precision spec which nobody can implement
|
|
|
Traneptora
|
|
_wb_
jxl decoding is way more fully specified than jpeg decoding, but still not fully specified, or at least, conformance does allow some deviation from the mathematically idealized infinite-precision spec which nobody can implement
|
|
2022-06-13 01:09:32
|
I see, and the rounding is not specified?
|
|
|
_wb_
|
2022-06-13 01:09:35
|
probably we should port some intra-block-copy detection algos from video codecs to libjxl
|
|
|
Traneptora
I see, and the rounding is not specified?
|
|
2022-06-13 01:09:53
|
the spec describes everything with mathematical real numbers, so there is no rounding there
|
|
|
Traneptora
|
2022-06-13 01:10:23
|
I see. a lot of people care about bitexactness in decode so just specifying how to round them isn't a bad idea
|
|
|
yurume
|
2022-06-13 01:10:32
|
thankfully I believe the spec has no component where you actually need arbitrarily large precision, triple-double would be sufficient
|
|
2022-06-13 01:11:04
|
(triple-double is essentially same to IEEE 754 binary64 but the mantissa size is tripled)
|
|
|
_wb_
|
2022-06-13 01:11:52
|
we want applications to be able to get full conformance with a single-precision float implementation, and some level of conformance with a faster / more hw-friendly limited precision fixed point implementation
|
|
|
yurume
|
2022-06-13 01:12:16
|
I believe so because it is already known that correctly rounding binary64 trig and other transcendental functions can be implemented with at most triple-doubles
|
|
2022-06-13 01:14:00
|
such mathematical functions are the biggest source of needs for precision as long as there is no arbitrary construct that has to operate on idealized real numbers, JXL doesn't have anything like that
|
|
|
_wb_
|
2022-06-13 01:14:38
|
for full conformance only lossless needs to be bit-exact, for lossy we allow some nonzero peak error and some RMSE compared to the reference decodes, certainly enough to not have to implement things using double precision or worse
|
|
2022-06-13 01:16:08
|
(this also allows us to just produce reference decodes with libjxl; the difference between that and a libjxl that uses double precision is smaller than the tolerances we allow anyway)
|
|
|
Traneptora
|
2022-06-13 08:25:33
|
This is not a bug but a clarity issue, can you add it to your list <@268284145820631040>
|
|
2022-06-13 08:25:35
|
> one for HfGlobal followed by HfPass data for all the passes
|
|
2022-06-13 08:26:12
|
There should be a sentence that clarifies that even if these groups are not present because `encoding != kVarDCT`, there's still a TOC entry for them, but it must be 0.
|
|
|
yurume
|
2022-06-14 12:11:57
|
there is a note that reads "TOC entries can be zero, which corresponds to an empty section", so I (correctly) thought that was the case
|
|
2022-06-14 12:12:24
|
anyway I'll note this in the other issue (I have a bigger issue about TOC in general)
|
|
|
Traneptora
|
2022-06-14 03:47:47
|
👍
|
|
2022-06-14 04:33:03
|
I'm confused about the *Progressive* conformance sample
|
|
2022-06-14 04:34:30
|
It's a 467k file, and yet the TOC sizes are listed in the order of hundreds of millions
|
|
2022-06-14 04:34:48
|
it's possible I'm misaligned
|
|
|
yurume
|
2022-06-14 04:39:24
|
oh, jxsml fails to decode ICC for that file
|
|
2022-06-14 04:39:31
|
another thing to check
|
|
|
Traneptora
|
2022-06-14 04:39:44
|
well `icc_enc_size` is 711 for both my code and your code
|
|
2022-06-14 04:39:49
|
but jxlinfo reports it at 896
|
|
2022-06-14 04:39:57
|
so it's probably an image header issue
|
|
|
yurume
|
2022-06-14 04:40:02
|
I guess so
|
|
|
Traneptora
|
2022-06-14 04:40:13
|
since `icc_enc_size` predates the icc
|
|
2022-06-14 04:42:45
|
I'll have to investigate
|
|
2022-06-14 04:42:55
|
unfortunately the libjxl code makes this rather difficult to debug
|
|
|
yurume
|
2022-06-14 04:43:23
|
I tend to throw lots of printf around suspicious lines
|
|
|
Traneptora
|
2022-06-14 04:44:47
|
the issue I'm wondering about is what do I printf
|
|
|
yurume
|
2022-06-14 04:50:37
|
uh, just `printf("foo\n");`?
|
|
2022-06-14 04:51:00
|
(I tend to use `fprintf(stderr, "...");` instead but anyway)
|
|
|
Traneptora
|
2022-06-14 04:51:02
|
oh yea I do that, but what I'm trying to print is the offset from the codestream header
|
|
2022-06-14 04:51:12
|
and I'm not sure where in the libjxl visitor code I can get that
|
|
|
yurume
|
2022-06-14 04:51:20
|
that's `BitReader::TotalBitsConsumed()` or so
|
|
|
Traneptora
|
2022-06-14 05:04:48
|
djxl reports 65 bits consumed for the image header
|
|
2022-06-14 05:05:22
|
I'm getting 66
|
|
2022-06-14 05:05:38
|
there's probably a bit behind a flag that I'm not noticing
|
|
2022-06-14 05:09:11
|
I think it's not reading the default matrix flag though, so I don't know
|
|
|
improver
|
2022-06-14 05:10:06
|
|
|
2022-06-14 05:10:22
|
idk about current draft version but snapshot i have has this
|
|
|
Traneptora
|
|
improver
idk about current draft version but snapshot i have has this
|
|
2022-06-14 05:13:00
|
in U32? yea that's been fixed, that typo
|
|
2022-06-14 05:15:25
|
I think the 65 vs 66 bit difference is because CustomTransformData is not considered part of ImageMetadata in the libjxl code
|
|
|
yurume
|
2022-06-14 05:15:57
|
wait, what?
|
|
|
Traneptora
|
2022-06-14 05:16:02
|
so the extra bit is just `default_matrix`
|
|
|
yurume
|
2022-06-14 05:16:14
|
ah `all_default` thingy
|
|
2022-06-14 05:16:17
|
ugh
|
|
|
Traneptora
|
2022-06-14 05:16:42
|
also, I think the difference between 711 and 896 is that 711 is the size of the *encoded* ICC profile
|
|
2022-06-14 05:16:45
|
which we both just skip over
|
|
2022-06-14 05:16:53
|
after it's decompressed, it's decoded
|
|
2022-06-14 05:18:39
|
<@268284145820631040> not an ICC issue
|
|
2022-06-14 05:18:58
|
libjxl, jxsml, and my code all report the frame header starting at 20936 bits in
|
|
|
_wb_
|
2022-06-14 05:37:52
|
I started looking at some spec bugreports, yurume is very thorough!
|
|
|
improver
|
2022-06-14 06:05:57
|
> Each extension is identified by an ext_id in the range [0, 63), assigned in increasing sequential order.
63) is kinda.. weird number? it would mean from 0 upto 62
|
|
|
Traneptora
|
2022-06-14 06:06:08
|
that's probably a typo
|
|
|
improver
|
2022-06-14 06:06:10
|
but `static constexpr size_t kMaxExtensions = 64; // bits in u64`
|
|
|
Traneptora
|
2022-06-14 06:06:15
|
should be `[0, 64)`
|
|
|
improver
|
|
Traneptora
|
2022-06-14 06:45:36
|
figured it out <@268284145820631040>
|
|
2022-06-14 06:45:45
|
> If lf_level != 0, the samples of the frame (before any colour transform is applied) are recorded as
> LFFrame[ lf_level-1] and may be referenced by subsequent frames. Moreover, in this case, the decoder
> considers the current frame to have dimensions ceil( width / (1 « (3× lf_level))) × ceil( height / (1 «
> (3× lf_level))).
|
|
2022-06-14 06:46:07
|
This affects the calculation of `width` and `height` in the header, which affects `group_dim` and the number of TOC entries
|
|
|
_wb_
|
2022-06-14 07:09:50
|
Ah yes, lf frames have implicit dimensions that are not just the image dimensions
|
|
|
Traneptora
|
2022-06-14 07:12:08
|
after adjusting for that, I get past that frame
|
|
2022-06-14 07:12:22
|
I'll still have other bugs elsewhere but they're something unrelated
|
|
|
_wb_
Ah yes, lf frames have implicit dimensions that are not just the image dimensions
|
|
2022-06-14 07:12:59
|
the spec could be a bit more explicit about it IMO. it does say it but it should say that when calculating the frame size for the purpose of `num_groups` it takes that into account
|
|
2022-06-14 07:14:26
|
especially considering it's mentioned *after* the part where num_groups is defined
|
|
|
_wb_
|
2022-06-14 07:21:30
|
Good point, that order should be reversed I guess
|
|
|
Sage
|
2022-06-14 11:35:21
|
how well does monochrome jxl fare against jbig2?
|
|
2022-06-14 11:35:45
|
AFAIK jbig2 is the best encoder of monochrome images so I'm interested
|
|
|
improver
|
2022-06-15 12:32:06
|
SizeHeader could have note that neither height nor width can exceed 2^30. right now seems like it's possible to exceed with certain ratio values
|
|
2022-06-15 12:32:35
|
seems that due to how it's encoded height shouldn't be able to exceed
|
|
|
Traneptora
|
2022-06-15 03:14:31
|
only one more conformance sample to debug against
|
|
2022-06-15 03:14:55
|
the others all successfully parsed <:CatFlex:666111460086251533>
|
|
2022-06-15 03:15:31
|
squashed some bugs, including in the spec
|
|
2022-06-15 03:16:09
|
plus some spec unclear things discovered
|
|
|
_wb_
|
|
Sage
AFAIK jbig2 is the best encoder of monochrome images so I'm interested
|
|
2022-06-15 05:08:46
|
Jxl isn't specifically designed for 1-bit and it's a case that hasn't been the focus yet for encoder improvements, but I think in theory jxl should be able to match jbig2 — at least jxl can do patches and run lengths too...
|
|
|
Sage
|
2022-06-15 05:38:21
|
interesting
|
|
2022-06-15 05:39:47
|
I believe jbig2 is very patent encumbered and proprietary so a modern open image format being able to supersede it would be nice
|
|
|
_wb_
|
2022-06-15 07:04:07
|
jbig2 was published in 2000 so patents shouldn't be an issue if they were — it also has been part of PDF for quite a long time and I never heard about patent issues there...
|
|
2022-06-15 07:06:59
|
there were patents on it but they are all from the 1990s so they expired in the 2010s, and they weren't a big issue since royalty-free licenses were available afaiu
|
|
2022-06-15 07:07:47
|
jbig2 mostly has a bad reputation because some copiers implemented lossy jbig2 in a bad way, turning 6's into 8's and stuff like that
|
|
2022-06-15 07:08:31
|
but tbh that's not really a codec issue but a bad implementation issue, if you ask me
|
|
2022-06-15 07:09:16
|
(one could also make a lossy PNG encoder that too eagerly sees "good enough" deflate matches and that would also turn a 6 into an 8)
|
|
2022-06-15 07:11:10
|
anyway, I think jxl _could_ be universal enough to replace jbig2, but one practical issue with the current libjxl implementation is that it will use 12 bytes per pixel internally (for 3-channel 32-bit floats) which is of course major overkill when the actual data is 1 bit per pixel
|
|
2022-06-15 07:12:07
|
so we'd need some specialized code paths to make things more memory-friendly in that case
|
|
2022-06-15 07:13:23
|
I'm wondering if 1-bit images are still very relevant though — maybe 4-bit grayscale is a more important target now, since I think that is what most e-readers use natively...
|
|
|
Traneptora
|
2022-06-16 11:16:15
|
alright, my parser passes all conformance samples
|
|
2022-06-16 11:16:22
|
here's the main thing I noticed with regard to the spec
|
|
2022-06-16 11:17:28
|
1. when calculating `num_groups`, one must divide `ceil(width / group_dim)`, using `width` in the frame header.
|
|
2022-06-16 11:18:05
|
However, fi `lf_level != 0`, then one must set `width = ceil(width / (1 << (3 * lf_level))`
|
|
2022-06-16 11:18:19
|
*before* making this `num_groups` calculation
|
|
2022-06-16 11:18:25
|
(same for height as well)
|
|
2022-06-16 11:18:37
|
this is in the spec, but since it's mentioned after `num_groups` is defined, it's very confusing
|
|
2022-06-16 11:19:19
|
2. The same is also true if `upsampling != 1`. If `upsampling > 1` then you must divide `width = ceil(width / upsampling)` before calculating `num_groups`
|
|
2022-06-16 11:20:46
|
Do note that upsampling is present provided that `!(flags & kUseLfFrame)`, but the spec doesn't actually say that this flag *must* be enabled if `frame_type == kLFFrame`
|
|
2022-06-16 11:21:11
|
this means, in theory, you can have both downsamplings
|
|
2022-06-16 11:21:26
|
and if so, you apply the `lf_level` one first, and then apply the `upsampling` one
|
|
2022-06-16 11:21:32
|
but the spec doesn't say this
|
|
2022-06-16 11:21:38
|
since you call `ceil` first, this is not commutative
|
|
2022-06-16 11:22:29
|
3. The condition for `save_before_ct` is incorrect
|
|
2022-06-16 11:23:56
|
The spec says that it's signalled provided that `!all_default && frame_type != kLFFrame`
|
|
2022-06-16 11:25:12
|
with a default value `d_sbct` defined later inthe spec
|
|
2022-06-16 11:27:16
|
the actual condition for `save_before_ct` being signaled is
```
frame_type == kFrameReferenceOnly || full_frame && (frame_type == kFrameRegular || frame_type == kFrameSkipProgressive) && (frame_duration == 0 || save_as_ref != 0) && !is_last && blend_mode == kReplace
```
|
|
2022-06-16 11:27:22
|
this is what I got from inspecting libjxl code
|
|
2022-06-16 11:28:15
|
also `!all_default` as well
|
|
2022-06-16 11:29:04
|
its default value also appears to be `true`, if `frame_type == kLFFrame`
|
|
2022-06-16 11:31:36
|
with *no* default value defined in libjxl if `frame_type == kFrameRegular || frame_type == kFrameSkipProgressive`
|
|
2022-06-16 11:33:14
|
that is, it defaults to `true` if `frame_type == kLFFrame`. Otherwise, if it is signalled, it has whatever value it is signalled as. Otherwise, it does not have a default value.
|
|
2022-06-16 11:34:07
|
It would make sense for the spec to define its default to `true`, since I'm guessing the value is unused if it has no default.
|
|
2022-06-16 11:37:04
|
4. The spec should be clear that if `num_toc_entires > 1`, then all the TOC entries listed there are present but the size may be zero for some of them, in the case of `encoding == kModular`
|
|
2022-06-16 11:37:26
|
5. The spec says this (F.3.3)
> Before decoding the TOC, the decoder invokes ZeroPadToByte() (B.2.7).
|
|
2022-06-16 11:37:35
|
It's not entirely clear what "before decoding the TOC means" here.
|
|
2022-06-16 11:37:50
|
The actual behavior is that `permuted_toc = Bool()` is read *immediately* after the frame header
|
|
2022-06-16 11:38:06
|
and then if it is nonzero, the TOC permutation is read *immediately* after `permuted_toc`.
|
|
2022-06-16 11:38:38
|
*then* ZeroPadToByte() is invoked, then the TOC lengths are each read, and *then* ZeroPadToByte() is invoked again after they have all been read.
|
|
2022-06-16 11:38:44
|
This should be more explicit.
|
|
2022-06-16 11:39:49
|
6. Entropy coding!
|
|
2022-06-16 11:40:28
|
> `[0, num_clusters - 1)` appears twice in section C.2.1. It should be `[0, num_clusters)`
|
|
2022-06-16 11:41:31
|
7. Alias Mapping section C2.2, provides this line of code:
> ```cpp
> if (cutoffs[i] > bucket_size) overfull.push_back(i); else underfull.push_back(i);
> ```
|
|
2022-06-16 11:41:33
|
this is incorrect.
|
|
2022-06-16 11:42:09
|
The correct line here is
```cpp
if (cutoffs[i] > bucket_size) overfull.push_back(i);
else if (cutoffs[i] < bucket_size) underfull.push_back(i);
```
|
|
2022-06-16 11:43:25
|
8. Alias Mapping section C.2.2 is missing an extra loop
|
|
2022-06-16 11:44:20
|
```cpp
for (i = alphabet_size; i < table_size; i++) {
underfull.push_back(i);
}
while(overfull.size() > 0) { // right before this line
```
|
|
2022-06-16 11:44:34
|
this same loop should also zero out `cutoffs[i]` if we're not assuming it's initially zero
|
|
2022-06-16 11:45:31
|
9. Alias Mapping is missing a comment that says underfull never runs out
|
|
2022-06-16 11:46:06
|
```cpp
while (overfull.size() > 0) {
/* underfull.size() > 0 */
o = overfull.back(); u = underfull.back(); overfull.pop_back(); underfull.pop_back();
```
|
|
2022-06-16 11:46:39
|
10. This could probably be simplified to `o = overfull.pop_back(); u = underfull.pop_back();`
|
|
2022-06-16 11:47:07
|
Since you're not defining these functions, you're assuming a user is already familiar with `push()` and `pop()` on a stack, and convention is that `pop()` removes *and returns* the last element.
|
|
2022-06-16 11:48:37
|
11. Section C.2.2. defines AliasMapping for a specific distribution as `AliasMapping(x)`, a function of one variable
|
|
2022-06-16 11:49:12
|
and yet later on in Section C.2.3, you call this
```cpp
(symbol, offset) = AliasMapping(D, index);
state = D[symbol] × (state >> 12) + offset;
```
|
|
2022-06-16 11:49:50
|
it's probably a bit clearer if you write this as something like this
|
|
2022-06-16 11:50:44
|
```
(symbol, offset) = D.AliasMapping(index);
state = D[symbol] * (state >> 12) + offset;
```
|
|
2022-06-16 11:51:20
|
speaking of which, `\times` in the middle of psuedo-code is odd, I'd probably leave that in the prose, and use `*` in pseudo-code.
|
|
2022-06-16 11:56:37
|
12. In section C.2.6, you define `lsb = token & ((1 << config.lsb_in_token) - 1);`
|
|
2022-06-16 11:56:53
|
I'd probably rename this variable to `low` since I think `lsb` gives the wrong implication of what this is supposed to do.
|
|
2022-06-16 12:01:08
|
13. Section 2.4 doesn't define `alphabet_size` for simple distributions, only for flat and for complex distributions. Since `alphabet_size` can be less than `table_size` this is important.
|
|
2022-06-16 12:18:28
|
I think that's all I had to say
|
|
|
|
veluca
|
|
Traneptora
this means, in theory, you can have both downsamplings
|
|
2022-06-16 12:24:24
|
I believe libjxl disallows it? I vote for the spec to disallow it 😄
|
|
|
Traneptora
|
2022-06-16 12:25:10
|
if it isn't supposed to happen I'd prefer if the spec just says that, yea
|
|
|
|
veluca
|
|
Traneptora
```cpp
while (overfull.size() > 0) {
/* underfull.size() > 0 */
o = overfull.back(); u = underfull.back(); overfull.pop_back(); underfull.pop_back();
```
|
|
2022-06-16 12:27:37
|
well, that will never happen, so there's no need to check it, I guess
|
|
|
Traneptora
Since you're not defining these functions, you're assuming a user is already familiar with `push()` and `pop()` on a stack, and convention is that `pop()` removes *and returns* the last element.
|
|
2022-06-16 12:28:14
|
I agree that's the convention... but it's not what the C++ vectors/queues/... do 😦
|
|
|
Traneptora
speaking of which, `\times` in the middle of psuedo-code is odd, I'd probably leave that in the prose, and use `*` in pseudo-code.
|
|
2022-06-16 12:29:15
|
there was some ISO reason for that \times unfortunately
|
|
|
Traneptora
13. Section 2.4 doesn't define `alphabet_size` for simple distributions, only for flat and for complex distributions. Since `alphabet_size` can be less than `table_size` this is important.
|
|
2022-06-16 12:29:57
|
huhhhhh that probably ought to be fixed, I'll need to check what actually happens today
|
|
2022-06-16 12:30:05
|
<@794205442175402004> can you make those changes?
|
|
|
Traneptora
|
2022-06-16 12:31:07
|
do note that some of these might also be on yurume's list
|
|
|
|
veluca
|
2022-06-16 12:31:29
|
yeah I remember seeing some alias mapping fixes at least
|
|
|
Traneptora
|
2022-06-16 12:32:20
|
luckily since these overfull and underfull stacks are capped at size 256
|
|
2022-06-16 12:32:48
|
I just stack allocated `int overfull[256]; int underfull[256]; int overfull_size = 0; int underfull_size = 0;`
|
|
2022-06-16 12:33:12
|
and then I did `push_back(foo)` as `overfulll[overfull_size++] = foo;`
|
|
2022-06-16 12:33:41
|
and `foo = pop_back()` as `foo = overfull[--overfull_size];`
|
|
2022-06-16 12:33:57
|
so it is easy to do this in C without vectors
|
|
|
|
veluca
|
2022-06-16 12:34:14
|
yeah it's pretty easy
|
|
2022-06-16 12:34:28
|
even if the upper bound is dynamic you can `alloca` I guess
|
|
|
Traneptora
|
2022-06-16 12:34:39
|
yea, in this case I don't need to malloc though
|
|
2022-06-16 12:34:48
|
stack allocation is *much* faster than malloc/free
|
|
|
|
veluca
|
2022-06-16 12:35:10
|
alloca *is* stack allocation 😉
|
|
|
Traneptora
|
2022-06-16 12:35:18
|
is that a C thing?
|
|
|
|
veluca
|
2022-06-16 12:35:20
|
yep
|
|
2022-06-16 12:36:03
|
pretty much the same as `int v[N];`
|
|
|
Traneptora
|
2022-06-16 12:36:15
|
oh, VLA is disallowed by FFmpeg
|
|
2022-06-16 12:36:20
|
we use ISO C90
|
|
2022-06-16 12:36:32
|
there's a small list of features that are permitted from C99
|
|
|
|
veluca
|
2022-06-16 12:36:49
|
that's painful
|
|
|
Traneptora
|
2022-06-16 12:36:54
|
nah, not really
|
|
|
|
veluca
|
2022-06-16 12:37:07
|
huh it's not even part of *C*
|
|
2022-06-16 12:37:11
|
it's a glibc extension
|
|
2022-06-16 12:37:15
|
well nevermind then
|
|
|
Traneptora
|
2022-06-16 12:37:30
|
in this case I know it's capped at 256
|
|
2022-06-16 12:37:35
|
so I can just do that
|
|
|
|
veluca
|
|
Traneptora
|
2022-06-16 12:40:02
|
specifically, this is what's allowed from C99:
1. `inline`
2. `//` comments
3. designated struct initializers e.g. `(‘struct s x = { .i = 17 };’); `
4. compound literals e.g. `struct AVRational r = { 0, 1 };`
5. variable definitions *in* for loops `for (int i = 0; i < 8; i++)`
6. Variadic Macros with `__VA_ARGS__`
7. Implementation defined behavior for signed integers is assumed to match the expected behavior for two’s complement. Non representable values in integer casts are binary truncated. Shift right of signed values uses sign extension.
8. `<stdint.h>` and `<inttypes.h>`
|
|
2022-06-16 12:40:23
|
In particular mixing declarations and statements is forbidden
|
|
|
|
veluca
|
2022-06-16 12:40:43
|
which is pretty terrible usability wise
|
|
|
Traneptora
|
2022-06-16 12:41:11
|
Not really?
|
|
2022-06-16 12:41:17
|
I find it's not really an issue
|
|
|
|
veluca
|
2022-06-16 12:41:29
|
makes it a lot easier to leave variables uninitialized
|
|
|
Traneptora
|
2022-06-16 12:41:40
|
do note that it's not considered a mixture if it's at the start of a block
|
|
2022-06-16 12:41:59
|
```c
if (a > 0) {
int b = -a;
foo(b);
}
```
|
|
2022-06-16 12:42:01
|
this is legal
|
|
|
|
veluca
|
2022-06-16 12:42:03
|
ah, that's less bad then
|
|
2022-06-16 12:42:08
|
still not great IMHO
|
|
|
Traneptora
|
2022-06-16 12:42:30
|
I don't mind being unable to mix declarations and statements
|
|
2022-06-16 12:42:45
|
I found that once you get used to it, it's generally not a problem
|
|
2022-06-16 12:42:52
|
it's not as inconvenient as it sounds
|
|
|
|
veluca
|
2022-06-16 12:43:08
|
I'll believe you 😛
|
|
|
Traneptora
|
2022-06-16 12:43:33
|
do keep in mind I'm not a C++ programmer, I'm a C programmer so maybe things being somewhat less convenient is more of the norm for me <:kek:857018203640561677>
|
|
|
|
veluca
|
2022-06-16 12:44:34
|
heheh
|
|
2022-06-16 12:44:46
|
I recently started missing Rust when writing C++
|
|
|
Traneptora
|
2022-06-16 12:56:25
|
I should clarify
|
|
2022-06-16 12:56:34
|
I said my code passes all conformance samples, but it rejects spot
|
|
2022-06-16 12:56:38
|
because spot is still invalid
|
|
2022-06-16 12:57:01
|
spot is a level10 codestream as a raw file
|
|
2022-06-16 01:04:59
|
send a PR for that anyway https://github.com/libjxl/conformance/pull/21
|
|
2022-06-16 01:19:26
|
That's correct <:kek:857018203640561677>
|
|
2022-06-16 01:19:58
|
it's a valid level10 codestream, it's just that level10 codestreams are not supposed to be raw files
|
|
2022-06-16 01:49:14
|
yea, probably. it violates a level limit but djxl doesn't actually warn when this happens
|
|
2022-06-17 05:33:38
|
<@604964375924834314> you mentioned that `cjxl` automatically detects if an ICC profile matches enum constants and if so, it will use those instead of the ICC profile
|
|
2022-06-17 05:33:48
|
Does it do this for JPEG reconstruction as well, if `--strip` is passed?
|
|
2022-06-17 05:34:30
|
(I figure it won't if `--strip` is not passed since it has to bit-exact reconstruct the original JPEG, but in the case of `--strip`, you only care about preserving the original pixel data and DCT coeffs.)
|
|
|
spider-mario
|
2022-06-17 05:34:53
|
good question, I am not sure
|
|
2022-06-17 05:35:03
|
I suspect not
|
|
|
Traneptora
|
2022-06-17 05:36:37
|
It's not a big deal for FFmpeg since `uses_original_profile` is going to be true in that case
|
|
2022-06-17 05:37:23
|
I'm trying to recompress my meme folder and some of my meme JPEGs have ICCs attached
|
|
2022-06-17 05:37:27
|
that's why I'm a little curious
|
|
2022-06-17 05:37:37
|
as it stands `jbrd` boxes are tiny so I don't really care about those
|
|
2022-06-17 05:37:50
|
but I'm wondering if `--strip` also can strip the ICC Profile away if possible
|
|
2022-06-17 05:38:42
|
as far as I'm aware, the purpose of `--strip` is if you have no intention of reconstructing the original JPEG file, but rather, you want to recompress it without generation loss
|
|
|
_wb_
|
2022-06-17 05:52:17
|
For lossless and jpeg reconstruction I think it never uses the enum since you cannot get the _exact_ icc profile from the enum - there are e.g. multiple versions of sRGB around, with different names, different rounding of the numbers, etc. For lossy we say "meh, close enough", but for lossless and jpeg reconstruction we don't.
|
|
|
|
veluca
|
2022-06-17 06:39:47
|
yup indeed
|
|
2022-06-17 06:40:05
|
I'm not *entirely sure* that's the best thing to do... but whatever
|
|
|
w
|
2022-06-17 09:29:28
|
<:weirdge:778473410341896222>
|
|
|
Traneptora
|
2022-06-17 09:38:25
|
spambots here too now?
|
|
|
_wb_
|
2022-06-17 10:20:42
|
sigh, Matrix integration...
|
|
|
Traneptora
|
2022-06-18 04:53:24
|
Update: unfortunately, Lynne doesn't want to merge my parser because it's a ton of code. she's rather just repurpose it as a decoder instead
|
|
2022-06-18 04:55:00
|
a lot of code for a parser
|
|
2022-06-18 05:02:28
|
that said Lynne's gonna use my code to start writing her own decoder
|
|
2022-06-18 05:02:59
|
plus I have an idea to make it work with libjxl anyway
|
|
|
_wb_
|
2022-06-18 05:56:42
|
So we'll have two new decoders then? <@268284145820631040> 's and yours/Lynnne's?
|
|
2022-06-18 05:57:10
|
The more implementations the merrier btw, if you ask me
|
|
|
Traneptora
|
2022-06-18 08:36:05
|
yea, although I don't know when Lynne's getting around to it
|
|
2022-06-18 08:36:31
|
she *did* say that having a working ANS implementation to look at is a lot easier to read than the spec as she had it (which was a public draft version from january)
|
|
2022-06-18 08:37:16
|
she pointed out that the 4 MiB lz77 window is a problem if you want to implement this in hardware
|
|
2022-06-18 08:37:21
|
I'm not sure if there's any real solution to this
|
|