JPEG XL

Info

rules 57
github 35276
reddit 647

JPEG XL

tools 4225
website 1655
adoption 20712
image-compression-forum 0

General chat

welcome 3810
introduce-yourself 291
color 1414
photography 3435
other-codecs 23765
on-topic 24923
off-topic 22701

Voice Channels

General 2147

Archived

bot-spam 4380

jxl

Anything JPEG XL related

Jyrki Alakuijala
2022-10-04 06:22:54
how many pixels are delivered is often something that the facebook/instagram/etc. engineers decide -- and that decision is partially based on what kind of quality the formats deliver at that zoom rate
2022-10-04 06:37:08
yet another interesting balance is the continuity of very flat blocks -- there we didn't do so good in guetzli initially, and Sascha K. reported on twitter that he sees the blocks (and I had to agree with him) and improve
2022-10-04 06:37:28
we had bit of this tendency in JPEG XL, too, but rarely, until Zoltan improved it about a month ago
_wb_
2022-10-04 07:56:34
After looking at some images with --progressive_dc={0,1}, I have to agree with ssimulacra2 that the calibration between those two is slightly off, and doing progressive dc indeed causes a slight decrease in quality. I will fix it to make sure there is no quality difference between those two options.
2022-10-04 08:03:57
Also, interestingly, it seems like our current default to use progressive_dc=0 at d < 4.5 and progressive_dc=1 at d >= 4.5 is not quite optimal: ssimulacra2 suggests the following (after preliminary recalibration, not now, since now progressive_dc=1 just lowers the dc quality too much): actually progressive_dc=1 is slightly better at d < 2.5 or so and progressive_dc=0 is slightly better at d > 3.5 or so. If that's true, then our current default is kind of doing the opposite of what it should do.
2022-10-04 08:06:30
here's an example showing that ssimulacra2 is right when it says progressive dc has too low dc quality: progressive_dc=0: https://jon-cld.s3.amazonaws.com/test_images/049/jxl-nldcpdc0-e6-q52.png progressive_dc=1: https://jon-cld.s3.amazonaws.com/test_images/049/jxl-nldcpdc1-e6-q52.png
2022-10-04 08:08:55
This is what causes that jump from ssimulacra2 60 at q52 to ssimulacra2 56 at q50 (around 0.55 bpp)
2022-10-04 08:10:37
I can make the top half of the line continue downwards, which is like a 5% improvement for d4.5+ if ssimulacra2 is to be believed
Pashi
2022-10-04 09:26:38
Moar quality
Jyrki Alakuijala
2022-10-05 08:26:27
we don't really need the progressive dc -- the usual non-progressive dc is also 8x8 progressive ๐Ÿ™‚
2022-10-05 08:26:49
we could push that down or disable it altogether
2022-10-05 08:27:05
I agree that progressive_dc=0 looks quite a bit better
2022-10-05 08:27:34
also it should be faster to decode -- some more memory locality
_wb_
2022-10-05 09:11:13
I agree that progressive dc is not really needed โ€” it could in principle be used to e.g. get a 16x16 preview on the screen earlier (or even a 64x64 blurhash type thing), but for web images the main benefit comes from the 8x8 and anything else (more progressive dc or ac) is relatively speaking a detail.
2022-10-05 09:14:20
Ssimulacra2 suggests in general to allocate still some more bits to DC, especially in that d>4.5 region where we currently effectively have a DC quality drop, not so much caused by using progressive DC but mostly by selecting a too low quality for the progressive DC โ€” if you force progressive DC, the curve looks like a continuation of the bottom line, i.e. progressive DC is just allocating less bits to DC and paying a price in quality that is not worth the saving in bpp.
2022-10-05 09:17:18
I'll play with the DC/AC balance a bit to get optimal curves for both progressive dc and non-progressive dc, and then I'll see where the Pareto front is so we can set the default to that โ€” taking into account though that progressive dc comes at a small price in encode and decode speed so if the difference is small, non-progressive dc is a better default.
Jyrki Alakuijala
2022-10-05 09:52:57
I'm ok with putting some more bits into DC at > d4.5
2022-10-05 09:53:18
or starting from d2.9 or so
2022-10-05 09:54:06
it is likely better to do it with manual viewing rather than ssimulacra2
2022-10-05 10:08:11
In the past my dc quantization rule was IIRC linear allocation below d2.9 and pow(d, 0.55) for above
2022-10-05 10:09:13
I think we relaxed this in a recent change (to fix a non-linearity bug in dc allocation) and the power changed to something around 0.65-0.7 or so, naturally allocating less dc bits at lower quality
2022-10-05 10:10:10
it is 3-4 years since I tuned it last, so my memory is not so precise on it %-)
_wb_
2022-10-05 11:36:08
I am doing a mix of manual viewing, ssimulacra2, butteraugli 3norm, and reviewing cases where the metrics disagree. For the lower distances it is challenging to compare things since we are close to visually lossless, and zooming in doesn't really help much to find subtle dc-related issues. I am considering to compare with artificially boosted brightness and/or contrast, but that kind of defeats the purpose of manual viewing...
Jyrki Alakuijala
2022-10-05 11:53:43
just critical manual viewing -- if it doesn't help clearly, then go with the metrics ๐Ÿ™‚
_wb_
2022-10-05 04:22:22
Putting some more bits into DC and a bit less in AC makes things a bit worse according to Butteraugli 3-norm and a lot better according to ssimulacra2. Probably for the same reason, Butteraugli 3-norm says progressive dc (in the current version) is a good idea compared to non-progressive dc (since it 'wastes' less bits on dc), while ssimulacra2 says non-progressive dc is better (which currently puts more bits in dc).
2022-10-05 04:25:16
Basically it looks to me (after looking at a lot of images) like Butteraugli doesn't care very much about dc (until the point where it causes local artifacts, then of course it does care), while ssimulacra2 cares a lot about it, and is very sensitive to very subtle banding.
2022-10-05 04:29:32
I suspect that ssimulacra2 rewards high-precision DC a bit too much (even when the additional precision doesn't make any visible difference anymore), while butteraugli doesn't seem to notice subtle very low frequency stuff.
2022-10-05 04:30:17
The 'truth' is likely somewhere in between what those two metrics say.
2022-10-05 08:37:00
so this is what butteraugli says
2022-10-05 08:37:30
and this is what ssimulacra2 says:
2022-10-05 08:45:08
or, to show the difference in a more dramatic way:
2022-10-05 08:45:26
butteraugli:
2022-10-05 08:45:52
ssimulacra2:
2022-10-05 08:46:44
adc is adjusted ac/dc balance, adc0 is progressive_dc=0, adc1 is progressive_dc=1
2022-10-05 08:49:25
the red line is libjxl 0.7, more or less
2022-10-06 07:03:21
basically butteraugli says the adjusted balance (adc) makes things worse (especially at low quality) while ssimulacra2 says it makes things better
2022-10-06 07:06:31
e.g. orig: https://jon-cld.s3.amazonaws.com/test_images/reference/032.png current libjxl: https://jon-cld.s3.amazonaws.com/test_images/032/jxl-nldc-e6-q18.png (ssimulacra2: 55.2, BA 3-norm: 1.73) adjusted balance: https://jon-cld.s3.amazonaws.com/test_images/032/jxl-adc0-e6-q20.png (ssimulacra2: 61.3, BA 3-norm: 1.78) On this particular metric disagreement I have to agree with ssimulacra2. I'm looking at more of these disagreements now to see which metric to believe on this topic.
Pashi
2022-10-06 07:14:55
The difference is not dramatic but the adjusted one loses slightly more texture compared to the original, it looks like. In other words the adjusted version is less faithful to the original and lower fidelity. In particular, look at how the background changes. The faint horizontal stripes get fainter.
2022-10-06 07:16:12
That's the most noticeable difference but also the texture of the feathers is further lost in the adjusted version.
_wb_
2022-10-06 07:26:23
To me, the non-adjusted one has visible dc blocks in the faint background stripes and more color bleeding around the edges of the bird. The adjusted one looks slightly better for that reason imo (though of course both are very low quality).
2022-10-06 10:08:16
OK, done manually reviewing all the (bigger) disagreements between ssimulacra2 and butteraugli. Basically what it boils down to is that ssimulacra2 penalizes slight color shifts, color bleeding (caused by dc-level chroma inaccuracy), and blockiness more than butteraugli, and it prefers a slightly smoother image over one that has such artifacts. Butteraugli penalizes loss of high freq detail more, and prefers an image with some color bleeding and blockiness over one that is a bit smoother. It is a bit of a matter of viewing distance (or equivalently, of pixel density) like <@532010383041363969> said, and probably also a matter of personal preference which metric is right. I personally tend to agree with ssimulacra2 in most cases where there's a disagreement, but the disagreements are clearly very much about the balance between dc and ac and what kind of artifacts are most problematic. I think it's wise and more future-proof to penalize dc-level artifacts strongly like ssimulacra2 does, considering higher pixel densities and also HDR and in general screens with better contrast and color reproduction, which I think will tend to make it more important to get the dc right in the future.
Pashi
2022-10-06 04:14:09
I think that these things should be tuned to what humans actually notice and see, and if two algorithms disagree over something that is ultimately hard for humans to agree on, then it's probably not something that matters.
2022-10-06 04:15:21
Anything that's easy for humans to notice and agree on is the low hanging fruit
2022-10-06 04:20:57
Now that you pointed out the artifacts in the non-adjusted one I'm starting to like the adjusted one a little better. But still, there's hardly enough of a difference to warrant worrying about which one an algorithm prefers, if it's hard for a human to choose a preference either.
2022-10-06 04:21:47
Only cases where it's clear which one a human would prefer, are cases which should matter when judging an algorithm
Jyrki Alakuijala
2022-10-06 06:45:39
feel free to increase dc sensitivity in butteraugli a bit
2022-10-06 06:46:59
three values from here: https://github.com/libjxl/libjxl/blob/main/lib/jxl/butteraugli/butteraugli.cc#L325
2022-10-06 06:47:13
multiply them by say 1.05 each
2022-10-06 06:49:38
https://github.com/libjxl/libjxl/blob/main/lib/jxl/butteraugli/butteraugli.cc#L1081 17.8 here is a normalization number that may need to be adjusted a bit to keep 1.0 in the same place
Traneptora
2022-10-10 06:49:00
Do JXLP boxes have to be consecutive, or can there be boxes between them
yurume
2022-10-10 06:49:38
AFAIK the latter; metadata can be there
Traneptora
2022-10-10 06:49:52
ah, hm.
2022-10-10 06:50:51
Does a JXLP box or JXLC box need to be the final box?
_wb_
2022-10-10 06:51:44
Either it's one jxlc box and that's it, or it's one or more jxlp boxes
Traneptora
2022-10-10 06:52:03
Yes but does the file have to end with it
_wb_
2022-10-10 06:52:18
Ah, no
2022-10-10 06:52:30
You can put metadata or whatever at the end
Traneptora
2022-10-10 06:52:42
hm
2022-10-10 06:53:26
I'm thinking of adding a BSF to ffmpeg that canonicalizes a jxl file
_wb_
2022-10-10 06:54:19
I am not sure what order makes most sense if you want a single canonical form
Traneptora
2022-10-10 06:54:38
wraps a codestream in a container if raw, otherwise concatenates all jxlp/jxlc boxes into a final jxlc box, with nonzero size, and forwards other metadata as-is
_wb_
2022-10-10 06:55:07
Sounds reasonable
2022-10-10 06:56:06
All metadata first is maybe not ideal for progressive, but then again, if you want to optimize for delivery you better strip all metadata (and I guess the container too)
Traneptora
2022-10-10 06:57:28
question is where do intervening boxes go, ie non-jxlc/jxlp boxes, that can't be stripped
2022-10-10 06:57:44
the ones that go after the end
2022-10-10 06:57:49
or in the middle
_wb_
2022-10-10 06:58:04
Just put them all in the beginning
Traneptora
2022-10-10 06:58:06
can all of them be safely moved before the codestream
_wb_
2022-10-10 06:58:42
Boxes are supposed to be moveable as long as you don't change the relative order of boxes with the same name
2022-10-10 06:59:23
I dunno if that's an official rule of isobmf or if I just made that up but it's what I had in mind anyway
Traneptora
2022-10-10 06:59:39
ah so if I recombine all jxlp to a final jxlc and leave other boxes untouched in that order, it will be valid
_wb_
2022-10-10 07:00:12
Yeah, it should be fine to do that
Traneptora
2022-10-10 07:00:25
the purpose of this bitstream filter btw is to make parsing and packetizing easy, not to optimize progressive delivery
2022-10-10 07:01:19
so, say, it would be possible to mux into NUT easily
_wb_
2022-10-10 07:02:11
Even changing order might be ok, except for jbrd which theoretically could be used for mjpeg-like jpeg frames but then the order needs to match the frame order (not that we have implemented multi-frame jpeg reconstruction yet, but it's a theoretical possibility)
Traneptora
2022-10-10 07:02:42
I have no need to reorder the boxes I can pass through
_wb_
2022-10-10 07:03:29
And then there could in principle also be multiple Exif boxes and they would also need to match the order in which they are needed for jpeg reconstruction
Traneptora
2022-10-10 07:04:28
but again, as long as I don't reorder boxes of the same tag relative to each other, and only move the codestream to the end inside a single jxlc, is that fine?
2022-10-10 07:04:54
cause that's the plan
_wb_
2022-10-10 07:38:35
yes, that's fine
2022-10-10 07:39:11
maybe we should put a clarifying note about that in the spec and/or in the libjxl documentation somewhere
Traneptora
2022-10-11 09:12:30
ugh, this should not happen
2022-10-11 09:12:33
> Could NOT find HWY (missing: HWY_LIBRARY) (found suitable version "1.0.1", minimum required is "0.15.0")
2022-10-11 09:12:41
is this a cmake bug or is there something we can do to fix that?
Fallen
2022-10-14 08:18:53
How does JXL compare to mozjpeg at 50% quality in file size? I can't seem to encode it right on my end I always end up with similar file sizes & quality. Are there any recommended ways to encode images to JXL to test the quality and space savings?
_wb_
2022-10-14 08:29:37
Start from a png original and encode it with both cjxl and mozjpeg. If you have time, it's best to use your eyes to find a q setting for each that produces an image with the fidelity you want. Then compare filesizes.
Jyrki Alakuijala
Fallen How does JXL compare to mozjpeg at 50% quality in file size? I can't seem to encode it right on my end I always end up with similar file sizes & quality. Are there any recommended ways to encode images to JXL to test the quality and space savings?
2022-10-17 03:49:20
while in some encoders quality is controlled by a value that goes from 0 to 100, it is not a fraction or denoted by % -- also the commonly used range is from 75 to 100 (and the useful range from 75-97)
2022-10-17 03:51:22
mozjpeg gets a huge disadvantage over modern codecs such as jpeg xl when overly low qualities are used (like quality 50), better to test at qualities around 85
Fallen
2022-10-17 05:04:10
Thank you, I greatly appreciate the response. I'm trying to compress photos below a certain file size to see how each format compares. Trying to get 8MP images to 200KB each. I find even at 50% there isn't much quality loss for either yet the file sizes are similar.
2022-10-17 05:04:48
Below 50% is when things fall apart and avif fairs better but the encoding time for avif is insane.
_wb_
2022-10-17 05:28:09
Quality is indeed not a percentage. 8 megapixels in 200 kilobytes is 0.2 bits per pixel or 120:1 compression if the source was 8-bit RGB. That's low quality, but of course a lot depends on how good your original images are.
2022-10-17 05:29:05
Are you comparing at 1:1 zoom or are you comparing images zoomed out?
Moritz Firsching
Traneptora > Could NOT find HWY (missing: HWY_LIBRARY) (found suitable version "1.0.1", minimum required is "0.15.0")
2022-10-17 05:43:51
This should indeed not happen; could you describe more details, perhaps open an issue on github about it?!
Fallen
_wb_ Are you comparing at 1:1 zoom or are you comparing images zoomed out?
2022-10-17 06:39:24
100% zoom
Jyrki Alakuijala
2022-10-17 12:58:53
if normal photographs are to be compressed to 0.2 BPP per pixel, it may be more useful to keep them 2x subsampled, i.e, 4x less pixels, and then compress the consequent image to 0.8 BPP
2022-10-17 12:59:58
if it contains miniature text -- for example is a photo containing a sign including important semantic information such as a street name -- then it may be better to compress directly at 0.2 BPP, particularly if subsampling would eat that info
2022-10-17 01:00:17
for screen captures the situation gets more complicated
_wb_
2022-10-17 01:01:18
it depends on how much entropy there is in the image, if it's not a very high entropy image then 0.2 bpp might still be reasonable
2022-10-17 01:03:59
Read 5322x3548 image, 6635753 bytes, 301.8 MP/s Encoding [Container | VarDCT, d2.000, effort: 7 | 4343-byte Exif], Compressed to 517676 bytes including container (0.219 bpp).
2022-10-17 01:05:05
that's a random professional photo of my kid, with a studio background so basically over half the pixel area has almost zero entropy
2022-10-17 01:06:10
if the image is large enough and there are enough easy parts in it, then 0.2 bpp is not necessarily very low quality
2022-10-17 01:07:08
it's different for web images since those are typically a lot smaller (< 1 megapixel usually) so the entropy per pixel tends to be higher
2022-10-17 01:10:06
e.g. that exact same image, downscaled to a more typical dimension for a web page: Read 500x333 image, 499515 bytes, 1980.6 MP/s Encoding [VarDCT, d2.000, effort: 7], Compressed to 9835 bytes (0.473 bpp).
2022-10-17 01:10:41
same distance target, but the large image is 0.22 bpp while the small one is 0.47 bpp
2022-10-17 01:13:06
(this is part of what makes it confusing when benchmarking is done based on bpp: for a small image, 0.2 bpp is very low quality (I have to go to d7 on that image to get there, and that's an image with an easy background), while for a large enough and simple enough image, it may be OK)
fab
2022-10-17 01:45:06
Jon
2022-10-17 01:45:16
I edited all JPEG XL Wikipedias
2022-10-17 01:45:33
En,es and it
2022-10-17 01:45:47
I uniformed all
2022-10-17 01:47:09
El formato tiene una variedad de modalidad de codifica
2022-10-17 01:48:09
2022-10-17 01:48:47
That was original JPEG XL talking which wasn't adopted in English Wikipedia
2022-10-17 01:49:02
So I deleted in Italian and Spanish
2022-10-17 01:49:19
Spanish I think didn't have a JPEG XL article
_wb_
2022-10-17 01:50:18
I don't speak italian nor spanish so I can't check those, but I'll do a quick check of the english wikipedia article to see if there are things that can be improved
fab
2022-10-17 01:50:41
Also in Italian i copied the same links as features
2022-10-17 01:50:54
They are professional in English trust me
2022-10-17 01:51:10
In the italian it was Chrome unboxed
2022-10-17 01:51:32
Which is quite outdated and poor as a source
2022-10-17 01:52:41
2022-10-17 01:52:54
In reality in Italian there's a lot more
2022-10-17 01:52:59
But with no source
2022-10-17 01:53:16
What it can be improved is do same as Spanish
2022-10-17 01:54:49
2022-10-17 01:55:11
It looks so much better visually at least to me
2022-10-17 01:56:49
The best looking is italian my native language. is normal i can write so much better in my native language
2022-10-17 01:59:01
I autofixed myself in English
2022-10-17 01:59:15
In the en jxl Wikipedia a lot of errors of translate
2022-10-17 02:21:10
https://en.m.wikipedia.org/wiki/User_talk:Veikk0.ma#JPEG_XL
2022-10-17 02:21:35
He probably knows more svt than JPEG XL
2022-10-17 02:24:06
2022-10-17 02:24:47
I don't understand if auxiliary data and dc coefficients is the same thing or is a list.
2022-10-17 02:25:01
Is too generic in this aspect
2022-10-17 02:25:15
It's ok for a slideshow
2022-10-17 02:25:27
But not for a Wikipedia article
2022-10-17 02:31:19
Another dubious thing arr
2022-10-17 02:31:55
More efficient recomprression lossless transports options
2022-10-17 02:32:10
The original doc doesn't write like that
2022-10-17 02:32:17
I inverted words
2022-10-17 02:32:29
To make sound better
2022-10-17 02:32:34
But is wrong
2022-10-17 02:33:56
Then at italian article i affirm that preset 7 Is used by JPEG member to tune butteraugli of a encoder
2022-10-17 02:34:14
Is this true or grammatically correct
2022-10-17 02:34:37
Is there a page i should insert for the members of jpeg
2022-10-17 02:36:35
I'm opening Chrome
2022-10-17 02:38:15
https://it.m.wikipedia.org/wiki/JPEG_XL#Caratteristiche_encoder
2022-10-17 02:38:55
In cui l'encoder libjxl calculates the picture
2022-10-17 02:39:10
That's probably someday will be removed
2022-10-17 02:40:50
Squeeze modular parts I removed, it was non sense
fab https://it.m.wikipedia.org/wiki/JPEG_XL#Caratteristiche_encoder
2022-10-17 02:44:43
This link includes two sources
Traneptora
Moritz Firsching This should indeed not happen; could you describe more details, perhaps open an issue on github about it?!
2022-10-17 03:31:48
I can't reproduce this anymore. I'm guessing it was either a bug in cmake or something
fab
2022-10-17 04:48:18
What changes between auxiliary data and dc coefficients
2022-10-17 04:48:34
What is so unique about dc coefficients
2022-10-17 04:48:59
I know that they are a way of construting the Image for es progressive
2022-10-17 04:49:19
But user in Wikipedia read a list and they mistake things
2022-10-17 04:49:32
Especially if they are written as a lists
2022-10-17 04:55:24
Is there a page about DC COEFFICIENTS on Wikipedia English?
2022-10-17 05:08:55
2022-10-17 05:09:28
Basically that the only useful edit I did
2022-10-17 05:16:22
I also added qita link and the Chrome unboxed substitution on spanish and Italian
2022-10-17 05:17:07
On italian and Spanish it allows bigger image dimensions
2022-10-17 05:18:14
That has a link by cloudinary with publisher Jon Sneyers added from the English
2022-10-17 05:18:58
I did nothing after that.
2022-10-18 06:19:11
The problem are very small to fix and are three. I will make a Google docs today
2022-10-18 06:48:00
2022-10-18 06:48:25
States of the interest of jxl and AOMedia Av1
2022-10-18 06:51:39
Usa 53 in avif
2022-10-18 06:51:49
It wants Apple
Moritz Firsching
Traneptora I can't reproduce this anymore. I'm guessing it was either a bug in cmake or something
2022-10-18 08:14:58
thanks for reporting nonetheless, this way when other people hit the same issue for some reason it might be easier to investigate...
fab
2022-10-18 10:50:26
https://en.m.wikipedia.org/wiki/Special:MobileDiff/1116769429
2022-10-18 10:50:42
Why remove how it works lz77 in jxl
2022-10-18 10:50:56
Is interesting some things should be said
2022-10-18 10:51:12
At least you could cite a book
2022-10-18 10:51:25
What books are on JPEG XL?
2022-10-18 10:52:17
static context trees instead of adaptive for faster decoding (for speed), with Alexander Ratushnyak's gradient predictor
2022-10-18 10:52:27
Most people don't know it
2022-10-18 10:52:43
Is nice to know more details about entropy
2022-10-18 10:53:07
You could say then read a book not Wikipedia
2022-10-18 10:53:29
But I'm not in favour of deleting those things
2022-10-18 10:53:46
You obviously did for your copyright
_wb_
2022-10-18 10:57:29
they can be brought back but it needs to be written in a wikipedia style, not discord style
2022-10-18 10:59:15
I think it's a bit too specific though, for a general introduction on jxl; there are more important things than this
2022-10-18 10:59:52
(lz77 is barely used in the most common encode settings)
fab
2022-10-18 11:01:17
https://it.m.wikipedia.org/wiki/JPEG_XL#Caratteristiche_encoder
2022-10-18 11:01:28
This includes a encode su
2022-10-18 11:01:35
Jyrki alayulla
2022-10-18 11:01:41
Post
2022-10-18 11:01:55
Can you make it available for English?
2022-10-18 11:02:09
Because in italian is full of translation error
2022-10-18 11:02:28
Is literally underunstable even with translate
2022-10-18 11:03:12
The encode su link is a good post
2022-10-18 11:05:35
Is note 25
2022-10-18 11:05:46
On italian Wikipedia
2022-10-18 11:06:02
Though id say to keep the link only
2022-10-18 11:06:10
And add max author name
2022-10-18 11:06:27
Not exaggerate with text
fab Is note 25
2022-10-18 11:10:19
On that note of italian editors the thought on that is that it makes confusion
2022-10-18 11:10:37
And if i add your text probably more
2022-10-18 11:11:12
But jyrki alayulla is a trustable editor
2022-10-18 11:11:34
And is an historic post of JPEG XL
2022-10-18 01:48:21
News
2022-10-18 01:48:36
11 commits on English JPEG XL Wikipedia
2022-10-18 01:48:45
I fixed all
2022-10-18 01:49:06
Read now
2022-10-18 01:55:51
It stills need a link before talking about the speed of libjxl
2022-10-18 01:56:22
Or someone will cut that part
2022-10-18 01:56:38
I think Jon can add it
2022-10-18 01:57:06
2022-10-18 01:58:45
It needs another link
2022-10-18 01:59:27
The one that talks about libjxl speed vs heic, in the JPEG.org site is present from 3 years
2022-10-19 10:40:28
Wb
2022-10-19 10:40:47
I re-added your deleted text
2022-10-19 10:41:36
https://en.m.wikipedia.org/wiki/Special:MobileDiff/1116769429
2022-10-19 10:42:12
Plus I copied this
2022-10-19 10:42:15
https://discord.com/channels/794206087879852103/803574970180829194/1032224936024088596
2022-10-19 10:43:50
It was difficult, hope we apprecciate
2022-10-19 10:44:33
It miss only sources
2022-10-19 10:44:47
In some claims such as speed with heic
2022-10-19 10:44:57
But is good
2022-10-19 10:45:38
31.080 bytes long
2022-10-19 10:45:55
2022-10-19 10:48:04
Format and encoder aren't the same thing so don't touch anything
_wb_
2022-10-19 02:50:51
This is an old comment by <@268284145820631040> (sorry, I'm still processing old remarks): > The Frame bundle in F.1 (Frame header: General) lists each group in the order, but if permuted flag from F.3.1 (TOC: General) is set this order might change. Therefore it should be indicated that this bundle is after undoing the permutation. > > There is another concern with an unconventional TOC. Namely, what happens when there are gaps between consecutive sections? If we take the specification literally, each section should be always read from the computed bitstream position and not from the end of the previous section (in the permuted order) so gaps between sections should be okay. It might be even possible that sections can overlap. But this sounds very ridiculous. > > If we are to avoid such a situation, an option is to declare that they can be only permuted. TOC would establish mutually distinct bitstream offset ranges for each section (not only the starting offset!) and a section should be fully decoded at the exact ending offset. Or we can leave it as is, only explicitly noting that sections can be possibly overlapped or have gaps between them.
2022-10-19 02:51:24
Gaps between sections (i.e. a section length being declared to be longer than it really is) are allowed, and this could theoretically be useful if you would e.g. make an encoder that uses a fixed (upper bound for the) compressed size per group or something like that (so it can write the TOC immediately). So I think we should keep allowing that.
2022-10-19 02:54:12
Overlapping sections doesn't seem to be something that is useful, except perhaps in pathological cases like a very repetitive image that happens to have the repetition aligned with the 256x256 grid. I'm not sure what exactly libjxl does, but I don't think it would allow that though. I think it is interpreting the TOC not just as (delta-coded) offsets, but as actual section lengths, i.e. reading past the length is an error. I think we need to make that behavior explicit in the spec, i.e. not allow reading past the end of a group. This is useful because it simplifies incremental/progressive decode: you know from the TOC when a group is available and can be fully decoded. If overlapping is allowed, you have to wait until end-of-stream until you can be sure that any group can be decoded.
2022-10-19 02:55:27
(also it would become impossible to skip frames by just adding all the sizes from the TOC, and it would in fact not really be clear where the next frame begins)
yurume
_wb_ (also it would become impossible to skip frames by just adding all the sizes from the TOC, and it would in fact not really be clear where the next frame begins)
2022-10-19 08:00:20
this was exactly my concern. gaps, as long as made explicitly allowed, are fine, but overlaps make everything too complicated.
Traneptora
2022-10-20 03:12:13
<@416586441058025472> can you stop vandalizing the wikipedia page
2022-10-20 03:12:33
you just take copy-pastes from discord without citations and put them verbatim, including in the first-person voice, on the english wikipedia page
BlueSwordM
2022-10-20 07:07:57
<@794205442175402004> Interesting question about some psycho-visual optimizations: reading some papers and doing some related research in video coding, I just realized correlating luma level <> brightness isn't actually a very good thing to do within standard YCbCr because of an interesting reason: the normal extracted Y luma plane isn't actually pure brightness/contrast data(IE, grayscale), but it is actually green. For that kind of stuff, it is better to use the grayscale version Y'. I'm guessing this is different since libjxl internally works with XYB instead of YCbCr, correct? That means brightness calculations are done in the X plane directly, without any conversion to grayscale, correct? If I am wrong, please correct me. I still have a decent amount of trouble understanding colorspace magic.
_wb_
2022-10-20 07:17:14
The Y of XYB certainly is not the same Y as the Y of YCbCr
2022-10-20 07:17:46
But both are not just green
BlueSwordM
_wb_ But both are not just green
2022-10-20 07:26:08
I know that, but it seems like in most media, Y is shown to store luminance + green, while Y' is shown to be closer to grayscale. All of it comes from whether or not gamma correction is applied, which I didn't get at first until I started reading a bit better in the literature.
2022-10-20 07:33:13
What I've come to understand at this point is that if you want to do brightness related adjustment for final intended viewing or for encoder related psy optimizations, it is best to convert the corresponding correlated luma plane to grayscale(or an equivalent brightness plane) while keeping perceived brightness level of each pixel in the converted image to be the same as the original one.
_wb_
2022-10-20 07:42:08
both the Y of YCbCr and the Y of XYB are gamma-compressed: in case of YCbCr it's basically just the same transfer curve as whatever the corresponding RGB space is using (so typically something like ^2.2 for SDR, and for HDR something more logarithmiccy), in case of XYB it's a cubic+bias transfer curve.
2022-10-20 07:43:38
most image processing is typically best done in linear space, since that's the only space where pixel value arithmetic (like averaging two pixels) produces correct result
2022-10-20 07:44:46
but lots of viewers and tools (including e.g. browsers) do stuff in nonlinear space, since it's of course more convenient to just use the values as is than it is to first convert to linear and then convert back to gamma compressed
2022-10-20 07:46:00
this means that if you take a large image with high frequency detail and use browser downscaling, it will look darker than it should
BlueSwordM
_wb_ this means that if you take a large image with high frequency detail and use browser downscaling, it will look darker than it should
2022-10-20 07:46:14
Oh, I never realized this!
2022-10-20 07:46:25
That explains a lot of stuff actually.
_wb_
2022-10-20 07:49:18
the average of 0 and 1 is always 0.5, but 0.5 in sRGB maps to something like 0.25 in linear light since it (of course for good reasons) assigns more range to the darks than to the brights
2022-10-20 07:50:28
http://www.ericbrasseur.org/gamma.html here's some more reading stuff about this
2022-10-20 07:50:42
in particular this is a funny example: http://www.ericbrasseur.org/gamma-1.0-or-2.2.png
2022-10-20 07:52:16
just downscale it to 1:2 (in physical pixels! on a dpr2 device like a macbook that means you have to go to "25%") to see whether the downscaling is done correctly or not
2022-10-20 07:53:16
if you squint your eyes or walk away far enough from the screen, you should see it says "RULES" โ€” so that's what a correct downscaler should turn it into
2022-10-20 07:57:07
Left: Apple's Preview, gets it right Right: Chrome, gets it wrong
Traneptora
2022-10-20 02:40:02
libplacebo/mpv gets it right, however
2022-10-20 02:40:20
2022-10-20 03:17:07
the key is scaling in linear light
JendaLinda
2022-10-20 05:31:39
Have you tried --resampling=2 in cjxl?
_wb_
2022-10-20 05:34:17
that will probably do it wrong
2022-10-20 05:35:15
not sure though
JendaLinda
2022-10-20 05:38:14
It does
2022-10-20 06:02:41
Although simple averaging is technically incorrect, I'd say most people are used to it so they don't see anything wrong about it. After all, simple resampling algorithms are used all over the place.
_wb_
2022-10-20 06:55:37
Incorrect downscaling also has the big practical advantage that for black text on a white background, it basically makes the text bolder and thus easier to remain legible after downscaling
2022-10-20 06:56:16
(for white text on a black background though, it has the opposite effect of making the letters thinner than it should)
fab
2022-10-20 07:49:44
this just renders jxl obsolete https://gitlab.com/AOMediaCodec/avm/-/merge_requests/488/diffs
BlueSwordM
fab this just renders jxl obsolete https://gitlab.com/AOMediaCodec/avm/-/merge_requests/488/diffs
2022-10-20 07:54:30
I mean, JPEG-XL was always obsolete for pure video stuff, so I guess yes? <:kekw:808717074305122316>
_wb_
2022-10-20 07:58:18
What does that pull request do?
BlueSwordM
_wb_ What does that pull request do?
2022-10-20 08:00:09
It adds sub block warp MVs to the AVM repo. It's a new coding feature that will likely be present in AV2.
_wb_
2022-10-20 08:14:06
Sub block warp? What kind of warping does that do?
BlueSwordM
_wb_ Sub block warp? What kind of warping does that do?
2022-10-20 08:17:50
Here is a good description of warped motion compensation https://gitlab.com/AOMediaCodec/SVT-AV1/-/blob/master/Docs/Appendix-Local-Warped-Motion.md
2022-10-20 08:18:02
Sub-block likely means at the pixel level within a partition.
_wb_
2022-10-20 08:39:20
So it's shearing, basically? When I saw "warp" I imagined something more like a vortex
diskorduser
fab this just renders jxl obsolete https://gitlab.com/AOMediaCodec/avm/-/merge_requests/488/diffs
2022-10-21 04:53:53
what does it do?
Traneptora
diskorduser what does it do?
2022-10-21 01:44:24
https://discord.com/channels/794206087879852103/794206170445119489/1032745057511538750
diskorduser
2022-10-21 01:49:09
Just asking to know what's on fab's mind. ๐Ÿคฃ
Fox Wizard
2022-10-21 03:01:24
Too much <:kekw:808717074305122316>
fab
2022-10-21 04:28:36
<@853026420792360980>
2022-10-21 04:28:50
Sorry if i'm not a editor
2022-10-21 04:28:59
Moderator already left 4 messages
2022-10-21 04:29:08
It didn't happen on Wikipedia Italy
2022-10-21 04:29:23
I did other two commits
2022-10-21 04:29:33
Not gonna change it anymore
2022-10-21 04:29:44
I'll wait jon Sneyers and you
Traneptora
2022-10-21 04:44:09
There's nothing wrong with editing wikipedia, just make sure it's up to wikipedia's quality standards
fab
Traneptora There's nothing wrong with editing wikipedia, just make sure it's up to wikipedia's quality standards
2022-10-21 04:52:05
With whose is correct in English?
2022-10-21 04:52:13
Are too a ?
2022-10-21 04:52:28
I didnt studied English grammar
Traneptora
2022-10-21 04:52:29
well for one, discord conversations are not considered valid sources of information on wikipedia
2022-10-21 04:52:36
so copying and pasting them is definitely not
2022-10-21 04:52:51
also has potential copyright issues if any of the authors don't agree to have their words be creative commons
fab
2022-10-21 04:52:58
2022-10-21 04:53:12
This is written by me
2022-10-21 04:53:38
But don't know if there is a more technical way to say
2022-10-21 04:54:11
Vmaf shouldnt be mentioned
Traneptora
2022-10-21 04:54:22
it clearly looks like it's not written by a native speaker and you haven't changed that
fab
2022-10-21 04:54:29
As it has no relation with jxl
Traneptora it clearly looks like it's not written by a native speaker and you haven't changed that
2022-10-21 04:54:41
Yes i know
2022-10-21 04:54:45
Too a
2022-10-21 04:55:12
Focused on a gradient
2022-10-21 04:55:28
First delete mention about VMAF
2022-10-21 04:55:35
Then say without a
Traneptora
2022-10-21 04:55:39
I mean if you can't edit the English Wikipedia so the sentences are English sentences then don't edit the English wikipedia, stick to your native language. Otherwise it's not helpful to readers
fab
2022-10-21 04:55:53
Like focused on gradient preservation
Traneptora
2022-10-21 04:55:59
If you can write English to a high-enough proficiency that you can write encyclopedic articles, then that's fine
2022-10-21 04:56:02
but the issue here is that you can't
fab
2022-10-21 04:56:11
With whose is not English
2022-10-21 04:56:19
VMAF shouldnt be there
2022-10-21 04:56:41
The best is delete
2022-10-21 04:56:51
And jon if he wants could add
2022-10-21 04:57:00
Probably you're right
2022-10-21 04:57:38
This isn't technical about butteraugli
2022-10-21 04:58:02
Is merely a justification to add a link to butteraugli
2022-10-21 04:58:15
I can't even explain in Italian what is it
2022-10-21 04:58:28
Vmaf is useless to say
fab Is merely a justification to add a link to butteraugli
2022-10-21 04:58:46
That as hisosyha said is wrong
2022-10-21 04:58:57
And i know this rule on Wikipedia
Traneptora If you can write English to a high-enough proficiency that you can write encyclopedic articles, then that's fine
2022-10-21 04:59:40
I don't even know what is proficiency
2022-10-21 05:02:41
I also didn't know that It shouldn be write in past
2022-10-21 05:02:49
But is in the past
2022-10-21 05:03:04
Never used that word
2022-10-21 05:03:36
I speak English since child but no titles
2022-10-21 05:03:51
Attributes
2022-10-21 05:03:55
I don't know
2022-10-21 05:04:17
Ah qualifications
diskorduser
2022-10-21 05:04:21
<#806898911091753051>
fab
2022-10-21 05:08:04
Though i learned fast
2022-10-21 05:08:23
Since child isn't a thing that i studied at school
2022-10-21 05:09:55
<@853026420792360980> I want to translate in it, should you first fix something in Wikipedia?
fab <@853026420792360980> I want to translate in it, should you first fix something in Wikipedia?
2022-10-21 06:01:15
For now I removed it's too boring
Traneptora
2022-10-21 06:14:10
I don't speak Italian
2022-10-21 06:14:18
so I can't translate
fab
2022-10-21 06:36:41
Magic code and mime type aren't the same thing?
Traneptora
2022-10-21 06:38:17
No
2022-10-21 06:38:21
MIME type is `image/jxl`
2022-10-21 06:38:33
magic code is the first several bytes of the file that indicate what type of file it is
fab
Traneptora magic code is the first several bytes of the file that indicate what type of file it is
2022-10-21 06:38:52
https://en.m.wikipedia.org/wiki/JPEG_XL
2022-10-21 06:39:04
Do you think is good
2022-10-21 06:39:12
I deleted all my part
2022-10-21 06:39:24
2022-10-21 06:39:58
I did all on mobile
brooke
2022-10-22 12:31:51
~~just out of curiosity, could i take a .cbz / .cbr archive made of .jpg files and convert them to .jxl?~~
2022-10-22 12:32:02
~~should work as long as the viewer supports .jxl, right?~~
2022-10-22 12:32:50
never mind answered my own question using search
spider-mario
brooke never mind answered my own question using search
2022-10-22 09:19:52
out of curiosity, what answer did you arrive at?
Fraetor
spider-mario out of curiosity, what answer did you arrive at?
2022-10-22 11:43:31
Yes you can. And programs like Tachiyomi read them just fine.
Traneptora
2022-10-22 01:53:28
welp, I'm doing it. I decided I'm going to write a full standalone JXL decoder in pure java
2022-10-22 01:53:48
calling it `jxlatte`
_wb_
Traneptora calling it `jxlatte`
2022-10-22 02:46:23
for real?
Traneptora
2022-10-22 03:00:59
ye
2022-10-22 03:01:11
is that a bad name?
2022-10-22 03:02:07
the hard part might be CMS but I don't think I technically need one
2022-10-22 03:03:49
just bake in the transforms I suppose
2022-10-22 03:05:57
Not entirely sure how im going to handle HDR output, since I don't want to parse ICC profiles
Fraetor
Traneptora Not entirely sure how im going to handle HDR output, since I don't want to parse ICC profiles
2022-10-22 03:47:14
Just pass them through and let something else handle it?
_wb_
2022-10-22 04:17:25
Yeah, just output pixels plus icc profile
Traneptora
2022-10-22 09:01:27
this is the source I have atm
2022-10-22 09:01:28
https://github.com/thebombzen/jxlatte
2022-10-22 09:01:35
it'll take some time naturally
2022-10-22 09:01:47
a lot of this first part is implementing code I've already written and debugged in C for FFmpeg
2022-10-22 09:09:29
probably should make my own class JXLImage as BufferedImage doesn't support high bit depths
2022-10-24 04:43:40
and, got the image header parser written, time to port over the ANS decoder
2022-10-24 05:11:15
I feel like this is the third or fourth time I've written a parser for this image header
2022-10-24 05:11:18
gets easier every time <:KEKW:643601031040729099>
_wb_
2022-10-24 12:43:52
97th JPEG meeting is about to start, this is the current state of the 2nd edition of 18181-1 as it is before the meeting. Plan is to submit the committee draft at the 98th meeting in January, so we still have some time to fix spec bugs.
Jyrki Alakuijala
2022-10-24 01:57:09
it is always slightly entertaining that it is literally a meeting of experts
2022-10-24 01:58:41
also amazing that we are able to have four hour meetings about summarising what was discussed in other meetings and make official decisions in a four hour meeting (instead of say, reviewing the notes on github like system or otherwise offline like IETF)
2022-10-24 01:59:02
it makes the processes quite predictable (but still very very slow)
_wb_
2022-10-24 02:03:48
it's more fun when it's a physical meeting, then besides the bureaucratic official parts you can also have lunch, coffee breaks, or go to a pub with the other experts, which is what I consider to be the most valuable part of these meetings ๐Ÿ™‚
Jyrki Alakuijala
2022-10-24 02:06:00
I remain unconvinced... I think I can have quite a bit more fun with $4000 budget than fly to a week long meeting. (((I don't like flying/hotels/physical meetins.)))
2022-10-24 02:12:41
we need better media acquisition/compression/representation so that the quality of online matches (and exceeds) the physical meetings ๐Ÿ™‚
2022-10-24 02:28:49
we need it like a BBC rain forest nature document -- not more realism ๐Ÿ™‚
Traneptora
2022-10-24 02:37:14
Oh good an updated draft
2022-10-24 02:37:19
just in time
_wb_
2022-10-24 02:41:15
all spec bugs found (and reported) before the end of the year can still be fixed in the next updated draft; in January we will submit the CD so at that point it will go in the ISO paywall system again
2022-10-24 02:42:24
(so I really want to have all bugs found before that, because bugs fixed only in the paywall version but not in the drafts that get circulated are in practice not really fixed)
fab
2022-10-24 04:46:26
I finished translating the JPEG XL page on italian and Spanish
2022-10-24 04:48:14
190n
2022-10-24 11:57:18
<@&807636211489177661>
2022-10-24 11:57:24
<:BanHammer:805396864639565834>
w
2022-10-25 02:15:59
ban matrix
190n
2022-10-25 02:16:02
well delete it at least
yurume
2022-10-25 03:42:55
while I haven't used Matrix, it is a surprise to me that Matrix actually has *less* moderation tools than IRC. really?
2022-10-25 03:43:14
if that's true something is very wrong with Matrix I guess...
2022-10-25 03:52:24
*facepalm and deep exhalation*
w
2022-10-25 04:18:14
what is the point of matrix
OkyDooky
2022-10-25 08:21:48
There is definitely a way to ban a user, but it has to be done on the Matrix side (<@456226577798135808>)
2022-10-25 08:48:41
๐Ÿ‘‹
fab
2022-10-25 08:50:30
I'm at note 29 on Italian Page JPEG XL
2022-10-25 08:51:01
I have not focused too much on quality
2022-10-25 08:51:11
The french did better in a day
2022-10-25 08:51:26
And without spamming commits
The_Decryptor
2022-10-25 08:59:53
Yeah, the equivalent for a discord server on the matrix side is a "space" with child rooms
_wb_
w what is the point of matrix
2022-10-25 09:37:49
I suppose it's good to have an open alternative to the proprietary thing that discord is โ€” I think it helps to avoid that discord goes the Slack way and basically starts charging money for basic functionality like not deleting messages that are 90+ days old.
2022-10-25 09:41:18
So far, discord only lets you pay for "fun extras" (which you don't really need) and for stuff that actually costs them money (like uploading large files). It does not have ads or "sponsored messages" and it does a reasonable job at keeping spammers away. As long as it's like that, I consider it OK enough.
w
2022-10-25 09:41:28
should just use irc and mumble
_wb_
2022-10-25 09:43:30
is there a good way to get irc with messages archived/searchable and a reasonably simple to use user interface that works on desktop and mobile?
yurume
2022-10-25 09:48:46
there were many third-party services when IRC was still in a good position
2022-10-25 09:49:07
now they are mostly gone, so IRC is pretty much out of the question
2022-10-25 09:50:00
I say this as an operator of once significant IRC network in my country ๐Ÿ˜‰
2022-10-25 09:50:24
(peaked around ~1000 concurrent users, now around 100 users)
w
2022-10-25 10:00:24
https://xkcd.com/927/
fab
2022-10-25 04:33:59
https://fr-m-wikipedia-org.translate.goog/wiki/JPEG_XL?_x_tr_sl=fr&_x_tr_tl=en&_x_tr_hl=it&_x_tr_pto=wapp
2022-10-25 04:34:10
https://it-m-wikipedia-org.translate.goog/wiki/AOMedia_Video_1?_x_tr_sl=it&_x_tr_tl=en&_x_tr_hl=it&_x_tr_pto=wapp
2022-10-25 04:38:48
2022-10-25 04:39:22
For some things this is clearer and they already have 12 notes
diskorduser
2022-10-25 06:27:12
Can we have separate jxl Wikipedia channel? <:PepeHands:808829977608323112>
Jyrki Alakuijala
2022-10-25 06:40:13
+1 for jxl Wikipedia (also xyb jpeg discussion for wikipedia could go there)
Traneptora
2022-10-25 07:03:45
> First, if use_global_tree is false, the decoder reads a MA tree and corresponding clustered distributions > as described in H.4.2; otherwise the global MA tree and its clustered distributions are used as decoded from > the GlobalModular section (G.1.3). Does this mean the *same* distribution is used to decode the MA tree as the one used to decode the Modular pixel data, including internal state?
2022-10-25 10:46:22
Also, what happens if you have a permuted TOC in a kModular frame, and LfGlobal appears at the end, but intermediary blocks use the GlobalModular MA Tree? Do you have to decode LfGlobal first anyway, buffering until then?
_wb_
2022-10-26 05:01:47
Yes, and an encoder that does that is doing something very silly
Traneptora
2022-10-26 05:03:13
I see, so you could do that but you don't really have any reason to do that
_wb_
2022-10-26 05:05:45
Exactly
Traneptora
2022-10-26 05:12:43
the only purpose I see in making those sorts of contrived pain-in-the-butt codestreams would be for conformance testing
2022-10-26 05:12:54
like to test if the decoder handles edge cases correctly
fab
2022-10-26 03:36:01
https://es.m.wikipedia.org/wiki/Especial:MobileDiff/146910739
Traneptora
2022-10-26 05:26:07
And JXLatte decoded its first image!
2022-10-26 05:26:11
an 8x8 block of white pixels
monad
2022-10-27 05:49:16
wow good job
Traneptora
2022-10-27 06:25:30
<@&807636211489177661>
w
2022-10-27 08:08:23
it was matrix
2022-10-27 08:08:40
as usual
improver
2022-10-27 08:10:41
the price of setting the channel publicly discoverable
Traneptora
2022-10-27 09:21:26
A second successfully decoded image from JXLatte, this one much more complicated!
2022-10-27 09:21:31
2022-10-27 09:22:01
the JXL art tool is useful for this, as I can be sure I read the tree correctly as I set it explicitly
Moritz Firsching
Traneptora And JXLatte decoded its first image!
2022-10-28 11:33:18
fantastic, great job!
yurume
Traneptora the JXL art tool is useful for this, as I can be sure I read the tree correctly as I set it explicitly
2022-10-28 11:43:54
are you close to having working modular prediction besides from wp? that's cool!
Traneptora
yurume are you close to having working modular prediction besides from wp? that's cool!
2022-10-28 01:01:54
well wp is by far the hardest part
2022-10-28 01:02:19
I haven't done predictors >15 yet tho
2022-10-28 01:02:21
the extra ones
2022-10-28 01:02:37
er, not predictors but tree contexts
yurume
2022-10-28 01:51:11
ah yeah, I see where you are
Traneptora
2022-10-28 02:17:50
conveiently java has stuff like deflators built in so I could roll out an extremely primitive PNG encoder with minimal work
2022-10-28 02:17:58
since it's more widely supported than ppm
2022-10-28 02:18:20
Java does have a built in PNG encoder but it requires you to deal with their BufferedImage API which is not fun
VcSaJen
Traneptora an 8x8 block of white pixels
2022-10-28 07:28:55
I still remember when I tried to implement JPG from scratch for my homework. It could only correctly decode 8x8 diagonal gradient of two colors.
Traneptora
2022-10-28 07:31:37
I picked something extremely simple just to test out the core functionality
2022-10-29 09:36:50
Can a referenced frame in Patch be *after* the frame containing the Patch?
2022-10-29 09:36:55
or does it have to be that frame or earlier?
_wb_
2022-10-29 09:40:38
It has to be earlier
2022-10-29 09:41:08
It's whatever has been stored in `Reference[idx]` at that point
Traneptora
2022-10-29 10:43:08
I'm confused about the ModularLfGroup section
2022-10-29 10:43:51
> then a channel corresponding to the LF group rectangle is added, with the same hshift and vshift as in the GlobalModular image, where the group dimensions and the x,y offsets are right-shifted by hshift (for x and width) and vshift (for y and height).
2022-10-29 10:44:11
`x` and `y` offsets as far as I'm aware aren't attached to modular channels, they're attached to frames
2022-10-29 10:44:26
at least it doesn't say anything about x and y offsets for modular channels in annex H
2022-10-29 10:52:50
Also, because the global modular stream doesn't have any inverse transforms applied yet, how are we supposed to copy pixels into the corresponding buffer?
_wb_
2022-10-29 10:52:52
It refers to the pixel positions, e.g. if the normal group offset would be 512,768 then for a hshift=1, vshift=2 channel, the actual group offset is 256,192
2022-10-29 10:53:38
And if the frame size is 1024x1024 then for that channel the channel size would be 512x256
Traneptora
2022-10-29 10:53:58
wait so if the frame size is 1024x1024, and the groupDim is 256, then there's 16 LfGroups?
2022-10-29 10:55:05
and, I decode the when I decode the GlobalModular image, until I get a frame that's bigger than 256x256, then I stop, right?
_wb_
2022-10-29 10:55:28
No, lfgroups have shift 3, so with groupDim 256 they correspond to 2048x2048 pixels but are only 256x256 in size of actual coefficients/channel sizes
Traneptora
2022-10-29 10:55:47
wait hold up
2022-10-29 10:56:00
I decode the when I decode the GlobalModular image, until I get a frame that's bigger than 256x256, then I stop, right?
2022-10-29 10:56:03
and
2022-10-29 10:56:04
then
2022-10-29 10:56:11
After that, I have a list of post-forward-transform-channels in the GlobalModular channel list that aren't decoded *yet*
_wb_
2022-10-29 10:56:37
Yes
2022-10-29 10:57:06
The inverse transforms are only done when all the hfgroups have been decoded too
Traneptora
2022-10-29 10:57:11
So, when I decode an LfGroup, do I initialize an entirely new submodular bitstream, or do I use the same MA Tree and same entropy stream as in the GlobalModular?
_wb_
2022-10-29 10:57:41
Not the same entropy stream, as it is in different sections
2022-10-29 10:58:05
Same MA tree depends on if the global tree is used or a local one
Traneptora
2022-10-29 10:58:24
but the entropy stream associated with a modular sub-bitstream uses the one associated with its tree, right?
2022-10-29 10:59:18
I ask because if you look at the subsection of Annex H that describes MA trees, it says "after decoding the MA tree you read `(tree.size() + 1) / 2` clustered Distributions
2022-10-29 10:59:38
right in that section
2022-10-29 10:59:42
which leads me to believe that if a submodular stream uses the global tree, it uses the distribution associated with that tree
2022-10-29 11:00:03
preserving *state* at least
2022-10-29 11:00:05
Is this not true?
2022-10-29 11:00:23
Or are those clustered distributions read *when* the modular tree is requested.
_wb_
2022-10-29 11:03:30
<@268284145820631040> probably knows better than me at this point, for me it is longer ago ๐Ÿ™‚
yurume
2022-10-29 11:04:15
(reading the backlog)
_wb_
2022-10-29 11:04:29
Please remember all unclarity and propose clarifications for the spec
Traneptora
2022-10-29 11:04:47
I'm gonna propose clarifications once I figure out how it works and debug, but yea
2022-10-29 11:05:00
but it looks like for each LFGroup you read an entirely new modular sub bitstream
2022-10-29 11:05:46
which means each LFGroup can have its own set of transforms as well, is this correct?
_wb_
2022-10-29 11:06:46
Yes, all sections are entirely new subbitstreams, that get their own local transforms which get undone locally, and then they combine into the full modular image which gets the global transforms undone
2022-10-29 11:07:11
So you can have local palettes, local RCTs, etc
Traneptora
2022-10-29 11:08:06
so if GroupDim is 256, then each LF Group is a 256x256 Modular stream, with one channel corresponding to each channel that has at least hShift >=3 and vshift >= 3?
yurume
2022-10-29 11:08:07
I think it's per-group, not per-lf-group (G.3.3)
2022-10-29 11:09:51
so in my understanding, whenever the spec says "...decoded as another modular image bitstream (H)", it is independent from all other modular bitstreams and has own transformations
_wb_
Traneptora so if GroupDim is 256, then each LF Group is a 256x256 Modular stream, with one channel corresponding to each channel that has at least hShift >=3 and vshift >= 3?
2022-10-29 11:10:03
Yes, anything that fits in 256x256 iirc
2022-10-29 11:11:02
The global metachannels and anything that fits as-is in 256x256 goes into globalmodular
2022-10-29 11:11:41
Then lfgroups have everything else with shift >= 3, split in groups that end up fitting in 256x256
Traneptora
2022-10-29 11:11:51
each LFGroup has a number of channels corresponding to the number of un-decoded channels in GlobalModular.
_wb_
2022-10-29 11:12:01
And then modulargroups for everything else
Traneptora each LFGroup has a number of channels corresponding to the number of un-decoded channels in GlobalModular.
2022-10-29 11:12:37
Yes except the ones that will go in the modulargroups, so with shift < 3
Traneptora
2022-10-29 11:12:47
wait, no, each LFGroup has a number of channels correponding to un-decoded channels in GlobalModular, but only those with shift >= 3
2022-10-29 11:12:50
I see
_wb_
2022-10-29 11:13:32
It is quite complicated but it is done this way basically so modular stays in sync with what is available in vardct, so you can get progressive alpha etc
2022-10-29 11:13:40
(or do progressive lossless)
Traneptora
2022-10-29 11:14:39
so it sounds like I replace the undecoded channels with a buffer when decoding the LF Groups
2022-10-29 11:15:02
or rather, the relevant undecoded channels
2022-10-29 11:15:03
then I decode each LF Group into their corresponding region of that buffer
_wb_
2022-10-29 11:15:11
Yes
Traneptora
2022-10-29 11:15:21
and the channel counts will always be the same, by definition
2022-10-29 11:16:02
How do I calculate that `x` and `y` offset for the individual LF Group?
_wb_
2022-10-29 11:17:59
The groups are in scanline order, so first one is at 0,0, second at 256,0, etc
2022-10-29 11:18:36
Or 2048,0 but shifted by 3
Traneptora
2022-10-29 11:19:25
so they're at `groupDim >> (hShift - 3)`?
2022-10-29 11:19:30
where `hShift >= 3`
2022-10-29 11:21:13
in this case if `hShift == 4` and `groupDim == 256`, then `width == 128`, and they occur every `width` pixels?
_wb_
2022-10-29 11:22:52
Yes, basically lfgroups contain the "dc" data (or in case of modular, the squeeze-equivalent of that) for a 2048x2048 region downscaled 1:8
2022-10-29 11:25:27
So every such group contains the corresponding data from each channel, taking into account their channel shift, and when all lfgroups are decoded you in principle have enough to reconstruct an 1:8 image by applying the inverse squeeze and assuming that all not-yet-decoded (shift < 3) channels are all-zeroes
2022-10-29 11:26:54
The idea is that you get both progressive decode and a way to do region of interest (cropped) decode
Traneptora
2022-10-29 11:29:14
the Modular stream start in Annex H mentions that extra channels have DimShift equal to their extraChannel's dimShift.
2022-10-29 11:29:43
But when you're decoding an LFGroup, how do you know if a corresponding undecoded channel is an Extra Channel? does it matter?
2022-10-29 11:31:15
or does it not matter becauase it inherits vshift and hshift from the global modular corresponding channel?
_wb_
Traneptora or does it not matter becauase it inherits vshift and hshift from the global modular corresponding channel?
2022-10-30 06:35:12
Yes, this.
2022-10-30 01:14:09
Not really, but it could be used as an interchange format without forced flattening