JPEG XL

Info

rules 57
github 35276
reddit 647

JPEG XL

tools 4225
website 1655
adoption 20712
image-compression-forum 0

General chat

welcome 3810
introduce-yourself 291
color 1414
photography 3435
other-codecs 23765
on-topic 24923
off-topic 22701

Voice Channels

General 2147

Archived

bot-spam 4380

on-topic

Whatever else

CrushedAsian255
2024-11-15 08:23:51
Lossy modular uses the Squeeze transform, but there aren’t really specific blocks
2024-11-15 08:23:55
Unless you mean group size?
salrit
2024-11-15 08:24:20
I wanted to know about the lossless mode.
CrushedAsian255
2024-11-15 08:25:18
Lossless modular is split up into groups for parallel encoding but there aren’t really “blocks” as you would find in VarDCT and other DCT based formats
salrit
2024-11-15 08:28:23
Oh, thanks , I was trying to approach the -WebP kinda way where they divide the image into blocks. BTW, can you describe a line or two about what these groups are?
2024-11-15 08:29:47
For example this is one of the frame info when I am trying to use lossless for a gray image: {xsize = 512, ysize = 512, xsize_upsampled = 512, ysize_upsampled = 512, xsize_upsampled_padded = 512, ysize_upsampled_padded = 512, xsize_padded = 512, ysize_padded = 512, xsize_blocks = 64, ysize_blocks = 64, xsize_groups = 2, ysize_groups = 2, xsize_dc_groups = 1, ysize_dc_groups = 1, num_groups = 4, num_dc_groups = 1, group_dim = 256, dc_group_dim = 2048}
CrushedAsian255
2024-11-15 08:31:26
Not sure what is going on there
2024-11-15 08:31:38
I haven’t used the api
2024-11-15 08:31:42
Just the cli
salrit
2024-11-15 08:33:53
Okay, thnx
_wb_
2024-11-15 08:56:05
group_dim x group_dim is the size of a section that is coded independently (so can be encoded/decoded in parallel)
2024-11-15 08:57:03
xsize x ysize is the coded size of the frame, which gets split into sections of group_dim size
2024-11-15 08:57:38
blocks and DC groups are not relevant for lossless
salrit
_wb_ xsize x ysize is the coded size of the frame, which gets split into sections of group_dim size
2024-11-15 11:14:59
Coded independently .. as in each section has a separate tree for the prediction and then sent for entropy coding individually?
_wb_
2024-11-15 11:17:09
The tree can be separate or shared, but the ANS stream is initialized separately per section.
lonjil
_wb_ Not sure if I should keep that bump for cameras, it's based on the 102-megapixel Fujifilm GFX100 which was released in 2019, while today in 2024 pretty much the highest resolution camera around is the Sony α7R V at 61 megapixels. My conclusion is they went ahead and broke that 100 megapixel barrier and then decided ~50 megapixels is actually enough.
2024-11-15 11:18:54
here's a fun thing: the Fujifilm GFX100 cameras, and some other cameras like the Sony a7R V, have a "pixel shift" feature that will take multiple exposures with the sensor moved slightly. 4x pixel shift moves the sensor by single pixels, giving you better color information but not higher resolution. 16x color shift moves the sensor by half-pixel distances, producing 4x higher resolution images. For this reason, some people have called the GFX100 a "400 MP camera".
salrit
_wb_ The tree can be separate or shared, but the ANS stream is initialized separately per section.
2024-11-15 11:19:20
thnx
CrushedAsian255
_wb_ blocks and DC groups are not relevant for lossless
2024-11-15 11:20:17
aren't DC groups still used for lossless?
2024-11-15 11:20:29
or am i thhinking of squeeze
_wb_
2024-11-15 11:22:13
Yes, when using squeeze, Modular will follow the section structure of VarDCT, putting the very lowest frequency data in the Global section, the rest of the data for an 1:8 image in the DC groups, and the rest in the AC groups (in the corresponding passes, in case of multiple passes).
2024-11-15 11:22:38
But for usual lossless, things are not progressive and just encoded only in AC groups.
CrushedAsian255
2024-11-15 11:22:40
what is stored in Global section?
2024-11-15 11:22:45
for vardct and for modular?
_wb_
2024-11-15 11:26:17
The Global section has the shared tree (if there is a shared one) and basically all the data that still fits within a single group-sized chunk. So if the image is, say, 16000x8000, and you use Squeeze, then the Global section will contain everything up to 250x250, which will be enough for a 500x250 preview. The DC groups will contain the data needed to get a 2000x1000 preview (1:8). And the AC groups will contain the rest.
2024-11-15 11:27:10
The Global section also contains the splines and patches, noise parameters, and quantization tables.
CrushedAsian255
2024-11-15 11:27:14
so if squeeze is not enabled, will the LF be empty?
2024-11-15 11:27:52
im guessing HF Global is not used for modular?
_wb_
2024-11-15 11:28:07
For VarDCT the Global section is typically quite small since it doesn't contain any actual image data.
2024-11-15 11:28:51
For Modular it will be even smaller if Squeeze is not used, since it will only have the splines/patches/noise data.
CrushedAsian255
2024-11-15 11:29:23
for a 1440x960 image there is data in both LfGlobal and LfGroup(0), for that size couldn't it all fit in just LfGlobal as 1:8 is 180x120?
2024-11-15 11:29:37
(lossy modular)
_wb_
2024-11-15 11:39:49
uhm yes, I wouldn't expect that to happen. Can you dump some debug info to output what exactly the modular data is in each section? E.g. compile with `-DJXL_DEBUG_V_LEVEL=10` and then do a single-threaded encode or decode.
CrushedAsian255
2024-11-15 11:44:30
I am having issue building it on Mac
2024-11-15 11:45:27
Is there a guide or should I use a Linux vm
Tirr
2024-11-15 11:48:02
what problem have you encountered?
2024-11-15 11:49:44
basically you'll need cmake, ninja, and graphviz (in addition to xcode command line tools, which should include clang). then run `SKIP_TEST=1 ./ci.sh release -DBUILD_TESTING=Off`
CrushedAsian255
2024-11-15 11:50:42
Hang on I’ll get back to you in a few minutes have to sort something IRL
Tirr basically you'll need cmake, ninja, and graphviz (in addition to xcode command line tools, which should include clang). then run `SKIP_TEST=1 ./ci.sh release -DBUILD_TESTING=Off`
2024-11-15 11:50:51
Can I install those through brew?
Tirr
2024-11-15 11:50:55
yep
Tirr basically you'll need cmake, ninja, and graphviz (in addition to xcode command line tools, which should include clang). then run `SKIP_TEST=1 ./ci.sh release -DBUILD_TESTING=Off`
2024-11-15 11:53:40
maybe replacing `release` with `debug` would be better since you're trying to debug
spider-mario
lonjil here's a fun thing: the Fujifilm GFX100 cameras, and some other cameras like the Sony a7R V, have a "pixel shift" feature that will take multiple exposures with the sensor moved slightly. 4x pixel shift moves the sensor by single pixels, giving you better color information but not higher resolution. 16x color shift moves the sensor by half-pixel distances, producing 4x higher resolution images. For this reason, some people have called the GFX100 a "400 MP camera".
2024-11-15 11:57:19
one could argue that 4× pixel shift gives you higher chroma resolution: https://www.strollswithmydog.com/bayer-cfa-effect-on-sharpness/
CrushedAsian255
Tirr maybe replacing `release` with `debug` would be better since you're trying to debug
2024-11-15 11:57:48
What debug?
spider-mario
2024-11-15 11:57:54
> In conclusion we have seen that the effect of a Bayer CFA on the spatial frequencies and hence the ‘sharpness’ information captured by a sensor compared to those from the corresponding monochrome version can go from (almost) nothing to halving the potentially unaliased range, based on the chrominance content of the image and the direction in which the spatial frequencies are being stressed.
lonjil
spider-mario one could argue that 4× pixel shift gives you higher chroma resolution: https://www.strollswithmydog.com/bayer-cfa-effect-on-sharpness/
2024-11-15 11:57:55
absolutely
CrushedAsian255
2024-11-15 12:19:01
i think its building
2024-11-15 12:19:10
nevermind
2024-11-15 12:19:31
`~/jxl_build/libjxl/lib/extras/dec/apng.cc:581:5: error: no matching function for call to 'png_set_keep_unknown_chunks'`
2024-11-15 12:20:36
``` libjxl/lib/extras/dec/jpg.cc:221:5: error: no matching function for call to 'jpeg_mem_src' 221 | jpeg_mem_src(&cinfo, reinterpret_cast<const unsigned char*>(bytes.data()), | ^~~~~~~~~~~~ /Library/Frameworks/Mono.framework/Headers/jpeglib.h:959:14: note: candidate function not viable: 2nd argument ('const unsigned char *') would lose const qualifier 959 | EXTERN(void) jpeg_mem_src JPP((j_decompress_ptr cinfo, | ^ 960 | unsigned char * inbuffer, | ~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. ```
2024-11-15 12:21:51
<@206628065147748352> do you know what this means
2024-11-15 12:25:14
I don't know how to get it to build but here is what I did take this file `a.jxl` and run `cjxl a.jxl b.jxl -m 1 -d 1` `b.jxl` will show the weird behaviour here is `a.jxl` and `b.jxl`
2024-11-15 12:31:11
(The file I first noticed the issue was different but i didn’t want to share it so I found this other file which also displays the same symptoms)
salrit
_wb_ The tree can be separate or shared, but the ANS stream is initialized separately per section.
2024-11-15 02:20:02
So, is it possible to give an overview for lossless... like initially the enc_modular is set using the image features (size, # of colour planes etc) and then divide the image into sections to process them in parallel. Each section is now subjected to prediction (where there might be a shared tree or individual) and then the ANS individually for each section. Overall for general lossless using maybe -e 9 this is the flow : EncodeImageJXL -> ReadCompressedOutput -> JxlEncoderProcessOutput -> ProcessOneEnqueuedInput -> EncodeFrame -> EncodeFrameOneShot -> ComputeEncodingData -> ComputeTree -> EncodeGroups I am particularly interested in the predictor, found that sometimes some pre-made splits are used, (if I understood correctly) and then fo higher effort cases the tree is built. How is tree_splits_ decided and This is an o/p of a simple grayscale image's GDB session : " std::vector of length 493, capacity 493 = {{splitval = 22, property = 1, lchild = 1, rchild = 2, predictor = jxl::Predictor::Weighted, predictor_offset = 0, multiplier = 1}, {splitval = 23, property = 1, lchild = 3, rchild = 4, predictor = jxl::Predictor::Weighted, predictor_offset = 0, multiplier = 1}, {splitval = 2, property = 10, lchild = 261, rchild = 262, predictor = jxl::Predictor::Weighted, predictor_offset = 0, multiplier = 1}, ......" What does the fields represent particulalry (splitval and property)?
2024-11-15 02:21:10
underscore got the italics font... pls ignore them
_wb_
2024-11-15 02:21:52
property is which of these to test against
2024-11-15 02:22:21
the decision nodes in the MA tree are of the form [property] > [splitval]
2024-11-15 02:23:36
so splitval=22, property=1 means it's a decision node "stream index > 22" (with one branch for the > case and another for the <= case)
2024-11-15 02:25:04
at low effort settings a prebaked tree is used, at higher effort a tree is constructed based on the image data, with higher efforts using more of the available properties than lower efforts
salrit
2024-11-15 02:30:28
So the properties are basically the contexts.. based on which the pixels are segregated
_wb_ at low effort settings a prebaked tree is used, at higher effort a tree is constructed based on the image data, with higher efforts using more of the available properties than lower efforts
2024-11-15 02:41:43
The predictor might be different for each node of the MA Tree? and is it that for higher efforts all the predictors are tried to decide the best one?
_wb_
2024-11-15 03:18:58
Higher efforts will try more predictors and will also use more of the properties to construct an MA tree.
2024-11-15 03:19:39
And yes, the predictor can be different in each node, so the tree is both for context modeling and for selecting which predictor to use
salrit
2024-11-15 04:16:24
Thanx
2024-11-16 06:38:12
Got something else too : What is the ModularStreamID, suppose for grey-scale it is ModularAC, what does that mean and secondly what is the "num_streams" parameter, it is derived from the ModularStreamID and is used in the "tree_splits" also... ?
_wb_
2024-11-16 07:18:51
There is some numbering scheme for the sections (streams) which is also used to allow MA trees to be shared between sections while still having differences between sections since you can have decision nodes based on stream ID
2024-11-16 07:21:35
By the way, decision nodes based on stream ID, channel number, and row number are not actually causing branching in the decoder, since it will specialize the tree.
CrushedAsian255
_wb_ By the way, decision nodes based on stream ID, channel number, and row number are not actually causing branching in the decoder, since it will specialize the tree.
2024-11-16 07:45:05
So the decoder prunes the tree before running it?
_wb_ There is some numbering scheme for the sections (streams) which is also used to allow MA trees to be shared between sections while still having differences between sections since you can have decision nodes based on stream ID
2024-11-16 07:46:59
Are stream IDs the “group number”s in jxl art?
_wb_
2024-11-16 08:06:23
yes
2024-11-16 08:06:32
and yes
2024-11-16 08:06:54
streams, sections, groups are kind of used interchangeably
Tirr
2024-11-16 08:08:58
the spec uses the term "stream ID" in the context of Modular sub-streams
2024-11-16 08:12:48
every Modular substreams gets its unique ID, even for VarDCT varblock quantization matrices IIRC
_wb_
2024-11-16 08:16:22
> The stream index is defined as follows: for GlobalModular: 0; for LF coefficients: 1 + LF group index; for ModularLfGroup: 1 + num_lf_groups + LF group index; for HFMetadata: 1 + 2 * num_lf_groups + LF group index; for RAW dequantization tables: 1 + 3 * num_lf_groups + parameters index (see I.2.4); for ModularGroup: 1 + 3 * num_lf_groups + 17 + num_groups * pass index + group index.
CrushedAsian255
2024-11-16 08:17:34
Is modular initialised once and is used everywhere?
2024-11-16 08:17:47
I guess that’s what Global is for?
_wb_
2024-11-16 08:18:31
every modular substream can have its own transforms and MA tree and everything
CrushedAsian255
2024-11-16 08:19:05
Then why do they all share a single namespace for streams?
_wb_
2024-11-16 08:19:36
but there's also per frame a concept of one large modular image, which can have global transforms and a global MA tree that can be used in each of the substreams
CrushedAsian255
2024-11-16 08:19:52
Can each sub bitstream signal to just use the global modular subbitstream ?
_wb_
2024-11-16 08:20:55
a specific modular substream can choose to either use the global tree or to define its own local tree
CrushedAsian255
2024-11-16 08:21:23
That makes sense
2024-11-16 08:21:55
Does the MA tree leaves define the ANS context?
2024-11-16 08:23:40
How do ANS contexts work again?
salrit
2024-11-16 08:25:16
There was this thing too, well two different ways the heuristic properties are defined, one this way : splitting_heuristics_properties = std::vector of length 16, capacity 16 = {0, 1, 15, 9, 10, 11, 12, 13, 14, 2, 3, 4, 5, 6, 7, 8} and in some other way if the squeeze Trsfm is used, Couldn't understand the way of defining ?
_wb_
CrushedAsian255 How do ANS contexts work again?
2024-11-16 08:28:20
Yes. The context determines the probabilities for each token, which determine how many fractional bits it will take to encode that token (high-probability ones will use fewer bits, low-probability ones more)
CrushedAsian255
2024-11-16 08:28:49
Fractional bits?
2024-11-16 08:29:07
Like in Arithmetic coding?
_wb_
salrit There was this thing too, well two different ways the heuristic properties are defined, one this way : splitting_heuristics_properties = std::vector of length 16, capacity 16 = {0, 1, 15, 9, 10, 11, 12, 13, 14, 2, 3, 4, 5, 6, 7, 8} and in some other way if the squeeze Trsfm is used, Couldn't understand the way of defining ?
2024-11-16 08:29:39
depending on the effort setting, it will use some prefix of that vector. These are indices of properties, sorted more or less from "most useful" to "least useful"
CrushedAsian255 Like in Arithmetic coding?
2024-11-16 08:30:35
Huffman coding requires an integer number of bits per encoded token, while in ANS (and arithmetic coding / range coding) you can have symbols that take less than one bit
CrushedAsian255
_wb_ depending on the effort setting, it will use some prefix of that vector. These are indices of properties, sorted more or less from "most useful" to "least useful"
2024-11-16 08:30:45
Why are not useful properties included?
salrit
_wb_ depending on the effort setting, it will use some prefix of that vector. These are indices of properties, sorted more or less from "most useful" to "least useful"
2024-11-16 08:30:48
Ahh.. like the DEFLATE type two use of prefix tokens.. the more common ones are sorted first
_wb_
CrushedAsian255 Why are not useful properties included?
2024-11-16 08:31:24
They're all useful, some are just typically more useful than others. It also depends on the image content which properties are useful.
CrushedAsian255
2024-11-16 08:31:47
So it’s ordered by which ones are more commonly useful?
_wb_
2024-11-16 08:32:55
yes — though maybe there's room for improvement there, it's all just encoder heuristics that were at some point determined based on some corpus of images and the state of the encoder as it was then
CrushedAsian255
_wb_ Huffman coding requires an integer number of bits per encoded token, while in ANS (and arithmetic coding / range coding) you can have symbols that take less than one bit
2024-11-16 08:33:07
How does ANS actually work compared to range coding? Is rANS like a hybrid?
salrit
_wb_ There is some numbering scheme for the sections (streams) which is also used to allow MA trees to be shared between sections while still having differences between sections since you can have decision nodes based on stream ID
2024-11-16 08:38:35
I not sure I get it fully , suppose for a grey image of 512 x 512, I use lossless encoding and this is the frame_data : {xsize = 512, ysize = 512, xsize_upsampled = 512, ysize_upsampled = 512, xsize_upsampled_padded = 512, ysize_upsampled_padded = 512, xsize_padded = 512, ysize_padded = 512, xsize_blocks = 64, ysize_blocks = 64, xsize_groups = 2, ysize_groups = 2, xsize_dc_groups = 1, ysize_dc_groups = 1, num_groups = 4, num_dc_groups = 1, group_dim = 256, dc_group_dim = 2048} For this I get the num_streams as 25 , as you pointed out the group_dim which is 256 and the number of groups matters here for lossless which is 4 are relevant for now but if the streams is same as sections and which is same as groups, the 'num_streams = 25' ? And there was this thing too that the useful_splits in the tree gets initialized to {0, num_streams}, what is the 'useful_splits' here?
_wb_
CrushedAsian255 How does ANS actually work compared to range coding? Is rANS like a hybrid?
2024-11-16 08:40:39
the way I see it, the main difference between ANS and range coding is that range coding is more or less symmetric between encode and decode, while ANS is a bit faster to decode at the cost of complicating the encoder. Other than that they're pretty similar, and things depend more on _how_ you use them (e.g. one bit at a time vs larger alphabet size, static probabilities vs dynamically updating probabilities, etc) than on ANS vs AC/range coding.
salrit I not sure I get it fully , suppose for a grey image of 512 x 512, I use lossless encoding and this is the frame_data : {xsize = 512, ysize = 512, xsize_upsampled = 512, ysize_upsampled = 512, xsize_upsampled_padded = 512, ysize_upsampled_padded = 512, xsize_padded = 512, ysize_padded = 512, xsize_blocks = 64, ysize_blocks = 64, xsize_groups = 2, ysize_groups = 2, xsize_dc_groups = 1, ysize_dc_groups = 1, num_groups = 4, num_dc_groups = 1, group_dim = 256, dc_group_dim = 2048} For this I get the num_streams as 25 , as you pointed out the group_dim which is 256 and the number of groups matters here for lossless which is 4 are relevant for now but if the streams is same as sections and which is same as groups, the 'num_streams = 25' ? And there was this thing too that the useful_splits in the tree gets initialized to {0, num_streams}, what is the 'useful_splits' here?
2024-11-16 08:42:38
it's a high number because there are many stream IDs unused, like possibly there are 17 raw quantization tables that each get their own stream id
2024-11-16 08:43:17
for a lossless image, you'll end up with stream ids 21, 22, 23, 24 being used for the actual data
CrushedAsian255
2024-11-16 08:44:28
Are the others then just set to Set 0 and a singleton of 0 or something ?
_wb_
2024-11-16 08:44:43
streams that are not needed are not signaled at all
salrit
2024-11-16 08:46:07
okay..
CrushedAsian255
2024-11-16 08:47:37
JXL uses static probability ranges rANS correct?
salrit
_wb_ it's a high number because there are many stream IDs unused, like possibly there are 17 raw quantization tables that each get their own stream id
2024-11-16 08:54:41
For the same image the tree_splits were such (gdb) p tree_splits_ $338 = std::vector of length 2, capacity 2 = {0, 25} What does it represent ? I mean the capacity 2 = {0, 25}?
_wb_
2024-11-16 09:28:47
I think <@179701849576833024> wrote that code but I think that's just some encoder bookkeeping thing to make sure we use different subtrees for streams corresponding to DC data than for streams corresponding to block selection metadata or modular HF groups, as a bit of an optimization to not have to learn that those are different kinds of data
veluca
2024-11-16 09:30:22
I have some vague memory of this being quantization related, and {0, num_streams} being effectively a noop
2024-11-16 09:31:58
ah, no, it is indeed meant to ensure the encoder doesn't put entirely unrelated data in the same tree
salrit
veluca ah, no, it is indeed meant to ensure the encoder doesn't put entirely unrelated data in the same tree
2024-11-16 09:37:14
so the {0, num_streams} indicates that there might be a single tree or can be upto - 'num_streams' trees ?
veluca
2024-11-16 09:37:51
no it's just saying that there's one tree covering streams [0, num_streams)
2024-11-16 09:38:24
if it were {0, 5, num_streams} it'd indicate two trees, for `[0, 5)` and `[5, num_streams)`
salrit
2024-11-16 09:39:22
Ahh.. thanks
2024-11-16 01:24:04
What does Computing a Tree and Tokenizing a Tree differ by? There were two steps, CompuTree and then ToeknizeTree...
spider-mario
2024-11-16 02:18:27
interesting, the Preview app on macOS 15.1 (not sure since when exactly) seems to even display PNGs with an HLG ICC as HDR
CrushedAsian255
spider-mario interesting, the Preview app on macOS 15.1 (not sure since when exactly) seems to even display PNGs with an HLG ICC as HDR
2024-11-16 02:20:59
I think it was added in 15.0, I noticed this a couple of weeks ago
spider-mario
2024-11-16 02:24:46
not JPEG though
salrit
veluca if it were {0, 5, num_streams} it'd indicate two trees, for `[0, 5)` and `[5, num_streams)`
2024-11-16 04:32:46
const size_t num_toc_entries = is_small_image ? 1 : AcGroupIndex(0, 0, num_groups, frame_dim.num_dc_groups) + num_groups * num_passes; Can you let me know what is this "num_toc_entries" ?
veluca
2024-11-16 04:36:41
It's the number of sections that the frame is split into
salrit
2024-11-16 04:37:59
But that's the num_groups right?
veluca It's the number of sections that the frame is split into
2024-11-16 04:39:00
{xsize = 512, ysize = 512, xsize_upsampled = 512, ysize_upsampled = 512, xsize_upsampled_padded = 512, ysize_upsampled_padded = 512, xsize_padded = 512, ysize_padded = 512, xsize_blocks = 64, ysize_blocks = 64, xsize_groups = 2, ysize_groups = 2, xsize_dc_groups = 1, ysize_dc_groups = 1, num_groups = 4, num_dc_groups = 1, group_dim = 256, dc_group_dim = 2048} Like here the num_groups says that there are 4 , 256 x 256 sections the frame is divided into, if I am not mistaken...
veluca
2024-11-16 04:41:56
yes
2024-11-16 04:42:01
but sections are more than that
2024-11-16 04:42:12
there's also AcGlobal, DcGlobal, and DcGroups
2024-11-16 04:42:33
so the toc should have 7 entries
salrit
veluca so the toc should have 7 entries
2024-11-16 04:45:06
Yes, it has 7.. Well for lossless encoding what does the AcGlobal, DcGlobal and DcGroups mean?
veluca
2024-11-16 04:46:04
IIRC DcGlobal will contain the tree + histograms, patches, and maybe palette and such, and DcGroups and AcGlobal will likely be empty
salrit
2024-11-16 04:48:11
Thanx, so these are also sections apart from the num_groups. JXL_RETURN_IF_ERROR(RunOnPool(pool, 0, num_groups, resize_aux_outs, process_group, "EncodeGroupCoefficients")); And i think this says that the groups are encoded parallely right by the process_group ?
veluca
2024-11-16 04:48:45
yep
salrit
2024-11-16 04:49:02
Cool.. thnx
veluca yep
2024-11-16 04:54:17
Just one more thing to ask in this regard, like you said the tree for the example I gave is a single one ... (the example of splits = {0, 25}) Does that mean that there is a separate tree computed for each group and then merged later as a single tree? If done so then how each group is entropy coded in parallel ? OR am I missing something in between...?
veluca
2024-11-16 05:12:19
computing the tree is generally faster than compressing the groups
2024-11-16 05:13:21
what jxl does is to first compute some statistics about the image (at high enough efforts), in parallel across groups, then compute a tree, then encode
Laserhosen
2024-11-18 02:51:32
Apparently skcms, and therefore libjxl in its default configuration, can't deal with this ridiculous 100KB BenQ monitor ICC profile I found in a random PNG 😅 . ``` ./lib/jxl/cms/jxl_cms.cc:976: JXL_RETURN_IF_ERROR code=1: skcms_Parse(icc_data, icc_size, &profile) ./lib/jxl/cms/color_encoding_cms.h:528: JXL_RETURN_IF_ERROR code=1: cms.set_fields_from_icc(cms.set_fields_data, new_icc.data(), new_icc.size(), &external, &new_cmyk) ./lib/jxl/encode.cc:1140: ICC profile could not be set ``` I wonder if lcms2 fares any better...
_wb_
2024-11-18 03:29:44
a 100 KB icc profile for RGB? haven't seen such a monster before, I thought only CMYK profiles got that crazy
spider-mario
2024-11-18 03:30:32
for monitor profiles, LUTs are common
2024-11-18 03:32:04
for what it's worth, I'm not completely sure we should default to skcms
w
2024-11-18 03:33:40
displaycal creates ICC with lut
spider-mario
2024-11-18 03:34:27
often several LUTs, in fact
2024-11-18 03:34:53
three 1D LUTs for calibration in the GPU (vcgt, “video card gamma table”), and 3D LUT for the actual profile part
w
2024-11-18 03:36:22
lcms2 after all is still the best 🤷
spider-mario
2024-11-18 03:36:42
yeah, skcms is easier to build (just one file) and runs faster, but lcms2 is more feature-complete and possibly more accurate
2024-11-18 03:36:55
where skcms and lcms2 disagree, I would tend to trust lcms2 more
w
2024-11-18 03:37:13
and qcms is nowhere close
Laserhosen
2024-11-18 03:44:22
Yep, RGB, and it seems to work... in the sense that Gwenview displays the PNG differently with/without it.
spider-mario
2024-11-18 03:46:45
I would say that there is a slim but non-zero chance that this is a bug not in skcms but in our usage of it – could you perhaps send a copy of the ICC profile in question?
2024-11-18 03:46:53
(image optional; I can always attach it to another image myself)
Laserhosen
2024-11-18 03:48:26
spider-mario
2024-11-18 03:53:47
thanks!
2024-11-18 03:54:35
yep, LUT profile ``` tag 20: sig 'A2B0' [0x41324230] size 50740 type 'mft2' [0x6d667432] […] tag 22: sig 'B2A0' [0x42324130] size 50740 type 'mft2' [0x6d667432] ```
salrit
2024-11-18 04:48:37
While computing the tree, there are two steps : CollectPixelSamples() which in some sort samples pixels from a distribution, saw Geometric in the case of example (lossless) I had and then the step : PreQuantizeProperties(), What does these two perform ?
_wb_
spider-mario for monitor profiles, LUTs are common
2024-11-18 05:56:18
Let me rephrase: I thought ICC profiles you would want to put in a PNG image file wouldn't get that large.
spider-mario
2024-11-18 05:57:14
depending on the circumstances, you might conceivably attach a monitor profile to a screenshot, at least temporarily
2024-11-18 05:57:32
I would usually then convert the image to a more common colorspace, though
_wb_
2024-11-18 05:57:40
Big profiles are useful to describe wonky stuff like the details of a display or a printer, but to describe the colorspace used to represent an image?
2024-11-18 06:01:26
I guess for a screenshot in display space it might make sense, yes. Though even then it feels like unnecessarily exposing details of the screen you happen to use, and converting it to something standard and simple would make more sense to me, even if only to avoid creating a vector for fingerprinting.
2024-11-18 06:01:59
After all such very display specific profiles, especially if the result of calibration, would be pretty unique, no?
spider-mario
2024-11-18 06:33:45
yes
fotis3i
2024-11-18 07:13:29
I have a question regarding Apple's jxl support. I doubt this is a problem of cjxl but thought I'd ask anyway in case somebody here has some insights to share. I am using cjxl to convert HDR png files into jxl files. While both the jxl and png files render properly on iOS18 (the light-sources appear super-bright) , when I click the edit button inside apple photos **some of the photos** will immediately get converted to SDR for editing. Others won't, and they will be HDR-editable. I feel that apple is measuring the number of superbright pixels and decides whether the image should be edited as HDR or as a tone-mapped SDR in a heuristic manner and not by relying on the metadata of the file. Anybody who knows anything about this? I'm looking for a way to enforce HDR editing on apple devices.
CrushedAsian255
I have a question regarding Apple's jxl support. I doubt this is a problem of cjxl but thought I'd ask anyway in case somebody here has some insights to share. I am using cjxl to convert HDR png files into jxl files. While both the jxl and png files render properly on iOS18 (the light-sources appear super-bright) , when I click the edit button inside apple photos **some of the photos** will immediately get converted to SDR for editing. Others won't, and they will be HDR-editable. I feel that apple is measuring the number of superbright pixels and decides whether the image should be edited as HDR or as a tone-mapped SDR in a heuristic manner and not by relying on the metadata of the file. Anybody who knows anything about this? I'm looking for a way to enforce HDR editing on apple devices.
2024-11-18 10:47:09
Can you run jxlinfo -v on the files?
Foxtrot
2024-11-18 11:16:55
This is what you get for not supporting JPEG XL 😄 https://www.reuters.com/technology/doj-ask-judge-force-google-sell-off-chrome-bloomberg-reports-2024-11-18/
CrushedAsian255
Foxtrot This is what you get for not supporting JPEG XL 😄 https://www.reuters.com/technology/doj-ask-judge-force-google-sell-off-chrome-bloomberg-reports-2024-11-18/
2024-11-18 11:27:32
Sell it to Jon Sneyers and the Cloudinary team
HCrikki
2024-11-18 11:38:23
decoupling from google would be the most ideal outcome but only a starting point
2024-11-18 11:39:43
android included an 'aosp' browser long before google witheld updates to whats essentially chromium in order to make it costly for oems to complete aosp on their own using their own or google's suite of apps
2024-11-18 11:40:35
architecturally, decoupling the engine from the browser code would do wonders (for mozilla too).
2024-11-18 11:40:57
engines dont need constant updates to the point not even entities like ms can keep up and are forced to rebase on google code and ditch their own
2024-11-18 11:44:39
monoculture needs to end but part of how its enforced is using google's seo/webmaster tools (do this or you wont be visible on google search unless you pay!). web.dev and lighthouse should be removed from google's control too
veluca
2024-11-19 07:58:05
I _probably_ should not be commenting on this, but looking at Mozilla's financials... where would a non-Google-Chrome (or Android) be getting money from? even Mozilla effectively gets most of its money from Google
2024-11-19 07:58:31
(to be clear, I am not necessarily saying that's a good thing. it's just how it is in practice though...)
afed
2024-11-19 08:57:18
google will still give money, but not own <:KekDog:805390049033191445>
_wb_
2024-11-19 12:22:09
I can see how Android could be supported by not just Google but also phone/tablet hardware vendors. In fact I think they should have more incentive to maintain and improve Android than Google does.
2024-11-19 12:30:24
Also I think Mozilla demonstrates that Google has an incentive to subsidize browser development (since basically the better the web platform and the more the web gets used in general, the more profit they make), regardless of whether that development happens in-house or is done by others.
jonnyawsom3
_wb_ I can see how Android could be supported by not just Google but also phone/tablet hardware vendors. In fact I think they should have more incentive to maintain and improve Android than Google does.
2024-11-19 12:53:36
I'm fairly sure they already do, given LTT's video on the stock Android installation and how bad it is
2024-11-19 12:53:51
Not even coming with a phone app pre-installed
TheBigBadBoy - 𝙸𝚛
2024-11-19 12:58:47
LineAgeOS my beloved
veluca
_wb_ Also I think Mozilla demonstrates that Google has an incentive to subsidize browser development (since basically the better the web platform and the more the web gets used in general, the more profit they make), regardless of whether that development happens in-house or is done by others.
2024-11-19 01:08:32
sure... but does it really change so much if it's Google or people that get ~all their money from Google to use Google as the default search engine?
fotis3i
CrushedAsian255 Can you run jxlinfo -v on the files?
2024-11-19 01:51:50
jxlinfo -v P8159736.jxl returns box: type: "JXL " size: 12, contents size: 4 JPEG XL file format container (ISO/IEC 18181-2) box: type: "ftyp" size: 20, contents size: 12 box: type: "jxll" size: 9, contents size: 1 box: type: "jxlp" size: 33, contents size: 25 JPEG XL image, 4640x3472, lossy, 16-bit RGB+Alpha num_color_channels: 3 num_extra_channels: 1 extra channel 0: type: Alpha bits_per_sample: 16 alpha_premultiplied: 0 (Non-premultiplied) intensity_target: 10000.000000 nits min_nits: 0.000000 relative_to_max_display: 0 linear_below: 0.000000 have_preview: 0 have_animation: 0 Intrinsic dimensions: 4640x3472 Orientation: 1 (Normal) Color space: RGB, D65, Rec.2100 primaries, PQ transfer function, rendering intent: Relative box: type: "brob" size: 1125, contents size: 1117 Brotli-compressed xml metadata: 1125 compressed bytes box: type: "jxlp" size: 11055036, contents size: 11055028
2024-11-19 02:08:25
<@386612331288723469> you are onto something: I notice now that the files that are demoted to SDR-on-edit have an alpha channel saved in the file - the ones that are edit fine don't! That's a good starting point - now I have to see where in the process does this alpha channel get added! Thank you for helping out
Quackdoc
_wb_ I can see how Android could be supported by not just Google but also phone/tablet hardware vendors. In fact I think they should have more incentive to maintain and improve Android than Google does.
2024-11-19 03:12:19
I think most likely solution would be to spin aosp out into a seperate non profit
I'm fairly sure they already do, given LTT's video on the stock Android installation and how bad it is
2024-11-19 03:16:21
retarded video, A) thats not even "stock android" as the development GSIs have a different selection of apps B) didn't bother installing 3rd party apps which even normies do
_wb_
veluca sure... but does it really change so much if it's Google or people that get ~all their money from Google to use Google as the default search engine?
2024-11-19 03:28:39
No, if there's still a strong financial dependency then Google does effectively retain control and the unhealthy monopolistic situation more or less remains. But at least it would change things a little (from direct control to indirect control), and one could hope that other stakeholders would show up to diversify the funding of such a non-Google-Chrome. Another option imo could be that it becomes a non-profit organization funded by a fund that gets created by a one-time big forced donation from Google ordered by some judge. That way the non-profit can still have a steady income (say it's a $10b fund then it would generate ~$400m/year at current intrest rates), which would not be enough to re-hire the entire current Chrome team but it would certainly be enough to establish a decent independent foundation for the project...
Oleksii Matiash
Quackdoc I think most likely solution would be to spin aosp out into a seperate non profit
2024-11-19 04:12:30
It will immediately lag behind Apple
Quackdoc
Oleksii Matiash It will immediately lag behind Apple
2024-11-19 04:15:16
not really, AOSP is already an entirely open source software stack. and with chromeOS eventually migrating to AOSP base instead of a gentoo base google has large incentive to keep funding it
Oleksii Matiash
Quackdoc not really, AOSP is already an entirely open source software stack. and with chromeOS eventually migrating to AOSP base instead of a gentoo base google has large incentive to keep funding it
2024-11-19 04:20:34
It is opensource but funding, management - it should be held by Google or any other large company, interested in the project, otherwise it will become non competitive very soon
Quackdoc
2024-11-19 04:21:04
I dont see in what ways this would happen
HCrikki
veluca I _probably_ should not be commenting on this, but looking at Mozilla's financials... where would a non-Google-Chrome (or Android) be getting money from? even Mozilla effectively gets most of its money from Google
2024-11-19 05:07:04
revenue share and default search are 2 different things mozilla sold and have no reason to be lumped together in a spreadcheat other than to confuse folks. MS in the past was willing to pay the same amount as google to be default (around 200mo - this is supposed to always be paid upfront regardless of wether users keep your SE default) and almost match the remaining payout for performance (to be paid after cycle - even if people switch from bing mozilla would still get paid by google for google searches)
2024-11-19 05:08:13
the preservation of that exact amount shouldnt be the priority either way since its a faustian pact weakening mozilla further
2024-11-19 05:10:09
the solution is for mozilla to stop putting so many people on its own payroll. take linux kernel or libreoffice, every contributor is paid by his own company/job and so few are on payroll donations and sponsorship deals suffice to cover all expenses
2024-11-19 05:11:38
for firefox, aosp and chromium, gatekeeping only imposes costs the gatekeeper happily accepted to impose themselves in order to gain and keep control - practical development considerations didnt
2024-11-19 05:16:31
on efficiency, did you know every single minor firefox version has over 1400 separate binaries built (including several for each language), even nightlies and mirroring seems to require archiving even the oldest alphas (iinm over 150.000 binaries with 0 use)
2024-11-19 05:18:16
foss projects should stop crippling themselves with non-essential expenses and inefficient workflows with high upkeep costs
Quackdoc
2024-11-19 11:51:06
<@288069412857315328> grass.moe is your site right? the JXL images seem to be down, do you know of other sites with a simple "jxl" gallery? testing servo right now.
2024-11-19 11:51:25
Looking for a site to actually stress it
2024-11-20 12:45:13
nvm I had a local copy, but lmao servo + jxl-oxide performs better then waterfox + libjxl because servo isn't crushing my PC
2024-11-20 01:25:44
https://files.catbox.moe/4a8yln.mp4 https://files.catbox.moe/151xdc.mp4
2024-11-20 01:26:08
waterfox crashed
jonnyawsom3
2024-11-20 02:34:11
Waterfox uses a huge amount of memory during decode from what I can tell, and scaling the image multiples it
Quackdoc
2024-11-20 02:50:34
servo chad
_wb_
2024-11-20 08:52:41
https://discord.com/channels/794206087879852103/809126648816336917/1308893568949161985
2024-11-20 08:52:47
For lossless or lossy?
2024-11-20 09:00:21
For lossless the worst is noise. For lossy typically the worst is thin high-contrast edges.
spider-mario
2024-11-21 11:32:54
https://chaos.social/@mattgrayyes/113520373478295532
HCrikki
2024-11-23 10:32:32
idk if mentioned lately but it seems new libjxl nightlies are publicly accessible again here after 2 months of no builds https://artifacts.lucaversari.it/libjxl/libjxl/latest/
A homosapien
2024-11-23 10:39:16
<:JXL:805850130203934781> <:Stonks:806137886726553651>
veluca
HCrikki idk if mentioned lately but it seems new libjxl nightlies are publicly accessible again here after 2 months of no builds https://artifacts.lucaversari.it/libjxl/libjxl/latest/
2024-11-23 11:23:08
yeah, debian build was broken...
DZgas Ж
2024-11-24 07:59:34
wow, no one new ffmpeg build is not working for me <:PepeGlasses:878298516965982308>
2024-11-24 08:02:06
ffmpeg 7.1 Are AVX hardcode now?
2024-11-24 08:06:25
ffmpeg-7.0.2 works. But none of the new 7.1 or GIT versions run on me
salrit
2024-11-25 08:33:44
Before calculating the 'Gather Tree Data' in the JXL code, there seems to be a step where only a fraction of pixels are chosen using some kind of distribution sampling... for Modular Encoding in lossless way. What does this step mean? and while gathering the tree data, this is seen that just the fraction of pixels' residuals are chosen for getting added to the tree_samples, why is it so?
2024-11-25 08:35:52
I am using higher efforts here btw...
_wb_
2024-11-26 02:18:48
mostly to save memory: the memory needed to store those samples for tree learning is proportional to the fraction, while usually even when sampling only 10% or so, the result is not hugely different from sampling 100%...
CrushedAsian255
_wb_ mostly to save memory: the memory needed to store those samples for tree learning is proportional to the fraction, while usually even when sampling only 10% or so, the result is not hugely different from sampling 100%...
2024-11-26 02:46:02
`-I` controls that correct?
_wb_
2024-11-26 02:50:49
yes
salrit
_wb_ mostly to save memory: the memory needed to store those samples for tree learning is proportional to the fraction, while usually even when sampling only 10% or so, the result is not hugely different from sampling 100%...
2024-11-26 04:54:40
Thanks! I got a bit confused—where exactly are the residuals computed? I noticed some kind of prediction errors being generated during the "Gather-Tree-Data" step, but where does the residue generation actually happen? From what I see, the steps are ComputeTree -> ComputeToken -> EncodeGroup, so it’s likely happening before EncodeGroup, but I couldn’t pinpoint the exact location in the code.
_wb_
2024-11-26 05:05:01
the residuals depend on the predictor which depends on the tree, so iirc first samples are gathered to create a tree, then the tree is used to do the actual encoding
2024-11-26 05:06:24
The residuals are computed here: https://github.com/libjxl/libjxl/blob/main/lib/jxl/modular/encoding/enc_encoding.cc#L530
salrit
2024-11-26 05:07:19
thnx
jonnyawsom3
2024-11-26 07:06:58
Hear me out, what if we used him to test gradients
salrit
_wb_ The residuals are computed here: https://github.com/libjxl/libjxl/blob/main/lib/jxl/modular/encoding/enc_encoding.cc#L530
2024-11-27 10:14:13
While setting the properties here for predictors : https://github.com/libjxl/libjxl/blob/main/lib/jxl/enc_modular.cc#L1143 Suppose the prediction scheme is the kBest (as I have higher efforst), so the chosen predictors are Weighted and Gradient (https://github.com/libjxl/libjxl/blob/main/lib/jxl/modular/encoding/enc_ma.cc#L526), When gathering data for the tree, during the process of iterating through the pixels in a section, either PredictLearnAll or PredictLearnAllNEC is called. Inside these functions, all the predictors are iterated over. Why is it that in the earlier stage, only two predictors are selected, but now all predictors are being considered?
_wb_
2024-11-27 11:20:31
I suppose there is some opportunity for further specialization to make a variant of that function that doesn't populate all predictors but just Weighted and Gradient. As it is now, we have a templated very generic `Predict` function that populates the various properties and produces either one prediction or all predictions (the latter only used for encoding), with some specialized instances mostly for faster decoding.
spider-mario
2024-12-01 12:58:39
> In what could be a wonderful holiday for the Linux desktop, it looks like the Wayland color management protocol might finally be close to merging after four years in discussion.
Quackdoc
spider-mario > In what could be a wonderful holiday for the Linux desktop, it looks like the Wayland color management protocol might finally be close to merging after four years in discussion.
2024-12-01 01:00:51
it also looks pretty good, nothing blatantly wrong as far as I can tell. and it doesn't push dumb assumptions onto compositors either
DZgas Ж
2024-12-04 09:34:56
<@853026420792360980>
Traneptora
2024-12-04 09:35:20
yes?
DZgas Ж
Traneptora yes?
2024-12-04 09:38:03
the error occurs when converting a video to png. In my case, av1 video
2024-12-04 09:38:50
video sample
2024-12-04 09:39:27
```ffmpeg.exe -i "C:\Users\a\Desktop\THE AMAZING DIGITAL CIRCUS: PILOT [HwAPLk_sQ3w].mkv" -vf scale=128:-1:flags=spline "C:\Users\a\Desktop\THE AMAZING DIGITAL CIRCUS: PILOT [HwAPLk_sQ3w].png"```
2024-12-04 09:39:45
Traneptora
2024-12-04 09:40:46
DZgas Ж
2024-12-04 09:41:14
yep
2024-12-04 09:41:24
You did it.
Traneptora
2024-12-04 09:42:16
Your original content is subsampled
2024-12-04 09:42:26
what did you expect to happen
2024-12-04 09:42:27
it's 4:2:0
DZgas Ж
Traneptora Your original content is subsampled
2024-12-04 09:42:59
and what?
Traneptora
2024-12-04 09:43:07
and your original content is subsampled
2024-12-04 09:43:12
what did you expect to happen
DZgas Ж
2024-12-04 09:43:31
Am I making a yuv420 in png? What are you saying
Traneptora
2024-12-04 09:43:43
I'm saying you're starting with content that's 4:2:0 subsampled
2024-12-04 09:43:51
you shouldn't expect anything different to happen
2024-12-04 09:43:57
PNGs aren't subsampled
2024-12-04 09:44:07
but upscaling the chroma doesn't change anything
2024-12-04 09:44:14
you're going to have less data than you started with
2024-12-04 09:44:36
the only way to convert yuv420p into RGB is to upscale it to yuv444p first, and then convert that to RGB
2024-12-04 09:44:43
literally nothing to do with the PNG encoder
DZgas Ж
Traneptora I'm saying you're starting with content that's 4:2:0 subsampled
2024-12-04 09:44:52
ffmpeg doesn't work the way you say it does. the size reduction occurs in the rgb field and not 420, then this field turns into RGB out png sample
Traneptora
2024-12-04 09:44:52
by the time the PNG encoder sees anything, it's already RGB
2024-12-04 09:45:03
that's definitely not true
2024-12-04 09:45:08
you actually have no clue how it works
2024-12-04 09:45:45
what's happening here is that 4:2:0 is being upsampled to 4:4:4, then to RGB, then it's being downscaled
2024-12-04 09:45:54
literally nothing to do with PNG encoder
DZgas Ж
Traneptora literally nothing to do with PNG encoder
2024-12-04 09:46:43
I literally don't understand the meaning of your text. Ffmpeg is broken and gives the wrong file. What do you say?
Traneptora
2024-12-04 09:46:58
there's nothing broken and the file isn't wrong
2024-12-04 09:47:10
I'm using English, if you don't understand me I suggest you learn what these terms mean
2024-12-04 09:47:27
rather than just knee-jerk accuse the software of being broken because you don't understand how anything works
2024-12-04 09:47:40
You can't convert yuv420p into RGB without upscaling the chroma to 4:4:4 first
2024-12-04 09:47:44
because RGB cannot be subsampled
2024-12-04 09:48:30
since the PNG encoder doesn't declare any yuv formats as supported, FFmpeg automatically converts it from yuv444p into RGB using the negotation method that libavfilter provides
DZgas Ж
Traneptora there's nothing broken and the file isn't wrong
2024-12-04 09:49:04
> file isn't wrong wtf
Traneptora
2024-12-04 09:49:07
If you want more detail about what's going on under the hood, run ffmpeg with `-v debug`
DZgas Ж > file isn't wrong wtf
2024-12-04 09:49:27
it isn't wrong. You started with subsampled content. You ended with subsampled content. This should not surprise you.
2024-12-04 09:51:14
If you think libswscale is scaling it wrong, try using libplacebo
spider-mario
2024-12-04 09:51:41
is it using nearest-neighbour to upsample the chroma?
Traneptora
2024-12-04 09:52:43
most likely, yes. swscale isn't very sophisticated, but it's not incorrect
DZgas Ж
Traneptora If you think libswscale is scaling it wrong, try using libplacebo
2024-12-04 09:52:58
ok, write a command to scale
Traneptora
DZgas Ж ok, write a command to scale
2024-12-04 09:53:52
``` libplacebo=w=128:h=-1:format=gbrp ```
2024-12-04 09:54:00
plus everything else
2024-12-04 09:55:08
``` ffmpeg -y -i "THE AMAZING DIGITAL CIRCUS: PILOT [HwAPLk_sQ3w].mkv" -vf "libplacebo=w=128:h=-1:format=gbrp,crop=128:64" -frames:v 1 -update 1 OUT2.png ```
2024-12-04 09:55:44
upscaler defaults to spline36, downscaler defaults to mitchell
2024-12-04 09:55:52
you can, of course, change it with -upscaler and -downscaler
DZgas Ж
Traneptora it isn't wrong. You started with subsampled content. You ended with subsampled content. This should not surprise you.
2024-12-04 09:55:57
No. I still don't understand the meaning of your words. I'm doing a job and the job doesn't match what it should be. and you don't offer a solution -- on my part, are you saying that: it's Broken? Not really. That's how it should be (broken)
2024-12-04 09:56:00
and blurry and same error
2024-12-04 09:56:06
libplacebo
Traneptora
2024-12-04 09:56:20
No, I'm saying that it's not broken
2024-12-04 09:56:26
You're starting with subsampled content
2024-12-04 09:56:29
you can't create data that isn't there
2024-12-04 09:56:36
no matter what you do it's going to look subsampled
DZgas Ж
Traneptora No, I'm saying that it's not broken
2024-12-04 09:56:39
not ffmpeg is broken
Traneptora
2024-12-04 09:56:49
this is so fucking pointless
2024-12-04 09:57:00
you're too stubborn to read and too stubborn to learn what the terms mean
DZgas Ж
Traneptora this is so fucking pointless
2024-12-04 09:57:26
you are not offering a solution. your command outputs the same 422
Traneptora
2024-12-04 09:57:35
it's not 4:2:2
2024-12-04 09:57:36
and no, it's not
2024-12-04 09:57:42
it looks qualitatively different
2024-12-04 09:57:47
are you looking at `OUT.png` again?
2024-12-04 09:58:01
because this writes to `OUT2.png`
spider-mario
2024-12-04 09:58:07
the input is quite high-resolution, though; it is a bit surprising that it ends up so low-quality even downsampled to this extent
DZgas Ж
Traneptora it's not 4:2:2
2024-12-04 09:58:11
It's definitely 422
spider-mario
2024-12-04 09:58:18
almost as if it were subsampling the chroma again after downsizing
2024-12-04 09:58:45
maybe forcing a conversion to yuv444p10le before the scale filter would help?
Traneptora
spider-mario maybe forcing a conversion to yuv444p10le before the scale filter would help?
2024-12-04 09:58:57
no, using libplacebo is enough to get what he wants
2024-12-04 09:59:00
idk why he's still whining
spider-mario
2024-12-04 09:59:34
ah, indeed it does
2024-12-04 09:59:43
DZgas Ж
Traneptora idk why he's still whining
2024-12-04 09:59:57
the problem has not been solved. I'm not getting RGB. the software does not work
Traneptora
2024-12-04 10:00:04
You are saving it as a PNG
2024-12-04 10:00:06
you are getting RGB
2024-12-04 10:00:24
if you forcibly convert it back to yuv444p and do a reinterpret-cast to RGB so you can channel-decompose it, this is what you get
2024-12-04 10:00:28
U channel
2024-12-04 10:00:33
V channel
DZgas Ж
Traneptora you are getting RGB
2024-12-04 10:00:33
this is the 422 that was converted to rgb
Traneptora
2024-12-04 10:00:42
Y channel
2024-12-04 10:00:46
there's no 4:2:2 anywhere here
DZgas Ж
2024-12-04 10:00:59
where this pic
Traneptora
2024-12-04 10:01:14
I generated it with ``` ffmpeg -y -i "THE AMAZING DIGITAL CIRCUS: PILOT [HwAPLk_sQ3w].mkv" -vf "libplacebo=w=128:h=-1:format=gbrp,crop=128:64" -frames:v 1 -update 1 OUT2.png ```
2024-12-04 10:01:28
upscaled it with nearest neighbor
2024-12-04 10:02:13
I upscaled it to nearest neighbor using libplacebo, again ``` ffmpeg -y -i OUT2.png -vf libplacebo=w=1024:h=512:upscaler=nearest:format=gbrp test.png ```
2024-12-04 10:03:08
then used ``` ffmpeg -i test.png -vf libplacebo=colorspace=bt709:format=yuv444p -f rawvideo - | ffmpeg -f rawvideo -video_size 1024x512 -pixel_format gbrp -i - -y test2.png ``` to convert it back into yuv444p. The second ffmpeg command is to reinterpet it as to allow me to open it in GIMP, and decompose the channels.
DZgas Ж
Traneptora I generated it with ``` ffmpeg -y -i "THE AMAZING DIGITAL CIRCUS: PILOT [HwAPLk_sQ3w].mkv" -vf "libplacebo=w=128:h=-1:format=gbrp,crop=128:64" -frames:v 1 -update 1 OUT2.png ```
2024-12-04 10:04:33
Hm... ffmpeg is crahed
Traneptora
2024-12-04 10:05:23
do you have vulkan?
2024-12-04 10:05:36
if you don't, you can always use zscale instead as well
spider-mario
2024-12-04 10:05:44
if using `-vf scale` instead of libplacebo, it does seem to help to insert a forced conversion to yuv444p12 first
2024-12-04 10:06:11
Traneptora
2024-12-04 10:06:13
``` ffmpeg -y -i "THE AMAZING DIGITAL CIRCUS: PILOT [HwAPLk_sQ3w].mkv" -vf "zscale=w=128:h=-1:f=spline36,format=gbrp,crop=128:64" -frames:v 1 -update 1 OUT2.png ```
2024-12-04 10:06:20
you can use zscale as well to make it use zimg
DZgas Ж
Traneptora do you have vulkan?
2024-12-04 10:06:43
Actually, yes. but it doesn't work
Traneptora
2024-12-04 10:07:00
if you have a broken vulkan setup then use zscale instead
2024-12-04 10:07:02
which is all cpu
DZgas Ж
2024-12-04 10:07:50
zscale works good
2024-12-04 10:08:09
standart scale do 422
Traneptora
2024-12-04 10:08:19
there's no 4:2:2 anywhere, and I keep trying to tell you this
2024-12-04 10:08:32
swscale is not sophisticated. it's upsampling 4:2:0 to 4:4:4 before it converts to RGB, but it does so using nearest-neighbor
2024-12-04 10:08:53
4:2:2 isn't anywhere in the pipeline at all at any point
DZgas Ж
Traneptora there's no 4:2:2 anywhere, and I keep trying to tell you this
2024-12-04 10:09:09
you can't convert a yuv420 video to rgb24 and say it's rgb. It's still a bullshit yuv420
Traneptora
2024-12-04 10:09:39
in terms of the data available, yes
DZgas Ж
2024-12-04 10:10:02
<:This:805404376658739230>
Traneptora
2024-12-04 10:10:12
I did say that
DZgas Ж
2024-12-04 10:10:45
I have never used alternative scaling methods. now I will know about zscale
2024-12-04 10:11:03
At least it's working.
Traneptora
2024-12-04 10:11:22
libswscale is correct, just not a high quality scaler
2024-12-04 10:11:26
which is what you were experiencing
2024-12-04 10:11:28
nothing was broken
DZgas Ж
Traneptora libswscale is correct, just not a high quality scaler
2024-12-04 10:13:14
Yes, this is complete nonsense. png to png is perfectly converted and video to JPEG as yuvj444p in my code is also perfectly converted, so just ffmpeg Standard scale lib is broken, since the scale replacement fixed it
Traneptora nothing was broken
2024-12-04 10:13:23
Actually
Traneptora
2024-12-04 10:13:33
you know what, I don't care anymore
2024-12-04 10:13:49
this is like talking to a brick wall
DZgas Ж
Traneptora this is like talking to a brick wall
2024-12-04 10:14:40
Of course, because I'm right and you're wrong <:PirateCat:992960743169347644>
2024-12-04 10:15:45
and the answer is literally: libswscale is broken so use zscale
A homosapien
2024-12-04 10:21:27
libswscale isn't broken, your just not interpolating the chroma, you have to pass `-sws_flags +accurate_rnd+full_chroma_int`
2024-12-04 10:21:42
I *do* think its dumb that it's not turned on by default
DZgas Ж
A homosapien libswscale isn't broken, your just not interpolating the chroma, you have to pass `-sws_flags +accurate_rnd+full_chroma_int`
2024-12-04 10:23:47
THIS SHIT
2024-12-04 10:23:55
<:This:805404376658739230>
2024-12-04 10:24:15
<@853026420792360980>this
2024-12-04 10:24:35
And there are no problems now
A homosapien libswscale isn't broken, your just not interpolating the chroma, you have to pass `-sws_flags +accurate_rnd+full_chroma_int`
2024-12-04 10:26:10
at that moment, I did not understand the meaning of this function, png to png scale gave identical results, but it turns out that's what it is
spider-mario
2024-12-04 10:28:23
what does “full” chroma interpolation mean?
2024-12-04 10:28:29
what happens when it’s not full?
DZgas Ж
2024-12-04 10:29:00
flags=spline+full_chroma_inp+accurate_rnd+full_chroma_int 👍 gread
spider-mario
2024-12-04 10:29:50
oh, it seems to be the equivalent of my “pre-converting to 4:4:4”?
DZgas Ж
spider-mario what happens when it’s not full?
2024-12-04 10:29:56
as can be seen from our argument above. when converting a video frame to PNG, you actually get YUV422 output write as RGB
DZgas Ж It's definitely 422
2024-12-04 10:30:41
above
A homosapien I *do* think its dumb that it's not turned on by default
2024-12-04 10:32:43
I agree
A homosapien
spider-mario oh, it seems to be the equivalent of my “pre-converting to 4:4:4”?
2024-12-04 11:32:29
It upscales the 420 or 422 up to 444. I encountered this while I was making a wiki extracting frames from videos. I was confused why the chroma looked really blocky. I found the solution in this cool beginners guide for converting YUV to RGB and vice versa. https://trac.ffmpeg.org/wiki/colorspace
2024-12-04 11:38:59
Nowadays I use libplacebo because I want gamma correct scaling filters.
jonnyawsom3
2024-12-04 11:50:00
So... If I'm using `-vf scale=480:-2` to downscale, I should add `:flags=full_chroma_int`?
A homosapien
2024-12-04 11:58:27
Always, I recommend `spline+full_chroma_int+accurate_rnd`
2024-12-04 11:58:49
The default settings are tuned for absolute speed rather than accuracy
Traneptora
2024-12-04 11:59:07
+bitexact for cross platform reproducability
A homosapien
2024-12-05 12:01:08
Yeah, if you want to compare file hashes between outputs it's `+bitexact -map_metadata -1 -map_chapters -1`
2024-12-05 12:01:25
I think
Traneptora
2024-12-05 12:01:32
nah for that use -c rawvideo -f hash
2024-12-05 12:01:49
or -f framecrc
A homosapien
2024-12-05 12:02:35
oh nice, there are so many cool shortcuts with ffmpeg
2024-12-05 12:03:17
the learning curve is hard but fun
So... If I'm using `-vf scale=480:-2` to downscale, I should add `:flags=full_chroma_int`?
2024-12-05 12:04:07
Here is a cool guide https://academysoftwarefoundation.github.io/EncodingGuidelines/EncodeSwsScale.html#sws_flags-options
jonnyawsom3
A homosapien Always, I recommend `spline+full_chroma_int+accurate_rnd`
2024-12-05 12:04:52
Says there that accurate_rnd doesn't do much anymore since 5.0, and I've generally been using Lanczos
A homosapien
2024-12-05 12:08:36
Yeah the main one is `+full_chroma_int`, always good to have that one
jonnyawsom3
2024-12-05 12:09:09
Also, after spending a few hours making a script for a friend, I found out `:force_original_aspect_ratio=decrease:force_divisible_by=2` exists, which saved me from some ChatGPT nightmare fuel (Trying to scale to a max of 1280, but only by multiples of 2 to maintain pixel art)
2024-12-05 12:09:43
Still doesn't do the multiple of 2 thing, but looked good enough :P
A homosapien
2024-12-05 12:18:52
Cool
jonnyawsom3
2024-12-05 12:48:33
Just spent 20 minutes trying to figure out why my files were identical with and without it.... I had typed inp not int
Traneptora
Also, after spending a few hours making a script for a friend, I found out `:force_original_aspect_ratio=decrease:force_divisible_by=2` exists, which saved me from some ChatGPT nightmare fuel (Trying to scale to a max of 1280, but only by multiples of 2 to maintain pixel art)
2024-12-05 12:50:37
what do you mean by maintain pixel art?
jonnyawsom3
2024-12-05 12:50:59
Stay in 2, 4, 8x scaling so not to stretch the pixels
Traneptora
2024-12-05 12:51:04
`-vf libplacebo=upscaler=nearest` is good for that
2024-12-05 12:51:12
just forces nearest neighbor upsampling
2024-12-05 12:51:34
works as well if it's not a power of 2
2024-12-05 12:51:42
you could do 64x64 -> 384x384 for ex
jonnyawsom3
2024-12-05 12:52:05
Yeah, I have it set to nearest, but wasn't sure if having non-multiple scaling would still skew it at all. Turned out alright though
2024-12-05 12:52:13
Can't think of the right wording....
Traneptora
2024-12-05 12:52:23
afaiu the scale factor has to be an integer but not a power of 2
2024-12-05 12:52:39
so like, if you upscale by 3, each pixel becomes a 3x3 rectangle
2024-12-05 12:52:49
but otherwise you get to keep hard edges
jonnyawsom3
2024-12-05 12:53:07
A multiple of the original dimensions, but without going over 1280. Which just isn't possible in a single command as far as I can tell, so `force_original_aspect` is as good as I could get
Traneptora
2024-12-05 12:53:26
you can use an expression for that iirc
2024-12-05 12:55:04
for example, uh
A homosapien
2024-12-05 12:55:04
like `libplaebo=w=iw*2:h=ih*2`?
Traneptora
2024-12-05 12:55:06
``` ffmpeg -i rose.png -vf 'libplacebo=w=floor(1280/iw)*iw:h=-1:upscaler=nearest' foo.png ```
2024-12-05 12:55:24
this resamples rose.png from 70x46 into 1260x828
2024-12-05 12:55:40
which is a scale factor of 18
2024-12-05 12:56:14
this is what you get
2024-12-05 12:56:23
(open in browser for max res)
jonnyawsom3
2024-12-05 12:56:39
Main issue I had was having it handle both portrait and landscape
Traneptora
2024-12-05 12:57:18
`force_original_aspect_ratio=decrease` works as well
2024-12-05 12:57:56
e.g. ``` ffmpeg -i rose.png -vf 'libplacebo=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:upscaler=nearest' foo.png ```
2024-12-05 12:58:11
you also have `force_divisible_by` as well
2024-12-05 12:58:13
if you need that
2024-12-05 12:58:36
run `ffmpeg -h filter=libplacebo` for a full list of supported filter options
2024-12-05 12:59:09
for example, libplacebo can crop, but by default it doesn't
jonnyawsom3
Traneptora e.g. ``` ffmpeg -i rose.png -vf 'libplacebo=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:upscaler=nearest' foo.png ```
2024-12-05 01:05:28
That's doing... Something, but it seems to just give up ```ffmpeg -hide_banner -i Test.png -vf "libplacebo=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:force_divisible_by=2:upscaler=nearest" -y Telegram.png Input #0, png_pipe, from 'Test.png': Duration: N/A, bitrate: N/A Stream #0:0: Video: png, pal8(pc, gbr/unknown/unknown), 190x190 [SAR 2834:2834 DAR 1:1], 25 fps, 25 tbr, 25 tbn ```
Traneptora
That's doing... Something, but it seems to just give up ```ffmpeg -hide_banner -i Test.png -vf "libplacebo=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:force_divisible_by=2:upscaler=nearest" -y Telegram.png Input #0, png_pipe, from 'Test.png': Duration: N/A, bitrate: N/A Stream #0:0: Video: png, pal8(pc, gbr/unknown/unknown), 190x190 [SAR 2834:2834 DAR 1:1], 25 fps, 25 tbr, 25 tbn ```
2024-12-05 01:06:07
wdym "give up"
jonnyawsom3
2024-12-05 01:06:15
As in that's the entire output
Traneptora
2024-12-05 01:06:21
that's not right
As in that's the entire output
2024-12-05 01:07:10
you sure? what happens if you instead run ffmpeg without `-hide_banner` and run it with `-v debug`
2024-12-05 01:07:52
you should see a bunch of crap
jonnyawsom3
2024-12-05 01:08:01
Traneptora
2024-12-05 01:08:39
looks like it's crashing upon creating the vulkan instance
2024-12-05 01:11:03
I messed haasn about it
jonnyawsom3
2024-12-05 01:11:19
Thanks
Traneptora
2024-12-05 01:13:41
theoretically you can use: ``` ffmpeg -i rose.png -vf "scale=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:force_divisible_by=2:flags=neighbor+bitexact+accurate_rnd" foo.png ```
2024-12-05 01:14:13
at least until the libplacebo wrapper bug gets fixed
A homosapien
That's doing... Something, but it seems to just give up ```ffmpeg -hide_banner -i Test.png -vf "libplacebo=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:force_divisible_by=2:upscaler=nearest" -y Telegram.png Input #0, png_pipe, from 'Test.png': Duration: N/A, bitrate: N/A Stream #0:0: Video: png, pal8(pc, gbr/unknown/unknown), 190x190 [SAR 2834:2834 DAR 1:1], 25 fps, 25 tbr, 25 tbn ```
2024-12-05 01:16:59
Add `-hwaccel vulkan`
2024-12-05 01:17:16
I ran into the exact same issue
jonnyawsom3
2024-12-05 01:20:13
Huh... That's not right
A homosapien
2024-12-05 01:25:54
Ohh
2024-12-05 01:26:00
Turn off dithering
2024-12-05 01:26:10
Blue noise on by default
2024-12-05 01:26:50
https://gist.github.com/nico-lab/0825b680ff48cad1699edb095daf8cbd
jonnyawsom3
2024-12-05 01:27:00
Didn't know it had blue noise ~~my beloved~~, but still not right
2024-12-05 01:40:54
Works fine with mp4 output so good enough for me
Traneptora
Didn't know it had blue noise ~~my beloved~~, but still not right
2024-12-05 02:07:24
2024-12-05 02:07:52
what problem are you experiencing?
2024-12-05 02:08:07
``` ffmpeg -i image.png -vf "libplacebo=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:force_divisible_by=2:upscaler=nearest:dithering=-1:format=gbrp" image2.png ```
2024-12-05 02:08:09
this is waht I ran
2024-12-05 02:08:19
it didn't make everything blue and oversaturated
jonnyawsom3
2024-12-05 02:09:56
I didn't have the format set, so I think it was re-dithering/pallete-ing to PAL8
Traneptora
2024-12-05 02:24:31
dithering won't happen by default on upscale anyway
2024-12-05 02:24:51
it's a thing that happens with bit depth reduction
jonnyawsom3
Didn't know it had blue noise ~~my beloved~~, but still not right
2024-12-05 02:57:28
Huh... Wonder what happened there then `ffmpeg -hide_banner -i Test.png -vf "scale=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:force_divisible_by=2:flags=neighbor" -y Telegram.png`
A homosapien
2024-12-05 03:02:12
idk what happened, I didn't get the color shift you're getting
jonnyawsom3
2024-12-05 05:29:58
Maybe a bug since mine was a Git build? Gimme a sec...
A homosapien Add `-hwaccel vulkan`
2024-12-05 05:42:34
Didn't work by the way
2024-12-05 05:42:54
But nope, either it's dithered or it's having the colors messed up when outputting to PNG
2024-12-05 05:43:08
`ffmpeg -hide_banner -i Test.png -vf "scale=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:force_divisible_by=2:flags=neighbor:sws_dither=0" -y Telegram.png`
A homosapien
`ffmpeg -hide_banner -i Test.png -vf "scale=w=floor(1280/iw)*iw:h=floor(1280/ih)*ih:force_original_aspect_ratio=decrease:force_divisible_by=2:flags=neighbor:sws_dither=0" -y Telegram.png`
2024-12-05 07:18:52
~~I tried this exact command and this is what I got, your ffmpeg is cursed ngl~~
2024-12-05 07:20:45
wait the input png was rgb24 instead of 8 bit paletted
2024-12-05 07:20:52
With an 8-bit palette input I got your result
2024-12-05 08:24:57
adding `-pix_fmt rgb24` fixes the issue
jonnyawsom3
2024-12-05 06:28:52
Another day, another command addition needed Turns out the input my friend has is a PNG sequence with variable sizes. Pretty sure I'd need to load all the files, find the largest, and then insert the frames into the larger canvas
2024-12-05 06:30:21
This is around 14,000 PNGs, as a timelapse from Asperite, which is then scaled 4x from a few hundred pixels wide and with a tan background added
2024-12-05 06:33:32
```bat @echo off set /p "Filename=Enter Frame Name (If the frames are called Dino1395 input Dino, etc): " ffmpeg -hide_banner -f lavfi -i anullsrc -f image2 -framerate 60 -i "%~1\%Filename%%%d.png" -vf "format=rgba,split[bg][fg];[bg]drawbox=c=tan:t=fill[bgc];[bgc][fg]overlay,scale=in_w*4:-2:flags=neighbor" -map 0:a -map 1:v -shortest -y Timelapse.mp4 pause```
2024-12-05 06:33:58
The filename thing is because the PNGs are a sequence, but with the project name beforehand
2024-12-05 06:36:11
Easier to just copy paste the name and backspace a few than write a whole other section to the bat file
ProfPootis
2024-12-05 09:41:49
jonnyawsom3
2024-12-05 10:30:48
So... The main issue I'm having is that the scale filter doesn't run 'per frame', it's only using the first frame and then forcing the rest to fit that, rather than decreasing the resolution to maintain their aspect ratio in the chosen dimensions
2024-12-05 10:31:01
If that makes any sense... I'm going sligtly insane on this
bonnibel
2024-12-11 08:32:51
2024-12-11 08:32:52
some random thoughts after a few days i'm downscaling by an integer factor, so i can easily combine the blurring and area averaging into a single step this gets you different weights than just using the gaussian as the resize kernel. not sure which is more correct? as for picking sigma, what i'm currently doing is solving 1/(2\*d) = sqrt(2\*ln(c)) * 1/(2\*pi\*sigma), where c = sqrt(2) and d = the downscale factor (e.g. 2 for a 2x downscale). if i've understood it correctly, this'd put the -3 dB cutoff frequency at the nyquist frequency of the downscaled image. given how little i know about dsp though i've probably _not_ understood it correctly
_wb_
2024-12-12 04:31:00
Combining two filters in a single step requires using a larger kernel size, so I am not sure it is actually faster...
Traneptora
So... The main issue I'm having is that the scale filter doesn't run 'per frame', it's only using the first frame and then forcing the rest to fit that, rather than decreasing the resolution to maintain their aspect ratio in the chosen dimensions
2024-12-14 09:56:28
there a setting for that
2024-12-14 09:56:36
lemme find it, sec
So... The main issue I'm having is that the scale filter doesn't run 'per frame', it's only using the first frame and then forcing the rest to fit that, rather than decreasing the resolution to maintain their aspect ratio in the chosen dimensions
2024-12-14 10:10:13
`eval=frame`
2024-12-14 10:10:38
> eval > Specify when to evaluate width and height expression. It accepts the following values: > > init > Only evaluate expressions once during the filter initialization or when a command is processed. > > frame > Evaluate expressions for each incoming frame. > > Default value is init.
bonnibel
_wb_ Combining two filters in a single step requires using a larger kernel size, so I am not sure it is actually faster...
2024-12-15 12:18:35
true...
2024-12-15 12:20:55
> this gets you different weights than just using the gaussian as the resize kernel. not sure which is more correct? okay, for integer downscaling doing gauss downscale directly rather than gauss blur + area downscale seems to my eyes more like the result i get doing it in the frequency domain, but it's hard to tell
damian101
2024-12-15 12:43:28
what set of images was ssimulacra2 trained on?
bonnibel
2024-12-15 02:07:43
afaik: CID22, TID2013, KADID-10k, & KonFiG
_wb_
2024-12-15 04:01:00
Yes, I tuned it using those datasets, optimizing for Kendall rank correlation for all of them and also for Pearson and just MSE for the CID22 set, so the scores end up on a scale that is similar to the scale of CID22.
spider-mario
2024-12-21 09:44:37
https://mastodon.online/@nikitonsky/113691789641950263
Quackdoc
2024-12-21 10:21:03
xD
CrushedAsian255
2024-12-21 11:10:44
My projects: 0.3.48432
jonnyawsom3
2024-12-22 02:51:28
Also <@384009621519597581>, yesterday I was thinking about Space Engine again and creating higher bitdepth images. Exposure bracketing came to mind so I was going to have my friend render a few images, but then I realised I don't know the ideal settings for the widest range or the best software to process them into a wider single image
AccessViolation_
2024-12-22 02:58:59
I can boot it up and look at some values if you want. If you limit yourself to stuff within the solar system you can get a feel for the different brightness of the planets. If you get a couple of planets to line up, go pretty far away and use a very narrow field of view you can probably get a couple of similarly-sized planets and the sun in a single image
2024-12-22 02:59:32
Wait no if you had the sun in the picture you would only be seeing the dark sides
2024-12-22 03:00:29
Unless you like, have the sun take up half of the screen, then you could make it work
jonnyawsom3
2024-12-22 03:00:36
I mean wider as in range of values, since you wanted to make a 'real' JXL file going from the brightness of the sun to ambient space
AccessViolation_
2024-12-22 03:02:08
Yeah I know what you meant, that's probably going to require some custom software or at least something that can manually edit the layers in a JXL and then do the ICC profile correctly
2024-12-22 03:02:42
I was just thinking about how you would actually set this up in space engine since everything is realistically scaled and space objects aren't usually conveniently close
jonnyawsom3
2024-12-22 03:02:52
And now I'm wondering if we could do exposure bracketing *inside* a JXL using the blend modes and cutoff points for brightness per image as a layer. But that's probably insane
AccessViolation_
2024-12-22 03:06:23
There are several approaches, one is similar to a gain map approach with layers, another one is to store just one image with f32 channels, give it a linear profile, and encode the real values directly into floats, but then you end up with not a lot of precision on values very far away from zero. So the ideal transfer function for equal precision at every magnitude of brightness is to have a nonlinear transform function that compensates for the loss of precision in floats as the values get bigger, effectively leaving you with linear precision across the float range
2024-12-22 03:07:52
There was another idea around floats somehow getting linear precision someone had, but I forgot what it was :/
jonnyawsom3
AccessViolation_ I was just thinking about how you would actually set this up in space engine since everything is realistically scaled and space objects aren't usually conveniently close
2024-12-22 03:08:58
Planet foreground and sun background with the other stars in the space between would work as a proof of concept. This is more so about light values than the details of objects, until it's scaled to a might higher resolution once we know it works
2024-12-22 03:09:01
Given the game can only output 8bit images (Excluding BC6H with the Pro DLC), the gain map approach would likely be the least effort. Unless we have an easy way to blend such a wide range of exposures together, since I doubt normal exposure bracketing tools go above 16bit
AccessViolation_
2024-12-22 03:10:43
Yeah, that sounds like the best approach for starters
CrushedAsian255
AccessViolation_ There are several approaches, one is similar to a gain map approach with layers, another one is to store just one image with f32 channels, give it a linear profile, and encode the real values directly into floats, but then you end up with not a lot of precision on values very far away from zero. So the ideal transfer function for equal precision at every magnitude of brightness is to have a nonlinear transform function that compensates for the loss of precision in floats as the values get bigger, effectively leaving you with linear precision across the float range
2024-12-22 03:12:55
The loss of precision should matter as human vision is non linear anyways
2024-12-22 03:13:08
Or you could use a logarithmic transfer function
AccessViolation_
2024-12-22 03:15:08
The idea was not for humans to directly look at these images as a whole (although you could), the idea was more that you had sufficient detail in every brightness of the image so that when editing it you could choose to properly expose the surface of the sun, or the dark side of the earth, or both, and have them look good to human eyes after that edit
CrushedAsian255
2024-12-22 03:15:51
If doing that then the lack of precision on higher float values should still not matter as exposure is also exponential
jonnyawsom3
2024-12-22 03:15:58
Oh > You can export 32-bit depth floating point skyboxes Apparently that's with the PRO DLC's DDS export, I'd guess <https://steamcommunity.com/app/314650/discussions/0/2263564102376823320/>
CrushedAsian255
2024-12-22 03:20:46
Float gives you approx +/- 126 Exposure steps
AccessViolation_
2024-12-22 03:21:04
That looks promising, though I wonder if that's still over the whole exposure range the game can produce or if it just picks a range around your target exposure and uses the float range for that. Worth exploring, but I don't have the PRO version, and also whether a given version of the game works on my steam deck is a bit of a gamble lol, it's officially "unsupported" but sometimes it works and sometimes it doesn't
CrushedAsian255
2024-12-22 03:23:47
You get maybe 10-20 more EV in the negative direction but that is for sub normals and you DO start losing precision
jonnyawsom3
AccessViolation_ That looks promising, though I wonder if that's still over the whole exposure range the game can produce or if it just picks a range around your target exposure and uses the float range for that. Worth exploring, but I don't have the PRO version, and also whether a given version of the game works on my steam deck is a bit of a gamble lol, it's officially "unsupported" but sometimes it works and sometimes it doesn't
2024-12-22 03:24:43
DDS only supports half float, so I'm not sure where he got 32-bit from...
CrushedAsian255
2024-12-22 03:25:32
If it can fit in half float it can easily fit in a full float
_wb_
2024-12-22 06:06:14
Float has the same amount of mantissa precision for every exponent so I think it's pretty good for this use case.
AccessViolation_
2024-12-22 06:29:07
That's true, but as the exponent gets larger, the amount added when 'incrementing' the mantissa increases. At a value of about 13 million, ticking up the mantissa adds 1 to the value and you can no longer really store decimals. So you are wasting a lot on the 0.00000001 and 0.00000002 you can distinguish. I guess "as the magnitude get larger, precision will be less important" was the idea for IEEE 754 and that's probably fine in most cases but here the idea is that you can resolve equal detail at very high or low absolute brightness, so that principle should explicitly not apply
2024-12-22 06:31:55
Here's a neat tool for seeing how values are represented as floats and the error of the requested vs represented value https://www.h-schmidt.net/FloatConverter/IEEE754.html
2024-12-22 06:33:25
But I mean, given the still quite large amount of integers you can store precisely in f32, that's always an option if my concerns are valid (I know less about cameras, light and pictures than I do about floating point so maybe these concerns don't mean anything 😅 )
2024-12-22 06:34:13
And of course there's log profiles and what not that could compensate for it
spider-mario
2024-12-22 06:59:27
I’m not sure I follow – isn’t it mainly relative differences that matter at mid to high levels?
2024-12-22 07:01:01
jonnyawsom3
2024-12-22 07:03:41
Saying "No artifacts visible" is very subjective, especially when we'd be zooming in on distant stars and then sliding the exposure of the image after to make them bright
AccessViolation_ But I mean, given the still quite large amount of integers you can store precisely in f32, that's always an option if my concerns are valid (I know less about cameras, light and pictures than I do about floating point so maybe these concerns don't mean anything 😅 )
2024-12-22 07:05:54
Though, considering real camera sensor noise, and the in-game rendering not being perfectly accurate itself, it would probably be good enough
spider-mario
2024-12-22 07:23:34
but brightening them will involve multiplying anyway, not adding, right?
jonnyawsom3
2024-12-22 08:03:43
My brain hurts xD
_wb_
2024-12-22 09:19:07
If you consider brightness differences multiplicative (like when talking about stops) and store them in floats then you roughly have the same precision at every point, whether you are talking deep space nanonits (or whatever it is) or supernova petanits (or whatever it is)
AccessViolation_
2024-12-22 11:32:27
All this reading about predictors and context modeling makes me want to apply that concept to other data. One thing that comes to mind is Minecraft voxel world map data. Oak log blocks will almost always continue in the direction they face (as part of trees or wooden beams in buildings). Grass will almost always have air on top, as many other things on top will change the grass to dirt. Stairs will usually have more stairs in the same direction one up and ahead, to continue the staircase. A water block with air above it will have a very high chance of repeating horizontally in all directions as part of the water surface, and if the biome is ocean the water probably extends a long way down too. I know Minecraft internally uses a palette approach where each chunk creates a palette of all the blocks in it, and also a palette for some properties like whether blocks are waterlogged, and then the actual blocks are stored as indices into those. It uses the same amount of bits for every entry: the lowest amount you can get away with given the amount of blocks you have. And then it's deflate or lz4 compressed. I wonder how easy that is to beat with predictors and context modeling
2024-12-22 11:33:45
I mean, the best 'predictor' is just the chunk generation algorithm because world gen is procedural, but that's *very* CPU intensive which is probably why chunks are stored as a whole instead of only storing the changes to them
jonnyawsom3
2024-12-22 11:57:06
Maybe a cut down version of chunk gen for the 'higher level' properties like biome, to know what the most common blocks are
CrushedAsian255
2024-12-23 02:03:28
distance 4000 is amazing <@207980494892040194>
embed
CrushedAsian255 distance 4000 is amazing <@207980494892040194>
2024-12-23 02:03:34
https://embed.moe/https://cdn.discordapp.com/attachments/794206087879852106/1320572456607547443/out.jxl?ex=676a1670&is=6768c4f0&hm=0e0379c82a3a5ea692f19dde0ec1cb03fb8431a07e25883da4c86d00938d9b69&
CrushedAsian255
2024-12-23 02:05:19
```--intensity_target 0.1 -m 1 -d 40```
embed
CrushedAsian255 ```--intensity_target 0.1 -m 1 -d 40```
2024-12-23 02:05:23
https://embed.moe/https://cdn.discordapp.com/attachments/794206087879852106/1320572919084093491/out.jxl?ex=676a16de&is=6768c55e&hm=a305f3c6b0d4889a67ceee04874e6d9d753ac0e77b03b72c138a3dd1ac9c095d&