|
HCrikki
|
2024-02-28 03:19:32
|
is everyone running against their own sample images or is there a specific source dataset commonly benched ?
|
|
|
_wb_
|
2024-02-28 03:20:27
|
It might be a good idea to change the default effort setting to 6. For lossy, e6 is performing about as well as e7 while it is twice as fast, and for lossless e6 does perform worse than e7 but it still is much better than png/webp in all ways and has a more reasonable speed than e7 (even though e7 has become a lot more reasonable than it was, already)
|
|
|
HCrikki
is everyone running against their own sample images or is there a specific source dataset commonly benched ?
|
|
2024-02-28 03:21:30
|
there are some commonly used sets, but it's always interesting to see how things are for the images you care about โ things can be quite different depending on the kind of image...
|
|
|
Quackdoc
|
|
HCrikki
is everyone running against their own sample images or is there a specific source dataset commonly benched ?
|
|
2024-02-28 03:35:05
|
I don't use datasets since I find they poorly represent real world usecase, I will just crawl the internet and download a wack load of images i can get the source for, and rip a good amount of images from gallery sites
|
|
|
|
afed
|
|
_wb_
It might be a good idea to change the default effort setting to 6. For lossy, e6 is performing about as well as e7 while it is twice as fast, and for lossless e6 does perform worse than e7 but it still is much better than png/webp in all ways and has a more reasonable speed than e7 (even though e7 has become a lot more reasonable than it was, already)
|
|
2024-02-28 03:43:36
|
for lossy e6 with streaming by default and e7 without or now it's for the same efforts as for lossless?
`-e 6` by default is good and bad, bad because a lot of people are comparing default settings without taking speed into account
and e7 is still fast enough, with streaming also for lossless
|
|
2024-02-28 03:52:43
|
or now lossless e7 is worse than e6 on most images?
because patches are disabled?
|
|
|
_wb_
|
2024-02-28 04:04:28
|
no, for lossless, e7 compresses better than e6, but it's a bit slow still (even though it's much faster now than it was).
|
|
2024-02-28 04:05:28
|
for lossy, it's not super clear to me how much better the compression of e7 is compared to e6, but I think the answer is "not much".
|
|
|
MSLP
|
|
Traneptora
-October being the gcc optimization level of "ctober"
|
|
2024-02-28 04:26:55
|
doesn't work for me, but it's expected since it's February ๐คช
|
|
|
Oleksii Matiash
|
|
HCrikki
is everyone running against their own sample images or is there a specific source dataset commonly benched ?
|
|
2024-02-28 06:12:33
|
I use my own photos because of their large size and the fact that it is my main jxl usage
|
|
|
|
afed
|
|
_wb_
no, for lossless, e7 compresses better than e6, but it's a bit slow still (even though it's much faster now than it was).
|
|
2024-02-28 06:29:38
|
ah, I just read it wrong then maybe, though patches can sometimes significantly improve compression, don't know how well it works for lossy, also maybe the latest optimizations for complex images can perform worse on fast efforts
|
|
|
fab
|
2024-02-28 09:12:05
|
Jon How JPEG XL can upscale good in the Rance higher than SPMT19730M colours
|
|
2024-02-28 09:12:17
|
Is fast if i load a video
|
|
2024-02-28 09:12:43
|
Basically what is it the Changelogs?
|
|
2024-02-28 09:13:23
|
I rewrote full gemini because it was really terrible 72% of the times
|
|
2024-02-28 09:13:34
|
Now it speaks spanish
|
|
2024-02-28 09:13:52
|
I asked que es el codec AV1?
|
|
2024-02-28 09:15:11
|
Like for example images of cats in rgb and yrcbr color space, or even videos what is the bpp gains?
|
|
|
_wb_
|
2024-02-28 09:23:45
|
I see words but they are put in a sequence that is beyond my ability to understand.
|
|
|
spider-mario
|
2024-02-28 11:12:02
|
fab, whenever you write something, could you please check which channel you are about to post it to and whether that is the appropriate one?
|
|
2024-02-28 11:12:13
|
if none matches, <#806898911091753051> would be the best fit
|
|
|
Orum
|
2024-02-28 11:43:39
|
e 7 is really the sweet spot for lossless, at least on the images I've tested it on, though I do think streaming_input and streaming_output should be on by default
|
|
2024-02-28 11:45:29
|
and it would be a little more convenient if streaming_input didn't require ppm/pgm input, though I can see the challenges associated with supporting other formats
|
|
2024-02-28 11:48:16
|
if you needed a faster lossless preset I'd go all the way to e 2
|
|
|
_wb_
|
2024-02-29 07:50:44
|
streaming_input should be possible with PNG input as long as they're not Adam7 interlaced (but most aren't). But this will only affect cjxl; other applications using libjxl have to make changes on their end to do input streaming...
|
|
|
CrushedAsian255
|
|
Traneptora
-October being the gcc optimization level of "ctober"
|
|
2024-02-29 10:47:33
|
Optimises for ghosts running the software?
|
|
|
sklwmp
|
2024-03-01 04:51:03
|
Has anyone tested if compiling with -O3 is faster than -O2 for libjxl? I understand that sometimes, higher optimization levels are not always faster, that sometimes even -Osize works better.
|
|
|
Traneptora
|
|
sklwmp
Has anyone tested if compiling with -O3 is faster than -O2 for libjxl? I understand that sometimes, higher optimization levels are not always faster, that sometimes even -Osize works better.
|
|
2024-03-01 05:58:41
|
"higher optimization is not always faster" is something that gentoo people have been saying for years, but generally speaking O3 will be faster than O2
|
|
2024-03-01 05:59:46
|
O3 enables -funroll-loops, which can increase compiled code size in a way that that only actually hurts performance on systems that are very slow at loading dynamic libraries from disk
|
|
2024-03-01 06:00:12
|
it won't particularly matter for libjxl though as most of the performance-critcial code uses highway for simd and thus optimization flags won't affect much
|
|
2024-03-01 06:03:30
|
do keep in mind that people often will argue that "-O3 breaks code" but this is not actually true, -O3 breaks *incorrect* code that relies on undefined behavior
|
|
2024-03-01 06:03:47
|
also the biggest source of breakage is `-fstrict-aliasing` which is actually enabled with -O2
|
|
2024-03-01 06:04:06
|
`-fstrict-aliasing` tells the compiler to assume that code does not break the strict aliasing rule
|
|
2024-03-01 06:04:44
|
the strict aliasing rule being "it's illegal to re-interpret-cast pointers except to/from `unsigned char *`, `char *` and `void *`"
|
|
2024-03-01 06:05:40
|
so, doing something like
```c
float f = 5.0;
uint32_t x = *(uint32_t *)&f;
```
is UB in C, it's called the "strict-aliasing" rule
|
|
2024-03-01 06:05:54
|
if you actually want to do this, you need to do it with a union
|
|
2024-03-01 06:06:26
|
```c
union { float f; uint32_t i; } z;
z.f = 5.0;
uint32_t x = z.i;
```
this is legal in C. (though not in C++)
|
|
|
CrushedAsian255
|
2024-03-01 06:34:50
|
In c++ what should you do?
|
|
2024-03-01 06:35:02
|
Bitcasts?
|
|
|
_wb_
|
2024-03-01 06:45:12
|
memcpy works
|
|
|
Traneptora
|
2024-03-01 06:47:37
|
memcpy is one way, another way IIRC is to use `reinterpret_cast<uint32_t *>` explicitly
|
|
2024-03-01 06:47:55
|
reason it's illegal in C++ is they're more strict about unions
|
|
2024-03-01 06:48:24
|
in C++, exactly one member of a union is considered "active" at any given point, which is by definition the one that was most recently assigned
|
|
2024-03-01 06:48:41
|
and it's illegal C++ to read from a different member than the one that was assigned more recently
|
|
2024-03-01 06:48:43
|
C permits this, though
|
|
|
|
veluca
|
|
Traneptora
memcpy is one way, another way IIRC is to use `reinterpret_cast<uint32_t *>` explicitly
|
|
2024-03-01 08:20:10
|
that's definitely not OK
|
|
|
yoochan
|
2024-03-01 08:38:56
|
xkcd made one just for you ! https://xkcd.com/2899/
|
|
|
CrushedAsian255
|
|
Traneptora
in C++, exactly one member of a union is considered "active" at any given point, which is by definition the one that was most recently assigned
|
|
2024-03-01 08:51:27
|
Whatโs the point of using unions then
|
|
|
yurume
|
2024-03-01 08:59:58
|
for memory optimization?
|
|
2024-03-01 09:00:02
|
like, `std::variant`
|
|
|
spider-mario
|
|
CrushedAsian255
Whatโs the point of using unions then
|
|
2024-03-01 09:10:16
|
same as tagged unions but without the tag
|
|
|
_wb_
|
2024-03-01 09:54:49
|
Here is an example to demonstrate something most of us already know but still remains useful to point out: PSNR is not a good perceptual metric. These four images have the same PSNR of 35.0, but if you ask ssimulacra2 (or butteraugli, for that matter), the webp and avif images are worse.
|
|
2024-03-01 09:55:06
|
ugh it strips the animation
|
|
2024-03-01 09:55:32
|
but I guess the distortions are strong enough to see them without reference to the original
|
|
2024-03-01 09:57:06
|
animated version here: https://res.cloudinary.com/jon/psnr35.png
|
|
|
lonjil
|
2024-03-01 09:57:12
|
Yes, those are quite severe
|
|
|
_wb_
|
2024-03-01 09:59:49
|
this is lower quality than what I consider useful even for the web, but not outrageously low quality where all metrics break down
|
|
2024-03-01 10:00:48
|
whenever someone shows you PSNR BD rate results showing how great something is, point them to this little reminder that PSNR doesn't mean much ๐
|
|
|
yoochan
|
2024-03-01 10:01:54
|
avif only believe in MS-SSIM, is this metric better than PSNR ?
|
|
|
_wb_
|
2024-03-01 10:05:52
|
slightly, but not much
|
|
2024-03-01 10:15:14
|
This shows the correlation between PSNR and human opinions. Horizontal axis is human opinion, where 30 is low quality, 90 is very high quality. Vertical axis is psnr. The color indicates the number of images, where gray is 1 image, blue is 2, green is 20, orange is 100, red is 500 (so it's a density heatmap). As you can see for PSNR things are all over the place, if the PSNR is 50 it will be a very good image but if it is 35 or 40, the real quality can be pretty much anything.
|
|
2024-03-01 10:16:41
|
the solid black line shows the mean human score for a given psnr score, the dashed line shows p25 - p75 and the dotted line shows p5 - p95
|
|
2024-03-01 10:18:22
|
For MS-SSIM the plot looks like this. Better, but still kind of all over the place.
|
|
2024-03-01 10:21:27
|
For SSIMULACRA2 the plot looks like this. Of course still not perfect correlation (that would look like a perfect diagonal line), but significantly better than PSNR or MS-SSIM.
|
|
2024-03-01 10:25:15
|
The numbers at the top of the plot are the Kendall Rank Correlation Coefficient (KRCC), the Spearman Rank Correlation Coefficient (SRCC), and the Pearson Correlation Coefficient (PCC), which are ways to summarize the overall correlation.
|
|
2024-03-01 10:30:37
|
Of course you'll never get the correlation to a perfect 1 because
1) the ground truth of human opinions is not perfect (there's sampling noise in it, humans are not perfectly consistent, etc) ,and
2) humans can have non-transitive preferences occasionally (where they say things like A > B > C > A) which you can never capture with a numerical metric that constrains the concept of quality to a total order relation.
|
|
|
yoochan
|
2024-03-01 03:21:25
|
thank you for the plots, they are very interesting ! (and the explanations) I hope we could convince the guy who did answered me here : https://github.com/webmproject/codec-compare/issues/3 ๐
|
|
2024-03-01 03:21:53
|
The illustrations reminds me a blog post a read long time ago... written by you I suppose
|
|
|
Traneptora
|
|
CrushedAsian255
Whatโs the point of using unions then
|
|
2024-03-01 04:39:15
|
one reason might be, say how I use them in hydrium
|
|
2024-03-01 04:39:42
|
I store the quantized DCT coeffs in the same variable as the dequant ones
|
|
2024-03-01 04:40:03
|
quantized DCT coeffs are integers and the original are floats
|
|
2024-03-01 04:40:24
|
once I quantize them I don't need the float so I just assign it to the integer
|
|
2024-03-01 04:40:43
|
prevents me from allocating two buffers
|
|
|
monad
|
2024-03-02 11:01:51
|
all mistakes attributable to me. jon will be angry about this
|
|
|
CrushedAsian255
|
2024-03-02 12:04:59
|
What?
|
|
|
Orum
|
2024-03-02 01:03:59
|
0.9.0 <:PepeHands:808829977608323112>
|
|
|
fab
|
2024-03-02 02:14:27
|
In the benchmarks ive done ive seen 20,2% reductions on libjxl 0.10.0 with vp9 e8
|
|
2024-03-02 02:14:52
|
With the metric Ssimulacra 1
|
|
2024-03-02 03:53:32
|
16;50pm
|
|
2024-03-02 03:53:52
|
Toggleton sent this
|
|
|
jonnyawsom3
|
2024-03-02 03:54:42
|
That isn't a JXL
|
|
|
fab
|
|
sklwmp
|
|
Traneptora
"higher optimization is not always faster" is something that gentoo people have been saying for years, but generally speaking O3 will be faster than O2
|
|
2024-03-03 11:50:33
|
Well, apparently Arch says it too
|
|
|
Nyao-chan
|
2024-03-03 12:13:47
|
I did try PGO with Polly and BOLT and it was useless. I assume `-O 3` would be the same
|
|
2024-03-03 12:14:25
|
maybe 1.5% faster, but could be just error
|
|
|
|
veluca
|
2024-03-03 12:18:03
|
I would be extremely surprised if compiler optimizations made significant differences for vardct (beyond -O2 of perhaps -O1)
|
|
2024-03-03 12:18:09
|
for modular mode, perhaps
|
|
|
Nyao-chan
|
2024-03-03 12:18:50
|
Oh, I was only testing modular, I should specify
|
|
2024-03-03 12:19:28
|
I think Release instead of None did make in a little faster though
|
|
|
lonjil
|
2024-03-03 12:20:21
|
-O3 doesn't break things (the code is already broken) but the measured improvements from using it are usually under the margin of error.
|
|
|
|
afed
|
|
veluca
I would be extremely surprised if compiler optimizations made significant differences for vardct (beyond -O2 of perhaps -O1)
|
|
2024-03-03 12:22:38
|
btw, the git version for windows is still much slower for fast lossless modes than compiled with clang
|
|
2024-03-03 12:26:41
|
it wouldn't be that bad if people didn't make benchmarks with these much slower binaries (which is already happens)
|
|
|
|
veluca
|
|
lonjil
-O3 doesn't break things (the code is already broken) but the measured improvements from using it are usually under the margin of error.
|
|
2024-03-03 12:30:01
|
It turns on autovec though, which while not so great, can be quite helpful in *some* cases... Of course, most of those programs use intrinsics anyway
|
|
|
afed
btw, the git version for windows is still much slower for fast lossless modes than compiled with clang
|
|
2024-03-03 12:30:25
|
You mean it's slower with msvc than with clang?
|
|
2024-03-03 12:30:54
|
Ah, yes, I believe I understand why... I guess we could make release binaries with clang-cl, or figure out how to do dynamic dispatching with msvc
|
|
|
|
afed
|
2024-03-03 12:31:57
|
probably at least what is used for windows binaries in git releases
|
|
|
veluca
You mean it's slower with msvc than with clang?
|
|
2024-03-03 12:46:40
|
https://canary.discord.com/channels/794206087879852103/848189884614705192/1213829871189626980
|
|
|
|
veluca
|
|
afed
https://canary.discord.com/channels/794206087879852103/848189884614705192/1213829871189626980
|
|
2024-03-03 12:47:32
|
I see, what did you compile the git version with?
|
|
|
|
afed
|
2024-03-03 12:48:26
|
-O2 x86-64-v2
|
|
|
|
veluca
|
2024-03-03 01:05:00
|
... compiler? ๐
|
|
|
|
afed
|
2024-03-03 01:06:30
|
latest clang in msys2, can't remember exactly, but most likely the latest release version
|
|
2024-03-03 01:07:36
|
`Clang 17.0.6`
|
|
|
|
veluca
|
2024-03-03 01:23:39
|
I see
|
|
2024-03-03 01:23:46
|
I wonder what I'd get with clang-cl
|
|
2024-03-03 01:24:08
|
is that something you could try out?
|
|
|
|
afed
|
2024-03-03 01:26:49
|
it would be difficult because msvc is big and there are a lot of extra deps
|
|
|
|
veluca
|
2024-03-03 02:55:16
|
I see, I'll investigate during the week then perhaps
|
|
|
|
afed
|
2024-03-04 10:10:20
|
the new git build with clang is not much different, maybe it's an older clang version or something else, lto? <:Thonk:805904896879493180>
|
|
|
|
veluca
|
2024-03-04 10:12:22
|
can you give me the output of something at `-e 1 -d 0`?
|
|
2024-03-04 10:12:35
|
(I assume you mean the win build from my latest PR)
|
|
|
|
afed
|
2024-03-04 10:15:45
|
git msvc
`2480 x 3508, geomean: 71.379 MP/s [57.86, 74.33], 100 reps, 8 threads.`
git clang
`2480 x 3508, geomean: 71.170 MP/s [62.47, 73.80], 100 reps, 8 threads.`
|
|
|
|
veluca
|
2024-03-04 10:17:31
|
yeah no
|
|
2024-03-04 10:17:34
|
that didn't do it
|
|
|
|
afed
|
2024-03-04 10:20:05
|
are these builds for x86-64-v1?
it's strange that mostly e1 is slower, even though it should be more optimized
|
|
|
|
veluca
|
2024-03-04 10:20:23
|
oh, I can tell you why
|
|
2024-03-04 10:21:14
|
https://github.com/libjxl/libjxl/blob/main/lib/jxl/enc_fast_lossless.cc#L49
|
|
2024-03-04 10:22:01
|
I suspect clang-cl still "smells" like msvc for that check
|
|
|
|
afed
|
2024-03-05 10:06:12
|
it might be useful to have some extra build at least for windows x64 with march avx2 (like -march=haswell, for just avx I don't think it makes much sense) and avx512 support (disabled for generic builds, as far as I know) for benchmarks on modern systems
because for windows it's less likely that people will compile their own binaries
something like `jxl-x64-avx2-windows-static.zip`, if it's not too hard for extra maintenance and extra compilation time
|
|
|
fab
|
2024-03-05 10:37:38
|
Yt is becoming smart, it's learning how to build a custom encoder based on what I gave on Brave
|
|
|
jonnyawsom3
|
|
afed
it might be useful to have some extra build at least for windows x64 with march avx2 (like -march=haswell, for just avx I don't think it makes much sense) and avx512 support (disabled for generic builds, as far as I know) for benchmarks on modern systems
because for windows it's less likely that people will compile their own binaries
something like `jxl-x64-avx2-windows-static.zip`, if it's not too hard for extra maintenance and extra compilation time
|
|
2024-03-05 03:28:36
|
The package names are already slightly confusing for the average user, might just make it worse
|
|
|
|
afed
|
2024-03-05 04:05:43
|
don't think one more binary is much more confusing, maybe a slightly different name, but still, avx2 binaries (or something like x86-64-v2, x86-64-v3) are pretty common if that makes sense for speed
avx2 gives some gain, enabling avx512 also gives about 50% more and the binary size will not be as bloated as for generic builds, so it's good for everyone
though first we need that clang compilation should at least work properly
|
|
|
|
veluca
|
2024-03-05 04:07:21
|
but libjxl uses dynamic dispatch anyway
|
|
|
|
afed
|
2024-03-05 04:08:32
|
yeah, but still march avx2 gives some gains over generic
|
|
|
|
veluca
|
2024-03-05 04:08:43
|
in modular?
|
|
|
|
afed
|
2024-03-05 04:15:57
|
yeah, some, but still, for e1 mostly (and for slower efforts as well, though I haven't done any comparisons recently, but it was pretty consistently) and it's smaller binary size and the option to use avx512, which is disabled by default (windows users rarely compile binaries and avx512 support is not that uncommon)
|
|
|
Oleksii Matiash
|
2024-03-05 04:49:06
|
Just curious, is binary size really an issue? I mean does enabling avx512 increase binary size like by x2?
|
|
|
|
veluca
|
|
afed
yeah, some, but still, for e1 mostly (and for slower efforts as well, though I haven't done any comparisons recently, but it was pretty consistently) and it's smaller binary size and the option to use avx512, which is disabled by default (windows users rarely compile binaries and avx512 support is not that uncommon)
|
|
2024-03-05 04:54:37
|
not that uncommon *now* ๐
|
|
2024-03-05 04:54:42
|
enabling avx512 might make sense
|
|
|
|
afed
|
2024-03-05 04:54:44
|
no, not much, but also compression for e1 with avx512 is a bit worse, maybe that is also one of the reasons
|
|
2024-03-05 04:55:24
|
but for some separate version I think it's worth it
|
|
2024-03-05 04:58:05
|
<:FeelsReadingMan:808827102278451241>
|
|
|
Oleksii Matiash
|
|
veluca
not that uncommon *now* ๐
|
|
2024-03-05 05:19:51
|
Yes ๐
|
|
|
jonnyawsom3
|
2024-03-05 06:00:41
|
I know a while ago some extensions stopped showing in cjxl due to upstream changes https://discord.com/channels/794206087879852103/804324493420920833/1210293891664977930
|
|
|
eddie.zato
|
2024-03-06 05:22:02
|
How to compile cjxl without avx support?
I tried `-march=x86-64-v2 -mtune=x86-64-v2 -mno-avx -mno-avx2`, but it doesn't seem to work as cjxl still says `cjxl v0.10.1 5f67ebc [AVX2,SSE4,SSE2]`
|
|
|
_wb_
|
2024-03-06 06:25:47
|
They're a HWY define for it
|
|
|
DZgas ะ
|
|
afed
<:FeelsReadingMan:808827102278451241>
|
|
2024-03-06 09:30:32
|
๐ฌ I don't have AVX
|
|
|
190n
|
2024-03-06 09:32:10
|
really, what cpu?
|
|
|
DZgas ะ
|
|
190n
really, what cpu?
|
|
2024-03-06 10:01:38
|
AMD Athlon II X4 640
|
|
|
Orum
|
2024-03-06 10:03:26
|
holy...
|
|
2024-03-06 10:04:03
|
you want an Ivy Bridge PC?
|
|
|
DZgas ะ
|
2024-03-06 10:04:10
|
https://github.com/ggerganov/whisper.cpp
https://github.com/Const-me/Whisper
An excellent test for stupid developer -- the x64 AND x86 variant is compiled by AVX-only --- geniuses devs
|
|
|
Orum
you want an Ivy Bridge PC?
|
|
2024-03-06 10:05:18
|
What for? my computer seems to be working...
|
|
|
Orum
|
2024-03-06 10:05:32
|
for AVX, obviously...
|
|
|
DZgas ะ
|
|
Orum
for AVX, obviously...
|
|
2024-03-06 10:06:05
|
Oh yes..... and why do I need AVX? ๐
|
|
2024-03-06 10:08:49
|
Well, fans of poke <@548366727663321098> anything for no reason, are you an AVX seller?
|
|
|
Orum
|
2024-03-06 10:09:14
|
I can't imagine how painful it must be to not have AVX these days
|
|
2024-03-06 10:09:34
|
or just using 14 year old HW
|
|
|
|
afed
|
|
DZgas ะ
๐ฌ I don't have AVX
|
|
2024-03-06 10:10:35
|
it would still be just an extra binary, but not for avx, for avx2 and with enabled avx512, so modern systems will get some gains
|
|
2024-03-06 10:12:34
|
also <:KekDog:805390049033191445>
https://www.phoronix.com/news/LLVM-Clang-18.1-Released
|
|
|
DZgas ะ
|
|
Orum
I can't imagine how painful it must be to not have AVX these days
|
|
2024-03-06 10:12:36
|
๐ useless thing - go to shop and buy HP laptop with N3050 (no have avx) in 2024
|
|
|
HCrikki
|
2024-03-06 10:12:42
|
a private build would imo be the best compromise. upstream needs to move forward
|
|
|
190n
|
|
DZgas ะ
๐ useless thing - go to shop and buy HP laptop with N3050 (no have avx) in 2024
|
|
2024-03-06 10:14:11
|
intel moment <:YEP:808828808127971399>
|
|
|
DZgas ะ
|
2024-03-06 10:14:40
|
just think, the AVX developed in 2008 by Intel itself is not used in their 2015 Braswell architecture -- what is the reason hmm
|
|
|
|
afed
|
|
afed
also <:KekDog:805390049033191445>
https://www.phoronix.com/news/LLVM-Clang-18.1-Released
|
|
2024-03-06 10:15:14
|
but basically avx10.2 is what will be the mainstream version, and 10.1 is sort of transitional
|
|
|
Orum
|
2024-03-06 10:15:43
|
braswell?
|
|
|
DZgas ะ
|
|
Orum
braswell?
|
|
2024-03-06 10:16:13
|
|
|
|
Orum
|
2024-03-06 10:16:53
|
those things without AVX are all atom/low power CPUs
|
|
2024-03-06 10:17:12
|
they're not meant to be crunching stuff with SIMD ๐
|
|
2024-03-06 10:18:42
|
anyway hopefully RISC-V will take over and their vector instructions will finally kill the SIMD treadmill
|
|
|
|
afed
|
|
HCrikki
a private build would imo be the best compromise. upstream needs to move forward
|
|
2024-03-06 10:34:01
|
there is no need for private builds, just a normal generic version for any cpus and avx2+ (with enabled avx512) for modern systems, when I tested this gives up to 5% for modular modes compared to generic build and avx512 gives about 50% more for e1 (and for other modes too, but less)
|
|
2024-03-06 10:39:21
|
but when the builds are fixed, because right now it's like this
https://canary.discord.com/channels/794206087879852103/848189884614705192/1213829871189626980
|
|
|
Traneptora
|
|
Orum
anyway hopefully RISC-V will take over and their vector instructions will finally kill the SIMD treadmill
|
|
2024-03-06 10:49:42
|
2024 year of the risc-v desktop?
|
|
|
Quackdoc
|
|
Traneptora
2024 year of the risc-v desktop?
|
|
2024-03-06 10:52:06
|
maybe, the new upcomming milk-v oasis looks insanely good for the alleged price
|
|
|
Traneptora
|
2024-03-06 10:52:31
|
just want to point out that risc-v is like ten years old and we still don't really have hardware for it
|
|
2024-03-06 10:52:34
|
it's all qemu instances
|
|
|
Quackdoc
|
|
Traneptora
just want to point out that risc-v is like ten years old and we still don't really have hardware for it
|
|
2024-03-06 10:54:56
|
risc-v spec has only formally been a stable spec for about 4 years now meaning any hardware produced before that should be considered as possibly incompatible, in that time we have had 3 major socs released. 2 socs are SBC level socs and one soc "corporate" level.
|
|
2024-03-06 10:55:42
|
the antminer x3 is a good example of corporate use, the sipeed licheepi4a is a good example of a midtier soc, and milk-v mars for lower end
|
|
2024-03-06 10:56:16
|
contrary to it being slow, risc-v adoption has been absurdly fast, probably the fastest adoption of any architecture since i386
|
|
|
Traneptora
|
2024-03-06 10:56:48
|
what exactly makes risc-v so much better than x86
|
|
|
Quackdoc
|
2024-03-06 10:58:02
|
well it's mostly just arm without a lot of the mistakes. Cheap low power devices that perform decently.
it also helps that since its an open spec, you can literally design and get someone to fab you chips at an actually decent price
|
|
|
Traneptora
|
2024-03-06 11:00:02
|
I see, doesn't strike me as likely to replace x86 for non-embedded computing then
|
|
|
Orum
|
|
Traneptora
just want to point out that risc-v is like ten years old and we still don't really have hardware for it
|
|
2024-03-06 11:00:08
|
just want to point out that's still very young for an arch
|
|
|
Traneptora
|
2024-03-06 11:00:37
|
sure, but I feel like people are hailing it as the next great thing but I'm not really sure what it really gives a non-emedded user over x86
|
|
|
Orum
|
2024-03-06 11:00:41
|
ARM dates back to 1985 and we still don't really have desktop ARM machines (except Apple, if you count them)
|
|
|
Quackdoc
|
|
Traneptora
I see, doesn't strike me as likely to replace x86 for non-embedded computing then
|
|
2024-03-06 11:00:45
|
people said that about laptops too, but apple was abke to do it anyways
|
|
|
Traneptora
|
|
Orum
ARM dates back to 1985 and we still don't really have desktop ARM machines (except Apple, if you count them)
|
|
2024-03-06 11:00:52
|
that's because there's no reason for that
|
|
|
Quackdoc
|
|
Orum
ARM dates back to 1985 and we still don't really have desktop ARM machines (except Apple, if you count them)
|
|
2024-03-06 11:01:01
|
apples has kinda lol
|
|
|
Traneptora
|
2024-03-06 11:01:01
|
arm doesn't have any benefits over x86 outside of battery life
|
|
2024-03-06 11:01:22
|
apple has a particularly power-efficient implementation of arm in the apple silicon laptops
|
|
2024-03-06 11:01:34
|
but that's a property of that chip, not a property of arm
|
|
|
Orum
|
|
Quackdoc
apples has kinda lol
|
|
2024-03-06 11:01:35
|
yeah, "kinda" is right
|
|
|
Traneptora
sure, but I feel like people are hailing it as the next great thing but I'm not really sure what it really gives a non-emedded user over x86
|
|
2024-03-06 11:02:10
|
single biggest thing is vector instructions
|
|
|
Traneptora
|
2024-03-06 11:02:26
|
x86 has vector instructions
|
|
|
Quackdoc
people said that about laptops too, but apple was abke to do it anyways
|
|
2024-03-06 11:02:43
|
you say "able" to do that. but apple has a vertically integrated ecosystem, so they can make any changes they want to exactly that ecosystem
|
|
|
Orum
|
|
Traneptora
x86 has vector instructions
|
|
2024-03-06 11:02:47
|
no, it doesn't <:CatBlobPolice:805388337862279198>
|
|
|
Quackdoc
|
|
Traneptora
but that's a property of that chip, not a property of arm
|
|
2024-03-06 11:02:53
|
no one has tried in the first place,
|
|
|
Traneptora
|
|
Orum
no, it doesn't <:CatBlobPolice:805388337862279198>
|
|
2024-03-06 11:02:57
|
what do you think simd is
|
|
|
Orum
|
2024-03-06 11:03:03
|
SIMD != vector
|
|
|
Traneptora
|
2024-03-06 11:03:03
|
stuff like sse etc.
|
|
2024-03-06 11:03:26
|
what makes SIMD not vector instructions
|
|
|
Orum
|
2024-03-06 11:03:33
|
they both process data in parallel, but how they go about it is *completely* different
|
|
|
Quackdoc
|
2024-03-06 11:03:42
|
risc-v does really unique stuff for their vector acceleration
|
|
|
Traneptora
|
|
Orum
they both process data in parallel, but how they go about it is *completely* different
|
|
2024-03-06 11:03:55
|
so it's an implementation-specific thing?
|
|
|
Orum
|
2024-03-06 11:05:10
|
this article has a good overview of the differences: https://webcache.googleusercontent.com/search?q=cache:https://medium.com/swlh/risc-v-vector-instructions-vs-arm-and-x86-simd-8c9b17963a31
|
|
|
Quackdoc
|
2024-03-06 11:05:19
|
>webcache
|
|
|
Orum
|
2024-03-06 11:05:26
|
because paywalled <:FeelsSadMan:808221433243107338>
|
|
|
Traneptora
|
2024-03-06 11:05:26
|
it's medium
|
|
|
Quackdoc
|
|
Orum
because paywalled <:FeelsSadMan:808221433243107338>
|
|
2024-03-06 11:05:43
|
did they break 12ft.io?
|
|
2024-03-06 11:06:57
|
the answer is yes
|
|
|
Orum
|
2024-03-06 11:07:15
|
> In the code examples Patterson and Waterman are using they remark that the SIMD programs require 10 to 20 times more instructions to be executed compared to the RISC-V version using vector instructions.
that alone makes vector worth it, but the benefits don't stop there
|
|
|
Traneptora
|
2024-03-06 11:07:26
|
I don't see how
|
|
2024-03-06 11:07:41
|
why is the number of instructions that exist an important metric
|
|
|
Orum
|
2024-03-06 11:08:14
|
those instructions have to be loaded from memory, and that takes time, valuable cache space, and most importantly: power
|
|
2024-03-06 11:08:29
|
> The max vector length can be queried at runtime, so one does not need to hardcode 64 element long batch sizes.
this is the *real* killer of SIMD
|
|
|
Traneptora
|
2024-03-06 11:09:03
|
as a desktop user I don't care about power
|
|
2024-03-06 11:09:14
|
why would I care about that more than all existing software continuing to work
|
|
|
Orum
|
2024-03-06 11:09:23
|
how much SIMD do we have in x86? MMX, SSE, SSE2, SSSE3, SSE4, AVX, AVX2, AVX512?
|
|
|
Traneptora
|
2024-03-06 11:09:29
|
nobody uses MMX
|
|
2024-03-06 11:09:30
|
but yes
|
|
2024-03-06 11:09:40
|
fwiw, all the SSE are all 128-bit
|
|
2024-03-06 11:09:45
|
and all the AVX are 256-bit
|
|
2024-03-06 11:09:47
|
AVX512 is 512-bit
|
|
|
Orum
|
2024-03-06 11:09:50
|
every time a new SIMD extension comes out, you need to rewrite and/or recompile code for it
|
|
|
Traneptora
|
2024-03-06 11:10:00
|
sure, but the existing binary code *still works*
|
|
|
Orum
|
2024-03-06 11:10:01
|
with RISC-V vector you don't
|
|
|
Quackdoc
|
|
Traneptora
as a desktop user I don't care about power
|
|
2024-03-06 11:10:23
|
it depends on where you live, my cousins for instance live on solar + battery / generator, so they want to shave off as much energy as possible
|
|
|
Orum
|
2024-03-06 11:10:27
|
you can use whatever new horsepower they put on your new silicon without being stuck with something written for ancient SIMD
|
|
|
Traneptora
|
2024-03-06 11:10:30
|
having to recompile performance-critical code once every few years doesn't seem worse than breaking all x86 code cause risc-v hype
|
|
|
Orum
|
2024-03-06 11:10:48
|
it's not just recompile, it's *rewriting*
|
|
2024-03-06 11:11:09
|
you don't magically put in MMX code to a compiler and instantly get AVX512
|
|
|
Traneptora
|
2024-03-06 11:11:23
|
sure, but again, why is this worth breaking all software that has come out since the 1980s
|
|
|
Quackdoc
|
2024-03-06 11:11:26
|
hand written asm is panic
|
|
|
Traneptora
|
2024-03-06 11:11:35
|
you can always use intrinsics instead of hand-written asm
|
|
2024-03-06 11:11:36
|
but yes
|
|
|
Orum
|
2024-03-06 11:11:37
|
that's the whole point--none of this *is* breaking in RISC-V
|
|
|
Quackdoc
|
|
Traneptora
sure, but again, why is this worth breaking all software that has come out since the 1980s
|
|
2024-03-06 11:11:38
|
for some it is, for some it isnt.
|
|
|
Traneptora
|
|
Orum
that's the whole point--none of this *is* breaking in RISC-V
|
|
2024-03-06 11:11:50
|
can I run x86 software on risc-v hardware?
|
|
|
Quackdoc
|
2024-03-06 11:11:53
|
but risc-v allows this forwards compatibility
|
|
|
Traneptora
|
2024-03-06 11:11:57
|
if the answer to that is "no" you are breaking all software
|
|
|
Orum
|
2024-03-06 11:12:13
|
you can, they're called translation layers or emulators
|
|
|
Traneptora
|
2024-03-06 11:12:24
|
I see, so the answer is "no"
|
|
|
Orum
|
2024-03-06 11:12:30
|
no, it's yes
|
|
|
Traneptora
|
2024-03-06 11:12:30
|
if you have to set up a qemu instance, the answer is no
|
|
|
Quackdoc
|
2024-03-06 11:12:39
|
well, box86
|
|
|
Orum
|
2024-03-06 11:12:46
|
how do you think Apple runs x86 code without having a x86 license?
|
|
|
Traneptora
|
2024-03-06 11:12:53
|
apple has an x86 license
|
|
|
Orum
|
2024-03-06 11:12:58
|
no, they don't
|
|
|
Traneptora
|
2024-03-06 11:13:03
|
mac computers have been using intel CPUs for years
|
|
2024-03-06 11:13:06
|
I have no idea what you're talking about
|
|
|
Orum
|
2024-03-06 11:13:17
|
Intel holds the license, not Apple ๐
|
|
2024-03-06 11:13:22
|
they bought chips from Intel
|
|
|
Traneptora
|
2024-03-06 11:13:29
|
are you being pedantic for the purpose of being pedantic
|
|
|
Orum
|
2024-03-06 11:13:41
|
no, this is *extremely* important in the CPU manufacturing space
|
|
|
Traneptora
|
2024-03-06 11:13:41
|
macintosh computers have had intel CPUs in them for years
|
|
|
Quackdoc
|
2024-03-06 11:13:44
|
talking about rosetta
|
|
|
Traneptora
|
2024-03-06 11:13:54
|
you ask about "how can apple run x86 code" the answer is cause they have intel Cpus in them
|
|
|
Orum
|
2024-03-06 11:14:10
|
I'm talking about *modern* ARM Apples, not ancient ones
|
|
|
Traneptora
|
2024-03-06 11:14:21
|
x86 macintosh computers are not ancient
|
|
2024-03-06 11:14:24
|
powerpc ones are ancient
|
|
|
Quackdoc
|
2024-03-06 11:14:26
|
the m1 devices don't, their emulation on the otherhand is extremely efficient, not 100% granted, but still quite good
|
|
|
Traneptora
|
2024-03-06 11:14:47
|
so it often doesn't work
|
|
2024-03-06 11:14:49
|
got it
|
|
|
Orum
|
2024-03-06 11:14:52
|
well yes PPC are even older but their x86 stuff is quite old at this point
|
|
|
Quackdoc
|
|
Traneptora
so it often doesn't work
|
|
2024-03-06 11:15:28
|
I've never had an issue with it myself, granted I havent needed to use it often
|
|
2024-03-06 11:15:44
|
i've not seen anyone do a large scale test with it however
|
|
|
Traneptora
|
|
Orum
well yes PPC are even older but their x86 stuff is quite old at this point
|
|
2024-03-06 11:15:48
|
if 2023 is ancient I can't wait to hear what you think about 2022
|
|
|
Quackdoc
|
2024-03-06 11:16:34
|
wasnt the last intel macbook 2020? not what I would call ancient, but they arent exactly new either
|
|
|
Orum
|
2024-03-06 11:16:35
|
that was literally their last x86 laptop ๐คทโโ๏ธ
|
|
|
Traneptora
|
2024-03-06 11:16:36
|
the apple silicon version of Mac Pro was released literally less than a year ago
|
|
2024-03-06 11:16:49
|
june 2023
|
|
|
Orum
|
2024-03-06 11:16:53
|
but x86 was dead to Apple long before that
|
|
|
Traneptora
|
2024-03-06 11:17:04
|
ah yes dead to apple despite being explicitly supported less than 12 months ago
|
|
2024-03-06 11:17:05
|
got it
|
|
2024-03-06 11:17:24
|
it wasn't even *announced* that this was going to happen until june 2020
|
|
2024-03-06 11:17:27
|
this isn't ancient history
|
|
|
Quackdoc
|
2024-03-06 11:17:39
|
ah their last macbook was apparently 2021
|
|
|
Orum
|
2024-03-06 11:17:43
|
yeah, 2020 is ancient in the computing sphere
|
|
|
Traneptora
|
2024-03-06 11:17:46
|
no, it's not
|
|
|
Quackdoc
|
2024-03-06 11:18:06
|
but well, either way, the point was their x86 emulation is good, and indeed it is
|
|
|
Traneptora
|
2024-03-06 11:18:07
|
we're talking less than four years for a transition to an entirely different architecture across their ecosystem
|
|
2024-03-06 11:18:14
|
that's definitely not ancient history
|
|
|
Orum
just want to point out that's still very young for an arch
|
|
2024-03-06 11:18:52
|
so risc-v being ten years old is "very young" but four years is ancient history
|
|
2024-03-06 11:18:52
|
got it
|
|
|
Quackdoc
|
2024-03-06 11:18:56
|
I havent seen anyone submit any extensions dedicated to acceleration of things like x86 to the riscv spec, but I don't see why they couldn't be submitted
|
|
|
Orum
|
2024-03-06 11:19:10
|
I disagree; the moment they announced they were dropping x86 and moving to ARM, there was little reason to buy any x86 Apple HW
|
|
|
Traneptora
so risc-v being ten years old is "very young" but four years is ancient history
|
|
2024-03-06 11:19:18
|
different things entirely
|
|
|
Traneptora
|
2024-03-06 11:19:28
|
you said it was ancient "in the computing sphere"
|
|
2024-03-06 11:19:36
|
is instruction sets not in the computing sphere
|
|
|
Orum
|
2024-03-06 11:19:42
|
yes, in the consumer product computing sphere
|
|
|
Traneptora
|
2024-03-06 11:19:54
|
not really
|
|
2024-03-06 11:20:00
|
people don't replace their computers every 4 years
|
|
2024-03-06 11:20:03
|
people just don't do that
|
|
|
Orum
|
2024-03-06 11:20:09
|
sure, but no one is buying 4-year old computers either
|
|
|
Traneptora
|
2024-03-06 11:20:31
|
you have no reason to do that because it's not cheaper than 1-year-old hardware
|
|
2024-03-06 11:20:43
|
if it was cheaper, people would totally do that
|
|
|
Orum
|
2024-03-06 11:20:45
|
which is why all the reviewers are at AMD's throat for their recent copycat move of Intel, trying to sell old silicon under a new name
|
|
|
|
afed
|
|
Traneptora
AVX512 is 512-bit
|
|
2024-03-06 11:20:50
|
even 512 didn't really expand for desktops because small cores and now intel is trying to replace it by AVX10 with 256 splits
|
|
|
Traneptora
|
2024-03-06 11:21:24
|
yea, it's true that the more and more you try to extend you get diminishing returns
|
|
|
Quackdoc
|
2024-03-06 11:21:35
|
oh, riscv was ratified about 4 3/4ths years ago, my bad :D
|
|
|
Traneptora
|
2024-03-06 11:21:53
|
AVX512's gains over the 256-bit is lower than the 256-bit AVX over SSE's 128-bit
|
|
2024-03-06 11:21:54
|
etc.
|
|
|
lonjil
|
|
Orum
anyway hopefully RISC-V will take over and their vector instructions will finally kill the SIMD treadmill
|
|
2024-03-06 11:22:24
|
the SIMD treadmill was already dead
|
|
|
Orum
|
2024-03-06 11:22:42
|
in any case, from everyone's perspective *except* the manufacturer's, having a democratized ISA is a *good*thing
|
|
|
lonjil
|
|
Quackdoc
well it's mostly just arm without a lot of the mistakes. Cheap low power devices that perform decently.
it also helps that since its an open spec, you can literally design and get someone to fab you chips at an actually decent price
|
|
2024-03-06 11:22:53
|
There is already Arm without the mistakes, it's called Aarch64
|
|
|
Quackdoc
|
|
lonjil
There is already Arm without the mistakes, it's called Aarch64
|
|
2024-03-06 11:23:05
|
[av1_kekw](https://cdn.discordapp.com/emojis/758892021191934033.webp?size=48&quality=lossless&name=av1_kekw)
|
|
|
Orum
|
2024-03-06 11:23:09
|
instead of the oligopoly we've been forced to swallow for decades
|
|
|
lonjil
|
|
Orum
ARM dates back to 1985 and we still don't really have desktop ARM machines (except Apple, if you count them)
|
|
2024-03-06 11:23:13
|
why wouldn't you count apple?
|
|
|
Orum
|
2024-03-06 11:23:27
|
because their products are absurdly poor value
|
|
|
Traneptora
|
|
lonjil
why wouldn't you count apple?
|
|
2024-03-06 11:23:44
|
apple is vertically integrated which gives them the prerogative to change things and force changes on users that other hardware manufacturers don't have
|
|
2024-03-06 11:24:05
|
for example, intel can't sell a new CPU that microsoft windows doesn't work on
|
|
|
Quackdoc
|
|
Orum
instead of the oligopoly we've been forced to swallow for decades
|
|
2024-03-06 11:24:06
|
indeed, this is one of the major benefits. it's pretty nice to see how varying the risc-v development is now.
|
|
|
lonjil
|
|
Traneptora
what do you think simd is
|
|
2024-03-06 11:24:31
|
most of the industry considers "vector processing" to be a kind of SIMD, but the "vector processing" industry tries to sell itself by claiming to be something entirely different from SIMD. It is very silly.
|
|
|
Orum
|
2024-03-06 11:24:49
|
because it is totally different
|
|
|
Traneptora
|
2024-03-06 11:24:54
|
it's just variable-length simd
|
|
2024-03-06 11:24:58
|
it's not totally different
|
|
|
Orum
|
2024-03-06 11:25:03
|
"just" ๐
|
|
|
Traneptora
|
2024-03-06 11:25:07
|
I mean
|
|
|
Quackdoc
|
|
lonjil
most of the industry considers "vector processing" to be a kind of SIMD, but the "vector processing" industry tries to sell itself by claiming to be something entirely different from SIMD. It is very silly.
|
|
2024-03-06 11:25:08
|
technically speaking, it's massively different when it comes to working with it
|
|
|
Traneptora
|
2024-03-06 11:25:08
|
it's still single-instruction-multiple-data
|
|
2024-03-06 11:25:16
|
the core concept is the same
|
|
|
Quackdoc
|
2024-03-06 11:25:17
|
yeah but so are gpus
|
|
|
Traneptora
|
2024-03-06 11:25:23
|
gpus are glorified simd, yes
|
|
|
Orum
|
2024-03-06 11:25:32
|
GPUs are SIMT, not SIMD
|
|
|
lonjil
|
|
Orum
with RISC-V vector you don't
|
|
2024-03-06 11:25:44
|
you do if you want new features. Most of all those instruction sets you listed were new features, not a difference in width (which is the only thing "vectors" solves)
|
|
|
Traneptora
apple has an x86 license
|
|
2024-03-06 11:26:30
|
no they don't, buying chips doesn't get you a license
|
|
|
Quackdoc
|
|
lonjil
you do if you want new features. Most of all those instruction sets you listed were new features, not a difference in width (which is the only thing "vectors" solves)
|
|
2024-03-06 11:26:39
|
that is the case for new *features* but when it comes to extending vector that work doesnt need to get put in
|
|
|
Traneptora
|
|
lonjil
no they don't, buying chips doesn't get you a license
|
|
2024-03-06 11:26:42
|
then I didn't understand the point of the question
|
|
|
Orum
|
|
lonjil
you do if you want new features. Most of all those instruction sets you listed were new features, not a difference in width (which is the only thing "vectors" solves)
|
|
2024-03-06 11:27:01
|
Difference in width is *massive*, especially from a developer's perspective
|
|
2024-03-06 11:28:27
|
instead of having to write, compile, test, and verify code for 4 different widths and countless different extensions, you only need to write one
|
|
|
lonjil
|
|
Orum
in any case, from everyone's perspective *except* the manufacturer's, having a democratized ISA is a *good*thing
|
|
2024-03-06 11:28:46
|
yes. RISC-V is good at two things:
1. small microcontrollers.
2. open spec so people can do whatever they want.
not very good for "big" chips tho. Some design flaws. And a few flaws that would be super easy to fix (just need a few new instructions) that they refuse to do because it isn't "RISC" enough. Just dogma.
|
|
|
Traneptora
|
2024-03-06 11:28:48
|
you still run into issues where you need new extensions to use features like f16c or fma
|
|
|
Orum
|
2024-03-06 11:29:09
|
FMA is not 'new' by any means
|
|
|
lonjil
yes. RISC-V is good at two things:
1. small microcontrollers.
2. open spec so people can do whatever they want.
not very good for "big" chips tho. Some design flaws. And a few flaws that would be super easy to fix (just need a few new instructions) that they refuse to do because it isn't "RISC" enough. Just dogma.
|
|
2024-03-06 11:29:21
|
why is it bad for big chips?
|
|
|
Traneptora
|
2024-03-06 11:29:21
|
I mean, it's something that was added after SSE
|
|
2024-03-06 11:29:29
|
it's not simply a width change, when it was released it was a new feature
|
|
|
Quackdoc
|
|
lonjil
yes. RISC-V is good at two things:
1. small microcontrollers.
2. open spec so people can do whatever they want.
not very good for "big" chips tho. Some design flaws. And a few flaws that would be super easy to fix (just need a few new instructions) that they refuse to do because it isn't "RISC" enough. Just dogma.
|
|
2024-03-06 11:29:30
|
what are "big" chips? risc-v is being adopted in everything from SBCs to cryptominers
|
|
|
lonjil
|
|
Orum
because their products are absurdly poor value
|
|
2024-03-06 11:29:50
|
have you compared their laptops to other similar laptops? MacBooks are surprisingly great value, especially if you like low power consumption and long battery life.
|
|
|
Traneptora
|
2024-03-06 11:29:57
|
variable-width simd gets you the width upgrade automatically but anything like f16c or fma3 that wasn't in the previous version won't automatically get added
|
|
|
190n
|
|
Quackdoc
what are "big" chips? risc-v is being adopted in everything from SBCs to cryptominers
|
|
2024-03-06 11:30:12
|
and the SBCs are still slower than arm or x86 ones
|
|
|
Quackdoc
|
|
lonjil
have you compared their laptops to other similar laptops? MacBooks are surprisingly great value, especially if you like low power consumption and long battery life.
|
|
2024-03-06 11:30:13
|
for laptops yes, but the box thingies? I would disagree on
|
|
|
Traneptora
|
2024-03-06 11:30:20
|
do they even sell those?
|
|
|
Quackdoc
|
|
190n
and the SBCs are still slower than arm or x86 ones
|
|
2024-03-06 11:30:28
|
not really?
|
|
|
Traneptora
|
2024-03-06 11:30:38
|
as far as I understand the primary reason you'd purchase an apple laptop is that apple has made their apple silicon chips very power-efficient
|
|
|
Quackdoc
|
2024-03-06 11:30:39
|
the milk-v mars is faster then the rpi3b from what I can see
|
|
2024-03-06 11:30:54
|
and the pi4a sits somewhere between the pi4 and pi5 in perf
|
|
|
Traneptora
|
2024-03-06 11:31:01
|
if you don't like macOS you can always run asahi linux on them too
|
|
|
Orum
|
|
lonjil
have you compared their laptops to other similar laptops? MacBooks are surprisingly great value, especially if you like low power consumption and long battery life.
|
|
2024-03-06 11:31:06
|
Me personally? No, but others have: https://www.youtube.com/watch?v=u1dxOI_kYG8
|
|
|
190n
|
|
Quackdoc
the milk-v mars is faster then the rpi3b from what I can see
|
|
2024-03-06 11:31:26
|
but rpi3b is 2 generations old
|
|
|
lonjil
|
|
Traneptora
for example, intel can't sell a new CPU that microsoft windows doesn't work on
|
|
2024-03-06 11:31:44
|
I don't really see the relevance? Pretty much all software for macOS continued to work, and Microsoft is totally willing to port Windows to new architectures (back in the day they ported Windows to Itanium, and today Windows on Arm is finally getting somewhere.)
|
|
|
Quackdoc
|
|
190n
but rpi3b is 2 generations old
|
|
2024-03-06 11:31:57
|
the rpi3b is also architecturally very different from the 4/5, and is more power efficent then them by quite the margin
|
|
|
190n
|
2024-03-06 11:31:59
|
wow it's faster than an in-order arm cpu designed in 2012
|
|
|
Quackdoc
|
2024-03-06 11:32:12
|
there is a significant reason why people still by the rpi3b
|
|
|
Traneptora
|
|
lonjil
I don't really see the relevance? Pretty much all software for macOS continued to work, and Microsoft is totally willing to port Windows to new architectures (back in the day they ported Windows to Itanium, and today Windows on Arm is finally getting somewhere.)
|
|
2024-03-06 11:32:13
|
no, but it's easier to force a transition along when it's vertically integrated
|
|
|
190n
|
|
Quackdoc
the rpi3b is also architecturally very different from the 4/5, and is more power efficent then them by quite the margin
|
|
2024-03-06 11:32:16
|
is it really more power efficient or does it just use less power
|
|
|
Quackdoc
|
2024-03-06 11:32:25
|
> Raspberry Pi 3 Model B will remain in production until at least January 2028
|
|
|
190n
is it really more power efficient or does it just use less power
|
|
2024-03-06 11:32:37
|
power efficient
|
|
2024-03-06 11:33:11
|
it doesn't make sense to compare a product designed to compete with rpi3b to an rpi4, they are just different segments
|
|
|
lonjil
|
|
Quackdoc
that is the case for new *features* but when it comes to extending vector that work doesnt need to get put in
|
|
2024-03-06 11:33:28
|
but you chose to list out every little update and minor version even with the same width, so I think my critique of your critique is fair.
|
|
|
Traneptora
|
|
Orum
Me personally? No, but others have: https://www.youtube.com/watch?v=u1dxOI_kYG8
|
|
2024-03-06 11:33:33
|
does this video mention power consumption at all
|
|
2024-03-06 11:33:39
|
I don't really want to spend 8 minutes watching it
|
|
|
Quackdoc
|
|
lonjil
but you chose to list out every little update and minor version even with the same width, so I think my critique of your critique is fair.
|
|
2024-03-06 11:33:49
|
sure, but for a lot of people, width is what matters
|
|
2024-03-06 11:34:00
|
width *is* the key part here afterall
|
|
|
190n
|
|
Quackdoc
it doesn't make sense to compare a product designed to compete with rpi3b to an rpi4, they are just different segments
|
|
2024-03-06 11:34:24
|
sure but i would not call a pi 3b competitor a "big chip"
|
|
|
Traneptora
|
2024-03-06 11:34:26
|
if width is what you care about, width has been updated like 3 times in 20 years
|
|
2024-03-06 11:34:30
|
which is not that much
|
|
|
Orum
|
|
Traneptora
does this video mention power consumption at all
|
|
2024-03-06 11:34:34
|
IDK, but I have no doubt apple will have better efficiency as they neither have to deal with x86 hell and they are willing to pay for bleeding-edge processes
|
|
|
Quackdoc
|
|
190n
sure but i would not call a pi 3b competitor a "big chip"
|
|
2024-03-06 11:34:37
|
that's why I asked what is a "big" chip
|
|
|
lonjil
|
|
Traneptora
then I didn't understand the point of the question
|
|
2024-03-06 11:34:50
|
if you go back up the convo, the original point was that Apple ensured that x86-64 software works just fine on Arm laptops. Then you replied with stuff about license and x86 chips, which was not relevant (however, I replied before realizing that you had missed that point)
|
|
|
Orum
|
2024-03-06 11:34:56
|
though actually it has worse efficiency if you go over the 8GB limit
|
|
|
Quackdoc
|
2024-03-06 11:35:09
|
for instance the antminer x3 is a crypto bro machine running iirc 3x SG2042s
|
|
2024-03-06 11:35:20
|
Im not sure if that would be considered "big" or not
|
|
|
Traneptora
|
|
lonjil
if you go back up the convo, the original point was that Apple ensured that x86-64 software works just fine on Arm laptops. Then you replied with stuff about license and x86 chips, which was not relevant (however, I replied before realizing that you had missed that point)
|
|
2024-03-06 11:35:25
|
whether or not apple has paid intel for a license doesn't seem to matter though if x86 code works on apple silicon via rosetta
|
|
2024-03-06 11:35:44
|
like it works or it doesn't, whether they paid intel for it seems largely irrelevant imo
|
|
|
190n
|
|
Quackdoc
that's why I asked what is a "big" chip
|
|
2024-03-06 11:35:47
|
i would classify as something you could reasonably put in a smartphone or low-end PC, or very high-end SBC
|
|
|
Quackdoc
Im not sure if that would be considered "big" or not
|
|
2024-03-06 11:35:53
|
yeah it would
|
|
|
lonjil
|
|
Quackdoc
what are "big" chips? risc-v is being adopted in everything from SBCs to cryptominers
|
|
2024-03-06 11:35:57
|
you know, desktops and servers
|
|
|
Orum
|
|
Traneptora
like it works or it doesn't, whether they paid intel for it seems largely irrelevant imo
|
|
2024-03-06 11:36:07
|
the point is there's no reason you can't do the same with RISC-V
|
|
|
Quackdoc
|
|
190n
i would classify as something you could reasonably put in a smartphone or low-end PC, or very high-end SBC
|
|
2024-03-06 11:36:22
|
in that case you have the lichee pi4a, the lichee pi4a is a bit on the pricy side granted, but it's quite the decent chip when excusing that (early adopter fee and everything)
|
|
2024-03-06 11:36:49
|
you also have the upcomming SG2380 which should be quite the promissing peice of work
|
|
|
Traneptora
|
|
Orum
the point is there's no reason you can't do the same with RISC-V
|
|
2024-03-06 11:37:23
|
except it doesn't fully work
what I forsee is someone is running an x86 binary on risc-v and something doesn't work and they report a bug and the developer goes "recompile it for risc-v" and the user gets angry and refuses and the developer gets angry and nobody wins
|
|
2024-03-06 11:37:44
|
this kind of cycle happens all the time in software dev
|
|
|
Orum
|
2024-03-06 11:37:52
|
well if it's OSS there really isn't a reason *not* to compile it for RISC-V then
|
|
|
Traneptora
|
2024-03-06 11:38:02
|
sure but this happens
|
|
|
Quackdoc
|
|
Quackdoc
in that case you have the lichee pi4a, the lichee pi4a is a bit on the pricy side granted, but it's quite the decent chip when excusing that (early adopter fee and everything)
|
|
2024-03-06 11:38:39
|
considering that the lichee pi4a manages to compete with the rockchip devices in perf/watt it's pretty amazing since it was the first design released after the spec was formally ratified IIRC
|
|
|
Orum
|
2024-03-06 11:38:45
|
sure, but if it's closed source then the developer either releases RISC-V binaries, or (if it's paid closed source) loses a sale
|
|
|
Traneptora
|
2024-03-06 11:39:00
|
well for ubiquitous software they won't lose a sale
|
|
2024-03-06 11:39:22
|
if adobe doesn't release risc-v binaries for acrobat and my PDF doesn't render in acroread because of some bug in the translation layer
|
|
2024-03-06 11:39:23
|
nobody wins
|
|
|
lonjil
|
|
Orum
why is it bad for big chips?
|
|
2024-03-06 11:39:57
|
one example is the lack of complex addressing modes. needing to do those calculations with intermediate registers increases the pressure on the register rename engine, one of the most power hungry parts of the core that is always running. by having more complex instructions with fewer intermediate registers needed, you can reduce the size of the rename engine by something like 20%. Saves a lot of power and die area inside the core.
Alibaba has some cores that implement custom extensions for that and some other stuff. They claim something like 30% better perf due to this. But the RISC-V foundation is against it on ideological grounds.
|
|
|
Quackdoc
|
2024-03-06 11:39:58
|
while true, IMO it's not really that big of an issue for early adopters who go in expecting that
|
|
|
Orum
|
2024-03-06 11:39:59
|
if it's ubiquitous then you can just use something else to read PDFs
|
|
|
lonjil
|
|
lonjil
one example is the lack of complex addressing modes. needing to do those calculations with intermediate registers increases the pressure on the register rename engine, one of the most power hungry parts of the core that is always running. by having more complex instructions with fewer intermediate registers needed, you can reduce the size of the rename engine by something like 20%. Saves a lot of power and die area inside the core.
Alibaba has some cores that implement custom extensions for that and some other stuff. They claim something like 30% better perf due to this. But the RISC-V foundation is against it on ideological grounds.
|
|
2024-03-06 11:40:30
|
(on the other hand, something like a microcontroller doesn't have register renaming, so this doesn't really matter at all in that space)
|
|
|
190n
|
|
lonjil
one example is the lack of complex addressing modes. needing to do those calculations with intermediate registers increases the pressure on the register rename engine, one of the most power hungry parts of the core that is always running. by having more complex instructions with fewer intermediate registers needed, you can reduce the size of the rename engine by something like 20%. Saves a lot of power and die area inside the core.
Alibaba has some cores that implement custom extensions for that and some other stuff. They claim something like 30% better perf due to this. But the RISC-V foundation is against it on ideological grounds.
|
|
2024-03-06 11:41:26
|
Zba extension adds some instructions for address generation, or are you talking about even more complex stuff?
|
|
2024-03-06 11:41:49
|
you can shift left by (1, 2, 3) and add in one operation
|
|
|
lonjil
|
|
Quackdoc
for laptops yes, but the box thingies? I would disagree on
|
|
2024-03-06 11:41:59
|
which box thingies? Mac Mini can be decent value depending on your needs, especially if you, say, depend on solar power and need to use as little as possible, as was mentioned previously ๐
Though if you mean like the Mac Pro then yeah lmao terrible value.
|
|
|
Traneptora
|
2024-03-06 11:42:33
|
oh they sell actual ATX-sized mac desktops?
|
|
2024-03-06 11:42:41
|
oh, that, I don't see the point in that
|
|
2024-03-06 11:42:52
|
the primary reason you'd purchase an apple computer is that it's power-efficient
|
|
|
lonjil
|
2024-03-06 11:42:52
|
they even sell rack-mount mac desktops
|
|
|
Traneptora
|
|
190n
|
2024-03-06 11:43:22
|
they even sell wheeled mac desktops
|
|
|
Orum
|
2024-03-06 11:44:38
|
they even sold boat anchors
|
|
|
lonjil
|
|
190n
Zba extension adds some instructions for address generation, or are you talking about even more complex stuff?
|
|
2024-03-06 11:45:14
|
these generate addresses, which is nice, but they still require you to stuff that address into a register before using it. I meant having complex addressing in your load and store instructions. On regular consumer and server class chips, this is very cheap to implement, basically free performance for nothing.
|
|
|
190n
|
2024-03-06 11:46:40
|
ah, like the `[reg + constant * reg]` you can do in x86?
|
|
|
lonjil
|
|
Traneptora
whether or not apple has paid intel for a license doesn't seem to matter though if x86 code works on apple silicon via rosetta
|
|
2024-03-06 11:47:16
|
yes. I am just pedantic. Though Apple probably did not pay anything.
Fun fact if you didn't know: Apple made a Linux version of Rosetta, so your x86-64 docker containers can run on apple silicon macs ๐
|
|
|
190n
|
2024-03-06 11:47:41
|
omg there's a conditional operations extension now
|
|
|
Traneptora
|
|
lonjil
yes. I am just pedantic. Though Apple probably did not pay anything.
Fun fact if you didn't know: Apple made a Linux version of Rosetta, so your x86-64 docker containers can run on apple silicon macs ๐
|
|
2024-03-06 11:47:56
|
>docker
plz
|
|
|
lonjil
|
|
190n
ah, like the `[reg + constant * reg]` you can do in x86?
|
|
2024-03-06 11:48:06
|
yeah. Though I don't recall exactly which addressing modes Alibaba engineers added as an extension.
|
|
|
190n
|
|
lonjil
yes. I am just pedantic. Though Apple probably did not pay anything.
Fun fact if you didn't know: Apple made a Linux version of Rosetta, so your x86-64 docker containers can run on apple silicon macs ๐
|
|
2024-03-06 11:48:10
|
and so you can run x86 binaries in your linux vm on apple silicon
|
|
2024-03-06 11:48:17
|
well i guess that's what docker does anyway...
|
|
|
lonjil
|
2024-03-06 11:49:18
|
finally, I've caught up to the conversation
|
|
2024-03-06 11:49:39
|
only took me 30 minutes of writing replies
|
|
|
190n
well i guess that's what docker does anyway...
|
|
2024-03-06 11:50:13
|
yeah. Docker on Windows and macOS uses VMs
|
|
2024-03-06 11:50:45
|
On FreeBSD I believe Podman (docker but cooler) can use FreeBSD's Linux compat layer, no VM needed.
|
|
|
Traneptora
>docker
plz
|
|
2024-03-06 11:53:03
|
how do orchestrate stuff? Kubernetes? Ansible? Shell scripts? Just running stuff inside `tmux` by hand? ๐
maybe you're a normal person who doesn't orchastrate anything...
|
|
|
Traneptora
|
|
lonjil
how do orchestrate stuff? Kubernetes? Ansible? Shell scripts? Just running stuff inside `tmux` by hand? ๐
maybe you're a normal person who doesn't orchastrate anything...
|
|
2024-03-06 11:54:19
|
I distribute software in source only <:YEP:808828808127971399>
|
|
2024-03-06 11:54:58
|
I've had bad experiences getting docker to work and actually segment off my system
|
|
|
lonjil
|
|
Traneptora
the primary reason you'd purchase an apple computer is that it's power-efficient
|
|
2024-03-06 11:55:20
|
actually funny thing, there is one area where apple's silly big computers are money efficient. They have way more video ram than any GPU that isn't 10x more expensive. So if you're doing really vram heavy stuff, they're good value.
|
|
|
Traneptora
|
2024-03-06 11:55:46
|
huh. what GPU is inside them?
|
|
2024-03-06 11:55:50
|
I asusme apple doesn't make gpus
|
|
|
lonjil
|
2024-03-06 11:55:56
|
Same GPU as on the iPhone
|
|
2024-03-06 11:56:15
|
used to be PowerVR, but Apple forked their architecture and took it in-house some years ago.
|
|
|
Traneptora
|
2024-03-06 11:56:27
|
ah so it's actually in-house
|
|
|
Quackdoc
|
|
lonjil
which box thingies? Mac Mini can be decent value depending on your needs, especially if you, say, depend on solar power and need to use as little as possible, as was mentioned previously ๐
Though if you mean like the Mac Pro then yeah lmao terrible value.
|
|
2024-03-06 11:56:46
|
I don't really know, I dont follow the names much anymore
|
|
|
lonjil
|
|
Traneptora
ah so it's actually in-house
|
|
2024-03-06 11:57:32
|
M3 Max is 92 billion transistors, most of which goes to the GPU. And M3 Ultra (whenever it comes out) will double that.
|
|
2024-03-06 11:58:04
|
for reference, Nvidia 4090 has 76 billion transistors
|
|
2024-03-06 11:59:46
|
M3 Max can have up to 128 GB of RAM (M3 Ultra should double that), while the 4090 has 24.
|
|
|
190n
|
|
lonjil
yeah. Docker on Windows and macOS uses VMs
|
|
2024-03-07 12:00:04
|
https://macoscontainers.org/ tho if it goes anywhere
|
|
|
lonjil
|
2024-03-07 12:01:32
|
neat
|
|
|
Quackdoc
I don't really know, I dont follow the names much anymore
|
|
2024-03-07 12:04:57
|
they've been using those names for like 20 years :p
|
|
|
Quackdoc
|
2024-03-07 12:06:16
|
its the new stupid names like studio or mini or something
|
|
|
Traneptora
|
|
lonjil
M3 Max can have up to 128 GB of RAM (M3 Ultra should double that), while the 4090 has 24.
|
|
2024-03-07 12:06:40
|
... why?
|
|
|
lonjil
|
2024-03-07 12:06:41
|
mini is since 2005. studio is a new name I think yeah.
|
|
|
Traneptora
... why?
|
|
2024-03-07 12:08:53
|
CPU and GPU is one the same chip, it's all unified RAM like any laptop chip or phone. So, same reason you'd have 128 GB in any computer, except now the GPU can use it too. (and you can share buffers between the CPU and GPU, to avoid copying)
|
|
|
Traneptora
|
2024-03-07 12:09:03
|
ah, that makes sense
|
|
|
190n
|
2024-03-07 12:09:59
|
would be cool if they'd let you install extra slower ram in the mac pro mostly for the cpu to use, as was rumored for a bit
|
|
2024-03-07 12:10:03
|
like normal ddr5
|
|
2024-03-07 12:10:45
|
nvidia is doing something kinda similar with the grace hopper chip that has stacked HBM for the gpu and a larger pool of LPDDR5 for the cpu
|
|
|
lonjil
|
2024-03-07 12:11:24
|
Eventually they'll do a 4x chip, with like 512 GB of LPDDR
|
|
|
190n
|
2024-03-07 12:12:58
|
still 3x less than the intel mac pro
|
|
|
lonjil
|
2024-03-07 12:13:24
|
psh, unified memory means you don't need as much ๐
|
|
2024-03-07 12:14:06
|
just like how swap means you only need 8 gb of ram, not 16
|
|
|
190n
|
2024-03-07 12:14:18
|
hmm how much vram could you get on the cheese grater
|
|
2024-03-07 12:16:17
|
128?
|
|
|
lonjil
|
2024-03-07 12:16:47
|
seems like it. With 4 W6800X GPUs bridged together
|
|
|
190n
|
2024-03-07 12:18:34
|
apple explaining why it is impossible to port amd/nvidia drivers to arm (amd and nvidia have already done it)
|
|
|
lonjil
|
|
lonjil
seems like it. With 4 W6800X GPUs bridged together
|
|
2024-03-07 12:18:58
|
much cheaper than any 128GB GPU AMD sells today <:kekw:808717074305122316>
|
|
|
190n
|
2024-03-07 12:19:15
|
they realized AI startups have money?
|
|
|
Quackdoc
|
|
lonjil
psh, unified memory means you don't need as much ๐
|
|
2024-03-07 12:20:14
|
the memory optimization stuff apple does do is pretty neat granted, but its only majorly optimized for their users average use case
|
|
|
lonjil
|
|
190n
apple explaining why it is impossible to port amd/nvidia drivers to arm (amd and nvidia have already done it)
|
|
2024-03-07 12:20:19
|
it has to do with PCIe features that many Arm systems lack. You have two options. Either add the missing features in hardware, or simply trap memory accesses and emulate them in software (super slow!)
that latter option is unfortunately common
|
|
|
Quackdoc
the memory optimization stuff apple does do is pretty neat granted, but its only majorly optimized for their users average use case
|
|
2024-03-07 12:21:47
|
you should've seen my girlfriend playing modded Kerbal Space Program on her 8GB macbook air. On the one hand, slow and laggy because it needed to swap. On the other hand, still faster than my 16 GB laptop that was only a few years older ๐ (and I was playing unmodded)
|
|
|
Quackdoc
|
2024-03-07 12:23:00
|
yeah, the M1 chips iirc have dedicated memory compression acceleration which is nice, and ofc they always can do DMA based swapping since they are guaranteed to be able to swap over nvme now, so it's like, yeah, you do have a lot of optimization stuff, but still, gibe more
|
|
|
lonjil
|
2024-03-07 12:24:06
|
If I have money in the future, I might buy a macbook for that sweet lower power use. But honestly I don't want to until they move to Armv9, because I want to play with SVE2.
|
|
|
Quackdoc
|
2024-03-07 12:25:59
|
im just waiting for the risc-v stuff, I don't really need that much perf anymore since I just really do web browsing and testing applications anyways. and the new lichee pad4a if they can price it reasonably would be just a decent price point for me anyways. and since I use linux 100% of time now aside from VMs, it works out well for me
|
|
|
Traneptora
|
|
lonjil
you should've seen my girlfriend playing modded Kerbal Space Program on her 8GB macbook air. On the one hand, slow and laggy because it needed to swap. On the other hand, still faster than my 16 GB laptop that was only a few years older ๐ (and I was playing unmodded)
|
|
2024-03-07 12:26:26
|
I find swap on desktops pointless
|
|
|
lonjil
|
|
Quackdoc
im just waiting for the risc-v stuff, I don't really need that much perf anymore since I just really do web browsing and testing applications anyways. and the new lichee pad4a if they can price it reasonably would be just a decent price point for me anyways. and since I use linux 100% of time now aside from VMs, it works out well for me
|
|
2024-03-07 12:26:34
|
unfortunately I need big and powerful computers (for reasons)
|
|
|
Quackdoc
|
2024-03-07 12:26:35
|
also want one to start porting waydroid to it if I get the time :D
|
|
|
lonjil
unfortunately I need big and powerful computers (for reasons)
|
|
2024-03-07 12:26:51
|
remote desktop I assume is off the table? lol
|
|
|
Traneptora
|
2024-03-07 12:27:14
|
the only use case where you want to have swap enabled is when you have things loaded into memory for extended periods of time that aren't used, so the OS can swap them out, and use the plentiful ram for better disk caching
|
|
|
lonjil
|
2024-03-07 12:27:34
|
i have 15000 tabs open, most of them are in swap I guess
|
|
|
Traneptora
|
2024-03-07 12:27:41
|
well that's your fault
|
|
2024-03-07 12:28:04
|
people who have 15k tabs open and complain that they ran out of ram should either download more ram or not have 15k tabs open <:kek:857018203640561677>
|
|
|
Quackdoc
|
2024-03-07 12:28:06
|
I find it nice, since linux is trash and can't handle low ram properly [av1_omegalul](https://cdn.discordapp.com/emojis/885026577618980904.webp?size=48&quality=lossless&name=av1_omegalul) so even on 16gib, just compiling something when im using the desktop can hurt, though lately zram helps a good chunk
|
|
2024-03-07 12:28:21
|
zram helps quite a bit
|
|
|
lonjil
|
2024-03-07 12:28:53
|
my window manager/compositor (sway) has 700MiB in swap for some reason
|
|
|
Traneptora
|
2024-03-07 12:29:01
|
contrary to popular belief, swap is *not* designed for if you run out of memory. what happens if you have swap enabled and run out of memory is you get into a state of trashing that makes your system unresponsible and inevitably you will power cycle it
|
|
|
lonjil
|
2024-03-07 12:29:01
|
only 29MiB in RAM
|
|
|
Quackdoc
|
|
lonjil
my window manager/compositor (sway) has 700MiB in swap for some reason
|
|
2024-03-07 12:29:19
|
hello fellow sway user :D
|
|
|
Traneptora
|
2024-03-07 12:29:40
|
the primary purpose of swap is to give the OS more ram to work with to allow it to more aggressively cache the filesystem stuff, cause fsync is evil
|
|
|
lonjil
|
|
Traneptora
contrary to popular belief, swap is *not* designed for if you run out of memory. what happens if you have swap enabled and run out of memory is you get into a state of trashing that makes your system unresponsible and inevitably you will power cycle it
|
|
2024-03-07 12:30:00
|
I believe macos does something like, keeping track of which application is in focus, and swapping it back in if it got swapped out when it wasn't in use. As opposed to Linux very naive handling.
|
|
|
Quackdoc
|
|
Traneptora
contrary to popular belief, swap is *not* designed for if you run out of memory. what happens if you have swap enabled and run out of memory is you get into a state of trashing that makes your system unresponsible and inevitably you will power cycle it
|
|
2024-03-07 12:30:21
|
well regardless of what it's designed for, sadly, linux handles low memory situations abysmally, and swap can often prevent hard crashing
|
|
|
Traneptora
|
2024-03-07 12:30:43
|
well no, what happens when you run out of ram and have swap is you start thrashing and your system becomes unresponsible
|
|
|
lonjil
|
2024-03-07 12:31:01
|
earlier today linux decided to move literally all my applications to swap in favor of filling my ram with, uh, something
|
|
|
Traneptora
|
2024-03-07 12:31:04
|
I've used linux on a desktop for a long time and it has never, ever happened that when I started thrashing the system ever became responsive again
|
|
|
Quackdoc
|
2024-03-07 12:31:12
|
at least the system can still call oomkiller.
|
|
|
Traneptora
|
2024-03-07 12:31:21
|
except it won't
|
|
|
lonjil
|
2024-03-07 12:31:26
|
I'm literally using 26GiB of swap right now and my system is responsive
|
|
|
Traneptora
|
2024-03-07 12:31:34
|
yea but lon you're not out of memory
|
|
|
Quackdoc
|
2024-03-07 12:31:40
|
it does? and even if it doesn't for some reason you can manually call it using sysrq
|
|
|
Traneptora
|
2024-03-07 12:31:52
|
it won't call the OOMkiller if you run out of ram with swap
|
|
2024-03-07 12:31:57
|
it will just start thrashing
|
|
|
lonjil
|
2024-03-07 12:32:10
|
fun fact: the oom killer always kills discord first for some reason
|
|
|
Quackdoc
|
2024-03-07 12:32:30
|
thats not what I have experienced at all, swap is pretty much a necessity on my laptop and tablets, 2gib and 4gib respectively since linux will just hard crash on them.
|
|
2024-03-07 12:32:43
|
when you enable swap if often will recover just fine
|
|
|
Traneptora
|
2024-03-07 12:32:52
|
I'm saying it has never been the case that I have run out of memory with swap enabled and had the system ever become responsive again after it started thrashing
|
|
2024-03-07 12:33:32
|
I've been using linux for quite some time as a daily drive. almost 15 years, and I've never experienced anything other than what I described
|
|
|
Quackdoc
|
2024-03-07 12:33:34
|
yeah, can't say I share the same experience, but then again, flash helps a lot in these cases anyways
|
|
|
lonjil
|
|
lonjil
earlier today linux decided to move literally all my applications to swap in favor of filling my ram with, uh, something
|
|
2024-03-07 12:33:37
|
this caused massive thrashing but my system did become usable again eventually
|
|
2024-03-07 12:34:00
|
swap usage was at 70GiB and my mouse cursor wasn't moving
|
|
2024-03-07 12:34:10
|
eventually it became responsive again
|
|
|
Traneptora
|
2024-03-07 12:34:15
|
how long did you have to wait
|
|
2024-03-07 12:34:31
|
it may be the case that it responds eventually but I find that power cycling the system is faster than waiting
|
|
|
lonjil
|
2024-03-07 12:34:40
|
a minute or two?
|
|
|
Quackdoc
|
2024-03-07 12:34:44
|
on my laptops ill sometimes have to wait like 30s, but thats better then hardcrashing and loosing a bunch of data
|
|
|
Traneptora
|
2024-03-07 12:35:00
|
ah, yea, I've never had it recover after only a minute
|
|
|
lonjil
|
2024-03-07 12:35:13
|
but I have no idea what caused it, linux's swapping heuristics are jank
|
|
|
Traneptora
|
2024-03-07 12:35:28
|
my life has been so much better when I disabled swap*
|
|
2024-03-07 12:35:40
|
*I have a swap partition but I don't swap it on unless I need to hibernate
|
|
2024-03-07 12:35:52
|
since hibernation is done to swap space
|
|
|
lonjil
|
2024-03-07 12:35:53
|
I might disable swap after I upgrade to a computer with 256 GiB of RAM.
|
|
|
Quackdoc
|
2024-03-07 12:36:03
|
how linux handles memory in general is jank lol
|
|
|
Traneptora
|
2024-03-07 12:36:10
|
I have 32 GiB and I don't really run out
|
|
|
Quackdoc
|
2024-03-07 12:36:19
|
lately i've been using bustd as my oomkiller and it works pretty well
|
|
|
Traneptora
|
2024-03-07 12:36:26
|
I didn't really run out at 16 either unless I was doing some kind of heavy encoding stuffs
|
|
|
CrushedAsian255
|
2024-03-07 12:36:58
|
i once got mathematica to use 62GB of swap + 22GB of ram
|
|
|
Quackdoc
|
2024-03-07 12:37:00
|
fat lto on rust projects with many sub stuff hurts T.T
|
|
|
Traneptora
|
|
CrushedAsian255
i once got mathematica to use 62GB of swap + 22GB of ram
|
|
2024-03-07 12:37:10
|
oo, that's impressive, how
|
|
|
CrushedAsian255
|
|
Traneptora
oo, that's impressive, how
|
|
2024-03-07 12:37:22
|
`DeBrujinSequence[2,1000000000]`
|
|
|
Traneptora
|
2024-03-07 12:37:28
|
<:holyfuck:941472932654362676>
|
|
2024-03-07 12:37:30
|
yea that might do it
|
|