AI language models can exceed PNG and FLAC in lossless compression, says study

@FlickOfTheBean@beehaw.org · 1 year ago

AI language models can exceed PNG and FLAC in lossless compression, says study

@EdgeOfToday@lemm.ee · 1 year ago

With a neural network, you wouldn’t be able to mathematically prove that the signal is perfectly recovered 100% of the time for all possible inputs. That is the case with PNG and FLAC. If you’re just listening to music and need a good compression ratio, then sure, it won’t be a big deal if a couple of bits are wrong. But that’s also why we have lossy compression. If the goal is to make signal degradation imperceptible to a human, then you could get a much better compression ratio using neural networks. If it’s truly critical that the signal isn’t corrupted, it would probably be better to just use the original method.

astraeus · 1 year ago

Seems like another “hey, what if we used LLMs for this” scenarios. It might be more effective, but exactly how many more resources are being used to make it do the same work as current compression algorithms? Effective doesn’t mean efficient and I think for lossless applications efficient is truly more important.

Butterbee (She/Her) · 1 year ago

A LOT. You can barely run 13b parameter models on a 24gb gfx card and outputs are like a page or so of text. Translate that over to audio and it would have to be broken down into discrete chunks that the model could use as “prompts” to output a section of audio that fit into the models available output. It might compress better, but it would be exceedingly painful and slow to extract even on AI focused cards. And it would use OODLES of watts to get just a little bit better than flac.

@abhibeckert@beehaw.org · edit-2 1 year ago

13b parameters works out to about 9GB. You need a bit more than that since it needs more than just the model in memory, but at 24GB I’d expect at least half of it to go unused. And memory doesn’t use much power at all by the way. LPDDR4 uses something like 0.3 watts while actively reading/writing to it.

The actual computations use more, obviously, but GFX cards are not designed for this task and while they’re fast most of them are also horribly inefficient.

I run 13b parameter models on my ultra portable laptop (which has a small battery, no active cooling (fanless) and no discrete GPU). It has 16GB of RAM not GPU memory - RAM, and I’m running a full operating system, web browsers, etc a the same time. Models like llama2, stable diffusion, etc get perfectly usable performance without using much battery at all (at a guess, single digit watts while performing the calculations).

There is efficient hardware now and there will be even more efficient hardware in the future. My laptop definitely isn’t designed to run these models and on top of that the models aren’t designed to run on a laptop either. There’s plenty more optimisation work to be done in the years to come.

Butterbee (She/Her) · 1 year ago

Ok, it’s been a while since I tried running a language model so I might have been thinking of the 30b models that were showing up at the time. The point remains though that this thing they were running would be well beyond hardware generally available and completely impractical for realtime use. Like… why would you do all that when flac and png are good enough. It is far cheaper and uses less power to accommodate the slightly less compressed files.

@christophski@feddit.uk · 1 year ago

Ok but what if we used LLM AND blockchain for this

@ezures@lemmy.wtf · 1 year ago

Im sure we can squeeze an nft in there somewhere

@christophski@feddit.uk · 1 year ago

S m a r t c o n t r a c t s

astraeus · 1 year ago

Our company has been looking for a brilliant innovator like you, how would you like to apply for a new position called professional cool sounding tech peddler, I mean director of creative technology?

@christophski@feddit.uk · 1 year ago

I want 200k and 30% of the company

astraeus · 1 year ago

Best I can do is $125k and $300k in company stock over 4 years

@person594@feddit.de · 1 year ago

That isn’t really the case; while many neural network implementations make nondeterministic optimizations, floating point arithmetic is in principle entirely deterministic, and it isn’t too hard to get a neural network to run deterministically if needed. They are perfectly applicable for lossless compression, which is what is done in this article.

@CanadaPlus@lemmy.sdf.org · 1 year ago

It sounds like the actual compression isn’t taking place within the neural net here, though.