How to Differentiate Lossless From Lossy Music

zeus · October 19, 2024, 1:14am

Venture no further if you don’t want to ruin music a little bit for yourself. Venture forward with no haste if you are interested in why I care about lossless files and bit-perfect reconstruction so much.

No seriously, I’m warning you. You sure you wanna know?

…you sure?

Okay.

There are a couple of tells in very low bit rate lossy audio files that make them stand out to me in comparison to a lossless file. I’m gonna go through a few of them here. If I missed any, feel free to add.

SHANNON-NYQUIST AND YOU

As we all know, lossy files are smaller than their lossless counterparts because the encoder uses an algorithm to literally truncate the higher frequencies. If you pull up a 128KBPS MP3 in a spectrogram analyzer, you can see that characteristic cut at 16kHz. In order to get an idea of how to accurately recreate the shape and amplitude of a frequency, the frequency must be sampled at twice the bandwidth of the signal. Or in other words, if you sample an analog signal at a rate that exceeds the signal’s highest frequency by at least a factor of two, the original analog signal can be perfectly recovered from the discrete values produced by sampling. If you do not do this, you get an unpleasant type of distortion known as aliasing. This is known as the Shannon-Nyquist theorem.

This is much easier to see with visuals. Here is a graph of a sound wave with each dot representing a sample.

Here is that same wave, 20 samples a second:

Same wave once again, 10 samples a second. See how the wave starts to diverge from the original?

At 5 samples a second, it’s no longer a true representation of the original wave but all of the information is preserved.

This is the absolute limit, which is two samples per cycle, which is what Shannon-Nyquist theorem says you need to reconstruct a signal.

This is what it looks like when you go less than two samples per cycle. This is 1.9 samples per cycle.

And this is 1.1:

So you can see that the new wave constructed out of the samples is way different than the original wave and the original information cannot be reconstructed.

SHORTCOMINGS IN LOSSY FILES

Makes sense? Cool. I explained all of this because all the sampling required for perfect reconstruction take up a lot of space. The lossy encoder of your choice (we’ll use LAME as a stand-in) looks for ways to get rid of these extra bits without fundamentally altering the reconstruction of the signal. In fact, there’s several things going on:

The first one is that most sound above 20khz is completely removed. This is based on the idea that human hearing is unable to hear above that and any noise above that level would make the file size larger, since sounds above that region are much more complex to compress. Some people have reported being able to hear above 20kHz, so this is controversial.

The second part is psychoacoustics. The engineers that develop a lossy format create models of human hearing called psychoacoustics which show that many frequencies in a complex piece of sound become “masked” by other frequencies, and you more or less don’t hear them. To make the lossy file with a smaller footprint, many of these sounds are removed. The bit rate that you target determines the amount of this processing that’s applied - the lower the bit rate, the more processing done.

For example, a loud sonic event will mask a quieter one if they both occur within a small interval of time — even if the louder event actually occurs after the quieter one. The closer in time the potentially masked sound is to the louder, masker, sound event, the louder it needs to be in order to remain perceivable.

Research has shown that the human ear responds to frequency content using what are known as critical bands. These are narrow bandwidth divisions of the 20Hz-20kHz frequency spectrum. If a loud frequency component exists in one of these critical bands, that noise creates a masking threshold which will render quieter frequencies in the same critical band potentially imperceptible.

The last thing that’s done is that there are all sorts of mathematics involved in compressing data in which the reconstructed sound is close, but not quite the same as the input was. These processes have different effects, one of which causes a type of pre-echo that can cause some amount of smearing.

Pre-echo is a digital audio compression artifact where a sound is heard before it occurs. It is most noticeable in impulsive sounds from percussion instruments such as castanets or cymbals. The psychoacoustic component of the effect is that one hears only the echo preceding the transient, not the one following – because this latter is drowned out by the transient. Forward temporal masking is much stronger than backwards temporal masking, so it is easy to hear a pre-echo, but no post-echo.

Smearing in audio refers to a loss of clarity and definition in sound reproduction, often characterized by a blurring of transients and a lack of precision in imaging. This can result in a muddy or indistinct sound, making it difficult to discern individual elements in a mix.

Now let’s get into some ways that I can tell a lossy file is lossy.

CYMBALS SOUND LIKE BREAD

We just went into a lot of the reasons why above, but this is usually the dead give away. MP3 especially uses a compression algorithm that can sometimes introduce audible artifacts, particularly in the high-frequency range where cymbals and hi-hats reside, causing a slightly distorted or “harsh” sound. The lower the bit rate, the more harsh the artifacts.

The rapid attacks and decay of cymbal and hi-hat sounds can be smoothed out during MP3 encoding, resulting in a less defined and impactful sound. Something that I listen for is what I call the “wobble” of the cymbal, ride, or hi-hat, which I define as occasional peaks in an otherwise decaying signal. On the other hand, if you can make out the spin of a crash or ride as it’s happening, you know you’ve got a high quality file.

Along that same line…

DRUM HITS HAVE NO REVERB

There’s a subtle but definitely present amount of reverb that occurs when you’re listening to a drum set in real life. Some of it’s the room, some of it’s the instruments, some of it is perception. I don’t want drum kits to sound like I’m inside an oil canister, but I know I have a lossy file when the drums sound flat, as if every hit was sampled to be exactly the same.

If the drums sound flat and lifeless and my ear does not naturally gravitate towards it because it’s been de-emphasized in the mix, that’s one way I hear a poor quality drum sample. Drums should have body and slight variations from one strike to the next.

BASS SOUNDS FLUBBY

We talked a lot about what these encoders do to the high end, but what about the low end? This is truthfully an implementation detail, left to the various encoders, but any serious encoder throws out information under 20Hz, for the same reasons as I listed above (people are not supposed to be able to hear them). I personally can hear as low as 16Hz, so I also call this quite a controversial statement.

In general, low frequencies require much less information than high frequencies to encode. However, masked frequencies and silences are removed, amounting to a spectral ‘thinning’. This makes bass sound flubby to me when the encoder is really not up to the task, or when the file is of a sufficiently low bit rate.

HARMONIC DISTORTION

So, first of all, what is a harmonic?

If you take a note, like a C, and play it on a guitar or a piano, not only do you hear the note C, you also hear, very quietly, other notes that are mathematically related to that C. You might hear a C an octave higher, and then another octave above that, and you might hear an E and G mixed in there as well. It’s actually more complex than that, but the point is that if you play a note on virtually any instrument, you get more than the single note that defines the pitch. The other stuff are the harmonics.

The harmonics of an instrument are a huge factor in why it sounds the way it does. A guitar with steel strings has a different set of harmonics than a guitar with nylon strings. The steel string is typically brighter and more metallic, due to its harmonics!

Harmonic distortion is when harmonics are added to a sound, a signal, that aren’t there in the original signal. When it comes to the guitar, the complex body and physics of the instrument adds extra harmonics to whatever notes are being played, but this type of harmonic distortion is desirable.

Electronic components (amplifiers, etc.) also add harmonics to a signal. Usually a well-designed circuit adds a very, very tiny amount of harmonics. That is also harmonic distortion, but it’s imperceptible. A badly designed circuit can add enough harmonic distortion that one can really hear it. There are amounts of harmonic distortion that can be very noticeable, and certain patterns of harmonics are more noticeable. Some patterns sound good, and some sound bad.

Understanding the math of harmonics also explains why distortion seems to make something sound brighter: because what you’re adding is harmonics above the fundamental, and those harmonics stack up and increase the apparent high frequency tonality of a sound. This is why too much harmonic distortion can sound harsh and painful.

Lossy encoders remove a lot of harmonic information, which can make a song sound darker, hollow, and even quieter.

LESS SOUNDSTAGE

Soundstage is a subjective concept that refers to the spatial qualities of a headphone, such as the width, depth, height, and shape of the figurative space where parts of a song are placed. It’s also known as speaker image. The more realistic and nuanced the sound stage, the better the clarity and detail.

Lossy encoders shrink the sound stage. Sometimes it feels as if the “space” between instruments and layers in the mix is really small, or perhaps even absent. Ever listen to a deathcore track where the guitar and the bass are fighting for the same frequency bands, and everyone sounds “bunched” together? Lossy encoders definitely do not help in this aspect, because they throw out a lot of the information involved with precise placement.

Okay, I’m done spouting exposition. @rsm_rain I hope you’re proud.

rsm_rain · October 19, 2024, 5:16am

Somehow, believe it or not, that progression of waveform diagrams may have bridged the gap between what i hear and what i understand

If we think of the cymbals as having (IRL) all this crazy harmonic information, some audible, some 20+kHz - and then we compress it with smaller sample rates, even tho we’re not too far from discarding frequencies below 20 or even above - yeah i could absolutely see how natural harmonic resonance, even inaudible ones, might be grouped and bunched up into this peaks and troughs of ghost waves that weren’t there in the original master or recording

And thus you have that sound of bread, particularly a bad bread, where you can bite in and perceive not the whole thing but every particular crumble, and the crumbles fall apart and jostle up against eachother, and the whole sensations is unpleasant… that’s the innate emotional experience of listening to below a certain threshold of encoding for me, and now i finalllyyyy have a mechanical physics explanation as to why!!

zeus · October 19, 2024, 12:15pm

I think time smearing is just as bad of an offender.

I linked a video above where the creator does a simulation of it (since YouTube compression will cause smearing on its own), and once you hear it you’ll know you’ve definitely heard it before. Pull out some jank rarez from last decade locked at a low bit rate and listen in.

I like really, really noticed this when I was listening to “Blackmail the Universe” by Megadeth. The original file I had was 128kbps ripped with LAME3.96r, and the best way that I could describe it was that everything sounded flat. “Less energy”. When I compare it to the FLAC, I now hear the smear.

Time smearing essentially distributes the same energy over a longer time, so the peaks will be lower. Bass kicks will sound lower and keyboards will have less chime and impact. Once you know what time smearing sounds like, you’ll never want to listen to low quality MP3 files again.

One fun test that I did was to take the MP3 file and play it in foobar while I played the ALAC file in iTunes at the same time. When I stopped the ALAC first, all the extra quality that it was providing disappeared along with it and the contrast back down to 128 was noticeable. When I stopped the MP3 first, the volume drop was noticeable but the instruments were still lush and everything was clear. So that’s one way I found that I can tell low quality MP3 are not clear.

I think I became extra sensitive to this because I started recording myself on computer, and comparing what I hear to what I heard I can obviously hear that it’s not the same. I record and export everything at 24 bits because of it.

zeus · October 19, 2024, 1:16pm

My challenge from the other topic still stands:

Post your "UNPOPULAR" Japanese music opinions! / aka "HOT TAKES" :P

But I don’t want you to take my word for it. I want you to try your ears out. If you’re curious, hit me up on the side. I’m going to send you a ZIP file of snippets of three versions of the same track: one 320 MP3, one ALAC sourced from PCM, and one ALAC sourced from DSD. I only need you to do two things:

Configure your system to play back the highest quality possible. On Windows, this is really simple. Right click the Volume icon, then click “Sound Settings”, then pick your audio device, and select the highest option it supports. If you have 24/96, you’re in good company.

image1080×243 12.4 KB

Configure your audio player to use WASAPI (if possible), or Direct Sound (if WASAPI is not possible). Foobar2000 is not configured to use WASAPI out of the box, you have to download WASAPI output support from here and then configure fb2k to use WASAPI output in “Library → Configure → Playback → Output → Device”. This is how mine is set.

Share your findings with everyone else here!

One thing I forgot to add (so consider it 2.5) is that if you do any type of software signal processing, you want to make sure it’s checked there too. I switched up my whole set up just two days ago, so now I plug my HD600 straight into the headphone jack of my desktop.

Normally that would be taboo, right? But check this out.

Yeah, this thing can spit out 32/96. In fact, it can spit out 32/384, but for whatever reason Windows 11 doesn’t like that. A problem for another day, I suppose. But if you’re doing what I’m doing, make sure you configure it there too!

zeus · October 19, 2024, 6:40pm

I originally wasn’t going to triple post, but then this got huge so I did.

Here’s three different graphs of the same track in different qualities so you can visualize spectral thinning.

The first one is the most awful sounding one of the bunch. 128 CBR, but it doesn’t sound like it.

This is pretty egregious, about as bad as it can get, and in order to understand why you have to look at what a proper 128 CBR rip is supposed to look like.

You see how it’s a more solid cut at 16kHZ and there’s a lot more blue compared to green in the upper kHz ranges (for MP3 128 CBR)? Yeah, that matters. You can actually hear that difference. Trust me. That color corresponds to how loud a sound is. Green means it’s much louder, which is bad. Loud is not good. Loud strangles dynamics and makes everything the same level, so a quiet whisper is the same volume as a gravity blast. You want dynamics. The second rip, while still 128, is leagues better than that first rip. You see that bar around 16kHz in the first image? Yeah, that’s a whole lot of noise. We’ll get into that later…

And if you’re a Dir en grey fan, you should know exactly what song this is. You probably have the shit rip. It’s the easiest one to find.

Here is what the FLAC version looks like. Is it to your expectations?

Compare the good 128 rip to this. Look at how much information in the upper registers is literally gone. Then look at how the gradients of blue and green around the 12-16kHz range are not the same. Similar, but not the same. This is the result of the encoder, and this is probably reflecting some smearing and pre-echo. All that blue and purple is where the cymbals and harmonic information live. You may not be able to “hear” it, but it’s more of a sense thing.

And to switch tracks since Karma doesn’t come in any quality higher than above, here’s a 24/96 copy of Asrun Dream by Gackt.

What’s that above 35kHz? Ultrasonic frequencies! Ultrasonic frequencies are inaudible to humans. In clinical settings, lower frequencies have greater depth of penetration into the body, while higher frequencies have greater resolution but limited depth of penetration. These frequencies should be removed by your DAC. The DSD format that I discussed about a week ago uses noise shaping to move quantization noise to these ultrasonic frequencies, which allows for an extended frequency response and a wide dynamic range.

What is quantization noise? When you take an analog signal, convert it to digital, and then convert back to analog, the final signal is not the same as the original signal. The difference is quantization error, which comes out as noise.

This pic I stole from Wikipedia visualizes this. Above is the original analog signal (green), the quantized signal (black dots), the signal reconstructed from the quantized signal (yellow) and the difference between the original signal and the reconstructed signal (red), which is the quantization error.

What is dithering? Dither is an intentionally applied form of noise used to randomize quantization error, preventing large-scale patterns such as color banding in images. Dither is routinely used in processing of both digital audio and video data, and is often one of the last stages of mastering audio to a CD.

What is noise shaping? Noise shaping is a digital audio processing technique, usually in combination with dithering, which is used to increase the apparent signal-to-noise ratio of the final product. This is done by altering the spectral shape of the error that is introduced by dithering and quantization, to push it up into that ultrasonic range so it’s not audible!

Obviously, the result of all of this is low quantization noise and low distortion in the audible bandwidth necessary for high resolution audio. Single rate DSD64 with one-bit sampling can deliver a dynamic range of 120 dB from 20 Hz to 20 kHz and an extended frequency response up to 100 kHz, even though most recent SACD players specify an upper limit of 80 to 90 kHz.

DSD is not without issues. One is that it’s huge. Another is that DSD creates a tremendous amount of noise. So much, in fact, that Sony/Phillips have created a noise-shaping system designed solely for the purpose of disguising the inherent noise in a DSD signal. Like I said above, the noise created by DSD’s one-bit sampling is shifted out of the lower frequencies, and shoved up into the ultrasonic range, making the noise “inaudible.” But not all of the noise is shifted all the time, and lower frequencies can still contain noise.

“Noesis” by Gackt is a wonderful example of how this noise isn’t always removed. There’s noise around 2:10 and 5:09 (but oddly enough, not 3:47) that I’ve never heard rendered that way before; it’s present in the PCM file as well but it’s not as noticeable. I’m gonna have to tag in golden ears @Aeolus so he can tell you if he hears it as well.

You can’t actually see this imperfection in the spectrograph, but you can see how there’s some noise in the ultrasonic range at 45kHz. I drew a little arrow next to it, so you can see it. See why the band of noise at 16kHz in the very first graph was no good? This is where it’s supposed to live!

Any imperfections in a DSD signal are time and amplitude imperfections. If one were to zoom in on a DSD signal, those amplitude fluctuations would be visible. This would imply that DSD is incapable of reproducing the same transient twice due to the time domain errors caused by the sampling. So DSD’s one-bit sampling is not better or worse than PCM (pulse code modulation), just different.

Also remember, any of these imperfections are so slight as to be imperceptible to the ear. You literally have to analyze waves to get this far into it. Me? I turn all of the DSD files I have into 24/96kHz PCM ALAC files, since I literally can’t tell the difference. The way I see it, taking DSD files and converting them down into PCM is one step above taking FLAC files and crunching them down into MP3! It’s as close as I can get to studio masters.

Put another way, if DSD sampling is represented as “squares”, and PCM sampling is represented as “circles”, and the process of converting from DSD to PCM is converting these “squares” into “circles”, then the conversation isn’t about “squares versus circles”, but about how to draw a better circle.

With the example that I started with, you can literally hear (and see) how starting from an FLAC file and converting down to 128 VBR sounds way better than whatever trash the original rip was, even though on the surface they’re both “128 CBR”. It’s not the same. And like I said, this conversation is just one step up. Starting with DSD and converting down to an FLAC container will yield better results than starting with the FLAC off the CD, because one is limited to 16 bits and the other can go as high as 24.

And since I have you here, there is a thing called 32 bit audio. You literally cannot hear the difference between 24-bit and 32-bit. Why care? It’s more if you’re an audio engineer, or a musician recording your own takes, because then you want to work with all the quality and mix down to a reasonable file size when you’re done. Anyone insisting they can hear differences in 32-bit files are full of shit. Watch how they’re suddenly allergic to A/B/X testing…

I should take this to it’s logical conclusion. Here’s a 24/192 file, courtesy of Bob Dylan.

All that green is the quantization noise! The purpose of keeping all of that extra noise is that you’re supposed to put all the artifacts that you don’t care about in this range, and then have your DAC filter it out, so you’re left with music. This is why even though I have the picture above implying that my playback is 32/96, I actually settled on 24/48. Literally anything above 48kHz is noise that I don’t want to hear (like intermodulation distortion, or IMD), and if I set the playback rate as I had it above, then those ultrasonic noises will be played back. Frequencies higher than 20kHz combine to produce lower frequencies within our hearing range that add to the high end frequency timbre of music.

There’s reason to keep it in the file, but I don’t wanna hear it!

And since it’s legitimately impossible for me to find anything higher quality than that…I had to go make it! So here’s 32/192 courtesy of my guitar (if you’re curious it’s a cover of Constance by Spiritbox)! It’s also 317MB

Peep the noise all the way at the top. Noise, well into the ultrasonic region that’s not doing anything for anyone. Where did it come from? Turns out, it’s the effect of pitch shifting a guitar into the bass range! Pitch shifters can make ultrasonic sound audible by replacing the high frequencies that are lowered with ultrasonic frequencies

Why doesn’t it stretch all the way up like the other graphs? Because this is literally just my guitar, which has a limited range it works in. I also didn’t do any fancy post-processing, so there’s just nothing in those ranges until you get up to around 48kHz, where there’s some ultrasonic noise.

So yeah, I really do think anything above 24/48 is a waste and possibly produces worse sound.

EDIT: There’s one small detail I forgot to add about DSD above. DSD features multichannel sound, which means that if you have a surround sound system, you can enjoy true multichannel playback of your favorite songs. When you convert from DSD to PCM, you also have to resample the five channels down to two, or you won’t hear anything. This could be another interesting reason to keep DSD files of your favorite releases.

Aeolus · October 19, 2024, 8:23pm

makin me bust out the MDR-1Am2 huh?

dont forget that noise can be caused by your motherboard, hard drive and usb ports. That’s why i have a desktop headphone dac

zeus · October 19, 2024, 11:11pm

I’m still not done, but I swear this one won’t be obnoxiously long.

Let’s try to isolate some of this ultrasonic sound and shift it back into sonic territory.

The first thing I did was open this up in Audacity, then click Effect > EQ and Filter > High Pass Filter. I selected 24000Hz with a rolloff of -48Hz and hit enter.

That turned this

into this

What does it sound like? Nothing, as intended! Here’s the graph of this file:

Nothing below 24kHz, just as we intended. Now let’s shift all this noise down into a frequency that we can hear.

Next I hit Tools > Nyquist Prompt and entered this:

(mult *track* (hzosc 24000))

Now I get something like this in Audacity:

It doesn’t look much different, but I can assure you that it sounds different. What do I hear?

Well I can say that it sounds sort of like Asrun Dream if you squint your ears hard enough. See this region I circled in the original graph? Take a quick guess at what you think it might be:

It’s a vocal filter applied on Gackt during the verse which pokes up into the ultrasonic range!

You know what else I hear a lot of? Percussion. Very tinny, faint percussion! This is along with what sounds like cicadas chirping, a few audio scratches, and a lot of white noise. This is what that noise shifted down into the audio range looks like:

The blue indicates it’s quiet as hell, and indeed I have to max out the volume in Audacity to hear it. In comparison, I accidentally played Asrun Dream in Audacity at full volume and my wife could hear it across the house (don’t worry, I took the headset off first)!

I literally care about none of this noise. None of it is musical. Anyone claiming they can hear this is full of shit.

But don’t take my word for it. Listen below.

It’s so not musical that YouTube’s Content ID System didn’t pick this up. Take that for whatever it’s worth. However, this extra information does add to the frequencies that we can hear and to our perception of the music, however slight. And you can hear a ton of audio imperfections that would be a nuisance to listen to if it was actually audible at sonic frequencies!

delkmiroph · October 20, 2024, 12:50am

Such great thread

mooood:

blacktooth · October 20, 2024, 10:04am

Finally, Gackt for bats.

zeus · October 20, 2024, 12:36pm

“Gackt for Bats” would be an interesting slogan and thought experiment.

I know that no one has ultrasonic hearing because if they did, all they would hear is additional white noise. I took the results of the YouTube file and stapled it to the original 24/96 file I took it from, to simulate making the noise audible. The song got a little brighter and along with it came a ton of white noise. Anyone with ultrasonic hearing would likely hate all these lossless files that store imperfections in that band, and would likely champion 16/44.1kHz just so that they don’t have to deal with those frequencies. I can’t imagine piercing white noise to be pleasant. DSD, with all its inherent noise, would likely be unlistenable.

Depending on the file, our bat could become a legit Gackt fan! FLAC? They can hear some of it on the lower edge of their range. DSD? All they’ll hear is that noise, with a very faint suggestion of music if we’re lucky.

zeus · October 20, 2024, 7:04pm

I promised myself “I wouldn’t write anything today”. And then, while going through my collection, I found not one but two things worth bringing up in this context. And since I’m gaining the reputation for ranting, why not double down?

The first kind of blends the line between this topic and The Unpopular Opinions Thread, but level with me here because it’s relevant. A lot of rips of early the GazettE material are garbage. So bad, that it makes you think that the style of the band is not for you.

Here’s my first bit of evidence. This 128kbps file was encoded with iTunes 4. FOUR! When I tell you that the opening cymbal hit to 菫 sounds like a completely different instrument with this file, I mean it.

Back in the days, the same file was passed around to everyone. If you know you haven’t explicitly looked for MADARA in 320, check your file. If you have iTunes 4 rip, that was the same as mine, I’m telling you that you haven’t heard MADARA properly. It literally changes the pitch and timbre of over half the instruments and sounds way different.

I own this damn CD too and I didn’t get around to ripping it until today, so high chance someone else on this forum is in the same boat.

Here’s my second bit of evidence. What in the hell is “Mastered for Itunes” anyway? This is the first time I’ve ever seen that embedded in a file.

Time for research.

"Mastered for iTunes approved aggregators can send your high bit rate masters to iTunes, but, they can only guarantee that your release will be branded as “Mastered for iTunes” if your work has been mastered by a Mastered for iTunes Mastering Engineer. When submitting your work to the aggregator for the MFiT release (bullet 2, above), you also have to send the aggregator the following information with your release:

Name of the approved mastering house or studio where your album was mastered for iTunes

Name of your Mastered for iTunes mastering engineer

The email address of your mastering engineer

The aggregator I checked with added that the process for submitting the content is a separate process from having it branded as a Mastered for iTunes release, and that typically the MFiT branding is added a few weeks after your album is added to iTunes."

So basically, you had to know a guy on the inside and then send them 24/96 files so that they could make AAC files out of them. Which, while it follows the same logic of “drawing a better circle” from my previous point, defeats its own purpose because at that point the circle has purposefully lost all the information that would distinguish a 24/96 copy from a 16/44.1 copy, and ends up creating a worse version of what is on the CD.

Apparently it’s been rebranded to “Apple Digital Masters”, but it should be the same concept.

And I’m not quite sure what the conversion process is, since I obviously didn’t do it, but when I compare the mix between the 256 M4A and the 1411 M4A, wouldn’t you know that I detect a small difference? The version off the CD seems less bright and resonant in the higher frequencies, which translates to a less fatiguing listen over a long period of time.

So if you’re one of those people that have never liked TRACES VOL. 2 because you thought the mastering was jank, you may have just been hating what the encoder spit out.

It’s actually crazy how the releases I never gravitated towards overlap with the terrible quality releases in my library. As if I can subconsciously pick out when I have a low quality MP3 rip done ten years ago.

All of these gotta go!

Seriously, if you’re a fan of early the GazettE, make sure that those early three singles aren’t butt. If you grabbed them from Monochrome Heaven, they are guaranteed ass. I have a 128 rip that sounds like someone threw a wet blanket over the whole thing and sounds straight awful.

Butt ass version:

Proper version:

It should be illegal for MP3s to sound this bad. It was at exactly this moment that I had the realization that I probably never liked 紅蓮 when it dropped because the rip was absurdly crushed. I still have it too.

iTunes 9.0.2 is a bit better than 4, but not at this quality.

This is what I mean when I say it’s crushed:

And this is what it’s supposed to look like:

So if you got that 128 VBR MP3 back when it first dropped, it’s about time to upgrade and hear 傀儡絵 the way it was intended!

Damn, even Filth in the Beauty rips out there don’t sound great.

LAME3.96 can’t be that bad, right? 3.100 is the newest, so that’s only four versions behind!

Yeah, no. An MP3 file from 2004? How could this possibly stack up? I always felt the single versions on Filth in the Beauty sounded like hot garbage, and that it was criminal that Crucify Sorrow never got another release. Was my rip just crap?

Yeah, it was the best 2004 can do. Same with REGRET.

This is REGRET (the song) and looks as horrific as it sounds. What is even going on in the band between 14 and 16kHz???

Much better.

Please, for the love of Zeus, go replace your the GazettE files! Even if they say 320, all 320 CBR is not equal! A rip from 2004 is gonna sound way worse than whatever we can cook up today! To prove my last point for this post, here’s a copy of LEECH 320 ripped with iTunes 8 versus LEECH 320 ripped with LAME3.100.

iTunes 8

LAME3.100

The second one preserves frequencies much better!!!

Excuse me while I spend the rest of the afternoon ripping all my CDs. You should too. You see anything less than LAME 3.99? That encoder is over fifteen years old. Source a new copy.

EDIT: @Delkmiroph said this perfectly so I’mma quote:

That makes me think that some reviews of albums may have a false judgement of quality, since most of them were heard years ago.

We still have this issue today. If you’re sourcing STACKED RUBBISH in MP3 on the high seas, you have no idea if what you’re listening to was ripped last year or last decade. And apart from sitting down and looking at the encoding tags embedded in the file - and also knowing what it means and what to look for - you would have no way of telling! But your ears can! And if you get the idea to source another MP3 file, and it sounds just as bad because both of them got it from the same source, you would draw the conclusion that the album just “sounds like that”.

It does not just “sound like that”.

Axius · October 21, 2024, 10:12pm

Now we need a thread for hi-res tracks

zeus · October 21, 2024, 10:24pm

May as well treat this thread as the same thing for hi-res tracks. Since the only thing that’s different between 16-bit, 24-bit, and 32-bit is how much more detail it preserves, they’re all in the same bucket of “PCM” in my mind. And then there’s DSD, like I described above, that takes a different approach to encoding music and has its own pros and cons. There’s not another hi-res format that I’m aware of (that’s not a gimmick), but I’m pretty certain that it’ll do the same thing a little differently.

ANOMIE · October 22, 2024, 12:32am

Really great write-up Zeus! What hardware and software do you recommend for quality rips? I’m overdue to digitize my collection but I need to invest in a good disc reader.

zeus · October 22, 2024, 12:33am

Exact Audio Copy for ripping. For listening, pretty much anything that’s not $10 at the cornerstore will do you justice. I listen on $20 Moondrop Chu 2 and they sound great

Aeolus · October 22, 2024, 11:40am

Here are some guides on making good rips

Ripping CDs with windows

Ripping CDs with MacOS

Ripping CDs with Linux

And here’s a list of drives with their ripping accuracy.

delkmiroph · October 22, 2024, 11:53am

I use dBpoweramp paid version! It’s great also.

Aeolus · October 23, 2024, 8:04pm

Dbpoweramp is good for tagging and converting lossless to lossless, but the ripper isn’t worth it as it does not allow for setting up your drive and does not provide a log.

NewsMan · October 25, 2024, 3:30am

This is super in-depth and very well researched. I didn’t think you would go into pre-ringing with this, a phenomenon you can also encounter with certain kinds of processing like linear-phase EQ.

All of this is why I’m fully okay with spending the money on physical storage space for non-compressed and losslessly-compressed media rips. I care about the art, and about the presentation of that art - I’d rather hear it the way it’s meant to be heard! Hell, it’s the same reason I listen to albums front-to-back

One thing I’ll slightly disagree with

Even in the most well-designed circuits, harmonic distortion is often not only perceptible, but why people choose one piece of gear over another. So much so that when folks discovered Harrison’s plug-in version of the EQ section of their 32C mixing console didn’t distort how the board does, they got upset about it!

zeus · October 26, 2024, 6:30pm

WHAT ABOUT SHM-CD?

I was gonna make a joke that SHM-CD was short for SH(A)M-CD, but with a sample size of one I can’t really draw too many conclusions.

But that should tell you everything you need to know about what I think of SHM-CD. I sourced both copies of D’espairsRay’s REDEEMER in FLAC - one on the regular CD and the other on SHM-CD. I loaded them up in foobar and did the ABX Comparator.

Ladies and gentlemen, I literally could not tell the difference. Complete guessing game. You pop both in Spek and they look exactly the same. They report the same number of samples in foobar and the SHM-CD is a slightly larger file most of the time, but not by much.

Here’s what Wikipedia says about this format.

SHM-CDs (short for Super High Material Compact Disc) is a variant of the Compact Disc, which replaces the Polycarbonate base with a proprietary material. This material was created during joint research by Universal Music Japan and JVC into manufacturing high-clarity Liquid-crystal displays.

SHM-CDs are fully compatible with all CD players, since the difference in light refraction is not detected as an error. JVC claims that the greater fluidity and clarity of the material used for SHM-CDs results in a higher reading accuracy and improved sound quality. However, since the CD-Audio format contains inherent error correction, it is not known whether the difference in read errors is great enough to be audibly different.

As far as REDEEMER goes, it doesn’t make a difference at all. I think this is a great example that shows that any differences that are heard between SHM-CD and regular CD is probably due to a different mix.

Hell, I think SHM-CD is more of a disappointment than HDCD! At least with HDCD, the gimmick was that there was 20 bits of information stored on the disk if you took the appropriate steps to read it. There is no such analog for this format. There’s not even a difference between 16 and 20-bit!

REDEEMER SHM-CD TRACK 3 REDEEMER

REDEEMER CD TRACK 3 REDEEMER

MACABRE CD TRACK 4 EGNIRYS

MACABRE HDCD TRACK 4 EGNIRYS

In both cases, the waves look exactly the same, which is what I expect. But at least with HDCD you can measure the difference and see 1035 vs 1741kbps. And if you take the two tracks, line them up perfectly, and inverse the polarity of one to see what pops out, you do get something with MACABRE and nothing with REDEEMER.

But when I say something, I mean almost nothing.

Here’s REAPER’s output of MACABRE after the inversion.

It’s very quiet, but it’s measurable! But this is like, laughably quiet. Nothing I could do would bring up whatever measurable sound there was to an audible level. -14 LUFS is considered quiet. -80 is a whisper’s whisper. And as far as I understand it, the gimmick with HDCD was that you get wider dynamic range, which isn’t something that can really be measured.

Here’s REAPER’s output of BRILLIANT

-inf means no sound at all.

For the record, I can’t actually hear what’s going on in the shifted version of egnirys. It’s way too quiet. But what’s actually there?

So if I can measure it, but I can’t hear it, then is it important?

No.