What is audible?
To find what matters we must look at psychoacoustic research. Here are some phenomena that influence the perceived sound quality.
Threshold of hearing: Our hearing system has a natural lower limit. This limit varies with frequency. At low frequencies, the level of threshold is rather high. In the frequency range of 2-4 kHz, the level of threshold is low. See figure 5.
Figure 5. The curve indicates the threshold of hearing (people with normal hearing). Humans are not able to hear sound below the threshold. Please notice that humans do not hear low-level low frequencies very well.
Masking: When the ear is exposed to sound energy in a specific frequency range, a masking of the surrounding frequencies is created. This masking especially works at higher frequencies.
The illustration below shows the masking curves of a 1 kHz pure tone at various SPLs.
Figure 6. The diagram shows the masking curves of a 1 kHz pure tone at various SPLs.
This means, that very often the distortion components become inaudible due to masking.
Figure 7. The 3rd harmonic (3 kHz) of the 1 kHz tone is inaudible due to masking even though the distortion is 5%.
Regarding difference frequency distortion, the frequency components below the masking frequencies become the most audible. This is also related to the fact that the difference tones are not necessarily musical and are therefore perceived as more annoying.
The masking effect is essential for all bit-reduced audio formats. Here, the distortion may be high and the signal-to-noise ratio low. However, we (listeners in general) often accept the way it sounds.
Distortion of the ear: The ear itself generates distortion. This phenomenon especially exists at higher SPLs. The ear has its best resolution at lower SPLs.
This phenomenon is heard when listening to two tones that are almost equally loud. Depending on the frequency interval between the two tones, a third tone can be heard. For instance, if you hear two tones with the interval of one fifth, C3 and G3 (131 Hz and 196 Hz) your ear will create the difference tone of 65 Hz (C2, which is one octave below the C3. A fact used when building pipe organs to create "artificial sub-voices").
Below is an example of intermodulation or difference frequency distortion. Two tones are generated. The ear then produces difference tone distortion. However, only the tones below the real tone are perceived – due to masking.
Figure 8. Distortion in the ear: Frequency components occurring due to two frequencies: 1 kHz and 1.6 kHz. Only the two components below 1 kHz become audible.
Conclusion
In general, there should be no distortion in your microphones. However, in reality there is some. Measures like THD and difference frequency distortion measurements do not tell the full story. However, the numbers in the specs can be regarded as an indication of how "healthy" the design is. What is perceived by the ear is rather complex because the ear produces distortion itself. Despite the numbers, the more distortion in your microphone, the muddier and more unclear the sound gets in your recording.
References
[1] IEC 60.268 Sound System Equipment, part 2: Explanation of general terms and calculation methods.
[2] IEC 60.268 Sound System Equipment, part 4: Microphones.
[3] Geddes, Earl R.; Lee, Lidia W: Auditory Perception of Nonlinear Distortion - Theory. AES 115th Convention Paper 5890. 2003.
[4] Geddes, Earl R.; Lee, Lidia W: Auditory Perception of Nonlinear Distortion. AES 115th Convention Paper 5891. 2003.
[5] Toole, Floyd E.: Sound Reproduction - The Acoustics of Loudspeakers and Rooms. Focal Press 2008. ISBN 978-0-240-520094