I will be happy to edit any of this posting if anyone finds serious issues in it that need to be addressed. I'll treat it like a wiki article: always subject to revision and correction.
Aaaannnd away we go.
Phonetics of The English Stop Contrast
Some will tell you English stops are voiced and voiceless; others will tell you they're aspirated and unaspirated; still others almost seem to wave a wand by declaring them fortis and lenis - because nobody's entirely certain what that's supposed to mean, in phonetic terms. But for the moment I will use "fortis" and "lenis" as handy labels for the two stop series - /p t k/ and /b d g/ - but don't worry yet what they mean, for right now they're just names.
Never let anyone tell you that the contrast of English stops is made by a single easily described phonetic feature. Depending on dialect, in every phonetic environment there are multiple features involved in maintaining the contrast, and to understand what's really going on, we have to dissect things more closely than just saying "it's aspiration". At least five independent articulatory issues are involved, as described below.
What is aspiration? It's actually two things that, in English, work in tandem.
1. One is called Voice Onset Time - VOT for short. This refers to where, in a phonetic sequence, voicing begins. For fortis stops in some environments, it begins well after the stop is released. For lenis stops, it begins well before this... but the issue is messier here, and we will also have to ask "what is voicing?" (see #3). A fuller treatment of VOT is given by Finlay above; no need to repeat it.
2. The second element to aspiration is a widened vocal tract setting. There is a continuum of possible states of the glottis during speech; normal "voiceless" sounds are at one point along the continuum, with the glottis held somewhat open. But in aspiration the glottis is held further open still - about as open as it normally ever gets - which produces a subtle breathy, h-like phonation on whatever sounds you pronounce while the glottis is in this position. (And in fact English /h/ is just this: "aspirated" phonation alone, without any other articulatory component.)
English fortis stops always exhibit both components of aspiration before a stressed vowel, across all dialects, except that the aspiration is reduced or absent when the stop falls after an /s/. In this environment the distinction between the two stop series is essentially neutralized.
3. What is voicing? It, too, turns out to be more complex than first meets the eye. Classically speaking, voicing (or "modal" voice) is defined as the vibration produced by the glottis when it is held in a certain position on its continuum, a tighter position than for voiceless sounds. However, the actual vibration is highly sensitive to disruption by certain factors. One of them is if there is too much air pressure in the oral cavity ahead of the glottis; this quickly dampens the glottal vibration to nothing. This is not much of an issue when pronouncing fricatives, nasals, vowels, and so forth, because air continues to escape the mouth during their articulation so oral pressure never gets too high to maintain the vibration. Stops are different. They completely close off the escape of air from the mouth for a short time during their articulation. During this time, oral air pressure rises as incoming air from the lungs builds up in the mouth, and this cuts off glottal vibration.
Many languages with voiced stops compensate by moving muscles in the mouth to increase the volume of the oral cavity so as to accomodate incoming air, thus preserving the glottal vibration; English does not do this. It permits the vibration to cease during the closure of a stop. So English lenis stops are "voiced" in the sense that they are always pronounced with the glottis held in the correct position for voicing, but not in the sense that the vibration actually continues during stop closure. The voicing vibration returns instantly upon the release of the stop, because the oral air pressure returns to normal at that time. This gives them a much earlier VOT than the fortis stops have.
Various additional features help support the contrast in English stops. It is important to look at them because the VOT difference that arises from the voicing and aspiration systems does not operate in all phonetic environments. The VOT-based description works beautifully for stops located in the onset of a stressed syllable (except after /s/) - but before stressless syllables, and especially in codas, VOT is less important or simply absent as a factor.
4. Pre-lengthening. A vowel before a lenis stop is normally pronounced for a longer duration than it would otherwise be. This varies by dialect - in some it is a major feature distinguishing coda and post-stress lenis stops, and in others it is weakly present if at all. Another issue: in dialects with American t-flapping, the t/d distinction is mostly or fully neutralized in post-stress position. This is the ladder-latter merger. But in some varieties the two are not fully merged, despite t-flapping: they remain somewhat or fully distinguishable by the trace that /d/ leaves on vowel length. I'm uncertain exactly what regions this is true of; my impression is that the American Midwest tends to have this remaining length-based contrast more than other regions, but I have not studied the distribution.
5. Glottal reinforcement. To the best of my knowledge, across all dialects of English and in all postvocalic phonetic environments, fortis stops show some degree of "glottal reinforcement". This is not the same as "coarticulated with a glottal stop" as some people have put it, although it may amount to the same thing in some varieties. But in most varieties of English it consists of an incomplete glottal stricture that occurs immediately before the stop's primary closure. (This can also be described as creaky phonation type: fortis stops are slightly pre-creakyvoiced. It sounds nuts, but it's true.)
The net effect is that in many or most dialects, word-final plosives (which are usually unreleased) are told apart not by any of the voicing/aspiration type features, but instead by whether the stop is preceded by glottal stricture versus a longer vowel.
- For some speakers there may be more features still that get involved; for instance see Salmoneus' post above. Pre-lengthening can involve changes to the tone or quality of the vowel, for one of the bigger points of variation. I have attempted only to describe what is true across most or all dialects (this is why I ignored British t-glottalization, for example).
- Do not rely too closely on the "puff of air" test for aspiration - it's cute and works some of the time, but is neither definitive of aspiration nor a reliable result of it.