web.archive.org

The Laws of Thermodynamics

_ [Contents]

Copyright © 2005 jsd

Thermodynamics is celebrated for its power, generality, and elegance. However, all too often, students are taught some sort of pseudo-thermodynamics that is infamously confusing, limited, and ugly. This document is an attempt to do better, i.e. to present the main ideas in a clean, simple, modern way.
The first law of thermodynamics is usually stated in a very unwise form.   We will see how to remedy this.

The second law is usually stated in a very unwise form.   We will see how to remedy this, too.

The so-called third law is a complete loser. It is beyond repair.   We will see that we can live without it just fine.

Many of the basic concepts and terminology (including heat, work, adiabatic, etc.) are usually given multiple mutually-inconsistent definitions.   We will see how to avoid the inconsistencies.

Many people remember the conventional “laws” of thermodynamics by reference to the following joke:
  0)   You have to play the game;
  1)   You can’t win;
  2)   You can’t break even, except on a very cold day; and
  3)   It doesn’t get that cold.
It is not optimal to formulate thermodynamics in terms of a short list of enumerated laws, but if you insist on having such a list, here it is, modernized and clarified as much as possible:
The zeroth law of thermodynamics tries to tell us that certain thermodynamical notions such as “temperature”, “equilibrium”, and “macroscopic state” make sense.   Sometimes these make sense, to a useful approximation ... but not always. See section 3.

The first law of thermodynamics states that energy obeys a local conservation law.   This is true and important. See section 1.2.

The second law of thermodynamics states that entropy obeys a local law of paraconservation.   This is true and important. See section 2.

There is no third law of thermodynamics.   The conventional so-called third law alleges that the entropy of some things goes to zero as temperature goes to zero. This is never true, except perhaps in a few extraordinary, carefully-engineered situations. It is never important. See section 4.

To summarize the situation, we have two laws (#1 and #2) that are very powerful, reliable, and important (but often misstated and/or conflated with other notions) plus a grab-bag of many lesser laws that may or may not be important and indeed are not always true (although sometimes you can make them true by suitable engineering). What’s worse, there are many essential ideas that are not even hinted at in the aforementioned list, as discussed in section 5.

We will not attempt to base our theory on some small number of axiomatic “laws”. We will carefully formulate a first law and a second law, but will leave numerous other ideas un-numbered. The rationale for this is discussed in section 7.4.

Also we do not express the first law as dE = dW + dQ or anything like that, even though it is traditional in some quarters to do so. For starters, it is mathematically manifestly impossible for there to be any W and/or Q that satisfy such an equation -- except perhaps in trivial, uninteresting cases -- as discussed in section 7.6. Secondly, even though there are non-ridiculous ways of expressing dE in terms of a thermal part and a non-thermal part, such expressions are vastly less general, less elegant, and less reliable than equation 1.

Our thermodynamics is not restricted to the study of ideal gasses. We do not pretend that all thermal energy is kinetic; we recognize that random potential energy is important also.

We do not base our thermodynamics on a notion of “heat”. Energy and entropy are always well defined, even in cases where heat is not.

We do not define entropy in terms of energy, nor vice versa. We do not define either of them in terms of temperature. Entropy and energy are well defined even in situations where the temperature is zero, unknown, or undefinable.

To be explicit: You can do thermodynamics without heat. You can even do quite a bit of thermodynamics without temperature. But you can’t do thermodynamics without energy and entropy.

This document is also available in PDF format. You may find this advantageous if your browser has trouble displaying standard HTML math symbols.

*   Table of Contents

1  Energy

1.1  Definition of Energy

Energy is a fundamental concept in physics. It is so fundamental that there is no point in trying to define it in terms of anything more fundamental. It is far more sensible to understand energy in terms of what it does than in terms of what it is. By way of analogy, let’s recall how mathematicians define the natural numbers. There are five Peano axioms, of which the first two are:
  • There is a natural number 0.
  • The successor of any natural number (N) is a natural number, denoted SN.
So, the first few natural numbers are 0, S0, SS0, and SSS0. This definition is recursive. It is not circular, just recursive. This is a precise, rigorous, formal definition.

Following this recipe, we will start by a few examples of energy, then define the general notion of energy recursively. Specifically:

Some well-understood examples of energy include following:

  • gravitational energy: m g h
  • kinetic energy: ½ m v2
  • Hookean spring energy: ½ k x2
  • capacitive energy: ½ C V2
  • inductive energy: ½ L I2
We generalize this by saying energy is anything that can be converted to some known form(s) of energy in accordance with the law of conservation of energy (section 1.2).

This definition is recursive. That is, we can pull our understanding of energy up by the bootstraps. We can identify new forms of energy as they come along, because they contribute to the conservation law in the same way as the already-known examples.

Energy is somewhat abstract. There is no getting around that. You just have to get used to it -- by accumulating experience, seeing how energy behaves in various situations. As abstractions go, energy is one of the easiest to understand, because it is so precise and well-behaved.

1.2  Conservation of Energy

The first law of thermodynamics states that energy obeys a local conservation law.

By this we mean something very specific:

Any decrease in the amount of energy in a given region of space must be exactly balanced by a simultaneous increase in the amount of energy in an adjacent region of space.
Note the adjectives “simultaneous” and “adjacent”. The laws of physics do not permit energy to disappear now and reappear later. Similarly the laws do not permit energy to disappear from here and reappear at some distant place. Energy is conserved right here, right now.

It is usually possible1 to observe and measure the physical processes whereby energy is transported from one region to the next. This allows us to express the energy-conservation law as an equation:

change(energy inside boundary) = - flow(energy, outward across boundary)              (1)

The word “flow” in this expression has the same meaning as it has in everyday life. See reference 8 for the details on this.

There is also a global law of conservation of energy: The total energy in the universe cannot change. The local law implies the global law but not conversely. The global law is interesting, but not nearly as useful as the local law, for the following reason: suppose I were to observe that some energy has vanished from my laboratory. It would do me no good to have a global law that asserts that a corresponding amount of energy has appeared “somewhere” else in the universe. There is no way of checking that assertion, so I would not know and not care whether energy was being globally conserved.2 Also there is would be very hard to reconcile a non-local law with the requirements of special relativity.

As discussed in reference 8, there is an important distinction between the notion of conservation and the notion of constancy. Local conservation of energy says that the energy in a region is constant except insofar as energy flows across the boundary.

1.3  Energy versus Work

Non-experts sometimes try to define energy as “the ability to do work” but this is unhelpful. At best, it would saddle us with the burden of defining “work” which is no easier than defining energy. More importantly, though, equating energy with doable work is inconsistent with thermodynamics, as we see from the following example:
#1: Consider an isolated system containing a hot potato, a cold potato, and a tiny heat engine. This system has some energy and some ability to do work.   #2: Contrast that with a system that is just the same, but instead of a hot potato and a cold potato, it has two hot potatoes.

The second system has more energy but less ability to do work.

This sheds an interesting side-light on the energy-conservation law. The law, by itself, does not tell you what will happen; it only tells you what cannot happen: you cannot have any process that fails to conserve energy. That’s because processes that would be permissible as far as the energy conservation law is concerned may be forbidden on other grounds. To say the same thing another way: if something is prohibited by the energy-conservation law, the prohibition is absolute, whereas if something is permitted by the energy-conservation law, the permission is conditional, conditioned on compliance with all the other laws of physics. In particular, you can freely convert nonthermal energy to thermal energy, but not the reverse. The reverse would be perfectly consistent with energy conservation, but is forbidden on other grounds (namely the second law of thermodynamics, as discussed in section 2).

Let’s be clear: work can be converted to any other form of energy. Not every form of energy can be used to do work. Equating energy with doable work is just not correct.

1.4  Conflict with the Vernacular

There is only one technical meaning for the term energy. For all practical purposes, there is agreement among physicists as to what energy is.

The same goes for the term conservation.

However, you should beware that the technical meanings of these terms conflict with the vernacular meanings. There is tremendous potential for negative transference.

For example, you may have seen a placard that says “Please Conserve Energy by turning off the lights when you leave” or something similar. Let’s be absolutely clear: the vernacular notion of “conserving energy” (as expressed by the placard) is grossly inconsistent with the technical notion of conservation of energy (as expressed by equation 1).

The vernacular notion of “energy” is only loosely defined. Often it seems to correspond, more-or-less, with either the Gibbs free enthalpy, G (as defined in section 13.3), or the thermodynamically available energy (as discussed in section 13.4), or some other notion of low-entropy energy.

The vernacular notion of “conservation” means saving, preserving, not wasting, not dissipating. It definitely is not equivalent to equation 1, because it is applied to G, and to wildlife, and to other things that are not, in the technical sense, conserved quantities.

Combining these two notions, we see that when the placard says “Please Conserve Energy” it is nontrivial to translate that into technical terms.

At some schools, the students have found it amusing to add appropriate “translations” or “corrections” to such placards. The possibilities include:

  1. “Please Do Not Dissipate the Gibbs Potential” or, equivalently, “Please Do Note Waste Free Enthalpy”.
  2. “Please Do Not Thermalize the Energy” or “Please Do Not Waste the Thermodynamically-Available Energy”.
  3. “Please Do Not Create Entropy Unnecessarily”.
The third version is far and away the most precise, and the most amenable to a quantitative interpretation. We see that the placard wasn’t really talking about energy at all, but about entropy instead.

2  Entropy

2.1  Paraconservation

The second law states that entropy obeys a local paraconservation law. That is, entropy is “nearly” conserved.

By that we mean something very specific:

change(entropy inside boundary) >= - flow(entropy, outward across boundary)              (2)

 

The structure and meaning of equation 2 is very similar to equation 1, except that it has an inequality instead of an equality. It tells us that the entropy in a given region can increase, but it cannot decrease except by flowing into adjacent regions.

As usual, the local law implies a corresponding global law, but not conversely; see the discussion at the end of section 1.1.

Entropy is absolutely essential to thermodynamics ... just as essential as energy.

You can’t do thermodynamics without entropy.


Entropy is defined in terms of statistics, as we will discuss in a moment. There are (in some situations) important connections between entropy, energy, and temperature, but these do not define entropy. The first law (energy) and the second law (entropy) are logically independent. Entropy is well defined even when the temperature is unknown, undefinable, irrelevant, or zero.3 This is true and important.

Entropy is related to information. Essentially it is the opposite of information, as we see from the following scenarios.

2.2  Scenario: Cup Game

As shown in figure 1, suppose I have three blocks and five cups on a table.
To illustrate the idea of entropy, let’s play the following game: In the preliminary part of the game, you hide the blocks under the cups however you like, and, optionally, you tell me something about what you have done. As suggested in the figure, the cups are transparent, so you know exactly what is going on. However, the whole array is behind a screen, so I don’t know anything except what I’m told.

During the main part of the game, I am required to ascertain the position of each of the blocks. Since in this version of the game, there are five cups and three blocks, the answer can be written as a three-symbol string, such as 122, where the first symbol identifies the cup containing the red block, the second symbol identifies the cup containing the black block, and the third symbol identifies the cup containing the blue block. Each symbol is in the range zero through four inclusive. There are 53 = 125 such strings. (More generally, in a version where there are N cups and B blocks, there are NB possible states.)

I cannot see what’s inside the cups, but I am allowed to ask questions of an oracle who can see what’s inside. My score in the game is determined by the number of questions I ask; each yes/no question contributes one bit to my score. My objective is to finish the game with the lowest possible score.

  1. Example: If you tell me all three blocks are under cup #4, then my score is zero; I don’t have to ask any questions of the oracle.
  2. Example: If you hide the blocks randomly and don’t tell me anything, then if I’m smart my score is at worst 7 bits (and usually exactly 7 bits). That’s because 27 = 128, which is slightly larger than the number of possible states. My minimax strategy is simple: I write down all the states in order, from 000 through 444 inclusive, and I ask the oracle questions of the following form: Is the actual state in the first half of the list? Is it in the first or third quarter? Is it in an odd-numbered eighth? After at most seven questions, I know exactly where the correct answer sits in the list.
  3. Example: You can give me partial information. If you hide the blocks randomly, but tell me that cup #4 happens to be empty, then my score will be six bits, since 26 = 64 = 43.
To calculate what my score will be, I don’t need to know anything about energy; all I have to do is count states (specifically, the number of states consistent with what I know about the situation). States are states; they are not energy states.
If you wish to make this sound more thermodynamical, you can assume that the table is horizontal, and the blocks are non-interacting, so that all possible configurations have the same energy. But really, it is easier to just say over a wide range of energies, energy has got nothing to do with this game.
The point of all this is that we define the entropy of a given situation according to the number of questions I have to ask to finish the game, starting from the given situation. Each yes/no question contributes one bit to the entropy.

The central, crucial idea of entropy is that it measures how much I don’t know about the situation. Entropy is not knowing.

2.3  Scenario: Card Game

Here is a card game that illustrates the same points as the cup game. The only important difference is the size of the state space: roughly eighty million million million million million million million million million million million states, rather than 125 states. That is, the state space is bigger by a factor of 1066 or so.

Consider a deck of 52 playing cards. By re-ordering the deck, we can create a large number (52 factorial) of different configurations. (For present purposes we choose not to flip or rotate the cards, just re-order them.)

In the preliminary phase of the game, you prepare the deck in a configuration of your choosing, by shuffling and/or artful re-arrangement. You optionally tell me something about the resulting configuration. During the main phase of the game, my task is to fully describe the configuration, i.e. to determine which card is on top, which card is second, et cetera. I cannot look at the cards, but I can ask questions of the oracle. Each yes/no question contributes one bit to my score. My objective is to ask as few questions as possible.

  1. Example: You put the deck in some agreed-upon reference configuration, and announce that fact. Then I don’t need to do anything, and my score is zero.
  2. Example: You put the deck in the reverse of the reference configuration, and announce that fact. I can easily tell which card is where. I don’t need to ask any questions, so my score is again zero.
  3. Example: You shuffle the deck thoroughly. You announce that, and only that. The deck could be in any of the 52 factorial different configurations. If I follow a sensible (minimax) strategy, my score will be 226 bits, since the base-2 logarithm of 52 factorial is slightly greater than 225.
  4. Example: You start with the reference configuration, then “cut” the deck; that is, you apply one of the 52 possible full-length cyclic permutations. You announce what procedure you have followed, but you do not divulge anything about the location of the cut. My best strategy is as follows: By asking six well-chosen questions, I can find out which card is on top. I can then easily describe every detail of the configuration. My score is six bits.
  5. Example: Same as above, but in addition to announcing your procedure you also announce what card is on top. My score is zero.
  6. Example: You announce that it is equally likely that you have either shuffled the deck completely or left it in the reference configuration. Then my score on average is only 114 bits, if I use the following strategy: I start by asking whether the deck is already in the reference configuration. That costs me one question, but half of the time it’s the only question I’ll need. The other half of the time, I’ll need 226 more questions to unshuffle the shuffled deck. The average of 1 and 227 is 114.
One configuration of the card deck corresponds to one microstate.

Note that we are not depending on any special properties of the “reference” state. For simplicity, we could agree that our reference state is the factory-standard state (cards ordered according to suit and number), but any other agreed-upon state would work just as well. If I know deck is in Moe’s favorite state, I can easily rearrange it into Joe’s favorite state. Rearranging it from one known state to to another known state does not involve any entropy.

2.4  Discussion

2.4.1  States and Probabilities

Our notion of entropy is completely dependent on having a notion of microstate, and on having a procedure for assigning a probability to each microstate.
In some special cases, the procedure involves little more than counting the “allowed” microstates, as discussed in section 8.6. But keep in mind that mere counting does not suffice in the general case.
For simplicity, the cup game and the card game were arranged to embody a clear notion of microstate. That is, the rules of the game specified what situations would be considered the “same” microstate and what would be considered “different” microstates. Such games are a model that is directly and precisely applicable to physical systems where the physics is naturally discrete, such as systems involving only the nonclassical spin of elementary particles (such as the demagnetization refrigerator discussed in section 10.11).

For systems involving continuous variables such as position and momentum, counting the states is somewhat trickier. The correct procedure is discussed in section 11.2.

2.4.2  Entropy is Not Knowing

The point of all this is that the “score” in these games is an example of entropy. Specifically: at each point in the game, there are two numbers worth knowing: the number of questions I have already asked, and the number of questions I must ask to finish the game. The latter is what we call the the entropy of the situation at that point.

Entropy is not knowing.
Entropy measures how much is not known about the situation.


At each point during the game, the entropy depends hardly at all on the actual configuration of the cards or blocks. This makes the definition of entropy somewhat context-dependent or even subjective. Some people find this irksome or even shocking, but it is real physics. For physical examples of context-dependent entropy, and a discussion, see section 11.7.

2.4.3  Entropy versus Energy

Note that entropy has been defined without reference to temperature and without reference to heat. Room temperature is equivalent to zero temperature for purposes of the cup game and the card game; theoretically there is “some” chance that thermal agitation will cause two of the cards to spontaneously hop up and exchange places during the game, but that is really, really negligible.

Non-experts often try to define entropy in terms of energy. This is a mistake. To calculate the entropy, I don’t need to know anything about energy; all I need to know is the probability of each relevant state. See section 2.5 for details on this.

States are states;
they are not energy states.


Entropy is not defined in terms of energy, nor vice versa.

In some cases, there is a simple mapping that allows us to identify the ith microstate by means of its energy Ei. It is often convenient to exploit this mapping when it exists, but it does not always exist.

For more about how to connect entropy to thermal energy, see section 8.

2.4.4  Entropy versus Disorder

There is another group of non-experts who try to define entropy in terms of disorder. This is another mistake; it is perhaps all the more disruptive because it is in some ways “close” to the truth. We can agree that the number of disorderly states greatly exceeds the number of orderly states ... so if all you know is that the system is not in an orderly state, you know the entropy is high. In contrast, though, the important point is that if you know the system is in some particular disorderly state, the entropy is zero. If you know what state the system is in, it doesn’t matter whether that state “looks” disorderly or not.

Furthermore, there are additional reasons why the typical text-book illustration of a messy dorm room is not a good model of entropy. For starters, it provides no easy way to define and delimit the states. Even if we stipulate that the tidy state is unique, we still don’t know whether a shirt on the floor “here” is different from a shirt on the floor “there”. If we don’t know how many different disorderly states there are, we can’t quantify the entropy. (In contrast the games in section 2.2 and section 2.3 included a clear rule for defining and delimiting the states.)

2.4.5  False Dichotomy

There is a long-running holy war between those who try to define entropy in terms of energy, and those who try to define it in terms of disorder. This is based on a grotesquely false dichotomy: If entropy-as-energy is imperfect, then entropy-as-disorder must be perfect ... or vice versa. I don’t know whether to laugh or cry when I see this. Actually, both versions are highly imperfect. You might get away with using one or the other in selected situations, but not in general.

The right way to define entropy is in terms of probability, we now discuss. (The various other notions can then be understood as special cases and/or approximations to the true entropy.)

2.5  Quantifying Entropy

If we have a system characterized by a probability distribution P, the entropy is given by where Pi is the probability that the system is in the ith microstate. This is the official definition of entropy. This is the gold standard. Other expressions may be useful in special cases (as in section 8.6 for example) but you can never go wrong using equation 3.

In the games discussed above, it was convenient to measure entropy in bits, because I was asking yes/no questions. Other units are possible, as discussed in section 8.5.

Figure 2 shows the contribution to the entropy from one term in the sum in equation 3. Its maximum value is approximately 0.53 bits, attained when Pi=1/e.

plogp
Figure 2: - Pi log Pi -- One Term in the Sum


Figure 3 shows the total entropy for a two-state system such as a coin. Here P represents the probability of the the “heads” state, which gives us one term in the sum. The “tails” state necessarily has probability (1-P) and that gives us the other term in the sum. The total entropy in this case is a symmetric function of P. Its maximum value is 1 bit, attained when P=½.

plogp2
Figure 3: Total Entropy -- Two-State System


As discussed in section 8.5 the base of the logarithm in equation 3 is chosen according to what units you wish to use for measuring entropy. If you choose units of joules per kelvin (J/K), we can pull out a factor of Boltzmann’s constant and rewrite the equation as: Entropy itself is conventionally represented by big S and is an extensive property, with rare peculiar exceptions as discussed in section 11.7. Molar entropy is conventionally represented by small s and is the corresponding intensive property.

Although it is often convenient to measure molar entropy in units of J/K/mol, other units are allowed, for the same reason that mileage is called mileage even when it is measured in metric units. In particular, sometimes additional insight is gained by measuring molar entropy in units of bits per particle. See section 8.5 for more discussion of units.

When discussing a chemical reaction using a formula such as

2 O3 → 3 O2 + Δ s              (5)

it is common to speak of “the entropy of the reaction” but properly it is “the molar entropy of the reaction” and should be written Δ s (not Δ S). All the other terms in the formula are intensive, so the entropy-related term must be intensive also.

Of particular interest is the standard molar entropy, s0, measured at standard temperature and pressure. The entropy of a gas is strongly dependent on pressure, as mentioned in section 11.2.

2.6  Surprise Value

If we have a system characterized by a probability distribution P, the surprise value of the ith state is given by

$i := log(1/Pi)              (6)

By comparing this with equation 3, it is easy to see that the entropy is simply the average of the surprise value.

Note the following contrast:

Surprise value is a property of the state i.   Entropy is not a property of the state i; it is a property of the distribution P.

This should make it obvious that entropy is not, by itself, the solution to all the world’s problems. Entropy measures a particular average property of the distribution. It is easy to find situations where other properties of the distribution are worth knowing.

3  Basic Concepts (Zeroth Law)

There are a bunch of basic notions that are often lumped together and called the zeroth law of thermodynamics. These notions are incomparably less fundamental than the notion of energy (the first law) and entropy (the second law), so despite its name, the zeroth law doesn’t deserve priority.

Here are some oft-cited rules, and some comments on each.

We can divide the world into some number of regions that are disjoint from each other.   If there are only two regions, some people like to call one of them “the” system and call the other “the” environment, but usually it is better to consider all regions on an equal footing. Regions are sometimes called systems and/or subsystems. Systems are sometimes called objects, especially when they are relatively simple.

There is such a thing as thermal equilibrium.   You must not assume that everything is in thermal equilibrium. Thermodynamics and indeed life itself depend on the fact that some regions are out of equilibrium with other regions.

There is such a thing as temperature.   There are innumerable important examples of systems that lack a well-defined temperature, such as the three-state laser discussed in section 10.4.

Whenever any two systems are in equilibrium with each other, they have the same temperature. See section 9.   This is true and important. (To be precise, we should say they have the same average temperature, since there will be fluctuations, which may be significant for very small systems.)

We can establish equilibrium within a system, and equilibrium between selected pairs of systems, without establishing equilibrium between all systems.   This is an entirely nontrivial statement. Sometimes it takes a good bit of engineering to keep some pairs near equilibrium and other pairs far from equilibrium. See section 10.12.

If/when we have established equilibrium within a system, a few variables suffice to entirely describe the thermodynamic state (i.e. macrostate) of the system.4 (See section 11.1 for a discussion of microstate versus macrostate.)   This is an entirely nontrivial statement, and to make it useful you have to be cagey about what variables you choose; for instance,
  • Knowing the temperature and pressure of a parcel of ice gives you more-or-less a complete description of the thermodynamic state of the ice.
  • Knowing the temperature and pressure of a parcel of liquid water gives you more-or-less a complete description of the thermodynamic state of the water.
  • Meanwhile, in contrast, knowing the temperature and pressure of an ice/water mixture does not fully determine the thermodynamic state, because you don’t know what fraction is ice and what fraction is water.

4  Low-Temperature Entropy (Alleged Third Law)

As mentioned in the introduction, one sometimes hears the assertion that the entropy of a system must go to zero as the temperature goes to zero.

There is no theoretical basis for this assertion, so far as I know -- just unsubstantiated opinion.

As for experimental evidence, I know of only one case where (if I work hard enough) I can make this statement true, while there are innumerable cases where it is not true:

  • There is such a thing as a spin glass. It is a solid, with a spin at every site. At low temperatures, these spins are not lined up; they are highly disordered. And there is a large potential barrier that prevents the spins from flipping. So for all practical purposes, the entropy of these spins is frozen in. The molar entropy involved is substantial, on the order of one J/K/mole. You can calculate the amount of entropy based on measurements of the magnetic properties.
  • A chunk of ordinary glass (e.g. window glass) has a considerable amount of frozen-in entropy, due to the disorderly spatial arrangement of the glass molecules. That is, glass is not a perfect crystal. Again, the molar entropy is quite substantial. It can be measured by Xray scattering and neutron scattering experiments.
  • For that matter, it is proverbial that perfect crystals do not occur in nature. This is because it is energetically more favorable for a crystal to grow at a dislocation. Furthermore, the materials from which the crystal was grown will have chemical impurities, not to mention a mixture of isotopes. So any real crystal will have frozen-in nonuniformities. The molar entropy might be rather less than one J/K/mole, but it won’t be zero.
  • If I wanted to create a sample where the entropy went to zero in the limit of zero temperature, I would proceed as follows: Start with a sample of helium. Cool it to some very low temperature. The superfluid fraction is a single quantum state, so it has zero entropy. But the sample as a whole still has nonzero entropy, because 3He is quite soluble in 4He (about 6% at zero temperature), and there will always be some 3He around. To get rid of that, pump the sample through a superleak, so the 3He is left behind. (Call it reverse osmosis if you like.) Repeat this as a function of T. As T goes to zero, the superfluid fraction goes to 100% (i.e. the normal-fluid fraction goes to 0%) so the entropy, as far as I know, would go to zero asymptotically.
Note: It is hard to measure the low-temperature entropy by means of elementary thermal measurements, because typically such measurements are insensitive to “spectator entropy” as discussed in section 11.5.

5  The Rest of Physics, Chemistry, etc.

The previous sections have set forth the conventional laws of thermodynamics, cleaned up and modernized as much as possible.

At this point you may be asking, why do these laws call attention to conservation of energy, but not the other great conservation laws (momentum, electrical charge, lepton number, et cetera)? And for that matter, what about all the other physical laws, the ones that aren’t expressed as conservation laws? Well, you’re right, there are some quite silly inconsistencies here.

The fact of the matter is that in order to do thermo, you need to import a great deal of classical mechanics. You can think of this as the minus-oneth law of thermodynamics.

  • This includes Newton’s third law (which is tantamount to conservation of momentum) and Newton’s second law, with the associated ideas of force, mass, acceleration, et cetera. Note that the concept of pseudowork, which shows up in some thermodynamic discussions, is more closely related to momentum than to energy.
  • In particular, this includes the notion of conservation of energy, which is a well-established part of nonthermal classical mechanics. From this we conclude that the first law of thermodynamics is redundant and should, logically, be left unsaid (although it remains true and important).
  • If you are going to apply thermodynamics to an electrical or magnetic system, you need to import the laws of electromagnetism.
  • If you are going to apply thermodynamics to a chemical system, you need to import the fundamental notions of chemistry. This includes the notion that atoms exist and are unchanged by ordinary chemical reactions (which merely defines what we mean by a “chemical” as opposed to “nuclear” reaction). This implies dozens of additional approximate5 conservation laws, one for each type of atom.
Sometimes the process of importing a classical idea into the world of thermodynamics is trivial, and sometimes not. For example, we believe that the law of conservation of momentum would be guaranteed valid if we applied it by breaking a complex object into its elementary components, applying the law to each component separately, and summing the various contributions. That’s fine, but nobody wants to do it that way. In the spirit of thermodynamics, we would prefer a macroscopic law. That is, we would like to be able to measure the overall mass of the object (M), measure its average velocity (V), and from that compute a macroscopic momentum (MV) obeying the law of conservation of momentum. In fact this macroscopic approach works fine, and can fairly easily be proven to be consistent with the microscopic approach. In contrast, however, the corresponding process of formulating a macroscopic notion of kinetic energy is nontrivial, and indeed annoyingly ambiguous, as discussed in section 17.4 and reference 7.

6  Thermal and Nonthermal Energy

It will sometimes be useful to distinguish thermal energy from other forms of energy. Obviously such a distinction does not change the total amount of energy, so we can write an equation

Eplain = Enonthermal + Ethermal   ...   if the latter exists              (7)

We call this the first lemma of thermodynamics. This lemma is not fundamental and is not always applicable, since it is predicated on the premise that we can tell the difference between thermal and nonthermal energy. This premise may or may not hold in cases of interest. When in doubt, forget about equation 7 and put your trust in the first law (conservation of energy, equation 1) and the second law (paraconservation of entropy, equation 2),

Examples where you cannot separate the energy into thermal and nonthermal parts are discussed in section 10.4. See section 8 for more about thermal and nonthermal energy. See section 7.6 for warnings about how not to interpret equation 7. Also see section 13.4 for further reasons why the notion of “thermodynamically available energy” is problematic.

The LHS of equation 7 is just the plain old energy. It is sometimes referred to as the “common” or “ordinary” energy, which means the same thing. It is sometimes called the “combined” or “total” energy, referring to the combination of thermal and nonthermal contributions. It is very often called the “internal” energy, which is a slightly odd way of emphasizing that it is the energy of the system in question, not the energy of the whole universe. In most contexts you can just call it the energy, with no adjectives at all. Many thermodynamics books use the symbol U for the energy, but in other thermo books -- and everywhere else in physics -- the symbol E is used, and we will use E in this document.

Warning: It would be a monstrous mistake to think that the law of conservation of energy could be applied to thermal energy separately or to nonthermal energy separately. See section 7.6 and section 10.5.

Although there are some special cases where thermal energy happens to flow without changing, this is by no means a general law. As a counterexample, consider a parcel of warm air rising in the atmosphere. It expands and cools as it goes.

  • The air itself flows from place to place.
  • The entropy flows from place to place.
  • The energy flows from place to place.
On the other hand:
  • It is typically not true that the thermal energy flows from place to place, or that the nonthermal energy flows from place to place. In this example (and many others), you have energy that disappears from one region in thermal form and reappears in an adjacent region in nonthermal form.
  • In particular, it is typically not true that the air itself, the entropy, and the energy flow together at the same rate. That might be true in some cases, but not in general. You could have energy flowing in one direction (in the form of sound-waves in the air) while the air itself is moving in another direction, and while entropy is flowing in a third direction (via thermal conduction due to a temperature gradient).
See section 10.10 for some related discussion of cooling by expansion.

As another example, dissipative processes such as sliding friction or viscous flow will convert nonthermal energy to thermal energy. See section 10.5 and section 10.6 for examples.

As a third example, any process that involves mixing will change the proportion of thermal and nonthermal energy. See section 10.7 for more on this.

It is a very common mistake to overconcentrate on “transfer of thermal energy” or “thermal transfer of energy” as if that were the cornerstone of thermodynamics. Yes, sometimes those ideas are important. For instance, a thermally-insulating pushrod performs an entirely nonthermal transfer of energy, while a nonmoving heat exchanger performs an entirely thermal transfer of energy. But pushrods and heat exchangers are carefully engineered objects. You cannot take them for granted. They are not representative of the general case.

Remember that energy and entropy are primary and fundamental. Heat and work suffer from multiple inconsistent definitions. In some situations, heat and work may be convenient for keeping track of some contributions to the energy budget and entropy budget. In other situations, your best strategy is to forget about heat and work, and rely directly on energy and entropy.

7  The W + Q Equation

7.1  Partial Derivatives

Let’s build up a scenario, based on some universal facts plus some scenario-specific assumptions.

We know that the energy of the system is well defined. Similarly we know the entropy of the system is well defined. These aren’t assumptions. Every system has energy and entropy.

Next, as a hypothesis of this scenario, we assume that the system has a well-defined thermodynamic state, i.e. macrostate. This macrostate can be represented as a point in some abstract state-space. At each point in macrostate-space, the macroscopic quantities we are interested in (energy, entropy, pressure, volume, temperature, etc.) take on well-defined values.

We further assume that this macrostate-space has dimensionality M, and that M is not very large. (This M may be larger or smaller than the dimensionality D of the position-space we live in, namely D=3.)

Assuming a well-behaved thermodyamic state is a highly nontrivial assumption.

  • As an example where these assumptions are valid, consider the hackneyed example of the ideal gas in equilibrium in a cylinder, where the macrostate is determined by a few variables such as volume, temperature, and number of particles.
  • As a more interesting example, consider a heat-flow experiment. We have a metal bar that is kept hot at one end and cold at the other end. This is obviously a non-equilibrium situation, and the heat-flow is obviously irreversible. Yet at each point in the bar, there is a well-defined local temperature, a well-defined local energy density, et cetera. As far as I know, all the assumptions we have made so far hold just fine.
  • As a challenging but not hopeless intermediate case, consider a thermal distribution with a few exceptions, as discussed in section 10.3. In this case, our macrostate space must include additional variables to quantify the excitation of the exceptional modes. These variables will show up as additional dimensions in the vector V or as additional explicit terms in a generalization of equation 11.
  • As a more problematic example, consider turbulent flow. The motion is chaotic, and the closer you look the more chaos you see. In general, this topic is beyond the scope of this discussion. However, depending on what questions you are asking, it may be possible to average over space and/or average over time so as to establish a well-behaved notion of local temperature in the fluid.
  • As an even more problematic example, suppose you have just set off a firecracker inside a cylinder of gas. This is even farther beyond the scope of this discussion. The system will be chaotic and far from equilibrium. It is also nonuniform in space and time, so averaging is problematic (although perhaps not impossible). A great number of modes will be excited. Describing the macrostate of the system will require a tremendous number of variables, so much so that describing the macrostate might be almost as laborious as describing the microstate.
We further assume that the quantities of interest vary smoothly from place to place in macrostate-space.
We must be careful how we formalize this “smoothness” idea. By way of analogy, consider a point moving along a great-circle path on a sphere. This path is nice and smooth, by which we mean differentiable. We can get into trouble if we try to describe this path in terms of polar coordinates, because the coordinate system is singular at the poles. This is a problem with the coordinate system, not with the path itself. To repeat: a great-circle route that passes over the pole is differentiable, but its representation in polar coordinates is not.

Applying this idea to thermodynamics, consider an ice/water mixture at constant pressure. The temperature is a smooth function of the energy content, whereas the energy-content is not a smooth function of temperature. I recommend thinking in terms of an abstract point moving in macrostate-space. Both T and E are well-behaved functions, with definite values at each point in macrostate-space. We get into trouble if we try to parameterize this point using T as one of the coordinates, but this is a problem with the coordinate representation, not with the abstract space itself.

We will now choose a particular set of variables as a basis for specifying points in macrostate-space. We will temporarily focus on a certain special set, but we are not wedded to it. As one of our variables, we choose S, the entropy. The remaining variables we will collectively call V, which is a vector with D-1 dimensions. In particular, we choose the macroscopic variable V in such a way that the microscopic energy Ei of the ith microstate is determined by V. (For an ideal gas in a box, V is just the volume of the box.)

Given these assumptions, we can write:

dE =   |
|
|
|
dV +   |
|
|
|
dS              (8)
which is just the chain rule for differentiating a function of two variables. More elaborate versions of this will be discussed in section 17.1.

In some quarters it is conventional to define

and You might say this is just terminology, just a definition of T ... but we need to be careful because there are also other definitions of T floating around. Eventually we will confirm that the various definitions are consistent.

Given this terminology, we can rewrite equation 8 in the following widely-used form:

dE = -P dV + T dS              (11)

 

Similarly, if we choose to define

and that’s fine; that’s just terminology. Note that w and q are one-forms, not scalars, as discussed in section 7.6. They are functions of state, i.e. uniquely determined by the thermodynamic state.6 Using these definitions of w and q we can write

dE = w + q              (14)

which is fine so long as we don’t misinterpret it. However you should keep in mind that equation 14 and its precursors are very commonly misinterpreted. In particular, it is tempting to interpret w as “work” and q as “heat”, which is either a good idea or a bad idea, depending on which of the various mutually-inconsistent definitions of “work” and “heat” you happen to use. See section 16 and section 17.1 for details.

You should also keep in mind that these equations (equation 8, equation 11 and/or equation 14) do not represent the most general case. An important generalization is mentioned in section 7.3.

Recall that we are not wedded to using (V,S) as our basis in macrostate space. As an easy but useful change of variable, consider the case where V = XYZ, in which case we can expand equation 8 as:

dE   =  
  |
|
|
|
dX +   |
|
|
|
dY +   |
|
|
|
dZ +   |
|
|
|
dS
   
    =   - YZP dX + - ZXP dY - XYP dZ + T dS
   
    =   - FX dX + - FY dY - FZ dZ + T dS
             (15)
where we define the forces FX, FY, and FZ as directional derivatives of the energy: FX := -∂ E / ∂ X |Y,Z,S and similarly for the others.

As a less-trivial change of variable, now that we have introduced the T variable, we can write

dE =   |
|
|
|
dV +   |
|
|
|
dT              (16)
assuming things are sufficiently differentiable.

The derivative in the second term on the RHS is called the heat capacity (at constant volume); that is:

assuming the RHS exists. (This is a nontrivial assumption; the heat capacity is singular near a first-order phase transition such as the ice/water transition.)

The other derivative on the RHS of equation 16 doesn’t have a name so far as I know. It is identically zero for an ideal gas (but not in general).

Using the chain rule, we can find a useful expression for the heat capacity (at constant volume) in terms of entropy.

This equation is particularly useful in reverse, as means for finding the entropy. Just measure CV, divide by temperature, and integrate with respect to temperature along a contour of constant volume. This tells you ΔS (the change in entropy) along the path of integration.

7.2  Integration

Let’s continue to assume that T and P are functions of state, and that S and V suffice to span the macrostate-space.

Then, in cases where equation 11 is valid, we can integrate both sides to find E. This gives us an expression for E as a function of V and S alone (plus a constant of integration that has no physical significance). Naturally, this expression is more than sufficient to guarantee that E is a function of state.

Things are much messier if we try to integrate only one of the terms on the RHS of equation 11. Without loss of generality, let’s consider the T dS term. We integrate T dS along some path Γ. Let the endpoints of the path be A and B.

It is crucial to keep in mind that the value of the integral depends on the chosen path --- not simply on the endpoints. It is OK to write things like

whereas it would be quite unacceptable to write
(anything) =
T dS              (20)
I recommend writing QΓ rather than Q, to keep the path-dependence completely explicit. This QΓ exists only along the low-dimensional subspace defined by the path Γ, and cannot be extended to cover the whole thermodynamic state-space. That’s because T dS is a non-exact one-form. See section 7.6 for more about this.

7.3  Advection

Equation 11 is predicated on the assumption that the energy is known as a function V and S alone ... which is not the most general case. As an important generalization, consider the energy budget of a typical automobile. The most-common way of increasing the energy within the system is to transfer fuel (and oxidizer) across the boundary of the system. This is an example of advection of energy. This contributes to dE, but is not included in PdV or TdS. So we should write something like:

dE = -P dV + T dS + advection              (21)

 

It is possible to quantify the advection mathematically. Simple cases are easy. The general case would lead us into a discussion of fluid dynamics, which is beyond the scope of this document.

7.4  Deciding What’s True

Having derived results such as equation 11 and equation 21, we must figure out how to interpret the terms on the RHS. Please consider the following notions and decide which ones are true:
  1. Heat is defined to be TdS (subject to the usual restrictions, discussed in section 7.1).
  2. Heat is defined to be “energy that is transferred from one body to another as the result of a difference in temperature”.
  3. The laws of thermodynamics apply even when irreversible processes are occuring.
It turns out that these three notions are mutually contradictory. You have to get rid of one of them, for reasons detailed in section 16 and section 15.

As a rule, you are allowed to define your terms however you like. However, if you want a term to have a formal, well-defined meaning,

  • Each term must be defined only once, and
  • You must stick to a well-known unambiguous meaning, and/or clearly explain what definition you are using.
The problem is, many textbooks don’t play by the rules. On some pages they define heat to be TdS, on some pages they define it to be flow across a boundary, and on some pages they require thermodynamics to apply to irreversible processes.

This is an example of boundary/interior inconsistency, as discussed in section 15.

The result is a shell game: There’s a serious problem, but nobody can pin down the location of the problem.

This results in endless confusion. Indeed, sometimes it results in holy war between the Little-Endians and the Big-Endians: Each side is 100% convinced that their definition is “right”, and therefore the other side must be “wrong”. (Reference 24.) I will not take sides in this holy war. Viable alternatives include:

  1. Pick one definition of heat. Explicitly say which definition you’ve chosen, and use it consistently. Recognize that others may choose differently.
  2. Go ahead and use the term informally, with multiple inconsistent meanings, as many experts do. Just don’t pretend you’re being consistent when you’re not. Use other terms and concepts (e.g. energy and entropy) when you need to convey a precise meaning.
  3. Avoid using term “heat” any more than necessary. Focus attention on other terms and concepts (e.g. energy and entropy).
For more on this, see the discussion near the end of section 7.5.

7.5  Deciding What’s Fundamental

It is not necessarily wise to pick out certain laws and consider them “axioms” of physics. As Feynman has eloquently argued in reference 2, real life is not like high-school geometry, where you were given a handful of axioms and expected to deduce everything from that. In the real world, every fact is linked to many other facts in a grand tapestry. If a hole develops in the tapestry, you can re-weave it starting from the top of the hole, or the bottom, or either side. That is to say, if you forget one particular fact, you can re-derive it in many different ways.

In this spirit, some folks may wish to consider equation 1 and equation 14 as being equally axiomatic, or equally non-axiomatic. One can be used to re-derive the other, with the help of other facts, subject to certain limitations.

On the other hand, some facts are more useful than others. Some are absolutely central to our understanding of the world, while others are less so. Some laws are more worth discussing and remembering, while others are less so. Saying that something is true and useful does not make it fundamental; the expression 1+2+3+4=10 is true and sometimes useful, but it isn’t very fundamental, because it lacks generality.

Deciding which laws to emphasize is to some extent a matter of taste, but one ought to consider such factors as simplicity and generality, favoring laws with a large number of predictions and a small number of exceptions.

In my book, energy conservation (equation 1) is fundamental. From that, plus a couple of restrictions, we can derive equation 14 using calculus. Along the way, the derivation gives us important information about how w and q should be interpreted. It’s pretty clear what the appropriate restrictions are.

If you try to go the other direction, i.e. from w+q to conservation of energy, you must start by divining the correct interpretation of w and q. The usual “official” interpretations are questionable to say the least, as discussed in section 10.6 and section 15. Then you have to posit suitable restrictions and do a little calculus. Finally, if it all works out, you end up with an unnecessarily restrictive version of the local energy-conservation law.

Even in the best case I have to wonder why anyone would bother with the latter approach. I would consider such a derivation as being supporting evidence for the law of local conservation of energy, but not even the best evidence.

I cannot imagine why anyone would want to use equation 14 or equation 21 as “the” first law of thermodynamics. Insead, I recommend using the local law of conservation of energy ... which is simpler, clearer, more fundamental, more powerful, and more general.

It’s not at all clear that thermodynamics should be formulated in quasi-axiomatic terms, but if you insist on having a “first law” it ought to be a simple, direct statement of local conservation of energy. If you insist on having a “second law” it ought to be a simple, direct statement of local paraconservation of entropy.

Another way to judge equation 14 is to ask to what extent it describes this-or-that practical device. Two devices of the utmost practical importance are the thermally-insulating pushrod and the ordinary nonmoving heat exchanger. The pushrod transfers energy and momentum (but no entropy) across the boundary, while the heat exchanger transfers energy and entropy (but no momentum) across the boundary.

It is traditional to describe these devices in terms of work and heat, but it is not necessary to do so, and I’m not convinced it’s wise. As you saw in the previous paragraph, it is perfectly possible to describe them in terms of energy, momentum, and entropy, which are the true coin of the realm, the truly primary and fundamental physical quantities. Heat and work are secondary at best (even after you have resolved the nasty inconsistencies discussed in section 7.4 and section 15).

Even if/when you can resolve dE into a -PdV term and a TdS term, that doesn’t mean you must do so. In many cases you are better off keeping track of E by itself, and keeping track of S by itself. Instead of saying no heat flows down the pushrod, it makes at least as much sense to say that no entropy flows down the pushrod. Keeping track of E and S is more fundamental, as you can see from the fact that energy and entropy can be exchanged between systems that don’t even have a temperature (section 10.4).

When in doubt, rely on the fundamental laws: conservation of energy, conservation of momentum, paraconservation of entropy, et cetera.

7.6  Non-Exact One-Forms

Sometimes people who are trying to write equation 8 or equation 14 instead write something like

dE = dW + dQ         (allegedly)              (22)

which is deplorable.

Using the language of differential forms, the situation can be understood as follows:

  • E is a scalar state-function.
  • V is a scalar state-function.
  • S is a scalar state-function.
  • P is a scalar state-function.
  • T is a scalar state-function.
  • dE is an exact one-form state-function.
  • dS is an exact one-form state-function.
  • dV is an exact one-form state-function.
  • w := PdV is in general a non-exact one-form state-function.
  • q := TdS is in general a non-exact one-form state-function.
  • There is in general no state-function W such that w = dW.
  • There is in general no state-function Q such that q = dQ.
where in the last four items, we have to say “in general” because exceptions can occur in peculiar cases, mainly cases that are so low-dimensional that it is not possible to contruct a heat engine. Such exceptions are very unlike the general case, and not worth much discussion beyond what was said in conjunction with equation 19. When we say something is a state-function we mean it is a function of the thermodynamic state. The last two items follow immediately from the definition of exact versus non-exact.

Figure 4 shows the difference between an exact one-form and an inexact one-form.

As you can see in on the left side of the figure, the quantity dS is exact. If you integrate clockwise around the loop as shown, the net number of upward steps is zero. This is related to the fact that we can assign an unambigous height (S) to each point in (T,S) space.   In contrast, as you can see on the right side of the diagram, the quantity TdS is not exact. If you integrate clockwise around the loop as shown, there are considerably more upward steps than downward steps. There is no hope of assigning a height “Q” to points in (T,S) space.

dS-TdS
Figure 4: dS is Exact, TdS is Not


For details on the properties of one-forms, see reference 4 and perhaps reference 5.

The difference between exact and inexact has important consequences for practical situations such as heat engines. Even if we restrict attention to reversible situations, we still cannot think of Q as a function of state, for the following reasons: You can define any number of functions Q1, Q2,

⋅⋅⋅ by integrating TdS along some paths Γ1, Γ2, ⋅⋅⋅ of your choosing. Each such Qi can be interpreted as the total heat that has flowed into the system along the specified path. As an example, let’s choose path Γ6 to be the macrostate-trajectory of a heat engine. Let Q6(N) be the value of Q6 at the end of the Nth cycle. We see that even after specifying the path, Q6 is still not a state function, because at the end of each cycle, all the state functions return to their initial values, whereas Q6(N) grows linearly with N. This proves that in any situation where you can build a heat engine, q is not equal to d(anything).

Among other things, this means there is not in general a useful way to integrate TdS to create a “Q” that quantifies the notion of “thermal energy” suggested in equation 7. If we want to define “thermal energy” we must find another way to do it.

7.7  Why dW and dQ Are Tempting

It is remarkable that people are fond of writing things like dQ ... even in cases where it does not exist. (The remarks in this section apply equally well to dW and similar monstrosities.)

Even people who know it is wrong do it anyway. They call dQ an “inexact differential” and sometimes put a slash through the d to call attention to this. The problem is, neither dQ nor ðQ is a differential at all. Yes, TdS is an inexact one-form, but it is not properly called an inexact differential, since it is (in general) not a differential, i.e. not the derivative of anything.

One wonders how such a bizarre tangle of contradictions could arise, and how it could persist. I hypothesize part of the problem is a too-narrow interpretation of the traditional notation for integrals. Most mathematics books say that every integral should be written in the form


(integrand) d(something)              (23)
where the d is alleged to be merely part of the notation -- an obligatory and purely mechanical part of the notation -- and the integrand is considered to be separate from the d(something).

However, it doesn’t have to be that way. If you think about the integral from the Lebesgue point of view (as opposed to the Riemann point of view), you realize that what is indispensible is a measure. Specifically: d(something) is a perfectly fine, normal type of measure, but not the only possible type of measure.

In an ordinary one-dimensional integral, we are integrating along a path, which in the simplest case is just an interval on the number line. Each element of the path is a little pointy vector, and the measure needs to map that pointy vector to a number. Any one-form will do, exact or otherwise. The exact one-forms can be written as d(something), while the inexact ones cannot.

For purposes of discussion, in the rest of this section we will put square brackets around the measure, to make it easy to recognize the measure even if it takes a somewhat unfamiliar form. As a simple example, a typical integral can be written as:


(integrand) [(measure)]              (24)
where Γ is the domain to be integrated over, and the measure is typically something like dx.

As a more intricate example, in two dimensions the moment of inertia of an object Ω is:

where the measure is dm. As usual, r denotes distance and m denotes mass. The integral runs over all elements of the object, and we can think of dm as an operator that tells us the mass of each such element. To my way of thinking, this is the definition of moment of inertia: a sum of r2, summed over all elements of mass in the object.

The previous expression can be expanded as:

I =
r2 [ρ(x,ydx dy]              (26)
where the measure is same as before, just rewritten in terms of the density, ρ.

Things begin to get interesting if we rewrite that as:

I =
r2 ρ(x,y) [dx dy]              (27)
where ρ is no longer part of the measure but has become part of the integrand. We see that the distinction between the integrand and the measure is becoming a bit vague. Exploiting this vagueness in the other direction, we can write: which tells us that the distinction between integrand and measure is completely meaningless. Henceforth I will treat everything inside the integral on the same footing. The integrand and measure together will be called the argument7 of the integral.

Using an example from thermodynamics, we can write

where Γ is some path through thermodynamic state-space, and where q is an inexact one-form, defined as q := TdS.

It must be emphasized that these integrals must not be written as

[dQ] nor as [dq]. This is because the argument in equation 29 is an inexact one-form, and therefore cannot be equal to d(anything).

There is no problem with using TdS as the measure in an integral. The only problem comes when you try to write TdS as d(something) or ð(something):

  • Yes, TdS is a measure.
  • Yes, it is a one-form.
  • No, it is not an exact one-form.
  • No, it is not d(anything).
I realize an expression like [q] will come as a shock to some people, but I think it expresses the correct ideas. It’s a whole lot more expressive and more correct than trying to write TdS as d(something) or ð(something).

Once you understand the ideas, the square brackets used in this section no longer serve any important purpose. Feel free to omit them if you wish.

The traditional notation

⋅⋅⋅ dx is like the proverbial trusty hammer, which is exactly the right tool for pounding nails. But sometimes you encounter a task that is not a nail, and other tools are called for. Specifically: sometimes it is OK to have no explicit d inside the integral.

There are only two things that are required: the integral must have a domain to be integrated over, and it must have some sort of argument. The argument must be an operator, which operates on an element of the domain to produce a number (or perhaps a vector or the like) that can be summed by the integral.

A one-form certainly suffices to serve as an argument (when elements of the domain are pointy vectors). Indeed, some texts introduce the notion of one-forms by defining them to be operators of the sort we need. That is, the space of one-forms is defined as an operator space, consisting of the operators that map column vectors to scalars. (If you want a less-fancy name for the same thing, you can call them row vectors). Using these operators does not require taking a dot product. (You don’t need a dot product unless you want to multply two column vectors.) The operation of applying a row vector to a column vector to produce a scalar is called contraction, not a dot product.

It is interesting to note that an ordinary summation of the form ∑i Fi corresponds exactly to a Lebesgue integral using a measure that assigns unit weight to each integer (i) in the domain. No explicit d is needed.

People heretofore have interpreted d in several ways: as a one-form, as an infinitesimal step in some direction, and as the marker for the measure in an integral. The more I think about it, the more convinced I am that the one-form interpretation is far and away the most advantageous. The others can be seen as mere approximations of the one-form interpretation. The approximations work OK in elementary situations, but produce profound misconceptions and contradictions when applied to more general situations ... such as thermodynamics.   In contrast, note that in section 16, I do not take such a hard line about the multiple incompatible definitions of heat. I don’t label any of them as right or wrong. Rather, I recognize that each of them in isolation has some merit, and it is only when you put them together that conflicts arise.

Bottom line: There are two really simple ideas here: (1) d always means exterior derivative. The exterior derivative of anything is an exact differential, i.e. an exact one-form. (2) An integral needs to have a measure, which is usually but not necessarily of the form d(something).

8  Connecting Entropy with Thermal Energy

We start by examining the distinction between thermal energy and nonthermal energy.

8.1  An Illustration

Let box A contain a cold, rapidly-rotating flywheel. Actually, let it contain two counter-rotating flywheels, so we won’t have any net angular momentum to worry about. Also let it contain a cold, tightly-compressed spring.

Compare that with box B which is the same except that the spring has been released and the flywheels have been stopped, by dissipative processes entirely internal to the box. The nonthermal kinetic and potential energy has been converted to thermal energy. The flywheels and spring are now warm. Assume losses into other modes (sound etc.) are negligible.

The main difference between box A and box B is that the entropy is higher. In box A we have energy in a low-entropy form, and in box B we have the same energy in a higher-entropy form.

We can understand this in terms of macrostates and microstates as follows: Let T be the temperature, ω be the speed of the flywheel, and L be the extension of the spring. Then the macrostate can be described in terms of these variables. Knowing the macrostate doesn’t suffice to tell us the system is in a particular microstate; rather, there is some range, some set of microstates consistent with a given macrostate. The key idea here is that the number of microstates consistent with (TB, ωB, LB) is much larger than the number of microstates consistent with (TA, ωA, LA).

In this less-than-general case it is mostly harmless to speak of the “energy” being spread out over a large number of microstates, but remember this is not the defining property of entropy, for reasons discussed in section 2.4.3. The defining property is that the probability gets spread out over a large number of microstates.

Similarly, in this case it is mostly harmless to speak of box A as being more “ordered” than box B. That’s true and even somewhat relevant ... but it ought not be overemphasized, and it must not be thought of as a characteristic property -- let alone a defining property -- of the low-entropy macrostate. Entropy is not synonymous with disorder, for reasons discussed in section 2.4.4.

8.2  The Thermal-Energy Distribution

We shall see that in equilibrium, thermal energy is distributed among the microstates according to a very special probability distribution, namely the Boltzmann distribution. That is, the probability of finding the system in microstate i is given by:

Pi = e-Ei / kT    ...   for a thermal distribution              (30)

where Ei is the energy of the ith microstate, and kT is the temperature measured in energy units. That is, plain T is the temperature measured in degrees, and k is Boltzmann’s constant, which is just the conversion factor from degrees to whatever units you are using to measure Ei.

Evidence in favor of equation 30 is discussed in section 10.2.

8.3  Remarks

8.3.1  Nonthermal Energy is Freely Convertible; Thermal Energy is not

The difference between random energy and predictable energy has many consequences. The most important consequence is that the predictable energy can be freely converted to and from other forms, such as gravitational potential energy, chemical energy, electrical energy, et cetera. In many cases, these conversions can be carried out with very high efficiency. In contrast, the laws of thermodynamics place severe restrictions on the efficiency with which thermal energy can be converted to any nonthermal form.

8.3.2  Thermodynamic Laws without Temperature

Ironically, the first law of thermodynamics (equation 1) does not depend on temperature. Energy is well-defined and is conserved, no matter what. It doesn’t matter whether the system is hot or cold or whether it even has a temperature at all.

Even more ironically, the second law of thermodynamics (equation 2) doesn’t depend on temperature, either. Entropy is well-defined and is paraconserved no matter what. It doesn’t matter whether the system is hot or cold or whether it even has a temperature at all.

(This state of affairs is ironic because thermodynamics is commonly defined to be the science of heat and temperature, as you might have expected from the name: thermodynamics. Yet in our modernized and rationalized thermodynamics, the two most central, fundamental ideas -- energy and entropy -- are defined without reference to heat or temperature.)

Of course there are many important situations that do involve temperature. Most of the common, every-day applications of thermodynamics involve temperature -- but you should not think of temperature as the essence of thermodynamcs. Rather, it is a secondary concept which is defined (if and when it even exists) in terms of energy and entropy.

8.3.3  Kinetic and Potential Thermal Energy

Don’t fall into the trap of thinking that thermal energy is necessarily kinetic energy. In almost all situations, thermal energy is a mixture of kinetic and potential energy.8 The two forms of energy play parallel roles:
  • To visualize thermal potential energy, imagine that the atoms in a crystal lattice are held in place by springs. Half of these springs have thermal potential energy because they are extended relative to their resting-length, while the other half have thermal potential energy because they are compressed relative to their resting-length. They’ve all got energy, but you can’t easily harness it because you don’t know which ones are compressed and which ones are extended.
  • To visualize thermal kinetic energy, imagine that half the atoms have a leftward velocity and half have a rightward velocity. They all have kinetic energy, but you can’t easily harness it because you don’t know which ones are moving leftward and which are moving rightward.
In fact, for an ordinary crystal such as quartz or sodium chloride, the thermal energy is almost exactly half kinetic and half potential. It’s easy to see why that must be: The heat capacity is well described in terms of thermal phonons in the crystal. Each phonon mode is a harmonic9 oscillator. In each cycle of any harmonic oscillator, the energy changes from kinetic to potential and back again. The kinetic energy goes sin2(phase) and the potential energy goes like cos2(phase), so on average each of those is half of the total energy.

Not all kinetic energy is thermal.
Not all thermal energy is kinetic.


8.3.4  Internal Nonthermal Energy

Over the years, lots of people have noticed that you can always split the kinetic energy of a complex object into the KE of the center-of-mass motion plus the KE of the relative motion (i.e. the motion of the components relative to the center of mass).

Also a lot of people have noticed that you can sometimes split the energy of an object into a thermal piece and a non-thermal piece.

It is an all-too-common mistake to think that the overall/relative split is the same as the nonthermal/thermal split. Beware: they’re not the same. Definitely not.

First of all, thermal energy is not restricted to being kinetic energy, as discussed in section 8.3.3. So trying to understand the thermal/non-thermal split in terms of kinetic energy is guaranteed to fail. Using the work/KE theorem (reference 7) to connect work (via KE) to the thermal/nonthermal split is guaranteed to fail for the same reason.

Secondly, a standard counterexample uses flywheels, as discussed in section 17.4. You can impart KE to the flywheels without imparting center-of-mass KE or thermal energy or potential energy to the system.

Whenever the system of interest has a clear-cut thermal/nonthermal split, fine -- you can use it for whatever it’s worth. But if you ever have a question about the thermal/nonthermal split, you shouldn’t waste time trying to answer it. It’s the wrong question, and probably unanswerable. Instead you should ask about the energy and entropy.

Center-of-mass motion is an example but not the only example of low-entropy energy. The motion of the flywheels is another perfectly good example of low-entropy energy. Several other examples are listed in section 10.3.

A macroscopic object has something like 1023 modes. The center-of-mass motion is just one of these modes. The motion of counter-rotating flywheels is another mode. These are slightly special, but not very special. A mode to which we can apply a conservation law, such as conservation of momentum, or conservation of angular momentum, might require a little bit of special treatment, but usually not much ... and there aren’t very many such modes.

Sometimes on account of conservation laws, and sometimes for other reasons as discussed in section 10.12 it may be possible for a few modes of the system to be strongly coupled to the outside (and weakly coupled to the rest of the system), while the remaining 1023 modes are more strongly coupled to each other than they are to the outside. It is these issues of coupling-strength that determine which modes are thermal and which (if any) are non-thermal. This is consistent with our definition of equilibrium (section 9).

Thermodynamics treats all the thermal modes on an equal footing. One manifestation of this can be seen in equation 30, where each state contributes one term to the sum ... and addition is commutative.

There will never be an axiom that says such-and-such mode is always thermal or always nonthermal; the answer is sensitive to how you engineer the couplings.

8.4  Entropy Without Constant Re-Shuffling

It is a common mistake to visualize entropy as a highly dynamic process, whereby the system is constantly flipping from one microstate to another. This may be a consequence of the fallacy discussed in section 8.3.4 (mistaking the thermal/nonthermal distinction for the kinetic/potential distinction) ... or it may have other roots; I’m not sure.

In any case, the fact is that re-shuffling is not an essential part of the entropy picture.

An understanding of this point proceeds directly from fundamental notions of probability and statistics.

By way of illustration, consider one hand in a game of draw poker.

  A)   The deck is shuffled and hands are dealt in the usual way.
  B)   In preparation for the first round of betting, you look at your hand and discover that you’ve got the infamous “inside straight”. Other players raise the stakes, and when it’s your turn to bet you drop out, saying to yourself “if this had been an outside straight the probability would have been twice as favorable”.
  C)   The other players, curiously enough, stand pat, and after the hand is over you get a chance to flip through the deck and see the card you would have drawn.
Let’s more closely examine step (B). At this point you have to make a decision based on probability. The deck, as it sits there, is not constantly re-arranging itself, yet you are somehow able to think about the probability that the card you draw will complete your inside straight.

The deck, as it sits there during step (B), is not flipping from one microstate to another. It is in some microstate, and staying in that microstate. At this stage you don’t know what microstate that happens to be. Later, at step (C), long after the hand is over, you might get a chance to find out the exact microstate, but right now at step (B) you are forced to make a decision based only on the probability.

The same ideas apply to the entropy of a roomful of air, or any other thermodynamic system. At any given instant, the air is in some microstate with 100% probability; you just don’t know what microstate that happens to be. If you did know, the entropy would be zero ... but you don’t know. You don’t need to take any sort of time-average to realize that you don’t know the microstate.

The bottom line is that the essence of entropy is the same as the essence of probability in general: The essential idea is that you don’t know the microstate. Constant re-arrangement is not essential.

This leaves us with the question of whether re-arrangement is ever important. Of course the deck needs to be shuffled at step (A). Not constantly re-shuffled, just shuffled the once.

Again, the same ideas apply to the entropy of a roomful of air. If you did somehow obtain knowledge of the microstate, you might be interested in the timescale over which the system re-arranges itself, making your erstwhile knowledge obsolete and thereby returning the system to a high-entropy condition.

The crucial point remains: the process whereby knowledge is lost and entropy is created is not part of the definition of entropy, and need not be considered when you evaluate the entropy. If you walk into a room for the first time, the re-arrangement rate is not your concern. You don’t know the microstate of this room, and that’s all there is to the story. You don’t care how quickly (if at all) one unknown microstate turns into another.

If you don’t like the poker analogy, we can use a cryptology analogy instead. Yes, physics, poker, and cryptology are all the same when it comes to this. Statistics is statistics.

If I’ve intercepted just one cryptotext from the opposition and I’m trying to crack it, on some level what matters is whether or not I know their session key. It doesn’t matter whether that session key is 10 microseconds old, or 10 minutes old, or 10 days old. If I don’t have any information about it, I don’t have any information about it, and that’s all that need be said.

On the other hand, if I’ve intercepted a stream of messages and extracted partial information from them (via a partial break of the cryptosystem), the opposition would be well advised to “re-shuffle the deck” i.e. choose new session keys on a timescale fast compared to my ability to extract information about them.

Applying these ideas to a roomful of air: Typical sorts of measurements give us only a pathetically small amout of partial information about the microstate. So it really doesn’t matter whether the air re-arranges itself super-frequently or super-infrequently. We don’t have any significant amount of information about the microstate, and that’s all there is to the story.

Reference 6 presents a simulation that demonstrates the points discussed in this subsection.

8.5  Units of Entropy

In the definition of entropy, equation 3, the base of the logarithm has intentionally been left unspecified. You get to choose a convenient base. This is the same thing as choosing what units will be used for measuring the entropy.
  • Base 2 corresponds to measuring entropy in bits.
  • Base e corresponds to measuring entropy in nats.
  • Base 3 corresponds to measuring entropy in trits.
  • The huge base exp(1/k) corresponds to measuring entropy in joules per kelvin (J/K). Note that capital K is the kelvin unit of temperature, while small k is Boltzmann’s constant.
When dealing with smallish amounts of entropy, units of bits are conventional and often convenient. When dealing with large amounts of entropy, units of J/K are conventional and often convenient. These are related as follows:
1 J/K   =   1.04×1023 bits
1 bit   =   9.57×10-24 J/K
             (31)
A convenient unit for molar entropy is Joules per Kelvin per mole:
1 J/K/mol   =   0.17 bit/particle
1 bit/particle   =   5.76 J/K/mol
             (32)
Values in this range (on the order of one bit per particle) are very commonly encountered.

8.6  Probability versus Multiplicity

Suppose we have a case where the system has a set of states10 (called the “accessible” states) that are equiprobable, i.e. Pi = 1/W for some constant W. The remaining (“inaccessible”) states are unoccupied, i.e. they all have Pi = 0. The constant W is called the multiplicity. It necessarily equals the number of accessible states, since (in accordance with the usual definition of “probability”) we want our probabilities to be normalized: ∑Pi = 1.

In this less-than-general case, the entropy (as defined by equation 3) reduces to

S = logW              (33)

As discussed in section 8.5, you choose the base of the logarithm according to what units you prefer for measuring entropy: bits, nats, trits, J/K, or whatever. If you prefer J/K, the equation can be rewritten

S = k lnW              (34)

By way of example, equation 33 is normally applied to microcanonical systems. Microcanonical means that the system is isolated, i.e. constrained to have a definite, constant energy.

By way of counterexample, an object in contact with a constant-temperature heat bath is called canonical (not microcanonical) and cannot be described by equation 33. Similarly, any system that can exchange particles with a reservoir, as described by a chemical potential, is grand canonical (not microcanonical) and cannot be described by equation 33.

Some people are inordinately fond of equation 33 or equivalently equation 34. They are tempted to take it as the definition of entropy, and sometimes offer outrageously unscientific arguments in its support. But the fact remains that Equation 3 is the proper definition, while equation 34 is a special case. The latter cannot describe the general case, for reasons we now discuss.

For a thermal distribution, the probability of a microstate is given by equation 30. So, even within the restricted realm of thermal distributions, equation 34 does not cover all the bases; it applies if and only if all the accessible microstates have the same energy. It is possible to arrange for this to be true, by constraining all accessible microstates to have the same energy. That is, it is possible to create a microcanonical system by isolating or insulating and sealing the system so that no energy can enter or leave. This can be done, but it places drastic restrictions on the sort of systems we can analyze.

  • Two of the four phases of the Carnot cycle are carried out at constant temperature, not constant energy. The system is in contact with a heat bath, not isolated or insulated. A theory of “thermodynamics” without heat engines would be pretty lame.
  • A great many chemistry-lab recipes call for the system to be held at constant temperature while the reaction proceeds. Vastly fewer call for the system to be held in a thermally-insulated flask while the reaction proceeds. A theory of “thermodynamics” incapable of describing typical laboratory procedures would be pretty lame.
  • Even if the overall system is insulated, we often arrange it so that various subsystems within the system are mutually in equilibrium. For example, if there is liquid in a flask, we expect the left half of the liquid to be in thermal equilibrium with the right half, especially if we stir things. But remember, equilibrium involves having a shared temperature. The left half is not thermally insulated from the right half; energy is exchanged between the two halves. The microstates of the left half are not equiprobable. A theory of “thermodynamics” incapable of describing thermal equilibrium would be pretty lame.

8.7  Spreading in Probability Space

Non-experts sometimes get the idea that whenever something is more spread out in position, its entropy must be higher. This is a mistake. Yes, there are scenarios where a gas expands and does gain entropy (such as isothermal expansion, or diffusive mixing as discussed in section 10.7) ... but there are also scenarios where a gas expands but does not gain entropy (reversible thermally-isolated expansion).

Figure 5 shows two blocks under three transparent cups. In the first scenario, the blocks are “concentrated” in the 00 state. In the probability histogram below the cups, there is unit probability (shown in magenta) in the 00 slot, and zero probability in the other slots, so p log(1/p) is zero everywhere. That means the entropy is zero.

In the next scenario, the blocks are spread out in position, but since we know exactly what state they are in, all the probability is in the 02 slot. That means p log(1/p) is zero everywhere, and the entropy is still zero.

In the third scenario, the system is in some randomly chosen state, namely the 21 state, which is as disordered and as random as any state can be, yet since we know what state it is, p log(1/p) is zero everywhere, and the entropy is zero.

The fourth scenario is derived from the third scenario, except that the cups are behind a screen. We can’t see the blocks right now, but we remember where they are. The entropy remains zero.

Finally, in the fifth scenario, we simply don’t know what state the blocks are in. The blocks are behind a screen, and have been shuffled since the last time we looked. We have some vague notion that on average, there is 2/3rds of a block under each cup, but that is only an average over many states. The probability histogram shows there is a 1-out-of-9 chance for the system to be in any of the 9 possible states, so ∑ p log(1/p) = log(9) .

cups-dispersion
Figure 5: Spreading vs. Randomness vs. Uncertainty


One point to be made here is that entropy is not defined in terms of particles that are spread out in position-space, but rather in terms of probability that is spread out in state-space. This is quite an important distinction. For more details on this, including an interactive simulation, see reference 6.

Entropy involves probability spread out in state-space
(not necessarily particles spread out in position-space).


As a way of reinforcing this point, consider a system of spins such as discussed in section 10.11. The spins change orientation, but they don’t change position at all. Their positions are locked to the crystal lattice. The notion of entropy doesn’t require any notion of position; as long as we have states, and a probability of occupying each state, then we have a well-defined notion of entropy. High entropy means the probability is spread out over many states in state-space.

State-space can sometimes be rather hard to visualize. As mentioned in section 2.3, a well-shuffled card deck has nearly 2226 bits of entropy ... which is a stupendous number. If you consider the states of gas molecules in a liter of air, the number of states is even larger -- far, far beyond what most people can visualize. If you try to histogram these states, you have an unmanageable number of slots (in contrast to the 9 slots in figure 5) with usually a very small probability in each slot.

Another point to be made in connection with figure 5 concerns the relationship between observing and stirring (aka mixing, aka shuffling). Here’s the rule:

  not looking looking
not stirring entropy constant entropy decreasing (aa)
stirring entropy increasing (aa) contest

where (aa) means almost always; we have to say (aa) because entropy can’t be increased by stirring if it is already at its maximum possible value, and it can’t be decreased by looking if it is already zero. Note that if you’re not looking, lack of stirring does not cause an increase in entropy. By the same token, if you’re not stirring, lack of looking does not cause a decrease in entropy. If you are stirring and looking simultaneously, there is a contest between the two processes; the entropy might decrease or might increase, depending on which process is more effective.

The simulation in reference 6 serves to underline these points.

9  Definition of Equilibrium

Feynman defined equilibrium to be “when all the fast things have happened but the slow things have not” (reference 3). That statement pokes fun at the arbitrariness of the split between “fast” and “slow” -- but at the same time it is 100% correct and insightful. There is an element of arbitrariness in our notion of equilibrium. Over an ultra-long timescale, a diamond will turn into graphite. And in the ultra-short timescale, you can have non-equilibrium distributions of phonons rattling around inside a diamond crystal, such that it doesn’t make sense to talk about the temperature thereof. But usually we are interested in the intermediate timescale, long after the phonons have become thermalized but long before the diamond turns into graphite. During this intermediate timescale it makes sense to talk about the temperature of the diamond.

One should neither assume that equilibrium exists, nor that it doesn’t.

Diamond has a vast, clear-cut separation between the slow timescale and the fast timescale. Most intro-level textbook thermodynamics deal only with systems that have a clean separation.   In the real world, one often encounters cases where the separation of timescales is not so clean, and an element of arbitrariness is involved. The laws of thermodynamics can still be applied, but more effort and more care is required. See section 10.3 for a discussion.

Tangential remark: You may have heard of “Le Chatelier’s principle”. It cannot be taken seriously, since there are (and have been, ever since Le Chatelier’s day) two versions of the “principle”, one of which is untrue, and the other of which is meaningless, i.e. trivially circular. This “principle” needs to be thrown out and replaced by two well-defined concepts, namely equilibrium and stability. (This is analogous to the way that “heat” needs to be thrown out and replaced by two well-defined concepts, namely energy and entropy, as discussed in section 16.)

10  Experimental Basis

In science, questions are not decided by taking votes, or by seeing who argues the loudest or the longest. Scientific questions are decided by a careful combination of experiments and reasoning. So here are some epochal experiments that form the starting point for the reasoning presented here, and illustrate why certain other approaches are unsatisfactory.

10.1  Basic Notions of Temperature and Equilibrium

Make a bunch of thermometers. Calibrate them, to make sure they agree with one another. Attach thermometers to each of the objects mentioned below.

Take two objects that start out at different temperatures. Put them in a box together. Observe that they end up at the same temperature. This is an example of thermal equilibrium.

Take two objects that start out at the same temperature. Put them in a box together. Observe that they never (if left alone) end up at different temperatures. You can build a machine, called a refrigerator or a heat pump, that will cool off one object while heating up the other, but all such machines require an energy input, so they are irrelevant to any discussion of equilibrium.

10.2  Exponential Dependence on Energy

Here is a collection of observed phenomena that tend to support equation 30.
  • There is a wide (but not infinitely wide) class of chemical reactions where the rate of reaction depends exponentially on inverse temperature according to the Arrhenius rate equation:

    rate = A e-Ea / kT              (35)

    where Ea is called the activation energy and the prefactor A is called the attempt frequency . The idea here is that the reaction pathway has a potential barrier of height Ea and the rate depends on thermal activation over the barrier. In the independent-particle approximation, we expect that thermal agitation will randomly give an exponentially small fraction of the particles an energy greater than Ea in accordance with equation 30.

    Of course there are many examples where equation 35 would not be expected to apply. For instance, the flow of gas through a pipe (under the influence of specified upstream and downstream pressures) is not a thermally activated process, and does not exhibit an exponential dependence on inverse temperature.

  • In a wide class of materials, the strength of the NMR signal closely follows the Curie law over a range of many orders of magnitude. That is, the strength is proportional to 1/T. This is exactly what we would expect from treating each individual nucleus as an system unto itself (while treating everything else as the “environment” aka “heat bath”) and assigning probabilities to its individual microstates in accordance with equation 30.
  • The density of saturated water vapor (i.e. the density of gaseous H2O in equilibrium with liquid H2O) is rather accurately an exponential function of inverse temperature. This is what we would expect from equation 30, if we once again make the independent-particle approximation and say that particles in the liquid are in a low-energy state while particles in the vapor are in a high-energy state.

10.3  Metastable Systems with a Temperature

Here are some interesting examples:
  • a flywheel that may keep spinning for one second or one hour or one day.
  • a large piece of metal that rings like a bell, i.e. with a high excitation in one of its mechanical resonance modes.
  • a capacitor that may hold its charge for hours or days.
  • an electrochemical storage battery that may have a shelf life of ten days or ten months or ten years.
  • a fluid-dynamic excitation such as the wingtip vortices trailing behind an airplane.
  • a weight-driven cuckoo clock that may go a day or a week between windings. a spring-driven clock that may go a day or a week or a year between windings.
  • a microwave oven that puts potato-molecules into an excited state.
  • a metastable chemical species such as H2O2 or TNT. If left to themselves, they will decompose quickly or slowly, depending on temperature, catalysis, and other details.
  • a classical Carnot-cycle heat engine. If you operate it too quickly, there will be nonidealities because the parts of the cycle that are supposed to be isothermal won’t be (i.e. the working fluid won’t be in good thermal contact with the heat bath). On the other hand, if you operate it too slowly, there will be nonidealities due to parasitic thermal conduction through structures such as the pushrod that connects the piston to the load. You cannot assume or postulate that there is a nice big separation between the too-slow timescale and the too-fast timescale; if you need a big separation you must arrange for it by careful engineering.
(Section 10.4 takes another look at metastable systems.)

There are good reasons why we might want to apply thermodynamics to systems such as these. For instance, the Clausius-Clapeyron equation can tell us interesting things about a voltaic cell.

Also, just analyzing such a system as a Gedankenexperiment helps us understand a thing or two about what we ought to mean by “equilibrium”, “temperature”, “heat”, and “work”.

In equilibrium, states are supposed to be occupied in accordance with the Boltzmann distribution law (equation 30).

An example is depicted in figure 6, which is a scatter plot of Pi versus Ei.

ei-pi
Figure 6: An Equilibrium Distribution


As mentioned in section 9, Feynman defined equilibrium to be “when all the fast things have happened but the slow things have not” (reference 3). The examples listed at the beginning of this section all share the property of having two timescales and therefore two notions of equilibrium. If you “charge up” such a system with nonthermal energy and come back some time later, you may find the energy still in nonthermal form, or you may find it degraded to thermal form.

The idea of temperature is valid even on the shorter timescale. In practice, I can measure the temperature of a battery or a flywheel without waiting for it to run down. I can measure the temperature of a bottle of H2O2 without waiting for it to decompose.

To understand this, we must broaden our notion of what an equilibrium distribution is. Specifically, we must consider the case of a Boltzmann exponential distribution with exceptions. The flywheel has ≈1023 modes that follow the Boltzmann distribution, and one that does not. The exception involves a huge amount of energy, but involves essentially zero entropy. Also, most importantly, we can build a thermometer that couples to the thermal modes without coupling to the exceptional mode.

A particularly interesting case is shown in figure 7. In this case there are two exceptions. This situation has exactly the same entropy as the situation shown in figure 6. This can be seen directly from equation 3 since the Pi values are the same, differing only by a permutation of the dummy index i.

ei-pi-x
Figure 7: An Equilibrium Distribution with Exceptions


Meanwhile, the energy shown in figure 7 is significantly larger than the energy shown in figure 6.

This proves that in some cases of interest, we cannot write the sytem energy E as a function of the macroscopic thermodynamic variables V and S. Remember, V determines the spacing between energy levels (which is the same in both figures) and S tells us something about the occupation of those levels, but alas S does not tell us everything we need to know. Comparing figure 6 with figure 7 we have the same V, the same S, and different E. So we must not assume E = E(V,S).

Occasionally somebody tries to argue that the laws of thermodynamics do not apply to figure 7, on the grounds that thermodynamics requires strict adherence to the Boltzmann exponential law. This is a bogus argument for several reasons. First of all, strict adherence to the Boltzmann exponential law would imply that everything in sight was at the same temperature. That means we can’t have a heat engine, which depends on having two heat reservoirs at different temperatures. A theory of pseudo-thermodynamics that cannot handle exceptions to the Boltzmann exponential law is useless.

So we must allow some exceptions to the Boltzmann exponential law ... maybe not every imaginable exception, but some exceptions. A good criterion for deciding what sort of exceptions to allow is to ask whether it is operationally possible to measure the temperature. For example, in the case of a storage battery, it is operationally straightforward to design a thermometer that is electrically insulated from the exceptional mode, but thermally well connected to the thermal modes.

Perhaps the most important point is that equation 1 and equation 2 apply directly, without modification, to the situations listed at the beginning of this section. So from this point of view, these situations are not “exceptional” at all.

The examples listed at the beginning of this section raise some other basic questions. Suppose I stir a large tub of water. Have I done work on it (w) or have I heated it (q)? If the question is answerable at all, the answer must depend on timescales and other details. A big vortex can’t be considered thermal energy. But if you wait long enough the vortex dies out and you’re left with just thermal energy. Whether you consider the latter to be q and/or heat is yet another question. (See section 7.4 and especially section 16 for a discussion of what is meant by “heat”.)

In cases where the system’s internal “spin-down” time is short to all other timescales of interest, we get plain old dissipative systems. Additional examples include:

  • The Rumford experiment (section 10.5).
  • Shear in a viscous fluid (section 10.6).
  • A block sliding down an inclined plane, under the influence of sliding friction.
  • The brake shoes on a car.
  • et cetera.

10.4  Metastable Systems without a Temperature

An interesting example is:
  • a three-state laser, in which there is a population inversion.
In this case, it’s not clear how to measure the temperature or even define the temperature of the spin system. Remember that in equilibrium, states are supposed to be occupied with probability proportional to the Boltzmann factor, Pi = exp(-Ei/kT). However, the middle microstate is more highly occupied than the microstates on either side, as depicted in figure 8. This situation is clearly not describable by any exponential, since exponentials are monotone.

ei-pi-3
Figure 8: Three-State System without a Temperature


The problem is that this system has so few states that we can’t figure out which ones are the thermal “background” and which ones are the “exceptions”.

It is absolutely crucial that a metastable system (or even a grossly out-of-equilibrium) system must have a well-defined entropy, for reasons suggested by figure 9. Suppose the system starts out in equilibrium, with a well-defined entropy S(1). It then passes through in intermediate state that is out of equilibrium, and ends up in an equilibrium state with entropy S(3). The law of paraconservation of entropy is meaningless unless we can somehow relate S(3) to S(1). The only reasonable way that can happen is if the intermediate state has a well-defined entropy. The intermediate state typically does not have a temperature, but it does have a well-defined entropy.

s-not-t
Figure 9: Non-Equilibrium: Well-Defined Entropy


10.5  Rumford’s Experiment

Benjamin Thompson (Count Rumford) did some experiments that were published in 1798. Before that time, people had more-or-less assumed that thermal energy was somehow separate from other forms of energy, and was separately conserved. Rumford totally demolished this notion, by demonstrating that unlimited amounts of thermal energy could be produced by nonthermal mechanical means.

You would do well to read Rumford’s original paper right now. See reference 10. It is a masterpiece: easy to read, informative, and entertaining.

Analysing Rumford’s set-up in terms of heat and work is a quagmire. First of all, you would need to resolve the conflict between the various definitions (section 16 and section 17.1). Yes, work is being done in terms of energy flowing across the boundary, but no work is being done in terms of the work/KE theorem, since the cannon is not accelerating.

Secondly, you would need to decide what length-scales (λ) are of interest. Usually it is appropriate to use a mesoscopic length-scale, in the conventional “thermodynamic” spirit. Sometimes, however, it is worthwhile to drill down to the microscopic length-scale, to seek an understanding of how entropy is produced from scratch during a dissipative process. If you want to do this, it might be good to first examine the oil bearing (section 10.6) as a warm-up exercise.

A microscopic analysis of sliding friction between solid objects requires much attention to detail. The process is highly nonuniform in space and time. We must divide the system into tiny pieces; it is conventional to imagine a host of tiny asperities on the face of each object. We just account account for the instantaneous force and motion of each piece. (Considering the average force and overall motion is not sufficient.)

10.6  A Dissipative System : Oil Bearing

10.6.1  A Sensible Approach

Here is a modified version of Rumford’s experiment, more suitable for quantitative analysis. Note that reference 25 carries out a similar analysis and reaches many of the same conclusions.

Suppose we have an oil bearing as shown in figure 10. It consists of an upper plate and a lower plate, with a thin layer of oil between them. Each plate is a narrow annulus of radius R. The lower plate is held stationary. The upper plate rotates under the influence of a force F, applied via a handle as shown. The upper plate is kept coaxial with the lower plate by a force of constraint, not shown. The two forces combine to create a pure torque, τ = F/R. The applied torque τ is balanced in the long run by a frictional torque τ’; specifically

where 〈 ... 〉 denotes a time-average. As another way of saying the same thing, in the long run the upper plate settles down to a more-or-less steady velocity.

We arrange that the system as a whole is thermally insulated from the environment, to a sufficient approximation. This includes arranging that the handle is thermally insulating. In practice this isn’t difficult.

We also arrange that the plates are somewhat thermally insulating, so that heat in the oil doesn’t immediately leak into the plates.

Viscous dissipation in the oil causes the oil to heat up. To a good approximation this is the only form of dissipation we must consider.

In an infinitesimal period of time, the handle moves through a distance dx or equivalently through an angle dθ = dx/R. We consider the driving force F to be a controlled variable. We consider θ to be an observable dependent variable. The relative motion of the plates sets up a steady shearing motion within the oil. We assume the oil forms a sufficiently thin layer and has sufficiently high viscosity that the flow is laminar (i.e. non-turbulent) everywhere. We say the fluid has a very low Reynolds number (but if you don’t know what that means, don’t worry about it). The point is that the velocity of the oil follows the simple pattern shown by the red arrows in figure 11.

shear
Figure 11: Shear: Velocity Field in the Oil


The local work done on the handle by the driving force is w = Fdx or equivalently w = τdθ. This tells us how much energy is flowing across the boundary of the system. From now on we stop talking about work, and instead talk about energy, confident that energy is conserved.

We can keep track of the energy-content of the system by integrating the energy inputs. Similarly, given the initial entropy and the heat capacity of the materials, we can predict the entropy at all times11 by integrating equation 18. Also given the initial temperature and heat capacity, we can predict the temperature at all times by integrating equation 17. We can then measure the temperature and compare it with the prediction.

We can understand the situation in terms of equation 1 and equation 7. Energy τdθ comes in via the handle. This energy cannot be stored as potential energy within the system. This energy also cannot be stored as macroscopic or mesoscopic kinetic energy within the system, since at each point the velocity is essentially constant. By a process of elimination we conclude that this energy accumulates inside the system in the form of thermal energy.

This gives us a reasonably complete description of the thermodynamics of the oil bearing.

This example is simple, but helps make a very important point. If you base your thermodynamics on wrong foundations, it will get the wrong answer when applied to dissipative systems such as fluids, brakes, grindstones, et cetera. Some people try to duck this problem this by narrowing their definition of “thermodynamics” so severely that it has nothing to say (right or wrong) about dissipative systems. Making no predictions is a big improvement over making wrong predictions ... but still it is a terrible price to pay. Real thermodynamics has tremendous power and generality. Real thermodynamics applies just fine to dissipative systems. See section 19 for more on this.

10.6.2  Misconceptions : Heat

There are several correct ways of analyzing the oil-bearing system, one of which was presented in section 10.6.1. In addition, there are innumerably many incorrect ways of analyzing things. We cannot list all possible misconceptions, let alone discuss them all. However, it seems worthwhile to point out some of the most prevalent pitfalls.

You may have been taught to think of heating as thermal energy transfer across a boundary. That’s definition #3 in section 16. That’s fine provided you don’t confuse it with definition #2 (TdS).

The oil bearing serves as a clear illustration of the difference between heat-flow and heat-TdS. This is an instance of boundary/interior inconsistency, as discussed in section 15.

No heat is flowing into the oil. The oil is hotter than its surroundings, so if there is any heat-flow at all, it flows outward from the oil.   The TdS/dt is strongly positive. The entropy of the oil is steadily increasing.

Another point that can be made using this example is that the laws of thermodynamics apply just fine to dissipative systems. Viscous damping has a number of pedagogical advantages relative to (say) the sliding friction in Rumford’s cannon-boring experiment. It’s clear where the dissipation is occurring, and it’s clear that the dissipation does not prevent us from assigning a well-behaved temperature to each part of the apparatus. Viscous dissipation is more-or-less ideal in the sense that it does not depend on submicroscopic nonidealities such as the asperities that are commonly used to explain solid-on-solid sliding friction.

10.6.3  Misconceptions : Work

We now discuss some common misconceptions about work.

Work is susceptible to boundary/interior inconsistencies for some of the same reasons that heat is.

You may have been taught to think of work as an energy transfer across a boundary. That’s one of the definitions of work discussed in section 17.1. It’s often useful, and is harmless provided you don’t confuse it with the other definition, namely PdV.

Work-flow is the “work” that shows up in the principle of virtual work (reference 21), e.g. when we want to calculate the force on the handle of the oil bearing.   Work-PdV is the “work” that shows up in the work/KE theorem.

10.6.4  Remarks

This discussion has shed some light on how equation 8 can and cannot be interpreted.
  • Sometimes the terms on the RHS are well-defined and can be interpreted as “work” and “heat”.
  • Sometimes the terms on the RHS are well-defined but do not correspond to conventional notions of “work” and “heat”.
  • Sometimes the terms on the RHS are not even well-defined, i.e. the derivatives do not exist.
In all cases, the equation should not be considered the first law of thermodynamics, because it is inelegant and in every way inferior to a simple, direct statement of local conservation of energy.

10.7  The Gibbs Gedankenexperiment

As shown in figure 12, suppose we have two containers connected by a valve. Initially the valve is closed. We fill one container with an ideal gas, and fill the other container with a different ideal gas, at the same temperature and pressure. When we open the valve, the gasses will begin to mix. The temperature and pressure will remain unchanged, but there will be an irreversible increase in entropy. After mixing is complete, the molar entropy will have increased by Rln2.

gibbs
Figure 12: The Gibbs Gedankenexperiment


As Gibbs observed,12 the Rln2 result is independent of the choice of gasses, “... except that the gasses which are mixed must be of different kinds. If we should bring into contact two masses of the same kind of gas, they would also mix, but there would be no increase of entropy”.

There is no way to explain this in terms of 19th-century physics. The explanation depends on quantum mechanics. It has to do with the fact that one helium atom is identical (absolutely totally identical) with another helium atom.

Also consider the following contrast:

In figure 12, the pressure on both sides of the valve is the same. There is no net driving force. The process proceeds by diffusion, not by macroscopic flow.   This contrasts with the scenario where we have gas on one side of the partition, but vacuum on the other side. This is dramatically different, because in this scenario there is a perfectly good 17th-century dynamic (not thermodynamic) explanation for why the gas expands: there is a pressure difference, which drives a flow of fluid.

Entropy drives the process. There is no hope of extracting energy from the diffusive mixing process.   Energy drives the process. We could extract some of this energy by replacing the valve by a turbine.

The timescale for free expansion is roughly L/c, where L is the size of the apparatus, and c is the speed of sound. The timescale for diffusion is slower by a huge factor, namely by a factor of L/λ, where λ is the mean free path in the gas.
Pedagogical note: The experiment in figure 12 is not very exciting to watch. Here’s an alternative: Put a drop or two of food coloring in a beaker of still water. The color will spread throughout the container, but only rather slowly. This allows students to visualize a process driven by entropy, not energy.

Actually, it is likely that most of the color-spreading that you see is due to convection, not diffusion. To minimize convection, try putting the water in a tall, narrow glass cylinder, and putting it under a Bell jar to protect it from drafts. Then the spreading will take a very long time indeed.

Beware: Diffusion experiments of this sort are tremendously valuable if explained properly ... but they are horribly vulernable to misinterpretation if not explained properly, for reasons discussed in section 8.7.

10.8  Spin Echo Experiment

It is possible to set up an experimental situation where there are a bunch of nuclei whose spins appear to be oriented completely at random, like a well-shuffled set of cards. However, if I let you in on the secret of how the system was prepared, you can, by using a certain sequence of Nuclear Magnetic Resonance (NMR) pulses, get all the spins to line up -- evidently a very low-entropy configuration.

The trick is that there is a lot of information in the lattice surrounding the nuclei, something like 1023 bits of information. I don’t need to communicate all this information to you explicitly; I just need to let you in on the secret of how to use this information to untangle the spins.

The ramifications and implications of this are discussed in section 11.7.

10.9  Melting

Take a pot of ice-water. Add energy to it via friction, à la Rumford, as described in section 10.5. The added energy will cause the ice to melt. The temperature of the ice-water will not increase, not until all the ice is gone.

This illustrates the fact that temperature is not the same as thermal energy. It focuses our attention on the entropy. A gram of liquid water has more entropy than a gram of ice. So at any given temperature, a gram of water has more thermal energy than a gram of ice.

The following experiment makes an interesting contrast.

10.10  Isentropic Expansion and Compression

Take an ideal gas in a piston. Assume everything is thermally insulated, so that no energy enters or leaves the system via thermal conduction. Gently retract the piston, allowing the gas to expand. The gas cools as it expands. In the expanded state,
  • The gas has essentially the same entropy, if the expansion was done gently enough.
  • The gas has a lower temperature.
  • The gas has less thermal energy, by some amount ΔE.
Before the expansion, the energy in question (ΔE) was in thermal form, within the gas.   After the expansion, this energy is in non-thermal form, within the mechanism that moves the piston.

This scenario illustrates the difference between temperature and entropy, and the difference between thermal energy and entropy.

Remember, the second law of thermodynamics says that the entropy obeys a local law of paraconservation. Be careful not to misquote this law. It doesn’t say that the temperature can’t decrease. It doesn’t say that the thermal energy can’t decrease. It says the entropy can’t decrease in any given region of space, except by flowing into adjacent regions.

Energy is conserved. That is, it cannot increase or decrease except by flowing into adjacent regions. Thermal energy by itself is not conserved ... it can be converted to/from other forms of energy, and the energy-conservation law applies only to the total energy.

If you gently push the piston back in, compressing the gas, the temperature will go back up.

Isentropic compression is an increase in temperature at constant entropy. Melting (section 10.9) is an increase in entropy at constant temperature. These are two radically different ways of increasing the thermal energy.

10.11  Demagnetization Refrigerator

Attach a bar magnet to a wooden board so that it is free to pivot end-over-end. This is easy; get a metal bar magnet and drill a hole in the middle, then nail it loosely to the board. Observe that it is free to rotate. You can imagine that if it were smaller and more finely balanced, thermal agitation would cause it to rotate randomly back and forth forever.

Now hold another bar magnet close enough to ruin the free rotation, forcing the spinner to align with the imposed field.

This is a passable pedagogical model of part of a demagnetization refrigerator. Such refrigerators are used to produce exceedingly low temperatures (microkelvins or below). Commonly copper nuclei are used as the spinners. They have only 1 accessible state in a large magnetic field at low temperature, but have 4 equiprobable states when free. The latter corresponds to a molar entropy of R log(4). This value can be obtained in the obvious way just by counting states, and also this value is what is observed in the thermal performance of the refrigerator. What a coincidence! This answers the question about how to connect state-counting to macroscopic thermal behavior.

10.12  Thermal Insulation

As a practical technical matter, it is possible to have thermal insulation.

If we push on an object using a thermally-insulating stick, we can transfer energy to the object, without transferring much entropy.

In contrast, if we push on a hot object using a non-insulating stick, even though we impart energy to one or two of the object’s modes by pushing, the object could be losing energy overall, via thermal conduction through the stick.

Similarly, if you try to build a piece of thermodynamic apparatus, such as an automobile engine, it is essential that some parts reach thermal equilibrium reasonably quickly, and it is equally essential that other parts do not reach equilibrium on the same timescale.

11  More About Entropy

11.1  Microstate versus Macrostate

Beware: In thermodynamics, the word “state” is used with several inconsistent meanings.
  • In this document, unless otherwise stated, state means microstate, completely specifying the relevant microscopic variables. For example, for a deck of cards, that means specifying exactly which card is on top, exactly which card is in the second position, et cetera.
  • In other situations, people say “state” meaning macrostate, specified by macroscopic variables such as the temperature, density, and pressure. A macrostate is an equivalence class, a set containing many, many microstates.

11.2  Phase Space

As mentioned in section 2.4.1, our notion of entropy is completely dependent on having a notion of microstate, and on having a procedure for assigning probability to microstates.

For systems where the relevant variables are naturally discrete, this is no problem. See section 2.2 and section 2.3 for examples involving symbols, and section 10.11 for an example involving real thermal physics.

We now discuss the procedure for dealing with continuous variables. In particular, we focus attention on the position and momentum variables.

It turns out that we must account for position and momentum jointly, not separately. That makes a lot of sense, as you can see by considering a harmonic oscillator with period τ: If you know the oscillator’s position at time t, you know know its momentum at time t+τ/4 and vice versa.

Figure 13 shows how this works, in the semi-classical approximation. There is an abstract space called phase space. For each position variable q there is a momentum variable p. (In the language of classical mechanics, we say p and q are dynamically conjugate, but if you don’t know what that means, don’t worry about it.)


Area in phase space is called action. We divide phase space into cells of size h, where h is Planck’s constant, also known as the quantum of action. A system has zero entropy if it can be described as sitting in a single cell in phase space. If we don’t know exactly where the system sits, so that it must be described as a probability distribution in phase space, it will have some correspondingly greater entropy.

If there are M independent position variables, there will be M momentum variables, and each microstate will be associated with a 2M-dimensional cell of size hM.

Using the phase-space idea, we can already understand, qualitatively, the entropy of an ideal gas in simple situations:

  • If we keep the volume constant and increase the temperature, the entropy goes up. The spread in position stays the same, but the spread in momentum increases.
  • If we keep the temperature constant and increase the volume, the entropy goes up. The spread in momentum stays the same, but the spread in position increases.
For a non-classical variable such as spin angular momentum, we don’t need to worry about conjugate variables. The spin is already discrete i.e. quantized, so we know how to count states ... and it already has the right dimensions, since angular momentum has the same dimensions as action.

In section 2, we introduced entropy by discussing systems with only discrete states, namely re-arrangements of a deck of cards. We now consider a continuous system, such as a collection of free particles. The same ideas apply.

For each continuous variable, you can divide the phase space into cells of size h and then see which cells are occupied, as discussed in section 11.2. In classical thermodynamics, there is no way to know the value of h; it is just an arbitrary constant. Changing the value of h changes the amount of entropy by an additive constant. But really there is no such arbitrariness, because “classical thermodynamics” is a contradiction in terms. There is no fully self-consistent classical thermodynamics. In modern physics, we definitely know the value of h, Planck’s constant. Therefore we have an absolute scale for measuring entropy.

11.3  Entropy in a Crystal; Phonons, Electrons, and Spins

Imagine a crystal of pure copper, containing only the 63Cu isotope. Under ordinary desktop conditions, most of the thermal energy in the crystal takes the form of random potential and kinetic energy associated with vibrations of the atoms relative to their nominal positions in the lattice. We can find “normal modes” for these vibrations. This is the same idea as finding the normal modes for two coupled oscillators, except that this time we’ve got something like 1023 coupled oscillators. There will be three normal modes per atom in the crystal. Each mode will be occupied by some number of phonons.

At ordinary temperatures, almost all modes will be in their ground state. Some of the low-lying modes will have a fair number of phonons in them, but this contributes only modestly to the entropy. When you add it all up, the crystal has about 6 bits per atom of entropy in the thermal phonons at room temperature. This depends strongly on the temperature, so if you cool the system, you quickly get into the regime where thermal phonon system contains much less than one bit of entropy per atom.

There is, however, more to the story. The copper crystal also contains conduction electrons. They are mostly in a low-entropy state, because of the exclusion principle, but still they manage to contribute a little bit to the entropy, about 1% as much as the thermal phonons at room temperature.

A third contribution comes from the fact that each 63Cu nucleus can be be in one of four different spin states: +3/2, +1/2, -1/2, or -3/2. Mathematically, it’s just like flipping a four-sided coin. The spin system contains two bits of entropy per atom under ordinary conditions.

You can easily make a model system that has four states per particle. The most elegant way might be to carve some tetrahedral dice ... but it’s easier and just as effective to use four-sided “bones”, that is, parallelepipeds that are roughly 1cm by 1cm by 3 or 4 cm long. Make them long enough and/or round off the ends so that they never settle on the ends. Color the four long sides four different colors. A collection of such bones is profoundly analogous to a collection of copper nuclei. The which-way-is-up variable contributes two bits of entropy per bone, while the nuclear spin contributes two bits of entropy per atom.

In everyday situations, you don’t care about this extra entropy in the spin system. It just goes along for the ride. This is an instance of spectator entropy, as discussed in section 11.5.

However, if you subject the crystal to a whopping big magnetic field (Teslas) and get things really cold (milliKelvins), you can get the nuclear spins to line up. Each nucleus is like a little bar magnet, so it tends to align itself with the applied field, and at low-enough temperature the thermal agitation can no longer overcome this tendency.

Let’s look at the cooling process, in a high magnetic field. We start at room temperature. The spins are completely random. If we cool things a little bit, the spins are still completely random. The spins have no effect on the observable properties such as heat capacity.

As the cooling continues, there will come a point where the spins start to line up. At this point the spin-entropy becomes important. It is no longer just going along for the ride. You will observe a contribution to the heat capacity whenever the crystal unloads some entropy.

You can also use copper nuclei to make a refrigerator for reaching very cold temperatures.

11.4  Entropy is Entropy

Some people who ought to know better try to argue that there is more than one kind of entropy.

Sometimes they try to make one or more of the following distinctions:

Shannon entropy.   Thermodynamic entropy.

Entropy of abstract symbols.   Entropy of physical systems.

Entropy defined by equation 3.   Entropy defined in terms of energy and temperature.

Small systems: 3 blocks with 53 states, or 52 cards with 52! states   Large systems: 1025 copper nuclei with 41025 states.

It must be emphasized that none of these distinctions have any value.

For starters, having two types of entropy would require two different paraconservation laws, one for each type. Also, if there exist any cases where there is some possibility of converting one type of entropy to the other, we would be back to having one overall paraconservation law, and the two type-by-type laws would be seen as mere approximations.

Also note that there are plenty of systems where there are two ways of evaluating the entropy. The copper nuclei described in section 10.11 have a maximum molar entropy of R ln(4). This value can be obtained in the obvious way by counting states, just as we did for the small, symbol-based systems in section 2. This is the same value that is obtained by macroscopic measurements of energy and temperature. What a coincidence!

Let’s be clear: The demagnetization refrigerator counts both as a small, symbol-based system and as a large, thermal system. Additional examples are mentioned in section 20.

11.5  Spectator Entropy

Suppose we define a bogus pseudo-entropy S’ as

S’ := S + K              (37)

for some arbitrary constant K. It turns out that in some (but not all!) situations, you may not be sensitive to the difference between S’ and S.

For example, suppose you are measuring the heat capacity. That has the same units as entropy, and is in fact closely related to the entropy. But we can see from equation 18 that the heat capacity is not sensitive to the difference between S’ and S, because the derivative on the RHS annihilates additive constants.

Similarly, suppose you want to know whether a certain chemical reaction will proceed spontaneously or not. That depends on the difference between the initial state and the final state, that is, differences in energy and differences in entropy. So once again, additive constants will drop out.

There are many standard reference books that purport to tabulate the entropy of various chemical compounds ... but if you read the fine print you will discover that they are really tabulating the pseudo-entropy S’ not the true entropy S. In particular, the tabulated numbers typically do not include the contribution from the nuclear spin-entropy, nor the contribution from mixing the various isotopes that make up each element. They can more-or-less get away with this because under ordinary chem-lab conditions those contributions are just additive constants.

However, you must not let down your guard. Just because you can get away with using S’ instead of S in a few simple situations does not mean you can get away with it in general. There is a correct value for S and plenty of cases where the correct value is needed.

11.6  No Secret Entropy, No Hidden Variables

Suppose we want to find the value of the true entropy, S. We account for the thermal phonons, and the electrons, and the nuclear spins. We even account for isotopes, chemical impurities, and structural defects in the crystal. But ... how do we know when to stop? How do we know if/when we’ve found all the entropy? In section 11.5 we saw how some of the entropy could silently go along for the ride, as a spectator, under certain conditions. Is there some additional entropy lurking here or there? Could there be hitherto-unimagined quantum numbers that couple to hitherto-unimagined fields?

The answer is no. According to all indications, there is no secret entropy. At any temperature below several thousand degrees, electrons, atomic nuclei, and all other subatomic particles can be described by their motion (position and momentum) and by their spin, but that’s it, that’s a complete description. Atoms, molecules, and all larger structures can be completely described by what their constituent particles are doing.

In classical mechanics, there could have been an arbitrary amount of secret entropy, but in the real world, governed by quantum mechanics, the answer is no.

We have a firm experimental basis for this conclusion. According to the laws of quantum mechanics, the scattering of identical particles is different from the scattering of distinguishable particles. Under ordinary conditions, protons are indistinguishable if they are in the same spin state, but they are distinguishable if they are in different spin states.

Therefore, suppose that in addition to the 1 bit of well-known spin-entropy, each proton had 17 bits of “secret entropy”, in whatever form you can imagine. That would mean that there would be 217 different distinguishable types of proton. If you pick protons at random, they would almost certainly be distinguishable, whether or not their spins were aligned, and you would almost never observe like-spin scattering to be different from unlike-spin scattering.

Such scattering experiments have been conducted with electrons, protons, various heavier nuclei, and sometimes entire atoms. There has never been any indication of any secret entropy.

The thermodynamics of chemical reactions tells us that larger structures can be described in terms of their constituents with no surprises.

The existence of superfluidity is further evidence that we can correctly account for entropy. All the atoms in the superfluid phase are described by a single quantum wavefunction. The entropy per atom is zero; otherwise it wouldn’t be a superfluid. Superfluid 4He depends on the fact that all 4He atoms are absolutely totally exactly identical. We already knew they were identical, based on two-particle scattering experiments, but the superfluid reassures us that we haven’t overlooked anything when going from a pair of particles to 1023 particles.

Superfluidity occurs because certain identical-particle effects are cumulative and therefore have a spectacular effect on the entire fluid. Similar macroscopic identical-particle effects have been directly observed in 3He, spin-polarized monatomic hydrogen, sodium atomic gas, and other systems.

It might also be remarked that the existence of superconductors, semiconductors, metals, molecular bonds, and the periodic table of elements is strong evidence that electrons have no secret entropy. The existence of lasers is strong evidence that photons have no secret entropy.

I can’t prove that no hitherto-secret entropy will ever be discovered. We might discover a new atom tomorrow, called Loonium, which is exactly the same as Helium except that for some reason it always obeys the distinguishable-particle scattering law when scattering against Helium. This wouldn’t be the end of the world; we would just postulate a new quantum number and use it to distinguish the two types of atom. All I can say is that Loonium must be exceedingly rare; otherwise it would have been noticed.

Reminder: The foregoing discussion applies to “secret entropy” that might exist at room temperature or below, in analogy to spin entropy. In contrast we are not talking about the plethora of quantum numbers that are known to come into play at higher energies, but are all in their ground state under ordinary room-temperature conditions.

11.7  Entropy is Context Dependent

Consider 100 decks of cards. The first one is randomly shuffled. It has an entropy of just under 226 bits. All the rest are ordered the same way as the first. If you give me any one of the decks in isolation, it will take me 226 yes/no questions to figure out how to return the deck to standard order. But after I’ve seen any one of the decks, I know the exact microstate of every other deck without asking additional questions. The other 99 decks contain zero additional entropy.

In a situation like this, it’s hard to consider entropy to be a state variable. In particular, the entropy density will not be an intensive property.

I know this sounds creepy, but it’s real physics. Creepy situations like this do not usually occur in physical systems, but sometimes they do. Examples include:

  • The spin-echo experiment (section 10.8) is the perfect example of this.
  • Small thermodynamic systems, including Maxwell demons and Szilard engines, are also excellent examples.
  • There are many magic tricks that involve a deck of cards that is (or appears to be) completely disordered, yet important details of the configuration are known to the magician.
  • Similarly, in cryptology, a string of symbols that is well encrypted will pass any standard test for randomness, and is therefore completely unpredictable to most parties ... yet it is highly predictable to parties who hold the key.
In an ordinary ideal gas, you can pretty much assume the entropy density is a well-behaved intensive property -- but don’t completely let down your guard, or you’ll be badly fooled by the spin-echo setup.

A related issue concerns the dependence of entropy on the choice of observer. Entropy is not simply a property of a system, but rather a property of the system and the description thereof. This was mentioned in passing near the end of section 2.

Let’s be clear: As a matter of principle, two different observers will in general assign two different values to “the” entropy.

This is easy to express in mathematical terms. The trustworthy definition of entropy is equation 3. If P is a conditional probability, as it often is, then S is a conditional entropy.

Human observers are so grossly dissipative and usually “know” so little that it is academic to worry about the thermodynamics of human “knowledge”. However, the issue takes on new life when we consider highly-optimized robot measuring devices -- Maxwell demons and the like.

For microscopic systems, it is for sure possible for different observers to report different values of “the” entropy (depending on what each observer knows about the system). The discrepancy can be a large percentage of the total.

By way of analogy, you know that different observers report different values of “the” kinetic energy (depending on the velocity of the observer), and this hasn’t caused the world to end.
For macroscopic systems (1023 particles or thereabouts) it is uncommon for one observer to know 1023 things that the other observer doesn’t ... but even this is possible. The spin echo experiment is a celebrated example, as discussed in section 10.8.

Regardless of the size of the system, it is often illuminating to consider a complete thermodynamic cycle, such that all participants are returned to the same state at the end of the cycle. This de-emphasizes what the observers “know” and instead focuses attention on how they “learn” ... and how they forget. In more technical terms: this focusses attention on the observation/measurement process, which is crucial if you want a deep understanding of what entropy is and where it comes from. See reference 13 and reference 14.

In particular, at some point in each cycle the observer will have to forget previous information, to make room for the new information. This forgetting expels entropy, and at temperature T it dissipates energy TS.

To repeat: When evaluating “the” entropy, it is necessary to account for the information in the observer-system. In a closed cycle, this focusses attention on the observation and measurement process. If you don’t do this, you will get the wrong answer every time when analyzing spin echo systems, Maxwell demons, Szilard engines, reversible computers, et cetera.

12  Entropy versus “Irreversibility” in Chemistry

In chemistry, the word “irreversible” is commonly used in connection with multiple inconsistent ideas, including:
  • The reaction is spontaneous.
  • The reaction strongly goes to completion.
  • The reaction is thermodynamically irreversible.
Those ideas are not completely unrelated ... but they are not completely identical, and there is potential for serious confusion.

You cannot look at a chemical reaction (as written in standard form) and decide whether it is spontaneous, let alone whether it goes to completion. For example, if you flow steam over hot iron, you produce iron oxide plus hydrogen. It goes to completion in the sense that the iron is used up. Conversely, if you flow hydrogen over hot iron oxide, you produce iron and H2O. It goes to completion in the sense that the iron oxide is used up.

And none of that has much to do with whether the reaction was thermodynamically reversible or not.

Here is a pair of scenarios that may clarify a few things.

Scenario #1: Suppose a heavy brick slides off a high shelf and falls to the floor. Clearly this counts as a “spontaneous” process. It liberates energy and liberates free energy.

Further suppose that near the floor we catch the brick using some sort of braking mechanism. The brakes absorb the energy and get slightly warm. This braking process is grossly irreversible in the thermodynamic sense. That is, the process is very far from being isentropic.

Now we can use the heat in the brakes to run a heat engine. Let’s suppose that it is an ideal heat engine. The fact that the engine is thermodynamically reversible is interesting, but it does not mean that the overall process (brick + brake + heat engine) is reversible. There was a terrible irreversibility at an upstream point in the process, before the energy reached the heat engine. The thermodynamic efficiency of the overall process will be terrible, perhaps less than 1%.

Scenario #2: Again the brick slides off the shelf, but this time we attach it to a long lever (rather than letting it fall freely). As the brick descends to the floor, the lever does useful work (perhaps raising another weight, generating electrical power, or whatever). The overall thermodynamic efficiency of this process could be very high, easily in excess of 90%, perhaps even in excess of 99%. The process is still spontaneous and still goes to completion.

From these scenarios we see that being spontaneous and/or going to completion does not necessarily tell you anything about whether the process is irreversible in the thermodynamic sense.

In elementary chemistry classes, people tend to pick up wrong ideas about thermodynamics, because the vast preponderance of the reactions that they carry out are analogous to scenario #1 above. That is, the reactions are grossly irreversible in the thermodynamic sense. The reactions are nowhere near isentropic.

There are some examples of chemical reactions that are essentially reversible, in analogy to scenario #2. In everyday life, the commonest examples of this are electrochemical reactions, e.g. storage batteries and fuel cells. Another example is the CO2/carbonate reaction discussed below. Alas, there is a tendency for people to forget about these reversible reactions and to unwisely assume that all reactions are grossly irreversible, in analogy to scenario #1. This unwise assumption can be seen in the terminology itself: widely-used tables list the “standard heat of reaction” (rather than the standard energy of reaction), apparently under the unjustifiable assumption that the energy liberated by the reaction will always show up as heat. Similarly reactions are referred to as “exothermic” and “endothermic”, even though it would be much wiser to refer to them as exergonic and endergonic.

It is very difficult, perhaps impossible, to learn much about thermodynamics by studying bricks that fall freely and smash against the floor. Instead, thermodynamics is most understandable and most useful when applied to situations that have relatively little dissipation, i.e. that are nearly isentropic.

Lots of people get into the situation where they have studied tens or hundreds or thousands of reactions, all of which are nowhere near isentropic. That’s a trap for the unwary. It would be unwise to leap to the conclusion that all reactions are far from isentropic ... and it would be even more unwise to leap to the conclusion that “all” natural processes are far from isentropic.

Chemists are often called upon to teach thermodynamics, perhaps under the guise of a “P-Chem” course (i.e. physical chemistry). This leads some people to ask for purely chemical examples to illustrate entropy and other thermodynamic ideas. I will answer the question in a moment, but first let me register my strong objections to the question. Thermodynamics derives its great power and elegance from its wide generality. Specialists who cannot cope with examples outside their own narrow specialty ought not be teaching thermodynamics.

Here’s a list of reasons why a proper understanding of entropy is directly or indirectly useful to chemistry students.

  1. Consider electrochemical reactions. Under suitable conditions, some electrochemical reactions can be made very nearly reversible in the thermodynamic sense. (See reference 18 for some notes on how such cells work.) In these cases, the heat of reaction is very much less than the energy of reaction, and the entropy is very much less than the energy divided by T.
  2. Consider the reaction that children commonly carry out, adding vinegar to baking soda, yielding sodium acetate and carbon dioxide gas. Let’s carry out this reaction in a more grown-up apparatus, namely a sealed cylinder with a piston. By pushing on the piston with weights and springs, we can raise the pressure of the CO2 gas. If we raise the pressure high enough, we push CO2 back into solution. This in turn raises the activity of the carbonic acid, and at some point it becomes a strong enough acid to attack the sodium acetate and partially reverse the reaction, liberating acetic acid. So this is clearly and inescapably a chemistry situation.

    Much of the significance of this story revolves around the fact that if we arrange the weights and springs just right, the whole process can be made thermodynamically reversible (nearly enough for practical purposes). Adding a tiny bit of weight will make the reaction go one way, just as removing a tiny bit of weight will make the reaction go the other way.

    Now some interesting questions arise: Could we use this phenomenon to build an engine, in analogy to a steam engine, but using CO2 instead of steam, using the carbonate ↔ CO2 chemical reaction instead of the purely physical process of evaporation? How does the CO2 pressure in this system vary with temperature? How much useful work would this CO2 engine generate? How much waste heat? What is the best efficiency it could possibly have? Can we run the engine backwards so that it works as a refrigerator?

    There are more questions of this kind, but you get the idea: once we have a reaction that is more-or-less thermodynamically reversible, we can bring to bear the entire machinery of thermodynamics.

  3. Consider the colligative effects of a solute on the on freezing point, boiling point, and vapor pressure of a solvent. The fact that they’re colligative -- i.e. insensitive to the chemical properties of the solute -- is strong evidence that entropy is what’s driving these effects, not enthalpy, energy, or free energy.
  4. Similarly: consider the Gibbs Gedankenexperiment (section 10.7). Starting with a sample of 4He, we get an increase in entropy if we mix it with 3He, or Ne, or Xe ... but we get no effect if we “mix” it with more of the same 4He.
  5. People who take chemistry classes often go on to careers in other fields. For example, you might need knowledge of chemistry, physics, and engineering in order to design a rocket engine, or a jet engine, or a plain old piston engine. Such things commonly involve a chemical reaction followed by a more-or-less isentropic expansion. Even though the chemical reaction is grossly irreversible, understanding the rest of the system requires understanding thermodynamics.

    To be really specific, suppose you are designing something with multiple heat engines in series. This case is considered as part of the standard “foundations of thermodynamics” argument, as illustrated figure 14. Entropy is conserved as it flows down the totem-pole of heat engines. The crucial conserved quantity that is the same for all the engines is entropy ... not energy, free energy, or enthalpy. No entropy is lost during the process, because entropy cannot be destroyed, and no entropy (just work) flows out through the horizontal arrows. No entropy is created, because we are assuming the heat engines are 100% reversible. For more on this, see reference 1.


  6. Consider “Design of Experiment”, as discussed in reference 19. In this case the entropy of interest is not the entropy of the reaction, but still it is entropy, calculated in accordance with equation 3, and it is something a chemist ought to know. Research chemists and especially chemical engineers are often in the situation where experiments are very expensive, and someone who doesn’t understand Design of Experiment will be in big trouble.

13  A Few More State-Functions

13.1  Enthalpy

We hereby define the enthalpy as:

H := E + P V              (38)

where H is the near-universally conventional symbol for enthalpy, E is the energy, V is the volume of the system, and P is the pressure on the system. We will briefly explore some of the mathematical consequences of this definition, and then explain what enthalpy is good for.

We will need the fact that

d(P V) = PdV + VdP              (39)

which is just the rule for differentiating a product. This rule applies to any two variables (not just P and V), provided they were differentiable to begin with. Note that this rule is intimately related to the idea of integrating by parts, as you can see by writing it as

PdV = d(P V) - VdP              (40)

and integrating both sides.

Differentiating equation 38 and using equation 11 and equation 39, we find that

dH   =   -PdV + TdS + PdV + VdP
    =   VdP + TdS
             (41)
which runs nearly parallel to equation 11; on the RHS we have transformed -PdV into VdP, and of course the LHS is enthalpy instead of energy.

This trick of transforming xdy into -ydx (with a leftover d(xy) term) is called a Legendre transformation. Again we note the idea may be somewhat familiar in the guise of integrating by parts.

In the chemistry lab, it is common to carry out reactions under conditions of constant pressure. If the reaction causes the system to expand or contract -- for instance if gas is evolved from a solid or liquid -- it will do work against atmospheric pressure. This work will change the energy ... but it will not change the enthalpy, because the latter depends on VdP.

This means that under conditions of constant pressure, it is easier to keep track of the enthalpy than to keep track of the energy.

It is also amusing to differentiate H with respect to P and S directly, using the chain rule. This gives us:

dH =   |
|
|
|
dP +   |
|
|
|
dS              (42)
which is interesting because we can compare it, term by term, with equation 41. When we do that, we find that the following identities must hold: and Equation 44 is not meant to redefine T; it is merely a corollary of our earlier definition of T (equation 10) and our definition of H (equation 38).

13.2  Free Energy

In many situations -- for instance when dealing with heat engines -- it is convenient to keep track of the free energy of a given parcel. This is also known as the Helmholtz potential, or the Helmholtz free energy. It is defined as:

F = E - T S              (45)

where F is the conventional symbol for free energy, E is (as always) the energy, S is the entropy, and T is the temperature of the parcel.

The free energy is extremely useful for analyzing the spontaneity and reversibility of transformations taking place at constant T and constant V. See reference 20 for details.

See section 13.4 for a discussion of what is (or isn’t) “free” about the free energy.

13.3  Free Enthalpy

Combining the ideas of section 13.1 and section 13.2, there are many situations where it is convenient to keep track of the free-enthalpy. This is also known as the Gibbs potential or the Gibbs free enthalpy. It is defined as:

G = E + P V - T S              (46)

where G is the conventional symbol for free enthalpy. (Beware: G is all-too-commonly called the Gibbs free “energy” but that is a bit of a misnomer. Please call it the free enthalpy, to avoid confusion between F and G.)

The free enthalpy is extremely useful for analyzing the spontaneity and reversibility of transformations taking place at constant T and constant P. See reference 20 for details.

13.4  Thermodynamically Available Energy

The notion of “free energy” is often misunderstood. Indeed the term “free energy” practically begs to be misunderstood.

It is superficially tempting to divide the energy E into two pieces, the “free” energy F and the “unfree” energy TS. This is formally possible, but not very helpful as far as I can tell . In particular, there is no connection to the ordinary meaning of “free”. You should not think that the free energy is the “thermodynamically available” part of the energy, or that TS is the “unavailable” part of the energy ... for reasons we now discuss. (See also section 6 for additional discussion of thermal versus nonthermal energy.)

Keep in mind that the free energy of a parcel is a function of state, and in particular is a function of the thermodynamics state of that parcel. That is, for parcel #1 we have F1 = E1 - T1 S1 and for parcel #2 we have F2 = E2 - T2 S2.

Suppose we hook up a heat engine that uses parcel #1 as its heat source and parcel #2 as its heat sink. Assume the heat engine is maximally efficient, so its efficiency is the Carnot efficiency, (T1 - T2)/T1. We see that the amount of “thermodynamically available” energy depends on T2, whereas the free energy of parcel #1 does not. In particular, if T2 is cold enough, the work done by the heat engine will exceed the free energy of parcel #1. Indeed, in the limit that T2 approaches absolute zero, the work done by the heat engine will converge to the entire energy E1, not the free energy F1.

The notion of thermodynamically available energy (also called simply thermal energy) is often not precisely definable, as you can see by considering the case where parcel #1 is in contact with two other parcels, with two different temperatures. On the other hand, if you do happen to have a well-defined, unique “ambient” temperature then you might be able to formulate a well-behaved notion of “thermal energy”.

In any case, if you find yourself trying to quantify the “thermal energy” content of something, it is likely that you are asking the wrong question. You will probably be much better off quantifying something else instead, perhaps the energy E and the entropy S.

Similar remarks apply to the free enthalpy, G.

In general, you should never assume you can figure out the nature of a thing merely by looking at the name of a thing. A titmouse is not a kind of mouse. Milk of magnesia is not made of milk. Chocolate turtles are not made of turtles. As Voltaire remarked, the Holy Roman Empire was neither holy, nor Roman, nor an empire. By the same token, free energy is not the “free” part of the energy.

13.5  Relationships among E, F, G, and H

We have now encountered four quantities {E, F, G, H} all of which have dimensions of energy. The relationships among these quantities can be nicely summarized in two-dimensional charts, as in figure 15.

efgh
Figure 15: Energy, Enthalpy, Free Energy, and Free Enthalpy

d-efgh
Figure 16: Some Derivatives of E, F, G, and H


The four expressions in figure 16 constitute all of the expressions that can be generated by starting with equation 11 and applying Legendre transformations. They are emphatically not the only valid ways of differentiating E, F, G, and H. Equation 16 is a very practical example -- namely heat capacity -- that does not show up in figure 16. It involves expressing dE in terms of dV and dT (rather than dV and dS).

Beware: There is a widespread misconception that E must be expressed in terms of V and S, while H must be expressed in terms of P and S, and so on for F(V,T) and G(P,S). There are no good reasons for such restrictions on the choice of variables. (These restrictions are related to some problems caused by taking shortcuts with the notation for partial derivatives. However, the restrictions are neither necessary nor sufficient to solve the problems. See reference 23 for more on this.)

Note that H does not obey a local conservation law the way E does. You can increase the enthalpy of a region without decreasing the enthalpy of neighboring regions. However, if you know the pressure and volume, you can use the definition (equation 38) to calculate the energy in terms of enthalpy, apply the energy conservation law, and then recalculate the enthalpy if desired.

14  Adiabatic Processes

The word adiabatic is another term that suffers from multiple inconsistent meanings. The situation is summarized in figure 17.

adiabatic
Figure 17: Multiple Definitions of Adiabatic

  1. Some thoughtful experts use “adiabatic” to denote a process where no entropy is transferred across the boundary of the region of interest. This was probably the original meaning, according to several lines of evidence, including the etymology: α + δια + βατoς = not passing across. As a corollary, we conclude the entropy of the region does not decrease.
  2. Other thoughtful experts refer to the adiabatic approximation (in contrast to the sudden approximation) to describe a perturbation carried out sufficiently gently that each initial state can be identified with a corresponding final state, and the occupation number of each state is preserved during the process. As a corollarly, we conclude that the entropy of the region does not change.
  3. Dictionaries and textbooks commonly define “adiabatic” to mean no flow of entropy across the boundary and no creation of entropy.
In the dream-world where only reversible processes need be considered, definitions (1) and (2) are equivalent, but that’s not much help to us in the real world.

Also note that when discussing energy, the corresponding ambiguity cannot arise. Energy can never be created or destroyed, so if there is no transfer across the boundary, there is no change.

As an example where the first definition (no flow) applies, but the second definition (occupation numbers preserved) does not, see reference 22. It speaks of an irreversible adiabatic process, which makes sense in context, but is clearly inconsistent with the second meaning. This is represented by point (1) in the figure.

As an example where the second definition applies but the first definition does not, consider the refrigeration technique known as adiabatic demagnetization. The demagnetization is carried out gently, so that the notion of corresponding states applies to it. If the system were isolated, this would cause the temperature of the spin system to decrease. The interesting thing is that people still call it adiabatic demagnetization even when the spin system is not isolated. Specifically, consider the subcase where there is a steady flow of heat inward across the boundary of the system, balanced by a steady demagnetization, so as to maintain constant temperature. Lots of entropy is flowing across the boundary, violating the first definition, but it is still called adiabatic demagnetization in accordance with the second definition. This subcase is represented by point (2) in the diagram.

As an example where the second definition applies, and we choose not to violate the first definition, consider the NMR technique known as “adiabatic fast passage”. The word “adiabatic” tells us the process is slow enough that there will be corresponding states and occupation numbers will be preserved. Evidently in this context the notion of no entropy flow across the boundary is not implied by the word “adiabatic”, so the word “fast” is adjoined, telling us that the process is sufficiently fast that not much entropy does cross the boundary. To repeat: adiabatic fast passage involves both ideas: it must be both “fast enough” and “slow enough”. This is represented by point (3) in the diagram.

My recommendation is to avoid using the term adiabatic whenever possible. Some constructive suggestions include:

  • If you mean thermally insulated, say thermally insulated.
  • If you mean a non-sudden perturbation, say non-sudden or gentle.
  • If you mean isentropic, say isentropic.
  • Instead of the nouns “adiabat” or “adiabatic line”, say “contour of constant entropy”.

15  Boundary versus Interior

We now discuss two related notions:
  • The flow of something across the boundary of the region.
  • The change in the amount of something inside the region.
When we consider a conserved quantity such as energy, momentum, or charge, these two notions stand in a one-to-one relationship. In general, though, these two notions are not equivalent.

In particular, consider equation 21, which is restated here:

dE = -P dV + T dS + advection              (47)

 

Although officially dE represents the change in energy in the interior of the region, we are free to interpret it as the flow of energy across the boundary. This works because E is a conserved quantity.

The advection term is explicitly a boundary-flow term.

It is extremely tempting to interpret the two remaining terms as boundary-flow terms also ... but this is not correct!

Officially PdV describes a property of the interior of the region. Ditto for TdS. Neither of these can be converted to a boundary-flow notion, because neither of them represents a conserved quantity. In particular, PdV energy can turn into TdS energy entirely within the interior of the region, without any boundary being involved.

Let’s be clear: boundary-flow ideas are elegant, powerful, and widely useful. Please don’t think I am saying anything bad about boundary-flow ideas. I am just saying that the PdV and TdS terms do not represent flows across a boundary.

Misinterpreting TdS as a boundary term is a ghastly mistake. It is more-or-less tantamount to assuming that heat is a conserved quantity unto itself. It would set science back over 200 years, back to the “caloric” theory.

Once these mistakes have been pointed out, they seem obvious, easy to spot, and easy to avoid. But beware: mistakes of this type are extremely prevalent in introductory-level thermodynamics books.

16  Heat

The term “heat” is a very slippery weasel. At least three sensible but mutually-inconsistent definitions are in widespread use. It is not worth arguing about the relative merits of these definitions, except to say that each has some merit. I observe that a typical thoughtful expert will use each of these definitions, depending on context. It would be nice to have a single, universally-accepted definition, but I doubt that will happen anytime soon. Sensible definitions include:
  1. Sometimes “heat” simply means hotness, i.e. relatively high temperature. Example: if we’re having a heat wave, it means a spell of hot weather. The corresponding verb, heating, simply means making something hotter. This type of heat is measured in degrees.
  2. Sometimes the word “heat” is used to refer to the T dS term in equation 11. Keep in mind that this is a vector, in particular an inexact one-form. This type of heat is measured in joules. The corresponding verb, heating, happens if and only if there is a change in the entropy of the region.
  3. It is common to find encyclopedias, dictionaries, and textbooks that define “heat” as “energy that is transferred from one body to another as the result of a difference in temperature”. This implies a transfer of entropy across the boundary of the region.
In addition, one sometimes encounters some less-than-sensible definitions, including:
  • Chemists commonly use “heat” as an all-purpose synonym for enthalpy, for instance in the expression “heat of formation”. This includes cases where there the “heat” (i.e. enthalpy) is not flowing across a boundary. Even more remarkably, it includes cases where the enthalpy is predominantly nonthermal, for instance in an electrochemical fuel cell. This usage is quite common, but I consider it a very unhelpful misnomer. I recommend crossing out terms like “heat of formation” and replacing them with terms like “enthalpy of formation” at every opportunity. Similarly the terms “exothermic” and “endothermic” in most cases should be crossed out and replaced with “exergonic” and “endergonic” respectively ... or perhaps “exenthalpic” and “endenthalpic”.
  • Some non-experts, when asked to define “heat”, describe something that is, in effect, the infrared portion of the electromagnetic spectrum. This notion is the basis of the phrase “heat rays”, and of the cliché “it gives off more heat than light”. Alas, this cliché makes no sense from a scientific point of view: It’s true that a black body that gives off lots of infrared radiation is hot ... but a black body that gives off visible light is hotter. Associating IR (rather than visible) with heat or hotness is just backwards.
As an example where definitions (1) and (2) apply, but definition (3) does not, consider the notion that a microwave oven heats the food. Clearly (1) the food gets hotter. Clearly (2) the entropy of the food changes. But (3) no entropy was transferred across the boundary of the food. Energy was transferred, but the entropy was created from scratch, within the food. According to any reasonable definition of temperature, the magnetron (the wave-generating device inside the oven) isn’t very hot, so you can’t say the energy was transferred “as the result of a difference in temperature”.

The distinction between (2) and (3) is an instance of the boundary/interior issue, as discussed in section 15.

As an example where definitions (2) and (3) apply, but definition (1) does not, consider a glass of ice-water sitting on the table. We say that heat leaks into the system and melts the ice. The temperature does not change during the process.

As an example where definition (1) applies but (2) and (3) do not, consider the reversible thermally-insulated compression of a parcel of gas. We say the gas heats up, and there is an increase in the amount of thermal energy within the region. On the other hand, clearly no heat or entropy was transferred across the boundary, and there was no change in the entropy within the region.

We now discuss the advantages and disadvantages of definition (3):

Definition (3) is probably the most widespread, perhaps in part because it is easily expressed in non-mathematical words. Many students have been forced to learn this definition by rote.   Rote learning is a poor substitute for understanding.

Definition (3) makes sense in some situations, such as a simple non-moving heat exchanger in a non-dissipative system.   Such situations are not representative of the general case.

Definition (3) focusses attention on flow across a boundary. This is good, because we believe all the laws of physics should be stated in local form, and flows across a boundary are crucial for this.   It focusses on temperature and heat. It would be better to focus on energy and entropy. Certainly energy and entropy can flow between systems that don’t even have a well-defined temperature (let alone a difference in temperature). Also remember that heat is not a conserved quantity, and it is hard to know what “flow” means when applied to non-conserved quantities. Whenever you talk about heat flow, you run the risk that non-experts will visualize heat as some sort of conserved fluid.

Heat is non-conserved twice over. First of all, even in reversible processes, heat is non-conserved because thermal energy can be converted to nonthermal energy and vice versa. As mentioned in section 10.6.3 energy is conserved, but heat (by itself) is not conserved. Secondly, in irreversible processes heat is not conserved because entropy is not conserved.

The word “heat” occurs in a great number of familiar expressions. Usually these are harmless, especially when used in a loose, qualitative sense ... but they can cause trouble if you try to quantify them, and some of them should be avoided entirely, because they are just begging to be misunderstood.

  • heat capacity
  • heat engine
  • heat pump
  • heat exchanger
  • heat bath
  • heat sink
  • heat source
  • heat leak
  • heat flow (problematic)
  • heat of reaction (very problematic)
  • et cetera.
As discussed in section 12, whenever you see the phrase “heat of reaction” you should cross it out and replace it with “enthalpy of reaction” or something similar. Also beware that Hess’s law is often taught in such a way that it seems to express conservation of heat. That’s terrible! Heat is not conserved! The symbol H conventionally stands for enthalpy; it does not stand for heat.

When we want to quantify things, it is better to forget about “heat” and just quantify energy and entropy, which are unambiguous and unproblematic. Talking about energy flow is incomparably better than talking about heat flow, because energy is a conserved quantity.

Here is a helpful analogy:

The problematic concept of phlogiston was replaced by two precise concepts (namely oxygen and energy).   The problematic concept of heat should be replaced by two precise concepts (namely energy and entropy).

As another analogy, consider the comparison between “heat” and “blue”, another common four-letter word.
Nobody in his right mind would try to quantify what “blue” means. Instead of quantifying the blueness, you should quantify something else, perhaps power versus wavelength.   Instead of quantifying heat, you should quantify the energy and entropy.

Actually “heat” is far more problematic than “blue”, because there’s something even worse than imprecision, namely holy wars between the big-endians and the little-endians, each of whom think they know “the one true meaning” of the term.

17  Work

17.1  Definitions

The definition of work suffers from one major problem plus several minor nuisances.

The major problem is that there are two perfectly good but inconsistent notions:

  1. Mechanical transfer of energy across a boundary. Here mechanical means non-thermal and non-advective.
  2. Force times distance.
These two notions are closely related but certainly not identical. This is an instance of the boundary/interior issue, as discussed in section 15. This is a recipe for maximal confusion. (Wildly different ideas are easily distinguished, and identical ideas need not be distinguished.)

Within the force-times-distance family, there are the following nuisance factors, which will be discussed below:

  • Done “on” versus done “by”.
  • Differential versus integral formulation.
  • Microscopic versus coarse-grained on some length-scale λ.
  • Local versus overall.
We start by considering the case where the energy is a nice differentiable function of state, and is known as a function of two variables V and S alone. Then we can write which is just a repeat of equation 8 and equation 11. This gives us the differential formulation of work, as follows:
The first term on the RHS, namely -P dV, is commonly called the work done on the system. Positive work done on the system increases the energy of the system.   The negative thereof, namely P dV, is the work done by the system. Positive work done by the system decreases the energy of the system.

As an elaboration, consider the common case where V itself is known as a differentiable function of some other variables (say) A, B, and C.
  Example #1:   Suppose the system is the parallelepied spanned by the vectors A, B, and C. Then the volume is V = ABC.
  Example #2:   Suppose the system is a spring as shown in figure 19. It has one end attached to point A and the other end attached to point B, where both A and B are points on a long one-dimensional track. Then V is just the length of the spring, V = B - A.


We can differentiate V to obtain
dV =   |
|
|
|
dA +   |
|
|
|
dB +   |
|
|
|
dC              (49)
and plug that into equation 48 to obtain
dE   =  
  |
|
|
|
  |
|
|
|
dA +   |
|
|
|
  |
|
|
|
dB +   |
|
|
|
  |
|
|
|
dC +   |
|
|
|
dS
             (50)
We can write this more compactly as:
dE   =   -FA|B,C dA -FB|C,A dB -FC|A,B dC + T dS
             (51)
where we have defined the notion of force in a given direction according to: and similarly for the other directions.

It is conventional but very risky to write FA (meaning force “in the A direction”) as shorthand for FA|B,C. This is risky because the notion of “the A direction” is not well defined. It is OK to speak of the direction of constant B and C, but not the direction of changing A. Specifically, in example #2, when we evaluate ∂E / ∂A, we get very different results depending on whether we evaluate it at constant B or at constant V.

There is no reliable, general way to disambiguate this by assuming that B and C are the directions “perpendicular” to A. As an aside, note that in the two examples above, if A and B are interpreted as position-vectors in real space, they are definitely not perpendicular. More to the point, when A and B are interpreted as part of the abstract thermodynamic state-space, we cannot even define a notion of perpendicular.

In the present context, FA is unambiguous because FA|B,C is by far the strongest candidate for what it might mean. But in another context, the symbol FA might be highly ambiguous.

*   Integral versus Differential

We can convert to the integral formulation of work by integrating the differential representation along some path Γ. The work done by the system is:
workby [Γ] =
PdV              (53)
 

Consider the contrast:

The differential formulation of work (PdV) is a vector, specifically a one-form. A one-form can be considered as a mapping from pointy vectors to scalars.   The integral formulation of work (workby[]) is a functional. It is a mapping from paths to scalars.

In particular, if Γ is a path from point X to point Y, you should not imagine that the work is a function of X and/or Y; rather it is a functional of the entire path. If PdV were an exact one-form, you could express the work as a function of the endpoints alone, but is isn’t so you can’t.

*   Coarse Graining

For each length scale λ, we get a different notion of work; these include microscopic work, mesoscopic work, and holoscopic work (aka macroscopic work, aka pseudowork). These are all similar in spirit, but the differences are hugely important. To illustrate this point, consider a flywheel in a box:
  • The holoscopic KE is zero, because the CM of the box is not moving.
  • If we look inside the box, we see that the flywheel as mesoscopic KE, because it is spinning.
  • If we look more closely, we find additional KE in the thermally-excited phonon modes, because the flywheel has nonzero temperature.
  • If we look yet more closely, we find yet more KE, including the KE of electrons whizzing around inside atoms.
More generally, there are innumerable gray areas, depending on the length scale λ.

In thermodynamics, it is usually -- but not necessarily -- appropriate to assume that “work” refers to either mesoscopic or holoscopic work.

*   Local versus Overall

Sometimes it is useful to consider the force and displacement acting locally on part of the boundary, and sometimes it is useful to consider the overall force and overall displacement.

To say the same thing in mathematical terms, let’s multiply both sides of equation 49 by P to obtain:

P dV = FA|B,C dA + FB|C,A dB + FC|A,B dC              (54)

 

In some contexts, it would make sense to speak of just one of the terms on the RHS as “the” work.

17.2  Energy Flow versus Work

Let’s consider systems that have some internal structure.

Our first example is shown in figure 20, namely a spring with a massive bob at one end. The other end is anchored. The mass of the spring itself is neglible compared to the mass of the bob. Dissipation is negligible. I am pushing on the bob, making it move at a steady speed v ≡ dA/dt. This requires adjusting the applied force F so that it always just balances the force of the spring.


When we ask how much “work” is involved, we have a bit of a dilemma.
It certainly feels to me like I am doing work on the spring+bob system. Energy is flowing across the boundary from me into the bob.   The overall work on the spring+bob system is zero. The force of my push on one end is exactly balanced by the force of constraint on the other end. Zero total force implies zero macroscopic work (aka pseudowork). Having zero macroscopic work is consistent with the work/KE theorem, since the KE of the system is not changing.

This dilemma does not go away if we break the system into sub-systems. The applied force on the bob is just balanced by the force of the spring, so there is no net force (hence no overall work) on the bob considered as a subsystem. The same goes for each small subsection of the spring: No net force, no acceleration, no work, and no change in KE.

The “local work” at the moving end is F

dx.

The “local work” at the fixed end is zero, since it is F

⋅ 0.

It is OK to think of energy pouring into the spring as a whole at the rate dE/dt = F

v. It is OK to think of energy as being like an abstract fluid flowing across the boundary.

It seems highly problematic to treat work as if it were a fluid flowing across the boundary. In particular, a naive attempt to apply the work/KE theorem is a disaster, because the energy inside the spring is virtually all potential energy; the KE inside the spring is negligible. The alleged work-fluid is flowing into the spring from the bob, and not flowing out anywhere, yet no work or KE is accumulating inside the spring.

As a second example, consider the oil bearing in section 10.6. Again we have a boundary/interior issue. Again we have a dilemma, due to conflicting definitions of work:

I am doing work in the sense of force (at a given point) times distance (moved by that point). I am doing work in the sense of pouring net energy across the boundary of the system.   There is no overall force, no overall work, no acceleration, and no change in KE.

Part of the lesson here is that you need to think carefully about the conditions for validity of the work/KE theorem. A non-exhaustive list is:
  • It suffices to have a rigid body, i.e. no motion of one part relative to another, i.e. no internal forces except forces of constraint. This implies no change in the internal potential energy.
  • It suffices to have a dismembered body, i.e. no internal forces between the parts, i.e. parts free to move independently of each other. Again this implies no change in the internal potential energy.
  • It suffices to carry out a full accounting for the internal forces, not just the external forces. This implies accounting for the changing internal potential energy.
There are some interesting parallels between the oil bearing and the spring:
  • In both cases, momentum flows into the system on one side and simultaneously flows out the other side, so there is no net accumulation of momentum within the system.
  • Meanwhile, energy flows into the system on one side and does not flow out the other side, so that energy accumulates within the system.
  • In one case the accumulated energy takes the form of thermal energy, and in the other case it takes the form of potential energy.
If you want a third parallel system, consider a force applied to a free body, such as the bob in figure 20 without the spring and without the anchor. Energy and momentum flow into the system and accumulate. The accumulated energy takes the form of macroscopic kinetic energy.

From this we see that the work/KE theorem is intimately connected to the accumulation of momentum within the system, not the accumulation of energy per se.

A related thought is that momentum is conserved and energy is conserved, while work (by itself) is not conserved. KE (by itself) is not conserved.

17.3  Remarks

Keep in mind that “work” is ambiguous. If you decide to speak in terms of work, you need to spell out exactly what you mean.

Also keep in mind that dissipative processes commonly convert mesoscopic KE into microscopic KE as well as non-kinetic forms of energy. Energy is conserved; mesoscopic KE is not (by itself) conserved.

17.4  Hidden Energy

You can’t hide momentum; if an object has momentum its center-of-mass will be moving, and this will be easy to notice. In contrast, you can easily hide energy in an object’s internal degrees of freedom, perhaps in the form of spinning flywheels, taut springs, thermal energy, or other things having nothing to do with center-of-mass motion.

Here is an example of hidden energy: Consider a cart with two flywheels on board. Initially everything is at rest. Apply a pair of forces (equal and opposite) to the front flywheel, causing it to spin up, clockwise. Apply a similar pair of forces to the back flywheel, causing it to spin up, counterclockwise. The net force on the cart is zero. The motion of the cart’s center of mass is zero. The net force dot the overall motion is zero squared. The cart’s overall angular momentum is also zero. Yet the cart has gained kinetic energy: internal, nonthermal, mesoscopic kinetic energy.

Examples like this are a dime a dozen. In some sense what we are seeing here is the difference between holoscopic and mesoscopic kinetic energy. If you don’t recognize the difference, and recklessly talk about “the” kinetic energy, you’re going to have trouble.

17.5  Pseudowork

Sometimes in thermodynamics it is appropriate to focus attention on the large-λ limit of equation 53. In that case we have:
d (P2 / 2M)   =   Ftot ⋅ dxcm
             (55)
where P = ∑pi is the total momentum of the system, M := ∑ mi is the total mass, Ftot := ∑Fi is total force applied to the system, and xcm is the distance travelled by the center of mass. See reference 7 for a derivation and discussion.

The RHS of equation 55 is called the pseudowork. The LHS represents the change in something we can call the pseudokinetic energy. This is just a synonym for the holoscopic kinetic energy.

There is an easy-to-prove theorem that says that for any length scale λ, an object’s total KE[λ] measured in the lab frame is equal to the KE[λ] of the relative motion of the components of the object (i.e. the KE[λ] measured in a frame comoving with the CM of the object) ... plus the holoscopic KE associated with the motion of the CM relative to the lab frame (as given by equation 55).

Mesoscopic work and holoscopic work (aka pseudowork) are consistent with the spirit of thermodynamics, because they don’t require knowing the microscopic forces and motions.

However, the pseudowork is not equal to the “thermodynamic” w that appears in the oft-abused equation 8. Here’s a counterexample: Suppose you apply a combination of forces to a system and its center of mass doesn’t move. Then there are at least three possibilities:

  • Maybe there is no energy transfer at all, e.g. static equilibrium;
  • Maybe there is a completely nonthermal transfer of energy, e.g. spinning up a flywheel; or
  • Maybe the energy is completely thermalized, as in boring a cannon with a completely dull tool (section 10.5).
According to the meaning of w usually associated with equation 8, w is zero in the first case, nonzero in the second case, and who-knows-what in the third case. It is a common mistake to confuse w with work or pseudowork. Don’t do it.

18  Other Ambiguous Terminology

In the literature, the term “state” is ambiguous. It can either mean microstate or macrostate.
In the context of quantum mechanics, state always means microstate.   In the context of classical thermodynamics, state always means macrostate, for instance in the expression “function of state”.

Both of these usages are well established, but they collide and conflict as soon as you start doing statistical mechanics, which sits at the interface between QM and thermo.
In this document, state is supposed to mean microstate, unless the context requires otherwise.   When we mean macrostate, we explicitly say macrostate or thermodynamic state. The idiomatic expression “function of state” necessarily refers to macrostate.

Similarly, "phase space" is ambiguous.
Phase-space means one thing in classical canonical mechanics; it corresponds to what we have been calling state-space, as discussed in section 11.2.   Phase space means something else in classical thermodynamics; it has to do with macroscopic phases such as the liquid phase and the solid phase.

(Ironically, Gibbs has his name associated with both of these notions.)

I’m not even talking about quantum mechanical phase

φ, as in exp(i φ); that’s a third notion, which is not terribly troublesome because you can usually figure out the meaning based on context.

Given how messed-up our language is, it’s a miracle anybody ever communicates anything.

19  Thermodynamics, Restricted or Not

There are various ways of restricting the applicability of thermodynamics, including
  • microcanonical only (i.e. constant energy)
  • equilibrium only
  • reversible only
  • ideal gasses only
  • et cetera.
Indeed, there are some people who seem to think that thermodynamics applies only to microcanonical reversible processes in a fully-equilibrated ideal gas.

To make progress, we need to carefully distinguish two ideas:

  a) Simplifying assumptions made in the context of a particular scenario. Depending on details, these may be entirely appropriate. Sometimes the gasses involved are ideal, to an excellent approximation. Sometimes a process is reversible, to an excellent approximation. But not always.
  b) Restrictions applied to the foundations of thermodynamics. We must be very careful with this. There must not be too many restrictions ... nor too few. Some restrictions are necessary, while other restrictions are worse than useless.
Some thermodynamic concepts necessarily have limited validity.
  • As discussed in section 10.4, there are situations where it is impossible to define a temperature.
  • The Boltzmann distribution law (equation 30 and figure 6) is valid only in equilibrium.
  • The notion of equiprobable states (equation 33) applies only in microcanonical equilibrium.
In contrast, very importantly, the law of conservation of energy applies without restriction. Similarly, the law of paraconservation of entropy applies without restriction. You must not think of E and/or S as being undefined in regions where “non-ideal” processes are occuring. Otherwise, it would be possible for some energy and/or entropy to flow into the “non-ideal” region, become undefined, and never come out again, thereby undermining the entire notion of conservation.
The ideas in the previous paragraph should not be overstated, because an approximate conservation law is not necessarily useless. For example, ordinary chemistry is based on the assumption that each of the chemical elements is separately conserved. But we know that’s only approximately true; if we wait long enough uranium will decay into thorium. Still, on the timescale of ordinary chemical reactions, we can say that uranium is conserved, to an excellent approximation.
When a law has small exceptions, you shouldn’t give up on the law entirely. You shouldn’t think that just because a process is slightly non-ideal, it becomes a free-for-all, where all the important quantities are undefined and none of the laws apply.

If you want to make simplifying assumptions in the context of a specific scenario, go ahead ... but don’t confuse that with restrictions on the fundamental laws.

Also, in an elementary course, it might be necessary, for pedagogical reasons, to use simplified versions of the fundamental laws ... but you need to be careful with this, lest it create misconceptions.

  • As an example: an imperfect notion of entropy in terms of multiplicity (equation 33) is better than no notion of entropy at all. However sooner or later (preferably sooner) you need to understand that entropy is really defined in terms of probability (equation 3), not multiplicity.
  • As another example: In an elementary course, it might be appropriate to start by applying thermo to ideal gasses. However, sooner or later (preferably sooner) it is very important to consider other systems; otherwise you risk horrific misconceptions, as discussed in section 8.3.3.
Finally, it must be emphasized that one should not ask whether thermodynamics “is” or “is not” applicable to a particular situation, as if it were an all-or-nothing proposition. Some concepts (such as energy and entropy) are always valid, while other concepts (such as equilibrium and temperature) might or might not be valid, depending on the situation.

20  The Relevance of Entropy

The concept of entropy is important in the following areas, among others:
  1. cryptography and cryptanalysis (secret codes)
  2. communications (error-correcting codes, as part of electronic engineering)
  3. computer science, including data-compression codes, machine learning, speech recognition, etc.
  4. librarianship
  5. the physics of computation
  6. the design of refrigerators, heat pumps, and engines (including piston, turbine, and rocket engines)
  7. nuclear engineering (reactors and weapons)
  8. fluid dynamics
  9. astrophysics and cosmology
  10. chemistry and chemical engineering
Very roughly speaking, the items higher on the list can be assigned to the “information theory” camp, while the items lower on the list can be assigned to the “thermodynamics” camp. However:
  • The physics of computation is squarely in both camps; see reference 13, reference 14, and reference 15.
  • Things like Maxwell demons and Szilard engines are squarely in both camps; see reference 16.
  • Demag refrigerators (as described in section 10.11) are in both camps, because you can quantify the entropy either by microscopic state-counting or by macroscopic thermal measurements.
  • Statistical mechanics -- which provides the theoretical basis for a modern understanding of thermodynamics -- is squarely in both camps; see reference 3.
So: at the end of the day, we discover that the two camps were essentially indistinguishable all along. Entropy is entropy. It is the same entropy, no matter whether you measure it in bits or in joules per kelvin (section 8.5).

As mentioned in section 2, you can’t do thermodynamics without entropy.

Also: entropy is one of the great elegant ideas of all time. C.P. Snow compared not knowing about the second law to never having read a work by Shakespeare.

21  Summary

  • Thermodynamics inherits many results from nonthermal mechanics. Energy, momentum, and electrical charge are always well defined. Each obeys a strict local conservation law.
  • Entropy is defined in terms of probability. It is always well defined. It obeys a strict local paraconservation law. Entropy is what sets thermodynamics apart from nonthermal mechanics.
  • Entropy is not defined in terms of energy, nor vice versa. Energy and entropy are well defined even in situations where the temperatature is zero, unknown, or undefinable.
  • Entropy is not defined in terms of position. It involves probability spread out in state-space, not necessarily particles spread out in position-space.
  • Entropy is not defined in terms of multiplicity. It is equal to the log of the multiplicity in the special case where all accessible states are equiprobable ... but not in the general case.
  • Work suffers from two inconsistent definitions. Heat suffers from at least three inconsistent definitions. Adiabatic suffers from two inconsistent definitions. At the very least, we need to coin new words or phrases, so we can talk about the underlying reality with some semblance of clarity. (This is loosely analogous to the way phlogiston was replaced by two more-modern, more-precise concepts, namely energy and oxygen.)
  • Heat and work are at best merely means for keeping track of certain contributions to the energy budget and entropy budget. In some situations, your best strategy is to forget about heat and work and account for energy and entropy directly.
  • When properly stated, the first law of thermodynamics expresses conservation of energy ... nothing more, nothing less. There are several equally-correct ways to state this. There are also innumerably many ways of misstating it, some of which are appallingly widespread.
  • When properly stated, the second law of thermodynamics expresses paraconservation of entropy ... nothing more, nothing less. There are several equally-correct ways to state this. There are also innumerably many ways of misstating it, some of which are appallingly widespread.
  • Not all thermal energy is kinetic. Not all kinetic energy is thermal.
  • Some systems (not all) are in internal equilibrium. They are described by a thermal distribution. They have a temperature.
  • Even more importantly, some systems (not all) are in intermal equilibrium with exceptions. They are described by a thermal distribution with exceptions. They have a temperature.
  • Two systems that are each in internal equilibrium may or may not be in equilibrium with each other. Any attempted theory of thermodynamics based on the assumption that everything is in equilibrium would be trivial and worthless.
  • The notion of thermal energy transfer is sometimes OK, provided that is the only thing going on (e.g. ordinary non-moving heat exchanger). The notion of nonthermal energy transfer is sometimes OK, provided that is the only thing going on (e.g. thermally-insulated pushrod). However, splitting thermal from nonthermal transfers isn’t always feasible, especially in dissipative systems.
  • Splitting the energy itself into a thermal part and a nonthermal part (rather than splitting the transfers) is problematic also, but for different reasons.
  • There is a simple relationship between force and momentum, for any system, macroscopic or microscopic.
  • For pointlike systems (no internal degrees of freedom), there is a simple relationship between overall force and total kinetic energy ... but for more complex systems, the relationship is much more complicated. There are multiple inequivalent work-like quantities, depending on what length scale λ you look at.

22  References

Thanks to Carl Mungan for many helpful discussions.

1.
Feynman, Leighton, and Sands, The Feynman Lectures on Physics. Volume I Chapter 4 deals with fundamental notions of conservation of energy, including the celebrated parable of Dennis and the blocks. Also, figure 44-8 on page 44-8 of Volume I illustrates heat engines in series.
2.
Feynman, The Character of Physical Law.
3.
Feynman, Statistical Mechanics.
4.
John Denker, “Thermodynamics and Differential Forms” ./thermo-forms.htm
5.
John Denker, “Visualizing Non-Conservative Fields” ./non-conservative.htm
6.
John Denker, “Entropy -- Increased by Stirring, Decreased by Observation” ./entropy-sim.htm
7.
John Denker, “Kinetic Energy, Work, Momentum, Force times Time, and Force dot Distance” ./kinetic-energy.htm
8.
John Denker, “Conservative Flow and the Continuity of World-Lines” ./conservative-flow.htm
9.
John Denker, “Reality versus Reductionism” ./reality-reductionism.htm
10.
Benjamin Thompson “Heat is a Form of Motion: An Experiment in Boring Cannon” Philosophical Transactions 88 (1798) http://dbhs.wvusd.k12.ca.us/webdocs/Chem-History/Rumford-1798.html
11.
Kittel and Kroemer, Thermal Physics (1980). This is far and away the most sensible thermo book I’ve seen.
12.
E.T. Jayne “The Gibbs Paradox”, in Smith, Erickson & Neudorfer (eds), Maximum Entropy and Bayesian Methods (1992).
13.
Rolf Landauer, "Irreversibility and Heat Generation in the Computing Process" IBM J. Res. Dev. 5, 183 (1961). http://www.research.ibm.com/journal/rd/441/landauerii.pdf
14.
Charles H. Bennett "The thermodynamics of computation -- a review" International Journal of Theoretical Physics, 21 (12), 905--940 (December 1982).
15.
Seth Lloyd "Use of mutual information to decrease entropy: Implications for the second law of thermodynamics" Phys. Rev. A 39, 5378--5386 (1989).
16.
Wojciech Hubert Zurek, "Maxwell’s Demon, Szilard’s Engine and Quantum Measurements" http://arxiv.org/pdf/quant-ph/0301076
17.
John Denker, “Negative Temperatures” ./neg-temp.htm
18.
John Denker, “How a Battery Works” ./battery.htm
19.
John Denker, “The Twelve-Coins Puzzle” ./twelve-coins.htm
20.
John Denker, “Spontaneous Transformations” ./spontaneous.htm
21.
John Denker, “Energy Flow -- Principle of Virtual Work” ./isothermal-pressure.htm
22.
Carl E. Mungan, “Irreversible adiabatic compression of an ideal gas” http://usna.edu/Users/physics/mungan/Publications/TPT4.pdf
23.
John Denker, “Partial Derivatives -- Pictorial Representation” ./partial-derivative.htm
24.
Jonathan Swift, Gulliver’s Travels http://www.gutenberg.org/dirs/etext97/gltrv10h.htm
25.
B. A. Sherwood and W. H. Bernard, “Work and heat transfer in the presence of sliding friction” Am. J. Phys 52(11) (1984).

1
Even in cases where measuring the energy flow is not feasible in practice, we assume it is possible in principle.
2
In some special cases, such as Wheeler/Feynman absorber theory, it is possible to make sense of non-local laws, provided we have a non-local conservation law plus a lot of additional information. Such theories are unconventional and very advanced, far beyond the scope of this document.
3
An example of this is discussed in section 10.4.
4
Some thermo books recombine ideas in such a way that this assertion becomes part of the first law. That comes to the same thing; I don’t much care what you call things. Most books just gloss over this point entirely, which is unfortunate.
5
Subject to the approximation that nuclear reactions can be neglected.
6
In the expression “function of state” or in the equivalent expression “state function”, state always means macrostate. You might think that anything that is a function at all is a function of the microstate, but that’s not true. In particular, entropy is defined as a sum over microstates.
7
This corresponds to saying that θ is the argument of the cosine in the expression cos(θ).
8
Ideal gasses are a special case, where all the energy -- thermal or otherwise -- is kinetic energy. Ideal gasses are sometimes useful as an illustration of thermodynamic ideas, but alas some textbooks overuse this example so heavily as to create the misimpression that thermodynamics deals only with kinetic energy.
9
Anharmonicity can cause the average KE to be not exactly equal to the average PE, but for a crystal well below its melting point, the thermal phonon modes are not significantly anharmonic.
10
Here “states” means “microstates”.
11
If the flow pattern were turbulent, calculating the entropy would entail practical as well as conceptual difficulties.
12
Quoted in reference 12.

[Contents] _

Copyright © 2005 jsd