Locating Peaks and Valleys with Derivatives (Explaining Calculus #9)

We have gone through a couple of discussions now on how to actually calculate the value of a derivative. This is great, after all what good is a shiny new tool if you don’t know how to operate it? But we are done with that now. Now that we’ve gone through a tutorial with derivatives, we can dive into the depth of what derivatives are capable of telling us. Since the initial motivation behind derivatives is all about change, you might think I’m about to write about calculations of how things change over time. That is partly right, but more accurately I’m choosing to show how we can go beyond “just” talking about changes over time by using the idea of change along the way.

What I plan to discuss here is a broad idea called optimization, and we will explore how derivatives play a key role in this theory.

What are “Peaks” and “Valleys”?

The type of math called optimization talks about finding maximum and minimum values. For example, if you are a business owner and found a function that would predict your profits, you would want to maximize that function. Or perhaps you want to build a fence around some grass for your animals, but don’t want to use any more fencing than you have to. You want to minimize the amount of fencing material you use. These ideas of maximizing and minimizing – making a value as large or as small as possible – are what we want to talk about.

But why the terms peak and valley? Well, the maximum height you find on a mountain is the peak of the mountain. The lowest point in a landscape is a valley. So, if we translate the very numerical ideas of maximizing and minimizing into visual language, we are learning how to find the peaks and valleys in our mathematical landscape. And we can make use of numerical values of derivatives to help us.

Connecting Peaks/Valleys with Derivatives

Imagine you draw a graph in a U shape. In fact, you can just look at the letter U and pretend it is a graph. Let’s think about finding the minimal point on U – that is, the lowest point on the letter. Now, imagine drawing a tangent line. It should look like you just underlined the letter U. In other words, it should be a horizontal line! Now, remember what derivatives are – slopes of tangent lines. What is the slope of a horizontal line? Well, that would be zero. So, we’ve arrived at a rather interesting observation through the letter U: the minimum value of U happened at a place where the tangent line had a slope of zero.

Now, let us instead use the letter V instead of U. What about the minimum point on V? Well, again, we can ask questions about the tangent line. But this time, we run into a hitch. The lowest point on the letter V is a sharp corner, and sharp corners don’t have tangent lines. So there isn’t any way we could even try to define a derivative at that point. This is our second interesting observation: the minimum value of V happened at a place with no tangent lines.

Now, here is a big, big point. Our observations about the letters U and V are actually comprehensive statements about max and min values of all derivatives! To make this statement more precise, I need to write down a couple definitions in more “mathy” language.

We begin by defining minimum and maximum more precisely. In fact, we will give two subtly different definitions – these two types we call absolute and relative. In intuitive terms, absolute max/min values are really bigger than every other value of the function. On the other hand, a relative max/min is one that “looks like” an absolute max/min if you “zoom in” far enough. To see what I mean here, imagine a mountain range with a variety of peaks. Every single peak is a relative maximum value, because if you took binoculars and narrowed your focus on just that peak, it would look to you like the highest point on the mountain. On the other hand, the absolute maximum of the mountain range is whichever peak is actually higher than all the other peaks. Notice that you can have many, many relative maximum values, but only one absolute maximum value (well, there could be ties, like two different mountain peaks that are equally tall, but there is still only one largest height that is shared by two peaks).

Now, let’s do some defining. We begin first with absolute max/min:

Definition: Let f(x) be a function. The function f has an absolute maximum at the point x if f(x) \geq f(y) for every value y. If x is an absolute maximum point, then f(x) is the absolute maximum value of f. Similarly, f(x) has an absolute minimum at the point x if for every value y, f(x) \leq f(y), and f(x) is the absolute minimum value of f.

Notice that the definitions makes its claim about every possible value of y. Also notice that “points” refer to inputs and “values” refer to outputs – perhaps the visual f(\text{point}) = \text{value} is of use to show what I mean here. On the other hand, consider the following definition of relative max/min values, and note the key differences.

Definition: Let f(x) be a function. Then f(x) has a relative maximum at the point x if, for every point y close enough to x, f(x) \geq f(y). Similarly, f(x) has a relative minimum at the point x if for all y close enough to x, f(x) \leq f(y).

Notice the presence of “close enough” in this definition. That is the key difference. As is suggested by the language, absolute is a stronger term than relative, as I can show explicitly by showing that absolute maxima/minima are also relative maxima/minima.

Fact: If f(x) has an absolute maximum at x, then f(x) also has a relative maximum at x.

Proof: Since x is an absolute maximum of f, f(x) \geq f(y) for every value y for which f makes any sense. This, of course, means that f(x) \geq f(y) for all y close to x. So, f(x) has a relative maximum at x.

The same kind of fact works for absolute and relative minima, and so a proof is not needed. You can fill in the blanks yourself. Before we can finalize the big claim I made about “U and V” I first need to take a detour into a related topic.

Increase and Decrease

We now move into a section more directly related to the idea of “change over time” that originally led us to the idea of a derivative in the first place. The idea here is fairly straightforward. As time progresses, a quantity could go up, it could go down, or it could stay the same. This is actually the complete idea. In mathematics, we say increasing instead of going up, decreasing instead of going down, and constant for staying the same.

How can we frame these ideas in equation-language? It is fairly easy if we use an analogy. Imagine you are walking on a mountain, that x represents how long you’ve been walking for, and f(x) represents how high you are. Intuitively, the function f(x) (your height) is increasing if after each step you take, you are higher than you were before. The progression of time can be captured pretty easily – time y is after time x if y > x. So, in order for each step to leave us higher than we started, we need f(y) > f(x) to be true whenever y > x is also true.

But, remember, we are talking about calculus too. The idea of “zooming in” will almost always be important. So, while the way we described the idea of increasing it is completely right, we want to be a bit more specific. We want to be able to say where f(x) is increasing and where it is not. To do this, we allow our where to be some interval of time, say [a,b]. Now, we can be more specific in our definition.

Definition: Let f(x) be a function with values in the range [a,b]. Then f(x) is increasing on the interval [a,b] if for any values of x,y between a and b, if y > x then f(y) > f(x).

It is pretty easy using the same intuition to come up with definitions for decreasing on an interval and constant on an interval. If we are going down a mountain, then with each step (time going forward) our height decreases (function value going down). Just as before, time going forward tells us that we want to think about y > x, and height going down whenever time goes forward tells us that instead of f(y) > f(x), we need f(y) < f(x). So, we have an easy definition.

Definition: Let f(x) be a function with values in the range [a,b]. Then f(x) is decreasing on the interval [a,b] if for any values of x,y between a and b, if y > x then f(y) < f(x).

Notice these two definitions are verbatim the same except for the word ‘decreasing’ and the change from > to < at the very end. Similar considerations to these lead us to a definition for constant in an interval.

Definition: Let f(x) be a function with values in the range [a,b]. Then f(x) is constant on the interval [a,b] if for any values of x,y between a and b, then f(x) = f(y).

Notice the similarity of all three definitions, and notice where they are different. It is always helpful I find to think of hiking up and down a mountain as the function.

Connecting Increasing/Decreasing with Derivatives

By using derivatives, we can zoom in even further. So far, we’ve only defined increase/decrease/constant in terms of intervals of numbers. But what about points? Those are even smaller than intervals. Can we somehow extend these ideas to miniscule points?

Actually, yes. We can. The easiest place to begin is the idea of being constant at a point. In the mountain analogy, this is the peak of the mountain. Everything levels out when you reach such a point. If you think in terms of tangent lines, the tangent line is totally flat there. That just is what it looks like to be constant, or flat, at a singular point. Imagining a rounded-off top to a mountain or hill is a good visual aid here. So, in terms of derivatives, we have a definition for being constant at a point.

Definition: A function f(x) is constant at the point a if f^\prime(a) = 0.

We can move forward now and say that increasing on an interval means increasing on every point inside that interval. Makes sense. Every point on a red wall is also red. It’s analogous to that. More specifically, f(x) is increasing at x if x is inside some interval [a,b] (that is, a < x < b) on which the old definition tells us that f(x) is increasing on [a,b]. But now, look at the definition of the derivative,

f^\prime(x) = \lim\limits_{h \to 0} \dfrac{f(x+h) - f(x)}{h}.

Let’s take this new definition of increasing at a point and play with it. Let’s pick an interval [a,b] on which f(x) is increasing. Since the variable h is part of a limiting process, we can make it as small as we want. So, we can make it small enough that x+h is always inside of [a,b]. This means that the rules of increasing apply here. Since f(y) > f(x) if, and only if, y > x, f(x+h) > f(x) if, and only if, x+h > x. Simplified, f(x+h) > f(x) is equivalent to h > 0. In exactly the same way, f(y) < f(x) is equivalent to y < x, and so f(x+h) < f(x) is equivalent to h < 0.

This may seem a bit weird, but pause and think about what we’ve done. Notice that f(x+h) - f(x) must be positive whenever h > 0, and it must be negative whenever h < 0. In both situations, \dfrac{f(x+h) - f(x)}{h} must always be positive. In fact, these values always being positive is going to force the limit of them, which is just f^\prime(x), to definitely not be negative. So, f^\prime(x) \geq 0. But… we already have a definition for f^\prime(x) = 0. So we won’t use that. When we keep the rest, we get a new definition for increasing functions.

Definition: The function f(x) is increasing at the point a if f^\prime(a) > 0.

You might be able to guess the definition of decreasing at a point now. If not, take some time to think about it before reading on.

Definition: The function f(x) is decreasing at the point a if f^\prime(a) < 0.

This discussion has now linked our ideas about increase and decrease to this new calculus language. You might not think we’ve made all that much progress, at least not yet, but I think we have. Notice how much shorter our new definitions are than the old ones! I would consider that a win. The are also easier to read, assuming of course you know what a derivative is. But that isn’t exactly the point. The point is that by the cleverly using the concept of a derivative, we made the idea of increasing and decreasing easier to put into a legitimate definition. And that is always a sign of mathematical progress. Simpler ways of saying the same thing is almost always better.

With all of the previous discussion in mind, we can now move on to the main point of the article, that puts all these ideas together.

Finding Peaks and Valleys with Math

We can move forward now with the big idea we mentioned before with the letters U and V. I made a big, big claim that the letters U and V essentially describe every possible type of maximum or minimum value you could ever find. I will now explain this.

Theorem: Let f(x) be a function whose input can be any real number and whose outputs are real numbers (I sometimes use the shorthand f : \mathbb{R} \to \mathbb{R} for this, where \mathbb{R} is a stand-in symbol that represents all the real numbers). Suppose that f(x) has a relative maximum value at the point x = a. Then either f^\prime(a) = 0 or f^\prime(a) does not exist.

Proof: Our claim is that there are only two possibilities. Another way to say this is that, if we are not in one of the situations, we must be in the other (since no third option exists). Viewing the problem in this manner is a common approach to either-or proofs in mathematics, and this is the approach I use here. I will assume that f^\prime(a) definitely exists. Our goal, then, is to discover that f^\prime(a) = 0 has to be true.

We already know that f(x) has a relative maximum value at x = a. Let’s remember briefly the definition of the derivative of f at a.

f^\prime(a) = \lim\limits_{h \to 0} \dfrac{f(a+h) - f(a)}{h}.

Now, since we are taking a limit, we can assume that h is actually small enough that a+h is close enough to a so that f(a+h) \leq f(a). Then for every small enough value of h, we know that f(a+h) - f(a) \leq 0. Now, the number h itself can be either small and positive or small and negative. If it is small and positive, then \dfrac{f(a+h) - f(a)}{h} \leq 0. If it is small and negative, then \dfrac{f(a+h) - f(a)}{h} \geq 0.

This is a rather curious fact. Remember that the derivative value f^\prime(a) definition definitely exist. Since the limit must exist, the approaches from negative and positive values of h cannot contradict each other, since if they did, the derivative just wouldn’t exist. This means that both of the inequalities we just figured out have to be true about f^\prime(a). That is, we have figured out that

0 \leq f^\prime(a) \leq 0.

It should strike us pretty quickly that the only number between zero and zero is zero. Therefore, f^\prime(a) = 0. This is exactly what I set out to discover, and so we are done proving what we set out to prove.

How to Find Relative Max and Min Values

We can make use of this super-important theorem to actually locate these special peaks and valleys in graphs. And, by using the increasing-decreasing ideas, we can pick out which of the located special points are peaks, which are valleys, and which are masquerading.

Here is the idea. Our big theorem tells us that the special points we are looking for – relative maximum and minimum points – always have either a zero derivative or a non-existent derivative. So, if we have a function f(x), we can locate all of these points by finding all solutions to f^\prime(x) = 0 and all points where the equation f^\prime(x) doesn’t make any sense but f(x) does make sense. We can then list all of these out, as there are never too many of these special points.

Just by the last paragraph, we now have a short list of possible places where we might find maximum or minimum values. But how do we tell which are which? This is where the increasing/decreasing ideas come in to play. Let’s start with minimum values – which have the shape of either U or V. Notice that to the left of the minimum points have down-sloping curves or lines, and the right of the minimum points have up-sloped curves or lines. In our discussion earlier, we already pointed out that down-sloping means decreasing, and up-sloped means increasing. We also pointed out that decreasing means f^\prime(x) < 0 and increasing means f^\prime(x) > 0. What does this mean then? Well, if a is one of our special points, and if f^\prime(x) < 0 to the left of a and f^\prime(x) > 0 to the right of a, then the graph of f(x) must be shaped either like a U or a V at the special point a. This means that a definitely has a minimum point at a! In exactly the same way, if the graph is increasing on the left side of a special point and decreasing on the right side, that must mean our graph is either an inverted-U or an inverted-V, which tells us that we have found a maximum point there!

But what if neither of those are true? What if the graph is increasing on both sides of the special point? Well, then, we’ve found a point masquerading as a max or min point, even though it isn’t either. It passed the first test by having a derivative that was 0 or didn’t exist, but it didn’t pass the second test. A good example here would be the letter D. The very right-most part of the letter D has a derivative that does not exist (to be clear about this part, since infinity isn’t a number, a line that goes straight up is said to not have a derivative). So, this point on D passes the first test. But if you zoom in nearby that point, it definitely is never the highest or lowest point in that zoomed-in window. So it can’t be a relative maximum or minimum point. And so it fails the second test. If you graph the function y = x^3 on a calculator or on the internet, you’ll see the point x = 0 has a zero slope (passes the first test) but isn’t a max or min (and so fails the second test). Again, this will be because the graph is going up on both sides of the key point (that is f^\prime(x) > 0).

Conclusion

This is one of the immediately most useful things about calculus. It has enabled us to maximize and minimize basically anything we want. The real-world implications here are pretty obvious. But to really lay out how you actually pull this sort of thing off, the next post will go through some examples of how to actually maximize and minimize things when someone hands you an equation.

Homework Problems

Problem 1: The big theorem I proved assumes that the function f(x) can take on any input we want. Think about what would happen if we constrain the acceptable input values. In particular, notice that a graph can’t have a tangent lines at endpoints (because all endpoints are ‘sharp’ and derivatives are designed to find locations that are not ‘sharp’). Can a graph have its absolute maximum value at an endpoint?

Problem 2: In the appendix, I go through the method of how to find the location of the peak or valley in a parabola. Convince yourself of a method for figuring out whether the point you find is going to be a maximum or minimum. Then, convince yourself that this is not just a relative maximum or minimum, but is an absolute maximum or minimum.

Problem 3: Find all the relative maximum and minimum points of the graph of f(x) = x^3 - x + 2 without using a graph. Then, check your answer by using a graph.

Problem 4: I claimed towards the end of the article that y = x^3 has a point that pretends to be a max/min point but actually isn’t. Use the second test (looking at increasing/decreasing behaviors of \dfrac{dy}{dx}) to understand why y = x^3 doesn’t have any max or min points. What is the special property of the graph that prevents it from having max and min points? Can you come up with other graphs that have this same special property?

Challenge Problem: For those looking for a challenge, find similarities between the two tests for max/min points from this post and the Intermediate Value Theorem that was used in Explaining Calculus #4 to prove that equations have solutions.

Appendix

What I want to do here to show how you can use this idea to find the x-value of the peak/valley point on any parabola.

The equation of a parabola is always given by a quadratic equation f(x) = ax^2 + bx + c for some constant numbers a,b,c. We’ve already mentioned that polynomials always have derivatives, and so in the “big theorem” the case where f^\prime(x) does not exist cannot rear its head here. So, to find the peak/valley of the graph, the only thing we need to do is solve the equation f^\prime(x) = 0.

To do this, let’s first find the derivative of f(x). Using the standard method for polynomials we discovered earlier, we can quickly discover that

f^\prime(x) = a \dfrac{d}{dx}[x^2] + b \dfrac{d}{dx}[x] + \dfrac{d}{dx}[c] = a * (2x) + b*(1) + 0 = 2ax + b.

Then to solve the equation f^\prime(x) = 0, we solve 2ax + b = 0. Subtracting b on both sides, 2ax = -b. By dividing both sides by 2a, then x = \dfrac{-b}{2a}. Therefore, if we want to find the peak/valley of some parabola, we can just check that value of x.

Critical Thinking Toolkit: Inference to the Best Explanation

I’ve talked about deductive reasoning and inductive reasoning before. Both of these are important mechanisms we use for arriving at conclusions. The first makes use of the rules of logic – which we can essentially view as limiting ourselves to the definitions of words like true and false and not allowing ourselves any other resources. The second we can view as showing that something is probably true be accumulating lots of examples of that truth occurring in various different situations. Both of these are important and useful, but there is yet another way of thinking that is also essential – which is often called abduction or abductive reasoning.

What is Abduction?

The deductive method emphasizes the rules of logic and doesn’t allow much else into the picture. The inductive method emphasizes accumulating many examples to point towards a specific explanation of those examples. Abduction is very different from both of these. The idea with abduction is that we want the best explanation of the information available to us. Let’s give an example before we try to get too specific.

Suppose that you walk outside in the morning and all of the ground around you is wet, but it is not currently raining. You probably will immediately think that it probably had finished raining not too long ago. But, for the sake of discussion, how is it that you reject the idea that someone can in the middle of the night and dumped buckets of water everywhere you can see? This idea clearly explains the wetness of the ground, and since there isn’t anything we can “repeat” in an experimental way, we can’t use induction. And yet, I doubt any of you reading this paragraph would doubt your belief that it rained not too long ago based on my alternative explanation.

What you are doing subconsciously here is using a kind of abductive argument. You are thinking to yourself something like “Well, yes, your alternative does explain why the ground is wet. But the explanation I originally came up with is better, so I would need you to show me some proof before I give up my belief that it rained yesterday.” You think this way because the hypothesis of rain is better at explaining what you know than the hypothesis of the buckets of water.

Now, suppose that someone showed you a time-stamped video of someone dumping buckets and buckets of water all over the area you are standing last night. In that situation, you may well shift what you believe. If you walk a ways away and the ground if entirely dry, this will probably convince you. This is because you’ve received new information that doesn’t make sense if it actually had rained earlier, but does make sense if someone had come by while you slept and dumped water everywhere near you.

This is exactly the idea that goes behind abduction. Abductive arguments collect all the information we know about a certain situation, formulate the various ways that we might explain that information, and then try to find which of those does the “best job” of explaining what you see. When all you knew was that the ground was wet, rain made the most sense. But when you found dry ground nearby and saw the video, you gained more information and that new information caused your belief to shift. This is the heart of the abductive method.

But, we must ask, how do we know which option is the best one? What do we mean by best anyways?

What Criteria Do We Use?

When using abductive arguments, we need to be clear about how we are judging which explanation is the best. For example, it wouldn’t be acceptable to use your own emotional preferences as part of what you mean by “best” since that makes the entire discussion depend on your personal opinion, which is exactly what we are trying to avoid. But, what criteria can we use then? There are a lot, which may apply to greater or lesser extents depending on the circumstances. The best way to approach this situation (notice I used the word best… what do I mean by best here?) is probably to make a list of various criteria that are often used in abductive arguments. These are often called explanatory virtues, because… well… it is a “good” thing for an explanation to have them. Fairly intuitive name for them. But what are these explanatory virtues? There are many. Here are some examples of qualities that might count as explanatory virtues.

  • Explanatory Power: How well does the theory explain what we know?
    • Example: Einstein’s theory of gravity predicts orbits more accurately than Newton’s theory of gravity does.
  • Explanatory Scope: How many different things that we know does the theory explain?
    • Example: Einstein’s theory of gravity does everything old theories of gravity do, and it does even more.
  • Simplicity: How many new things do we have to accept to adopt this theory?
    • Example: If someone hid a practical joke in your room, it was probably one or two people, not one hundred. And it was probably your brother, not aliens.
  • Coherence: This theory meshes well with things I already know about the world.
    • Example: If you know I like math, then you learn that I have a lot of math and science books in my house, that fits well with what you already know about me. If you know I hate math but learn that I own dozens of math books, that does not fit well.
  • Not Ad-Hoc: Does this theory seem arbitrary or contrived? Does it feel like someone ‘pulled it out of thin air’?
    • Example: Explaining Christmas presents by appealing to Santa Claus, a person whose sole purpose of existence is to make the presents appear under the Christmas tree.
  • High Prior Probability: Is there something intuitively (or logically) plausible about this theory? Does it ‘make a lot of sense’?
    • Example: Even before you count up ballots, you pretty much know who California voted for. Same with Mississippi.
  • Multiply Evidenced: Is there evidence for this theory from a variety of different sources and different types of sources?
    • Example: If your theory has evidence from cosmology, biology, history, and archaeology, you’re looking pretty good.

Why Does Abduction Work?

It is actually very strange that abduction works at all. This is because, on the level of deductive logic, abduction is a fallacy. When we use abduction, we are considering some reality B and try discover the cause of B (call this cause A). In terms of logic, we are saying something like “since we know B, and A best explains B, therefore B probably implies A.” This is extremely similar to the fallacy of affirming the consequent – the fallacy of saying that if we know that B and that A implies B, then we know A. This reasoning is false, and yet in the case of abduction it tends to work pretty well (although it is not perfect). How is this so?

If I’m honest, I don’t think we can ever be absolutely certain about this. Especially because, sometimes, it doesn’t work. Sometimes life is more complicated than that, and sometimes we don’t have all the evidence. And even if we do – why should the concepts that we humans happen to think are explanatory virtues have anything to do with whether the theory behind the virtues is really real?

This reminds me of Einstein’s keen observation about science, “The most incomprehensible thing about the world is that it is comprehensible.” It is a daunting question, in my opinion. I think that something like theism would help, as in that case the universe may well have been created by a being who made it orderly and who made human minds in such a way as to discover that order. But, nonetheless, I believe that we are all should continue believing that abduction will work, even if we don’t always fully understand why. It is a great blessing that the way we think about goodness and rationality so often lines up with the world in which we live. And using abduction well and thoughtfully is a way we can express our gratitude for that gift.

Updates and Future of Blog in 2021

We are in a new year! Praise be to God the Father and the Lord Jesus Christ for carrying each of us through such a trying year. I pray that everybody reading this might be reassured and revitalized by God’s hand through the Holy Spirit and that all may find life in His name. I pray equally that not only every individual, but also society at large might feel Your healing power.

At the end of such a painful year, I feel this only appropriate. We need so much healing. One of my many hopes with this blog is to put a drop into the bucket of healing. To see why, recall that I am both a scientist (a mathematician, to be specific) and a Christian. So, I find myself in a position to help rectify divisions between Christians and non-Christians, and even more specifically between Christians and scientists. I am also an educator – I currently teach calculus and hope to teach far more over the years – and I aim to educate other Christians (and non-Christians) in topics that touch on mathematics and Christianity.

In light of this, I want to continually receive feedback and input on what I am doing. As we are beginning a new year – which is in fact approximately one year since I began writing on this blog in earnest – I want to put forward my goals for the year and give my readers an opportunity to give me feedback on my plans. My post will proceed as follows. I will first list a couple major life events over the past year. I will then briefly summarize those writing projects on which I have already begun writing. I will then move on to topics that I hope to be writing about over the next year. After the post is done, I sincerely hope that all my readers will tell me which of these projects they are most or least excited about, and I hope to receive input about other writing directions I might possibly go as well.

Life Updates

Work Life: COVID has had the impacts on me that I imagine it has on all academics over the past nine months. Namely, I have had drastically less one-on-one interaction with my professors and fellow graduate students. For me personally, this has made being productive much harder. While I know that many have been able to leverage the quarantine into focused study on singular topics and high productivity, my experience has been different. I have ADHD, and the lack of interpersonal interactions/stimulation during quarantine has had a negative impact on my general efficiency and ability to focus. Listening to and delivering online lectures was quite difficult, as has maintaining enough focus to do productive research. I have been able to do some research on the odd values of Fourier coefficients of modular forms and on arithmetic statistics of partitions, but I feel like I could’ve been much more productive working in person. Sadly, that’s not how things were. But I hope that going forward I will never take for granted the benefit to me psychologically of working in a “workplace environment.”

Personal Life: The other major update, from just a few weeks ago now, is that I am engaged to be married to the love of my life. I proposed to my fiancĂ©e Kara on December 19th of 2020, and I am so thrilled that she said yes. I have spent the past couple of weeks doing a lot of wedding planning – we have chosen a date and venue, formed a preliminary budget, chosen the groomsman and bridesmaids, and done a lot of the work towards choosing a honeymoon and our first apartment. It has been a thrill and a large drain on my time, although I do not mean this in a bad way. Her emotional support keeps me going in my worst moments and she brings me back to Christ when I am struggling. I am so thankful for her and our upcoming wedding in July, as well for the rest of my life with her.

Current Writing Projects

I’ve been writing on this blog now since December 2019 (13 months ago) and have written around 80 posts for the blog. Some of these have been more or less isolated, but many of these have also been very interconnected. In fact, the main body of what I’ve written thus far has been designed to not only stand on its own, but also to fit together with other posts. I focus here on those larger bodies of related posts.

Critical Thinking Toolkit: This series has been going for quite a while, and consists of points that are important for thinking carefully about anything at all in life – I think that up until this moment none of them have only to do with religion, although I’ve used examples pertaining to religion as I’ve seen fit. These posts will continue, but will likely slow down within the next few months. I will also eventually put up a database post where I provide links to every database post I have made so far, organized into useful categories that share similarities.

Explaining Calculus: I’m probably somewhere between halfway and a third of the way done with this project now. My goal here has been to explain as much of what mathematicians call “calculus” to an audience that is not very mathematically inclined. There have been times I’ve indulged in some more detailed bits of mathematics, but I’ve tried to explain those more slowly than would normally be done. I’ve also not really emphasized actually teaching the reader to do calculus themselves – rather, I’ve been focusing on explaining the sorts of things that calculus does. I have provided examples that the careful reader (or the reader who already knows calculus) should be able to follow, but even if these are not understood, the goal of gaining a sort of rough ‘big-picture’ of what this calculus thing is all about should still be achieved.

There are three major ‘sections’ within the Explaining Calculus series – the topics of which would be called derivatives, integrals, and infinite series. I’ve been explaining derivatives so far, and am nearly done with that section of the series. After that segment, I will move on to integrals, then infinite series, and hopefully explain those well. I expect that this series will probably be completely done by the end of 2021.

Future Writing Projects

I do this blog partly because I love writing, but also because I want to help others learn. Because of that, I think frequently about what the best topics I can write about would be. These can end up shifting over time, but for now here is what I am expecting to be my major new writing projects in 2021.

Book and Article Summaries: This is the most significant of my future projects, the one I’d like to devote the most time to. I’ve only done one of these so far – namely my summary of Eugene Wigner’s article, which is entitled The Unreasonable Effectiveness of Mathematics in the Natural Sciences. This is an academic article – that is, this article is printed in a professional academic journal and is intended to experts in the relevant fields (which in this case are mathematics and physics, with implications in philosophy and theology). Ever since I started the blog, I’ve wanted to do things like this. As it turns out, these kinds of posts take a long time to prepare. The commentary part is fairly easy, but it takes a lot of time to do the best I can to correctly represent the original author of the article. In the case of my summary of Wigner’s article, Wigner is an atheistic physicist and I am a Christian mathematician – so we come from different perspectives. I did the best I knew how to keep my own perspective on his writing separate from his own writing, and to nonetheless include my own thoughts on his writing in the post via a section that followed the summary portion.

I would like to do many more of these this year. This may end up meaning that I can’t do as many posts this year, we will see how this works out. But of the 40 books I’ve read in 2020, I’d like to write summaries of probably 10 or so of them – in addition to similar summaries of academic articles. These range from writings of G.H. Hardy to Thomas Nagel to William Lane Craig to Alvin Plantinga. I’d like to do a lot here – hopefully this will be possible without any massive gaps in my posting schedule.

Intro to Philosophy of Religion: I’ve done some isolated posts to this effect in the past, and I’d now like to give this topic a more organized approach. I would like to take all the time I’ve spent learning about how modern philosophers are talking about religious issues and to put that in an easier to read format than is often presented. I’d also like to defend a particular argument for God’s existence, which forms a sort of subset of this series of posts that I’d like to explain below. I’d also like to do deeper dives into particular subjects within philosophy of religion. The first of these will be what is known as the Kalam Cosmological Argument, which (basically) reasons from the beginning of the universe a finite time ago (aka the Big Bang) that very likely God exists. Lots more needs to be said here – that will have to wait until a later time.

Explaining Set Theory: This one likely won’t come for a long time – or at least won’t come full force for a long time. This is my intended follow-up for the Explaining Calculus series. Set theory is another branch of mathematics, one that is more abstract. This will be very different from the calculus series. Set theory is, as the mathematician would say, more fundamental than calculus. What the mathematician means by this is similar to what we might mean when we say that learning to read is the fundamental task of all literature. Basically, set theory is very often the ‘language’ in which the other areas of mathematics are expressed. Set theory is a sort of ‘grammar’ for mathematics – in set theory we learn about such basic things as thinking of objects as ‘part of a whole’, how to relate things to each other in an organized manner (aka the function), and how to count things – even infinite things! Set theory also comes up a lot in philosophical circles because of the many curious paradoxical situations we find ourselves in as we dive deep into that world. In this series, my hope is to boil everything down enough that, even though you might not know how to do any set theory on your own, hopefully you can become increasingly able to understand that language.

Viewership of the Blog from Around the World!

That’s all for my major planned future topics for writing. It’s been a good first year – in fact this is a good time to point out that it has been just over a year since I’ve been writing on this blog publicly. I’m incredibly thankful for all of those who have been reading – it has been really gratifying for me and I hope it has been helpful in some way for you.

As a way to show my appreciation and excitement for continuing my writing, I wanted to give an overview of some of the viewership statistics that I can see as the owner of the website. I can see things like the number of pages that have been viewed, the number of times people have opened the website (i.e. if you open my main page, then read three articles, that only counts as opening my website once), and I can even see which countries have people who read the blog and which posts are the most popular. I currently sit at about 2,300 total views, representing more than 50 different countries – which I find amazing and humbling. Thank you to everyone who has been a reader of the blog!

Conclusion

So thank you to everyone who has been a reader – I hope you all have learned something and enjoyed doing so!

Also, I want my writing to be enjoyable to everyone. So, even though I’ve listed what my plans are for 2021, I want to know what people who are actually reading what I write want to see. Let me know in the comments if you have topics that you want me to write about!

My Top 10 Books from 2020

As a child, I loved reading. My mother can tell so many stories about reading to me as an infant or me beginning to read. But, for a long time, I lost this love. There were a variety of reasons behind this. But in 2019, that love was revived. I now make a concerted effort to read as much as I can. As I am getting my Ph.D in mathematics, naturally I spend a substantial amount of time reading books about math. But I read a variety of other topics this year as well, in topics ranging from hermeneutics, theology, cosmology, and philosophy. I find learning about all these things fascinating, and I want to share some of the books I’ve found most interesting and thought provoking. It’s hard to make a top 10 list for something so subjective, so I’ve done the best I could to include a variety of topics that different people might find interesting, and I’ve added some brief commentary below each to explain why I enjoyed each.

10: Mathematics for Human Flourishing (by Francis Su)

This is one of the best books I’ve read that pulls away wool from the public’s perception of what mathematics is. While the public tends to think of mathematics, and often mathematicians, as purely logical and data-focused, this book sheds away that stereotype to reveal the deeply aesthetic, societal, and emotional aspects of mathematics. I highly recommend the book for anyone who wants to see math in a different and beautiful light.

9: There is a God (by Antony Flew)

Antony Flew was, for most of his life, one of the most famous philosophical atheists in the world. He engaged in public debates defending atheism, and wrote many books in its favor. However, towards the end of his life, this great skeptic can to believe that in a kind of deism (a belief that there is a God but which neither affirms nor denies any particular religious tradition). And the primary reason for this, which might be surprising to many, is actually modern biology. This book goes through the evidence that convinced him that there must be some kind of God in an easy to read and objective way.

8: The Lion, the Witch, and the Wardrobe (by C.S. Lewis)

This is a pretty well known book, so it probably doesn’t need too much of my commentary. This is a beautiful story (also a movie) written by one of the most engaging writers of the twentieth century, C.S. Lewis. Good for children and adults, bringing both childlike wonder and deep questions to the table, this book is a good and quick read for anyone.

7: Knowledge and Christian Belief (by Alvin Plantinga)

Alvin Plantinga is an extremely influential philosopher from Notre Dame. He came into his career in a time when philosophical circles thought that there was no reason to take people who believe in any kind of God seriously. Plantinga essentially single-handedly changed their minds, and now modern philosophy has fruitful debate between atheists and theists. This book is a summary version of one of Plantinga’s most important works, Warranted Christian Belief, that addresses in a careful and thoughtful manner those people who claim that religious belief is irrational (as distinct from those who merely say it is false). The book addresses such topics as misguided secular definitions of the word faith, the witness of the Holy Spirit in Christianity, the nature of rationality, and what it means to know anything at all.

6: Where is God in a Coronavirus World? (by John Lennox)

John Lennox is a philosopher and mathematician at Oxford University, and a prolific author. In this book, Lennox takes a look at the COVID pandemic through a calmer lens, taking seriously all the suffering and yet not resorting to panic, taking seriously the physical, emotional, and spiritual dimensions of the world in which we find ourselves. Asking questions such as “why does God allow viruses” and providing scientific as well as spiritual answers, this book is a great and short read if you are struggling to grapple with the pandemic as a Christian (or even as a non-Christian).

5: Misreading Scripture with Western Eyes (by E. Randolph Richards & Brandon J. O’Brien)

Most books written by authors today are easy for us to read, because we share a common culture with the authors. But when you read a book from an ancient time or a faraway place, things get tricky. You may not be aware of the assumptions of that culture, you may not be aware of their way of life and the things they care about. And because you don’t know these things, you are apt to misunderstand a lot of what they write. Misreading Scripture with Western Eyes does a deep dive into the many ways we as 21st century Westerners misread what the 1st century-and-earlier Eastern/Middle Eastern authors of the Bible were trying to tell us. The authors go through their own process as Westerners of learning about Eastern ways of looking at life, including such topics as family, individuals, morality, laws, the honor-shame paradigm, and much more, and show how what we think the Jewish authors of the Biblical text meant isn’t quite what they actually meant. This book is a must-read if you want to better learn what the Bible really teaches.

4: The Question of God (by Dr. Armand M. Nicholi, Jr.)

Many great men and women have wrestled over the difficult questions of life – one of which, of course, is whether or not there is a God, and if so what God is like. This book is an exploration of some of the big questions – ‘God, Love, Sex, and the Meaning of Life’ per the subtitle of the book – through two very different sets of eyes – C.S. Lewis and Sigmund Freud. They were roughly contemporary, and although they never met, their ideas have an interesting and enlightening back-and-forth. For those who want to explore the lives and ideas of either of these two great men, I’d highly recommend picking up this book.

3: Seeking Allah, Finding Jesus (by Nabeel Qureshi)

This book is riveting and challenging to anyone. Nabeel Qureshi grew up Muslim, spent several years comparing the Scriptures and historical evidence behind Christianity and Islam, and eventually became a Christian through that study and through visions and dreams about Jesus. The evidence does come up in this book, but the book is more a story of his life, focusing more on spiritual and emotional realities of his Muslim upbringing, his friendship with Christians and those of other faiths, and the differences he saw between the Pakistani Muslim culture of his family and the Western culture than surrounded him. No matter your background, no matter what you think about Islam or Christianity, there is much to learn from Nabeel’s clear and eloquent writing style. By reading this book, not only will you get to see a very interesting life play out, you will learn about various cultures and how they interact in an individual’s life.

2: Tactics (by Greg Koukl)

I consider this a must-read for anyone who wants to do public dialogue. It should be required reading. The best summary I could give of what the book is about would be how to tactfully avoid unhelpful conversations by moving conversations in productive directions. The book itself focuses on doing this in religious and/or political conversations, but the core lessons apply anywhere and, even if you disagree with the author’s opinions, you can still see clearly how the strategies discussed work. To give a taste of the book, my favorite conversational tactic is called the Road Runner Tactic, and it gives conversational strategies for pointing out self-refuting ideas, which are ideas that fails to live up to their own standard. By mastering this book, your conversations about any topic – and especially controversial ones – will become much more productive and helpful for everyone, including yourself.

1: God, Stephen Hawking, and the Multiverse (by David Hutchings & David Wilkinson)

This is an absolutely brilliant book. If you want to understand the life of the great physicist and thinker Stephen Hawking, my main recommendation would be this book, probably even more than Hawking’s own books. I say this because the authors have made Hawking’s scientific and philosophical thinking incredibly accessible using helpful stories and analogies that make the whole book engaging. You don’t have to know any special amount of math, science, or philosophy, and the book is nonetheless able to take the reader through Hawking’s biggest and most influential ideas, both through the eyes of the public and the eyes of academia.

Honorable Mentions

I’ve read many more books than just 10 this year, and I felt I had to leave off so many awesome books when I made this list, to I wanted to throw on a few at the end as honorable mentions. These are listed in no particular order.

  • Cold-Case Christianity (by J. Warner Wallace)
    • Like crime shows? J. Warner Wallace has been on some for his work as a cold-case detective. As an atheist, Wallace used his investigative skills on the Bible and came to the conclusion that Jesus really did rise from the dead and that Christianity is the truth. In this book, Wallace shows how the objective tools that a detective uses at a crime scene can reveal the truth of the Bible in an easy-to-understand way that connects very closely with concepts that we already understand well from our favorite TV crime dramas.
  • Infinity, Causation, and Paradox (by Alexander Pruss)
    • This one is really dense, and you’d need some background in set theory and probability to follow it, but it is utterly fascinating. The book delves into a huge range of paradoxes about infinity – like strange lotteries with infinitely many tickets, infinitely long sticks, and infinitely long chains of cause-and-effect – and analyzes them mathematically and philosophically with the goal of resolving as many of the paradoxes as possible in ‘one fell swoop.’ The solution given in the book, causal finitism, holds that although perhaps infinitely many things exist, only finitely many causes can contribute to any event. The book ends with a discussion that suggests that, if causal finitism is true, then there ought to be a First Uncaused Cause of everything that exists.
  • The Kalam Cosmological Argument (by William Lane Craig)
    • Easy to read for a work of academic philosophy. This book covers evidence for God’s existence in a form called the Kalam cosmological argument. Craig traces the origins of the argument in medieval Islamic theology and traces the lineage of these thoughts to modern times. Craig then reformulates the argument in a stronger form, and draws of evidence from big bang cosmology, thermodynamics, and metaphysics to argue that the universe has a transcendent cause. (Worth noting: this book was written in the 1970’s, and the section on science would need to be much longer to fully discuss modern developments in cosmology. But the discussion is still close enough to modern as to be useful).
  • Church History in Plain Language (by Bruce Shelly)
    • Overview of the entire two thousand year history of the Christian church in, as the title suggests, plain and easy to understand language. The book is broken up in a natural way, and covers topics from early church councils to the formation and function of monasteries to denominational differences and so much more. Highly recommend as a introduction to the long and complicated history of the Christian church.
  • God and the Astronomers (by Robert Jastrow)
    • If science is the study of what what causes the effects we see around us, what caused the universe to spring into existence? World-renowned astrophysicist Dr. Robert Jastrow takes the reader through the evidence for the Big Bang, why at first that evidence was hated and rejected by atheist scientists, and the eventual acceptance by the scientific community of a theory that felt uncomfortably like a Biblical creation event.
  • Surprised by Hope (by N.T. Wright)
    • Renowned scholar and Anglican bishop N.T. Wright explores the Biblical messages about the afterlife we all miss… it has never been about heaven at all. Wright delves into the importance in the Bible of the real bodily resurrection, both of Christ and eventually of our own bodies, and the related deep connections between this life and the life to come that are all to often ignored.
  • Tales of Impossibility (by Dave Richeson)
    • A fascinating look into the 2000 year history of big problems that ancient Greek were unable to solve, the most famous of which would be “squaring the circle.” The book goes through the history and various approaches taken to the problems until they were finally solved in the 1800’s using a surprising method.

Thank you for taking the time to read through this list. I hope that you’ve found something that might interest you that you can pick up in 2021!

Computing Derivatives: Part 2 (Explaining Calculus #8)

Most recently in the series on calculus, we did some overview on some “precalculus” topics that we’d need for later calculus discussions. Having now done this, we move on to several examples of more ‘complicated’ rules that derivatives follow. We will then investigate some more difficult specific functions and their derivatives. Finally, at the end, I will leave a list of “homework problems” for any of my readers who want practice.

The Product Rule

We have already tackled to to take derivatives of functions like f(x) + g(x). We handled addition, one of the main operations of arithmetic. By doing addition, we were able to handle subtraction as well. But what about multiplication? That would be the natural next step, just as in school multiplication is the next natural step after learning about addition and subtraction. Our purpose now is to lay out the so-called product rule, which enables us to take derivatives of functions like f(x) g(x).

Fact: For any two functions f(x) and g(x) that have derivatives,

\dfrac{d}{dx}[f(x) g(x)] = f^\prime(x) g(x) + f(x) g^\prime(x).

Proof: This proof uses a slick trick of “adding zero” to an expression. First, by the definition of derivatives,

\dfrac{d}{dx}[f(x) g(x)] = \lim\limits_{h \to 0} \dfrac{f(x+h) g(x+h) - f(x) g(x)}{h}.

Now, we want to be clever. Our clever move will be the fact that f(x) g(x+h) - f(x) g(x+h) = 0. While this looks very strange to point out something so obvious, this fact means that

\lim\limits_{h \to 0} \dfrac{f(x+h) g(x+h) - f(x) g(x)}{h} = \lim\limits_{h \to 0} \dfrac{f(x+h) g(x+h) - f(x) g(x+h) + f(x) g(x+h) - f(x) g(x)}{h}.

The first to pieces of this numerator have a common factor, as do the second two. Therefore, this messy expression can be simplified a little bit as

\lim\limits_{h \to 0} \dfrac{g(x+h)(f(x+h) - f(x))}{h} + \lim\limits_{h \to 0} \dfrac{f(x)(g(x+h) - g(x))}{h}.

Since \lim\limits_{h \to 0} g(x+h) = g(x), the first of these simplifies as

\lim\limits_{h \to 0} \dfrac{g(x+h)(f(x+h) - f(x))}{h} = g(x) \lim\limits_{h \to 0} \dfrac{f(x+h) - f(x)}{h} = g(x) f^\prime(x).

The second piece, using the exact same process, is equal to f(x) g^\prime(x). When everything is put back together, f(x) g(x) has derivative f^\prime(x) g(x) + f(x) g^\prime(x).

The Chain Rule

You’d think that the next thing we would do is division. And in fact, there is a way I could do division next. But it will actually be easier to do something else first. I want to do functions inside of other functions next, like f(g(x)), which just means “copy-paste the expression g(x) wherever there would have been an x. This is called the chain rule, because the functions sort of link together like a chain, inextricably connected.

Fact: For any two functions f(x), g(x) for which f(g(x)) makes sense, we have \dfrac{d}{dx}[f(g(x))] = f^\prime(g(x)) * g^\prime(x). In different notation, if y is a function of u, which itself is a function of x, then

\dfrac{dy}{du} * \dfrac{du}{dx} = \dfrac{dy}{dx}.

In an earlier post, I made a comment about a similarity between this derivative notation and genuine fractions. The chain rule is the central such similarity – when written in the dy-over-dx style, it looks as if the du’s are cancelling out, like they would if these were actual fractions.

Proof: This one is actually quite tricky compared to the others. For this reason, I won’t actually do a totally correct proof. Instead, I’m going to do a proof that “usually” works. For those who want a great challenge, try to figure out where this proof goes wrong. (If you end up wanting to know, the Wikipedia article on the chain rule will actually tell you. If you don’t care, then reading this proof will give you the right idea.)

The definition of the derivative tells us that

\dfrac{d}{dx}[f(g(x))] = \lim\limits_{h \to 0} \dfrac{f(g(x+h)) - f(g(x))}{h}.

We now make a clever step of multiplying top and bottom by g(x+h) - g(x).

\dfrac{d}{dx}[f(g(x))] = \lim\limits_{h \to 0} \dfrac{f(g(x+h)) - f(g(x))}{g(x+h) - g(x)} \cdot \dfrac{g(x+h) - g(x)}{h}.

We can even split this up into two limits multiplied together.

\dfrac{d}{dx}[f(g(x))] = \lim\limits_{h \to 0} \dfrac{f(g(x+h)) - f(g(x))}{g(x+h) - g(x)} \cdot \lim\limits_{h \to 0} \dfrac{g(x+h) - g(x)}{h} = g^\prime(x) \lim\limits_{h \to 0} \dfrac{f(g(x+h)) - f(g(x))}{g(x+h) - g(x)}.

Since g(x) will be a continuous function (it has a derivative so it must be continuous), g(x+h) \to g(x) as h \to 0. This enables us to treat g(x+h) as if it were something like g(x) + h. There is a more careful way to write that down, but since we are being informal I won’t do that (and no, this isn’t the real problem in the proof… that already happened earlier). This will mean that

\lim\limits_{h \to 0} \dfrac{f(g(x+h)) - f(g(x))}{g(x+h) - g(x)} = \lim\limits_{h \to 0} \dfrac{f(g(x) + h) - f(g(x))}{(g(x) + h) - g(x)} = \lim\limits_{h \to 0} \dfrac{f(y+h) - f(y)}{h},

where y = g(x) is used to make the formulas easier to follow. This leads directly to the value f^\prime(g(x)) for this limit. When we combine all of our work, we find that \dfrac{d}{dx}[f(g(x))] = f^\prime(g(x)) g^\prime(x).

Fact: If the functions f(x), g(x) have derivatives, then the derivative of \dfrac{f(x)}{g(x)} is \dfrac{g(x) f^\prime(x) - f(x) g^\prime(x)}{g(x)^2}.

Proof: The first thing we do is to view the division as a multiplication by

\dfrac{f(x)}{g(x)} = f(x) \cdot \dfrac{1}{g(x)}.

Secondly, we want to use the chain rule for \dfrac{1}{g(x)}. To do this, we will use the function h(x) = \dfrac{1}{x}. Then \dfrac{1}{g(x)} = h(g(x)). Therefore,

\dfrac{f(x)}{g(x)} = f(x) \cdot h(g(x)).

We can then use the product rule to begin this derivative:

\dfrac{d}{dx}\bigg[ \dfrac{f(x)}{g(x)} \bigg] = f(x) \cdot \dfrac{d}{dx}[ h(g(x)) ] + h(g(x)) f^\prime(x).

From the chain rule, we can simplify the derivative of h(g(x)) to arrive at

\dfrac{d}{dx}\bigg[ \dfrac{f(x)}{g(x)} \bigg] = f(x) \cdot [h^\prime(g(x)) \cdot g^\prime(x)] + h(g(x)) f^\prime(x).

Now, since h(x) = \dfrac{1}{x} = x^{-1}, the rule for taking derivatives of powers of x proven in an earlier post tells us that h^\prime(x) = - x^{-2} = \dfrac{-1}{x^2}. Therefore,

\dfrac{d}{dx}\bigg[ \dfrac{f(x)}{g(x)} \bigg] = f(x) \cdot \bigg(\dfrac{-1}{g(x)^2} \cdot g^\prime(x) \bigg) + \dfrac{1}{g(x)} \cdot f^\prime(x),

and we can simplify this as

\dfrac{d}{dx}\bigg[ \dfrac{f(x)}{g(x)} \bigg] = \dfrac{- f(x) g^\prime(x)}{g(x)^2} + \dfrac{f^\prime(x) g(x)}{g(x)^2} = \dfrac{g(x) f^\prime(x) - f(x) g^\prime(x)}{g(x)^2}.

This is the original formula we wanted to prove. So, our proof is now done.

More Special Functions

We now move on from these general principles to some more specific examples. In the previous post on computing derivatives, we built up all the tools to compute the derivative of any polynomial expression, and in fact any expression involving x raised to constant powers. Now, we move on to trigonometry, exponentials, and logarithms. These derivatives are more difficult to discover, but the importance of these functions requires that any complete study of calculus should include their derivatives. Also, a thorough study of these functions and their derivatives will provide a very complete picture of how to calculate derivatives using the limit definition.

Fact: The derivatives of \sin{x} and \cos{x} are

\dfrac{d}{dx}[\sin{x}] = \cos{x} \text{ and } \dfrac{d}{dx}[\cos{x}] = -\sin{x}.

Proof: The definition of the derivative tells us that

\dfrac{d}{dx}[ \sin{x} ] = \lim\limits_{h \to 0} \dfrac{\sin{(x+h)} - \sin{x}}{h}.

In a previous post in the series, where I defined the function \sin{x}, I also gave the following rule for computing the value of \sin{(x+h)}:

\sin{(x+h)} = \sin{x} \cos{h} + \cos{x} \sin{h}.

Therefore,

\dfrac{d}{dx}[ \sin{x} ] = \lim\limits_{h \to 0} \dfrac{\sin{(x+h)} - \sin{x}}{h} = \lim\limits_{h \to 0} \dfrac{(\sin{x} \cos{h} + \cos{x} \sin{h}) - \sin{x}}{h}.

We can now split this limit up into two pieces – one for \sin{x} and one for \cos{x}.

\lim\limits_{h \to 0} \dfrac{(\sin{x} \cos{h} + \cos{x} \sin{h}) - \sin{x}}{h} = \lim\limits_{h \to 0} \dfrac{\sin{x}(\cos{h} - 1)}{h} + \lim\limits_{h \to 0} \dfrac{\cos{x} \sin{h}}{h}.

There are some facts we have to know in order to continue, these are that \lim\limits_{h \to 0} \dfrac{\sin{h}}{h} = 1 and \lim\limits_{h \to 0} \dfrac{\cos{h} -1}{h} = 0. The proofs of these are a bit tricky, and so I don’t want to go off on a tangent (math pun!) talking about those here. I will add an appendix to the end of this post in which I talk about how to find these limits.

Moving on, once we know the values of these limits, we know that

\lim\limits_{h \to 0} \dfrac{\sin{x}(\cos{h} - 1)}{h} = \sin{x} \cdot \lim\limits_{h \to 0} \dfrac{\cos{h} - 1}{h} = 0

and

\lim\limits_{h \to 0} \dfrac{\cos{x} \sin{h}}{h} = \cos{x} \cdot \lim\limits_{h \to 0} \dfrac{\sin{h}}{h} = \cos{x}.

Therefore, putting together all the steps we’ve laid out,

\dfrac{d}{dx}[\sin{x}] = \cos{x}.

This completes the first half of the proof. I will leave the proof about the derivative of \cos{x} as practice for any of my curious readers, only providing a few guiding hints. The proof begins the same way. Instead of using the special rule for \sin{(x+h)}, you need to use the rule for \cos{(x+h)} that I gave in the same post in which I gave the rule for \sin{(x+h)}. After this rule is used, you should be able to finish the proof by following the same ideas I use here.

This completes our discussion of

Fact: The derivative of f(x) = b^x is f^\prime(x) = b^x * \log{b}.

Proof: The first thing we will do is to “compress” every value of b into the very special number e. Remember that, earlier in the series, we defined the “natural” exponential function e^x and “natural” logarithm function \log{x}. Also, remember that e^{\log{x}} = x. Using x = b, we conclude that b = e^{\log{b}} and therefore

b^x = (e^{\log{b}})^x = e^{x * \log{b}}.

Now, let’s call f(x) = e^x and g(x) = x * \log{b}. Then b^x = f(g(x)). Using the chain rule, we then know that

\dfrac{d}{dx}[b^x] = \dfrac{d}{dx}[ f(g(x)) ] = f^\prime(g(x)) * g^\prime(x) = \log{b} * f^\prime(x*\log{b}).

What we have done here is to express the derivative of b^x in terms of the derivative of e^x. This means that we now only need to know how to find the derivative of this most special function. Using the definition of the derivative, along with some basic rules of exponents, we have

\dfrac{d}{dx}[e^x] = \lim\limits_{h \to 0} \dfrac{e^{x+h} - e^x}{h} = \lim\limits_{h \to 0} \dfrac{e^x(e^h - 1)}{h} = e^x \lim\limits_{h \to 0} \dfrac{e^h-1}{h}.

The proof here is a bit detailed, but the value of \lim\limits_{h \to 0} \dfrac{e^h - 1}{h} is 1. I will delay this proof to the appendix. But, the key fact here is that e^x is so special because it is its own derivative. That is, if f(x) = e^x, then f^\prime(x) = f(x) = e^x! Using all of these facts, we conclude that

\dfrac{d}{dx}[b^x] = e^{x * \log{b}} * \log{b} = b^x * \log{b}.

This completes our discussion of the derivative of exponential functions.

Fact: The derivative of the function f(x) = \log_b{x} is f^\prime(x) = \dfrac{1}{x * \log{b}}.

Proof: There is a way to do this similar to the previous proof about b^x. Instead of doing that, I want to be clever. Since the functions b^x and \log_b{x} are so closely related, shouldn’t their derivatives be really closely related too? We should think so. That just makes sense. Then maybe we can find a way to take advantage of our formula for the derivative of b^x to find the derivative of \log_b{x}. Can you find the way? Take a moment and think about it.

The way I’ll do this is to transform the equation f(x) = \log_b{x} into the equation b^{f(x)} = b^{\log_b{x}}. The second step is to notice that b^{\log_b{x}} = x because of the relationship between b^x and \log_b{x}. Therefore, b^{f(x)} = x. The other clever observation we have to make is that if two things are equal, their derivatives must also be equal. This means that \dfrac{d}{dx}[ b^{f(x)} ] = \dfrac{d}{dx}[x]. This is so clever because we can find these two derivatives by different methods. The right-hand side is much easier, the derivative of x is just 1. The left-hand side can be found by using the fact we already found about the derivative of exponentials along with the chain rule:

\dfrac{d}{dx}[b^{f(x)}] = (b^{f(x)}*\log{b}) * f^\prime(x) = (x*\log{b})*f^\prime(x)

since b^{f(x)} = x was already known to us. By combining these two derivative computations, we conclude that

(x*\log{b})*f^\prime(x) = 1,

and all we need to do is divide both sides of the equation by x*\log{b} to find our answer.

Notice that from this fact, we can also deduce that \dfrac{d}{dx}[\log{x}] = \dfrac{1}{x}. I will leave this to my reader. (Hint: why is \log{e} = 1?)

This basically completes the list of important derivatives. But for the sake of learning, I want to compute one more that uses a clever trick much like the trick I used in the previous proof that enables us to find the derivative of a function that is “more complicated” than any of those we have previously discussed. The reason I want to show this is to demonstrate how we can take advantage of the rules we have already learned to find derivatives of more complex functions.

Fact: The derivative of f(x) = x^x is f^\prime(x) = x^x(\log{x} + 1).

Proof: The function x^x isn’t actually susceptible to any derivative rule we’ve used to far. This is where we make a clever move to bring this function into the realm we know how to deal with by taking a logarithm. By the normal rules of logarithms, \log{f(x)} = \log{x^x} = x \log{x}. The derivative of the right hand side can be done using the product rule:

\dfrac{d}{dx}[ x \log{x} ] = x \dfrac{d}{dx}[\log{x}] + \log{x} \dfrac{d}{dx}[x] = \dfrac{x}{x} + \log{x} = \log{x} + 1.

One the other hand, the left hand side can be done using the chain rule:

\dfrac{d}{dx}[\log{f(x)}] = \dfrac{1}{f(x)} * f^\prime(x) = \dfrac{f^\prime(x)}{x^x}.

Since the left and right hand sides are equal, we conclude that

\dfrac{f^\prime(x)}{x^x} = \log{x}+1 \implies f^\prime(x) = x^x(\log{x} + 1).

Conclusion

We have now done everything we need to do with the admittedly tedious work of computing special kinds of derivatives. If you’ve made it this far, well done. Things get less tedious and more conceptual from here. Think about how tedious it was to learn how to add and multiply small numbers. Despite this tediousness, the effort is clearly worthwhile because of how useful addition and multiplication are for solving real-world problems. What we’ve done is “learned out times tables” for derivatives. Now that we’ve finished our “tables,” we can move on to grander topics. We can stop looking at individual trees and see the forest. Beginning with the next post, this is what we shall do. In the meantime, if you want some more practice, I’ve left some practice problems and extra material in an appendix that my more curious leaders could look at.

Homework/Practice

Problem 1: Find the derivatives of x^2 e^x and of \dfrac{x^2+2}{x^3 - x}.

Problem 2: Work out the details of the proof that the derivative of \cos{x} is - \sin{x}. Also, prove that the derivative of \tan{x} = \dfrac{\sin{x}}{\cos{x}} is \dfrac{1}{\cos^2{x}}. Go on to find the derivatives of \cot{x}, \sec{x}, and \csc{x}.

Problem 3: Find the derivatives of e^{e^x} and of \log{(\log{x})}. (Hint: Use the chain rule for both!)

Appendix

In this appendix, we evaluate the various limits we have delayed in the proofs given above.

Fact: \lim\limits_{h \to 0} \dfrac{\sin{h}}{h} and \lim\limits_{h \to 0} \dfrac{\cos{h} - 1}{h}.

Proof: The proof for the second limit is similar to the first, so we won’t consider it here. We only do the first limit. To do so, we consider the following diagram:

We need to establish some initial values of the geometry here. The angle at A is defined to be x. The side AB has length 1, which means immediately that AE has length \cos{x} and side CE has length \sin{x}. (I’m not currently very good at uploading labelled diagrams like this… sketch it on a piece of paper with labelled sides if you need that visual).

We first notice the triangle \Delta ABC. The area of this triangle is one half its base times its height. Its height is \sin{x}, and its base is 1. Therefore,

\text{Area}(\Delta ABC) = \dfrac{1}{2} \sin{x}.

Before we carry on, why is the length of BD equal to \tan{x}? This is because of “similar triangles.” Remember from geometry that two triangles are similar if their angles have the same degrees. The two triangles \Delta ACE and ABD are similar. The key fact about similar triangles is that any fractions of side lengths are always the same. This means that

\dfrac{\text{Length}(AE)}{\text{Length}(CE)} = \dfrac{\text{Length}(AB)}{\text{Length}(BD)}.

The length of AE is just \cos{x}. The length of AB is 1, and the length of CE is \sin{x}. Therefore,

\dfrac{\cos{x}}{\sin{x}} = \dfrac{1}{\text{Length}(BD)}.

Isolating \text{Length}(BD) = \dfrac{\sin{x}}{\cos{x}} = \tan{x}. So, the length of BD truly is equal to \tan{x}. This quickly tells us (as with the smaller triangle) that \text{Area}(\Delta ABD) = \dfrac{1}{2} \tan{x}. One more thing we want to observe. Instead of the triangle \Delta ABC, look at the “pizza slice” ABC. It definitely has more area than the triangle \Delta ABC and less area that the larger triangle \Delta ABD. Therefore, by putting together our earlier calculations,

\text{Area}(\Delta ABC) \leq \text{Area}(ABC) \leq \text{Area}(\Delta ABD)

\implies \dfrac{1}{2} \sin{x} \leq \text{Area}(ABC) \leq \dfrac{1}{2} \tan{x}.

The way angles are defined, the “pizza slice” ABC has area \dfrac{x}{2\pi} of a full circle. The area of the full unit circle is \pi, so the area of the pizza slice is \dfrac{1}{2} x. Therefore,

\dfrac{1}{2} \sin{x} \leq \dfrac{1}{2} x \leq \dfrac{1}{2} \tan{x} \implies \sin{x} \leq x \leq \tan{x}.

If we divide both sides by \sin{x}, then since \dfrac{\tan{x}}{\sin{x}} = \dfrac{1}{\cos{x}},

1 \leq \dfrac{x}{\sin{x}} \leq \dfrac{1}{\cos{x}}.

If we “flip” all these fractions and reverse the inequalities, this means that

1 \geq \dfrac{\sin{x}}{x} \geq \cos{x}.

By taking limits on all parts of this equation,

\lim\limits_{x \to 0} 1 \geq \lim\limits_{x \to 0} \dfrac{\sin{x}}{x} \geq \lim\limits_{x \to 0} \cos{x}.

Notice now that the third of these limits is just \cos{0}, which if we look at the triangle construction earlier is just 1, the length of AB. Therefore,

1 \leq \lim\limits_{x \to 0} \dfrac{\sin{x}}{x} \leq 1.

This technique is often called the Sandwich Theorem because we stuck the expression we wanted in between two other expressions. Since, of course, the only number between 1 and 1 is 1, the limit we wanted to find must be 1.

Fact: \lim\limits_{h \to 0} \dfrac{e^h - 1}{h} = 1.

Proof: This is done by very cleverly using the way the number e is defined. Remember that

e = \lim\limits_{x \to \infty}  \bigg( 1 + \dfrac{1}{x} \bigg)^x.

The clever move we can make here is to define h = \dfrac{1}{x} and rewrite this same limit in terms of h. Notice that if x \to \infty, then h \to 0. (Technically this is only a limit “from the right”. We would first have to prove that this limit actually exists. This isn’t terribly difficult to do, but it would require a lot of additional writing. So, I won’t do this. If you look at a graph of the function \dfrac{e^x-1}{x}, you can convince yourself that this limit does exist.)

By interchanging variables in this way,

e = \lim\limits_{h \to 0} (1 + h)^{1/h}.

Using this as a substitution for e inside the limit we want to actually compute,

\lim\limits_{h \to 0} \dfrac{e^h-1}{h} = \lim\limits_{h \to 0} \dfrac{((1+h)^{1/h})^h - 1}{h} = \lim\limits_{h \to 0} \dfrac{(1+h) - 1}{h} = \lim\limits_{h \to 0} \dfrac{h}{h} = 1.

Critical Thinking Toolkit: Self-Defeating Statements

Have you ever heard someone say something you find completely unbelievable? I’m sure we all have. Often, we think this because we know things that make the opposing claim unbelievable. There are many examples I could give, but I think your thoughts have probably already filled in some examples for me, so I won’t. Absurd ideas are everywhere. The question I will ask – and answer – here is the ideas that are the “most absurd” possible. How absurd could an idea be?

An Example of the Absurd

To get to the bottom of this, let’s think of what makes something “absurd” in the first place. One way we might think of this is that something absurd “flies in the face of all the evidence.” Imagine, for instance, you are talking to a person who claims that Joe Biden, the soon-to-be President of the United States, does not exist. Surely, whether or not you like Biden, you think he is a real person. Wouldn’t it be strange if millions of U.S. citizens voted for a person who did not exist for President? Surely, Joe Biden is a real person who actually exist. To deny that Joe Biden exists is absurd.

Conspiracy theories (those that are unsubstantiated at least) are also on the same basis. To be as uncontroversial as possible, let’s go with Holocaust denial, the belief that aliens built the Egyptian pyramids, and the view (held by nobody so far as I know) that Abraham Lincoln never lived. You could just do a Google search to figure out how some very misguided people try to deny that the Holocaust ever happened if you wanted to. I also greatly enjoy this video, which is a mock interview in which someone pretends to believe that Abraham Lincoln never lived and uses all kinds of ridiculous tactics to try to justify this “belief” of his. I’d highly recommend the video – it is very comical and also makes the point well… at least to me it does. At any rate, conspiracy theories generally are also quite absurd – although possibly one or two of them are really true.

Can any belief be more absurd than these? Believe it or not, I think yes. These may all come in a tie for second, but there is a specific type of belief that turns out to be so absurd that it has to win.

The Greatest Absurdity Possible: Self-Refutation

With the previous ideas I’ve mentioned, notice at least that none of the basic laws of logic disqualify them. There is no law of logic, say, that could tell us that the Holocaust did not happen. This is not because there is any real doubt that it did happen, but because all historical events require concrete physical evidence or human testimony to back them up, and this sort evidence could in theory be mistaken. People could hypothetically lie or be misguided, and physical evidence could hypothetically be misleading or incomplete.

Of course, mere hypotheticals need not lead us astray. But could there be beliefs that can’t even hypothetically be true? Well certainly – an easy example would be the claim that two plus two makes five. That cannot be true, because in fact two plus to makes four and four is not equal to five. It might seem, then, that a claim like “two plus two makes five” might be the height of absurdity.

I would say yes and no. Yes, this has the lowest possible likelihood of being true – namely it is completely impossible. Yet, I still think we can do something even worse. Let me give you an example of something I think is even more ridiculous. Consider the following sentence:

I don’t know how to write any words in English.

Notice something here. If somebody tries to tell me that two and two make five, I need to go to a different statement, namely the true statement that two and two make four, to refute them. With this odd statement, I don’t even need a new statement. They literally just wrote down words in English, which is the very thing they just claimed they could not do! In that sense, I don’t even need to correct them – they have already corrected themselves! Sentences like this one are called self-refuting claims. This is because the content of the sentence is in direct contradiction to something about the sentence itself. Here is another example:

There are no sentences in English that are longer than ten words.

That sentence is written in English and is twelve words long, but says that there cannot be any sentences in English that are more than ten words long. The sentence literally disproves itself! A claim that instantly disproves itself is at the height of absurdity – these are even more obviously incorrect than conspiracy theories.

Examples that People Actually Believe

Surely nobody actually believes any of these non-sensical ideas, right? Sadly, no. There are people who really do believe things that are self-refuting. Allow me to give a few examples I have heard uttered by actual people before.

All Religious Beliefs Are Equally True: The people who say this will normally reject almost everything I believe about religion in a heartbeat. Yet the sentence they just uttered would imply that my religious beliefs and theirs are on equal footing. See how they just contradicted themselves?

There is No Absolute Truth!: Ask yourself whether the claim “there is not absolute truth” is absolutely true or not. If it is not, why should I believe it? But if it is absolutely true, then there are absolute truths, namely that absolute truth itself!

Science is the Only Way to Truth: Ask yourself whether “science is the only way to truth” can be scientifically proven to be true. Obviously not! No laboratory experiment or blackboard calculation could ever prove a thing like that. Therefore, if this claim is correct, then it is incorrect too! I think it goes without saying we cannot accept something so obviously mistaken.

Conclusion

I think these examples should be humbling to us all. You can probably think of people who believe something like this, or perhaps you do (or did) yourself until recently. Whether or not we as individuals have ever fallen prey to something so absurd on its face, we ought to be humble. We might someday fall prey to trickery and wind up believing something like this. We must have enough intellectual integrity and humility to be willing to look at what we believe and ask ourselves – do my beliefs stand up to the standards of my beliefs? The answer is surprisingly non-obvious. Do we criticize others for being selfish but retain selfishness within ourselves? Do we mock religious people – or non-religious people – for believing things without evidence when we believe things without any evidence ourselves? As flawed human beings, we often find ourselves falling into these traps. May all of us become better at avoiding these intellectually toxic pitfalls for the sake of the common good of humanity. If we can avoid these absurdities, maybe our disagreements can become more reasonable and society may profit from the fruits of more and more genuine conversations.

Critical Thinking Toolkit: Non Sequiturs

Whenever we are presenting reasons for a position we believe or are involved in dialogue with a person we disagree with, one of the most important things for us to do is to to clearly lay out our reasons for what we believe. Built into this important criteria of conversation is the idea that our reasons actually do support what we say they support. This is where the idea of non-sequiturs come in.

Before we can discuss the nature of a non-sequitur, we first need to do a brief review of formal logic.

What is Formal Logic?

I’ve discussed some formal logic in my discussion of mathematics, as mathematics heavily relies on formal logic. But for this article, we need not go through every single detail. The overarching idea is much more important. In philosophy, the notion of formal logic is the notion of “irrefutable lines of reasoning.” Something would be considered “irrefutable” if denying it basically requires you to completely give up your mental integrity. I’ll give two examples of formal arguments:

Argument 1

  1. If A is true, then B is true.
  2. A is true.
  3. Therefore, B is true.

Argument 2

  1. Either A is true or B is true.
  2. A is not true.
  3. Therefore, B is true.

Take a moment to think about those. Pick either one of the arguments. Suppose you already agree with (1) and (2) in that argument. Can you see that you then are forced to believe (3) as well? A person who agrees with (1) and (2) but disagrees with (3) is in denial. Perhaps rejecting either (1) or (2) is understandable, but you just can’t say that both (1) and (2) are true and that (3) is false. You just… can’t. It probably feels like I’m rambling on about nothing. In a sense, that is right. This is extremely basic. And that is the point I want to make. Formal logic is just the realm of things that are this basic.

Non-Sequiturs

We can now get to the main theme of the article. A non sequitur happens when somebody claims to be using formal logic but really is not. The name non sequitur comes from Latin, where it translates into English as “it does not follow.” For this reason, non sequiturs are also called formal fallacies, because it is a mistake in formal logic.

Examples of Non Sequiturs

There are many, many examples of non sequiturs. I am going to restrict myself to those that are, in a sense, most basic. When I say “most basic,” I mean those that use such simple language that, if you don’t pause to think about it, you migth actually think are valid. Here are some examples of non sequiturs:

Affirming the Consequent

  1. If A is true, then B is true.
  2. B is true.
  3. Therefore, A is true.

Denying the Antecedent

  1. If A is true, then B is true.
  2. A is not true.
  3. Therefore, B is not true.

Affirming a Disjunct

  1. Either A is true or B is true.
  2. A is true.
  3. Therefore, B is not true.

Denying a Conjunct

  1. At least one of A and B is false.
  2. A is false.
  3. Therefore, B is true.

These may look weird. That’s ok. Statements of formal logic, whether they are correct or not, look weird sometimes. I encourage my readers to find some ways to fill in A and B in these four examples where you agree with (1) and (2) but might disagree with (3). I’ll give you an example for “affirming the consequent.”

  1. If it is raining, the ground is wet.
  2. The ground is wet.
  3. Therefore, it is raining.

Although is is quite possible that all of (1), (2), and (3) are true, it is pretty easy to think of a situation where (1) and (2) are true and (3) is not true. Maybe it just stopped raining 5 minutes ago. Then the ground would still be wet, so (2) is right, and it is still quite reasonable to say that the ground is always wet when it is raining, so (1) is right as well. But it is not now raining, so (3) would be wrong. This is what it looks like for a series of statement to be a non sequitur – by claiming that (1) and (2) are true, it does not follow that (3) is true. (3) may or may not be true.

Things that are Not Non Sequiturs

There are such things as informal arguments, and these are not non sequiturs. For example, pretty much all of science relies on informal arguments called inductive arguments. Inductive arguments involve collecting lots of data and inferring that, were you to go collect more data, that data would agree with whatever patterns you find in the data you already have. If you try to make this into a formal argument, then it is a non sequitur. There is no undeniable, infallible rule of logic that dictates this way of thinking. But scientists do not claim to be using formal logic, and so they are not guilty of a non sequitur. This is important.

It is also important if you are going to understand non sequiturs to understand what the rules of logic actually are. To see those basic rules, see my post here, which is an overview of the rules of logic.

Topics From “Precalculus”: Part 2 (Explaining Calculus #7)

Because of the directions I’d like to be able to go later on in this series, we need to make an aside about more topics that aren’t specifically calculus related. If you’ve made it this far, great for you! You’re now getting into the realm of calculus. This is a fun ride, and it is sometimes good to slow down and appreciate what we’ve done.

We now know how to compute slopes of curved shapes. Just as the slope of a line tells us a lot of information, so too the derivative of a graph carries important meaning. The discovery of calculus by such individuals as Newton and Leibniz, along with those precursors who came close like Pascal and Fermat, was one of the key components in the growth of science. Every single field of science you could imagine involves calculus in either simple or advanced forms as an indispensible component without which that science could not be done in its modern form. We are, to borrow the phrase Newton himself used at the end of one of his greatest works, playing on the shore, finding some nice looking shells, while the great ocean of calculus lies undiscovered beyond us.

Because of the importance of the methods of calculus, surely we want to bring calculus into everything we possibly can. This means, of course, expanding our list of functions beyond those discussed in the first “Topics from Precalculus” post in this series (see the link at the end for that article). In particular, we will lay out some basic trigonometry along with a review of exponential functions and their cousin, the logarithmic functions.

Exponential Function Review

Exponential functions were discussed in the first installment of precalculus topics, so I’ll make this one quick. An exponential function is one that looks like f(x) = b^x for a constant b. For positive whole numbers, exponentials are defined by repeated multiplication, so b^2 = b*b, b^3 = b*b*b, and so on. For negative whole numbers, exponentials are defined by division, so b^{-1} = 1/b, b^{-2} = 1/b^2, and so on. In light of the equation b^x b^y = b^{x+y} that is true for all exponentials, we can say b^0 = 1. If we want to do exponentials with fractional exponents, like b^{x/y}, we define b^{x/y} as the number that satisfies the equation (b^{x/y})^y = b^x. If you want to define exponentials for exponents that are irrational, say like b^{\sqrt{2}}, that is basically a matter of “smoothing out” the values you’ve already defined. This is too detailed to be worth talking about here.

A New and Important Number

While talking about exponentials, there is an important number that comes up a lot in calculus that we should go ahead and define. This number is often associated with the word natural because, in the world of calculus, using this number turns out to make things easier than they otherwise would be. This number is abbreviated e, and is usually defined by the rather seemingly-esoteric equation

e = \lim\limits_{n \to \infty} \bigg( 1 + \dfrac{1}{n} \bigg)^n.

You don’t really have to know all that much about e yet – we will bring up important facts about this very special number as we need them. The curious reader could do some research on e – there is a lot of very interesting material there. The key property of e for the purposes of calculus will emerge in the post after this one.

Logarithmic Functions

We have by now a pretty good intuition for what inverse (or perhaps reverse) process is. In mathematics, the two most obvious examples are addition/subtraction and multiplication/division. If you add two, you can undo that by subtracting two. If you multiply by 2, you undo that by dividing by 2. In the same way, exponential functions also have “reverse functions”. In mathematical lingo, we normally call these “inverse functions,” but the word reverse serves the same purpose if it helps you understand better. The inverse function to the exponential function has the name of logarithm.

For every exponential function b^x, there is an logarithm function \log_b(x). To explain the key property of logarithms, it may help to use a simpler example first – addition and subtraction. I mentioned before that adding two and subtracting two are inverses. This means that the “inverse function” of f(x) = x+2 must be g(x) = x - 2. The way you can tell that these are “inverses” is that when you ‘execute’ one of these functions followed by the other, you wind up where you started. In math-terms,

f(g(x)) = f(x-2) = (x-2)+2 = x \text{ and } g(f(x)) = g(x+2) = (x+2)-2 = x.

Notice that we initially plug in x, and that is also exactly what we get out. This is what we mean by “inverses.” Then, in this way, the logarithm is the special function \log_b(x) for which b^{\log_b(x)} = x and \log_b(b^x) = x. Because of the extremely close relationship between logarithms and exponential functions, all the rules of exponents have counterpart rules for logarithms. Let me show how this work:

Fact: \log_b(xy) = \log_b(x) + \log_b(y).

Proof: First, remember that b^x = b^y only if x = y. So, to prove the identity we want, we can instead prove that b^{\log_b(xy)} = b^{\log_b(x) + \log_b(y)}. Using the foundational property of logarithms, we know immediately that b^{\log_b(xy)} = xy. The same fundamental property, along with the rule b^{x+y} = b^x b^y for exponentials, leads us to conclude that b^{\log_b(x) + \log_b(y)} = b^{\log_b(x)} b^{\log_b(y)} = xy. Combining everything we’ve done shows that the original equation must be true.

By using the same essential strategy, we can prove that log_b(a^x) = x \log_b(a) for any numbers a,x. This is the equivalent of the rule (b^a)^x = b^{ax}. For every rule of exponents, you can find a similar rule for logarithms. That is all we need to know about logarithms in general for now.

The last thing we need is to briefly comment on a special logarithm. The logarithm associated to the special number e from earlier is called the natural logarithm. We could write it as \log_e(x), but it comes up so frequently that very often mathematicians just use \ln(x) or \log(x) as shorthand. Thus, if you see a logarithm without a base, the implied base is the number e.

Trigonometric Functions

Here, we have to delve a bit into the geometry of triangles and circles. Recall the Pythagorean theorem – that for a right triangle with sides a \leq b < c the equation a^2 + b^2 = c^2 is always true. Trigonometry arises from this equation. In particular, if we divide both sides by c^2, then we conclude that

1 = \dfrac{a^2}{c^2} + \dfrac{b^2}{c^2} = \bigg(\dfrac{a}{c}\bigg)^2 + \bigg(\dfrac{b}{c}\bigg)^2.

In geometry class, the ratios between these sides are called sine and cosine. The ratio between b and a is called the tangent. Instead of using ratios, I’ll define the sine and cosine in a slightly different way.

Set up the unit circle – that is, a circle with center at the origin point (0,0) and radius equal to 1. Pick out the point (1,0) on the circle as the “starting point.” Now, imagine rotating around the circle, counterclockwise, along x units of the perimeter. Then we define the functions \sin{x}, \cos{x} so that the point you land on has the coordinates (\cos{x}, \sin{x}). This is actually the same definition as the fractions used before, with the added twist that side lengths that go left or down are given a negative sign and those that go right or up have positive sign.

For example, rotating a length of \dfrac{\pi}{6}, which is 30 degrees, the resulting triangle has a base of \dfrac{\sqrt{3}}{2} and a height of \dfrac{1}{2}. This means that \sin{\dfrac{\pi}{6}} = \dfrac{1}{2} and \cos{\dfrac{\pi}{6}} = \dfrac{\sqrt{3}}{2}. If you had instead rotated 30 degrees the other direction, we would find that \sin{\dfrac{-\pi}{6}} = \dfrac{-1}{2} and \cos{\dfrac{-\pi}{6}} = \dfrac{\sqrt{3}}{2}.

We won’t have much occasion for actually calculating values of these functions, so we won’t do too much of that. There is really only one fact that will be helpful to point out (so we can use it to calculate a derivative later). We want to be able to calculate \sin{(x+y)} and \cos{(x+y)}. Here is how we do that

Fact: For any values of x,y, we have

\sin{(x+y)} = \sin{(x)}\cos{(y)} + \sin{(y)} \cos{(x)}

and

\cos{(x+y)} = \cos{(x)}\cos{(y)} - \sin{(x)}\sin{(y)}.

The proof of this fact comes from geometry, it isn’t terribly complicate but goes too far astray, so I won’t prove it here. The only other thing we must talk about in passing are the other trigonometric functions, all of which are built out of sine and cosine. These are the tangent, cotangent, secant, and cosecant. They are defined, in that order, below:

\tan{x} = \dfrac{\sin{x}}{\cos{x}}, \hspace{0.3in} \cot{x} = \dfrac{\cos{x}}{\sin{x}}, \hspace{0.3in} \sec{x} = \dfrac{1}{\cos{x}}, \hspace{0.3in} \csc{x} = \dfrac{1}{\sin{x}}.

Along with the identity \sin^2{x} + \cos^2{x} = 1 (which is just the Pythagorean theorem), this is all we need from the realm of “pre-calculus” topics in order to move forward.

References

First “Precalculus” Post: https://mathematicalapologist.com/2020/09/20/topics-from-pre-calculus-explaining-calculus-1/(opens in a new tab)

Critical Thinking Toolkit: The Ad Hominem Fallacy

This one really should go without saying… and yet we need to say it anyways. So often, public discourse gets bogged down in personal insults of one variety or another. Whether attacks on a person’s morality, integrity, honesty, educational background, or any other aspect of life, our culture – especially the so-called “cancel culture” – is saturated with efforts to discredit people based on perceived wrongs. On the level of intellectual argument, however, this practice is completely unacceptable and has a name.

Some Obvious Ad Hominem Fallacies

Arguments of this sort are called arguments ad hominem, or ad hominem fallacies. To show as clearly as possible the nature of an ad hominem fallacy, I pose the following dilemma to you.

Hitler believed that 2+2=4. Do you really want to agree with Hitler?

Our emotions might repulse here – we know Hitler was an evil man. But, once the emotional reaction clears up, this is completely silly. Hitler’s being evil has nothing at all to do with whether or not he was right when he said that 2+2=4. That is irrelevant. We can construct a similar-but-opposite example. What if someone walked up to you and said:

Martin Luther King, Jr. believed 2+2=5. How dare you disagree with him! Are you racist or something?

Clearly, the issue here is the same. MLK was a great man, but I can completely agree with MLK goals for social reform without agreeing that 2+2=5. This is not to say that MLK actually believed 2+2 was 5, surely he knew better. But this example makes a point – just because someone is virtuous doesn’t mean you should believe everything they say.

Less Obvious Examples of Ad Hominem

After looking at such obvious examples, it would be natural to think that intelligent people never stoop to this level. And yet, it happens more than people think.

Sadly, this is the majority of political discourse I see today. Those on the left generally paint all conservatives as racist, or at least with not caring about issues of racism, and on that basis (and other similar derogatory remarks) dismiss the entire conservative viewpoint. Conservatives quite often use labels of socialism and communism in similar ways. It isn’t that there aren’t racist conservatives or that there aren’t liberals who genuinely do want socialism or communism – there are. But you can’t just call someone a nasty name and then dismiss them altogether. That’s just not how things work. You’d be laughed out of any intelligent discussion if you take that approach. Sadly, this is largely the state of our world now. By and large, humanity does not know how to dialogue without resorting to insults. This is not surprising – we are never taught how to do so. But it is also extremely disappointing.

But ad hominem doesn’t just arise in moral situations. It also arises on the level of intellect. For instance, a statement like “well I don’t have to listen to you because you don’t have a Ph.D” is an ad hominem fallacy. Whether or not the person you are talking to has a Ph.D, they might be right or they might be wrong. That just isn’t relevant. Rational debate must focus not on the people, but on the subject matter. General relativity, whether stated from the mouth of Hitler or Mother Theresa, whether by Einstein or a confused third-grader reading off of a page, is equally true or equally false. None of those factors matter.

You’d think people don’t do this, but they do – I have been in situations where people responded to points I made by accusing me of lying about my credentials, which is a type of ad hominem attack. In fact, accusing someone of lying is always an ad hominem attack. You can’t prove someone is incorrect by accusing them of lying. Rather you must show in an objective way that the person is either intentionally lying or honestly mistaken.

Conclusion

Writing this post at all leaves a bad taste in my mouth. I don’t even want to speak with people who resort to those tactics. The thing to learn from all of this is that we must always be very careful to ensure that, when we are in a debate, we are focused not on the person, but on their words and whether or not the ideas involved are true or false.

Computing Derivatives: Part 1 (Explaining Calculus #6)

In the previous post in this series, we set up a definition of the derivative of a function, which is a new function that tells us how the original function changes over time. Now that we have set up this idea of derivatives, we are going to enter into a period of showing how to work with derivatives. After all, if we ever want to make use of this idea, we need to know how to actually calculate these things.

This will take up two posts, which are basically ordered by difficulty. In this post, I’ll focus on a subset of the foundational rules of derivatives, mainly the ones that are easier to grapple with. In the follow-up to this post, we will delve into some equally foundational rules that are a bit harder to discover.

Some General Rules for Derivatives

The opening of this post are two rules about derivatives that apply to all functions (well, all functions that have derivatives… which is quite a lot of functions and almost every function you’ll ever hear of).

The “Distributive Law” for Derivatives

This first piece can be thought of as a distributive law. Another way you might think about it is that you can do derivatives “one piece at a time.” If you have a function like x + x^2, this rule tells you that you can find its derivative by using the derivatives of x and x^2 in a straightforward way.

Fact: If f(x), g(x) are functions with derivatives, then \dfrac{d}{dx}[f(x) \pm g(x)] = f^\prime(x) + g^\prime(x).

Proof: The definition of derivatives tells us that

\dfrac{d}{dx}[ f(x) + g(x) ] = \lim\limits_{h \to 0} \dfrac{(f(x+h) + g(x+h)) - (f(x) + g(x))}{h}.

We can split up this limit into two limits:

\lim\limits_{h \to 0} \dfrac{(f(x+h) + g(x+h)) - (f(x) + g(x))}{h} = \lim\limits_{h \to 0} \dfrac{f(x+h) + f(x)}{h} + \lim\limits_{h \to 0} \dfrac{g(x+h) - g(x)}{h}.

The two pieces at the end of this equation are just f^\prime(x) and g^\prime(x). This means that the derivative of f(x) + g(x) is f^\prime(x) + g^\prime(x).

The proof works exactly the same way for f(x) - g(x), so we are done with this proof.

The “Coefficient Law” for Derivatives

This next rule is also quite straightforward. This rule tells us that coefficients don’t really play much of a role when calculating derivatives. You can essentially just pretend they aren’t there and put then back in when you are done. As an example, to calculate the derivative of 123 x^4, you can just calculate the derivative of x^4 and, when you are done, multiply the answer by 123.

Fact: If f(x) is a function and c is a constant number. Then

\dfrac{d}{dx}[c f(x)] = c f^\prime(x).

The proof for this fact is pretty similar to the one for the distributive law of derivatives. For anyone who is wanting some practice with these ideas, this would be a good example to work on.

Some Specific Functions with Derivatives

We’ve just finished discussing some basic rules of how derivatives work. We’ll now spend a little bit of time talking about specific functions that are important and how their derivatives are calculated.

Derivative of a Constant

The easiest of all functions to do the derivative of is a constant function. Since these functions never change, you can think of them as horizontal lines. Because of the viewpoint of derivatives as slopes or as capturing changes over time, you’d expect the derivative of something that never changes to be zero (zero reflecting the amount of change the function does). In fact, this is true.

Fact: If f(x) = c is a constant function, then f^\prime(x) = 0.

The proof of this fact is not hard, and would be a good exercise for those who want to practice doing such calculations.

Derivative of a Polynomial

One of the most important kinds of functions we talk about in mathematics are polynomials, which are sums of expressions like ax^n, for a positive whole number n and some kind of constant value a. Notice that, because of the coefficient law and distributive law mentioned earlier, we can treat polynomials one term at a time. For instance, if we want to compute the derivative of f(x) = 3x^2 + 7x - 4, the distributive law tells us that we only need to know how to take the derivatives of 3x^2, 7x, and 4. The coefficient law then tells us that we only need to know how to take the derivative of x^2, x and 4. We already know from earlier that the derivative of constants are zero, so we only need to know how to find the derivatives of x^2 and x. In fact, what we are actually going to do is to find the derivative of all members of the list x, x^2, x^3, x^4, \dots all at once. But, before I do this, the example of x^2 will be helpful.

Fact: If f(x) = x^2, then f^\prime(x) = 2x.

Proof: We can actually just do this directly using the definition of derivatives. We begin with the definition:

f^\prime(x) = \lim\limits_{h \to 0} \dfrac{f(x+h) - f(x)}{h} = \lim\limits_{h \to 0} \dfrac{(x+h)^2 - x^2}{h}.

We can simplify the inside of this limit using some algebra:

\lim\limits_{h \to 0} \dfrac{(x+h)^2 - x^2}{h} = \lim\limits_{h \to 0} \dfrac{(x^2 + 2xh + h^2) - x^2}{h} = \lim\limits_{h \to 0} \dfrac{h(2x+h)}{h}.

Usually, algebra doesn’t allow you to cancel out the h on top and bottom , because if h = 0, then this is not allowed. But, since we are inside of a limit in which h \to 0, we can assume that h is never actually zero (it just approaches zero). This means we actually can cancel. So now,

f^\prime(x) = \lim\limits_{h \to 0} (2x+h) = 2x + 0 = 2x,

which we evaluate in this way because 2x+h is continuous at h = 0, so you are allowed to plug in zero for h. So, the derivative of x^2 is 2x. This completes the proof.

The idea behind the derivative of x^n is basically the same. The difference occurs in the algebraic simplification of the term (x+h)^n that occurs in the limit. Here, we lay out exactly how to handle this difference.

Fact: If f(x) = x^n, then f^\prime(x) = n x^{n-1}.

Proof: The definition of f^\prime(x) tells us that f^\prime(x) = \lim\limits_{h \to 0} \dfrac{f(x+h) - f(x)}{h} = \lim\limits_{h \to 0} \dfrac{(x+h)^n - x^n}{h}. The key to simplifying this expression is in figuring out the term (x+h)^n. We can do this using the method of foiling. For now, let’s just assume we have done this, and that we’ve found that

(x+h)^n = x^n + a_1 x^{n-1} h^1 + a_2 x^{n-2} h^2 + \dots + a_{n-2} x^2 h^{n-2} + a_{n-1} x h^{n-1} + h^n.

Notice then that (x+h)^n - x^n = a_1 x^{n-1}h + h^2 g(x), where g(x) holds all the leftover terms that have at least two powers of h in them. Knowing this, we can compute the limit from earlier.

f^\prime(x) = \lim\limits_{h \to 0} \dfrac{a_1 x^{n-1} h + h^2 g(x)}{h} = \lim\limits_{h \to 0} [a_1 x^{n-1} + h g(x)] = a_1 x^{n-1}.

This means that we actually only need to figure out a_1, since it is the only part that actually influences anything. But think now about what a_1 was. This is the number of ways that foiling the expression (x+h)^n can give you a term like x^{n-1}h. Well, think about the different places that singular h might have come from. There is one for each (x+h) in the product, of which there are n. Ergo, there are exactly n different ways we could have obtained x^{n-1}h. Therefore, a_1 = n, and if we combine everything we know now, we conclude that f^\prime(x) = n x^{n-1}. This marks the end of our proof.

Notice that in this proof, we’ve assumed that the number n was a positive whole number. The result \dfrac{d}{dx}[x^\alpha] = \alpha x^{\alpha-1} is actually true for any real number \alpha at all, but this requires a much more sophisticated understanding of what x^\alpha means, and so I will not try to prove that here. But, the reader should know that this is true.

Example: As an example, we compute the derivative of the function f(x) = 3x^2 + 7x - 4 mentioned earlier:

f^\prime(x) = 3 \dfrac{d}{dx}[x^2] + 7 \dfrac{d}{dx}[x] - \dfrac{d}{dx}[4] = 3(2x) + 7(1) - 0 = 6x + 7.

Conclusion

We’ve now seen how to evaluate some derivatives. Notice the key principle we needed – we needed to cancel out terms on the top and bottom of a fraction. This is the key principle in evaluating derivatives using the limit definition. Any time you do a derivative in this way, you will be looking to cancel out terms in exactly the same way I have here.

We have now covered some of the easier to see ways in which derivatives can be calculated. In my next post, we will look into some trickier and even more useful rules that derivatives follow. If you want to have an exercise to work on your calculus intuition, convince yourself that all functions that have derivatives are also continuous. Also, think of examples of graphs that have special points that don’t have a tangent line.