Higher Order Derivatives and Their Applications (Explaining Calculus #12)

Up to this point, I’ve focused my efforts on derivatives of functions and what those derivatives mean. In particular, derivatives tell us about how things change over time – with derivatives, we can measure quantities like speed and growth. But derivatives are also functions, which means they have their own derivatives. Can we learn anything from there “derivatives of derivatives”? The answer is yes – and this is the topic for discussion here.

What are Higher Derivatives?

Recall briefly the definition of a derivative – for any function f(x), its derivative f^\prime(x) is defined by

f^\prime(x) = \lim\limits_{h \to 0} \dfrac{f(x+h) - f(x)}{h}.

I won’t actually need to use this definition here. But it is important for this discussion to remember that, in the first place, a derivative is a certain type of function. We begin with a function f(x), and we applying the limiting process above to f(x) to obtain the derivative function f^\prime(x).

The second-order derivative of the function f(x), usually just called the second derivative of f(x), can be found be applying this process twice – that is, applying it to f(x) to get f^\prime(x), then applying the same process again to f^\prime(x) to get another function, which we call f^{\prime \prime}(x) (two of the \prime because you do the derivative twice). This is called the second derivative. If you take even more derivatives, we generally call those higher derivatives.

How to Calculate Second Derivatives

We can use the formula for derivatives to find second derivatives. Since the second derivative f^{\prime \prime}(x) is the derivative of the derivative – we can apply the definition of the derivative to f^\prime(x) instead of to f(x). So,

f^{\prime \prime}(x) = \lim_{h \to 0} \dfrac{f^\prime(x+h) - f^\prime(x)}{h}.

We can briefly calculate a few second derivatives to show how this works. For a fairly simple one, set f(x) = x^3 + 2x. Using the “power rule” for derivatives, the first derivative of f(x) is f^\prime(x) = 3x^2 + 2. By applying the process again, we can find the second derivative, f^{\prime\prime}(x) = 6x. As another example, we could choose the more complicated g(x) = \ln{x}. In this case, the first derivative is g^\prime(x) = \dfrac{1}{x}. We can then use the power rule to find the second derivative, g^{\prime\prime}(x) = \dfrac{-1}{x^2}.

The Meaning of the Second Derivative

Now, for the purposes of my writing here, I don’t care that much about calculating lots of second derivatives. I’m much more concerned with what sorts of things you can do with second derivatives. Just as the first derivative has built into it the ability to tell us how to maximize and minimize things and how to calculate speeds, the second derivative also carries very useful built-in information. I’ll list out a few of these.

Application #1: Concavity

The most direct piece of information you can learn from second derivatives is called the concavity of a graph. Concavity describes the way a graph bends. As an example, the U-shape “bends upward” as you trace along the figure, and we call that “concave up”. When you work out the details, you can convince yourself that whenever f^{\prime\prime}(x) > 0, the graph of f(x) is concave up. This is because the idea of “bending upward” can be translated as” the slope is getting steeper and stepper in the up direction”. This second way of phrasing it talks about a slope (think first derivative) changing over time (so take another derivative). In much the same way, an upside-down U-shape has the visual feature of “bending downwards,” and in math we call that “concave down”. You can follow along the same lines of reasoning as with “concave up” graphs that if f^{\prime\prime}(x) < 0 then the graph of f(x) is concave down.

Application #2: Finding Max/Min Values

We can make use of this concavity idea to devise an entirely new method for finding max and min values of graphs. To explain this, let’s first give a brief reminder of how the first derivative enabled us to do this.

If we find the peak of a graph, that peak if definitely a so-called critical point – which just means f^\prime(x) = 0 there. Intuitively, this is like saying the top of a mountain is either flat or a sharp point, because if it were sloped then it wouldn’t be the top. So, if we want to know the max and min values of f(x), the first thing we can do is to solve the equation f^\prime(x) = 0. Once we find the solutions, we can then ask ourselves what happens to f^\prime(x) near those points. If, for example, we study the values of f^\prime(x) and find that the graph is going up right before it gets to a certain point, then becomes flat, then starts going downward after we pass by that point, then we can visualize this as a peak in the graph. If instead the graph was going down towards a point, then flattened out, then started climbing upward, then we’ve found a valley rather than a peak.

The second derivative gives us a more straightforward way to handle the “going up/going down” aspect of this problem. Take, for example, the case of finding a peak. The description “going up, flatten, going down” that I gave describes an upside-down U-shape. But we now know that second derivatives tell us whether a graph has an upside-down U-shape! If we know f^{\prime\prime}(x) < 0 then this is the same thing as the graph having an upside down U-shape. So, instead of doing the tedious work of evaluating lots of first derivative values nearby the critical point we care about – we can just evaluate f^{\prime\prime}(x) at the actual critical point itself and we can get the same information. In the exact same way, if f^{\prime\prime}(x) > 0 at the flat point on the graph, then you get a normal U-shape and you have a valley in your graph.

So, in almost every situation, using the second derivative actually makes finding peaks and valleys easier. The only hitch is that if you get unlucky and f^{\prime\prime}(x) = 0 exactly, then you can’t make things out one way or the other. If that unlucky event happens, you would need to go back to the old trick. But this is a very rare situation. There are technical reasons why the zero points of f^\prime(x) and f^{\prime\prime}(x) are nearly always different from one another. (If you want to challenge yourself, pick some random equations of the shape f(x) = x^3 + ax + b + c and see if you can find one where f^\prime(x) and f^{\prime\prime}(x) have a zero at the same place. You likely won’t find one by randomly guessing – you’ll have to be a bit clever. You’ll need to make use of an idea called “multiple zeros” or “repeated zeros”.)

Application #3: Acceleration and Gravity

In speaking of why derivatives are so important, I spoke about how you can use a derivative to measure speed – how fast something is going. I would now like to extend this interpretation of derivatives to acceleration, and to show how this interpretation contributed to one of the most important developments in the history of science.

We recall that the derivative of something is a description of how that thing changes over time. If we take this a little too rigidly, without interpreting anything, the second derivative would tell us “how the ‘change over time’ changes over time”. That isn’t very helpful. But if we use the interpretation of the first derivative as a speed or velocity, then this will shed light on the second derivative. Then the second derivative would then be a description of how the speed of an object changes over time. But we already have a word for that – acceleration! So, if we have some sort of function describing where something is, then its second derivative tells us about how that body is accelerating.

Acceleration is an incredibly important concept in physics. One of the great discoveries in the era of Galileo was that heavy objects and light objects accelerate exactly the same way when dropped from a height. Isaac Newton was able to do even better – and this is the application I’d like to talk about.

Newton’s theory of gravity can basically be reduced to two pieces. Firstly, there is his theory of force – that F = ma. This reads as “force equals mass times acceleration”. So, acceleration plays a big role. The second component describes how to calculate the force of gravity, the equation is F = G \dfrac{m_1 m_2}{r^2}. G is a constant value, m_1, m_2 are the masses of the two objects we are considering at the moment, and r is the distance between them (more specifically, their ‘centers’). When you put these together, you find that on earth, the acceleration that gravity causes is a constant number.

This is huge, because you can translate this sentence into a calculus equation. If f(t) is a function that tells you where some object I’ve thrown in the air happens to be at the moment, then Newton’s theory tells me that f^{\prime\prime}(t) is a constant number. In calculus, I can now ask myself the following question:

What sorts of functions have the property that f^{\prime\prime}(t) is a constant number?

Thus, we can see that calculus plays a key role in the theory of gravity. We actually haven’t developed enough calculus yet to solve this (we know how to take derivatives, but this question requires undoing derivatives). However, we will very soon be able to do exactly this. The rule of reversing derivatives is one of the most important parts of calculus – and it is the topic we will be addressing next in this series.

Conclusion

Not only is the first derivative of a function useful – so is its second derivative. We can use these second order derivatives to make calculations regarding gravity and in order to solve optimization problems – both of which are extremely important for engineering and all sorts of practical applications.

Database: Critical Thinking Toolkit

Up to this point, my database posts have been about compiling data from external sources. This database post is more organizational in nature. My goal here is to systematize the posts I’ve written in my series “Critical Thinking Toolkit”. The purpose of the series was to provide summary approaches to various areas of critical thinking that are not often taught or recognized. What I have done here is to take every post I’ve made in this series and sort them into different categories of related types of tools in the toolkit. I’ve also tried to add very brief commentary that can help a reader find what they are looking for. I hope this resource will be helpful!

Introduction: Thinking Clearly and Critically

Types of Arguments

There are three types of argument here. Deductive arguments are those that carry absolute certainty via basic rules of logic. Inductive arguments reason from specific to general, as in the scientific method (experimental science specifically). Inference to the best explanation is the kind of reasoning used by detectives and forensic scientists.

Analyzing Arguments

These posts all have to do with how to respond to, approach, and analyze arguments.

Logical Fallacies

These posts deal with common flaws people often miss in their own reasoning as well as in public discourse.

Modal Logic

Modal logic is the proper way of reasoning with the notions of possibility, contingency, and necessity.

Having a Productive Conversation

This is a mix of different posts I think are particularly

Epistemology

These posts all have to do with what we mean when we say we “know” something.

(Last Updated on 1/30/2020)

Summary of “Advice to Christian Philosophers” by Alvin Plantinga

Academia can be a wild place. Despite what people might think, there are often entire paradigm shifts between generations in academic circles. This has happened in philosophy – just two generations or so ago, things were wildly different, and things now are different from one generation ago. I now want to summarize famed Christian philosopher Alvin Plantinga’s article in the journal Faith and Philosophy [1] that mentions these changes and gives sound advice to philosophers living and working in the midst of that change.

Keep in mind that this article was written in 1984. Thus, when I write now, I mean during the timeframe the author wrote the article. Some parts of the article reference how things are right now, so to speak. I’ll use now in the context of 1984, just as the author did. At the end, I’ll speak to the developments referenced by the author in the modern day, and my impression of the extent to which the advice of Plantinga has or hasn’t been followed in the modern day… at least as far as I can determine from my perspective.

For the article I am summarizing, see [1]. Pretty much everything from the ‘summary’ section is my paraphrase of the main ideas I take from Plantinga’s article.

Summary of Plantinga’s Main Points

Christianity has been on the rise in academia. In the 1940s and 1950s, pretty much all of academic philosophy was profoundly anti-Christian in its views. Not in the discriminatory sense, but in that it rejected probably every single truth a Christian believes. At this time, philosophers largely worked within a framework called logical positivism. We don’t need to go all the way into what this framework was, but it will suffice to say that on this view, sentences like “God exists” were treated as literal nonsense. The claims of religions like Christianity and Islam that “there is such a being as God” were not even given the dignity of a claim – they were treated as a meaningless combination of words. But now (~30 year later) things are different. There are large philosophical societies dedicated to Christian philosophy, filled with openly Christian philosophers working both at Christian universities and secular universities. Christianity is on the rise in academia.

But just because it is on the rise doesn’t mean it has arrived at a reasonable destination yet. This new generation of Christian thinkers is, so to speak, the first generation. We have only taken a few steps, we still have much to learn about operating in the current American/Western culture. We will look at a few ways that Christian philosophers are still being treated differently from, or at least thinking differently from, secular philosophers and how that ought to change. Also, it is worth noting that this advice could apply equally well to Muslim, Jewish, or deistic philosophers, we here we will discuss Christianity specifically.

To use Plantinga’s own words, most of academic philosophy has “next to nothing to offer the student intent on coming to see how to be a Christian in philosophy – how to assess and develop the bearing of Christianity on matters of current philosophical concern, and how to think about those philosophical matters of interest to the Christian community.” [1, pg. 254]

Plantinga has two suggestions for Christian philosophers, summarized in the following quotation:

“First, Christian philosophers and Christian intellectuals generally must display more autonomy – more independence of the rest of the philosophical world. Second, Christian philosophers must display more integrity – integrity in the sense of integral wholeness, or oneness, or unity, being all of one piece… And necessary to these two is a third: Christian courage, or boldness, or strength, or perhaps Christian self-confidence. We Christian philosophers must display more faith, more trust in the Lord; we must put on the whole armor of God.” [1, pg. 254]

We Must Be More Autonomous

Why should these even need to be discussed? Again, Plantinga provides the explanation. In his words,

“Philosophy is a social enterprise; and our standards and assumptions – they parameters within which we practice our craft – are set by our mentors and by the great contemporary centers of philosophy.” [1, pg. 255]

In other words, we tend to think in the same ways as those who taught us. This of course has its benefits – after all those who came before us were smart people – but we should also have the ability to break through those conceptual and psychological barriers when necessary.

The inability to get over this boundary is very harmful. Just like any other group, Christians long to understand the world through the lens of what they already believe. And Christian philosophers have a crucial role in helping the rest of the church to learn to think about the world around them. This isn’t a call for Christians to retreat from important philosophical issues of the secular world – far from it, Christians should be involved in those. But this does mean that Christians should be involved in issues of special importance to Christians. Not in an isolationist sense though.

Consider politics as an analogy. If you follow a particular politician, you will find them engaged in (at least) two main lines of work. Firstly, you will find them debating and working with members of opposing political parties. Secondly, you will find them debating and working with members of the same political party. Both of these tasks are important. For instance if you are a Democratic senator, then part of your job is to work alongside Republicans as best you can to get good things done for the country and part of your job is to work within the Democratic party itself to bring the whole party in the direction you think it should go.

In the same way, Christian philosophers can and should work both within secular philosophy (analogous to working alongside all politicians, friend or foe) and should work on specifically Christian matters (analogous to working alongside your own political party). For example, shouldn’t we reflect more on God? Who is God, and what is He like? What does the Bible tell us, and what doesn’t it tell us? How can we think about all these issues? What of the story of salvation, with Jesus dying for our sins? Is this fair? How do we understand this story? These are all discussions where Christians can learn greatly from a Christian philosopher who has thought about these questions in light of their education in philosophy.

We Must Be More Unified

This one is fairly simple, and follows out of the first. There shouldn’t be debate within the realm of Christian philosophers regarding whether or not Christian philosophy should be done. Furthermore, there ought not be disagreement between Christian philosophers on the absolute broadest of issues in the areas that we disagree with most secular perspectives. Christian philosophers should, in this sense, be supportive of one another. Not that all Christian philosophers must agree on everything – just like secular philosophy, there will always be disagreements. But there ought to be a fundamental level on which Christian philosophers are unified. Just as the body of Christ is unified on the fundamental level, so ought Christian philosophers be unified.

This would equally apply to any other community – say Mormon philosophers or Muslim philosophers. Both the previous points are natural consequences of what it means to be part of a family – for Christian theology clearly teaches us that all who follow Christ are brothers and sisters. I’m sure Mormonism and Islam have similar ideas of community between believers, even if they don’t use the language of family.

We Must Have More Courage

This point flows out of the first two. Both of the first two points require disagreeing with the ‘mainstream’. Just as in any other break from the mainstream, a certain kind of courage is required here. The word strength could be equally applicable. When someone takes this step, resistance on both intellectual and social levels is sure to follow. That’s just how society is. But since the Christian holds his convictions dear to his heart as the truth, he is justified in holding to those convictions and standing by them publicly, even in academic work.

An Example: Verificationism

This has to do with the “Verifiability Criterion of Meaning”, as it is often called in philosophy. The logical positivists of forty or so years ago claimed that if a sentence is not verifiable using your senses (usually just the five) or elementary logic, then the sentence has literally no meaning. According to this “criterion”, sentences like “God created the universe” aren’t true or false at all – they literally don’t mean anything. The person who says “God created the universe” might as well have picked totally random words out of a dictionary. It is like saying “sell qualification climate miss” (a randomly generated set of words) – it just doesn’t mean anything at all. Or, to use a meaningless phrase from Alice and Wonderland, “T’was brillig, and the slithy toves did gyre and gymbol in the wabe”. According to the logical positivists, all sentences about religion are as meaningless as that.

This position certainly carried a very academic air to it – it feels sophisticated and modern on an emotional level. And yet, as a Christian philosopher, it is completely unacceptable. The Christian philosopher is, in the first place, fully entitled to think about the nature of God even if other people think they are speaking nonsense. Secondly, the Christian philosopher has the right to point out that sentences like “God created the universe” or “God exists” obviously have meaning – the Verifiability Criterion is just obviously wrong. Why should a Christian cave to the opinion of the non-Christian masses if that opinion is obviously wrong? By now, this viewpoint has been thrown in the trashcan of history – and rightly so – but why were not more Christians willing to speak out against it in its heyday in the 30s and 40s? We should have had a unified voice against it, and we should have had the courage and strength to speak out against the public opinion.

Other Examples: Knowledge and Persons

Plantinga’s article also has discussion on examples of the intersection with theism and the theory of knowledge and the theory of persons (i.e. what is knowledge, and what is a person). Discussing these would make the summary too lengthy – the main points are already all discussed. So, if my reader wants to see what Plantinga has to say in those areas, they can go for it. Otherwise, we move on to conclude the discussion.

Conclusion of Article

In rereading Plantinga’s paper, I find the following segments to be the best summary of what Plantinga wants to say.

“Many Christian philosophers appear to think of themselves qua philosophers as engaged with the atheist and agnostic philosopher in a common search for the correct philosophical position vis a vis the question whether there is such a person as God. Of course the Christian philosopher will have his own private conviction on the point; he will believe, of course, that indeed there is such a person as God. But he will think, or be inclined to think, or half inclined to think that as a philosopher he has no right to this position unless he is able to show that it follows from, or is probable, or justified with respect to premises accepted by all parties to the discussion – theist, agnostic, and atheist alike. Furthermore, he will be half inclined to think he has no right, as a philosopher, to positions that presuppose the existence of God, if he can’t show that belief to be justified in this way. What I want to urge is that the Christian philosophical community ought not to think of itself as engaged in this common effort to determine the probability or philosophical plausibility of belief in God. The Christian philosopher quite properly starts from the existence of God, and presupposes it in philosophical work, whether or not he can show it to be probable or plausible with respect to premises accepted by all philosophers, or most philosophers, or most philosophers at the great contemporary centers of philosophy.” [1, pg. 260-261]

“The Christian philosopher does indeed have a responsibility to the philosophical world at large; but his fundamental responsibility is to the Christian community, and finally to God.” [1, pg. 262]

“Now my point is not that Christian philosophers must follow Calvin [a theologian] here. My point is that the Christian philosopher has a right (I should say a duty) to work at his own projects – projects set by the beliefs of the Christian community of which he is a part. The Christian philosophical community must work out the answers to its questions; and both the questions and the appropriate ways of working out their answers may presuppose beliefs rejected at most of the leading centers of philosophy. But the Christian is proceeding quite properly in starting from these beliefs, even if they are so rejected. He is under no obligation to confine his research projects to those pursued at those centers, or to pursue his own projects on the basis of the assumptions that prevail there.” [1, pg. 263]

On These Developments in Christian Philosophy

The article I am discussing was written by Plantinga in 1984. I am writing this post in 2021, which of course is 37 years later. We are about as far from this article as the article is from the dark days of logical positivism it references in its introduction. How have things gone since then? Have philosophers taken his advice? I think they have – to conclude, I’ll discuss this briefly with some examples.

We Are More Autonomous

There has certainly been an increase in Christian perspectives within secular philosophy. But has there been more autonomy? I think there has been. In my experience with philosophy, I have encountered plenty of Christian philosophy dedicated to specifically Christian matters. Perhaps the best way to show what I mean is to give examples of serious Christian philosophers who have written on topics of both secular interest and specifically Christian interest.

The proliferation of examples could be continued. I’m not trying to be comprehensive here. My point was to select a few big names from the philosopical conversations I’ve been a part of and to point out that those people have all done significant philosophical writing on expressly Christian topics. This means progress. We’ve done well.

We Are More Unified

This one is not as positive as many others. There is, in the Christian community at large, quite a bit of vitriolic disagreement that isn’t appropriate (although some of it might be). There are many minor disagreements that are treated as if they were much bigger than they are. This may be rampant among Christians in general, but the situation is a lot better in academia. There may not be agreement necessarily, but there is unity of purpose nonetheless. Each Christian philosopher I have read or listened to appears quite clearly to be pursuing God as truly as they can, even if they sometimes arrive at some different conclusions. This is a wonderful development and I hope this attitude spreads to the church at large.

We Are More Courageous

This one also isn’t fully positive. If a person has misguided motives and supreme courage, then the result can be horrifying. Take Hitler as an obvious example – for most of his public life, you couldn’t accuse Hitler of being cowardly – he was quite bold in his proclamations. But his motives and desires were obviously misguided, and the result was horrifying. Sadly, there are Christians and people who claim to be Christian but likely aren’t who are in a less extreme but analogous position – people of great courage but whose goals are misguided. But there are also plenty of Christians who exhibit great courage and who have pure motives – those who do ministry in countries in the Middle East that want to execute them would be amazing examples of such people. There are less obvious but equally real examples of such courage in academia as well – the willingness of Christian academics to enter into recorded debates with opposing views that I have seen serve as a wonderful example – see debaters like William Lane Craig and James White if you want to see fearless, courageous Christian debaters.

Conclusion

I think Plantinga’s advice has been taken seriously and that we have followed through on that advice. I only hope we continue to follow through on that advice. By reaching into deeply Christian philosophical discussions in a unified and courageous manner, we can educate the Christian community and raise up an even stronger generation of new Christian academics, both in philosophy and in other fields – like my own field of mathematics. I’d love one day to write something like Plantinga’s article for the mathematical community – hopefully one day such an article will be widely read in the mathematical community. But until then (or until I find that article if it doesn’t exist), the Christian community can continue to work towards that point.

References

[1] Plantinga, Alvin. “Advice to Christian Philosophers.” Faith and Philosophy, vol. 1, no. 3, 1984, pp. 253–271., doi:10.5840/faithphil19841317.

Exponential Models (Explaining Calculus #11)

We’ve used calculus for a couple different applications at this point. I’d like to tack on another application to our list, perhaps the most important one – at least from the perspective of a society very reliant on technology, engineering, and science. The application I have in mind goes under the very broad title of mathematical modelling.

Mathematical Modelling

Mathematical modelling is an extremely broad term – so broad that it is a bit difficult to define. I would argue that large sections of physics, chemistry, engineering, and statistics (including polling in the social sciences) fall in one sense or another under this category. If I had to give a definition, I would say that mathematical modelling is the practice of framing patterns observed in the external world in mathematical language. So, mathematical modelling would involve first gathering some sort of data from the real world and subsequently attempting to explain that data by a single theory that utilizes some mathematical formulas.

Large portions of modern science were formulated this way. For example, because of the great ancient philosopher Aristotle, for more than a thousand years it was believed that heavy objects fall faster than lighter objects. This was believed not on the basis of gathering data, but on intuition. Perhaps you might think something like “well, since it is more difficult to prevent a heavy object from falling, it would make sense that a heavy object ought to fall faster”. And there is certainly intuitive plausibility here, but it doesn’t count as mathematical modelling for two main reasons. For one, there is not an equation given telling us how much quicker heavy objects fall. Secondly, the theory was not formulated by data.

This isn’t necessarily a bad thing for Aristotle’s theory – Einstein did not form his theory of relativity based on observational data either, but on more philosophical/physical intuitions. So Einstein was not engaged in mathematical modelling, yet his theory ended up being perfectly aligned with data once data was collected. But Aristotle’s theory turned out to conflict with data – notably Galileo did famous experiments that proved this wrong. Galileo’s work is much closer to mathematical modelling, because his theory arose directly from his data.

But even better would be a formulation of the data-based theory of Galileo in formulas. This is more or less what Isaac Newton did when he formulated his theory of gravity. Newton used the language of calculus to describe exactly what it means that objects of different weights fall at the same speed. Using the mathematical model of Newton, you can make predictions about a large range of moving objects – from footballs to planets.

Notice what has been done here. We start with Galileo dropping objects that weight different amounts and noticing they hit the ground at the same time. Obviously Galileo didn’t drop every possible object (who has the time for that), but he didn’t need to. From the experiments he did do, he formulated a theory – a model – that predicts that all objects always fall at the same speed towards the ground when dropped on Earth. Newton took that model, made it more precise mathematically, and expanded its scope to correctly predict much more data than Galileo’s theory did. This is one example of the way of thinking embodied by mathematical models.

In a later post, I plan to show how Newton’s model works. But before we go into that model, I want to talk about what I would consider to be a simpler model – one we can formulate together from the ground up.

Example: Growing Population

The example we will look at is the growth in population of living beings. The same basic model will work for any creature – whether bacteria, rabbits, or humans. I opt here to use human population growth. To begin, we’ll look at a graph of the estimated growth of the population I found.

The growth in human population over time (from Wikipedia Commons)

This is a rather odd graph. It is basically completely flat for a long time, then it just explodes. How do we understand a graph like this? Well, it all begins with a simple principle.

The more humans there are, the more babies they will have.

This seems rather obvious as a general rule. Since individual women are the ones that give birth to new humans, it would naturally follow that the more women there are, the more babies there can be in the future. And yet it is all we need to get a mathematical model started. How? Well, this is where mathematical modelling comes in.

Making the Model

How do we turn out observation into a model? Well, since we want a mathematical equation, let’s start by defining a mathematical symbol we can use to count the population of all humans. Let’s say

P(t) = Number of people alive t year after the first humans existed.

We are building a mathematical model based on the observation “the more humans there are, the more babies they will have“. In order to make things easier for us, we are going to pretend that only one generation of people is alive at a time. This just makes our counting job easier. Later on we will talk about how a mathematical modeler would go about making their model better able to account for multiple generations at a time. For now, we will leave

Ok. Now, let’s sit down and think about to take our observation that “the more humans there are, the more babies they will have” and make it into a formula. To see how this works, let’s imagine that there are currently 1000 humans alive on earth. That would be about 500 women – let’s just say exactly 500 women. Now, we reach a decision point in our model – how many children are these women having? We don’t really have to ask about each of the 500 women – an average will be good enough. Let’s imagine that, on average, each woman has 3 children in their life. How many people will be in the next generation? Well, that would be the number of women times the average number of children per woman – which leads us to 500 * 3 = 1500 people in the next generation (750 women). If the average number of children per woman stays at 3, then the next generation will have 750*3 = 2250 people. If we keep up this process, we will get bigger and bigger population sizes.

Now, how might be try to put this into a formula? This is where calculus becomes so important. Because what we are trying to figure out is how the total population changes over time – and change over time is exactly what calculus is meant to help us with! Notice now what we’ve done in the previous paragraph. There were two factors that told us how to predict the size of the next generation of humanity:

  1. The number of people alive currently.
  2. The average number of babies each woman has.

In the spirit of making this into an equation, we should express both of these in a mathematical way. We’ve already set up the function P(t) as the number of people alive at a specific time, so (1) is given to us by P(t). For the second one, we can use the letter B (short for ‘babies’) to stand in for the average number of babies each woman has. Now, when we calculated the size of each generation earlier, we multiplied the number of women by the average number of babies per woman. If our current generation is P(t) people, that would be about \dfrac{1}{2} P(t) women, and so the number of people in the next generation would be about

\dfrac{1}{2} P(t) * B = \bigg( \dfrac{B}{2} \bigg) P(t).

This leads us to an interesting formula. What if we want to know how quickly the human population is growing? Well, let’s do this. If we want to know how much bigger (or smaller) the new generation is compared to the old one, we just subtract these values from each other. In the example earlier, the first and second generation differed by 1500 – 1000 = 500 people, and the second and third differed by 2250 – 1500 = 750 people. If we use our big-picture formulas, along with the derivative P^\prime(t) to stand in for the change in population (since derivatives literally are changes over time), then we wind up with a formula “Change in Population = New Generation Size – Old Generation Size”, which if we put it in the language of calculus, leads us to

P^\prime(t) = \bigg( \dfrac{B}{2} P(t) \bigg) - P(t) = \bigg( \dfrac{B}{2} - 1 \bigg) P(t).

So, if we know B, then we can use this new formula to approximate how the population of humans will change over time.

Solving the Key Equation

We now have a key equation we want to solve. We’ve found an equation for the change over time that the population exhibits:

P^\prime(t) = \bigg( \dfrac{B}{2} - 1 \bigg) P(t).

We now want to solve this equation. This equation is basically just “Derivative = Constant * Population”, so it will be a bit easier to just write a single letter C instead of the more cumbersome \dfrac{B}{2} - 1. So, we will think about the equation

P^\prime(t) = C * P(t).

Now, this is an equation in the world of calculus. It would then be natural, as a mathematical modeler, to ask if we know about any functions P(t) that solve this equation. We actually don’t quite have the tools yet to solve this equation without already knowing the answer (we need something called antiderivatives for that), but in the examples of derivatives we’ve already talked about in previous posts, we already have an example of a function that works – the exponential functions. From the ‘calculating derivatives’ posts, I showed that if f(x) = b^x for some constant b, then f^\prime(x) = \log{b} * b^x = \log{b} * f(x). So, if we choose the value of b so that C = \log{b}, then f^\prime(x) = C * f(x). If we solve for b, then we find that b = e^C, and therefore we conclude that

f(x) = b^x = (e^C)^x = e^{Cx}

is a solution to the population growth equation P^\prime(t) = C * P(t). Now, does this actually work? Well, if you graph functions like b^x for values of b larger than 1, then the shape you get is a lot like the graph I started the post with – initially very flat, but with a sudden explosion later on. So we seem to have done a pretty good job.

Application to Population Growth

What does this mean for predicting future population growth? Well, we’ve already established that if you know the average number of babies each woman has, we can use that to form an equation predicting how quickly the population of the world will grow. So, the main thing we have to keep track of is this average.

But we can go even deeper. Recall the original formula we came up with – if B is the average number of children per woman and P(t) is the population, then we found the equation

P^\prime(t) = \bigg( \dfrac{B}{2} - 1 \bigg) P(t).

Perhaps you are wondering whether the global population is increasing or decreasing. We can use the value of B to predict this. If the value of \dfrac{B}{2} - 1 is positive, then the right-hand side of the above formula is a positive number, and this means the population is increasing. If on the other hand \dfrac{B}{2} - 1 is negative, then the right-hand side is negative, so the population will be decreasing. It is fairly easy to figure out that growing populations happen when B > 2 and shrinking populations happen when B < 2. This mathematical model has given us a quite interesting result then – we can determine whether the population is growing or shrinking without actually counting the population. That is a pretty neat consequence that we likely didn’t know or wouldn’t have thought of before we made our model.

We could draw tons of other conclusions from this model. I won’t do so here. My point is just to show how calculus played a role in developing this theory of population growth.

How to Improve Your Model

There are, of course, lots of ways we could improve the model. We could account for the fact that some people die before having children, we can account for the fact that multiple generations are alive at the same time, we can account for medical advancements that make it easier for some people to have children, we can account for cultural factors that change people’s desires about having children either up or down over time, we can factor in random events like plagues that cause huge decreases in population over short times. If we factor in those new observations into our mathematical model, our predictions will get better and better.

Lots of people have done this. There are plenty of models out there that take many of these factors into account. But, in the end, all of them are still basically exponential – they look very much like b^x. I’ll give one example of how this works. Say I want to add to my observation that only people in a certain age range can have babies. Then the equation you end up with is something like

“Babies this year = Babies from 20 year-olds + Babies from 21 year-olds + … + Babies from 45 year-olds.”

What ends up happening is you get something like the famous Fibonacci sequence. This sequence begins with F_1 = 1, F_2 = 1, and in order to get the next item in the list you add together the previous two. The first several numbers in the sequence are 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, and so on. In my series on the Fibonacci numbers, I showed that these number are very, very close to the exponential function \bigg( \dfrac{1 + \sqrt{5}}{2} \bigg)^x. Check out that series if you want to see how you could figure out something like this.

Modifications like these do have significant impacts on the model, but they don’t prevent it from being like b^x – more often they cause changes to the value of b, which still does radically alter how quickly the population grows. Despite this change, these modifications don’t tend to change the shape of the graph overall – it will still start very flat and then explode. The alterations tend to cause slight delays in the explosion or slightly slow down the explosion in the long-term.

There is a major exception that is worth addressing. One is that the observation we made earlier about whether B is greater than or less than 2 isn’t actually quite right. It isn’t quite right because some children die before they reach adulthood, and so never even have the option to have children. In reality, if you look at academic sociological polling (like Pew Research), you’ll see the number 2.1 in place of two here. What you’ll also realize is that different cultures tend to have different values of B – countries with lots of Muslims, for example, tend to have very high values of B compared with more secularized countries. These cultural differences actually tend to, over time, skew the overall, global value of B towards the higher side, since the cultures with high values of B grow faster than cultures with low values of B. These kinds of cultural considerations can actually change the value of B from one generation to the next, and so this has a significant impact on the graph. On the other hand, if enough cultures wind up with a value of B smaller than 2.1, then the graph will stop exploding altogether and start leveling off, as if there were an ‘imaginary ceiling’ the graph isn’t allowed to cross.

So, if you want to make predictions about population growth, probably the most important factor you need to account for is the different numbers of children being born in different geographical areas and cultures.

Conclusion

I’ve been a bit vague at points in this discussion. This is partly because I do not specialize in mathematical modeling – I would want to leave really specific details to professionals. My goal is not to develop a sophisticated mathematical model – my point is to show how calculus can be useful in formulating and studying these mathematical models and the kinds of ways you might think about trying to build your own model. As we go on in the calculus series, it is important to keep in mind the importance of the concepts in calculus in putting into mathematical language concepts about growth and change that come up so often in trying to make predictions about the world around us.

Critical Thinking Toolkit: Necessity and Contingency

In moments of deeper thought, have you ever noticed that there seem to be certain features of the world that are quite arbitrary or random, while other things seem like there isn’t really any other way things could be? Have you ever wondered at why things are the way they are, instead of some other way? This is the topic of this post, since these kinds of questions are taken quite seriously in philosophy. In fact, there is even a name for many of the deep considerations of this sort – we call them modal.

What is Modality?

Modal reasoning is an extension of normal logic. In order to understand what this extension is, it is worthwhile to review what logic normally is first.

Standard Deductive Logic

Logic is essentially the study of the different ways we can learn new information from previous knowledge. Here, we are concerned with deductive logic, which for our purposes means the type of logic that, if our starting point known, then the conclusion is in a sense forced upon us. Consider, for example, this argument:

  1. All men are mortal.
  2. Steve is a man.
  3. Therefore, Steve is mortal.

Notice that, if you accept both 1 and 2, you have to also accept 3. This may seem rather silly, and I’d say it does seem a bit silly to give a fancy name like ‘deductive logic’ to something that doesn’t look all that fancy. The reason why this actually isn’t silly is because of how few basic principles there are that are like this. There are just nine rules of deductive logic, along with two even more basic starting principles. The two basic principles are the law of non-contradiction, which says that nothing can be both true and false at the same time, and the law of the excluded middle, which says that any matter-of-fact claim is either true or false (there is no third alternative ‘in the middle’ of the two). The nine rules are all as simple as the rule used in the above example, which is the general pattern

  1. If X, then Y.
  2. X.
  3. Therefore, Y.

Other rules explain more about ‘if…then’ forms of thinking, some talk about how to treat sentences like ‘X and Y’ and ‘X or Y’. We don’t really need to go over all the rules – our main point is that the rules don’t deal with any language more complicated than and, or, not, and if…then.

The Additions in Modal Logic

Modal logic uses standard logic as its starting point and expands the language we can use. There are four key words we are adding to our logical language – necessary, possible, impossible, and contingent.

The word possible means pretty much what we normally mean when say use it. To say that something is possible means something like ‘it could potentially have happened’. To say something is impossible just means it is not possible. A square circle is a common example of something impossible – a shape can’t be both a square and a circle at the same time. That makes no sense – it is impossible. Another common example of an impossibility is a married bachelor – a married unmarried person. That is impossible – you can’t be both married and not married at the same time!

By contrasting against impossible things, we can better understand what counts as possible. For example, it is possible that there is a planet in our solar system we haven’t found yet. Notice that this doesn’t mean that there is a planet, and it also doesn’t mean there isn’t a planet. It just means the presence of such a planet isn’t utter nonsense like the idea of a square circle. It is also possible that the Baltimore Ravens win the Super Bowl next year, or that any other football team does for that matter. There are some teams that very likely won’t win, but unlikely things are still possible. We can proliferate different possibilities endlessly. Pretty much everything falls into things category.

The third category is a much stronger idea. This is the idea of a necessary thing. Something would be called necessary if it has to be the way it is – if it literally can’t be any other way. The easiest examples of necessary truths are things like “if it is raining, then there is currently water falling down to the ground from clouds”. This statement is necessarily true because the second part of the sentence just is what it means to be raining, so all I really said is “if it is raining, then it is raining”. It is literally impossible that “if it is raining, then it is raining” could be false. “All husbands are married” is another example of a necessary truth – being married is part of what it means to be a husband.

There is a fourth word that can also be important – contingent. Contingent is a label for anything that is neither necessary nor impossible. In possible world semantics, X is contingent if some possible worlds have X and others do not have X. For example, your birth is contingent. There is definitely a possible world in which you are born – you really have been born! But, had your parents not met (which certainly seems at least possible), then you wouldn’t have been born.

Now, a philosopher might now ask how these ideas work on the logical level – what sorts of logical rules do they follow? Studying the types of logical rules we can formulate using the terms necessary, possible, and impossible is called modal logic.

To give an example of one of the rules of modal logic, consider the following format similar to the bit of standard logic I wrote out earlier:

  1. Necessarily, if X then Y.
  2. Necessarily, X.
  3. Therefore, necessarily Y.

Now, lets think this through. Statement 1 says that if X happens, then Y must happen. It is unavoidable. Statement 2 says that X must happen. So, Statement 3 follows by stringing these together – since X must happen, Y must happen. This is one of the main rules of modal logic. There are plenty more rules, and they will often look very similar to the rules of standard logic with extra words added in. But, as we shall see later, you can’t throw in these words anywhere you want and wind up with good logic – sometimes things don’t work.

Possible World Semantics

I want to introduce here one of the common ways philosophers express modal styles of thinking in words. This style is called possible world semantics. In possible world semantics, the key concept is a possible world. A possible world is just some way that reality might have been. So, when I said earlier that it is possible that we haven’t found one of the plants orbiting our Sun yet, in possible world semantics we would say “there is a possible world in which there is an undiscovered planet orbiting the Sun”. If you wanted to say some situation is impossible, you can say that there is no possible world in which it is true. If you want to say something is necessary, you can say it is true in all possible worlds.

This can be a bit confusing sometimes because of the use of the word ‘world’. The confusion is quite understandable, because it makes it sound like the speaker actually think these possible worlds exist in some sense. But this need not be so. Possible world semantics is called semantics for a reason – it is nothing more than a helpful wording (a semantic device) that can be discarded if it is too confusing to be helpful. We have phrases like this in English – like the rising and setting of the Sun – that are merely semantic devices. We don’t necessarily mean the Sun is moving around the Earth when we use those phrases, but it is a helpful way of speaking and everyone understands what we mean, so we continue using this language.

So it is with possible world semantics. If imagining a ‘fake universe’ in which so-and-so is true helps you understand what ‘so-and-so is possible’ means, then great – use this way of phrasing things. If it isn’t helpful, that’s fine too. Ignore this bit, and if you hear someone using possible world language in a confusing way, just ask for clarification in different words.

Why it Matters

This will likely seem quite abstract, and perhaps to some it may even border on meaninglessness. But in philosophy, modality is very important – in fact there are debates about theories of modality, about how to correctly understand these distinctions, because they do in fact matter quite a lot.

Understanding Fundamental Reality

When we ask the ‘Big Questions’ so to speak, often we need to be able to use modal language in our answers – sometimes we even use modal language in the question itself! Instead of trying to convince you in an abstract way of the importance of modality in the Big Questions, I’ll just list out some Big Questions and emphasize the role of modal terms.

  • Do humans have free will? In other words, are all of our actions necessary, or do we have some possibility of choosing our own path?
  • Why does anything exist at all? Did the universe have to exist this way, or could it have been some other way? Is there a fundamental, necessary reason that explains why the universe exists?
  • Is it possible that our senses are misleading us? How do we know whether the world around us exists if our senses are deceiving us?
  • Is it possible that our cybersecurity algorithms are flawed, or is our encrypted data permanently safe?
  • Does human society always progress towards better morals over time, or towards worse morals, or do we have ebbs and flows?
  • If something doesn’t have to exist, then shouldn’t there be an explanation/cause of why it does exist? If not, why not?
  • Is a perfect society possible? If so, what does that look like and how do we try to get there?
  • Are mathematical statements like 2+2=4 necessarily true, or did we just invent them to help us keep things organized?

I hope that list will convince any reader of the importance of modal language in the most important questions of life. And if these words are important in the questions themselves, of course they will also be important in any attempts to answer them. And this leads to the next main point.

A Common Fallacy in Modal Logic

Earlier, I defined deductive logic as any method of thinking where, if you take for granted your starting point, then the conclusion if forced on you. This way of thinking about deductive logic makes it easy to define logical fallacies – these are forms of thinking that, at least some of the time, violate this condition. That is, a logical fallacy is a pattern of thinking that, at least some of the time, leads you to a wrong conclusion even if your starting points are true.

Modal logic has fallacies, just as we have fallacies in regular logic. However, very often fallacies in modal logic are harder to spot, because they are less obvious. In fact, they sometimes even have so much intuitive appeal that people use them casually and don’t even consider the possibility that this method is incorrect.

Let me give you an example. Consider the following logical form.

  1. Necessarily, if X, then Y.
  2. X.
  3. Therefore, necessarily Y.

This is a fallacy in modal logic. Let’s use one of our earlier examples involving bachelors to show how this fails.

  1. Necessarily, if Will is unmarried then Will is a bachelor.
  2. Will is unmarried.
  3. Therefore, Will is necessarily a bachelor.

This argument takes the form I wrote out, but is obviously wrong – I’m getting married later this year! So of course I don’t have to be a bachelor. But, both 1 and 2 are true about me right now. This conclusively shows that, just because we know 1 and 2 doesn’t mean 3 is true too.

This all seems rather obvious, but this logical fallacy is the very heart of why most people think God knowing the future is incompatible with free will. This theological argument normally comes in this form.

  1. Necessarily, if God knows you will do X tomorrow, then you will do X tomorrow,
  2. God knows you will do X tomorrow,
  3. Therefore, necessarily you will do X tomorrow.

The goal of this argument is to answer the Big Question about free will as a resounding ‘no’ by showing all of our actions are necessary if God knows the future. But, the entire argument rests on a logical fallacy! This means we don’t have to take it seriously. In order for modal logic to work here, you need to replace 2 with “Necessarily, God knows you will do X tomorrow.” But I’m thoroughly convinced that the theological doctrine of Molinism (which is held by many but not all Christians) conclusively proves that this modified version of 2 is false. Molinism takes too long to explain, so I won’t do so here, but my main point is just to show how much an understanding of modal logic really can matter.

The Ontological Argument for God

Ever since the medieval theologian and philosopher Anselm, there has been a detailed and interesting dialogue about a particular approach to a philosophical argument that God exist, called the ontological argument. As weird as it sounds, this argument basically makes the point that if God even might exist, then God does exist! I know, it sounds bizarre. But modal logic makes much clearer what we mean. Allow me to explain.

Part of what we normally mean by God is that God has always existed and cannot be destroyed – He is the greatest thing that exists. In more precise philosophical words, one of the things meant by such expressions is that God necessarily exists. Now, this is a modal word, so we should start thinking in modal terms. Now, by definition, God a being that exists necessarily (if He exists). This means that contingency isn’t applicable to the question or whether God exists or not – the only options available to us are necessity and impossibility. All else is ruled out by this definition of God. So, if you can disprove one of the two options, then you prove the other. There are a lot of different versions of ontological arguments, but all of them focus centrally on modal logic and the idea of God necessarily existing.

Conclusion

Modal logic may not come up incredibly often, but it is nonetheless important. It can be tricky to avoid tripping up sometimes when working in the weird world of modal logic, so some care is needed to minimize our mistakes. But, when it does come up, it is an indispensible tool that we can use to great effect and to understand the world in which we live better.

Optimizing Real-World Stuff (Explaining Calculus #10)

In the previous post in this series, I explained how you can use derivatives to locate maximizing and minimizing values for functions. Now, I will put that knowledge to use in concrete examples to show exactly how this works.

Example 1: Building a Fence

Imagine the following situation. You are a farmer, and you want to build a rectangular fence for your cows. You happen to live near a river, and you know your cows would never fall into the river. So, to save money, you decide to use the river as one of the “sides” of the fence. That way, you’ll only have to build three out of the actual fencing material. You’ve already decided that your fencing needs to contain 45,000 square feet of field so that your cows will have enough grass to graze on. Wanting to save money, you ask yourself… what is the least amount of fencing I have to buy to build this pen for my cows?

Notice we just used the word least. That’s just another word for minimizing something. That means we should be able to use calculus somehow. But how? Well Step 1 would be to clearly set up the thing you’re trying to minimize. As stated, we are trying to minimize the amount of fence material we buy. And what is the amount of fencing material? Well, that would be the total length of the three sides of the pen we have to build. To make this easier to visualize, let’s say the river runs north-south and we are building the pen to the right of the river. Since this is rectangular, two of the sides are going to be the same length (the top and bottom sides) and the third side (right side) could possibly be a different length. We will say the length of the top/bottom sides are x feet and the right-hand side has a length of y feet. Then the amount of fencing we need to use, which we will call P (stands for perimeter), is

P = x + x + y = 2x + y.

Ok, so far so good. We have a formula for P. But we have a problem. We can’t do a derivative yet, because there are too many variables. We need to find a way to “get rid of” the y… how? Well, let’s try to use the information about the area of the field we already know. The total area is going to be 45000. But the area of a rectangle is also always equal to xy. So,

\text{Area} = xy = 45000.

We can do something clever now. If we divide both sides by x, then we come to see that y = \dfrac{AAA}{x}. That means

P = 2x + y = 2x + \dfrac{45000}{x}.

That helped! Because now, we have just the one variable, x, on the right-hand side (we can also give P the new name P(x) to make it more clear that the variable is x). So we can take a derivative now! But before we do, we need to stop to remember why we are taking a derivative in the first place. Remember from the previous post that max and min values always occur either at a place where the derivative does not exist or where the derivative is equal to zero. So, once we take the derivative of P, our next step needs to be to catalogue these two potential options. First, the derivative of P. We can do this using the distributive rule and the polynomial rule for derivatives (which we discussed in part one of “Computing Derivatives”). Using these rules and simplifying a bit,

P^\prime(x) = \dfrac{d}{dx}[2x] + \dfrac{d}{dx}[45000 x^{-1}] = 2 x^{1-1} + (-1)*45000 x^{-1-1} = 2 - \dfrac{45000}{x^2}.

Now, we need to categorize ‘special points’ where we might find a max or min value. The only place where P^\prime(x) doesn’t make sense is at 0, but a rectangle with side length zero (remember, x is a side length!) doesn’t make any sense, so we can actually just ignore it. In other words, since x has to be a positive number anyways, x=0 is never even on our radar of possibilities. Having ruled out the ‘does not exist’ option, we need to solve the equation P^\prime(x) = 0. To simplify a bit, we can multiply both sides by x^2,

P^\prime(x) = 0 \iff 2 - \dfrac{45000}{x^2} = 0 \iff 2x^2 - 45000 = 0.

If we divide both sides of this new equation by 2, then we find that we need to solve x^2 - 22500 = 0. This equation has two solutions, x = 150 and x = -150. But remember from earlier, x is a length, so only x = 150 makes any sense (there is no such thing as a negative length). So, we actually only really have one option. But is this option really a minimum like we want it to be? After all, these special points can sometimes be maximums too. How do we know we actually found a minimum?

In the same post where we talked about how to find these special points, we also talked about how to tell which are which by using the ideas of increasing and decreasing. The idea was to pick values that are close to the special point on the left and right to see what P^\prime looks like there, and use the visual ideas of increasing/decreasing to give us our answer. Using the equation P^\prime(x) = 2 - \dfrac{45000}{x^2}, if we pick x a little to the left of 150 we get a negative number, and if we pick x a little to the right then we get a positive number. This means that, as we move from left to right, we find ourselves decreasing, leveling off, then increasing. This is the exact pattern of the U shape, which is the key example we used to show what a minimum looks like. So we actually do have a minimum!

What does all of this mean? It means that, as the farmer, you should build the top and bottom fences 150 feet long. But what about the lone third side? Well, we had an equation for that earlier, y = \dfrac{45000}{x}. If we just plug in x = 150, then we find that y = 300. So, the third side should be built at 300 feet long. And this is the best way to save your money!

I like this example because, without using calculus, it isn’t exactly obvious how you should be building the fence to save money. So this is a problem that became much, much easier because we learned how to use derivatives. And the even better thing is that we can solve any problem resembling this one by using exactly the same concept, we won’t have to come up with any complicated new ideas for new problems that are similar enough to this one. In fact, in “Homework Problem 1,” I give as practice this same problem without the river to help you out, so where you’ll have to build all 4 sides of the fence rather than just three.

Example 2: Building a Low-Cost Container

Imagine you are running a business and you need to design a very special shipping container. This container is special because the top and bottom of this container need to be made out of a more expensive, slightly magnetic material so the boxes can stay connected better when you stack them. The top and bottom materials are going to cost you $5 per square foot, and the cheaper material for the sides costs $2 per square foot. Your boss asks you to design a box with a total storage volume of 100 square feet for the lowest possible cost, and that the top and bottom of this box need to be squares. How do you design such a box?

Before we even start, notice that is a more complicated problem than the fence problem from earlier. We are going to have to think a little bit more carefully, particularly at the very beginning. Firstly, notice where the word lowest shows up – it refers to the cost. Because our process of minimizing things requires that we take a derivative, we need to write down function that tells us about the cost. We will eventually call this function C.

But how do we write down C? Well, we need to slow down a bit. This function counts the total cost of building a box. When we build this box, there are six sides, so C is the sum of the cost of the bottom, the cost of the top, and the cost of each of the four sides. Well then, if we could write down all those numbers, we’d have an equation for C! Here is where we need to pause and remember what our boss asked us to do. He told us that the top and bottom must be squares. So, if those squares have a side length of x feet, then the area of the top is x^2 square feet, and the area of the bottom is also x^2 square feet. We also know that the expensive material for the top and bottom costs us $5 per square foot. This means that the top will cost us 5x^2 dollars and the bottom will also cost us 5x^2 dollars. Now, we need the costs for the four sides, which go for the cheaper rate of $2 per square foot. We can notice that the four sides have a shared base with the squares, but have a height that we can’t determine from the squares. We can call this height y. So, the area of each of the four sides panels of the box is xy. Since these sides cost two dollars per square foot, each panel costs 2xy dollars to build. Since there are four of these side panels, then the cost for the sides is 2xy + 2xy + 2xy + 2xy = 8xy dollars.

We now have all the information we need for an initial cost formula. By adding up the dollar total for the top, bottom, and sides, we can conclude that

C = 5x^2 + 5x^2 + 8xy = 10x^2 + 8xy.

Now what? We can’t exactly take a derivative yet, there are too many variables. How do we get out of that problem? We again turn to what our boss asked us to do. We were asked to build a box with a total volume of 100 square feet. Well, we need to remember the volume formula for a box,

Volume = Base \times Width \times Height.

The base and the width are both x, and and the height is y. So, the volume formula tells us that our boss is asking that we make sure 100 = x^2 y. By dividing both sides by x^2, we can conclude that

y = \dfrac{100}{x^2}.

Because we know this new expression for y, we can use that expression in our formula for C:

C(x) = 10x^2 + 8x\dfrac{100}{x^2} = 10x^2 + \dfrac{8000}{x}.

We are in familiar territory now! We have a single function C(x) and we need to find its minimum value. This is where we use the same process from earlier – we take a derivative, find zeros, and carry on.

The derivative is a fairly routine one using the power rule, we have

C^\prime(x) = 20x - \dfrac{8000}{x^2}.

We want to solve the equation C^\prime(x) = 20x - \dfrac{8000}{x^2} = 0. By multiplying both sides by x^2, we see that 20x^3 - 8000 = 0. By adding 8000 to both sides, we arrive at 20x^3 = 8000. By dividing both sides by 20, we see that x^3 = 400. When you look at this closely, only one number really makes sense, because there is only one positive real number whose cube is 400. So, we see that x = \sqrt[3]{400} is probably our desired answer. But, how do we know this? The answer is, as before we need to check the slopes to the ‘left’ and ‘right’ of the key point. When you do this (take some time to do it yourself if you want practice) we find that leading up to the point x = \sqrt[3]{400} the graph of C(x) slopes downward, then after the key point it slopes upward. This is the characteristic ‘U’ pattern, which tells us we’ve found the minimum!

So, you can tell your boss the base needs to have a side length x = \sqrt[3]{400}. I’ve on purpose chosen a problem with a rather strange number as the answer to make a brief point – calculus can solve a lot of not very obvious problems with some not obvious answers. Before we close out this post, I’ll solve one more problem – this time one with a maximum.

Example 3: The Apple Orchard

You run an apple orchard, and naturally you want to produce as many apples as you can. You’ve done some research and experiments with your field and the variety of apple tree that you grow. You’ve notice over time that when you plant trees too close together, they crowd each other out and you get fewer apples per tree. But, you also realize that more trees mean more apples. You’ve got on the one hand an increase in apples, and on the other hand a decrease in apples. You are trying to figure out what the perfect balance is between planting as many trees as you can without causing too much overcrowding. Your research has led you to conclude the if the number of trees per acre of land increases by one, then each tree in that acre gives you two fewer apples than it did before. Your field currently has 10 trees per acre and each tree grows 35 apples. Now, how should you change your apple orchard to give you the most apples possible?

As with the previous two examples, we have to start by reminding ourselves of exactly what we are doing. We want the largest number of apples, so mathematically speaking, we want a formula A that counts the number of apples we get every harvest, and we want to maximize that formula.

Well, what is a good formula for A? Well, if we know each tree gives us N apples every harvest and that we have T trees per acre, then A = TN. This tells us how to count the number of apples. However, we can’t take the derivative of A yet, there are still too many variables. We now must use the time we spent in research about our apple trees. We know that when T = 10, N = 35. We also know that if we add one to T, then N goes down by two. We can actually think of this as the slope of a linear equation. To see how, note that in the equation y = -2 x, then if x goes up by one, then y goes down by two. This means that something like N = -2T is true… but that isn’t quite right, because T = 10, N = 35 doesn’t work with that equation. If we were to plug those in, we’d get 35 = -2*10 = -20, which is surely wrong. We have to ‘shift’ the equation up to make it work, which in this case involves adding stuff onto the right side until the -20 gets all the way up to 35 where we need it. When we make this correction, we end up with the correct equation N = -2T + 55.

We can now go back to the real problem of maximizing A. We know both that A = TN and N = -2T + 55. By using this second equation to modify the first one,

A = TN = T(-2T + 55) = -2T^2 + 55T.

Therefore, we’ve found that A = -2T^2 + 55T. This is a one-variable equation for A, so we are now at the same point we reached in the other two examples. We proceed in the same way, by finding A^\prime and solving for T in A^\prime = 0. The derivative of A is A^\prime = -4T + 55, and the solution to A^\prime = 0 is T = \dfrac{55}{4}. So, in theory, this is our answer. We should be planting about $\dfrac{55}{4} = 11.25$ trees per acre. Since we are currently only planting 10 per acre, we should plant a couple more trees.

Conclusion

This would be a really good time to pause and reflect on what calculus has enabled us to do. We’ve just solved a variety of problems that asked us to find best-case scenarios for many different sorts of things. This is one of the most important things that calculus can do. There is another incredibly important application of calculus to physics, which we shall go into later. For now, we can appreciate what these new mathematical tricks have enabled us to do.

A Practice Problem

Problem: This is a similar problem to Example 1. If you had to build all four sides of the fence instead of just three, prove that the pen you save the most money with will be a square. In other words, x = y if we use x,y the same way as in Example 1. (Hint: What is the new version of the equation P = 2x + y?)

Critical Thinking Toolkit: The Correlation-Causation Fallacy

Here, I’d like to discuss two interconnected tendencies we human beings have. We like looking for patterns, and we like explaining things. These are both incredibly important features of the way we think as humans. But they are not identical. It is easy to get them confused, and we often do get them confused. This is a problem, because any error in critical thinking allows opportunities for our personal biases to creep in. For example, suppose I find a patter, and I want that pattern to not be a coincidence, but it actually the pattern just is a coincidence. Then if I come to believe that the pattern I found is not a coincidence, I now believe something false that might give me a lot of problems later on.

What I am referring to is the so-called correlation-causation fallacy. The idea here is that just because you find a correlation between two things (a pattern) that does not mean there is causation between them (a cause-and-effect relationship). I find it a little difficult to explain correlation and causation in any simpler terms than that, because these are both hard to define without using even weirder language. What I’ve decided will be more helpful is a list of examples with some discussion. Perhaps not all of them will make sense – but I hope some of them will. And if I’ve missed any good examples, it isn’t too hard to find lots of examples with a quick google search.

With all that said, let’s begin with some examples.

Example 1: A Silly Example

I dug around the internet for a bit to find a silly example of the correlation-causation fallacy, and found this one. There is a significant mathematical correlation between the number of movies Nicolas Cage stars in correlates with the number of people who drown by falling into a swimming pool. This means that you could write down an equation that does a really good job of predicting one based on the other. Should we then infer that those people fell into those pools because of the movies, or that Cage stars in movies in honor of victims of pools? Surely not. It is mere coincidence that the data line up well between those two counts.

Example 2: A Mathematical Example

Perhaps you’ve seen a “trick question” like this before:

Which number comes next in the pattern 1, 2, 3, …?

Surely, you think, it must be 4. But actually, it might be 5, if the pattern is really that 1+2=3, so 2+3 must be the next number. It could, in fact, be literally any number at all. Mathematically, I can write down an equation that would spit out 1, 2, 3, 123, or 1, 2, 3, -7, or anything at all you’d like it to spit out. It actually isn’t terribly difficult either – you just have to use polynomials in a clever way (if you’re a mathematically inclined reader, find a way to produce such sequences on your own as a mental exercise).

The point that is generally taken from this mathematical fact is that any finite collection of data can be explained an infinite number of different ways. This makes the same basic point that the correlation-causation fallacy wants to make. Just because you think you see a pattern, doesn’t mean that pattern is actually there. There could be no pattern at all, or it could be a quite different pattern than you thought.

Example 3: The President

This one is a rather famous example, and in fact has its own Wikipedia article, which I will link to a bit later. Your goal is to guess which American President I am talking about.

  • Elected to congress in ’46 and to the presidency in ’60
  • Played a significant role in a key step towards civil rights for African American people.
  • Lost a son while living in the White House.
  • Assassinated by a gunshot to the head sitting next to his wife.
  • His successor was President Johnson.

Which President do you think I’m talking about? The truth is, surprisingly, I didn’t give you enough information, because all of those facts are true of both Abraham Lincoln and John F. Kennedy. In fact, there are a great many more similarities between them. You can check them out yourself. Should we then assume that because there are a shocking number of commonalities between the two, that one of them is a myth/legend based on the other? That is, did someone use the story of Abraham Lincoln to invent a story about JFK, or vice versa? Although I’ve never met either of these people, and never could, I’m about as sure as I can be that both of them really did exist and really did have these features.

This is a sort of historicized version of the correlation-causation fallacy, or perhaps you could call it a post hoc ergo propter hoc fallacy, which is a sort of after-therefore-caused-by fallacy. Regardless of how you want to classify this fallacy, I think it illustrates the main points about correlation-causation fallacies.

  • Correlation: The number of common features shared by Abraham Lincoln and John F. Kennedy is extremely surprising.
  • Causation: None of these common features have any cause-and-effect connections between them. The features they have in common are coincidental, not real.

An Example with Real Causation: Based On a True Story

I’ve given a couple examples that illustrate the distinction between correlation and causation. But, of course, often there really is causation behind a correlation. To use an example, consider fictional movies that are ‘based on a true story’. Like the example of Abraham Lincoln and John F. Kennedy, the real story and the movie version will have a long list of very detailed parallels. But this time, there will be a clear reason why – because we know the movie is based upon a real person. We know which one came first, and we know how the movie came about (we could talk to the director). By comparing and contrasting based-on-a-true-story narratives with the Lincoln-Kennedy coincidence should begin to give a feel for when a correlation should make us think there is causation going on versus when we should treat the correlation as a coincidence.

Example 5: “Pagan Origins of Christianity”

I’ve seen a fair number of secular people try to claim that the way Christians view Jesus today is just an amalgamation of various stories of pagan gods and/or goddesses. Mithras is a key example that is often used, but many others are mentioned as well. The claim is that, since the supposed lives of these pagan figures share many key similarities with what the four gospels say about Jesus, that the people who wrote the four gospels must have just “copy-pasted” that information over onto Jesus’ life.

Now, I won’t go here into whether or not the lives of these figures actually are similar. From everything I’ve seen, they aren’t at all similar, or are only trivially similar, just as there are trivial similarities between you and any random stranger you don’t know. There ought to be an obvious correlation-causation fallacy going on here, just like with Abraham Lincoln and John F. Kennedy. Just because there are similarities does not mean there is a connection! To show some kind of historical copy-paste like that, you not only have to find key similarities, you also need to find hard evidence that there is a cause-and-effect connection between those. For instance, do the gospel writers have a compelling motive to steal those stories? No, they were Jewish monotheists and would have found paganism repulsive to its very core. Why then would they use it in their Jesus story? That doesn’t make sense – if they were making up a story, they would have copy-pasted from other religious stories they actually liked – say Moses, Elijah, or King David. Other potential motives don’t seem to check out. So, it seems rather unlikely that the gospels were made up out of thin air by stealing bits and pieces from pagan mythology. This is correlation without causation.

There is a similar problem with some people who think Christmas or Easter have pagan origins. I’ll stick with Christmas. Whenever someone tries to argue for this position, often they will list a bunch of similarities between some pagan holiday (perhaps Saturnalia) and Christmas as proof. But that is a textbook definition of a correlation-causation fallacy! That’s not good enough. In order to show an actual causal connection, you’d have to show some kind of cause-effect connection that shows how one became the other. For instance, are there early Christian documents that state a link between pagan holidays with Christian ones?

For instance, what about the day of Christmas, December 25th? You can find a lot of people saying that the reason this date was chosen for Jesus’ birth is because of ancient pagan holidays celebrated on that day. But this just isn’t why Christians came to believe this dating for Jesus’ birth – it is a correlation, but not a causation. The real reason is, to be fair, equally strange, but it isn’t pagan. Jewish believers in this time frame tended to believe that important religious figures would be conceived on the same day as they died, that this sort of ‘perfect symmetry’ ought to exist for key figures in human history. So, if they could just figure out what day Jesus died, then they’d (according to this view) also know the day he was conceived, and so just add 9 months to that and you get Jesus’ birthday. Some Christians from a couple hundred years after Jesus’ death estimated that Jesus was crucified on March 25th, hence he would have been conceived on that day, and then born 9 months later on December 25th. This is strange, and we of course no longer believe in this reason, but this origin definitely is not from a pagan holiday. The idea of symmetry between conception and death was a Jewish idea, not a pagan one, so there seems to be no pagan connection.

Conclusion

The big takeaway here is that sometimes, patterns and similarities can be deceiving. It is not enough to find patterns if you want to show that two things are connected – you must show why or how they are connected in order to convincingly show that they are connected. But this does not discount the value of pattern-hunting. On the contrary, we should always be keeping our eyes out for patterns, for very often there is a truth behind the pattern. The point is that we must be humble when we find patterns and not jump to conclusions too quickly. If these examples weren’t helpful enough, it wouldn’t be hard to find different ones that might be more helpful.

What Is Philosophy of Religion?

Have you ever watched a public debate or dialogue about religion? I certainly have – many in fact. I am quite interested in such conversations. Although I don’t usually like the heated ones – I like the philosophically informed ones. If you haven’t heard any of these types of conversations before, that might sound strange. Philosophically informed religious debate? Isn’t religion all about blind faith? Why would people be debating that? You might also be surprised that academic philosophy has an entire area called philosophy of religion. There are many peer-reviewed journals – such as the International Journal for the Philosophy of Religion, the American Catholic Philosophical Quarterly, Faith and Philosophy, and many others – dedicated to just this topic. Seems odd if religion is supposed to be all about personal spirituality. What is going on here then?

Perhaps none of this is surprising to you. Perhaps it is. Perhaps you have read or written some peer-reviewed articles in these journals. Or perhaps you think the idea of a philosophy journal dedicated to religious topics an oxymoron. Regardless of where you are, the fact of the matter is there is such an area of study – in both religious and secular universities, with both religious and non-religious philosophers asking deep and interesting questions about religion. One of the major goals I have for my blog over time is to make some of the conversations going on in these journals accessible to people who don’t have access to such articles. To begin the process of doing so, it is important to lay out some of the basic groundwork of what kinds of conversations go on in philosophy of religion.

A Few Big Questions Worth Asking

What sorts of questions do philosophers ask in this field? Here are some examples of the kinds of jumping-off points that lead to interesting back and forth exchanges between those who think God exists and those who do not.

  • How do religious groups come to their conclusions? What is the nature of a “central belief” of a religion? How is this similar to/different from central beliefs in other areas?
  • How do religious people arrive at their beliefs? How should a person arrive at their religious beliefs? Which of these ways of arriving at beliefs are rational, and why?
  • Where do other fields (like science, history, etc.) intersect with ideas that play a role in religion? To which religious worldview do these other fields point?
  • What is the nature of religion as compared to science? To what extent do the ideas at the forefront of science actually contradict or agree with core claims of various religious systems, or more broadly, with the generic claim that there is a God?
  • If God does exist, what is God like? For instance, is God omniscient? Omnipotent? If so, what do those words mean?
  • What is the relationship between God and time? Is God beyond time, or the creator of time, or everlasting throughout all time… or some combination of these?
  • What do we make of the possible existence of God in light of evil in the world? Is this good evidence that there is no God? And is there counterevidence from other domains that would indicate that God does exist?

These are all questions I’ve spent a lot of time thinking about (some more than others, to be fair) and all of which I think I can answer to some extent, although these questions are obviously extremely big and can’t really be answered easily or quickly. But I hope these examples give an idea of the topics and scope of philosophy of religion.

Having undergone a brief overview of what philosophy of religion is, we mention a few things it is not.

  • Philosophy of religion is not theology, although the two have intersection – particularly in asking questions about what God is like. Theology takes as a starting point the truth of a particular religious text or teaching and takes a deeper dive into analyzing and understanding that particular text or teaching. Philosophy of religion can draw from religious texts – for instance, if a philosopher wants to argue that two aspects of Christianity appear in conflict, they should quote the Bible at various points to make their case. But philosophy of religion will not use a religious text in quite the same way as theology does. This is not better or worse, but the distinction is important to realize.
  • Philosophy of religion is open to those who do not practice any religion. In fact, many famous philosophers of religion do not believe that God exists. Graham Oppy would be one of the most famous examples. Likewise, you can be a religious believer and nonetheless study the philosophy of religion objectively – Alvin Plantinga would be a good example.

Having taken some time to understand the big picture of what the philosophy of religion is all about, let’s take some time to unpack some specifics.

Natural Theology

The Stanford Encyclopedia of Philosophy defines natural theology as “the project of using all of the cognitive faculties that are “natural” to human beings – reason, sense-perception, introspection – to investigate religious or theological matters” [1]. In less fancy words, natural theology is the task of talking about religion without relying on appeals to religious institutions or Scriptures as implicitly reliable authorities. For instance, it cannot be “natural” in this sense of the word for a human being to believe the Bible, or the Qur’an, or even to believe that God exists. Although you could make an effort within different “natural” means in this sense to show why a human ought to believe that God exists, or believe in the Bible or the Qur’an. Mainly, natural theology does not address something as specific as a single religion, but asks broader questions – we might say about different types of religious systems. A few of what we might consider types of religious systems are theism, atheism, and pantheism. Theism is the position that God exists outside of the universe (usually as its Creator), and is the position traditionally held by Christianity, Judaism, and Islam. The most common form of theism is monotheism, the view that only one God exists. Atheism is the position that God does not exist, that there are no gods of any sort. Some common forms of atheism are naturalism (the only thing that exist are natural things, which is notoriously hard to define) and materialism (which holds that the only things that exist are material things, like space, time, matter, and energy). Pantheism is the view that God exists but is not separate from the universe as the theist thinks, but that God is the universe (in some sense). Hinduism and Buddhism, as I understand it, often have pantheistic attitudes. “New Age” religious systems are also usually pantheistic or something close to pantheistic.

The topic of natural theology involves looking at all the “natural” evidence available to us to determine which of these systems is best supported by the evidence. As a Christian myself, I believe that theism is best supported by the evidence. As you’d expect, I can’t exactly go super in detail in just one article about why I think so. But I can give some samples of the sorts of conversations that would lead in that direction. The way I’ll do this is by listing some general types of philosophical arguments for theism that philosophers of religion offer in favor of theism.

Cosmological Arguments: Cosmological arguments for God’s existence are arguments that take analyze facts about the universe (cosmos) as their starting point. The philosopher could argue that certain facts about astrophysics, cosmology, and the universe appear to be very counterintuitive and unlikely if theism were not true.

Moral Arguments: Moral arguments for God’s existence begin with an idea that ‘right’ and ‘wrong’ are not just things we’ve made up, but real truths about the way reality is – generosity really is good and murder really is evil. Moral arguments argue that certain facts about this moral realm suggest strongly either that God is the ultimate source of the Good, or that God is the reason we as finite physical creatures are capable of truly knowing that there is ‘moral realm’ at all.

Teleological Arguments: Teleological arguments are also called ‘design arguments’ sometimes. They generally come in two flavors – biological and cosmological. In both cases, the idea is that when we look at reality, it looks like things could not have been this way by random chance, because when you calculate all the probabilities (say, that the universe could permit intelligent life on anywhere) they are much too low to be reasonable. This gives evidence that this universe was not in fact a random accident, but that it was designed to produce life by a Great Engineer, so to speak.

Ontological Arguments: These can be really meta and confusing. The word ‘ontological’ refers to ‘existence’. All ontological arguments use modal logic – logical forms that deal with whether things had to be a certain way or whether they might have been some other way – and use relatively simple steps within that complicated framework to attempt to argue that there has to be something that literally must exist, and that this something is most reasonably something like God as we normally think of God.

There are also topics within natural theology that one might leverage as evidence against God’s existence. The two most famous are the Problem of Evil, which roughly follows along the lines of “Why would a good God let so much evil happen?” and the Hiddenness of God, which argues that since we should expect to see clearer and much more frequent evidence that God exists if He did exist than we do in fact have. These are also quite serious points of discussion, and there are living philosophers that have made giant leaps of progress on even these questions (see Alvin Plantinga’s work on the so-called logical problem of evil).

I’m not trying to defend any of these here – I’m sure my reader will probably have opinions of these arguments. That is good. But whatever your opinion is, know that there are people who agree with you who make a lot of good points, and people who disagree with you that do too. This is no walk in the park – and why should we think it is? This is one of the biggest questions of all human existence – whether or not God exists. We would be foolish to expect these matters to be simple.

Epistemology of Religion

This is harder for me to summarize, because epistemology tends to get really abstract in certain ways. Epistemology is the fancy philosophy term for asking questions about the way we think – especially our beliefs and knowledge. So, religious epistemology asks questions about how we come to believe things about religion (and how, if true, we come to know those things).

I’ve read a book recently in religious epistemology by Notre Dame philosopher Alvin Plantinga called Knowledge and Christian Belief, and I figure that a rapid-fire overview of this book will be a good introduction to the sorts of conversations that might be had.

The book begins with an overview of various figures in pop culture – Richard Dawkins, Karl Marx, and Sigmund Freud would be some of the key examples – who claim that all religious belief is irrational – by which they mean that even if it were true, you still shouldn’t believe it because there is something ‘defective’ about it. Plantinga’s goal is to explain not only how belief in Christianity (and theism more broadly) can be rational, but how it can be rational even in the absence of ‘evidence’ that most people would want to see. Realize that by rational, Plantinga means roughly that it ‘make sense given what you know’, so you could actually perhaps rationally be an atheist or a theist in the same situation if you had some decent evidence on both sides.

Recall, we are not here talking about whether Christianity is true or not – we are asking that if it were true, how could we come to know that it is true? Plantinga’s view is, in a very simplified form, that we come to know that Christianity is true by a kind of testimonial that witnesses to us ‘in our hearts’, so to speak. Just as we come to know our parents exist because we have communications with them and recognize that they seem to have mental lives just like we do, so we come to realize that God exists and through a sort of communication with him. This communication differs from human to human communication, but since God is not very much like your next door neighbor, that is hardly surprising. God, as our creator, created us with an inner sense that longs for something more, and which naturally responds to certain aspects of the world (perhaps a particularly beautiful sunset) by feeling grateful – which points towards Someone-To-Be-Grateful-To, and that would be God. In Christianity, we actually talk about ‘relationship with God’ because of this type of connection, and we call the communication the ‘inner witness of the Holy Spirit’.

An important point to notice – we didn’t mention external evidence at all. But Plantinga’s overall point is that we don’t need to appeal to any such evidence. You don’t have to appeal to any evidence other than your relationship with your parents to prove that they exist – all though you could bring such evidence, you shouldn’t need to! You know them personally. That is enough to prove they exist to you. Perhaps someone else might not be convinced, but that isn’t your problem.

Again, I’ve way, way oversimplified Plantinga’s viewpoint. It is a lot more deep and nuanced than what I’ve presented here. But notice what I have done – I’ve given an account of what it means to believe a religion, where that belief is supposed to come from, and whether or not such a belief, if true, makes sense to believe. I also briefly mentioned an atheistic view on religious epistemology, although there are certainly other atheistic perspectives too, and Marx and Freud didn’t quite believe the same thing. This the goal of religious epistemology.

Conclusion

There is plenty I’ve left out here. This is a large field of study, and is very active area of research in modern times. There has been a resurgence in theistic philosophers as well, who have (among other things) updated and improved all the arguments for God’s existence and created a variety of new arguments as well. There is a lot of interesting material out there, and I hope to share as much of that material as I can.

Reference

[1] https://plato.stanford.edu/entries/natural-theology/

Locating Peaks and Valleys with Derivatives (Explaining Calculus #9)

We have gone through a couple of discussions now on how to actually calculate the value of a derivative. This is great, after all what good is a shiny new tool if you don’t know how to operate it? But we are done with that now. Now that we’ve gone through a tutorial with derivatives, we can dive into the depth of what derivatives are capable of telling us. Since the initial motivation behind derivatives is all about change, you might think I’m about to write about calculations of how things change over time. That is partly right, but more accurately I’m choosing to show how we can go beyond “just” talking about changes over time by using the idea of change along the way.

What I plan to discuss here is a broad idea called optimization, and we will explore how derivatives play a key role in this theory.

What are “Peaks” and “Valleys”?

The type of math called optimization talks about finding maximum and minimum values. For example, if you are a business owner and found a function that would predict your profits, you would want to maximize that function. Or perhaps you want to build a fence around some grass for your animals, but don’t want to use any more fencing than you have to. You want to minimize the amount of fencing material you use. These ideas of maximizing and minimizing – making a value as large or as small as possible – are what we want to talk about.

But why the terms peak and valley? Well, the maximum height you find on a mountain is the peak of the mountain. The lowest point in a landscape is a valley. So, if we translate the very numerical ideas of maximizing and minimizing into visual language, we are learning how to find the peaks and valleys in our mathematical landscape. And we can make use of numerical values of derivatives to help us.

Connecting Peaks/Valleys with Derivatives

Imagine you draw a graph in a U shape. In fact, you can just look at the letter U and pretend it is a graph. Let’s think about finding the minimal point on U – that is, the lowest point on the letter. Now, imagine drawing a tangent line. It should look like you just underlined the letter U. In other words, it should be a horizontal line! Now, remember what derivatives are – slopes of tangent lines. What is the slope of a horizontal line? Well, that would be zero. So, we’ve arrived at a rather interesting observation through the letter U: the minimum value of U happened at a place where the tangent line had a slope of zero.

Now, let us instead use the letter V instead of U. What about the minimum point on V? Well, again, we can ask questions about the tangent line. But this time, we run into a hitch. The lowest point on the letter V is a sharp corner, and sharp corners don’t have tangent lines. So there isn’t any way we could even try to define a derivative at that point. This is our second interesting observation: the minimum value of V happened at a place with no tangent lines.

Now, here is a big, big point. Our observations about the letters U and V are actually comprehensive statements about max and min values of all derivatives! To make this statement more precise, I need to write down a couple definitions in more “mathy” language.

We begin by defining minimum and maximum more precisely. In fact, we will give two subtly different definitions – these two types we call absolute and relative. In intuitive terms, absolute max/min values are really bigger than every other value of the function. On the other hand, a relative max/min is one that “looks like” an absolute max/min if you “zoom in” far enough. To see what I mean here, imagine a mountain range with a variety of peaks. Every single peak is a relative maximum value, because if you took binoculars and narrowed your focus on just that peak, it would look to you like the highest point on the mountain. On the other hand, the absolute maximum of the mountain range is whichever peak is actually higher than all the other peaks. Notice that you can have many, many relative maximum values, but only one absolute maximum value (well, there could be ties, like two different mountain peaks that are equally tall, but there is still only one largest height that is shared by two peaks).

Now, let’s do some defining. We begin first with absolute max/min:

Definition: Let f(x) be a function. The function f has an absolute maximum at the point x if f(x) \geq f(y) for every value y. If x is an absolute maximum point, then f(x) is the absolute maximum value of f. Similarly, f(x) has an absolute minimum at the point x if for every value y, f(x) \leq f(y), and f(x) is the absolute minimum value of f.

Notice that the definitions makes its claim about every possible value of y. Also notice that “points” refer to inputs and “values” refer to outputs – perhaps the visual f(\text{point}) = \text{value} is of use to show what I mean here. On the other hand, consider the following definition of relative max/min values, and note the key differences.

Definition: Let f(x) be a function. Then f(x) has a relative maximum at the point x if, for every point y close enough to x, f(x) \geq f(y). Similarly, f(x) has a relative minimum at the point x if for all y close enough to x, f(x) \leq f(y).

Notice the presence of “close enough” in this definition. That is the key difference. As is suggested by the language, absolute is a stronger term than relative, as I can show explicitly by showing that absolute maxima/minima are also relative maxima/minima.

Fact: If f(x) has an absolute maximum at x, then f(x) also has a relative maximum at x.

Proof: Since x is an absolute maximum of f, f(x) \geq f(y) for every value y for which f makes any sense. This, of course, means that f(x) \geq f(y) for all y close to x. So, f(x) has a relative maximum at x.

The same kind of fact works for absolute and relative minima, and so a proof is not needed. You can fill in the blanks yourself. Before we can finalize the big claim I made about “U and V” I first need to take a detour into a related topic.

Increase and Decrease

We now move into a section more directly related to the idea of “change over time” that originally led us to the idea of a derivative in the first place. The idea here is fairly straightforward. As time progresses, a quantity could go up, it could go down, or it could stay the same. This is actually the complete idea. In mathematics, we say increasing instead of going up, decreasing instead of going down, and constant for staying the same.

How can we frame these ideas in equation-language? It is fairly easy if we use an analogy. Imagine you are walking on a mountain, that x represents how long you’ve been walking for, and f(x) represents how high you are. Intuitively, the function f(x) (your height) is increasing if after each step you take, you are higher than you were before. The progression of time can be captured pretty easily – time y is after time x if y > x. So, in order for each step to leave us higher than we started, we need f(y) > f(x) to be true whenever y > x is also true.

But, remember, we are talking about calculus too. The idea of “zooming in” will almost always be important. So, while the way we described the idea of increasing it is completely right, we want to be a bit more specific. We want to be able to say where f(x) is increasing and where it is not. To do this, we allow our where to be some interval of time, say [a,b]. Now, we can be more specific in our definition.

Definition: Let f(x) be a function with values in the range [a,b]. Then f(x) is increasing on the interval [a,b] if for any values of x,y between a and b, if y > x then f(y) > f(x).

It is pretty easy using the same intuition to come up with definitions for decreasing on an interval and constant on an interval. If we are going down a mountain, then with each step (time going forward) our height decreases (function value going down). Just as before, time going forward tells us that we want to think about y > x, and height going down whenever time goes forward tells us that instead of f(y) > f(x), we need f(y) < f(x). So, we have an easy definition.

Definition: Let f(x) be a function with values in the range [a,b]. Then f(x) is decreasing on the interval [a,b] if for any values of x,y between a and b, if y > x then f(y) < f(x).

Notice these two definitions are verbatim the same except for the word ‘decreasing’ and the change from > to < at the very end. Similar considerations to these lead us to a definition for constant in an interval.

Definition: Let f(x) be a function with values in the range [a,b]. Then f(x) is constant on the interval [a,b] if for any values of x,y between a and b, then f(x) = f(y).

Notice the similarity of all three definitions, and notice where they are different. It is always helpful I find to think of hiking up and down a mountain as the function.

Connecting Increasing/Decreasing with Derivatives

By using derivatives, we can zoom in even further. So far, we’ve only defined increase/decrease/constant in terms of intervals of numbers. But what about points? Those are even smaller than intervals. Can we somehow extend these ideas to miniscule points?

Actually, yes. We can. The easiest place to begin is the idea of being constant at a point. In the mountain analogy, this is the peak of the mountain. Everything levels out when you reach such a point. If you think in terms of tangent lines, the tangent line is totally flat there. That just is what it looks like to be constant, or flat, at a singular point. Imagining a rounded-off top to a mountain or hill is a good visual aid here. So, in terms of derivatives, we have a definition for being constant at a point.

Definition: A function f(x) is constant at the point a if f^\prime(a) = 0.

We can move forward now and say that increasing on an interval means increasing on every point inside that interval. Makes sense. Every point on a red wall is also red. It’s analogous to that. More specifically, f(x) is increasing at x if x is inside some interval [a,b] (that is, a < x < b) on which the old definition tells us that f(x) is increasing on [a,b]. But now, look at the definition of the derivative,

f^\prime(x) = \lim\limits_{h \to 0} \dfrac{f(x+h) - f(x)}{h}.

Let’s take this new definition of increasing at a point and play with it. Let’s pick an interval [a,b] on which f(x) is increasing. Since the variable h is part of a limiting process, we can make it as small as we want. So, we can make it small enough that x+h is always inside of [a,b]. This means that the rules of increasing apply here. Since f(y) > f(x) if, and only if, y > x, f(x+h) > f(x) if, and only if, x+h > x. Simplified, f(x+h) > f(x) is equivalent to h > 0. In exactly the same way, f(y) < f(x) is equivalent to y < x, and so f(x+h) < f(x) is equivalent to h < 0.

This may seem a bit weird, but pause and think about what we’ve done. Notice that f(x+h) - f(x) must be positive whenever h > 0, and it must be negative whenever h < 0. In both situations, \dfrac{f(x+h) - f(x)}{h} must always be positive. In fact, these values always being positive is going to force the limit of them, which is just f^\prime(x), to definitely not be negative. So, f^\prime(x) \geq 0. But… we already have a definition for f^\prime(x) = 0. So we won’t use that. When we keep the rest, we get a new definition for increasing functions.

Definition: The function f(x) is increasing at the point a if f^\prime(a) > 0.

You might be able to guess the definition of decreasing at a point now. If not, take some time to think about it before reading on.

Definition: The function f(x) is decreasing at the point a if f^\prime(a) < 0.

This discussion has now linked our ideas about increase and decrease to this new calculus language. You might not think we’ve made all that much progress, at least not yet, but I think we have. Notice how much shorter our new definitions are than the old ones! I would consider that a win. The are also easier to read, assuming of course you know what a derivative is. But that isn’t exactly the point. The point is that by the cleverly using the concept of a derivative, we made the idea of increasing and decreasing easier to put into a legitimate definition. And that is always a sign of mathematical progress. Simpler ways of saying the same thing is almost always better.

With all of the previous discussion in mind, we can now move on to the main point of the article, that puts all these ideas together.

Finding Peaks and Valleys with Math

We can move forward now with the big idea we mentioned before with the letters U and V. I made a big, big claim that the letters U and V essentially describe every possible type of maximum or minimum value you could ever find. I will now explain this.

Theorem: Let f(x) be a function whose input can be any real number and whose outputs are real numbers (I sometimes use the shorthand f : \mathbb{R} \to \mathbb{R} for this, where \mathbb{R} is a stand-in symbol that represents all the real numbers). Suppose that f(x) has a relative maximum value at the point x = a. Then either f^\prime(a) = 0 or f^\prime(a) does not exist.

Proof: Our claim is that there are only two possibilities. Another way to say this is that, if we are not in one of the situations, we must be in the other (since no third option exists). Viewing the problem in this manner is a common approach to either-or proofs in mathematics, and this is the approach I use here. I will assume that f^\prime(a) definitely exists. Our goal, then, is to discover that f^\prime(a) = 0 has to be true.

We already know that f(x) has a relative maximum value at x = a. Let’s remember briefly the definition of the derivative of f at a.

f^\prime(a) = \lim\limits_{h \to 0} \dfrac{f(a+h) - f(a)}{h}.

Now, since we are taking a limit, we can assume that h is actually small enough that a+h is close enough to a so that f(a+h) \leq f(a). Then for every small enough value of h, we know that f(a+h) - f(a) \leq 0. Now, the number h itself can be either small and positive or small and negative. If it is small and positive, then \dfrac{f(a+h) - f(a)}{h} \leq 0. If it is small and negative, then \dfrac{f(a+h) - f(a)}{h} \geq 0.

This is a rather curious fact. Remember that the derivative value f^\prime(a) definition definitely exist. Since the limit must exist, the approaches from negative and positive values of h cannot contradict each other, since if they did, the derivative just wouldn’t exist. This means that both of the inequalities we just figured out have to be true about f^\prime(a). That is, we have figured out that

0 \leq f^\prime(a) \leq 0.

It should strike us pretty quickly that the only number between zero and zero is zero. Therefore, f^\prime(a) = 0. This is exactly what I set out to discover, and so we are done proving what we set out to prove.

How to Find Relative Max and Min Values

We can make use of this super-important theorem to actually locate these special peaks and valleys in graphs. And, by using the increasing-decreasing ideas, we can pick out which of the located special points are peaks, which are valleys, and which are masquerading.

Here is the idea. Our big theorem tells us that the special points we are looking for – relative maximum and minimum points – always have either a zero derivative or a non-existent derivative. So, if we have a function f(x), we can locate all of these points by finding all solutions to f^\prime(x) = 0 and all points where the equation f^\prime(x) doesn’t make any sense but f(x) does make sense. We can then list all of these out, as there are never too many of these special points.

Just by the last paragraph, we now have a short list of possible places where we might find maximum or minimum values. But how do we tell which are which? This is where the increasing/decreasing ideas come in to play. Let’s start with minimum values – which have the shape of either U or V. Notice that to the left of the minimum points have down-sloping curves or lines, and the right of the minimum points have up-sloped curves or lines. In our discussion earlier, we already pointed out that down-sloping means decreasing, and up-sloped means increasing. We also pointed out that decreasing means f^\prime(x) < 0 and increasing means f^\prime(x) > 0. What does this mean then? Well, if a is one of our special points, and if f^\prime(x) < 0 to the left of a and f^\prime(x) > 0 to the right of a, then the graph of f(x) must be shaped either like a U or a V at the special point a. This means that a definitely has a minimum point at a! In exactly the same way, if the graph is increasing on the left side of a special point and decreasing on the right side, that must mean our graph is either an inverted-U or an inverted-V, which tells us that we have found a maximum point there!

But what if neither of those are true? What if the graph is increasing on both sides of the special point? Well, then, we’ve found a point masquerading as a max or min point, even though it isn’t either. It passed the first test by having a derivative that was 0 or didn’t exist, but it didn’t pass the second test. A good example here would be the letter D. The very right-most part of the letter D has a derivative that does not exist (to be clear about this part, since infinity isn’t a number, a line that goes straight up is said to not have a derivative). So, this point on D passes the first test. But if you zoom in nearby that point, it definitely is never the highest or lowest point in that zoomed-in window. So it can’t be a relative maximum or minimum point. And so it fails the second test. If you graph the function y = x^3 on a calculator or on the internet, you’ll see the point x = 0 has a zero slope (passes the first test) but isn’t a max or min (and so fails the second test). Again, this will be because the graph is going up on both sides of the key point (that is f^\prime(x) > 0).

Conclusion

This is one of the immediately most useful things about calculus. It has enabled us to maximize and minimize basically anything we want. The real-world implications here are pretty obvious. But to really lay out how you actually pull this sort of thing off, the next post will go through some examples of how to actually maximize and minimize things when someone hands you an equation.

Homework Problems

Problem 1: The big theorem I proved assumes that the function f(x) can take on any input we want. Think about what would happen if we constrain the acceptable input values. In particular, notice that a graph can’t have a tangent lines at endpoints (because all endpoints are ‘sharp’ and derivatives are designed to find locations that are not ‘sharp’). Can a graph have its absolute maximum value at an endpoint?

Problem 2: In the appendix, I go through the method of how to find the location of the peak or valley in a parabola. Convince yourself of a method for figuring out whether the point you find is going to be a maximum or minimum. Then, convince yourself that this is not just a relative maximum or minimum, but is an absolute maximum or minimum.

Problem 3: Find all the relative maximum and minimum points of the graph of f(x) = x^3 - x + 2 without using a graph. Then, check your answer by using a graph.

Problem 4: I claimed towards the end of the article that y = x^3 has a point that pretends to be a max/min point but actually isn’t. Use the second test (looking at increasing/decreasing behaviors of \dfrac{dy}{dx}) to understand why y = x^3 doesn’t have any max or min points. What is the special property of the graph that prevents it from having max and min points? Can you come up with other graphs that have this same special property?

Challenge Problem: For those looking for a challenge, find similarities between the two tests for max/min points from this post and the Intermediate Value Theorem that was used in Explaining Calculus #4 to prove that equations have solutions.

Appendix

What I want to do here to show how you can use this idea to find the x-value of the peak/valley point on any parabola.

The equation of a parabola is always given by a quadratic equation f(x) = ax^2 + bx + c for some constant numbers a,b,c. We’ve already mentioned that polynomials always have derivatives, and so in the “big theorem” the case where f^\prime(x) does not exist cannot rear its head here. So, to find the peak/valley of the graph, the only thing we need to do is solve the equation f^\prime(x) = 0.

To do this, let’s first find the derivative of f(x). Using the standard method for polynomials we discovered earlier, we can quickly discover that

f^\prime(x) = a \dfrac{d}{dx}[x^2] + b \dfrac{d}{dx}[x] + \dfrac{d}{dx}[c] = a * (2x) + b*(1) + 0 = 2ax + b.

Then to solve the equation f^\prime(x) = 0, we solve 2ax + b = 0. Subtracting b on both sides, 2ax = -b. By dividing both sides by 2a, then x = \dfrac{-b}{2a}. Therefore, if we want to find the peak/valley of some parabola, we can just check that value of x.

Critical Thinking Toolkit: Inference to the Best Explanation

I’ve talked about deductive reasoning and inductive reasoning before. Both of these are important mechanisms we use for arriving at conclusions. The first makes use of the rules of logic – which we can essentially view as limiting ourselves to the definitions of words like true and false and not allowing ourselves any other resources. The second we can view as showing that something is probably true be accumulating lots of examples of that truth occurring in various different situations. Both of these are important and useful, but there is yet another way of thinking that is also essential – which is often called abduction or abductive reasoning.

What is Abduction?

The deductive method emphasizes the rules of logic and doesn’t allow much else into the picture. The inductive method emphasizes accumulating many examples to point towards a specific explanation of those examples. Abduction is very different from both of these. The idea with abduction is that we want the best explanation of the information available to us. Let’s give an example before we try to get too specific.

Suppose that you walk outside in the morning and all of the ground around you is wet, but it is not currently raining. You probably will immediately think that it probably had finished raining not too long ago. But, for the sake of discussion, how is it that you reject the idea that someone can in the middle of the night and dumped buckets of water everywhere you can see? This idea clearly explains the wetness of the ground, and since there isn’t anything we can “repeat” in an experimental way, we can’t use induction. And yet, I doubt any of you reading this paragraph would doubt your belief that it rained not too long ago based on my alternative explanation.

What you are doing subconsciously here is using a kind of abductive argument. You are thinking to yourself something like “Well, yes, your alternative does explain why the ground is wet. But the explanation I originally came up with is better, so I would need you to show me some proof before I give up my belief that it rained yesterday.” You think this way because the hypothesis of rain is better at explaining what you know than the hypothesis of the buckets of water.

Now, suppose that someone showed you a time-stamped video of someone dumping buckets and buckets of water all over the area you are standing last night. In that situation, you may well shift what you believe. If you walk a ways away and the ground if entirely dry, this will probably convince you. This is because you’ve received new information that doesn’t make sense if it actually had rained earlier, but does make sense if someone had come by while you slept and dumped water everywhere near you.

This is exactly the idea that goes behind abduction. Abductive arguments collect all the information we know about a certain situation, formulate the various ways that we might explain that information, and then try to find which of those does the “best job” of explaining what you see. When all you knew was that the ground was wet, rain made the most sense. But when you found dry ground nearby and saw the video, you gained more information and that new information caused your belief to shift. This is the heart of the abductive method.

But, we must ask, how do we know which option is the best one? What do we mean by best anyways?

What Criteria Do We Use?

When using abductive arguments, we need to be clear about how we are judging which explanation is the best. For example, it wouldn’t be acceptable to use your own emotional preferences as part of what you mean by “best” since that makes the entire discussion depend on your personal opinion, which is exactly what we are trying to avoid. But, what criteria can we use then? There are a lot, which may apply to greater or lesser extents depending on the circumstances. The best way to approach this situation (notice I used the word best… what do I mean by best here?) is probably to make a list of various criteria that are often used in abductive arguments. These are often called explanatory virtues, because… well… it is a “good” thing for an explanation to have them. Fairly intuitive name for them. But what are these explanatory virtues? There are many. Here are some examples of qualities that might count as explanatory virtues.

  • Explanatory Power: How well does the theory explain what we know?
    • Example: Einstein’s theory of gravity predicts orbits more accurately than Newton’s theory of gravity does.
  • Explanatory Scope: How many different things that we know does the theory explain?
    • Example: Einstein’s theory of gravity does everything old theories of gravity do, and it does even more.
  • Simplicity: How many new things do we have to accept to adopt this theory?
    • Example: If someone hid a practical joke in your room, it was probably one or two people, not one hundred. And it was probably your brother, not aliens.
  • Coherence: This theory meshes well with things I already know about the world.
    • Example: If you know I like math, then you learn that I have a lot of math and science books in my house, that fits well with what you already know about me. If you know I hate math but learn that I own dozens of math books, that does not fit well.
  • Not Ad-Hoc: Does this theory seem arbitrary or contrived? Does it feel like someone ‘pulled it out of thin air’?
    • Example: Explaining Christmas presents by appealing to Santa Claus, a person whose sole purpose of existence is to make the presents appear under the Christmas tree.
  • High Prior Probability: Is there something intuitively (or logically) plausible about this theory? Does it ‘make a lot of sense’?
    • Example: Even before you count up ballots, you pretty much know who California voted for. Same with Mississippi.
  • Multiply Evidenced: Is there evidence for this theory from a variety of different sources and different types of sources?
    • Example: If your theory has evidence from cosmology, biology, history, and archaeology, you’re looking pretty good.

Why Does Abduction Work?

It is actually very strange that abduction works at all. This is because, on the level of deductive logic, abduction is a fallacy. When we use abduction, we are considering some reality B and try discover the cause of B (call this cause A). In terms of logic, we are saying something like “since we know B, and A best explains B, therefore B probably implies A.” This is extremely similar to the fallacy of affirming the consequent – the fallacy of saying that if we know that B and that A implies B, then we know A. This reasoning is false, and yet in the case of abduction it tends to work pretty well (although it is not perfect). How is this so?

If I’m honest, I don’t think we can ever be absolutely certain about this. Especially because, sometimes, it doesn’t work. Sometimes life is more complicated than that, and sometimes we don’t have all the evidence. And even if we do – why should the concepts that we humans happen to think are explanatory virtues have anything to do with whether the theory behind the virtues is really real?

This reminds me of Einstein’s keen observation about science, “The most incomprehensible thing about the world is that it is comprehensible.” It is a daunting question, in my opinion. I think that something like theism would help, as in that case the universe may well have been created by a being who made it orderly and who made human minds in such a way as to discover that order. But, nonetheless, I believe that we are all should continue believing that abduction will work, even if we don’t always fully understand why. It is a great blessing that the way we think about goodness and rationality so often lines up with the world in which we live. And using abduction well and thoughtfully is a way we can express our gratitude for that gift.