Conditionals
Use if
statements to control whether or not a block of code is executed
- An
if
statement (more properly called a conditional statement) controls whether some block of code is executed or not. - Structure is similar to a
for
statement:- First line opens with
if
and ends with a colon - Body containing one or more statements is indented (usually by 4 spaces)
mass = 3.54 if mass > 3.0: print(mass, 'is large') mass = 2.07 if mass > 3.0: print(mass, 'is large')
3.54 is large
Conditionals are often used inside loops
- Not much point using a conditional when we know the value (as above).
- But useful when we have a collection to process.
masses = [3.54, 2.07, 9.22, 1.86, 1.71] for m in masses: if m > 3.0: print(m, 'is large')
3.54 is large 9.22 is large
Use else
to execute a block of code when an if
condition is not true
else
can be used following anif
.- Allows us to specify an alternative to execute when the
if
branch isn't taken.
masses = [3.54, 2.07, 9.22, 1.86, 1.71] for m in masses: if m > 3.0: print(m, 'is large') else: print(m, 'is small')
3.54 is large 2.07 is small 9.22 is large 1.86 is small 1.71 is small
Use elif
to specify additional tests
- May want to provide several alternative choices, each with its own test.
- Use
elif
(short for "else if") and a condition to specify these. - Always associated with an
if
. - Must come before the
else
(which is the "catch all").
masses = [3.54, 2.07, 9.22, 1.86, 1.71] for m in masses: if m > 9.0: print(m, 'is HUGE') elif m > 3.0: print(m, 'is large') else: print(m, 'is small')
3.54 is large 2.07 is small 9.22 is HUGE 1.86 is small 1.71 is small
Conditions are tested once, in order
- Python steps through the branches of the conditional in order, testing each in turn.
- So ordering matters.
grade = 85 if grade >= 70: print('grade is C') elif grade >= 80: print('grade is B') elif grade >= 90: print('grade is A')
grade is C
- Does not automatically go back and re-evaluate if values change.
velocity = 10.0 if velocity > 20.0: print('moving too fast') else: print('adjusting velocity') velocity = 50.0
adjusting velocity
- Often use conditionals in a loop to "evolve" the values of variables.
velocity = 10.0 for i in range(5): # execute the loop 5 times print(i, ':', velocity) if velocity > 20.0: print('moving too fast') velocity = velocity - 5.0 else: print('moving too slow') velocity = velocity + 10.0 print('final velocity:', velocity)
0 : 10.0 moving too slow 1 : 20.0 moving too slow 2 : 30.0 moving too fast 3 : 25.0 moving too fast 4 : 20.0 moving too slow final velocity: 30.0
Compound Relations Using and
, or
, and Parentheses
Often, you want some combination of things to be true. You can combine
relations within a conditional using
and
and or
. Continuing the example
above, suppose you havemass = [ 3.54, 2.07, 9.22, 1.86, 1.71] velocity = [10.00, 20.00, 30.00, 25.00, 20.00] i = 0 for i in range(5): if mass[i] > 5 and velocity[i] > 20: print("Fast heavy object. Duck!") elif mass[i] > 2 and mass[i] <= 5 and velocity[i] <= 20: print("Normal traffic") elif mass[i] <= 2 and velocity[i] <= 20: print("Slow light object. Ignore it") else: print("Whoa! Something is up with the data. Check it")
Just like with arithmetic, you can and should use parentheses whenever there
is possible ambiguity. A good general rule is to always use parentheses
when mixing
and
and or
in the same condition. That is, instead of:if mass[i] <= 2 or mass[i] >= 5 and velocity[i] > 20:
write one of these:
if (mass[i] <= 2 or mass[i] >= 5) and velocity[i] > 20:
if mass[i] <= 2 or (mass[i] >= 5 and velocity[i] > 20):
so it is perfectly clear to a reader (and to Python) what you really mean.
Tracing Execution
What does this program print?
pressure = 71.9 if pressure > 50.0: pressure = 25.0 elif pressure <= 50.0: pressure = 0.0 print(pressure)
Trimming Values
Fill in the blanks so that this program creates a new list
containing zeroes where the original list's values were negative
and ones where the original list's values were positive.
original = [-1.5, 0.2, 0.4, 0.0, -1.3, 0.4] result = ____ for value in original: if ____: result.append(0) else: ____ print(result)
[0, 1, 1, 1, 0, 1]
Processing Small Files
Modify this program so that it only processes files with fewer than 50 records.
import glob import pandas as pd for filename in glob.glob('data/*.csv'): contents = pd.read_csv(filename) ____: print(filename, len(contents))
Initializing
Modify this program so that it finds the largest and smallest values in the list
no matter what the range of values originally is.
values = [...some test data...] smallest, largest = None, None for v in values: if ____: smallest, largest = v, v ____: smallest = min(____, v) largest = max(____, v) print(smallest, largest)
What are the advantages and disadvantages of using this method
to find the range of the data?
Using Functions With Conditionals in Pandas
Functions will often contain conditionals. Here is a short example that
will indicate which quartile the argument is in based on hand-coded values
for the quartile cut points.
def calculate_life_quartile(exp): if exp < 58.41: # This observation is in the first quartile return 1 elif exp >= 58.41 and exp < 67.05: # This observation is in the second quartile return 2 elif exp >= 67.05 and exp < 71.70: # This observation is in the third quartile return 3 elif exp >= 71.70: # This observation is in the fourth quartile return 4 else: # This observation has bad data return None calculate_life_quartile(62.5)
2
That function would typically be used within a
for
loop, but Pandas has
a different, more efficient way of doing the same thing, and that is by
applying a function to a dataframe or a portion of a dataframe. Here
is an example, using the definition above.data = pd.read_csv('data/gapminder_all.csv') data['life_qrtl'] = data['lifeExp_1952'].apply(calculate_life_quartile)
There is a lot in that second line, so let's take it piece by piece.
On the right side of the
=
we start with data['lifeExp']
, which is the
column in the dataframe called data
labeled lifExp
. We use the
apply()
to do what it says, apply the calculate_life_quartile
to the
value of this column for every row in the dataframe.