Numpy Recap :
import numpy as np np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79]) np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7]) bmi = np_weight / np_height ** 2 bmi #array([ 21.852, 20.975, 21.75 , 24.747, 21.441]) bmi > 23 array([False, False, False, True, False], dtype=bool) bmi[bmi > 23] array([ 24.747])
Numeric comparison :
2 < 3 ->True 2==3 ->False 2<=3 ->True 3<=3 ->True x=2 y=3 x<y ->True
Other comparisons
'carl'<'chris' ->True 3<'chris' ->TypeError: unorderable types: int() < str() 3<4.1 ->True bmi>23 ->array([False, False, False, True, False], dtype=bool)
Boolean Operators
- and
- or
- not
True and True ->True False and True ->False True and False ->False False and False ->False x=12 x>5 and x<15 ->True
True or True ->True False or True ->True True or False ->True False or False ->False y=5 y<7 or y>13 ->True
not True False not False True
With Numpy arrays you can simply use the boolean operators as is you have to use logical methods which are :
- logical_and()
- logical_or()
- logical_not()
bmi ->array([ 21.852, 20.975, 21.75 , 24.747, 21.441]) bmi > 21 ->array([ True, False, True, True, True], dtype=bool) bmi > 21 and bmi < 22 ->ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() np.logical_and(bmi > 21, bmi < 22) ->array([ True, False, True, False, True], dtype=bool) bmi[np.logical_and(bmi > 21, bmi < 22)] ->array([ 21.852, 21.75, 21.441])
Conditional Statements
- if
- else
- elif
z = 3 if z % 2 == 0 : print("z is divisible by 2") elif z % 3 == 0 : print("z is divisible by 3") else : print("z is neither divisible by 2 nor by 3") if condition : expression elif condition : expression else : expression
Filtering Pandas DataFrame
We take again our brics data and we try to select countries with area over 8 million km2. This will be done in 3 steps :
- Select the area column
- Do comparison on area column
- Use result to select countries
import pandas as pd brics = pd.read_csv("path/to/brics.csv", index_col = 0) # country capital area population #BR Brazil Brasilia 8.516 200.40 #RU Russia Moscow 17.100 143.50 #IN India New Delhi 3.286 1252.00 #CH China Beijing 9.597 1357.00 #SA South Africa Pretoria 1.221 52.98 is_huge = brics["area"] > 8 brics[is_huge] #or in one line : brics[brics["area"] > 8] # country capital area population #BR Brazil Brasilia 8.516 200.4 #RU Russia Moscow 17.100 143.5 #CH China Beijing 9.597 1357.0
If you want to have multiple conditions you have to use the Numpy logical methods :
import numpy as np #Will return a series : np.logical_and(brics["area"] > 8, brics["area"] < 10) #Will return a dataframe brics[np.logical_and(brics["area"] > 8, brics["area"] < 10)]