Numpy Recap :

import numpy as np
np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79]) 
np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7])
bmi = np_weight / np_height ** 2
bmi
#array([ 21.852, 20.975, 21.75 , 24.747, 21.441])
bmi > 23
array([False, False, False, True, False], dtype=bool) 
bmi[bmi > 23] 
array([ 24.747])

Numeric comparison :

2 < 3
->True

2==3
->False

2<=3
->True

3<=3
->True

x=2
y=3
x<y
->True

Other comparisons

'carl'<'chris'
->True

3<'chris'
->TypeError: unorderable types: int() < str() 

3<4.1
->True

bmi>23
->array([False, False, False, True, False], dtype=bool)

Boolean Operators

  • and
  • or
  • not
True and True
->True

False and True
->False

True and False
->False

False and False
->False

x=12
x>5 and x<15
->True
True or True
->True

False or True
->True

True or False
->True

False or False
->False

y=5
y<7 or y>13
->True
not True
False

not False
True

With Numpy arrays you can simply use the boolean operators as is you have to use logical methods which are :

  • logical_and()
  • logical_or()
  • logical_not()
bmi
->array([ 21.852, 20.975, 21.75 , 24.747, 21.441]) 

bmi > 21
->array([ True, False, True, True, True], dtype=bool)

bmi > 21 and bmi < 22
->ValueError: The truth value of an array with more than one element
is ambiguous. Use a.any() or a.all()

np.logical_and(bmi > 21, bmi < 22)
->array([ True, False, True, False, True], dtype=bool)

bmi[np.logical_and(bmi > 21, bmi < 22)]
->array([ 21.852, 21.75, 21.441])
 

Conditional Statements

  • if
  • else
  • elif
z = 3
if z % 2 == 0 :
 print("z is divisible by 2")
elif z % 3 == 0 :
 print("z is divisible by 3")
else :
 print("z is neither divisible by 2 nor by 3")
 
if condition :
 expression
elif condition :
 expression
else :
 expression

Filtering Pandas DataFrame

We take again our brics data and we try to select countries with area over 8 million km2. This will be done in 3 steps :

  • Select the area column
  • Do comparison on area column
  • Use result to select countries
import pandas as pd
brics = pd.read_csv("path/to/brics.csv", index_col = 0)

#   country      capital   area   population
#BR Brazil       Brasilia  8.516  200.40
#RU Russia       Moscow    17.100 143.50
#IN India        New Delhi 3.286  1252.00
#CH China        Beijing   9.597  1357.00
#SA South Africa Pretoria  1.221  52.98

is_huge = brics["area"] > 8
brics[is_huge]

#or in one line : 
brics[brics["area"] > 8] 

#   country capital  area   population
#BR Brazil  Brasilia 8.516  200.4
#RU Russia  Moscow   17.100 143.5
#CH China   Beijing  9.597  1357.0

If you want to have multiple conditions you have to use the Numpy logical methods :

import numpy as np 

#Will return a series : 
np.logical_and(brics["area"] > 8, brics["area"] < 10)

#Will return a dataframe
brics[np.logical_and(brics["area"] > 8, brics["area"] < 10)] 

Brax

Dude in his 30s starting his digital notepad