CSCI 5521 - Introduction of Machine Learning

Supplementary Material

While there are several programming languages and tools available to build machine learning algorithms and models, Python is one of the most popular ones. All the coding components of the homework assignments are expected to be completed using Python 3.

Installing Python

Python is an easy to pick up language and used in a gamut of industrial and research settings. It’s developed under an open source license making it freely available to everyone.

While Python can be downloaded from this link, you can also try downloading Chocolatey for Windows or Homebrew for MacOS. You can follow this link for further instructions.

However, it is highly recommended that you download the Anaconda Navigator which is the most popular Python distribution platform. It also comes with Python Spyder IDE and Jupyter Notebook. Conda also lets you use command line interface through terminal on macOS/Linux and Anaconda Prompt on Windows. You can download it here.

Basics

In [ ]:
#Assigning temperature value.
temp=60 
print("Temperature is")
print(temp)

or

In [ ]:
print(f"Temperature is {temp}.")

Check the data type of variable :

In [ ]:
print( type( temp ) )

We can define a list in Python using square brackets []. For example,

In [ ]:
num_list=[ 50, 80, 30 ]


# check the length of the list 
len(num_list)
In [ ]:
num_list=[ [20,30] , [40,70] ]
len( num_list )

Control Flow: for loop and if, else statement can be used in the following way.

In [ ]:
num_list = [100, 70, 85]
sum=0
for num_value in num_list:
    sum= sum + num_value
print(f"sum is {sum}.") 


if sum>=150:
    print(sum)
else:
    print(150 - sum)

Notice that lines after for and if statements are indented. Indentation is important in Python. Functions can be defined using the def keyword.

In [ ]:
def add_ten(x):
    return x + 10


print( add_ten( 50 )  )

Some other base Python 3 functions can be found here. We recommend going over functions like dict(), tuple(), list(), zip(), len().

Try exploring what the following code does:

In [ ]:
a=[2]
print(a*2)
print(a*5)
print(a+a)
In [ ]:
a=2
print(a*5)
In [ ]:
b="python"
print(b*3)
print(b+b)

We can also anonymously call functions using the keyword lambda:

In [ ]:
add_num = lambda a, b: a + b
print(add_num(4,5))

Combining the map function that maps a function which it takes in as its first argument to a container in its second argument with lambda functions can be powerful.

In [ ]:
num_list= [1,2,3,4]
Twice_list = list(  map(  lambda num_value: num_value*2, num_list   )  )
print(Twice_list)

While, base Python has several useful functions, using additional functions from libraries makes your Python code more versatile and powerful. Some of the popular libraries/packages that you will use in this course are:

  1. Numpy: for numerical and scientific computing; creating N - dimensional arrays, vectorization.
In [ ]:
import numpy as np  # you can rename the libraries within code for easier calling 
num_list=np.array(num_list)
print( type(num_list) )
print( np.average(num_list) )
print(num_list.shape)
  1. Pandas: for powerful numerical and scientific computing; creating dataframes, vectorization. Works well for tabular format data.
In [ ]:
import pandas as pd
num_list=pd.DataFrame(num_list)
print( num_list.info() )
In [ ]:
print( num_list.shape )

It will be handy to explore other methods and functions like describe, columns, head, read_csv.

  1. Matplotlib: for plotting and graphing.
In [ ]:
import matplotlib.pyplot as plt
plt.plot(num_list,num_list)
plt.scatter(num_list,num_list,)
plt.plot(num_list,num_list,"ro")
  1. Seaborn: for advanced plotting and graphing.
  2. Scikit-learn: for importing datasets, scaling, feature engineering or machine learning methods.

Classes in Python

Python is an object oriented programming language where there is stress on treating variables and methods as objects. Classes are powerful templates/prototypes to create class objects with variable and method attributes. A simple example of a class is as follows:

In [ ]:
class MyClass:
    '''This is our first class'''
    class_variable=20
    
    def greeting(self):
        print("Hello!")
In [ ]:
class1=MyClass()
print(class1.class_variable)
In [ ]:
class2=MyClass()
class2.class_variable=10
print(class2.class_variable) #class2 is a class object allowing us to access different attributes of the class object
print(class2.__doc__) # __doc__ is a special attribute that gives the docstring of that class
In [ ]:
class2.greeting() # calling the object's greeting method

# What does class2.greeting output?

Just like the def keyword is used to create new functions, class keyword is used to create a class. You can create functions within class using def keyword. One special class funtion is __init__() function that is usually used to initialize all the variables. The init class function is called whenever a new class object is instantiated.

In [ ]:
class MyHouse:
    def __init__(self,bedroom=0,bathroom=1):
        self.bedrooms=bedroom
        self.bathrooms=bathroom
        
    def get_allrooms(self):
        total=self.bedrooms+self.bathrooms
        print(f'Total number of rooms is : {total}')
In [ ]:
House=MyHouse()
Home=MyHouse(3,2)
In [ ]:
print(House.bedrooms)
print(House.bathrooms)
House.get_allrooms()
In [ ]:
Home.get_allrooms()
In [ ]:
# You can also delete attributes of a class object
del Home.bathrooms
Home.get_allrooms()
# You deleted the bathrooms attribute of Home class object using del keyword. And that's why you get the attribute error.

Try to create a class "Person" where you can input the persons name and age at instantiation and have a class function called "About_" to print name and age.

In [ ]:
class Person:
    pass
In [ ]:
#This shoud print "The person's name is Jane Doe and she is 20 years old"
#person1=Person("Jane Doe",20)
#person1.About_()

Debugging

While there are several ways to debug, it is worthwhile to checkout pdb. Going over _set_trace_ , step , next , continue will be helpful.

For debugging within jupyter notebook, one can also use the debug magic command.

In [4]:
# Lets create an example class object
class Example:
    def __init__(self,a=0,b=0,c=9):
        self.a=a
        self.b=b
        self.c=c
    
    def printa(self):
        print(f"First variable is {self.a}.")
    
    def printb(self):
        print(f"Second variable is {b}.")
        
    def printc(self):
        print(f"Third variable is {self.c}.")
In [ ]:
class3=Example(0,2,10)
class3.printa()
class3.printb()
class3.printc()

Run the next cell, this will start an "ipdb" command line to test your code. Inspect your variables: print the values of "self.a", "self.b", "self.c", "b". Can you figure out the problem?

In [ ]:
%debug

Run the next cell. Press "c" to continue, "n" for next and "q" to quit the debugging.

In [ ]:
%debug class3.printa(); class3.printb(); class3.printc()

alternatively, you can try the following code:

In [ ]:
from IPython.core.debugger import set_trace
set_trace()

Learn about debug magic command here. Check this link for more details about IPython debugger.

Most homework assignments will require you to submit ".py" file. In which case, pdb will be more useful.

import pdb ;pdb.set_trace()

In [ ]: