EECS 298: Social Consequences of Computing

Lab 2

Setup

First, check out the Setup Tutorial section on the homepage of this site. If you are new to navigation with the command line, visit the windows setup or macOS setup, then review the command line tutorial.

If you do not have an IDE set up for Python, I recommend you set up VS Code. If you are on Windows and/or plan to develop on multiple machines during this course (such as on both a laptop and a desktop), it may be most convenient for you to set up remote development on one of the U-M Linux servers. This will save you needing to perform setup on multiple machines.

Next, view the python tutorial for steps on how to install python and setup a virtual environment for the course.

Virtual Environment

A virtual environment is a self-contained directory that contains a Python installation for a particular version of Python, plus a number of additional packages. This allows you to work on a specific project without affecting other projects or the system Python installation.

[!NOTE] Using virtual environments helps us by ensuring our Python package versions are the same across your computer and the autograder.

To create a virtual environment, navigate to your directory and run the following command:

python -m venv env

To activate the virtual environment, you can use the following commands:

On Windows:

.\env\Scripts\activate

On macOS and Linux:

source env/bin/activate

Once the virtual environment is activated, your command prompt will change to show the name of the environment, and you can install packages without affecting other projects or the system Python installation.

Then, install the packages from requirements.txt into your environment by using:

pip install -r requirements.txt

This command will read the requirements.txt file and install all the packages listed in it into your virtual environment.

To deactivate the virtual environment, simply run:

deactivate

Task

For this lab, you will play the role of an IT intern at MyHealth, a large new hospital in Ann Arbor. The company recently hired and is currently onboarding five prestigious surgeons from out-of-state, and you have been asked to write a Python program to automate generating the new doctors’ employee information.

For each doctor, you will need to create their company email. Each company email should adhere to the following format:

first letter of first name + last name + PID + @myhealth.org

doctors.csv

The doctors.csv file should contain a FirstName, LastName, Department, and PID for each of the five doctors. Before starting on the Python code, you’ll need to create this file.

Creating Your CSV File

The list of doctors will live in a .csv file, which stands for “Comma Seperated Values”. You can think of these files as a kind of watered-down Excel spreadsheet, or just a simple table with columns and rows of information.

CSV files are great to work with in Python, since they’re easy to read from and analyze in your code. You’ll create a .csv file on your own, then place it into your project’s folder so you can access it from lab2.py.

Setting Up Your File

To start, create a file named doctors.csv in the same folder as your lab2.py file. Then, open your new file using a text editor of your choice.

[!NOTE] You can also create and edit CSV files in programs like Excel, Google Sheets, and Numbers. Although this assignment will teach you how to work with CSVs by hand, these programs can export spreadsheets you create right into CSVs.

When you open the file, you should see a blank screen. Before we start adding restaurant data points (which will appear as rows, one for each doctor), we need to add a header row.

The header row lives at the top of your spreadsheet and denotes what each column of data means. Paste the following text into your blank CSV file to create your header row:

FirstName,LastName,Department,PID

Adding Data

Now, let’s add data! We’ll follow the same format as our header row - for example, if we wanted to add an example doctor, we would add the following line:

Meredith,Grey,Internal,123

We can see that each piece of data in the header row corresponds to a piece of data in the row below it.

FirstName,LastName,Department,PID
Meredith,Grey,Internal,123

In this same way, we can add any number of doctors to our dataset by adding new lines.

[!NOTE]
In CSV files, different pieces of data are seperated by commas. If you need to use a comma in your actual data, place that value in quotes. For example, to place the value Surgery, Admin, and Internal in your dataset, it would appear as "Surgery, Admin, Internal" in your CSV file.

New Hire Roster

Use this data from MyHealth to finish the doctors.csv file:

First Name	Last Name	Department	PID
Meredith	Grey	Internal	123
Alex	Karev	Surgery	456
Miranda	Bailey	Surgery	789
Derek	Shepherd	Research	101
Cristina	Yang	Admin	112

lab2.py

Create a file titled lab2.py. In this file, perform the following:

import the csv package at the top of the file.
Define a Doctor class with attributes for first name, last name, department, PID, and email address. The first name, last name, department, and PID are passed in as arguments to the __init__() constructor and you will set the email address yourself according to the specifications above.
Override the less than operator, which has the function header __lt__(self, other), to compare the PID of two Doctor objects.
Override the greater than operator, which has the function header __gt__(self, other), to compare the PID of two Doctor objects.
In the __main__ branch, load and read doctors.csv using the open built-in Python function and the function reader() from the csv package. Create a Doctor object for each doctor in the file. Add these Doctor objects to a list.
Use the Python function sorted() to sort the list from low to high PID.
Create a dict and add the doctors to the dictionary using department as the key and a list of emails of doctors in the department in order of low to high PID as the value. Hint: use the ordering from the sorted list above!
Print out the key/value pairs from the dict as follows:

Department1:[email1, email2]
...

Turn in lab2.py on Gradescope.

Tips

Lists

Python lists are the most common data structure used, and function like souped-up C arrays. Note that they support items of different types.

mylist = []

mylist.append(123)
mylist.append("abc")
mylist.append(456)
mylist.append("def")

print(mylist[0]) # 123
print(mylist[:2]) # [123,"abc"]

Python dictionaries

The dict object in Python is a built-in data structure. A dictionary can be thought of as a generalization of a list that is indexable by any object, rather than only an int. More formally, dictionaries contain key/value pairs. Indexing a dictionary by its key returns the matching value. See usage below:

my_dict = dict()

# Add a key/value pair by indexing with any object and assigning a value
my_dict["Key 1"] = 10.0
my_dict["The next key"] = "Its value"
my_dict[15] = 0.0

my_dict.keys() # ["Key 1", "The next key", 15]
my_dict["Key 1"] # 10
my_dict # {"Key 1": 10, "The next key": "Its value", 15: 0.0}

String slicing

Python string syntax provides powerful tools to index and “slice” strings. See below:

s = "abcdef"

print(s[0]) # The first char: "a"
print(s[-1]) # The last char: "f"

print(s[1:3]) # From the 2nd char to the fourth char (non-inclusive): "bc"
print(s[2:]) # From the 3rd char to the end: "cdef"
print(s[:-2]) # From the beginning to the second-to-last char (non-inclusive), "abcd"

CSV `reader()` function

The reader() function in the csv package is used to read in information from a csv file. The reader() function returns an iterable where each element in the iterable is the next line of the csv file represented as a list of strs when you split the line by commas. For example, suppose you have a file called file.csv that contains two lines of information

1,2,3
a,b,c

Then, you can use the reader() function as follows. Note that all information (including numbers!) is read in as strs.

import csv

with open("file.csv","r") as file: # The 'with' keyword defines the scope where "file.csv" is open and automatically closes it out of scope.
    data = csv.reader(file)
    for row in data: # for loop in Python -- 'data' is an iterable and 'row' is a holder variable for each element in 'data'
        print(row) # first prints ["1","2","3"] then prints ["a","b","c"]

Python `sorted()` function

The Python sorted() function takes an iterable (like a list) as input and returns a sorted version of the iterable. There are two optional keyword arguments. (1) key is a function that is run on each element of the iterable before comparisons are made between elements to sort the iterable. You can use the Python keyword lambda to write a simple function that takes as input a single argument representing the element in the iterable and returns to the sorted() function an new element to sort by. (2) reverse is default False and indicates whether the order of the sorted elements should be reversed or not at the end. The default sorting is A-Za-z for strs and low to high for ints.

myList = [5,4,1,7,9]
print(sorted(myList)) # [1,4,5,7,9]
print(sorted(myList, reverse=True)) # [9,7,5,4,1]

myList2 = [(1,"banana"),(2,"apple")]
print(sorted(myList, key = lambda element: element[0])) # [(1,"banana"),(2,"apple")], sorting according to the first element of each tuple
print(sorted(myList, key = lambda element: element[1])) # [(2,"apple"),(1,"banana")], sorting according to the second element of each tuple

Overriding functions

Any method within a parent class can be overridden in a child of that class. When writing the constructor, you actually override the default constructor of the Object class, from which every class inherits. You can do the same with other methods belonging to parent classes, for example the ` le(self, other) method, which is automatically called whenever two objects from the same class are compared by <=`.

class MyClass:

    def __init__(self, attribute_1, attribute_2):
        self.att1 = attribute_1
        self.att2 = attribute_2

    def __le__(self, other):
        return self.att1 <= other.att1

if __name__ == "__main__":
    obj1 = MyClass(1,6)
    obj2 = MyClass(5,2)

    print(obj1<=obj2) # prints True