Data Types and Conversion#
Questions:#
What kinds of data do programs store?
How can I convert one type to another?
Learning Objectives:#
Explain key differences between integers and floating point numbers.
Explain key differences between numbers and character strings.
Use built-in functions to convert between integers, floating point numbers, and strings.
Every value has a type#
Every value in a program has a specific type
Integer (
int
): represents positive or negative whole numbers like3
or-512
Floating point number (
float
): represents real numbers like3.14159
or-2.5
Character (
char
): single characters, for example"a"
,"j"
,"8"
,"("
Characters are written in either single quotes or double quotes (as long as they match)
Numerals placed in quotes will be treated as characters, not integers or floats
Character string (usually called “string”,
str
): textWritten in either single quotes or double quotes (as long as they match)
The quote marks aren’t printed when the string is displayed
Use the built-in function type
to find the type of a value#
Use the built-in function
type
to find out what type a value hasWorks on variables as well
But remember: the value has the type; the variable is just a label
print(type(52))
<class 'int'>
fitness = 'average'
print(type(fitness))
<class 'str'>
Nested functions such as print(type())
are evaluated from the inside out, like in mathematics.
Types control what operations (or methods) can be performed on a given value#
A value’s type determines what the program can do to it. So we can perform subtraction on integers:
print(5 - 3)
2
But not on strings or characters:
print('hello' - 'h')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[4], line 1
----> 1 print('hello' - 'h')
TypeError: unsupported operand type(s) for -: 'str' and 'str'
You can use the +
and *
operators on strings#
“Adding” character strings concatenates them; i.e., creates one long string by combinging the inputs in the order you specify
full_name = 'Ahmed' + 'Walsh'
print(full_name)
AhmedWalsh
To add spaces between strings that you concateate, you need to explicitly include whitespaces in quotes:
full_name = 'Ahmed' + ' ' + 'Walsh'
print(full_name)
Ahmed Walsh
Multiplying a character string by an integer N creates a new string that consists of that character string repeated N times (since multiplication is repeated addition).
greeting = 'hello-' * 3
print(greeting)
hello-hello-hello-
Strings have a length (but numbers don’t)#
The built-in function len
counts the number of characters in a string.
print(len(full_name))
But numbers don’t have a length (not even zero).
print(len(52))
Use an index to get a single character from a string.#
The characters (individual letters, numbers, and so on) in a string are ordered. For example, the string
'AB'
is not the same as'BA'
. Because of this ordering, we can treat the string as a list of characters.Each position in the string (first, second, etc.) is given a number. This number is called an index.
Indices are numbered from 0.
Use the position’s index in square brackets to get the character at that position.
atom_name = 'helium'
print(atom_name[0])
Use a slice to get a substring.#
A part of a string is called a substring. A substring can be as short as a single character.
An item in a list is called an element. Whenever we treat a string as if it were a list, the string’s elements are its individual characters.
A slice is a part of a string (or, more generally, any list-like thing).
We take a slice by using
[start:stop]
, wherestart
is replaced with the index of the first element we want andstop
is replaced with the index of the element just after the last element we want.Mathematically, you might say that a slice selects
[start:stop)
. The square bracket[
means inclusive, and the round bracket)
afterstop
means non-inclusive.The difference between
stop
andstart
is the slice’s length.Taking a slice does not change the contents of the original string. Instead, the slice is a copy of part of the original string.
atom_name = 'sodium'
print(atom_name[0:3])
Slicing numbers?#
If you assign a = 123
,
what happens if you try to get the second digit of a
via a[1]
?
a = 123
a[1]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/var/folders/x0/pvpjfdzd39sc5rqvsk5lvv4m0000gn/T/ipykernel_13282/3505054574.py in <module>
1 a = 123
----> 2 a[1]
TypeError: 'int' object is not subscriptable
Click the button to reveal the answer
Numbers are not strings or sequences and Python will raise an error if you try to perform an index operation on a number. In the lesson on types and type conversion we will learn more about types and how to convert between different types. If you want the Nth digit of a number you can convert it into a string using the str
built-in function and then perform an index operation on that string.
Slicing practice#
What does the following program print?
atom_name = 'carbon'
print('atom_name[1:3] is:', atom_name[1:3])
atom_name[1:3] is: ar
Slicing concepts#
cell_name = 'neuron'
What does
cell_name[1:5]
do?
cell_name[1:5]
'euro'
What does
cell_name[0:5]
do?
cell_name[0:5]
'neuro'
What does
cell_name[0:6]
do?
cell_name[0:6]
'neuron'
What does
cell_name[0:]
(without a value after the colon) do?
cell_name[0:]
'neuron'
What does
cell_name[:5]
(without a value before the colon) do?
cell_name[:5]
'neuro'
What does
cell_name[:]
(just a colon) do?
cell_name[:]
'neuron'
What does
cell_name[1:-1]
do?
cell_name[1:-1]
'euro'
What happens when you choose a high value (.e., the value after the colon) which is out of range? (i.e., try
cell_name[1:99]
)
cell_name[1:99]
'euron'
You must convert numbers to strings or vice versa when operating on them#
Cannot add numbers and strings.
print(1 + '2')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/var/folders/x0/pvpjfdzd39sc5rqvsk5lvv4m0000gn/T/ipykernel_13282/3905401405.py in <module>
----> 1 print(1 + '2')
TypeError: unsupported operand type(s) for +: 'int' and 'str'
This is not allowed because it’s ambiguous: should 1 + '2'
be 3
or '12'
?
Some types can be converted to other types by using the type name as a function:
print(1 + int('2'))
3
print(str(1) + '2')
12
You can mix integers and floats freely in operations#
Python 3 automatically converts integers to floats as needed.
print('half is', 1 / 2.0)
print('three squared is', 3.0 ** 2)
Exercises#
Fractions#
What type of value is 3.4? How can you find out?
Automatic Type Conversion#
What type of value is 3.25 + 4?
Choose a Type#
What type of value (integer, floating point number, or character string) would you use to represent each of the following? Try to come up with more than one good answer for each problem. For example, in (1), when would counting days with a floating point variable make more sense than using an integer?
Number of days since the start of the year.
Time elapsed from the start of the year until now in days.
Serial number of a piece of lab equipment.
A lab specimen’s age
Current population of a city.
Average population of a city over time.
Click the button to reveal!
Solution
The answers to the questions are:
Integer, since the number of days would lie between 1 and 365. Float would make sense if you were considering partial days (e.g., if it’s noon then today would count as 0.5)
Floating point, since fractional days are required
If serial number contains letters and numbers, then a character string. If the serial number consists only of numerals, then an integer could be used, although a character string could also be used.
This will vary! How do you define a specimen’s age? whole days since collection (integer)? date and time (string)?
Choose integer to represent population in units of individuals, or floating point to represent population as large aggregates (eg millions)
Floating point number, since an average is likely to have a fractional part.
Division Types#
In Python 3:
the
//
operator performs integer (whole-number) floor divisionthe
/
operator performs floating-point divisionthe ‘%’ (or modulo) operator calculates and returns the remainder from integer division:
print('5 // 3 = ', 5 // 3)
print('5 / 3 = ', 5 / 3)
print(1 + '2')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/var/folders/x0/pvpjfdzd39sc5rqvsk5lvv4m0000gn/T/ipykernel_13282/3905401405.py in <module>
----> 1 print(1 + '2')
TypeError: unsupported operand type(s) for +: 'int' and 'str'
print('5 % 3 =', 5 % 3)
print(1 + int('2'))
3
Division Challenge#
Imagine we are catering an event for 100 guests, and for dessert we want to serve each person one slice of pie. Each pie yields 8 pieces. How do we calculate the number of pies we need?
We can start by simply dividing the number of guests by the number of slices per pie:
pie_eaters = 100
slice_per_pie = 8
num_pies = pie_eaters / slice_per_pie
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
100 guests requires 12.5 pies
The last line can also be printed using an f-string:
print(f"{pie_eaters} guests requires {num_pies} pies.")
100 guests requires 12.5 pies.
However, this yields a floating point number. We can’t easily bake half a pie, so we need to round up to ensure we have enough pies. We can use floor division for this:
num_pies = pie_eaters // slice_per_pie
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
100 guests requires 12 pies
Of course, we actually need one more pie than that, but Python doesn’t provide an operator for rounding up (“ceiling” division). So we can simply add 1 to our answer:
num_pies = pie_eaters // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
100 guests requires 13 pies
Note that Python uses standard order of operations, so the division will be performed before the addition. That is, we will get:
(pie_eaters // slice_per_pie) + 1
not
pie_eaters // (slice_per_pie + 1)
When writing code, it’s good to test it and think about possible cases where it won’t work as intended. In this example, if the number of guests was evenly divisible by 8, then our calculation would erroneously tell us we need one more pie than we do:
pie_eaters = 64
num_pies = pie_eaters // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
64 guests requires 9 pies
We can make our code more robust by subtracting 1 to pie_eaters
within the formula:
num_pies = (pie_eaters - 1) // slice_per_pie + 1
print(pie_eaters, 'guests', 'requires', num_pies, 'pies')
64 guests requires 8 pies
Strings to Numbers#
Where reasonable, float()
will convert a string to a floating point number,
and int()
will convert a floating point number to an integer:
print("string to float:", float("3.4"))
print("float to int:", int(3.4))
If the conversion doesn’t make sense, however, an error message will occur
print("string to float:", float("Hello world!"))
Given this information, what do you expect the following program to do?
print("fractional string to int:", int("3.4"))
What does it actually do?
Why do you think it does that?
Click the button to reveal!
Solution
What do you expect this program to do? It would not be so unreasonable to expect the Python 3 int
command to
convert the string “3.4” to 3.4 and an additional type conversion to 3. After all, Python 3 performs a lot of other magic - isn’t that part of its charm?
However, Python 3 throws an error. Why? To be consistent, possibly. If you ask Python to perform two consecutive typecasts, you must convert it explicitly in code.
int("3.4")
int(float("3.4"))
Arithmetic with Different Types#
Given these variable definitions:
a = 1.0
b = "1"
c = "1.1"
Which of the following will return the floating point number 2.0
?
Note: there may be more than one right answer.
a + float(b)
float(b) + float(c)
a + int(c)
a + int(float(c))
int(a) + int(float(c))
2.0 * b
Summary of Key Points:#
Every value has a type
Use the built-in function
type
to find the type of a valueTypes control what operations can be done on values
Strings can be added and multiplied
Strings have a length (but numbers don’t)
Use the built-in function
len
to find the length of a stringUse an index to get a single character from a string
Use a slice to get a substring
Can mix integers and floats freely in operations
Must convert numbers to strings or vice versa when operating on them
This section was adapted from Aaron J. Newman’s Data Science for Psychology and Neuroscience - in Python and Software Carpentry’s Plotting and Programming in Python workshop.