1. First exercises¶
1.1. Exercises: integer (int) and string (str) variables¶
Exercise 1.1
Consider the following code:
a = 10
b = 4
c = a + b
print(c)
Question: What is the value of
c
that is printed?Modify the value that is assigned to variable
b
in such a way that the value ofc
printed to screen is16
.
Exercise 1.2
Create two variables a
and b
, and assign them the values 4 and 20 respectively. Create a new variable c
that is the sum of these two variables. Your code should produce the following output by printing the value of the variable c
to the screen:
a + b = 24
Exercise 1.3
Run the following code:
a = 10
b = 4
c = a + b
print(c)
print(c)
Question: Which value is printed to the screen and how often?
Now assign the variable b
a new value after having printed the value of c
, like so:
a = 10
b = 4
c = a + b
print('a + b = ',c)
b = 10
print('a + b = ',c)
Question: Why does the value of the variable c not change between the first and the second print statement (in line 4 and line 6 respectively)?
Exercise 1.4
Run the following code (make sure to write it exactly as shown here):
a = '10'
b = '4'
c = a + b
print(c)
Question: what is printed to the screen?
Question: what type of variable are
a
,b
andc
? (e.g. list, integer, float, string)Question: why does it not print
14
to the screen? You could also try this example to see what is happening:
a = '10'
b = '4'
c = a + '+' + b
print(c)
Exercise 1.5
Have a look at the following code (do not run it yet):
a = 10
b = '4'
c = a + b
print(c)
Question: what type of variable is
a
?Question: what type of variable is
b
?Question: what is the output of the code? (Run it!)
Question: do you understand the output?
1.2. Working with strings¶
As you have seen, you can combine strings using the +
operator.
first_part = "this is "
second_part = "a long string"
print(first_part + second_part)
this is a long string
If you want to access specific parts of a string you can use [
and ]
with a number in between.
msg = "hello there!"
msg[1]
'e'
You’ll see that [1]
selects the second character of the string "hello there!"
. Why the second and not the first? This is because Python, like many other programming language, starts counting at 0
.
There is a more extended version, where you create a slice. In this case you specify [start:end]
. This looks as follows:
msg[0:5]
'hello'
Here you see that you can take a substring of string
. In other words, you select a smaller part of a string
.
You can convert a number, such as an int
or float
to a string with the str()
method.
a = 0.12
print(type(a))
b = str(a)
print(type(b))
<class 'float'>
<class 'str'>
Similarly, you can convert a string that contains a number to an integer with the int()
function or to a float with the float()
function.
Strings have a number of methods that allow you to perform a variety of useful functions. For instance, lower()
creates a new string where all characters are in lowercase.
s1 = "ATG"
s2 = s1.lower()
print(s2)
atg
Similarly, you can use upper()
to convert a string to all uppercase. These methods create a new string, they leave the string you use to call the method unchanged.
Here is a list of useful string methods. For a complete overview you can check the Python string documentation.
upper()
- Returns a string in uppercaselower()
- Returns a string in lowercaselen()
- Returns the length of a stringcount()
- Returns the number of occurrences of a string within another stringreplace()
- Returns a string with specific text replacedfind()
- Return the first position where a substring is found, or -1 if is not foundstrip()
- Returns a string with all leading and trailing whitespace removed
1.3. Exercises: strings¶
Exercise 1.6
You are going to analyze the following sequence in more detail using Python:
GCTTGACAGGTAGACAGGACCCATAGACAGGATAGACAGGTAGACAGGGATAGACAGGGATAGCCAGATAGACGATAGCGATGATAC
To get this sequence into python you are allowed to copy paste! Provide answers + code to the following questions:
What is the length of this sequence?
What is the 40th base of this sequence?
Is there a C in the sequence starting at base position 44 and ending at position 53? (let Python tell you)
Exercise 1.7
Given the following input:
seq1 = "ATG"
seq2 = "GATTACA"
seq3 = "A"
Write the code that will calculate the total length of these three sequences and print the following output:
Total length: 11
Exercise 1.8
An open reading frame is a sequence of DNA that starts with the start codon ATG and ends with a stop codon (TAA, TAG or TGA). Take the following DNA sequence.
dna = "TTGCATGTCAATCGATCGGATTGGTTGATTTATCCCGA"
This sequence contains one ORF, with a start codon at the 5th position and a stop codon (TGA) at the 26th position.
Write code that will print the ORF of this sequence.
Let’s look at a different DNA sequence:
dna2 = "CCGGTATGCGGTTCTGACCA"
Does the code that you wrote for 1) also work for this sequence? If not, write code that will print the ORF of a DNA sequence (only with the TGA stop codon) that works with either of this sequences.
Exercise 1.9
Write code that will print the GC content of a sequence. This is the fraction of a sequence that is either C or G. When you use string dna
from 1.8 it should print the following:
GC content: 42%
Exercise 1.10
Write code that will print a sequence in lower-case with the ORF (start ATG, stop TGA) in upper-case.
For instance, the following sequence:
dna = "TTGCATGTCAATCGATCGGATTGGTTGATTTATCCCGA"
Would be printed like this:
ttgcATGTCAATCGATCGGATTGGTTGAtttatcccga