Notes on the Pythone 3 Language

This is mostly really basic stuff I noted while reviewing references and an online Python tutorial. I’ve done a little Python 2 programming, and a bit of Python 3 with Pandas and PyArrow. However I never systematically reviewed Python 3. Now that I’m involved in development on a Python 3 application it was time to learn it properly. The plan is to keep updating these notes as I go.

For the most part, though I don’t love the significant whitespace, I feel like most of the choices the designers made were good and what I’d have chosen for an interpreted language (or would have wanted to choose, I’m sure I’d have made some bad calls if I designed a language from scratch.)

We’re using MiniConda to manage our Python libraries and Python versions. Setting up development environments and installing Python 3 applications with MiniConda for users is an entirely other set of notes which I may end up posting.

Simple Data Types

Use “type()” on any variable to get its type.

There is only one integer type “long” which can be more than a 4 or 8 byte int, switching to a “BigInteger” sort of type when large.

Use standard operators on numbers with the exception of // for “floor division”. With two integers // will return an integer, with one or two floats it will return the floor of the division as a float. Integer division with // is significantly faster than /.

Operators

Standard for the most part. Logical operators are “or”, “and”, “not”. True and false literals are “True” and “False.”

Byt shifting « and » are supported on integers. Use “ ” and “&” for bitwise “or” and “and”, “^” for xor.

Use “**” for exponentiation.

Sequences

Lists, strings, byte sequences, tuples are all “sequences”, operated on with the same operators and notation. Lists are mutable, the rest aren’t.

List: lst = [1,2,3] tuple: tpl = (1,2,3) string: “12345”

Strings require UTF-8. Strings index by character, not byte offsets. A character is a string of length 1. Use “str(value)” to make a string out of another type.

You index sequences with [] notation, make copies of slices with [start:end]. [:] makes a complete shallow copy. Indexing with -1, -2 etc index from the end of the sequence.

strings are character sequences

Check for membership with the “in” operator, use methods “append()”, “extend(sequence)”, “pop(index)”, “remove(index)”, “index(value)”.. The “len(sequence)” returns the length.

You can use “+=” operator to append to a list, but using “lst = lst + value” is bad: It creates a new list on the right, assigns that new list to “lst” and disposes of the memory previously used for “lst”.

Use the “deepcopy(sequence)” function from the “copy” module to make a complete copy of all values including nested lists.

Dictionaries (map / associative arrays)

A dictionary literal:

d = {“I”:1, “II”:2, “III”:3}

Keys must be immutable types: No lists or dictionaries allowed. Interestingly a tuple could be a key. Keys can be of different types in the same dictionary.

The “in” operator checks for the existence of a key. The “len()” function gives the number of key-value pairs.

The “pop(key)” method returns the value for “key” and deletes the key-value pair. “popitem()” pops some pair and returns the value: You could use ina “while len(d)>0” type of loop.

Indexing into a dictionary with [key] fails if “key” doesn’t exist. The “get(key)” method returns special value None or a default value if the key is missing.

The “copy()” method will make a new shallow copy dictionary, or use “deepcopy()” to make new values.

Use “update()” to combine two dictionaries into one.

Iterating on dictionaries:

Iterate over all keys:

for k in d:

Iterate over all values:

for v in d.values():

Both key and value:

for k in d:
	value = d[k]

This is slower than iterating on only keys or values.

Convert to a list of key-values:

kv = list(d.items())

List of keys or list of values:

ks = list(d.keys())
vs = list(d.values())

The “dict()” function returns a dictionary from a list of “key,value, key,value …” items. . Or you pass in an iterator – the result of zip(), or items() on a dictionary. So

roman = ["I","II","III"]
arabic = [1,2,3]

r_to_a = dict( zip(roman, arabic))

Sets

Sets are collections of immutable data types where every member appears only once. Most standard set operators are supported, some only through methods on set objects.

A set literal:

s = {'a','b','c','d'}

Use “set()” to make a set from a sequence. This will make a set of ten members, one for each digit character.

digits = set("0123456789")

Sets themselves are mutable. Use the “add(value)” method on a set object to add a member. To make an immutable set use “frozenset()”:

digits = frozenset("1234567890")

Get the difference between members with “difference(set)”, or use the “-“ operator. Use “difference_update(set)” to remove members in ‘set’ from the set, equal to “set2 = set2 - set1”.

Removing set members: “remove(value)” errors if value isn’t in the set, “discard(value)” does not error.

Union supported with “union(set)” method or the “ ” operator, intersection with “intersection(set)” or “&” operator. Use “isdisjoint(set)” to ffind if no members are in common.

The “<” and “>” operators are for determining proper subsets, or use “issuperset(set)”. The “issubset(set)” can be used for determining subsets and or “>=”, “<=” are interchangeable with “issubset(set)”.

Conditionals

An if-statement can use “elif” or “else” as the last condition; ternary expresions look like:

value if EXP else value2

A while loop or for loop may have an “else” clause, executed as soon as the condition for “while” is false. A “break” statement will exit the loop and skip the “else” part.

The “for” loop in Python iterates over sequences or collections:

for x in numbers:
	do_stuff(x)		
	if bad_thing_happens
		break
else
	finished

Or use range(begin, end, step) to produce a sequence of numbers:

for a in range(2,2000):

To get an index as well as a value in a sequence inside a for-loop:

elevens = range(11,2500,11)
for n in range(len(elevens)):
	print(n, elevens[n])

Make a copy to avoid changing a list during a for-loop:

for n in  data[:]:

Iterators

Get an iterator variable with “iter()”. The “next()” function gets the next value in the collection from the iterator.

d = [1,2,3,4,5]
itr = iter(d)
one = next(itr)

The for-loop iterates on an iterable container by calling “iter()” on the object to the right of “in” in the for statement.

Functions

Functions can have named parameters following positional parameters:

def f(a,b,c=1,d=2,e=3):
	return a+b+c+d+e
	
f(1,2)
f(1,2,c=8,d=9,e=10)
f(1,2,e=100)

Without a return a function returns the special None value.

Use the “*” in the parameter list to denote an arbitrary sized list of params:

def rating(rating, *movies):
	for movie in movies:
		print(rating,movie)

Use the splat (“*”) operator) to convert a list into an argument list:

	movie_names = ["Three Kings", "Big", "Back to the Future"]
	rating("Four Star", *movie_names)

Arbitrary number of named parameters require “**”:

def f(**named_list)
	print named_list
	
f(x=1,y=2,z=3)

arg_values = { "x":1, "y":99}

f(**arg_values)