Learning python recently as well as choosing it as my tool for solving Advent of Code this year, I’ve seen a few ways to call a map type function on a list of data and get a new list based on the data inside.

Most commonly I’ve seen list comprehensions. As a non python dev this is most jarring approach, but also one of the simpliest looking once you get the hang of seeing it. Here is a comprehension that takes a list of strings and maps them to a list of integers.

# Map a list of strings to a list of integers.
str_arr = ['44', '33', '77', '03', '32']
int_arr = [int(item) for item in arr]

This is equivlent to using a map and lambda function.

int_arr = list(map(lambda item: int(item), arr))

This, like many python methods does not return a list automatically, rather a generator that we then have to convert into a list, using list() or wrapping square brackets around the comprehension; this is a feature of python and can be used efficently. Another way would be to create a generator function and have yield on each iteration.

def convert_to_int(str_array):
  for str in str_array:
    yield int(str)

int_arr = convert_to_int(arr)

This cannot be converted to a list easily, but can be looped over all the same. Another is the more procedeural for loop and populate a list

int_arr = []
for item in str_array:
  int_arr.append(int(item))

The advantage of this is the result is always a list and not a generator.

Generators vs Lists

Coming from mostly javascript where generators/iterators are not used often, it seemed silly they I would want one instead of a list. When comparing to Kotlin or Rust, that have similar data strucutres with lazy evaluation on lists, it makes more sense. Generators can have a huge benefit over lists because of this, however lose some of the advantages such as fixed length or abilitiy to index into an exact item. In most language you can convert a generator to a list very easily:

  • list(generator) in python
  • [...generator] in javascript
  • sequence.toList() in Kotlin

Because of this, any disadvantage of having a generator can be solved without much code, but come at the cost of computing the entire iterator which may not be the desired result.

Comparisons

Let’s compare using the methods above, as well as compare creating a list vs generator on the ones that support it easily.

comprehension 0.013299654000000001
map_function 0.016781889000000008

for_loop 0.016316492000000002

generator_function 2.184999999987891e-06

comprehension_no_list 2.225999999994066e-06
map_function_no_list 1.6949999999960053e-06

Comparing these the difference is rather small, even over 100k+ strings. As expected the ones that generate a full list and not a generator took a bit longer. However, lets see if performing an operator that can be performed on lists or generators, such as sum takes longer with full lists as well.

comprehension 0.014051208000000003
map_function 0.016429000000000006

for_loop 0.016547979000000004

generator_function 0.012891581999999999

comprehension_no_list 0.012013383000000002
map_function_no_list 0.013967973999999994

Results

Timed using sum over each list, there is no noticable difference, not that there was one before, between using generators and lists when performing operations. This can be benificial to only use generators if you do not need a list right away. Each iteration can take time and only doing them, in this case converting a string to an int, all in advance.

Testing

Tested using 100k+ number strings into numbers and timed using timeit.

There is a great video called The Fastest Way to Loop in Python - An Unfortunate Truth from Mcoding on youtube. It goes over a slightly different and more detailed test of using while loops vs for loops, and even uses numpy to show how fast writing python can be if used properly.