---

# S02E01 : Loop better - Iterators & generators
Cyril Desjouy

---

## 1. The iterable

An iterable is an object containing several elements that can be iterated with a `for` loop (for example). 
The sequential objects we saw earlier are iterable: the objects `list`, `tuple`, `str`, `ndarray`, ...

Iterable objects inherit the `__iter__` method that is required to return an object of type **iterator**.


## 2. The iterator

An iterator is a particular object representing a data flow. He inherits special methods:

* `__iter__`: returns the iterator object itself (`self`). This method is required for the object to be used in a `for` loop.
* `__next__`: returns the next element. If there are no more elements, raise the exception `StopIteration`.

All iterators are iterable objects (they have the `__iter__` method), but the opposite is not true. Objects of type `list` for example do not have a `__next__` method. So they are iterable but not iterators. On the other hand, it is possible to build an iterator from an iterable using the built-in `iter` function:

```Python
>> iterable =[1, 2, 3]        # list object that is an iterable
>> iterator = iter(iterable)  # equivalent to iterator = iterable.__iter__()
>> type(iterable)
list                           
>> type(iterator)
list_iterator               
```

><div class="alert alert-block alert-info">
Use the <code>len</code> function to determine the length of the object <code>iterable</code> and the length of the object <code>iterator</code>. 
</div>


As you may have noticed, iterators do not have all the features of iterables. For example, they do not have a `__len__` method. Moreover, they cannot be indexed!

><div class="alert alert-block alert-info">
Use the <code>next</code> function with the object <code>iterator</code> as input argument. Execute this instruction several times. What do you see? 
</div>

When the iterator is consumed, it is not possible to reuse it. We have already observed this behavior when studying generator expressions in a previous notebook. With each call of `__next__` method, our iterator returns a new value until it is empty. It then raises the `StopIteration` exception.

This is exactly what happens in a `for` loop:

* The `__iter__` method is called on the object to iterate, which returns an iterator.
* The `__next__` method is called at each iteration to generate the next value of the iterator.
* When the iterator is empty, the `StopIteration` exception is raised. It's the loop output signal!

It should also be noted that it is possible to develop custom iterators relatively easily by implementing a **class** containing the `__iter__` and `__next__` special methods. For instance:

```Python
class Counter:
    
    def __init__(self, n):
        self.i = 0
        self.n = n = n

    def __iter__(self):
        return self             # Returns the instance itself

    def __next__(self): 
        if self.i > self.n:
            raise StopIteration # When i>n, the StopIteration exception is raised
        else:
            self.i += 1
            return self.i - 1   # Otherwise we return i+1 !
```

Each instance of this class is then an iterator:
```Python
>> c = Counter(10)
>> next(c)
0
>> next(c)
1
```

**Conclusions:** In some cases, in scientific computing, the amount of data to be managed is so large that it is not possible to store it in memory. Due to their construction, iterators have a very small memory footprint regardless of the size of the dataset. This allows them to work on colossal amounts of data, even infinite. The disadvantage of iterators is that once consumed, they cannot be reused.

## 3. The generator

### 3.1. Generator functions

A generator is an iterator. It is usually defined by a function in which the `return` instruction is replaced by one or more `yield` instructions. This function is in this case called **generator function**. The special methods `__iter__` and `__next__` (which defines an iterator) are automatically implemented when it is created. Here is an example of a generator function:

```Python
def counter():
    yield 1
    yield 2
    yield 3
```

which allows you to create a generator:
```Python
>> c = counter()
>> type(counter)
function
>> type(c)
generator
```

><div class="alert alert-block alert-info">
Use the <code>next</code> function repeatedly on the object <code>c</code>. 
</div>

Since generators are iterators, they exhibit the same behaviour. Once empty, they raise the `StopIteration` exception and cannot be reused.

In order to avoid repeated `yield`, generator functions often use loops:
```Python
def counter(n):
    for i in range(n):
        yield i
```

This makes it possible to create a generator that can generate billions of values in an optimized way and without overloading the memory.

```Python
>> c = counter(2**24)
>> c.__sizeof__()
96                               # 96 bytes in memory
>> l =[i for i in range(2**24)]  # list with the same number of objects as our iterator
>> l.__sizeof__()
146916472                        # ~147 Mb in memory !
```

How about an infinite iterator? 
```Python
def counter():
    i = 0
    while True:
        i += 1
        yield i
```

```Python
>> c = counter()
>> c.__sizeof__()
96
```

### 3.2. Generator expressions

The principle of operation of the **generator functions** should remind you of the principle of the **generator expressions** seen in a previous notebook. 

As a reminder, generator expressions make it possible to create **generators** in a simpler and more concise way than generator functions. The syntax of generator expressions is similar to that of list comprehensions but uses square brackets instead of brackets. Here is an example:

```Python
s = ('hello {}'.format(i) for i in ['Bob','John','Jim','Jim','World!'])
```

><div class="alert alert-block alert-info">
Use the <code>print</code> function to display this generator.
</div>

Whether for an iterator or a generator, it is not possible to directly access the values (no `__len__` method, no indexing,...). To access it, you must use the `next` function or iterate it with a `for` loop!

**Note:** *Generator expressions can be passed as input arguments for any function that accepts an iterator. For instance:*

```Python
sum(i**2 for i in range(10))
```

*In this case, no need for the second set of brackets.*

## Summary

**Iterators** are built from an iterable using the `iter` function or by developing its own class implementing the `__iter__` and `__next__` methods.

**Generators** are built from a function using the `yield` instruction or a generator expression.

Any generator or iterator object is an iterable one. The opposite is not necessarily true.


## Application : Prime numbers

><div class="alert alert-block alert-info">
Create a generator to generate all prime numbers from 0 to $n$, where $n$ will be an input parameter. </div>


## References

* [Data-flair - Genrator vs. Iterator](https://data-flair.training/blogs/python-generator-vs-iterator/)