Python logo
Sept. 1, 2019 * Python Programming

Python: Random access generator for multi value sublist yield

Writing custom code every time to extract a custom set of values from a generator can be tedious. This is solved by having reusable code in the form of a generator class that can:

  • yield multiple values with every call
  • access the base list sequentially or randomly
  • control what happens when we reach the end of the list
  • create infinitely yielding generator

Custom generator class

With a list of numbers, we may need to generate two or more values per iteration.

mylist = [1, 2, 3, 4, 5, 6, 7, 8, 9]

Output #1:
[1, 2], [3, 4], [5, 6]...

Output #2:
[1, 3, 4], [2, 4, 5], [3, 5, 6]...

Output #3:
[1, 3], [6, 8], [3, 5]...

Examples show sequential or random subsets generated from the main list. The step size is variable during sequential access. Number of elements returned and the skip pattern is also customizable. The following class 'Racgen' allows all the features and makes code reusable.

Racgen: Random access generator class

import random

class Racgen:
    """    
    Random access generator
    * yields sublist of items from a list
    * access is random or sequential
    * multiple values per call using custom skip pattern    
    * enumeration counter
    """
    
    def __init__(self, lst=[]):
        """
        initializes generator class
        * lst: list to iterate over
        * pattern: relative item positions to sample
        * index: base position
        * step: index increase or decrease stepsize
        * control: custom control function for end logic
        * random: if true returns uses random indices
        * counter: count of returned values
        """
        
        self.lst = lst
        self.pattern = [0]
        self.index = 0
        self.step = 1
        self.control = None
        self.random = False
        self.counter = 1
        pass

    
    def __next__(self):
        """
        iteration function for generator class
        returns: (counter, [sub list])
        * None returned if sublist is out of range
        """

        # length of list
        maxpos = len(self.lst)
        
        # random index position
        if self.random:
            indexValue = random.randint(0, maxpos-1)            
            counterValue = self.counter
            self.counter += 1

        # sequential position
        else:
            # check for custom function
            if self.control is not None:
                self.control(self)
            # default checks: if outside range raise error
            elif (self.index < 0 or self.index >= maxpos):
                return -1
            else:
                pass

            indexValue = self.index
            counterValue = self.counter
            self.index += self.step
            self.counter += 1
            
        # return values of subset
        values = []        
        for pos in self.pattern:
            subIndx = indexValue + pos
            if (0 <= subIndx < maxpos):
                values.append(self.lst[subIndx])
            else:
                values.append(None)
        return (counterValue, values)



# instantiate generator
mylist = [1,2,3,4,5,6,7,8,'a','b','c']
gen = Racgen(mylist)

# request values from generator
for i in range(10):
    print(next(gen))
Output from code:
(1, [1])
(2, [2])
(3, [3])
(4, [4])
(5, [5])
(6, [6])
(7, [7])
(8, [8])
(9, ['a'])
(10, ['b'])

The Racgen class was instantiated with a list of numbers and letters. All the custom parameters were left as defaults. This creates a simple sequential generator starting from first element. Values are returned as a tuple, with first element as a counter. The second element is a sublist derived from the list being iterated on.

Sequential access yielding pair of values

Let us configure the generator instance to return a pair of values. The specifications for the returned values are:

  • pair of adjacent values
  • start at third element
  • step-size is 3

Example: Sequential with two return values

mylist = [1,2,3,4,5,6,7,8,'a','b','c']
gen = Racgen(mylist)

# returns two values @ index, index+1
gen.pattern = [0,1]

# start at index position 2
# step size is 3
gen.index = 2
gen.step = 3

for i in range(10):
    print(next(gen))
Output of code:
(1, [3, 4])
(2, [6, 7])
(3, ['a', 'b'])
-1
-1
-1
-1
-1
-1
-1

We start at the third element, followed by the 6th and 9th element. Along with the main element the adjacent element is also returned for each call to the generator. So the sublist has two values as intended.

Since list is not too long, we cannot get 10 set of values from the generator. Once the list is exhausted, the generator will yield -1. This would surely cause errors during upacking of the values. Error handling is required externally from the calling code to ensure code does not crash.

There is a built in feature to control what happens with we run out of list elements to iterate over. A custom function can be provided to the generator with logic to handle the possible cases.

Control function for generator

Assigning a custom control function allows any possible behavior of the generator when the index goes beyond the length of the list provided to iterate on.

Let us saturate the output to a static value in the next example.

Control function for generator

mylist = [1,2,3,4,5,6,7,8,'a','b','c']
gen = Racgen(mylist)
gen.pattern = [0,1]
gen.index = 2
gen.step = 3

# if moved past right edge
# then saturate at fixed index
def myfunc(obj):
    if obj.index > 8:
        obj.index = 6
        obj.step = 0
gen.control = myfunc

for i in range(10):
    print(next(gen))
(1, [3, 4])
(2, [6, 7])
(3, ['a', 'b'])
(4, [7, 8])
(5, [7, 8])
(6, [7, 8])
(7, [7, 8])
(8, [7, 8])
(9, [7, 8])
(10, [7, 8])

Function myfunc is assigned to the generator control. The generator will execute the function provided to decide what to do with the next index value. All the class parameters are available within the custom function.

In the example, the custom function is triggered when index is greater than 8. Once this happens:

  • the index is reset to 6 (which is the 7th element)
  • the step size is set to zero from initial value of 3

So for the rest of the generator calls, the 7th and 8th elements are returned over and over again. If the step size is assigned a negative value, the generator would start yielding values in reverse order. We will try this out later.

Random index generation

The generator class Racgen also yields list values by choosing index values randomly. There are no other control parameters for sequences, since the indices are randomly chosen within the list.

Randomly access subsets

mylist = [1,2,3,4,5,6,7,8,'a','b','c']
gen = Racgen(mylist)

# set generator to subset randomly
gen.random = True

for i in range(10):
    print(next(gen))
Output for code:
(1, ['c'])
(2, [4])
(3, [5])
(4, ['c'])
(5, [5])
(6, [1])
(7, [4])
(8, [1])
(9, ['b'])
(10, ['a'])

As expected, we have the counter, and a single value from the list returned as a tuple. We can control how many elements are returned, and their positional relation with the main index.

Multiple values with random index selection

mylist = [1,2,3,4,5,6,7,8,'a','b','c']
gen = Racgen(mylist)
gen.pattern = [-1, 0, 2]
gen.random = True

for i in range(10):
    print(next(gen))
(1, [4, 5, 7])
(2, [4, 5, 7])
(3, ['a', 'b', None])
(4, [3, 4, 6])
(5, [4, 5, 7])
(6, [6, 7, 'a'])
(7, [1, 2, 4])
(8, ['b', 'c', None])
(9, ['a', 'b', None])
(10, [6, 7, 'a'])

The generator was set in the random yield mode, and 3 values were requested per call. The element at the base index, along with the one on the lower end, and the one 2 positions next.

The main index is bounded between 0 and the max-size of the iterator list. However, for index = 0, there is no vale before it in the list. Similarly there is no element beyond the end of the list. In these cases 'None' is returned in the sub-list. The key element will always have a valid value from the list. Error checks are needed for the other values in the sub-list. Similar events will occur with sequential access when the key index is valid, but other elements requested are outside the list.

Infinite oscillating generator

Finally let us create an infinite oscillating generator. The control function provided would reverse the direction when index crosses the lower or upper limit of list size. The following generator has following characteristics:

  • sequential, starting at 6th element
  • reverses direction when index is out of bound

Oscillating generator

# instantiate generator
mylist = [1,2,3,4,5,6,7,8,'a','b','c']
gen = Racgen(mylist)

# returned sublist pattern
# [value@ index, index-1, index+2] 
gen.pattern = [0,-1,2]

# start at index pos 5 (sixth element)
gen.index = 5

# custom control function to reset sequential 
# index when it moves past boundary values
# step direction is reversed

def myfunc(obj):
    # if past right end
    if obj.index > 8:
        obj.index = 7
        obj.step = -1 

    # if past left end
    if obj.index < 1:
        obj.index = 2
        obj.step = 1

gen.control = myfunc

# request values from generator
for i in range(18):
    print(next(gen))
Output from code:
(1, [6, 5, 8])
(2, [7, 6, 'a'])
(3, [8, 7, 'b'])
(4, ['a', 8, 'c']) 
(5, [8, 7, 'b']) << reversal on high end
(6, [7, 6, 'a'])
(7, [6, 5, 8])
(8, [5, 4, 7])
(9, [4, 3, 6])
(10, [3, 2, 5])
(11, [2, 1, 4])
(12, [3, 2, 5]) << reversal on the low end
(13, [4, 3, 6])
(14, [5, 4, 7])
(15, [6, 5, 8])
(16, [7, 6, 'a'])
(17, [8, 7, 'b'])
(18, ['a', 8, 'c'])

The generator yielded 18 values, but we could have continued forever. The custom control function reverses direction and resets the index when it exceeds bounds on the top and bottom ends. The returned values keep oscillating within the list values. This can be very useful in creating html tables, where properties can be cycled through.

Mechanics of the generator class

The generator is initialized with default values with sequential access. Random access has to be set after instantiating..

  • the list to iterate over needs to be assigned at the start
  • the list can be changed inbetween, but care is needed to ensure there are no failures since there are no protections
  • counter value can be adjusted as needed
  • the sublist pattern with respect to the key index = 0 can be altered, with the realization that 'None' values are returned if they are beyond bounds
  • if there is no control function assigned, then a simple low and high boundary check is performed

Initialization

def __init__(self, lst=[]):
        self.lst = lst
        self.pattern = [0]
        self.index = 0
        self.step = 1
        self.control = None
        self.random = False
        self.counter = 1
        pass

The length of the list is calculated every time to ensure that if changes occurs to the list - then the most recent statistics are available. It is a key parameter for the code to function.

There are two parts for calculating the key index value. First is randomly selected for the list length. The second is calculated sequentially by increasing or decreasing the current index position.

Checks are done for sequential case using built in or user provided controls to ensure the key index is valid.

The subset is created by offsetting the key index with values from pattern list. If these indices are outside bounds, then None is returned for the specific values.

The counter is incremented and packaged with the sub-list as a tuple, forming the return value of the built-in method.

__next__ method

def __next__(self):
        # length of list
        maxpos = len(self.lst)
        
        # random index position
        if self.random:
            indexValue = random.randint(0, maxpos-1)            
            counterValue = self.counter
            self.counter += 1

        # sequential position
        else:
            # check for custom function
            if self.control is not None:
                self.control(self)
            # default checks: if outside range raise error
            elif (self.index < 0 or self.index >= maxpos):
                return -1
            else:
                pass

            indexValue = self.index
            counterValue = self.counter
            self.index += self.step
            self.counter += 1
            
        # return values of subset
        values = []        
        for pos in self.pattern:
            subIndx = indexValue + pos
            if (0 <= subIndx < maxpos):
                values.append(self.lst[subIndx])
            else:
                values.append(None)
        return (counterValue, values)

__next__ is available to any class object. It can be called as next(instance). It allows us to create a generator.