Skip to content

Creating customised models

framinan edited this page May 30, 2022 · 18 revisions

Overview

The models provided in scheptk are canonical models that do not include many constraints and objectives that may be found in real-life (such as setups, precedence constraints, or non-regular objectives). There are several functionalities in scheptk that allow you to create customised models with minimal effort, as well as solution procedures for these models.

In a nutshell, customised models are children of the Model class that carry out certain scheduling functions (e.g. which data to store and handle, how to compute the completion times, etc.) in an specific manner. So, in order to create a customised class, we should define the class as a children of the Model class and tell it how these specific funcions are carried out in this model.

Once these methods are described for the children class, all functions of scheptk should be available for the class. The following code summarises how a class Assembly_2_machines can be defined and used.

from scheptk.scheptk import Model
from scheptk.util import *

class Assembly_2_machines(Model):

    # constructor of the class: define the data (temporary=attributes, persistent=tags in the file)
    def __init__(self, filename):

        self.jobs = read_tag(filename,'JOBS')
        self.pt = read_tag(filename,'PT')


    # computes the completion time of each job on each machine
    def ct(self, solution):

        # completion time of jobs on each machine (the two first machines are the first stage, the third is the assembly stage)
        ct = [[0 for j in range(self.jobs)] for i in range(3)]

        # earliest starting time on the machines (initially zero)
        earliest_st_machine_1 = 0
        earliest_st_machine_2 = 0
        earliest_st_assembly = 0

        for j,job in enumerate(solution):

            # compute completion times of each job in the first stage
            ct[0][j] = earliest_st_machine_1 + self.pt[0][job] 
            ct[1][j] = earliest_st_machine_2 + self.pt[1][job] 
            
            # compute completion time of each job in the assembly stage
            ct[2][j] = max(earliest_st_assembly, max(ct[0][j], ct[1][j])) + self.pt[2][job]
            
            # update earliest completion times of the machines in the stages
            earliest_st_machine_1 = ct[0][j] 
            earliest_st_machine_2 = ct[1][j] 
            earliest_st_assembly = ct[2][j]
        
        return ct, solution

Once this code has been entered, all functions of a scheduling model in scheptk are available, i.e.:

instance = Assembly_2_machines('assembly_sample.txt')

solution = [1,0,2]
instance.Cmax(solution)
instance.SumCj(solution)
instance.print_schedule(solution)

and the makespan, the sum of the completion times, or even the Gantt chart can be depicted. In the next sections, the details of this code are provided.

Introduction to creating scheduling models

Roughly speaking, a scheduling model is an abstraction of a real-life scheduling problem where the decision problem can be modelled using a set of data, constraints, and objectives (criteria). Data in a scheduling model can be stored in a persistent manner (i.e. a file or a database), or loaded in the computer memory while running the program (i.e. variables, or the attributes of a model class). Constraints in a scheduling model usually determine how the jobs are processed in the machines, therefore are implemented as methods of the model class. These constraints determine the value of certain magnitudes of interest (criteria), so the objectives are also methods of the class.

Data

In general, there are two types of data in a scheduling problem: input data (the data required to make the scheduling decision) and output data (the data that constitute the decision). For instance, the jobs, machines and processing times are input data of a scheduling model, and an schedule indicating when each job is processed on each machine are output data. For the sake of brevity, the set of input data will be denoted as a model, and output data will be denoted as schedule.

Both types can be persistent (i.e. we might want to save them in a file and store them after the program is run), or temporary (i.e. we might want to load them in the memory of the program to use them). In scheptk, persistent data are stored in files and specified by tags, and there are some functions in util to read and write tags from and to files, and to edit and visualize them:

  • read_tag(filename, tag)
  • write_tag(tag, value, filename)
  • print_tag()
  • edit_tag(tag, value, filename)

The Model class

The model class is the parent class of all scheduling models in scheptk. The Model class contains two methods that should be instantiated by each children class, i.e.

  • __init__(filename), the constructor where the data of the model are defined. Basically, in this function we will define what attributes would have the customised model, and what is the tag name that we will use to read them from a file. Typically, the __init(filename)__ will contain lines of code like the following one: self.attr= read_tag(filename,attr_tag) where attr is the name of the attribute (such as jobs, machines, setups, etc) and attr_tag is the corresponding tag in the file (such as 'JOBS', 'MACHINES', 'SETUPS', etc). In the example of the Assembly_2_machines class, the input file would read something like:
[JOBS=3]
[PT=40,4,8;21,14,25;16,5,12]

would require the following __init__(filename) method:

   def __init__(self, filename):

        self.jobs = read_tag(filename,'JOBS')
        self.pt = read_tag(filename,'PT')

Note that different tags could have been used for the model, for instance an input file like this:

[JOBS=3]
[PT_MACHINE_1=40,4,8]
[PT_MACHINE_2=21,14,25]
[PT_ASSEMBLY=16,5,12]

and the corresponding __init__(filename) method would be

   def __init__(self, filename):

        self.jobs = read_tag(filename,'JOBS')
        self.pt = [read_tag(filename,'PT_MACHINE_1'), read_tag(filename,'PT_MACHINE_2'), read_tag(filename,'PT_ASSEMBLY')]

which is certainly more visual although less simple. In any case, it is very important that the attribute processing times are stored in the standard way, i.e. self.pt[i][j] is a list containing, for each machine, the processing time of each job on this machine, or (in the case of a single machine) self.pt[j] is a list containing the processing time of each job. If this structure for the processing times is not followed, then it is likely that the functionalities related to the Gantt charts do not work properly. In this case, however, it is always possible to use the class Schedule to produce and ad-hoc chart.

  • ct(solution) a method that takes a solution (in a user-defined solution encoding) and returns the completion time of each job in the solution on each machine AND the order in which the jobs are specified in ct. Next, the code for the ct() method in the single machine case is included:
    def ct(self,sequence):
        completion_time = []
        completion_time.append(self.r[sequence[0]] + self.pt[sequence[0]])
        for i in range(1,len(sequence)):
            completion_time.append(max(completion_time[i-1],self.r[sequence[i]]) + self.pt[sequence[i]])
        return [completion_time], sequence

It is important to remark that ct() returns a matrix (a list of lists) with as many rows as the number of machines, and as many columns as the number of jobs in the solution (not the total number of jobs). This difference is important for instance when one wants to deal with partial sequences (which represent partial solutions in many problems) and not with full sequences. This can be illustrated by the following example in the SingleMachine layout using the same 4-jobs instance that has been used to illustrate SingleMachine in another section of the wiki:

instance = SingleMachine('test_single.txt')
solution = [3,1]
completion_times, order_ct = instance.ct(solution)
print( 'Completion times: , completion_times )
print( 'Order of the jobs in the partial solution:', order_ct )

The execution of this code provides the following output:


----- Reading SingleMachine instance data from file test_single.txt -------
[JOBS=4]
[PT=2,10,1,1]
[W=2.0,1,1,2]
[DD=11,13,13,14]
[R=0,0,0,0]
----- end of SingleMachine instance data from file test_single.txt -------
Completion times: [[1, 11]]
Order of the jobs in the partial solution: [3, 1]

As it can be seen, the completion_time list returned is composed of one element (one machine, a row), in turn composed of two elements (two jobs, the columns). It is not a one machine, four jobs matrix even if in the instance there are 4 jobs. Failing to return the completion times in this way might result in problems when calculating the indicators of partial measures.

A sensible question at this point is, Why the order of the jobs has to be returned by the function? Is not this information already known? The answer is, in general, no, as there are encoding schemes (such as the codification used by JobShop or OpenShop) where the solution provided to ct() is not the sequence or order of the jobs. However, this is not the case if the solution encoding is simply the sequence of the jobs. However, since the methods related to Gantt charts are defined at the parent level (the Model class), then ct() has to be written always in the same way regardless the layout.

In the following we can see another example of the method ct() for the case of UnrelatedMachines:


   # implementation of completion times for unrelated parallel machines (ECT rule)
    # ties are broken with the lowest index
   def ct(self, sequence):

        # initializing completion times in the machines to zero
        ct_machines = [0 for i in range(self.machines)]
        ct = [[0 for j in range(len(sequence))] for i in range(self.machines)]

        # assign all jobs
        for j in range(len(sequence)):
            # construct what completion times would be if the job is assigned to each machine
            next_ct = [max(ct_machines[i],self.r[sequence[j]]) + self.pt[i][sequence[j]] for i in range(self.machines)]
            # assign the job to the machine finishing first
            index_machine = find_index_min(next_ct)
            # increases the completion time of the corresponding machine (and sets the completion time of the job)
            ct_machines[index_machine] = max(ct_machines[index_machine], self.r[sequence[j]]) + self.pt[index_machine][sequence[j]]
            ct[index_machine][j] = ct_machines[index_machine]
    
        return ct, sequence

As it can be seen, the completion times returned include some zeros if the job is not processed in this machine. This is not a problem at all. Finally, please check the example of ct() for the JobShop() where it can be seen that the solution passed as argument is not the same thing that the list of the jobs:

     def ct(self, solution):
         
       # get the jobs involved in the solution in the order they are processed
       jobs_involved = list(set(solution))

       # completion times of jobs and machines
       ct_jobs = [self.r[jobs_involved[j]] for j in range(len(jobs_involved))]
       ct_machines = [0 for i in range(self.machines)]  

       # completion time of each job on each machine
       ct = [[0 for j in range(len(jobs_involved))] for i in range(self.machines)]

       # number of operations completed by each job (initially zero)
       n_ops_jobs = [0 for j in range(len(jobs_involved))]

       for job in solution:

           # determine the corresponding machine
           machine = self.rt[job][n_ops_jobs[jobs_involved.index(job)]]

           # compute completion time
           curr_completion_time = max(ct_jobs[jobs_involved.index(job)], ct_machines[machine]) + self.pt[machine][job]

           # update completion times
           ct_jobs[jobs_involved.index(job)] = curr_completion_time
           ct_machines[machine] = curr_completion_time
           ct[machine][jobs_involved.index(job)] = curr_completion_time

           # update number of operations for the job
           n_ops_jobs[jobs_involved.index(job)] += 1

       return ct, jobs_involved