# coding: utf-8
# # Solving the JobShop scheduling problem using Artelys Kalis
# Scheduling problems are concerned with determining a plan for the execution of a given set of tasks. Addressing these challenges effectively is crucial in the industry, as it allows to reduce cost without compromising quality. One classical scheduling problem is the Job-Shop, in which $n$ jobs must be completed using $m$ machines. Each job is composed of multiple tasks of varying execution time, which must be processed in a given order and on specific machines. The aim is to find a schedule that minimizes the makespan (total length of the process).
#
# Because of its highly combinatorial nature, Constraint Programming (CP) is particularly interesting for solving this problem. In the following we describe the JobShop problem and present how to state and solve it using the CP solver Artelys Kalis.
# ## Description of the problem
# ### Parameters
#
# A problem instance $P = (M,T,J)$ consists of:
#
# - A set $M$ of machines
# - A set $T$ of tasks
# - A set $J$ of jobs.
#
# Each job $i \in J$ is composed of a subset $T_i$ of $\tau_i$ tasks. Hence a job can be represented as a sequence $\{task(i,1), task(i,2),...,task(i,\tau_i)\}$.
#
# Each $task(i,j)$ is associated with a machine $m_{ij} \in M$, and has a duration $d_{ij}$.
#
#
# ### Variables
#
# $t_{ij}$ = starting time of $task(i,j)$, $\forall i \in J, \forall j \in T_i$.
#
# ### Objective
#
# We seek a schedule that minimizes the makespan, that is the total duration between the start of the first task to be processed across all machines, and the end of the last one.
#
# The makespan, often called $C_{max}$, can be defined as the maximum completion time (starting time plus duration), across all tasks:
#
# $$C_{max} = \max\limits_{i,j}\ (t_{ij}+d_{ij})$$
#
# ### Constraints
#
# Starting times are non negative and we set a time horizon $MaxTime$. For instance, this horizon could represent one working day.
#
# $t_{i,j} \geq 0, \quad \forall i \in J, \forall j \in T_i $
#
# $t_{i,j} + d_{i,j} \leq MaxTime, \quad \forall i \in J, \forall j \in T_i$
#
#
# Given two consecutive tasks in the same job, the first one must be completed before the next one can start. Those are called *conjunctive constraints*.
#
# $t_{i,j} + d_{i,j} \leq t_{i,j+1}, \quad \forall i \in J, \forall j \in T_i$
#
#
# A machine can process only one task at a time. As a result, if $task(i,j)$ and $task(k,l)$ are associated with the same machine ($m_{ij}$ = $m_{kl}$), then one of them must be completed before the machine is available again to process the other one. Those are called *disjunctive constraints*:
#
# $t_{i,j} + d_{i,j} \leq t_{k,l} \quad$ or $\quad t_{k,l} + d_{k,l} \leq t_{i,j} \quad \forall i,k \in J, \forall j \in T_i, \forall l \in T_k$ such that $m_{ij} = m_{kl}$
# ## Implementation using the Python API
# ### Model data
# Let us start by taking a look at the data.
# In[1]:
file = open('jobshop33.dat', 'r')
data = file.read()
print(data)
# The first line represents the number of jobs and the number of machines. Here 3 jobs are to be processed on 3 machines.
#
# Each following line represents a job as a sequence of tasks. Each task is represented by its corresponding pair of machine and duration $(m_{ij},d_{ij})$. Here the first task of the first job must be processed on machine 2, and its completion lasts 1 unit of time. The second task must be processed on machine 0, for 3 units of time, and so on.
# Then we parse this data:
# In[2]:
# Parse data
lines = data.split('\n')
# Remove unnecessary comments
for line in lines:
if line.startswith("#"):
lines.remove(line)
# Parse first line with machines nb and jobs nb
firstLine = lines[0].split()
nb_jobs = int(firstLine[0])
nb_machines = int(firstLine[1])
# Parse each job
# For each job, parse each pair of numbers (m, d) where m is the machine id the task must be processed on
# and d is the duration of the task
machine_used = [[]]*nb_jobs
task_duration = [[]]*nb_jobs
for j in range(nb_jobs):
line = lines[j+1]
parts = line.split()
nb_tasks = len(parts)/2
machine_used[j] = [0]*nb_tasks
task_duration[j] = [0]*nb_tasks
for t in range(nb_tasks):
machine_used[j][t] = int(parts[2*t])
task_duration[j][t] = int(parts[2*t+1])
# In each line, odd indices correspond to machines, and even indices to processing times. We store this information in nested lists `machine_used` and `task_duration`.
#
# Here are the machines to use and task durations for each job:
# In[3]:
for j in range(nb_jobs):
print("Job " + str(j) + ":")
print("\t m " + str(machine_used[j]))
print("\t d " + str(task_duration[j]))
# ### The Kalis Environment
# In[4]:
import sys, os
from kalis import *
# #### The KSession Class
#
# In Artelys Kalis, problem statement and solving are carried out inside a `KSession` object. The first thing to do is to create it.
# In[5]:
# Creation of the Kalis session
session = KSession()
# #### Building the model
#
# Our problem comprises *variables*, *constraints*, and might have *solutions* after search. In Artelys Kalis `KProblem` objects hold the modeling entities and solution objects.
# In[6]:
# Creation of the problem in this session
problem = KProblem(session, "JobShop");
# We initialize a `KProblem` object variable called `problem`. "JobShop" is the internal name of the problem.
#
# Note how the first parameter set our `KProblem` into our `KSession`. Everytime we will initialize a Kalis object, the syntax will be similar.
#
# Since scheduling problems are often addressed with Constraint Programming, Kalis implements user-friendly classes specifically designed to facilitate their statement. Tasks and resources (machines in our example), are gathered into a schedule.
# In[7]:
s = KSchedule(problem, "JobShop Schedule", 0, 10000);
# Here we define a `KSchedule` object called `s` into our `Kproblem`. The third and fourth parameters correspond to the time window involved in the first constraint. Our timeline starts at 0 and our horizon is 10000 units of time. "JobShop Schedule" is the internal name of `s`.
# We move on and define our machines and tasks into this schedule `s`. We use `KUnaryResource` objects, which represent resources that can process at most one task at a time.
# In[8]:
# Create resource list
machine = [KUnaryResource(s, "M%i" % m) for m in range(nb_machines)]
# For the tasks, we use the constructor of `KTask`.
# In[9]:
# Create task list
task = {
(j, t): KTask(s, "J%iT%i" % (j,t), 0, s.getTimeMax(), task_duration[j][t], task_duration[j][t])
for j in range(nb_jobs)
for t in range(len(task_duration[j]))
}
# Our tasks are set to last between 0 unit of time and `s.getTimeMax()`, the horizon of our schedule, which we set at 10000 above. Last parameters state the minimum and maximum durations of our tasks. In our example each task has a specific, inflexible duration. Hence both parameters are set at `task_duration[j][t]`.
#
# Then we use the `requires()` method of the class `KTask` in order to associate each task to its corresponding machine.
# In[10]:
# Post resource usage for each task
for j in range(nb_jobs):
for t in range(len(machine_used[j])):
task[j, t].requires(machine[machine_used[j][t]],1)
# This method takes two input parameters : the resource to associate, and the amount of resource capacity our task requires. For the Job-Shop problem, all machines are unary resources, *e.g.* have a capacity of 1, and each task requires this unary capacity.
#
# Finally we specify that tasks are ordered for a given job using the `startsAfter()` method.
# In[11]:
# Post precedence constraints between the tasks of a same job
for j in range(nb_jobs):
for t in range(1, len(task_duration[j])):
task[j, t].startsAfter(task[j, t-1])
# And that's it! We do not need to explictly state disjunctive constraints because we implemented resources as `KUnaryResources` and, consequently, our machines will naturally process only one task at a time (Kalis implements these constraints internally). Also note that the variables are implicitly declared in the `KSchedule`, which will assign a starting time to each `KTask`.
# ### Solving the problem
# Once the problem has been fully built, we call the `optimize()` method for Kalis to solve the problem. For a scheduling problem, this method minimizes the makespan by default.
# In[12]:
s.optimize()
# We might also want to optimize another objective than the makespan. If so, before calling `optimize()`, we would use the `setObjective()` method to define an objective function. Then, the `setSense()` method to specify the sense of optimization. If we wish to minimize : `setSense(KProblem::Minimize);`.
# As mentioned earlier, solutions are comprised in our`KProblem`, which holds the modeling entities. We combine the `getProblem()` and `getSolution()` method on our `KSchedule` object to obtain the solution.
# In[13]:
sol = s.getProblem().getSolution()
# `sol` is a `KSolution` object. It contains the value of each decision variable and information about the resolution such as computation time or number of nodes explored in the search-tree.
# The code below displays the solution:
# In[14]:
# Get schedule per machine
machine_sol = [[]]*nb_machines
for m in range(nb_machines):
machine_sol[m] = []
for j in range(nb_jobs):
for t in range(len(task_duration[j])):
if machine_used[j][t] == m:
machine_sol[m].append(('Task(' + str(j) + ',' + str(t) + ')',[sol.getValue(task[j,t].getStartDateVar()),
sol.getValue(task[j,t].getEndDateVar())]))
print "Solution:"
for m in range(nb_machines):
machine_sol[m].sort(key=lambda x: x[1])
print "Machine " + str(m) + ": " + str(machine_sol[m])
print "\nMakespan: " + str(sol.getValue(s.getMakeSpan())) + "\n"
# We get the schedule of each machine: sequence of tasks and time intervals.
#
# Finally, here is the timeline for the optimal schedule, considering that one unit of time corresponds to one hour:
# In[15]:
import datetime as dt
import plotly.offline as offline
import plotly.figure_factory as ff
start_date = dt.datetime(2018, 1, 1, 8, 0, 0)
df = []
for j in range(nb_jobs):
for t in range(len(task_duration[j])):
df.append(dict(Task='Machine '+str(machine_used[j][t]),
Start= start_date+dt.timedelta(hours=sol.getValue(task[j,t].getStartDateVar())),
Finish= start_date+dt.timedelta(hours=sol.getValue(task[j,t].getEndDateVar())),
Resource='Job '+str(j)))
# Display the Gantt chart
figure = ff.create_gantt(df, title = 'Optimal Schedule', index_col='Resource', show_colorbar=True, group_tasks=True)
offline.init_notebook_mode()
offline.iplot(figure, filename='gantt-jobshop')
# We can finally free the memory by deleting the Kalis session.
# In[16]:
del session
# ## Full example on more complex instance
#
# Below you can find the entire program described in that Notebook running on a larger instance of this problem. The resulting gantt chart is the timeline obtained for 10 jobs and 10 machines!
# In[17]:
import sys, os
from kalis import *
import datetime as dt
import plotly.offline as offline
import plotly.figure_factory as ff
#Load data
file = open('abz6.dat', 'r')
data = file.read()
print(data)
# Parse data
lines = data.split('\n')
# Remove unnecessary comments
for line in lines:
if line.startswith("#"):
lines.remove(line)
# Parse first line with machines nb and jobs nb
firstLine = lines[0].split()
nb_jobs = int(firstLine[0])
nb_machines = int(firstLine[1])
# Parse each job
# For each job, parse each pair of numbers (m, d) where m is the machine id the task must be processed on
# and d is the duration of the task
machine_used = [[]]*nb_jobs
task_duration = [[]]*nb_jobs
for j in range(nb_jobs):
line = lines[j+1]
parts = line.split()
nb_tasks = len(parts)/2
machine_used[j] = [0]*nb_tasks
task_duration[j] = [0]*nb_tasks
for t in range(nb_tasks):
machine_used[j][t] = int(parts[2*t])
task_duration[j][t] = int(parts[2*t+1])
# Display parsed data
for j in range(nb_jobs):
print("Job " + str(j) + ":")
print("\t m " + str(machine_used[j]))
print("\t d " + str(task_duration[j]))
# Creation of the Kalis session
session = KSession()
# Creation of the problem in this session
problem = KProblem(session, "JobShop");
s = KSchedule(problem, "JobShop Schedule", 0, 10000);
# Create resource list
machine = [KUnaryResource(s, "M%i" % m) for m in range(nb_machines)]
# Create task list
task = {
(j, t): KTask(s, "J%iT%i" % (j,t), 0, s.getTimeMax(), task_duration[j][t], task_duration[j][t])
for j in range(nb_jobs)
for t in range(len(task_duration[j]))
}
# Post resource usage for each task
for j in range(nb_jobs):
for t in range(len(machine_used[j])):
task[j, t].requires(machine[machine_used[j][t]],1)
# Post precedence constraints between the tasks of a same job
for j in range(nb_jobs):
for t in range(1, len(task_duration[j])):
task[j, t].startsAfter(task[j, t-1])
s.optimize()
sol = s.getProblem().getSolution()
# Get schedule per machine
machine_sol = [[]]*nb_machines
for m in range(nb_machines):
machine_sol[m] = []
for j in range(nb_jobs):
for t in range(len(task_duration[j])):
if machine_used[j][t] == m:
machine_sol[m].append(('Task(' + str(j) + ',' + str(t) + ')',[sol.getValue(task[j,t].getStartDateVar()),
sol.getValue(task[j,t].getEndDateVar())]))
print "Solution:"
for m in range(nb_machines):
machine_sol[m].sort(key=lambda x: x[1])
print "Machine " + str(m) + ": " + str(machine_sol[m])
print "\nMakespan: " + str(sol.getValue(s.getMakeSpan())) + "\n"
# In[18]:
start_date = dt.datetime(2018, 1, 1, 8, 0, 0)
# Display the Gantt chart
df = []
for j in range(nb_jobs):
for t in range(len(task_duration[j])):
df.append(dict(Task='Machine '+str(machine_used[j][t]),
Start= start_date+dt.timedelta(hours=sol.getValue(task[j,t].getStartDateVar())),
Finish= start_date+dt.timedelta(hours=sol.getValue(task[j,t].getEndDateVar())),
Resource='Job '+str(j)))
figure = ff.create_gantt(df, title = 'Optimal Schedule', index_col='Resource', show_colorbar=True, group_tasks=True)
offline.init_notebook_mode()
offline.iplot(figure, filename='gantt-jobshop')
# In[19]:
del session