Paper 5: Multiserver MonotonicGittins Scheduling
Let’s return to this series. Time for my fifth paper: Optimal Multiserver Scheduling with Unknown Job Sizes in Heavy Traffic This was the third queueing paper I contributed to, and it builds upon my first queueing paper, on Muliserver SRPT. My coauthors once again were Ziv Scully, then a fellow PhD student, now a professor at Cornell, and Mor HarcholBalter, my advisor (and Ziv’s).
Setting
My previous paper on SRPTk, which I wrote a post about, dealt with a central queue model, where the scheduler could run any k jobs of its choice, and where the scheduler knows each job’s size (its service requirement) in advance. In this paper, we focus the setting where the job size distribution is known, but the actually size of a job is unknown. With respect to a specific job, the only information we have available is the job’s age, the amount of service it has received so far.
We want to design a scheduling policy which looks at those ages, looks at the job size distribution, and decide which jobs to run.
Otherwise, the setting is once again the same as in the SRPTk paper: Poisson arrivals, i.i.d. job sizes, fixed number of servers k, variable load rho, preemption allowed. We are again interested in the mean response time behavior in the rho to 1 limit.
Building blocks
In this paper, we decided to build off of three building blocks: The Gittins policy for the singleserver queue, dating back to 1980, he SOAP analysis by my coauthors, and the multiserver taggedjob approach from the SRPTk paper.
The Gittins policy is the policy which minimizes mean response time in the singleserver setting. It works as follows:

Look at all jobs in the system, and all durations for which we could run those jobs.

Using the job size distribution and the current ages of the jobs, calculate the probability that the job will finish within the duration, and the expected time for which the job will be served – the minimum of the service duration and the job size.

Rank jobs by the ratio of expected service time to probability of completion. Serve the job with lowest ratio – the least time to complete a job, in expectation.
The Gittins scheduling policy is an index policy, which means it maps job ages to a priority and serves the job of best priority. This priority can go up and down as a job increases in age, depending on the job size distribution.
To analyze the mean response time of singleserver policies such as the Gittins policy, my coauthor Ziv developed the SOAP analysis. SOAP stands for ``Schedule Ordered by Agebased Priority’’. It gives a unified analysis of mean response time for all agebased index policies in the singleserver setting, and in particular it gave the first mean response time analysis for the Gittins policy. Despite the Gittins policy having been known to achieve optimal mean response time for about 40 years, this was the first mean response time analysis.
All of that prior work was in the singleserver setting. To prove a result in the multiserver setting, we decided to use the multiserver taggedjob analysis from the SRPTk paper, which essentially transfered SOAPtype results from the singleserver settting to the multiserver setting.
Challenge: Monotonicity
Unfortunately, as we explored the multiserver taggedjob analysis in this setting, we encountered an obstacle: The method only produces tight results for monotonic index policies.
There are two kinds of monotonic index policies: Monotonically increasing, and monotonically decreasing. Monotonically increasing means that as a job’s age increases, its priority only stays constant or gets worse, never better. On the flipside, a monotonically decreasing index policy is one like SRPT where a job’s priority only gets better, never worse. Nonmonotonic policies are ones where priority goes both up and down.
In the paper, because we focus on the unknown jobsize setting, the index policies of interest are monotonically increasing policies, and nonmonotonic policies.
In Theorem 5.1 of the paper, we use the mutliserver taggedjob approach to give a tight analysis of mean response time for monontonically increasing policies, following the SRPTk approach quite closely. In particular, we show that any monotonically increasing scheduling policy in the multiserver setting achieves mean response time not much larger than the same scheduling policy in the resourcepooled singleserver setting, combining all servers together into one giant server that runs a single job at k times the speed.
In Appendix A, we show that the same approach cannot give a tight mean response time analysis for nonmonotonic policies: The resulting bound on mean response time will not be particularly tight, as the method makes too many worstcase bounds along the way. We tried several times to replace those worstcase steps with stochastic analyses, but it never worked out.
So, in the end, we decided to restrict our attention to monotonic scheduling policies, even if that meant we wouldn’t be able to use the Gittins scheduling policy.
Policy: Monotonic Gittins
Instead, we selected the Monotonic Gittins (MGittins) scheduling policy. This policy starts with the Gittins policy, but then adds the additional caveat that a job’s priority is never allowed to improve (decrease). Formally, a job’s priority is the maximum of all priorities up to that age under the Gittins policy.
Thanks to our work with the multiserver taggedjob method, we knew that multiserver MGittins performed nearly as well as singleserver MGittins. Now, it remained to analyze singleserver MGittins in comparison to true, optimal singleserver Gittins.
This part of the paper was less my focus  I mostly focused on the multiserver taggedjob method. Our analysis of MGittins lead through the SERPT and MSERPT policies: Shortest Expected Remaining Processing Time and the monotonic variant thereof, which were essentially simpler proxies for the Gittins and MGittins policies in the singleserver setting.
We couldn’t analyze these policies for fully general job size distributions, just for certain important classes of distributions, with lots of effort needed for each class of distribution.
Results
Our main result was Theorem 3.1, where we proved that in several important classes of job size distributions, the Monotonic Gittins scheduling policy achieves asymptotically optimal mean response time in the multiserver central queue setting.
Retrospective and future directions
This work was a little underwhelming. We weren’t able to analyze the proper multiserver Gittins policy, as we would hae liked, and we weren’t able to analyze fully general job size distributions, but had to settle for classes with more structure. And finally, this paper focused only on the setting where jobs are undifferentiated except for their ages. It couldn’t handle different classes of jobs with their own size distributions, for instance.
Nonetheless, this paper represents about as far as one could get with the techniques that we’d developed at the time: multiserver taggedjob + SOAP. For futher results, we’d need to develop new techniques. And in our next paper, on multiserver Gittins, we did just that.