Technical Details | CERIT Scientific Cloud

FairShare is a score that determines what priority you have in the scheduling queue for your jobs.

The more jobs you run, the lower your score becomes, temporarily. A number of factors are used to determine this score.

Fairshare

Fairshare is a scheduling policy which provides fair distribution of computing resources to users and to their jobs. The calculation of fairshare is based on a usage of resources in last 30 days. The importance of the resource usage decreases according to passed time, that means jobs running recently get higher weight than jobs running several days ago.
Fairshare takes into account also publications with acknowledgement to Metacentrum (only the publications you entered into the system Perun). A high number of publications results in higher fairshare for a user and higher priority for his jobs.

A job order available at https://metavo.metacentrum.cz/pbsmon2/queues/jobsQueued represents an order in which waiting jobs will start running.

To find out the start time of a waiting job adjusted by the batch system scheduler, use one of the following ways:

set the qstat command (qstat -f Job ID) and look up the items estimated.start_time and estimated.exec_vnode
go to http://metavo.metacentrum.cz/en/state/index.html, choose the options "Personal view" and "List of jobs". Then click on the job you are interested in and look up the items planned_start and planned_nodes.

The waiting jobs, for which the batch system scheduler has already alocated worker nodes, have stated a note in the comment line explaining why they are waiting, e.g. “Not Running: Insufficient amount of resource: mem”.

Specifying jobs' maximum run-time

For effective scheduling, it is always necessary to have as most accurate knowledge about the characteristics of the scheduled jobs as possible; one of the key information necessary is the expected maximum jobs' duration time (run-time). Although it is not often possible to accurately estimate, how long a job will run to complete its task, it is crucial to provide a reasonable upper estimate of its duration time. On the one hand, this estimate (ideally) should not be shorter than the real job's duration time (to prevent the scheduling system from killing the job when exhausting its reserved time frame), and on the other, it is not rational to specify this estimate too long (since, in general, the long-running jobs wait for their startup longer).

Thus, in comparison with the MetaCentrum infrastructure, the maximum jobs' run-time specification is not implicitly performed by submitting the jobs into a set of pre-defined, time-limited queues (short, normal, long, etc.), but by submitting the jobs into a single (default) queue and by explicit specification of its maximum run-time using the walltime option (see details at Jobs'/Nodes' property specifications). Based on this specification, the jobs are automatically moved from the default queue to the most suitable, time-limited Torque's internal queue, where they are waiting for their startup (see the picture below).

Compared to a system with a simple set of pre-defined, time-limited queues, this approach should provide many key benefits, such as:

flexible specification of the expected jobs' maximum run-time ⇒ easier/faster access to computing resources (shorter jobs are preferred by the scheduling system),
easier jobs' internal scheduling ⇒ easier/faster access to computing resources,
possibilities of fine-tuning the queues' internal logic (based on users' feedback).

queues

Hence, we would like to kindly ask you to really try to provide the jobs' run-time estimations as good as possible. It is not good to always use just a single "safe" (= long enough) estimate for all your jobs - such doings make the application of some effective scheduling optimizations impossible, and thus generally deteriorates the system response time for you, the users. It has been showed by many studies evaluating the application of scheduling optimizations on real systems, that when providing more accurate specification of jobs' duration time, the system's throughput and utilization can be increased by 30 % on average (using so-called backfilling, i.e., filling the schedule gaps by short jobs).

Example: The importance of the backfilling process mentioned above could be demonstrated by the following situation: when any job does not deplete its run-time window - which is a common occurrence - and simultaneously, the following job in the plan cannot be started yet (e.g., it is waiting for releasing another resources), there can be found a suitable short job filling the gap. It is obvious that these short tasks are started (and completed) much earlier than previously intended, while the guaranteed startup times of long-term jobs are not affected in any way. The key for such an efficient and fair usage of computing resoures is, as mentioned above, a sufficient diversity of jobs' duration times estimates.

After a pilot deployment of the CERIT-SC infrastructure, the described logic of internal queues management may be revised (based on its continuous evaluation and user response).