Project

General

Profile

Bug #11295

Test jobs sometimes get scheduled on a busy isotester while there are available ones

Added by bertagaz over 1 year ago. Updated 28 days ago.

Status:
Confirmed
Priority:
Normal
Assignee:
Category:
Continuous Integration
Target version:
Start date:
03/31/2016
Due date:
% Done:

0%

QA Check:
Feature Branch:
Type of work:
Research
Blueprint:
Starter:
No
Affected tool:

Description

While investigating #10601, we discovered that sometimes after a reboot_job completed, rather than starting the test job that triggered it for this isotester, Jenkins assigns this same isotester to another test job, resulting in the first test job waiting for hours for the other one to be over. See #10601#note-5 for details.

We first need to see if this still happens or not, and then fix it if it does. Note that the fact that we have more isotesters and that the test time is now way shorter might make it harder to detect this bug if it still exists.


Related issues

Related to Tails - Bug #10215: Suboptimal advance booking of Jenkins slaves for testing ISOs Resolved 09/17/2015

History

#1 Updated by intrigeri over 1 year ago

  • Description updated (diff)

I suggest to first set up a very simple test case to confirm what's the deal with job priority, and whether our current configuration is based on a correct understanding of how the priority sorter plugin works (#10601#note-5 has more precise pointers about where this doubt of mine comes from).

Rationale: even if the bug isn't obvious in our current setup for some reason, I'd rather not keep config designed based on erroneous assumptions, since if it's the case it'll be confusing in the future next time I have to debug weird race conditions again.

#2 Updated by bertagaz over 1 year ago

  • Target version changed from Tails_2.4 to Tails_2.5

#3 Updated by bertagaz over 1 year ago

  • Target version changed from Tails_2.5 to Tails_2.6

Probably won't have time to work on it before that.

#4 Updated by intrigeri over 1 year ago

  • Subject changed from Test jobs sometimes get their isotester stolen by another one. to Test jobs sometimes get their isotester stolen by another one

I've just seen something similar happen again: https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_test-11588-usb-on-jenkins-10733/15/ is "(pending—Waiting for next available executor on isotester2) UPSTREAMJOB_BUILD_NUMBER=15" while https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_test-11588-usb-on-jenkins-10733/14/ is running on isotester2. Five other isotesters are available, so it's a shame that job 15 was scheduled on isotester2 as well and now has to wait for 3 hours before it's run.

Job 14 was run on Jul 31, 2016 9:18:49 PM by https://jenkins.tails.boum.org/job/wrap_test_Tails_ISO_test-11588-usb-on-jenkins-10733/14/, which also run
https://jenkins.tails.boum.org/job/reboot_job/8542/ with parameter RESTART_NODE=isotester2. The wrap job had NODE_NAME=isotester2.

Job 15 was run on Jul 31, 2016 9:19:19 PM by https://jenkins.tails.boum.org/job/wrap_test_Tails_ISO_test-11588-usb-on-jenkins-10733/15/, which also run https://jenkins.tails.boum.org/job/reboot_job/8543/ with parameter RESTART_NODE=isotester2. The wrap job had NODE_NAME=isotester2.

As said in the ticket description, I've already investigated such a problem 8 months ago (#10601#note-5), so the next debugging steps should be easy, if done before the corresponding system logs and Jenkins artifacts expire.

I believe this clearly answers the "We first need to see if this still happens or not" part of this ticket: something is wrong with our job priority setup.

#5 Updated by intrigeri over 1 year ago

  • Subject changed from Test jobs sometimes get their isotester stolen by another one to Test jobs sometimes get scheduled on a busy isotester while there are available ones

Same thing as we speak, between https://jenkins.tails.boum.org/view/Tails_ISO/job/test_Tails_ISO_feature-from-intrigeri-for-2.6/7/ and job 9 on the same project: here again, 2 isotesters are free but job 9 is waiting for isotester1 to be available, while job 7 is running there.

#6 Updated by intrigeri over 1 year ago

  • Related to Bug #10215: Suboptimal advance booking of Jenkins slaves for testing ISOs added

#7 Updated by anonym about 1 year ago

  • Target version changed from Tails_2.6 to Tails_2.7

#8 Updated by bertagaz about 1 year ago

  • Target version changed from Tails_2.7 to Tails_2.9.1

#9 Updated by anonym 12 months ago

  • Target version changed from Tails_2.9.1 to Tails 2.10

#10 Updated by intrigeri 12 months ago

  • Target version changed from Tails 2.10 to Tails_2.11

#11 Updated by bertagaz 9 months ago

  • Target version changed from Tails_2.11 to Tails_2.12

#12 Updated by bertagaz 9 months ago

  • Target version changed from Tails_2.12 to Tails_3.0

#13 Updated by bertagaz 8 months ago

  • Target version changed from Tails_3.0 to Tails_3.1

#14 Updated by bertagaz 7 months ago

  • Target version changed from Tails_3.1 to Tails_3.2

#15 Updated by bertagaz 4 months ago

  • Target version changed from Tails_3.2 to Tails_3.3

#16 Updated by bertagaz about 2 months ago

  • Target version changed from Tails_3.3 to Tails_3.5

Realistically reschedule for 3.4.

#17 Updated by bertagaz 28 days ago

  • Target version changed from Tails_3.5 to Tails_3.6

Also available in: Atom PDF