Feature #11355

Feature #9614: Automated builds: complete phase two

Re-enable Jenkins notifications on ISO build/test failure

Added by intrigeri about 1 year ago. Updated 15 days ago.

Status:In ProgressStart date:04/15/2016
Priority:ElevatedDue date:
Assignee:bertagaz% Done:

60%

Category:Continuous Integration
Target version:Tails_3.1
QA Check:Dev Needed Blueprint:
Feature Branch:puppet-tails:feature/11355-reenable-jenkins-notifications;jenkins-jobs:feature/11355-reenable-jenkins-notifications Easy:
Type of work:Code Affected tool:

Description

Re-enable them one by one, carefully evaluating the outcome at each step, to make sure we continuously deliver a good enough UX to contributors.


Related issues

Related to Tails - Feature #9719: Configure Jenkins notifications to Redmine In Progress 07/10/2015
Related to Tails - Bug #10288: Fix newly identified issues to make our test suite more robust and faster In Progress 02/26/2015
Related to Tails - Bug #12647: Deal with June 2107 false positive CI notifications Confirmed 06/07/2017
Related to Tails - Bug #12648: Deal with July 2017 false positive CI notifications Confirmed 06/07/2017
Related to Tails - Bug #12649: Deal with August 2017 false positive CI notifications Confirmed 06/07/2017
Related to Tails - Bug #12650: Deal with September 2017 false positive CI notifications Confirmed 06/07/2017
Blocked by Tails - Bug #11354: Disable Jenkins notifications on ISO build/test failure Resolved 04/15/2016

History

#1 Updated by intrigeri about 1 year ago

Next step is probably to clearly identify the blockers.

#2 Updated by intrigeri about 1 year ago

  • Blocked by Bug #11354: Disable Jenkins notifications on ISO build/test failure added

#3 Updated by intrigeri about 1 year ago

A first change we decided to make is to only send Jenkins ISO build/test failure notifications for Ready for QA tickets. This is a pre-requisite to re-enabling such notifications (except on base branches, of course). Note that #9719 has technical info that may be relevant for the implementation.

#4 Updated by intrigeri about 1 year ago

  • Blocked by deleted (Bug #11354: Disable Jenkins notifications on ISO build/test failure)

#5 Updated by intrigeri about 1 year ago

  • Parent task set to #9614

#6 Updated by intrigeri about 1 year ago

  • Blocked by Bug #11354: Disable Jenkins notifications on ISO build/test failure added

#7 Updated by intrigeri about 1 year ago

  • Related to Feature #9719: Configure Jenkins notifications to Redmine added

#8 Updated by bertagaz about 1 year ago

  • Status changed from Confirmed to In Progress

As a first step, I'll do some stats on the last months build run to see if we can at least re-enable notifications for them.

#9 Updated by intrigeri 12 months ago

What's the status / updated timeline on this front?

#10 Updated by bertagaz 11 months ago

  • Target version changed from Tails_2.5 to Tails_2.6

#11 Updated by intrigeri 10 months ago

  • Related to Bug #10288: Fix newly identified issues to make our test suite more robust and faster added

#12 Updated by anonym 9 months ago

  • Target version changed from Tails_2.6 to Tails_2.7

#13 Updated by bertagaz 9 months ago

  • Target version changed from Tails_2.7 to Tails_2.9.1

#14 Updated by anonym 6 months ago

  • Target version changed from Tails_2.9.1 to Tails 2.10

#15 Updated by intrigeri 6 months ago

  • Target version changed from Tails 2.10 to Tails_2.11

#16 Updated by bertagaz 4 months ago

  • Target version changed from Tails_2.11 to Tails_2.12

#17 Updated by bertagaz 4 months ago

  • Priority changed from Normal to Elevated

#18 Updated by intrigeri 3 months ago

The current plan for the 1st iteration is #11355#note-3, and having a global kill switch for this feature so that any sysadmin can disable it. The kill switch must live outside of the code (e.g. somewhere in /etc) so we don't keep modifying a script merely to configure it (we've seen how reverts-of-reverts-of-reverts have lead us to problematic situations in the past).

#19 Updated by intrigeri 2 months ago

  • Target version changed from Tails_2.12 to Tails_3.0~rc1

Please ensure the new target version is more realistic than 2.12, after having looked at your 3.0~rc1 and 3.0 tickets. Also, IIRC you mentioned you already have had some WIP locally for a while, that you might want to push to a branch somewhere, so it's not stored on your hard drive only, and others can help move this forward :)

#20 Updated by intrigeri about 1 month ago

  • Target version changed from Tails_3.0~rc1 to Tails_3.0

intrigeri wrote:

Please ensure the new target version is more realistic than 2.12, after having looked at your 3.0~rc1 and 3.0 tickets. Also, IIRC you mentioned you already have had some WIP locally for a while, that you might want to push to a branch somewhere, so it's not stored on your hard drive only, and others can help move this forward :)

Ping :)

#21 Updated by bertagaz about 1 month ago

  • % Done changed from 0 to 20
  • Feature Branch set to puppet-tails:feature/11355-reenable-jenkins-notifications;jenkins-jobs:feature/11355-reenable-jenkins-notifications

intrigeri wrote:

intrigeri wrote:

Please ensure the new target version is more realistic than 2.12, after having looked at your 3.0~rc1 and 3.0 tickets. Also, IIRC you mentioned you already have had some WIP locally for a while, that you might want to push to a branch somewhere, so it's not stored on your hard drive only, and others can help move this forward :)

Ping :)

Right, I've just reworked a bit what I had and pushed it. It's for now completely untested but close to what it will be. If you want to have a first look at the design and code, you're welcome.

#22 Updated by intrigeri about 1 month ago

Right, I've just reworked a bit what I had and pushed it. It's for now completely untested but close to what it will be. If you want to have a first look at the design and code, you're welcome.

Thanks, I'll try! :)

#23 Updated by intrigeri about 1 month ago

Very quick initial look:

  • I like the design choices, so I'll only comment about the implementation here. Congrats!
  • /usr/local/bin/setup_notifications should have a less generic name
  • I'm not happy with the name of command line arguments to setup_notifications; I can certainly improve them myself later on during this process, but I would appreciate if you first sat down, looked at the use cases, thought about how confusing they can be (e.g. disable_build does not disable any "build") and improved them a bit; hint: the help string for each argument are pretty good and provide good inspiration for the corresponding option name :)
  • /etc/redmine/apikey should live in /etc/jenkins IMO: it is surely about Redmine, but what it configures is Jenkins stuff, not Redmine
  • I'd vastly prefer if setup_notifications output'ed its results (as a pure function!) instead of reporting them as a side effect by writing to $WORKSPACE/tmp/recipient
  • I'm confused when looking at the diff of files/jenkins/slaves/isobuilders/collect_build_environment: did you mean NOTIFY_BUILD_TO?
  • Typo in recipent
  • Please use named arguments for the setup_notifications function instead of positional ones (if Python lets you do that, I don't remember).
  • ticket_state variable name is confusing: it's used for something that is not Redmine ticket "Status".
  • Please extract the checking for RfQA in a predicate function.

#24 Updated by bertagaz about 1 month ago

  • Assignee changed from bertagaz to intrigeri
  • % Done changed from 20 to 50
  • QA Check set to Ready for QA

intrigeri wrote:

Very quick initial look:

  • I like the design choices, so I'll only comment about the implementation here. Congrats!

Thanks! Most difficult was to remember how it was done in the several related scripts and repos. :)

  • /usr/local/bin/setup_notifications should have a less generic name

Renamed in 4411e80e7cc40726e708a09af74a6d4427eaab06

  • I'm not happy with the name of command line arguments to setup_notifications; I can certainly improve them myself later on during this process, but I would appreciate if you first sat down, looked at the use cases, thought about how confusing they can be (e.g. disable_build does not disable any "build") and improved them a bit; hint: the help string for each argument are pretty good and provide good inspiration for the corresponding option name :)

I wasn't too happy about that too. Tentatively done in 9859ab0093a0e4906fcd452e8fb6225a073811d8.

  • /etc/redmine/apikey should live in /etc/jenkins IMO: it is surely about Redmine, but what it configures is Jenkins stuff, not Redmine

Makes sense. Done in bd7efbb5e9e6636607514807d18ea68b2b9a4ce3

  • I'd vastly prefer if setup_notifications output'ed its results (as a pure function!) instead of reporting them as a side effect by writing to $WORKSPACE/tmp/recipient

Done in 6cb019f677cc7a0924c4046b80be888eaf083a7f

  • I'm confused when looking at the diff of files/jenkins/slaves/isobuilders/collect_build_environment: did you mean NOTIFY_BUILD_TO?

Nop, this script collects the environment variable used during the build to pass and use them in the test job environment. So we don't care about the build notification recipient here, we just need to pass the test one.

  • Typo in recipent

Corrected that yesterday already,in 7e15570ca653952f17ec17ec37912f025aa5dc3c

  • Please use named arguments for the setup_notifications function instead of positional ones (if Python lets you do that, I don't remember).

They were already named ones, but in Python they can be used as positional ones too. I've still changed that accordingly in 9731bebd335e7010ce3828f81da25f3e746ed9e8

  • ticket_state variable name is confusing: it's used for something that is not Redmine ticket "Status".
  • Please extract the checking for RfQA in a predicate function.

Done both in fee1b2c675a0934036130ff42d7f8f1070e70e34.

From local tests I made it works quite well, so I think it can be reviewed now.

#25 Updated by intrigeri about 1 month ago

  • Assignee changed from intrigeri to bertagaz
  • QA Check changed from Ready for QA to Dev Needed

Note: I've pushed some (untested) improvements.

Thanks! Most difficult was to remember how it was done in the several related scripts and repos. :)

Design doc, anyone? (Sorry if I'm beating a dead horse.)

  • I'm not happy with the name of command line arguments to setup_notifications; I can certainly improve them myself later on during this process, but I would appreciate if you first sat down, looked at the use cases, thought about how confusing they can be (e.g. disable_build does not disable any "build") and improved them a bit; hint: the help string for each argument are pretty good and provide good inspiration for the corresponding option name :)

I wasn't too happy about that too. Tentatively done in 9859ab0093a0e4906fcd452e8fb6225a073811d8.

OK, great! Improved a tiny bit on top.

  • /etc/redmine/apikey should live in /etc/jenkins IMO: it is surely about Redmine, but what it configures is Jenkins stuff, not Redmine

Makes sense. Done in bd7efbb5e9e6636607514807d18ea68b2b9a4ce3

Cool. Then please adjust puppet:///modules/tails_secrets_jenkins/jenkins/slaves/isobuilders/redmine/redmine_apikey and require => File['/etc/redmine/apikey'] accordingly (that Puppet code can't possibly work; please test stuff before sending for QA!).

  • I'd vastly prefer if setup_notifications output'ed its results (as a pure function!) instead of reporting them as a side effect by writing to $WORKSPACE/tmp/recipient

Done in 6cb019f677cc7a0924c4046b80be888eaf083a7f

Good. So the name of the setup_notifications function doesn't reflect what it does anymore. Same for the script: it doesn't setup anything AFAICT.

  • I'm confused when looking at the diff of files/jenkins/slaves/isobuilders/collect_build_environment: did you mean NOTIFY_BUILD_TO?

Nop, this script collects the environment variable used during the build to pass and use them in the test job environment. So we don't care about the build notification recipient here, we just need to pass the test one.

OK. Please add a comment then :)

From local tests I made it works quite well, so I think it can be reviewed now.

Cool! How did you test this? In a clone of our Jenkins setup, or anything else? There was a syntax error in the Puppet code so I'm wondering how it can have been tested.

Some more questions/comments:

What's the requested kill switch? Is it the --disable* args one can pass in macros/builders.yaml?

What happens if TAILS_TICKET 0 and args.rfqa is true? To me it looks like print(notification_vars) will raise an exception. Same question when args.rfqa is false and TAILS_TICKET 0. And once you've added code to handle the missing cases, IMO the nested if/else thing will have grown out of control and will want a refactoring.

Why does the jenkins user need write access to /etc/jenkins?

#26 Updated by bertagaz 29 days ago

  • Assignee changed from bertagaz to intrigeri
  • % Done changed from 50 to 60
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:

Note: I've pushed some (untested) improvements.

Thanks! doesn't seem to break anything.

Thanks! Most difficult was to remember how it was done in the several related scripts and repos. :)

Design doc, anyone? (Sorry if I'm beating a dead horse.)

Doesn't hurt. I'll do that while doing the documentation for #12409.

OK, great! Improved a tiny bit on top.

Good one!

Cool. Then please adjust puppet:///modules/tails_secrets_jenkins/jenkins/slaves/isobuilders/redmine/redmine_apikey and require => File['/etc/redmine/apikey'] accordingly (that Puppet code can't possibly work; please test stuff before sending for QA!).

Oops, forgot to push the update of the tails_secrets_jenkins repo. Done. Fix the missing one in cbb9224aa082e912358af7557977d73e11dc9f77

Good. So the name of the setup_notifications function doesn't reflect what it does anymore. Same for the script: it doesn't setup anything AFAICT.

Fixed in 6cc5811cb4028d08ebc66ab179d3aa533005f7a1 and b6e15293a33e2df8072544da68745602e0d38431

Nop, this script collects the environment variable used during the build to pass and use them in the test job environment. So we don't care about the build notification recipient here, we just need to pass the test one.

OK. Please add a comment then :)

458c8ef5272719eb3c7c5b6f15589ed42916eb0e

From local tests I made it works quite well, so I think it can be reviewed now.

Cool! How did you test this? In a clone of our Jenkins setup, or anything else? There was a syntax error in the Puppet code so I'm wondering how it can have been tested.

I did test the script with a manual test suite. On the puppet side, considering it was just a matter of installing two files, I did not bother testing it and am willing to fix any problem that may arise at the deployment step.

What's the requested kill switch? Is it the --disable* args one can pass in macros/builders.yaml?

Yes. I'll document that in the design doc.

What happens if TAILS_TICKET 0 and args.rfqa is true? To me it looks like print(notification_vars) will raise an exception. Same question when args.rfqa is false and TAILS_TICKET 0. And once you've added code to handle the missing cases, IMO the nested if/else thing will have grown out of control and will want a refactoring.

Ooch, my manual test suite did not include testing when ticket was '0'. Indeed it would not work. Fixed in 52f0191eb42ba50c69084bce6f3f8b4d2e101f20

Why does the jenkins user need write access to /etc/jenkins?

Because I'm too permissive? Fixed in 11e204a024367155ecaa41e677829dfca354947b

#27 Updated by bertagaz 29 days ago

bertagaz wrote:

intrigeri wrote:

Design doc, anyone? (Sorry if I'm beating a dead horse.)

Doesn't hurt. I'll do that while doing the documentation for #12409.

I just had a look and it seems it's already documented in the blueprint. I just did not think about looking at it, and dig into the code.

#28 Updated by intrigeri 29 days ago

  • Assignee changed from intrigeri to bertagaz
  • QA Check changed from Ready for QA to Dev Needed

The code now looks OK to me, so please go ahead with the deployment whenever you have time to deal with any issues in the following N days. Please notify our fellow contributors before deploying, making it clear that we've learned from last year's experience, and this time we are ready to immediately disable the whole thing as soon as they notice they are ignoring such notifications. Yeepee :)

Then be careful, don't close this ticket: what's been done so far is "only" the first iteration and this ticket is about something bigger. I'll let you decide if using subtasks for the next steps would help you organize your work.

#29 Updated by intrigeri 29 days ago

I just had a look and it seems it's already documented in the blueprint. I just did not think about looking at it, and dig into the code.

The current state of our blueprints is very diverse: some reflect initial ideas that predated the actual (slightly different) implementation, some are almost up-to-date design doc, etc. So one cannot trust a blueprint to accurately describe what we're running. Ergo, a blueprint is not the right place to document something that works. Thankfully we now have a space (under contribute/working_together/roles/sysadmins) to store our design doc, so please consider moving there whatever blueprints content that is current and relevant.

#30 Updated by intrigeri 28 days ago

  • QA Check changed from Dev Needed to Info Needed

intrigeri wrote:

The code now looks OK to me, so please go ahead with the deployment whenever […]

Oops, I forgot to ask: what's happening to the notifications for the base branches, with the proposed changes? IMO they should either be kept as-is (i.e. sent to the RMs) for this first iteration, or be sent to the RM + last committer(s?).

#31 Updated by bertagaz 28 days ago

  • QA Check changed from Info Needed to Dev Needed

intrigeri wrote:

intrigeri wrote:

The code now looks OK to me, so please go ahead with the deployment whenever […]

Oops, I forgot to ask: what's happening to the notifications for the base branches, with the proposed changes? IMO they should either be kept as-is (i.e. sent to the RMs) for this first iteration, or be sent to the RM + last committer(s?).

The proposed change does not modify the current behavior: the script that generates the jobs sets the build and test recipient to tails-rm@ for base branches (as before), while for other branches it sets it to the NOTIFY_*_TO variables, which is then set by the new script at the beginning of the build to the last commit author.

#32 Updated by intrigeri 28 days ago

The proposed change does not modify the current behavior: […]

Thanks! Sounds perfect to me for this 1st iteration :)

#33 Updated by bertagaz 15 days ago

  • Related to Bug #12647: Deal with June 2107 false positive CI notifications added

#34 Updated by bertagaz 15 days ago

  • Related to Bug #12648: Deal with July 2017 false positive CI notifications added

#35 Updated by bertagaz 15 days ago

  • Related to Bug #12649: Deal with August 2017 false positive CI notifications added

#36 Updated by bertagaz 15 days ago

  • Related to Bug #12650: Deal with September 2017 false positive CI notifications added

#37 Updated by bertagaz 15 days ago

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:

The proposed change does not modify the current behavior: […]

Thanks! Sounds perfect to me for this 1st iteration :)

Merged the branches in their respective repos and had to fix some minor bugs. It now works as expected, see https://jenkins.tails.boum.org/view/All/job/build_Tails_ISO_doc-12000-graphical-as-root/24/console for non-RfQA branch, https://jenkins.tails.boum.org/job/build_Tails_ISO_feature-12635-tor-browser-7.0/10/console for a RfQA branch, and also have a look at the base branches.

I've also added tickets to track the number of false positives regulary, and sent an email on tails-dev@ to warn people that the notifications are back.

#38 Updated by intrigeri 15 days ago

  • Assignee changed from intrigeri to bertagaz
  • Target version changed from Tails_3.0 to Tails_3.1
  • QA Check changed from Ready for QA to Dev Needed

I had already reviewed the code, and I've verified it seems to work fine at first glance. Some remaining work is listed on #11355#note-28 and #11355#note-29 (I suspect subtasks would be better suited to track them though). Next step: ensure there's a plan to finish this work; the end goal is not clearly defined beyond this first iteration, and I don't think we should try to define it better now. Let's wait and see how this first iteration works first.

#39 Updated by intrigeri 15 days ago

In collect_build_environment, [ $NOTIFY_TEST_TO ] || NOTIFY_TEST_TO="" doesn't do what you think: we're running under set -u.

Also available in: Atom PDF