Project

General

Profile

Bug #13541

Bug #10288: Fix newly identified issues to make our test suite more robust and faster

Tor still sometimes fails to bootstrap in the test suite

Added by bertagaz 3 months ago. Updated 13 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Test suite
Target version:
Start date:
07/31/2017
Due date:
% Done:

20%

QA Check:
Ready for QA
Feature Branch:
Type of work:
Code
Blueprint:
Easy:
Affected tool:

Description

It's been noticed while reviewing Jenkins test suite runs failure in June (#12289) that the TorBootstrapFailure exception still happens sometimes.

There are 3 possible reasons to it, I think: a) chutney/tor sometimes feels stupid; or b) our test suite sometimes fails to manage chutney properly; or c) some unrelated event on the system prevents them from doing their job.

Next steps to investigate this would be to:

1. look at the test suite debug log and * double-check there's no explanation for the tor bootstrap failure (that's about (a) and (b)) * record the exact times when things went wrong
2. ensure we save the tor log from the Tails system when this happens, so we can see if that tor is stupid (a)
3. make the Journal persistent on isotesters so we can try to correlate such failures with system events (c).


Related issues

Related to Tails - Bug #10238: Investigate why Tor sometimes is slow at or even fails bootstrapping Rejected 09/23/2015
Related to Tails - Feature #12411: Stop restarting tor if bootstrapping stalls Resolved 03/29/2017
Related to Tails - Feature #12289: Deal with June 2017 false positive scenarios Resolved 06/05/2017 07/05/2017

Associated revisions

Revision 38307515 (diff)
Added by bertagaz about 2 months ago

Test suite: factorize Tor and Htpdate artifacts saving method.

  • Use save_failure_artifacts() coherently in the hooks exception
    catching.
  • Use software name as extension to simplify artifacts copying and
    renaming.

Refs: #13541

Revision cd27e054 (diff)
Added by bertagaz about 2 months ago

Also save systemd's journal on failure.

That can be a useful method, so let's do that for evey scenario failure.

Refs: #13541.

Revision fafd7216
Added by anonym 21 days ago

Merge remote-tracking branch 'origin/feature/13541-save-more-data-on-htpdate-or-tor-failures' into testing

Refs: #13541

History

#1 Updated by bertagaz 3 months ago

  • Parent task deleted (#10288)

#2 Updated by bertagaz 3 months ago

  • Related to Bug #10238: Investigate why Tor sometimes is slow at or even fails bootstrapping added

#3 Updated by bertagaz 3 months ago

  • Related to Feature #12411: Stop restarting tor if bootstrapping stalls added

#4 Updated by bertagaz 3 months ago

  • Related to Feature #12289: Deal with June 2017 false positive scenarios added

#5 Updated by bertagaz 3 months ago

I'll paste debug logs and all later when I'll have finished #12290.

#6 Updated by bertagaz 2 months ago

intrigeri wrote:

2. ensure we save the tor log from the Tails system when this happens, so we can see if that tor is stupid (a)

While doing #13472, I've pushed the feature/13541-save-more-data-on-htpdate-or-tor-failures branch which contains a rough implementation of that. It also saves htpdate.log on 'time has synced failures', so that's why I wanted it pushed and live in Jenkins.

#7 Updated by bertagaz about 2 months ago

  • Status changed from Confirmed to In Progress
  • Assignee set to anonym
  • Target version set to Tails_3.2
  • % Done changed from 0 to 10
  • QA Check set to Ready for QA

intrigeri wrote:

3. make the Journal persistent on isotesters so we can try to correlate such failures with system events (c).

I've pushed another commit in the branch mentioned in my previous note that does that. I've made it so that it does save the journal no matter what the failure. I think that's an interesting information we may want for a lot of reasons/cases. For example, I've done it because I wanted to have the systemd journal for #13461.

I've also pushed another commit in this branch that clean up the Tor and Htpdate logs retrieval.

I think that's something that'd be useful to get into stable and devel so that we get more useful informations from the Jenkins runs, so I'm puthing this branch RfQA. Merging it does not mean this ticket is over though so please set it back to Dev needed and no assignee if/when it's merged.

#8 Updated by bertagaz about 2 months ago

I'm attaching the systemd journal from https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-13541-save-more-data-on-htpdate-or-tor-failures/39/ which exposed this failure. The related saved Tor log is empty, it seems it didn't even started. That's the only builds from this branch that had this problem, so it's not sure it really shows the exact bug we're talking about.

#9 Updated by anonym 21 days ago

  • Assignee changed from anonym to bertagaz
  • Target version changed from Tails_3.2 to Tails_3.3
  • % Done changed from 10 to 20

I merged the feature/13541-save-more-data-on-htpdate-or-tor-failures branch, but had to push some fixups straight into testing:

892150ba5e Be more careful with remote shell usage when a scenario has failed.
0f5b0ea305 Prefer do..end over {} for multi-line blocks.
a44a711aa7 Fix indentation.
dcb30cc995 Simplify.
9f74565b80 Fix typos.
b85dbcb642 Simplify.

Please review (post-merge)!

#10 Updated by bertagaz 13 days ago

  • Target version changed from Tails_3.3 to Tails_3.4

#11 Updated by intrigeri 13 days ago

  • Parent task set to #10288

Also available in: Atom PDF