Project

General

Profile

Bug #11890

Bug #10288: Fix newly identified issues to make our test suite more robust and faster

Checking credentials in Icedove autoconfig wizard sometimes fails in the test suite

Added by intrigeri 12 months ago. Updated about 1 month ago.

Status:
Confirmed
Priority:
Normal
Assignee:
Category:
Test suite
Target version:
Start date:
10/31/2016
Due date:
% Done:

0%

QA Check:
Dev Needed
Feature Branch:
Type of work:
Code
Blueprint:
Easy:
Affected tool:
Email Client

debug.log View (1.55 MB) intrigeri, 10/31/2016 08:41 AM


Related issues

Related to Tails - Feature #6304: Automate the most important bits of the Icedove tests Resolved 09/26/2013
Blocked by Tails - Feature #12277: Run our own email (IMAP/POP3/SMTP) server for automated tests Confirmed 03/02/2017
Blocks Tails - Feature #13240: Core work 2017Q4: Test suite maintenance Confirmed 06/29/2017

Associated revisions

Revision 192a1e57 (diff)
Added by intrigeri 10 months ago

Test suite: mark tests using the Icedove autoconfig wizard as fragile (refs: #11890).

History

#1 Updated by intrigeri 12 months ago

  • Related to Feature #6304: Automate the most important bits of the Icedove tests added

#2 Updated by intrigeri 12 months ago

  • Parent task set to #10288

#4 Updated by anonym 11 months ago

  • Priority changed from Normal to Elevated

#5 Updated by anonym 11 months ago

  • Priority changed from Elevated to Normal

#6 Updated by anonym 11 months ago

  • Target version changed from Tails_2.9.1 to Tails 2.10

#7 Updated by intrigeri 11 months ago

Depending on the November false positives triaging, we'll mark as fragile or not, bump prio or not, postpone or not.

#8 Updated by intrigeri 10 months ago

This was one of the most common failures on stable and devel in Oct+Nov, so I'm going to mark the test as fragile. But it affects only icedove.feature so I'm not bumping priority.

#9 Updated by anonym 9 months ago

  • Assignee changed from anonym to intrigeri

I am afraid this failure actually happens when the Riseup mail servers (which we use) are down. As a regular Riseup mail user, I can attest to that it's not that uncommon. :/

I tried to manually reproduce this by killing the network (but not Tor). Then each retry took over a minute before the error happened again, unlike in the test suite failure video above, where the error is detected almost immediately. The only way I could achieve the exact same behavior (i.e. fast error detection) was by clicking on "Manual config" and then change "Server address" to something that doesn't listen on the target port (I used x.org for which port 993 is closed and not "filtered" (according to nmap) -- when the port is filtered, it takes a long time, which makes sense). So, it seems it's not the fault of Tor, but the server.

It would seem that another plausible alternative to the Riseup servers being down is that we get a bad DNS record for *.riseup.net (perhaps because of a bad exit?), and Tor caches it for the duration of the test without fetching a new one. But since we use Chutney I don't think this should happen.

So my working theory is that the servers are down. To verify we'd either need a detailed (per minute?) and accurate log of Riseup's servers' availability so we can compare, or perhaps switch to another mail provider (we could even try with switching only half of the isotesters to a new one, for better comparison). What do you think, tails-sysadmin@?

#10 Updated by anonym 9 months ago

Note to self: I just had an issue where my system's Icedove could not connect to imap.riseup.net when fetching email, no matter how much I requested new circuits and clicked "Retry" in the error dialog (presumably the same code is run in Icedove in this different case). I verified that the host was up and listening to TCP port 993. After restarting Tor, it immediately worked again. Could this mean something? Maybe this invalidates my experience that the servers are down occasionally, and that it actually is some Tor vs Icedove bug?

#11 Updated by intrigeri 9 months ago

  • Target version changed from Tails 2.10 to Tails_2.11

#12 Updated by intrigeri 9 months ago

  • QA Check set to Info Needed

#13 Updated by anonym 9 months ago

anonym wrote:

Note to self: I just had an issue where my system's Icedove could connect to imap.riseup.net when fetching email, no matter how much I requested new circuits and clicked "Retry" in the error dialog (presumably the same code is run in Icedove in this different case). I verified that the host was up and listening to TCP port 993. After restarting Tor, it immediately worked again. Could this mean something? Maybe this invalidates my experience that the servers are down occasionally, and that it actually is some Tor vs Icedove bug?

Related: It happened again. This time I retried a few times, with the error poping up again pretty much immediately. I did signal newnym but the problem persisted. However, if I cancelled and just pressed the "Get Messages" button, then it worked fine suddenly. Was that first "attempt" simply cursed to never succeed no matter how many retries? Had it gotten into this bad state thanks to Tor, somehow?

Next time I should try to "Cancel" and "Get Messages" without a signal newnym, to try to rule out or implicate Tor's role.

#14 Updated by intrigeri 9 months ago

anonym wrote:

Note to self: I just had an issue where my system's Icedove could connect to imap.riseup.net when fetching email, no matter how much I requested new circuits and clicked "Retry" in the error dialog

Did you mean "could not connect"?

Also, is this because the connection is kept open, so the retries still use the same circuit?

#15 Updated by intrigeri 9 months ago

  • Assignee changed from intrigeri to anonym

anonym wrote:

So my working theory is that the servers are down. To verify we'd either need a detailed (per minute?) and accurate log of Riseup's servers' availability so we can compare,

I've looked at their munin and its resolution is not good enough to provide such data: I never see less than 30-100 IMAP logins. I don't think we can easily get the data you want without bothering Riseup people.

What do you think, tails-sysadmin@?

What would be the requirements for a (private) IMAP server we would run ourselves? (Just like we already do for the SSH tests.)

#16 Updated by anonym 9 months ago

intrigeri wrote:

anonym wrote:

Note to self: I just had an issue where my system's Icedove could connect to imap.riseup.net when fetching email, no matter how much I requested new circuits and clicked "Retry" in the error dialog

Did you mean "could not connect"?

Yes! I'll edit the post.

Also, is this because the connection is kept open, so the retries still use the same circuit?

I guess -- if the connection is kept open, the circuit won't be closed and will in fact be reused. OTOH, since we use Chutney in the test suite, the test suite host is always the exit. That invalidates my theory about picking an exit node for which riseup.net is censored. I'm wondering if there's some worse bug in Thunderbird, where it can get into a state where it will always fail.

#17 Updated by anonym 9 months ago

  • Assignee changed from anonym to intrigeri

intrigeri wrote:

anonym wrote:

So my working theory is that the servers are down. To verify we'd either need a detailed (per minute?) and accurate log of Riseup's servers' availability so we can compare,

I've looked at their munin and its resolution is not good enough to provide such data: I never see less than 30-100 IMAP logins. I don't think we can easily get the data you want without bothering Riseup people.

ACK, I thought so.

What do you think, tails-sysadmin@?

What would be the requirements for a (private) IMAP server we would run ourselves? (Just like we already do for the SSH tests.)

It would need to do IMAP/POP3/SMTP, whose ports (which should be the standard ones!) must be exposed to the Internet. SMTP should be locked down so only mail to itself can be delivered. It'd be extra cool if it could implement a daily cleanup of the inbox, so mails older than 24 h are deleted -- that way we can drop the cleanup code from the test suite (and start doing the test for POP again).

A problem, though, is that since this account's credentials would be public, any one can login and try to subvert our test results. E.g. fill the inbox (DoS), mess with the emails we use as verification (forcing false positives or false negatives), and so on. A solution would be if the server would accept any credentials and create the account on the fly (so we'd use random ones), and then clean it up after 24 hours or so. Sounds like a fun sysadmin task, but perhaps too time-consuming?

#18 Updated by intrigeri 9 months ago

  • Assignee changed from intrigeri to anonym

What would be the requirements for a (private) IMAP server we would run ourselves? (Just like we already do for the SSH tests.)

It would need to do IMAP/POP3/SMTP, whose ports (which should be the standard ones!) must be exposed to the Internet.

Actually, we can probably configure it so that isotesters can reach it, and nobody outside of lizard can. With some simple DNS + firewall tricks, this should work nicely, and we could even have an autoconfig XML file served by our webserver for Thunderbird to fetch (which is how it does it currently with Riseup, I guess).

SMTP should be locked down so only mail to itself can be delivered.

Quite easy (but not required I think, unless I missed something).

It'd be extra cool if it could implement a daily cleanup of the inbox, so mails older than 24 h are deleted -- that way we can drop the cleanup code from the test suite (and start doing the test for POP again).

I think that dovecot has facilities to do that out of the box.

A problem, though, is that since this account's credentials would be public,

Why? We support configuring arbitrary accounts, right? So the way I see it, the email account for isotesters would live in the (private) isotesters secrets repo, and nobody else would be able to access and use them. E.g. you and I would keep using whatever other account we currently use.

#19 Updated by anonym 9 months ago

  • Assignee changed from anonym to intrigeri

intrigeri wrote:

What would be the requirements for a (private) IMAP server we would run ourselves? (Just like we already do for the SSH tests.)

It would need to do IMAP/POP3/SMTP, whose ports (which should be the standard ones!) must be exposed to the Internet.

Actually, we can probably configure it so that isotesters can reach it, and nobody outside of lizard can. With some simple DNS + firewall tricks, this should work nicely, and we could even have an autoconfig XML file served by our webserver for Thunderbird to fetch (which is how it does it currently with Riseup, I guess).

Ok. I had different assumptions (see below).

[...]

A problem, though, is that since this account's credentials would be public,

Why? We support configuring arbitrary accounts, right? So the way I see it, the email account for isotesters would live in the (private) isotesters secrets repo, and nobody else would be able to access and use them. E.g. you and I would keep using whatever other account we currently use.

When you said "Just like we already do for the SSH tests" I thought of the "unsafe SSH key" thing we use for the Git tests, so we wouldn't need any secrets at all for Icedove. But ok, since doing that would be hard, let's forget about it.

If this is easy enough to do, perhaps we can try it. However, we have not ruled out that Icedove simply is buggy when retrying, and if that is the case doing this would be pointless. What do you think?

(In the meantime, I'm awaiting Icedove to misbehave for me again.)

#20 Updated by intrigeri 9 months ago

  • Assignee changed from intrigeri to anonym

Ok. I had different assumptions (see below).

… and I was mis-remembering how we're doing it :)

If this is easy enough to do, perhaps we can try it. However, we have not ruled out that Icedove simply is buggy when retrying, and if that is the case doing this would be pointless. What do you think?

Very roughly, setting up the needed infra would take 2-4 hours. I'll let you make the call about if/when it's worth going that way.

#21 Updated by anonym 8 months ago

When debugging Icedove these are some useful prefs to put in .icedove/profile.default/preferences/0000tails.js:

pref("browser.dom.window.dump.enabled", true);
pref("mailnews.database.global.logging.dump", true);
pref("mail.wizard.logging.dump", "All");

#22 Updated by anonym 8 months ago

  • Assignee changed from anonym to intrigeri
In my quest to try to understand what causes this error I have tried to simulate a "network error" (broadly speaking) like this:
  1. Boot Tails with network
  2. Start Icedove, fill in a @riseup.net address + password
  3. When it presents the IMAP + SMTP configuration, then I unplug the network while tor is still running. [I also tried to DROP all Tor traffic via the firewall, in case tor would behave differently vs clients when circuits just don't work, compared to when the network is just down -- no difference.]
  4. Then I press "Done" so it will try the password
  5. After two minutes it will show the failure
  6. Pressing "Done" again (to test again) results in the same error, after exactly two mintes

The last two points differ from the video; then the first error happens after 44 seconds, and then each retry fails after 3 seconds (except the second retry, which for some reason takes 7 seconds). I guess the fact that we don't got the full two minute timeout could mean that we actually had a connection, but the server closed it, probably by shutting down completely, which also would explain why the following connections fail more or less immediately. OTOH, we are forcing a new Tor circuit between each retry, so I find it suspicious that each retry fails after a more or less fixed time -- I'd expect much more variance with this many new circuits.

By the way, note that Tor rate limits NEWNYM to one per 10 seconds, so some of our NEWNYM:s actually do nothing since they happen too often in this retry loop, but at least three of our NEWNYM:s are effective, so I do not think this is a problem. Just in case I took the liberty of fixing it in our base branches (3bc868968f47c414bdb3e32d09df675fea5766b8) after testing it thoroughly.

So where are we now? Well, even if nothing is certain, my experiment + Occam's razor seems to indicate that the server we are using is the problem, not Tor. Do you agree? If so I think it's worth potentially wasting sysadmin time by setting up our own locked down IMAP/POP3/SMTP server.

#23 Updated by intrigeri 8 months ago

  • Blocked by Feature #12277: Run our own email (IMAP/POP3/SMTP) server for automated tests added

#24 Updated by intrigeri 8 months ago

  • Assignee changed from intrigeri to anonym
  • QA Check changed from Info Needed to Dev Needed

So where are we now? Well, even if nothing is certain, my experiment + Occam's razor seems to indicate that the server we are using is the problem, not Tor. Do you agree?

ACK.

If so I think it's worth potentially wasting sysadmin time by setting up our own locked down IMAP/POP3/SMTP server.

So this is now #12277.

#25 Updated by anonym 8 months ago

  • Target version changed from Tails_2.11 to Tails_3.2

Bumping to same release as blocker.

#26 Updated by anonym 4 months ago

  • Blocks Feature #13239: Core work 2017Q3: Test suite maintenance added

#27 Updated by intrigeri about 1 month ago

  • Target version changed from Tails_3.2 to Tails_3.3

#28 Updated by intrigeri 17 days ago

  • Blocks Feature #13240: Core work 2017Q4: Test suite maintenance added

#29 Updated by intrigeri 17 days ago

  • Blocks deleted (Feature #13239: Core work 2017Q3: Test suite maintenance)

Also available in: Atom PDF