Bug #11786

System often crashes during/after memory wipe since Linux 4.6

Added by mercedes508 12 months ago. Updated 5 months ago.

Status:DuplicateStart date:09/08/2016
Priority:NormalDue date:
Assignee:-% Done:

20%

Category:Hardware support
Target version:-
QA Check:Dev Needed Blueprint:
Feature Branch: Easy:
Type of work:Code Affected tool:

Description

We received a few reports from users saying that memory wipe, previously working with Tails 2.5 isn't working anymore with Tails 2.6~rc1. I guessed because of Linux 4.6.

I'm not sure how this should be considered, evn though it clearly is a regression.

Screenshot.png (602 KB) mercedes508, 09/24/2016 06:57 AM


Related issues

Related to Tails - Bug #10733: Run our initramfs memory erasure hook earlier Resolved 12/09/2015
Related to Tails - Bug #9707: Jessie: System sometimes does not poweroff after memory erasure Rejected 07/08/2015
Related to Tails - Bug #12061: Long delay before memory wipe starts on Stretch Resolved 12/21/2016
Duplicated by Tails - Bug #11730: Memory erasure freezes more often on Stretch Duplicate 08/26/2016
Duplicates Tails - Bug #12354: Fix shutdown and memory wipe regressions on 3.0~betaN Resolved 03/20/2017

Associated revisions

Revision bc9aa9d6
Added by anonym 11 months ago

Revert "Run our initramfs memory erasure hook earlier."

This reverts commits 185f53877ba90b621dbade4aebc4903e07e6ea82 and
71ac15e8a9d84bbb66260530bbdf177fa8addffc.

Since we introduced this in Tails 2.6~rc1 we see a lot more issues with
memory wiping both in our automated tests (locally, and on Jenkins) and
among actual users.

Refs: #10733, #11786

Revision 7f88af1d
Added by anonym 11 months ago

Revert "Run our initramfs memory erasure hook earlier."

This reverts commits 185f53877ba90b621dbade4aebc4903e07e6ea82 and
71ac15e8a9d84bbb66260530bbdf177fa8addffc.

Since we introduced this in Tails 2.6~rc1 we see a lot more issues with
memory wiping both in our automated tests (locally, and on Jenkins) and
among actual users.

Refs: #10733, #11786

Revision 154ac069
Added by intrigeri 11 months ago

Memory wipe: only apply the "one instance of sdmem per 2 GiB of RAM" tweak on 32-bit systems.

It brings useless complexity and possible lack of robustness on 64-bit
systems, that are used by the vast majority of Tails users nowadays.
Let's simplify this.

refs: #11786

Revision 40d3a62b
Added by intrigeri 8 months ago

Try shutting down for 1 minute after erasing memory (refs: #11786).

The workaround introduced in cd66f0ba49af773489db3a1bf15294255f681ecd
doesn't work anymore on feature/stretch: in the automated test suite, all I see
during memory wipe is a black screen (at least in features/usb_install.feature),
so if there's a kernel panic we don't see the busybox error message.

Let's see if the shutdown command triggers the "can't fork" message: if that's
the case, then memory wipe has succeeded, and we should eventually be able to
shutdown, if we retry long enough.

Revision be9b18da
Added by intrigeri 8 months ago

Display more information after wiping memory (refs: #11786).

The idea is to enable:

  • users to report more useful bugs;
  • developers to better understand the failure we see in the test suite.

Revision 4cb379a9
Added by intrigeri 8 months ago

Revert "Work around Tails stopping on shut down due to #11730."

This reverts commit cd66f0ba49af773489db3a1bf15294255f681ecd.

On this branch we try harder to shut down after wiping the memory,
let's see if that's enough to work around the "can't fork" problem.

refs: #11786

Revision 7b36725f
Added by intrigeri 8 months ago

Memory wipe: wait a bit after sdmem finished, before running more commands (refs: #11786).

The kernel might need some time to free the memory, before the busybox
shell is allowed to fork new stuff.

Revision 72697482
Added by anonym 7 months ago

Work around Tails freezing during memory wiping.

This should be reverted once #11786 is fixed properly.

This commit is a backport from the tails/stretch of the two
workarounds we implemented there.

Refs: #10776, #11786

Revision 41d88dcd
Added by anonym 7 months ago

Work around another instance of #11786.

I.e. when we get a kernel stack trace instead of a BusyBox shell.

Refs: #11786

Revision 2c9f39bb
Added by anonym 7 months ago

Try to identify #11786 explicitly.

Actually this will only make us fail faster.

Refs: #11786

History

#1 Updated by intrigeri 12 months ago

  • Related to Bug #10733: Run our initramfs memory erasure hook earlier added

#2 Updated by intrigeri 12 months ago

  • Assignee set to mercedes508
  • Priority changed from Normal to Elevated
  • Target version set to Tails_2.6
  • QA Check set to Info Needed

Can you please add some details to the "isn't working anymore" part? Description of what happens exactly, what's on screen, screenshots, would be useful.

Also, is there any chance that you get the bug reporters to test another, experimental ISO image, if I provided one?

Is this problem happening every time with 2.6~rc1 on the bug reporters' hardware?

This looks like the problem that's being discussed at #10733#note-21. It's unclear if that's caused by the Linux kernel upgrade, or by the changes introduced for #10733.

#3 Updated by elouann 11 months ago

intrigeri wrote:

Can you please add some details to the "isn't working anymore" part? Description of what happens exactly, what's on screen, screenshots, would be useful.

We received a more detailed bug report:

Shutdown on Tails 2.5:
1. Screen shows "[ OK ] Started Modem Manager", followed by "You can now remove...", followed by "Starting new kernal"
2. Sceen fills with asterisk as memory is wiped, followed by "clean exit"
3. System turns off

Shutdown on Tails 2.6~rc1:
1. Screen shows "You can now remove...", followed by "kexec_core: Starting new kernal"
2. Systems turns off
Issue occurs every time.

#4 Updated by intrigeri 11 months ago

We received a more detailed bug report:

[...]

Shutdown on Tails 2.6~rc1:
1. Screen shows "You can now remove...", followed by "kexec_core: Starting new kernal"
2. Systems turns off
Issue occurs every time.

Hi elouann! This ticket is about memory wipe failing. In that report, the system turns off so memory wipe succeeded, so this seems off-topic here unless I've missed something ⇒ please report whatever other problem you wanted to report in a dedicated ticket. Thanks in advance! (If it's about the lack of visual feedback (asterisks) on screen: presumably that's caused by the upgrade to Linux 4.6, not much we can do about it, and having this feedback working or not has always been highly hardware-dependent; the best we can do is to ensure our doc/FAQ/known issues/whatever is good enough to avoid doc-reading-user confusion.)

#5 Updated by sonicsnail 11 months ago

In my case, 2.5 was fine, but on 2.6~rc1 the screen shows "You can now remove...", followed by "kexec_core: Starting new kernel", and it just hangs there. It never progresses to the next screen filled with asterisks and the system never turns off, I have to manually hold the power button until it turns off. So I assume the memory isn't getting wiped. This happens every time on 2.6~rc1.

intrigeri wrote:

Also, is there any chance that you get the bug reporters to test another, experimental ISO image, if I provided one?

I could test an experimental ISO if needed.

#6 Updated by anonym 11 months ago

  • Type of work changed from Research to Test

sonicsnail wrote:

intrigeri wrote:

Also, is there any chance that you get the bug reporters to test another, experimental ISO image, if I provided one?

I could test an experimental ISO if needed.

Can you please test this ISO image: http://nightly.tails.boum.org/build_Tails_ISO_feature-10733-reverted/builds/lastSuccessfulBuild/archive/build-artifacts/

If you could do it before Monday it might influence what we do for the final Tails 2.6 release.

#7 Updated by sonicsnail 11 months ago

anonym wrote:

Can you please test this ISO image: http://nightly.tails.boum.org/build_Tails_ISO_feature-10733-reverted/builds/lastSuccessfulBuild/archive/build-artifacts/

If you could do it before Monday it might influence what we do for the final Tails 2.6 release.

I tested the ISO, booting and shutting down several times, with the same results every time. Here are the results I got:
  1. Screen shows "You can now remove...", followed by "kexec_core: Starting new kernel".
  2. Screen fills with asterisks.
  3. Kernel panic (see output below). It hangs at the end of the output.
  4. System does not turn off automatically. Need to hold down the power button until it turns off.

So it now progesses past "Starting new kernel" to the screen filled with asterisks, which looks like memory wipe is working again, but now there's a kernel panic issue that prevents powering off.

Kernel panic output:

[    5.938190] usb 2-4: string descriptor 0 read error: -22
modprobe: module microcode not found in modules.dep
Starting Wiping the memory, press Control-C to abort earlier. Help: "/usr/bin/sdmem -h" 
Wipe mode is insecure (one pass with 0x00)
********************************************************************************
[Edit: ...]
********************************************************************************Killed
/scripts/init-premount/memory_wipe: line 57: can't fork
/init: /scripts/init-premount/ORDER: line 5: can't fork
[    8.033091] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000200
[    8.033091] 
[    8.033143] CPU: 1 PID: 1 Comm: init Tainted: G            E   4.6.0-0bpo.1-amd64 #1 Debian 4.6.4-1~bpo8+1
[    8.033192] Hardware name: [Edit: hardware redacted]
[    8.033224]  [Edit: hardware redacted]
[    8.033274]  [Edit: hardware redacted]
[    8.033323]  [Edit: hardware redacted]
[    8.033371] Call Trace:
[    8.033392]  [<ffffffff813124c5>] ? dump_stack+0x5c/0x77
[    8.033424]  [<ffffffff8117069f>] ? panic+0xdf/0x226
[    8.033455]  [<ffffffff8107f448>] ? do_exit+0xb68/0xba0
[    8.033486]  [<ffffffff8107f4f9>] ? do_group_exit+0x39/0xb0
[    8.033519]  [<ffffffff8107f580>] ? SyS_exit_group+0x10/0x10
[    8.033552]  [<ffffffff81003e28>] ? do_int80_syscall_32+0x58/0x160
[    8.033587]  [<ffffffff815caa66>] ? entry_INT80_compat+0x36/0x50
[    8.033626] Kernel Offset: disabled
[    8.033648] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000200
[    8.033648] 

#8 Updated by anonym 11 months ago

For the record, I've also observed this kernel panic on Tails with Linux 4.7 (KVM guest).

#9 Updated by anonym 11 months ago

  • Target version changed from Tails_2.6 to Tails_2.7

#11 Updated by emmapeel 11 months ago

With a Thinkpad x201

Tails 2.5:
Laptop shuts down ok, with askterisks and everything

Tails 2.6rc~1 and Tails 2.6:
Laptop shuts down but no asterisks.

tails-i386-feature_10733-reverted-2.6-20160918T1709Z-7f88af1+testing@f8e35e3:
Laptop shuts down ok, with askterisks and everything

#12 Updated by anonym 11 months ago

  • Assignee changed from mercedes508 to emmapeel

emmapeel wrote:

With a Thinkpad x201

Did you repeat these tests so there's some statistical significance in your results? Otherwise, can you please redo each test at least five times (clearly significant! :P) or so and report back?

Also, what results do you get from a recent devel build which has Linux 4.7?

#13 Updated by anonym 11 months ago

  • Related to Bug #9707: Jessie: System sometimes does not poweroff after memory erasure added

#14 Updated by anonym 11 months ago

  • Related to Bug #11730: Memory erasure freezes more often on Stretch added

#15 Updated by Anonymous 11 months ago

can confirm no asterisk just power off
in 2.5 everytime with asteriks then off with 2.6 no single time with asteriks just power off

#16 Updated by intrigeri 11 months ago

  • Related to deleted (Bug #11730: Memory erasure freezes more often on Stretch)

#17 Updated by intrigeri 11 months ago

  • Related to deleted (Bug #11730: Memory erasure freezes more often on Stretch)

#18 Updated by intrigeri 11 months ago

  • Duplicated by Bug #11730: Memory erasure freezes more often on Stretch added

#19 Updated by intrigeri 11 months ago

  • Subject changed from Memory wipe fails more often since Linux 4.6 to System often crashed during/after memory wipe since Linux 4.6
  • Assignee changed from emmapeel to intrigeri
  • QA Check deleted (Info Needed)
  • Type of work changed from Test to Code

So, it seems that we have at least two problems here:

  1. the progress feedback (asterisks) has disappeared in some cases; that's fixed by reverting the changes introduced at #10733; presumably memory wipe succeeds anyway, so it's a minor UX problem only. I'm not convinced it's worth working on at this point; if someone feels differently, please file a separate ticket as requested on #11786#note-4.
  2. the system crashes somehow during/after memory wipe (with kernel panic and/or being dropped to a busybox shell) with Linux 4.6 and 4.7, and as a result fails to poweroff; it looks like memory wipe might have succeeded in all cases though; it also looks like the problem is related to the system being out of memory for some reason. Next debugging step is to look more closely at our memory wipe initramfs script, and check what can possibly cause too much memory pressure. IMO we can do this without waiting for emmapeel's results, so reassigning to myself (but emmapeel: please do answer the questions anonym asked you).

#20 Updated by intrigeri 11 months ago

  • Subject changed from System often crashed during/after memory wipe since Linux 4.6 to System often crashes during/after memory wipe since Linux 4.6

#21 Updated by intrigeri 11 months ago

anonym, emmapeel, elouann, goupille, mercedes508: did you see this bug with a system that has no more than 2GB of RAM?

#22 Updated by intrigeri 11 months ago

Bug reporters & frontdesk: did you see this bug on systems that are are not "64-bit CPU + strictly more than 2GB of RAM"?

I've verified that busybox behaves sanely wrt. how we deal with background processes: in a busybox shell within our initramfs, (sleep 30) & (sleep 20) & (sleep 10) & wait returns after 30 seconds. I've also verified that killall sdmem works in that context.

I suspect that our "one instance of sdmem per 2 GiB of RAM" tweak causes this bug on 64-bit systems with strictly more than 2GB of RAM, where we uselessly create a risky situation with a bunch of sdmem process each trying to fill all memory, and rely on the OOM killer to behave properly... and for some reason we see that it sometimes tries to kill init. This does not explain the case when poweroff can't fork though: when we reach this point, if everything had gone well all sdmem processes would have been killed so there's no reason why there's no memory left, so I really don't understand what's happening there.

Still, I propose that we use the "one instance of sdmem per 2 GiB of RAM" tweak on 32-bit systems only (#8183 already drops it with 619c536a458177e7972ad6e0086a4174ee57a60d). I'm not sure it will fix this bug, but it makes sense to first eliminate some useless complexity, which can help us debug the problem further if it's not fixed by that change. I'll prepare a branch that does so.

#23 Updated by intrigeri 11 months ago

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10
  • Feature Branch set to bugfix/11786-simplify-memory-wipe-on-64-bit

#24 Updated by intrigeri 11 months ago

intrigeri wrote:

Still, I propose that we use the "one instance of sdmem per 2 GiB of RAM" tweak on 32-bit systems only (#8183 already drops it with 619c536a458177e7972ad6e0086a4174ee57a60d). I'm not sure it will fix this bug, but it makes sense to first eliminate some useless complexity, which can help us debug the problem further if it's not fixed by that change. I'll prepare a branch that does so.

Done on the topic branch, let's see what Jenkins thinks about it.

#25 Updated by intrigeri 11 months ago

Interesting: with that change in, according to Jenkins only ~2.5-3GB is wiped. The video shows that sdmem prints "done" before we display our "Happy dumping" message. I wonder if that's caused by our 32-bit userspace (sorry I'm quite clueless in such low-level stuff).

#26 Updated by anonym 11 months ago

intrigeri wrote:

Interesting: with that change in, according to Jenkins only ~2.5-3GB is wiped. The video shows that sdmem prints "done" before we display our "Happy dumping" message. I wonder if that's caused by our 32-bit userspace (sorry I'm quite clueless in such low-level stuff).

I am very certain our 32-bit userspace is the issue. After all, a 32-bit program will not "learn" how to deal with 64-bit pointers just because the kernel knows it -- in the compiled code, the size of pointers are 32-bit still, and no matter what you do you cannot fit a 64-bit pointer. 4 GiB of virtual memory per process is the hard limit.

The only way to save this KISS approach, I guess, is to ship a statically linked 64-bit sdmem in our initramfs... :)

#27 Updated by anonym 11 months ago

intrigeri wrote:

anonym, emmapeel, elouann, goupille, mercedes508: did you see this bug with a system that has no more than 2GB of RAM?

Note that our automated test suite defaults to VMs with exactly 2 GiB of RAM, and it frequently happens when running it. FWIW I also run other Tails VMs with that exact amount of RAM, with the same problem.

#28 Updated by intrigeri 11 months ago

Note that our automated test suite defaults to VMs with exactly 2 GiB of RAM, and it frequently happens when running it. FWIW I also run other Tails VMs with that exact amount of RAM, with the same problem.

OK, too bad. So my hypothesis was wrong and it's not worth fixing the issue with my branch (that I'm going to delete).

#29 Updated by intrigeri 11 months ago

  • Feature Branch deleted (bugfix/11786-simplify-memory-wipe-on-64-bit)

#30 Updated by intrigeri 11 months ago

FWIW, since the beginning of October:

  • among 22 test suite runs on the stable branch (Linux 4.6), I see 5 such failures
  • among 6 test suite runs on the devel branch (Linux 4.7), I see no such failure

I've just started a few runs on the devel branch to confirm whether the move to 4.7 has fixed this bug.

Meanwhile, frontdesk: please have bug reporters test recent builds from the devel branch.

#31 Updated by intrigeri 10 months ago

  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 10 to 20
  • QA Check set to Info Needed

Update -- since the beginning of October:

  • among 24 test suite runs on the stable branch (Linux 4.6), I see 6 such failures => 25% failure rate
  • among 13 test suite runs on the devel branch (Linux 4.7), I see no such failure => 0% failure rate

… so it might indeed be that the upgrade to 4.7 fixes this bug.

Next steps:

  • emmapeel reports about testing current devel on ThinkPad X201 (she already knows she has to do that).
  • the RM for Tails 2.7 (bertagaz) comments about the idea of upgrading to Linux 4.7 in our next point-release (cherry-picking a few commits from feature/11818-linux-4.7 should be enough, but perhaps it's more involved if it implies either bumping our debian APT snapshot or importing src:linux 4.7 into our custom APT repo): bertagaz, what do you think? Is this bug bad enough in your opinion to go through this and the risk of other regressions?

#32 Updated by emmapeel 10 months ago

So, I have tested several times each with a Thinkpadx201 and:

- Tails 2.6:
When clicking restart, the message about 'now you can take your USB out' appears, and after approx. 15 seconds of it, the laptop restarts (no stars! :S)
- tails-i386-devel-2.7-20161005T0955Z-bc6a783
No change with relation to 2.6
- Tails 2.5:
Similar to Tails 2.6, takes the same amount of time, but I get the screen full as asterisks before restarting, at some point inside the 15 seconds window (around 10 secs)

#33 Updated by bertagaz 10 months ago

  • QA Check changed from Info Needed to Dev Needed

emmapeel wrote:

So, I have tested several times each with a Thinkpadx201 and:

- Tails 2.6:
When clicking restart, the message about 'now you can take your USB out' appears, and after approx. 15 seconds of it, the laptop restarts (no stars! :S)
- tails-i386-devel-2.7-20161005T0955Z-bc6a783
No change with relation to 2.6
- Tails 2.5:
Similar to Tails 2.6, takes the same amount of time, but I get the screen full as asterisks before restarting, at some point inside the 15 seconds window (around 10 secs)

So this branch does seems to fix it in Jenkins, and does not bring real trouble on bare metal it seems. Sounds like a bugfix that could be merged for 2.7, specially since #11818 seems to fix the main bug that prevented it to be merged for 2.6. Hopefully it won't bring others.

I'll give it a try. Hope I'll find an easy way to include this kernel.

#34 Updated by intrigeri 10 months ago

bertagaz wrote:

I'll give it a try. Hope I'll find an easy way to include this kernel.

Any progress update / ETA? Do you need help?

#35 Updated by bertagaz 10 months ago

intrigeri wrote:

Any progress update / ETA? Do you need help?

Progress yes, more tests to analyze, dead line en of the week end best (specially now that merging the other branches just went out of my next moves).

#36 Updated by intrigeri 10 months ago

Progress yes, more tests to analyze,

What's the branch? I see none with "11786" in its name.

#37 Updated by intrigeri 10 months ago

[Moved to #11885]

#38 Updated by intrigeri 10 months ago

I guess that this might be relevant and is worth having a closer look at:

linux-latest (75) unstable; urgency=medium

  * From Linux 4.7, the iptables connection tracking system will no longer
    automatically load helper modules.  If your firewall configuration
    depends on connection tracking helpers, you should explicitly load the
    required modules.  For more information, see
    <https://home.regit.org/netfilter-en/secure-use-of-helpers/>.

 -- Ben Hutchings <ben@decadent.org.uk>  Sat, 29 Oct 2016 01:53:18 +0100

#39 Updated by anonym 10 months ago

  • Blocks Bug #10776: Step "I shutdown and wait for Tails to finish wiping the memory" fails when memory wiping causes a freeze added

#40 Updated by intrigeri 10 months ago

Will this be fixed by the branch for #11885?

#41 Updated by bertagaz 10 months ago

intrigeri wrote:

Will this be fixed by the branch for #11885?

I'm not sure yet, I've seen failures when trying at home to run a 4.7 kernel, but that was with the 4.7.2. I think this ticket needs to wait a bit more to see how the newer 4.7.8 one works regarding this. I also want to test it a bit on bare metal.

#42 Updated by bertagaz 10 months ago

I've removed the outdated related branch that existed. Linux 4.7 is now in all main branches, so we'll track this bug though their respective Jenkins runs. We can create a more meaningful and active branch for this ticket.

#44 Updated by intrigeri 9 months ago

And a user just reported that he still sees the bug on Tails 2.7.

#45 Updated by bertagaz 9 months ago

  • Target version changed from Tails_2.7 to Tails_2.9.1

#46 Updated by sonicsnail 9 months ago

I tested again with Tails 3.0~alpha1, which is on Linux 4.8.0. Now the machine successfully powers off, whereas on 2.6~rc1 through 2.7 it would hang. I presume the memory is successfully wiping too because there's a delay between "You can now remove..." and powering off, but no asterisks are shown.

#47 Updated by anonym 8 months ago

  • Target version changed from Tails_2.9.1 to Tails 2.10

#48 Updated by intrigeri 8 months ago

  • Assignee changed from bertagaz to anonym

anonym, can you take this one? It was on bertagaz plate since he was the RM but that's over now, and I think this is now on the Foundations Team's plate.

#49 Updated by intrigeri 8 months ago

FWIW the workaround added in our automated test suite (cd66f0b) doesn't work on current feature/stretch apparently: all I see during memory wipe is a black screen (at least in features/usb_install.feature), so if there's a kernel panic we don't see the busybox error message. This blocks updating parts of the test suite for Stretch, e.g. large parts of the USB and persistence tests. Bare metal is not affected (at least one one MacBook Pro I tried this on).

I'm currently testing a workaround to this problem, that should apply to Tails 2.x and 3.x if it works.

#50 Updated by intrigeri 8 months ago

intrigeri wrote:

FWIW the workaround added in our automated test suite (cd66f0b) doesn't work on current feature/stretch apparently: all I see during memory wipe is a black screen (at least in features/usb_install.feature), so if there's a kernel panic we don't see the busybox error message.

Err, no: it's "just" that it takes a long while before the message about memory erasure is displayed; I'll file a separate ticket to track this, and meanwhile I'll teach the test suite to be more patient.

#51 Updated by intrigeri 8 months ago

  • Related to Bug #12061: Long delay before memory wipe starts on Stretch added

#52 Updated by intrigeri 8 months ago

  • Assignee changed from anonym to intrigeri

anonym, can you take this one? It was on bertagaz plate since he was the RM but that's over now, and I think this is now on the Foundations Team's plate.

You seem to be busy with dogtail optimizations, and this problem makes the feature/stretch sprint more painful than it could be, so I worked on it a bit. Here's what I did:

Next steps: trigger enough builds+tests of these branches on Jenkins to get meaningful data, and test locally to see what happens with these branches merged in (I'll merge them for all my feature/stretch local dev builds in the next days).

#53 Updated by intrigeri 8 months ago

intrigeri wrote:

So it seems that memory wiping succeeds, but then attempting to run poweroff triggers the "can't fork", which kills the script. I'm trying something else on feature/stretch+11786-wait-before-poweroff and bugfix/11786-wait-before-poweroff.

#54 Updated by intrigeri 8 months ago

intrigeri wrote:

I'm trying something else on feature/stretch+11786-wait-before-poweroff and bugfix/11786-wait-before-poweroff.

Crap, even calling sleep triggers "can't fork": with busybox, even builtins are copies of the original busybox multi-call binary, so calling "sleep" is not an option; and "sleep" is not a builtin in dash nor bash anyway.

#55 Updated by intrigeri 8 months ago

  • Assignee changed from intrigeri to anonym

I've tried everything I could, and failed. Help!

#56 Updated by anonym 7 months ago

  • Assignee changed from anonym to intrigeri

Just to check you hypothesis that sleep would help us, try busy waiting with something like this instead:

X=0
until [ "${X}" -gt 1000000 ]; do X="$((${X} + 1))"; done

The number might need adjustment depending on your hardware, of course. On my hardware that results in a ~10 "sleep".

#57 Updated by intrigeri 7 months ago

  • Assignee changed from intrigeri to anonym

anonym wrote:

Just to check you hypothesis that sleep would help us, try busy waiting with something like this instead:

This won't work either, since in busybox [ requires forking.

#58 Updated by intrigeri 7 months ago

  • Assignee deleted (anonym)
  • Priority changed from Elevated to Normal
  • Target version deleted (Tails 2.10)

One option (until we can rely on PaX' memory sanitization or something) proposed by anonym would be to rewrite that section of the script in C and statically compiling it. It doesn't seem to be worth the effort though: this regression doesn't affect everyone (e.g. even without this new bug, memory wiping fails on lots of hardware), and it's not a security problem (memory is wiped). So for the test suite, we'll rely on our workarounds; and we give up for now on actual bug for human users. Let's hope we'll be able to replace our current implementation with something more robust some day.

Patches welcome though: the script lives in config/chroot_local-includes/usr/share/initramfs-tools/scripts/init-top/memory_wipe.

#59 Updated by anonym 7 months ago

  • Blocks deleted (Bug #10776: Step "I shutdown and wait for Tails to finish wiping the memory" fails when memory wiping causes a freeze)

#60 Updated by intrigeri 7 months ago

intrigeri wrote:

One option (until we can rely on PaX' memory sanitization or something) proposed by anonym would be to rewrite that section of the script in C and statically compiling it.

C, or anything else that can be statically compiled, e.g. Go or Rust.

#61 Updated by intrigeri 5 months ago

  • Related to Bug #12354: Fix shutdown and memory wipe regressions on 3.0~betaN added

#62 Updated by intrigeri 5 months ago

  • Status changed from In Progress to Duplicate

Let's track the next steps on #12354.

#63 Updated by intrigeri 5 months ago

  • Related to deleted (Bug #12354: Fix shutdown and memory wipe regressions on 3.0~betaN)

#64 Updated by intrigeri 5 months ago

  • Duplicates Bug #12354: Fix shutdown and memory wipe regressions on 3.0~betaN added

Also available in: Atom PDF