Project

General

Profile

Feature #11366

Document our monitoring setup

Added by bertagaz about 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Infrastructure
Target version:
Start date:
04/25/2016
Due date:
% Done:

100%

QA Check:
Pass
Feature Branch:
Type of work:
Contributors documentation
Blueprint:
Starter:
No
Affected tool:

Description

Once completely deployed in production and stabilized, we should note somewhere (to be defined) how our monitoring setup is configured.


Related issues

Blocks Tails - Feature #5734: Monitor servers Resolved 01/09/2015 11/09/2015

Associated revisions

Revision c1367c8f (diff)
Added by bertagaz about 2 years ago

Document the Icinga2 and VPN setup for contributors.

Refs: #11366

Revision 6fb44547 (diff)
Added by bertagaz about 2 years ago

Remove wrong VPN setup documentation.

Refs: #11366

Revision 3e0a7e3b (diff)
Added by bertagaz about 2 years ago

Add documentation about Icinga2 check deployment.

Refs: #11366

Revision 9acbcaba (diff)
Added by bertagaz almost 2 years ago

Move the page about adding Icinga2 checks in the right place.

Refs: #11366

Revision 887a3f64 (diff)
Added by bertagaz almost 2 years ago

Add link to the upstream puppet module.

Refs: #11366

Revision 19e270b3 (diff)
Added by bertagaz almost 2 years ago

s/deploy/add.

Refs: #11366

Revision ce168705 (diff)
Added by bertagaz almost 2 years ago

Camel case name.

Refs: #11366

Revision 389f89f6 (diff)
Added by bertagaz almost 2 years ago

Remove useless "really".

Refs: #11366

Revision 9d0cdab3 (diff)
Added by bertagaz almost 2 years ago

Typo.

Refs: #11366

Revision 6eab4c1c (diff)
Added by bertagaz almost 2 years ago

Rephrase fuzzy words.

Refs: #11366

Revision fd54ce6b (diff)
Added by bertagaz almost 2 years ago

t::m::s::torbrowser_archive is not a class.

Refs: #11366

Revision 5375cb8f (diff)
Added by bertagaz almost 2 years ago

Add url to example service definition template.

Refs: #11366

Revision cfabbffb (diff)
Added by bertagaz almost 2 years ago

Fix fuzzy phrase.

Refs: #11366

Revision f4a7d945 (diff)
Added by bertagaz almost 2 years ago

Fix parenthesis mismatch.

Refs: #11366

Revision a0d6d4fb (diff)
Added by bertagaz almost 2 years ago

Fix wrong Camel Case.

Refs: #11366

Revision 384d228a (diff)
Added by bertagaz almost 2 years ago

Typos.

Refs: #11366

Revision 6eae4fcd (diff)
Added by bertagaz almost 2 years ago

Precision++

Refs: #11366

Revision 453232b9 (diff)
Added by bertagaz almost 2 years ago

More precise phrasing.

Refs: #11366

Revision 4b12e906 (diff)
Added by bertagaz almost 2 years ago

Fix typo.

Refs: #11366

Revision cec2084b (diff)
Added by intrigeri almost 2 years ago

Monitoring doc: various rephrasing, navigation improvements & typo fixes.

refs: #11366

History

#1 Updated by bertagaz about 2 years ago

  • Blocked by Feature #9484: Deploy the monitoring setup to production added

#3 Updated by intrigeri about 2 years ago

  • Type of work changed from Sysadmin to Contributors documentation

#4 Updated by intrigeri about 2 years ago

  • Blocked by deleted (Feature #9484: Deploy the monitoring setup to production)

#5 Updated by intrigeri about 2 years ago

#6 Updated by bertagaz about 2 years ago

  • Assignee changed from bertagaz to intrigeri
  • Target version changed from Tails_2.4 to Tails_2.5
  • % Done changed from 0 to 60
  • QA Check set to Ready for QA

Pushed c1367c8 that document the classes for sysadmin contributors.

#7 Updated by intrigeri about 2 years ago

  • Status changed from Confirmed to In Progress
  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 60 to 30
  • QA Check changed from Ready for QA to Dev Needed

Thanks, I like the bits you pushed!

What I feel is missing:

  • a high-level description of what the pieces are, and how they're connected together: you know, the bits we've discussed on some now closed ticket, about the satellite etc. design; rationale: pointing to the individual config bits is good, but it makes little sense unless one knows what these bits are about; this can probably fit in a few sentences;
  • basic documentation (not sure where it should live) about adding a check: when I tried, it took me literally hours to reverse-engineer this, and that's actually why I created this ticket in the first place. A very rough ordered list of resources to create/update would have saved me hours, and will save me hours next time.

#8 Updated by bertagaz about 2 years ago

  • Assignee changed from bertagaz to intrigeri
  • % Done changed from 30 to 70
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:

  • a high-level description of what the pieces are, and how they're connected together: you know, the bits we've discussed on some now closed ticket, about the satellite etc. design; rationale: pointing to the individual config bits is good, but it makes little sense unless one knows what these bits are about; this can probably fit in a few sentences;

Good idea. 5372fd8

  • basic documentation (not sure where it should live) about adding a check: when I tried, it took me literally hours to reverse-engineer this, and that's actually why I created this ticket in the first place. A very rough ordered list of resources to create/update would have saved me hours, and will save me hours next time.

Added that in 3e0a7e3

#9 Updated by intrigeri about 2 years ago

  • Assignee changed from intrigeri to bertagaz
  • QA Check changed from Ready for QA to Dev Needed

bertagaz wrote:

intrigeri wrote:

  • a high-level description of what the pieces are, and how they're connected together: you know, the bits we've discussed on some now closed ticket, about the satellite etc. design; rationale: pointing to the individual config bits is good, but it makes little sense unless one knows what these bits are about; this can probably fit in a few sentences;

Good idea. 5372fd8

I guess you rather mean 3f4d45d (that has a buggy "refs:" ID, took me a while to find it). It looks good to me, thanks!

  • basic documentation (not sure where it should live) about adding a check: when I tried, it took me literally hours to reverse-engineer this, and that's actually why I created this ticket in the first place. A very rough ordered list of resources to create/update would have saved me hours, and will save me hours next time.

Added that in 3e0a7e3

Please don't put that in contribute/how/sysadmin, that is about welcoming new sysadmin contributors: this info feels out of place on that page. Instead it should be under contribute/working_together/roles/sysadmins (and link from there) since that's where we document services.

Other than that, I like the structure of this piece of doc. There are quite a few parts that seem obscure or confusing to me, though mostly due to vague wording, so I'll list them here (all this is so clear in your mind that I understand it's hard to guess what will be clear or not to me; and tech documentation writing is hard):

  • s/Deploying/Adding/ (deploying means something more specific than what you mean here, I think)
  • "upstream Icinga2 Puppet module" -> URL please
  • "active record" -> I think it's called "Active Records"
  • "so we can't really use" -> looks like "really" adds no info, but makes the sentence confusing
  • softwares -> spell checking
  • In "Once plugins and check commands are checked" I don't understand what "checked" means. Rephrase?
  • "Have a look at the tails::monitoring::service:torbrowser_archive class" -> I don't think it's a class.
  • "the related service configuration template" -> URL please (finding where each of these many small files lives was part of the reverse-engineering pain)
  • "Ran from the master on a remote hosted service" > this contradicts what I understood until now - isn't the purpose of a "Remotely executed service" precisely that it's run on the master? Rephrasing, perhaps? (I guess that "on" is not the right word)
  • tails::monitoring::{master,satellite,agent) class -> parenthesis mismatch; and maybe you mean "classes"?
  • Tails::Monitoring::Service::Memory -> the caps seem wrong in this context (there's at least another instance of this typo elsewhere, so some proof-reading would be welcome)
  • In "Once all of [...] are checked", I don't know what "checked" means.
  • "Pay attention to the parameter passed at the exported resources collection." -> that is? What should I pay attention to?
  • "the related node manifest" -> what's that?
  • "serveral time" -> spell checker + grammar

I've tried hard to list only on the issues that either are important for understanding the meaning of the text, or that are trivial to fix. The goal here is not to produce a super-polished text for newbies and end-users, just something I can refer to in 6 months :)

#11 Updated by bertagaz almost 2 years ago

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:

Added that in 3e0a7e3

Please don't put that in contribute/how/sysadmin, that is about welcoming new sysadmin contributors: this info feels out of place on that page. Instead it should be under contribute/working_together/roles/sysadmins (and link from there) since that's where we document services.

Ok, moved it in 9acbcab

Other than that, I like the structure of this piece of doc. There are quite a few parts that seem obscure or confusing to me, though mostly due to vague wording, so I'll list them here (all this is so clear in your mind that I understand it's hard to guess what will be clear or not to me; and tech documentation writing is hard):

Thanks! I tried to follow the chronological order one should use to add a check.

  • s/Deploying/Adding/ (deploying means something more specific than what you mean here, I think)

19e270b

  • "upstream Icinga2 Puppet module" -> URL please

887a3f6

  • "active record" -> I think it's called "Active Records"

ce16870

  • "so we can't really use" -> looks like "really" adds no info, but makes the sentence confusing

389f89f

  • softwares -> spell checking

9d0cdab

  • In "Once plugins and check commands are checked" I don't understand what "checked" means. Rephrase?

6eab4c1

  • "Have a look at the tails::monitoring::service:torbrowser_archive class" -> I don't think it's a class.

Right, fd54ce6

  • "the related service configuration template" -> URL please (finding where each of these many small files lives was part of the reverse-engineering pain)

5375cb8

  • "Ran from the master on a remote hosted service" > this contradicts what I understood until now - isn't the purpose of a "Remotely executed service" precisely that it's run on the master? Rephrasing, perhaps? (I guess that "on" is not the right word)

cfabbff

  • tails::monitoring::{master,satellite,agent) class -> parenthesis mismatch; and maybe you mean "classes"?

f4a7d94

  • Tails::Monitoring::Service::Memory -> the caps seem wrong in this context (there's at least another instance of this typo elsewhere, so some proof-reading would be welcome)

a0d6d4f

  • In "Once all of [...] are checked", I don't know what "checked" means.

6eab4c1 already mentioned above.

  • "Pay attention to the parameter passed at the exported resources collection." -> that is? What should I pay attention to?

6eae4fc

  • "the related node manifest" -> what's that?

453232b and 4b12e90

  • "serveral time" -> spell checker + grammar

384d228

#12 Updated by intrigeri almost 2 years ago

  • Status changed from In Progress to Resolved
  • Assignee deleted (intrigeri)
  • % Done changed from 70 to 100
  • QA Check changed from Ready for QA to Pass

Added cec2084ba499c0426911ba1b47adc94d358cbfb3 on top (please have a look), and I think we're good. Let's see what happens the first time I try to actually use this piece of doc :)

Also available in: Atom PDF