16:00:18 <GeKo> #startmeeting network-health
16:00:18 <MeetBot> Meeting started Mon Jun  7 16:00:18 2021 UTC.  The chair is GeKo. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:18 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
16:00:53 <dgoulet> o/
16:00:54 <GeKo> alright, let's get started for the weekly meeting
16:00:59 <GeKo> https://pad.riseup.net/p/tor-netteam-2021.1-keep is the pad
16:01:07 <GeKo> ah, no
16:01:09 <ggus> hey!
16:01:12 <GeKo> let me get the right onw
16:01:14 <GeKo> *one
16:01:27 <irl[m]> hi
16:01:34 <GeKo> kfahv6wfkbezjyg4r6mlhpmieydbebr5vkok5r34ya464gqz6c44bnyd.onion/p/tor-nethealthteam-2021.1-keep
16:02:31 <gaba> hi!
16:04:29 <GeKo> okay, let's get started
16:04:45 <GeKo> if someones needs to add things to the pad, please do while we are chatting here
16:05:19 <GeKo> irl[m]: so, for the metrics.tpo outages what are the next steps here?
16:05:40 <irl[m]> as yet, no idea, i'm still mostly working on collector outage
16:05:58 <GeKo> i've filed https://gitlab.torproject.org/tpo/metrics/team/-/issues/15
16:06:01 <gaba> irl[m]: what specifically you did to bring metrics.tpo back?
16:06:08 <irl[m]> turned it off and on again
16:06:17 <GeKo> and added an item on the status page to indicate we have issues
16:06:33 <irl[m]> i've not restarted it since the weekend, and it looks happy again now
16:06:44 <GeKo> i had issues this morning
16:07:00 <GeKo> showing a 502 proxy error on relay search
16:07:10 <irl[m]> that would imply it's a load-related problem then
16:07:20 <irl[m]> until i've got the prometheus stuff set up i have no visibility of any of this
16:07:23 <GeKo> so it seems theses problems are at least only intermittently happening
16:07:34 <GeKo> what is the ticket for that?
16:07:52 <GeKo> or do we need to create one?
16:08:54 <GeKo> irl[m]: ^
16:10:04 <irl[m]> https://gitlab.torproject.org/tpo/tpa/team/-/issues/40280 is the ticket that is blocking the prometheus exporter being set up for collector, then i was going to add collector into the prometheus, and then go from there
16:10:04 <irl[m]> it's a whole new thing, to replace the old metrics nagios that seems to have been turned off while i was gone and nothing replaced it
16:10:04 <irl[m]> i think the biggest problem here is that i only knew metrics was broken because someone told me
16:10:04 <irl[m]> not a single alert was triggered anywhere
16:10:16 <irl[m]> the second problem is that the logs are very noisy, because metrics-web does a lot more than it did when the logging was initially devised, so without monitoring you just have a mountain of logs
16:10:30 <irl[m]> you don't know where to look because you have no timestamp
16:10:37 <GeKo> i see
16:10:54 <irl[m]> i'll look to see if we made a ticket for the larger thing
16:11:01 <GeKo> thanks
16:11:16 <irl[m]> https://gitlab.torproject.org/tpo/tpa/team/-/issues/40216 is related
16:11:24 <irl[m]> https://gitlab.torproject.org/tpo/tpa/team/-/issues/40274 is related
16:11:31 <GeKo> right
16:11:34 <irl[m]> there isn't a "project" ticket as such that i can see
16:11:42 <GeKo> i remember the last one
16:12:04 <irl[m]> the ticket would probably be titled "Monitor Metrics services with Prometheus"
16:12:36 <irl[m]> anarcat has set up some git stuff to make it easier for me to directly write the prometheus configs and have them deployed
16:12:45 <GeKo> can we take some shortcuts here so that the issue potentially buggging metrics.tpo is caught first?
16:13:00 <GeKo> i am not sure what logging infra needs to get set up for that
16:13:14 <GeKo> as i don't really know all the pieces involved here
16:13:40 <irl[m]> yes, i need to refresh my knowledge of blackbox exporter and then write the config for that
16:13:47 <irl[m]> instead of matrix it will just send me emails, which is better than nothing
16:14:05 <GeKo> but if collector is e.g. not involved in the metrics.tpo outage we could postpone setting prometheus alerts up for that one
16:14:23 <GeKo> and start with a different part first
16:14:38 <irl[m]> right yes, that is the plan
16:14:48 <GeKo> great
16:14:52 <gaba> so the idea is to do this in nagios?
16:14:55 <gaba> and not prometheus
16:14:59 <irl[m]> no, prometheus
16:15:02 <gaba> ook
16:15:10 <irl[m]> it all used to be in nagios but i guess people didn't like nagios as it got turned off
16:15:44 <irl[m]> the prometheus is being used by anti-censorship too, so there's redundancy of knowledge
16:15:53 <GeKo> yeah
16:16:07 <GeKo> i don't know anything about why the nagios part got turned off
16:16:35 <GeKo> but we should not start with it again if we move to prometheus i guess
16:17:03 <GeKo> okay
16:17:11 <GeKo> that's anything i had for that item
16:17:19 <GeKo> the other is the roadmap
16:17:25 <GeKo> http://kfahv6wfkbezjyg4r6mlhpmieydbebr5vkok5r34ya464gqz6c44bnyd.onion/p/IutVYvgMq9614nDk-KFm
16:17:29 <gaba> yes, not sure. We never talked about retiring nagios
16:17:46 <GeKo> i've cleaned it up and created tickets and we started triaging them
16:18:11 <GeKo> so for this week it would be useful if any of you could go over it and think about things that are missing
16:18:20 <GeKo> or even mis-categorized
16:18:26 <GeKo> arma2: mikeperry: ^
16:18:51 <GeKo> we tried to put things in Needed and Wanted etc. according to what we came up during the meeting
16:19:05 <GeKo> and by me thinking about it afterwards
16:19:10 <GeKo> but things are not set in stone
16:19:28 <GeKo> so, if there is anything we should fix here, let gaba or me know
16:19:45 <GeKo> ggus: should we deal with the remaining community items?
16:19:49 <ggus> GeKo: yes
16:20:10 <GeKo> so, i have the meetup in Needed
16:20:20 <GeKo> anything else we should put into that?
16:20:31 <GeKo> the otf fellow and operator census work?
16:20:48 <ggus> yes, the operator census work is needed.
16:20:59 <GeKo> to we have a ticket for that work?
16:21:14 <ggus> mmmh, let me check
16:21:21 <gaba> arma2: we also assigned you a ticket.
16:22:00 <GeKo> and could easily assign more :)
16:22:31 <gaba> :)
16:23:51 <GeKo> ggus: no need to find it now, if it takes too long (yeah gitlab search is horrible)
16:23:51 <ggus> GeKo: https://gitlab.torproject.org/tpo/community/team/-/issues/39
16:23:58 <GeKo> :)
16:24:33 <GeKo> there are three items in the wanted section
16:24:43 <GeKo> which i marked with "XXX Ticket"
16:24:57 <GeKo> i guess we a) want to have them as wanted
16:25:10 <GeKo> and b) there should be tickets for them?
16:25:28 <GeKo> could you file them if they are missing and add the links to the pad?
16:25:47 <GeKo> i'll clean up things around them afterwards
16:26:43 <ggus> > Understand where relay operators try to go to get support (UX side)
16:26:48 <ggus> who created this one?
16:27:01 <GeKo> dunno
16:27:13 <GeKo> maybe arma2
16:27:16 <gaba> i think it was arma2
16:27:32 <ggus> ok, i will create a ticket for that one. this will also part of the new fellow
16:27:41 <ggus> part of the work
16:27:45 <GeKo> nice
16:27:48 <GeKo> thanks
16:28:10 <GeKo> the final item i had to discuss is the website blocking tor one
16:28:36 <GeKo> i guess we can keep that as wanted given that we have a gsoc project running
16:28:44 <GeKo> which is providing the infra for that
16:29:18 <GeKo> ggus: at some point we should connect both worlds the advocacy one with the tools one
16:29:28 <ggus> and the comms world too
16:29:28 <GeKo> so the former can start using the latter
16:29:31 <GeKo> yes
16:29:48 <ggus> when this project will be released?
16:30:05 <GeKo> i'll leave that for you to decide when the right time is to get started with that
16:30:11 <GeKo> let me see
16:31:02 <GeKo> i am actually not sure when gsoc ends
16:31:13 <GeKo> but i think end of july
16:31:16 <GeKo> or begin of august
16:31:26 <GeKo> https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/GSoC-2021 is the page for the project
16:32:05 <GeKo> ggus: i'll put you as the comms liasion on the pad, too
16:32:11 <GeKo> not just the community one :)
16:32:33 <GeKo> and we can then put that item on the whishlist for comms folks
16:32:33 <ggus> ok!
16:32:41 <GeKo> i'll create a ticket after the meeting
16:32:47 <GeKo> and then we can take it from there
16:33:11 <GeKo> but the tool is a thing (or will be) and it can be useful for the advocavy part i think
16:33:18 <GeKo> *advocacy
16:33:20 <ggus> GeKo: it would be nice to have woswos and _ranchak_ presenting both projects during a Tor demo day.
16:33:28 <GeKo> right!
16:33:29 <ggus> GeKo: yeah!
16:33:31 <woswos> if you have any wishlist for the gsoc project, please let me/us know
16:33:34 <GeKo> good idea
16:33:52 <GeKo> woswos: you could think about the demo day idea, too
16:34:07 <GeKo> would be awesome to have it presented there
16:34:38 <woswos> is there a link for getting more information about it?
16:34:43 <ggus> woswos: yes, one sec
16:34:47 <GeKo> ggus: that's all i had from my side
16:36:04 <ggus> woswos: example - https://lists.torproject.org/pipermail/tor-project/2021-February/003047.html
16:36:12 <GeKo> while ggus is looking for the link let me know if there is anything else to discuss today
16:36:33 <ggus> we will announce the next demo day on torproject mailing list. but it should happen in august
16:36:58 <gaba> nice
16:36:59 <ggus> 5 - 10 minutes, open for community members, small crowd (~40 ppl)
16:37:02 <gaba> end of August
16:37:09 <woswos> thanks for the link
16:37:51 <GeKo> ggus: are we good wrt the roadmap for now?
16:38:17 <ggus> GeKo: yes, i will create the 2 tickets that are missing
16:38:26 <GeKo> thanks
16:38:47 <GeKo> okay. i heard nothing getting raised for discussion
16:38:57 <GeKo> so thanks for being here and have a nice week
16:38:59 <GeKo> #endmeeting