Page MenuHomeMiraheze

MonitoringTag
ActivePublic

Members (2)

Watchers (2)

Details

Description

Tag to identify any tasks that affect monitoring infrastructure, checks, deployments and methods.

Recent Activity

Thu, Sep 17

BlackWidowMovie0000 closed T6192: Timeless Content Bar as Resolved.
Thu, Sep 17, 19:05 · Universal Omega, Extensions, Upstream
BlackWidowMovie0000 reopened T6192: Timeless Content Bar as "Open".
Thu, Sep 17, 18:55 · Universal Omega, Extensions, Upstream

Tue, Sep 8

Paladox closed T6093: Catch dns.resolver.NoAnswer properly inside reverse DNS check as Resolved.

Fixed with https://github.com/miraheze/puppet/pull/1491

Tue, Sep 8, 23:13 · Site Reliability Engineering, Monitoring

Mon, Aug 24

Southparkfan added a project to T6093: Catch dns.resolver.NoAnswer properly inside reverse DNS check: Site Reliability Engineering.
Mon, Aug 24, 22:49 · Site Reliability Engineering, Monitoring
Southparkfan triaged T6093: Catch dns.resolver.NoAnswer properly inside reverse DNS check as Normal priority.
Mon, Aug 24, 22:49 · Site Reliability Engineering, Monitoring

Jul 18 2020

Sario528 added a watcher for Monitoring: Sario528.
Jul 18 2020, 23:41

Jul 13 2020

Southparkfan claimed T5713: Create automated Icinga check for validity of all TLS certificates on system.
Jul 13 2020, 18:17 · Monitoring, Site Reliability Engineering

Jun 27 2020

RhinosF1 closed T5811: Icinga IRC Alerts not triggering as Resolved.
Jun 27 2020, 11:43 · Amanda Catherine, Site Reliability Engineering, Monitoring
RhinosF1 updated the task description for T5811: Icinga IRC Alerts not triggering.
Jun 27 2020, 08:26 · Amanda Catherine, Site Reliability Engineering, Monitoring
RhinosF1 added a comment to T5811: Icinga IRC Alerts not triggering.

18:48:49 UTC was last message from icinga

Jun 27 2020, 08:25 · Amanda Catherine, Site Reliability Engineering, Monitoring
RhinosF1 triaged T5811: Icinga IRC Alerts not triggering as Unbreak Now! priority.
Jun 27 2020, 08:22 · Amanda Catherine, Site Reliability Engineering, Monitoring
RhinosF1 created T5811: Icinga IRC Alerts not triggering.
Jun 27 2020, 08:22 · Amanda Catherine, Site Reliability Engineering, Monitoring

Jun 25 2020

Paladox moved T4601: Track mediawiki-static storage space in Icinga from Bugs to Trivial Puppet Change on the Site Reliability Engineering board.
Jun 25 2020, 15:39 · Monitoring, Site Reliability Engineering
Paladox moved T4601: Track mediawiki-static storage space in Icinga from Trivial Puppet Change to Bugs on the Site Reliability Engineering board.
Jun 25 2020, 15:39 · Monitoring, Site Reliability Engineering

Jun 23 2020

John added a project to T5713: Create automated Icinga check for validity of all TLS certificates on system: Monitoring.
Jun 23 2020, 15:37 · Monitoring, Site Reliability Engineering

Jun 7 2020

OnKoydenKovuldum closed T5706: Spam as Resolved.

WikiLirik sözcüleri eleştirisel pozitif-negatif düşünce karşılaştırma
PyschoLyricWiki Attack Groups
Ben ötesi psikoloji

Jun 7 2020, 03:40 · Trash

May 6 2020

John closed T5545: Monitor package updates as Resolved.
May 6 2020, 12:08 · Monitoring, Site Reliability Engineering
John claimed T5545: Monitor package updates.
May 6 2020, 10:49 · Monitoring, Site Reliability Engineering

May 5 2020

John added a project to T5545: Monitor package updates: Monitoring.
May 5 2020, 22:34 · Monitoring, Site Reliability Engineering

Apr 21 2020

Examknow added a comment to T5453: Alert when icinga-miraheze disconnects from IRC.

Complete.

SigmaBot will generate an alert including the ping “!sre” (stalk if you want to) when icinga-miraheze quits. It will run a short status check to indicate whether icinga and meta is up.

A full status check can be run by SRE, me or Examknow using !status but it is long as it checks multiple services so only use if you have to.

Apr 21 2020, 18:29 · Monitoring, Site Reliability Engineering
RhinosF1 closed T5453: Alert when icinga-miraheze disconnects from IRC as Resolved.

SigmaBot will generate an alert including the ping “!sre” (stalk if you want to) when icinga-miraheze quits. It will run a short status check to indicate whether icinga and meta is up.

Apr 21 2020, 16:24 · Monitoring, Site Reliability Engineering
RhinosF1 reassigned T5453: Alert when icinga-miraheze disconnects from IRC from RhinosF1 to Examknow.

The backup monitoring solution will:
Ping !staff when icinga-miraheze disconnects
Pings meta.miraheze.org & icinga.miraheze.org
Reports if the sites are UP or DOWN

Apr 21 2020, 15:25 · Monitoring, Site Reliability Engineering
John added a comment to T5453: Alert when icinga-miraheze disconnects from IRC.
In T5453#106269, @John wrote:

If this is a task here tagged as monitoring, it’s a puppet change. If it’s not a change you’ll be making in Miraheze, this isn’t the place to track it currently.

It’s probably best out of MH’s network to reduce the chance that it goes down when we do.

Apr 21 2020, 14:17 · Monitoring, Site Reliability Engineering
RhinosF1 added a comment to T5453: Alert when icinga-miraheze disconnects from IRC.
In T5453#106269, @John wrote:

If this is a task here tagged as monitoring, it’s a puppet change. If it’s not a change you’ll be making in Miraheze, this isn’t the place to track it currently.

Apr 21 2020, 14:06 · Monitoring, Site Reliability Engineering
John added a comment to T5453: Alert when icinga-miraheze disconnects from IRC.

If this is a task here tagged as monitoring, it’s a puppet change. If it’s not a change you’ll be making in Miraheze, this isn’t the place to track it currently.

Apr 21 2020, 14:05 · Monitoring, Site Reliability Engineering
RhinosF1 added a comment to T5453: Alert when icinga-miraheze disconnects from IRC.

We already have redundant monitoring.

Apr 21 2020, 14:01 · Monitoring, Site Reliability Engineering
RhinosF1 added a comment to T5453: Alert when icinga-miraheze disconnects from IRC.

SigmaBot may be able to ping staff when icinga-miraheze parts the #miraheze channel. Staff please let me know your thoughts on this.

Apr 21 2020, 14:00 · Monitoring, Site Reliability Engineering
Paladox added a comment to T5453: Alert when icinga-miraheze disconnects from IRC.

We already have redundant monitoring.

Apr 21 2020, 13:59 · Monitoring, Site Reliability Engineering
RhinosF1 updated subscribers of T5453: Alert when icinga-miraheze disconnects from IRC.

@John: Is someone going to work on the puppet change?

Apr 21 2020, 13:59 · Monitoring, Site Reliability Engineering
Examknow added a comment to T5453: Alert when icinga-miraheze disconnects from IRC.

SigmaBot may be able to ping staff when icinga-miraheze parts the #miraheze channel. Staff please let me know your thoughts on this.

Apr 21 2020, 13:58 · Monitoring, Site Reliability Engineering
John moved T5453: Alert when icinga-miraheze disconnects from IRC from New Features/Services to Trivial Puppet Change on the Site Reliability Engineering board.
Apr 21 2020, 13:55 · Monitoring, Site Reliability Engineering
John closed T5454: Reduce number of alerts in an outage as Declined.

I’ll answer the question by saying; “we need more monitoring than we currently have.”

Apr 21 2020, 13:53 · Site Reliability Engineering, Monitoring
RhinosF1 created T5454: Reduce number of alerts in an outage.
Apr 21 2020, 13:31 · Site Reliability Engineering, Monitoring
RhinosF1 moved T5453: Alert when icinga-miraheze disconnects from IRC from Radar to New Features/Services on the Site Reliability Engineering board.
Apr 21 2020, 13:28 · Monitoring, Site Reliability Engineering
RhinosF1 triaged T5453: Alert when icinga-miraheze disconnects from IRC as Normal priority.
Apr 21 2020, 13:28 · Monitoring, Site Reliability Engineering

Apr 13 2020

John moved T4292: Add puppetdb prometheus exporter and export metrics to grafana from Radar to Trivial Puppet Change on the Site Reliability Engineering board.
Apr 13 2020, 14:12 · Monitoring, Site Reliability Engineering
John moved T4601: Track mediawiki-static storage space in Icinga from Radar to Trivial Puppet Change on the Site Reliability Engineering board.
Apr 13 2020, 14:12 · Monitoring, Site Reliability Engineering

Apr 9 2020

John updated the image for Monitoring from F26: fa-briefcase-blue.png to F1141558: fa-tags-yellow.png.
Apr 9 2020, 00:01
John edited Description on Monitoring.
Apr 9 2020, 00:01

Apr 7 2020

Reception123 closed T4431: Experiment with influxdb (to bring metrics for LizardFS into grafana) as Declined.

We don't have LizardFS anymore.

Apr 7 2020, 16:38 · Monitoring, Site Reliability Engineering

Dec 31 2019

Reception123 added a comment to T4601: Track mediawiki-static storage space in Icinga.

only Grafana, not icinga

Dec 31 2019, 15:46 · Monitoring, Site Reliability Engineering
Paladox reopened T4601: Track mediawiki-static storage space in Icinga as "Open".

I was wrong, should have read the title.

Dec 31 2019, 15:46 · Monitoring, Site Reliability Engineering
Paladox added a comment to T4601: Track mediawiki-static storage space in Icinga.

Technically already supported https://grafana.miraheze.org/d/n_LdiE1Zz/glusterfs?orgId=1&refresh=10s, but the issue is it dosen't seem accurate and only shows per server not in total.

Dec 31 2019, 15:46 · Monitoring, Site Reliability Engineering
Reception123 closed T4601: Track mediawiki-static storage space in Icinga as Resolved.

Done but issues with total space vs per server.

Dec 31 2019, 15:45 · Monitoring, Site Reliability Engineering

Sep 5 2019

Paladox closed T4291: Add nginx prometheus exporter and use metrics in grafana as Resolved.

This is now done!

Sep 5 2019, 19:18 · Monitoring, Site Reliability Engineering

Aug 23 2019

John closed T4670: Bandwidth should be monitored to prevent downtime as Declined.

Declining in line with previous issue.

Aug 23 2019, 21:31 · Site Reliability Engineering, Monitoring
RhinosF1 triaged T4670: Bandwidth should be monitored to prevent downtime as High priority.
Aug 23 2019, 19:25 · Site Reliability Engineering, Monitoring

Aug 8 2019

Reception123 assigned T4601: Track mediawiki-static storage space in Icinga to Paladox.

Assigning Paladox since LizardFS is his thing.

Aug 8 2019, 16:03 · Monitoring, Site Reliability Engineering

Aug 3 2019

RhinosF1 added projects to T4601: Track mediawiki-static storage space in Icinga: Site Reliability Engineering, Monitoring.
Aug 3 2019, 08:55 · Monitoring, Site Reliability Engineering

Jul 3 2019

Paladox claimed T4291: Add nginx prometheus exporter and use metrics in grafana.
Jul 3 2019, 19:23 · Monitoring, Site Reliability Engineering