Rate limiting for error emails sent to admins #4003

Closed
opened 2025-12-29 18:32:33 +01:00 by adam · 6 comments
Owner

Originally created by @tyler-8 on GitHub (Aug 20, 2020).

Environment

  • Python version: 3.6.8
  • NetBox version: 2.8.8

Proposed Functionality

Implement rate-limiting of emails sent to admins based on repeat errors. There are a couple existing Django plugins as an option, or a custom-built implementation that utilizes the redis cache that would work. redis is the ideal method of tracking the error rate for sending emails as it would ensure NetBox deployments of all kinds (load balanced, k8s, docker, standard install) would all be appropriately rate-limited.

I don't think this could be implemented as a NetBox plugin as it would require modifying some of the 'core' Django bits of the NetBox codebase.

Use Case

  • Database or other network maintenance that may affect NetBox environment: if the database isn't reachable/resolvable during a time period, the NetBox admins can be spammed with anywhere from dozens to hundreds (or thousands) of emails within a very short (1-5 minutes) timespan
  • Security scanning systems, during their work, will attempt many malformed and known "bad" requests that will succeed through nginx (or whatever reverse-proxy) and be parsed by Django/NetBox and could create an email flood as in the above case

Having a rate-limiting mechanism to reduce the number of emails sent for the same exact error condition in a short time span would significantly reduce error emails. The emails can be extremely useful for admins to troubleshoot valid issues as it contains full stack traces and other environment variables so disabling the feature entirely would be a loss of vital information for debugging problems.

Database Changes

Depending on implementation there could be changes if any of the configuration was chosen to be exposed through the admin panel, though I think environment variables/configuration parameters would be enough.

External Dependencies

  • redis
  • some django plugin, though none of the options (1, 2) I could find were completely up to date or used only the local fileystem to track rate limit
Originally created by @tyler-8 on GitHub (Aug 20, 2020). <!-- NOTE: IF YOUR ISSUE DOES NOT FOLLOW THIS TEMPLATE, IT WILL BE CLOSED. This form is only for proposing specific new features or enhancements. If you have a general idea or question, please post to our mailing list instead of opening an issue: https://groups.google.com/forum/#!forum/netbox-discuss NOTE: Due to an excessive backlog of feature requests, we are not currently accepting any proposals which significantly extend NetBox's feature scope. Please describe the environment in which you are running NetBox. Be sure that you are running an unmodified instance of the latest stable release before submitting a bug report. --> ### Environment * Python version: 3.6.8<!-- Example: 3.6.9 --> * NetBox version: 2.8.8<!-- Example: 2.7.3 --> <!-- Describe in detail the new functionality you are proposing. Include any specific changes to work flows, data models, or the user interface. --> ### Proposed Functionality Implement rate-limiting of emails sent to admins based on repeat errors. There are a couple existing Django plugins as an option, or a [custom-built](https://stackoverflow.com/a/25889167) implementation that utilizes the redis cache that would work. redis is the ideal method of tracking the error rate for sending emails as it would ensure NetBox deployments of all kinds (load balanced, k8s, docker, standard install) would all be appropriately rate-limited. I don't think this could be implemented as a NetBox plugin as it would require modifying some of the 'core' Django bits of the NetBox codebase. <!-- Convey an example use case for your proposed feature. Write from the perspective of a NetBox user who would benefit from the proposed functionality and describe how. ---> ### Use Case - Database or other network maintenance that may affect NetBox environment: if the database isn't reachable/resolvable during a time period, the NetBox admins can be spammed with anywhere from dozens to hundreds (or thousands) of emails within a very short (1-5 minutes) timespan - Security scanning systems, during their work, will attempt many malformed and known "bad" requests that will succeed through nginx (or whatever reverse-proxy) and be parsed by Django/NetBox and could create an email flood as in the above case Having a rate-limiting mechanism to reduce the number of emails sent for the same exact error condition in a short time span would significantly reduce error emails. The emails can be extremely useful for admins to troubleshoot valid issues as it contains full stack traces and other environment variables so disabling the feature entirely would be a loss of vital information for debugging problems. <!-- Note any changes to the database schema necessary to support the new feature. For example, does the proposal require adding a new model or field? (Not all new features require database changes.) ---> ### Database Changes Depending on implementation there could be changes if any of the configuration was chosen to be exposed through the admin panel, though I think environment variables/configuration parameters would be enough. <!-- List any new dependencies on external libraries or services that this new feature would introduce. For example, does the proposal require the installation of a new Python package? (Not all new features introduce new dependencies.) --> ### External Dependencies - redis - some django plugin, though none of the options ([1](https://github.com/robert-kisteleki/django-email-throttler), [2](https://github.com/krisys/django-error-email-throttle)) I could find were completely up to date or used only the local fileystem to track rate limit
adam added the type: featurestatus: under review labels 2025-12-29 18:32:34 +01:00
adam closed this issue 2025-12-29 18:32:34 +01:00
Author
Owner

@tyler-8 commented on GitHub (Aug 20, 2020):

The more I look at this I think it could actually be done as a part of configuration.py with a custom handler. The one possible snag being the interaction with redis itself and keeping the key names de-conflicted. I'll tinker and report back in this issue but I suspect this may not require an official feature afterall.

@tyler-8 commented on GitHub (Aug 20, 2020): The more I look at this I think it could actually be done as a part of `configuration.py` with a custom handler. The one possible snag being the interaction with `redis` itself and keeping the key names de-conflicted. I'll tinker and report back in this issue but I suspect this may not require an official feature afterall.
Author
Owner

@jeremystretch commented on GitHub (Sep 4, 2020):

@tyler-8 Have you made any further progress regarding your comment above?

@jeremystretch commented on GitHub (Sep 4, 2020): @tyler-8 Have you made any further progress regarding your comment above?
Author
Owner

@tyler-8 commented on GitHub (Sep 8, 2020):

I've been testing with the custom handler and it's not going to be able to live in the configuration.py due to the way Django looks up handlers. For now, I'm tinkering with a netbox/handlers.py module (same directory as settings.py). At the moment it doesn't appear to be working. I'm still receiving the emails but the throttling isn't working yet.

# handlers.py
import datetime
import time

from django.utils.log import AdminEmailHandler
from cacheops.redis import redis_client  # This uses the redis server configured for `caching`


class AdminEmailThrottle(AdminEmailHandler):
    def incr_counter(self):
        c = redis_client()
        key = self._redis_key()
        res = c.incr(key)
        c.expire(key, 300)
        return res

    def _redis_key(self):
        return time.strftime(
            r"error_email_limiter:%Y-%m-%d_%H:%M", datetime.datetime.now().timetuple()
        )

    def emit(self, record):
        try:
            ctr = self.incr_counter()
        except Exception:
            pass
        else:
            if ctr >= 10:
                return
        super(AdminEmailThrottle, self).emit(record)

# logging config
LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        'mail_admins': {
            'level': 'ERROR',
            'class': 'netbox.handlers.AdminEmailThrottle'
        }
    },
    'loggers': {
        'django': {
            'handlers': ['mail_admins'],
            'level': 'ERROR',
            'propagate': True,
            },
        }
}
@tyler-8 commented on GitHub (Sep 8, 2020): I've been testing with the custom handler and it's not going to be able to live in the `configuration.py` due to the way Django looks up handlers. For now, I'm tinkering with a `netbox/handlers.py` module (same directory as `settings.py`). At the moment it doesn't appear to be working. I'm still receiving the emails but the throttling isn't working yet. ```python # handlers.py import datetime import time from django.utils.log import AdminEmailHandler from cacheops.redis import redis_client # This uses the redis server configured for `caching` class AdminEmailThrottle(AdminEmailHandler): def incr_counter(self): c = redis_client() key = self._redis_key() res = c.incr(key) c.expire(key, 300) return res def _redis_key(self): return time.strftime( r"error_email_limiter:%Y-%m-%d_%H:%M", datetime.datetime.now().timetuple() ) def emit(self, record): try: ctr = self.incr_counter() except Exception: pass else: if ctr >= 10: return super(AdminEmailThrottle, self).emit(record) ``` ```python # logging config LOGGING = { 'version': 1, 'disable_existing_loggers': False, 'handlers': { 'mail_admins': { 'level': 'ERROR', 'class': 'netbox.handlers.AdminEmailThrottle' } }, 'loggers': { 'django': { 'handlers': ['mail_admins'], 'level': 'ERROR', 'propagate': True, }, } } ```
Author
Owner

@tyler-8 commented on GitHub (Sep 8, 2020):

As far as requiring a built-in feature, I don't think it will be necessary - assuming I can get my above method working. The handlers.py would be treated similar to how the ldap_config.py is today (as an optional module that the user has to create), and the configuration.py logging dict just has to reference its import path.

@tyler-8 commented on GitHub (Sep 8, 2020): As far as requiring a built-in feature, I don't think it will be necessary - assuming I can get my above method working. The `handlers.py` would be treated similar to how the `ldap_config.py` is today (as an optional module that the user has to create), and the `configuration.py` logging dict just has to reference its import path.
Author
Owner

@tyler-8 commented on GitHub (Sep 8, 2020):

Got it working. the redis_client object has a decorator on it that changes how you interact with it. So this line c = redis_client() was incorrect (and redundant). I tweaked some things but this is basically what was used in the StackOverflow from the OP.

I think this could make for a good feature to build in - however it is possible to do on your own. The below handler could be fleshed out some more to have a setting for the throttling rate in configuration.py. This is also a very basic throttler - it's purely based on rate, there's no checking/hashing for throttling repeats of the same exact event, but it definitely gets the job done.

# handlers.py
import datetime
import time

from django.utils.log import AdminEmailHandler
from cacheops.redis import redis_client


class AdminEmailThrottle(AdminEmailHandler):
    """
    Throttle the number of emails sent to admins in a minute.
    Each minute is a separate bucket in the event that you have
    prolonged error conditions persisting over several minutes.
    You will continue to get notifications of the
    error at a reduced rate.
    """

    EXPIRE_TIME = 300
    MAX_EMAILS_MINUTE = 5
    KEY_PREFIX = "email_admins_counter"

    def increment_counter(self):
        key = self._redis_key()
        result = redis_client.incr(key)
        redis_client.expire(key, self.EXPIRE_TIME)
        return result

    def _redis_key(self):
        return time.strftime(
            f"{self.KEY_PREFIX}:%Y-%m-%d_%H:%M", datetime.datetime.now().timetuple(),
        )

    def emit(self, record):
        try:
            counter = self.increment_counter()
        except Exception:
            pass
        else:
            if counter > self.MAX_EMAILS_MINUTE:
                return
        super(AdminEmailThrottle, self).emit(record)

@tyler-8 commented on GitHub (Sep 8, 2020): Got it working. the `redis_client` object has a [decorator](https://funcy.readthedocs.io/en/stable/objects.html?highlight=LazyObject#LazyObject) on it that changes how you interact with it. So this line `c = redis_client()` was incorrect (and redundant). I tweaked some things but this is basically what was used in the StackOverflow from the OP. I think this could make for a good feature to build in - however it is possible to do on your own. The below handler could be fleshed out some more to have a setting for the throttling rate in `configuration.py`. This is also a very basic throttler - it's purely based on rate, there's no checking/hashing for throttling repeats of the same exact event, but it definitely gets the job done. ```python # handlers.py import datetime import time from django.utils.log import AdminEmailHandler from cacheops.redis import redis_client class AdminEmailThrottle(AdminEmailHandler): """ Throttle the number of emails sent to admins in a minute. Each minute is a separate bucket in the event that you have prolonged error conditions persisting over several minutes. You will continue to get notifications of the error at a reduced rate. """ EXPIRE_TIME = 300 MAX_EMAILS_MINUTE = 5 KEY_PREFIX = "email_admins_counter" def increment_counter(self): key = self._redis_key() result = redis_client.incr(key) redis_client.expire(key, self.EXPIRE_TIME) return result def _redis_key(self): return time.strftime( f"{self.KEY_PREFIX}:%Y-%m-%d_%H:%M", datetime.datetime.now().timetuple(), ) def emit(self, record): try: counter = self.increment_counter() except Exception: pass else: if counter > self.MAX_EMAILS_MINUTE: return super(AdminEmailThrottle, self).emit(record) ```
Author
Owner

@jeremystretch commented on GitHub (Sep 9, 2020):

@tyler-8 thank you for sharing your solution! As this is a general-purpose function in Django and not specific to NetBox, I don't think it makes much sense for us to take on maintenance of it. It would be better suited as a feature implemented within the Django framework itself. And since it's already possible to accomplish as you've demonstrated, I'm going to close this out. Thanks again!

@jeremystretch commented on GitHub (Sep 9, 2020): @tyler-8 thank you for sharing your solution! As this is a general-purpose function in Django and not specific to NetBox, I don't think it makes much sense for us to take on maintenance of it. It would be better suited as a feature implemented within the Django framework itself. And since it's already possible to accomplish as you've demonstrated, I'm going to close this out. Thanks again!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/netbox#4003