Discussion:
RFC: Verify concurrency limit
(too old to reply)
Kim B. Heino
2014-04-22 13:33:50 UTC
Permalink
Hello,

I'm one of the maintainers of multi-node spam checking service. We were
recently hit by DDoS attack. We received hundreds of emails per second,
all targeted to ***@single.client.com. Unfortunately client.com
had "unknown receiver tarpit" feature enabled and we had (must have)
"reject_unverified_recipient" option enabled on our side. This resulted
hundreds of verify probes per second, but client replied to less that
one per second. This resulted HUGE mail queue of verify probes plus
couple of real emails. Basically we and all of our clients were DDoS'ed
as our Postfix installation was using 99% of time to handle those
queued verify probes.

There are lot of different concurrency limits in Postfix but none for
verify. I quickly came up with attached patch which solved this DDoS
attack. It's not complete and it's quite dirty, but I'm sending it here
for comments before I clean it up.

Basic idea in patch:

Chunk #1: Function to increase/decrease/get current concurrency value
per receiver domain. I'm re-using verify_map for this value, stored as
key = "@@domain.com", value = "0". I know that value "0" will be purged
by verify_cache_validator() but that's not a problem.

Chunk #2: Descrease concurrency limit when probe finishes.

Chunk #3: Check if concurrency limit is over limit and DEFER is so.
Current limit is hardcoded to 18 but
$default_destination_concurrency_limit should be good default value.

Chunk #4: Increase concurrecty limit before sending probe.

Is this the correct way to solve this kind of DDoS? Should I clean up
the patch and add new verify_concurrency_limit config option? Any
comments?


While debugging my patch I noticed that Postfix doesn't strictly honor
verify timeout. If previous verify has already timeouted, but cache
cleanup timeout (12h) has now yet expired, Postfix will use previous
answer PLUS it sends new refresh probe (doesn't wait for answer).
Shouldn't it just ignore the old value? My patch doesn't take this is
account and it might result 18 new verifys + unknown number of refresh
probes. I would rather just ignore the old value.





diff -ur postfix-2.10.2.orig/src/verify/verify.c postfix-2.10.2/src/verify/verify.c
--- postfix-2.10.2.orig/src/verify/verify.c 2014-03-11 10:23:51.653142262 +0200
+++ postfix-2.10.2/src/verify/verify.c 2014-03-12 11:14:55.938779885 +0200
@@ -338,6 +338,45 @@
return (0);
}

+/* concurrency - keep track of currently running probes per domain */
+
+static signed int concurrency(VSTRING *email, signed int modify)
+{
+ VSTRING *domain;
+ const char *raw_data, *delim;
+ signed int count;
+
+ /* Convert email to "@@domain.tld" */
+ delim = vstring_memchr(email, '@');
+ if (delim == NULL)
+ return 0;
+ domain = vstring_alloc(40);
+ vstring_sprintf(domain, "@%s", delim);
+ msg_warn(">>>> domain %s, modify %d", STR(domain), modify);
+
+ /* Lookup current value */
+ raw_data = dict_cache_lookup(verify_map, STR(domain));
+ if (raw_data == NULL)
+ count = 0;
+ else
+ count = atoi(raw_data);
+
+ /* Set new value */
+ count = count + modify;
+ if (count < 0) {
+ msg_warn(">>>> negative %s = %d", STR(domain), count);
+ count = 0;
+ } else if (modify != 0) {
+ VSTRING *data = vstring_alloc(10);
+ vstring_sprintf(data, "%d", count);
+ dict_cache_update(verify_map, STR(domain), STR(data));
+ msg_warn(">>>> update %s = %s", STR(domain), STR(data));
+ vstring_free(data);
+ }
+ vstring_free(domain);
+ return count;
+}
+
/* verify_update_service - update address service */

static void verify_update_service(VSTREAM *client_stream)
@@ -372,6 +411,7 @@
* the address will be re-probed upon the next query. As long as
* some probes succeed the address will remain cached as OK.
*/
+ concurrency(addr, -1);
if (addr_status == DEL_RCPT_STAT_OK
|| (raw_data = dict_cache_lookup(verify_map, STR(addr))) == 0
|| STATUS_FROM_RAW_ENTRY(raw_data) != DEL_RCPT_STAT_OK) {
@@ -456,12 +496,23 @@
|| (now - probed > PROBE_TTL /* safe to probe */
&& (POSITIVE_ENTRY_EXPIRED(addr_status, updated)
|| NEGATIVE_ENTRY_EXPIRED(addr_status, updated)))) {
+
+ if (concurrency(addr, 0) >= 18) {
+ addr_status = DEL_RCPT_STAT_DEFER;
+ probed = 0;
+ updated = now;
+ text = "Concurrency limit exceeded";
+ msg_warn(">>>> %s", text);
+ } else {
+
addr_status = DEL_RCPT_STAT_TODO;
probed = 0;
updated = 0;
text = "Address verification in progress";
if (raw_data != 0 && var_verify_neg_cache == 0)
dict_cache_delete(verify_map, STR(addr));
+
+ }
}
if (msg_verbose)
msg_info("GOT %s status=%d probed=%ld updated=%ld text=%s",
@@ -495,6 +546,7 @@
if (now - probed > PROBE_TTL
&& (POSITIVE_REFRESH_NEEDED(addr_status, updated)
|| NEGATIVE_REFRESH_NEEDED(addr_status, updated))) {
+ concurrency(addr, +1);
if (msg_verbose)
msg_info("PROBE %s status=%d probed=%ld updated=%ld",
STR(addr), addr_status, now, updated);
Wietse Venema
2014-04-22 14:15:47 UTC
Permalink
Post by Kim B. Heino
Hello,
I'm one of the maintainers of multi-node spam checking service. We were
recently hit by DDoS attack. We received hundreds of emails per second,
had "unknown receiver tarpit" feature enabled and we had (must have)
"reject_unverified_recipient" option enabled on our side. This resulted
hundreds of verify probes per second, but client replied to less that
one per second. This resulted HUGE mail queue of verify probes plus
couple of real emails. Basically we and all of our clients were DDoS'ed
as our Postfix installation was using 99% of time to handle those
queued verify probes.
There are lot of different concurrency limits in Postfix but none for
verify. I quickly came up with attached patch which solved this DDoS
Address verification probes are subject to concurrency and rate
limits just like regular mail.

However, you are looking for a solution BEFORE the mail queue,
that stops the verify daemon from sending probes.
Post by Kim B. Heino
attack. It's not complete and it's quite dirty, but I'm sending it here
for comments before I clean it up.
Chunk #1: Function to increase/decrease/get current concurrency value
per receiver domain. I'm re-using verify_map for this value, stored as
by verify_cache_validator() but that's not a problem.
Instead of per-domain quota, would not it be sufficient to impose
a global limit on the total number of pending verify requests for
information that is not already cached? Then use something like
"random drop" to keep the number within bounds.

Wietse
Kim B. Heino
2014-04-22 14:37:01 UTC
Permalink
Post by Wietse Venema
However, you are looking for a solution BEFORE the mail queue,
that stops the verify daemon from sending probes.
Yes, exactly.
Post by Wietse Venema
Instead of per-domain quota, would not it be sufficient to impose
a global limit on the total number of pending verify requests for
information that is not already cached? Then use something like
"random drop" to keep the number within bounds.
We have lot of different clients where we forward mail to. One global
limit doesn't work: DDoS'ing one single client would affect all
clients.
Viktor Dukhovni
2014-04-22 15:08:44 UTC
Permalink
Post by Kim B. Heino
Post by Wietse Venema
Instead of per-domain quota, would not it be sufficient to impose
a global limit on the total number of pending verify requests for
information that is not already cached? Then use something like
"random drop" to keep the number within bounds.
We have lot of different clients where we forward mail to. One global
limit doesn't work: DDoS'ing one single client would affect all
clients.
You probably need both a per-domain limit and a larger global limit.

RED would be applied per-domain once the domain's limit is exceeded,
and globally once the global limit is exceeded.

Clients that don't process verify probes in a timely manner (tarpit
your system's probe messages) and thus contribute to DoS of your
system should be asked to provide you with a static user list, or
use another provider. You should use a separate transport for
verify probes with a generous process limit.
--
Viktor.
Wietse Venema
2014-04-22 15:10:50 UTC
Permalink
Post by Kim B. Heino
Post by Wietse Venema
However, you are looking for a solution BEFORE the mail queue,
that stops the verify daemon from sending probes.
Yes, exactly.
Post by Wietse Venema
Instead of per-domain quota, would not it be sufficient to impose
a global limit on the total number of pending verify requests for
information that is not already cached? Then use something like
"random drop" to keep the number within bounds.
We have lot of different clients where we forward mail to. One global
limit doesn't work: DDoS'ing one single client would affect all
clients.
First, it is OK for you to code up something that works for your
specific use case.

However, I have to support the full range including sender probes.
So I would have to address things such as:

- The problem with external counters is that they aren't reset when
the verify daemon is restarted.

- The problem with per-domain in-memory counters is that they can
use up a lot of memory especially with sender domains.

Wietse
Viktor Dukhovni
2014-04-22 17:10:33 UTC
Permalink
Post by Wietse Venema
- The problem with per-domain in-memory counters is that they can
use up a lot of memory especially with sender domains.
If the RED policy for probes were in the queue manager (I know
that's a pain) then the memory for counters would be limited by
the size of the active queue. And the limits would be recovered
automatically when the queue manager is re-started (infrequently).
--
Viktor.
Wietse Venema
2014-04-22 17:50:49 UTC
Permalink
Post by Viktor Dukhovni
Post by Wietse Venema
- The problem with per-domain in-memory counters is that they can
use up a lot of memory especially with sender domains.
If the RED policy for probes were in the queue manager (I know
that's a pain) then the memory for counters would be limited by
the size of the active queue. And the limits would be recovered
automatically when the queue manager is re-started (infrequently).
The correct solution prevents excess probes from entering the mail
queue.

I think that the disadvantage of a global-only limit is overstated
because the Original Poster does not understand how Postfix works.

A global limit on the number of pending probes affects only unknown
email addresses. Postfix proactively refreshes known email addresses
well before they expire. I am not an idiot.

Wietse
Viktor Dukhovni
2014-04-22 18:05:16 UTC
Permalink
Post by Wietse Venema
A global limit on the number of pending probes affects only unknown
email addresses. Postfix proactively refreshes known email addresses
well before they expire. I am not an idiot.
Whether this is sufficient depends on the cache hit rate, and
proportion of addresses that receive infrequent mail. Postfix does
not send refresh probes unless the recipient is actually sent a
message, right? The OP may benefit from a longer positive cache
lifetime, and a separate transport for probes. Customers that
tarpit probes are not doing anyone a favour, perhaps cluestick can
be applied.
--
Viktor.
Wietse Venema
2014-04-22 18:45:36 UTC
Permalink
Post by Viktor Dukhovni
Post by Wietse Venema
A global limit on the number of pending probes affects only unknown
email addresses. Postfix proactively refreshes known email addresses
well before they expire. I am not an idiot.
Whether this is sufficient depends on the cache hit rate, and
proportion of addresses that receive infrequent mail. Postfix does
not send refresh probes unless the recipient is actually sent a
message, right? The OP may benefit from a longer positive cache
lifetime, and a separate transport for probes.
With the default expiration policy, inactive addresses (>31 days)
are the same as unknown recipients. Unless we're dealing with a
sustained flood of non-existent addresses, few real addresses should
be affected.
Post by Viktor Dukhovni
Customers that tarpit probes are not doing anyone a favour, perhaps
cluestick can be applied.
Yes, but they should not affect other customers too much. We're not
striving for total performance isolation.

Wietse
Mika Ilmaranta
2014-04-23 08:02:23 UTC
Permalink
Hi,

Let me explain the situation further. One customers domain is hit with
hundreds of thousands of spam messages to random non existing recipient
addresses from random sender addresses in bursts that last a few hours.

Recipient verify probes clog every filtering nodes' mail queues with tens of
thousands verify probes and that effectively stops legitimate mails getting
through to all other clients too until the verify probes are dealt with.

The customer has Exchange and with co-operation with the customer tarpit was
dropped from the incoming receiver and connection concurrency lifted from
the default 20 connections to 200. We also changed transport for the
client's verify probes. Any mentioned configuration change didn't help us
much to cope with the problem of clogging mail queues, only the period of
clogging was decreased to approximately one third, which is still more than
can be tolerated.

Obtaining and keeping valid recipient address lists up to date with a few
thousand domains is not an option due to work load issues involved.

We don't use sender verify because all we really can do is to make sure the
senders domain exists. Sender verify also has problems coping with
greylisting and tarpitting at least.

Looks like someone else has also hit this problem earlier
http://serverfault.com/questions/312962/postfix-connection-cache-address-verification-probes

BR,
Mika
Wietse Venema
2014-04-23 11:19:50 UTC
Permalink
Post by Mika Ilmaranta
Hi,
Let me explain the situation further. One customers domain is hit with
hundreds of thousands of spam messages to random non existing recipient
addresses from random sender addresses in bursts that last a few hours.
Recipient verify probes clog every filtering nodes' mail queues with tens of
thousands verify probes and that effectively stops legitimate mails getting
through to all other clients too until the verify probes are dealt with.
Clogging can be prevented with a global limit on the number of
address verification probes.
Post by Mika Ilmaranta
Obtaining and keeping valid recipient address lists up to date with a few
thousand domains is not an option due to work load issues involved.
The Postfix address verification CACHE, in its default configuration,
will proactively refresh active addresses before they expire.
Therefore, your DDOS should not affect the verification of active
recipients, only those recipients that have expired or that are new.
This should be sufficient to handle a burst of bogus mail.

A global limit is what I can support on the short term. Your
per-receipient-domain limit is problematic because 1) it leaks
counters when the verify daemon is restarted, and 2) it solves only
the easy half of the problem (recipient domains).

Wietse
Mika Ilmaranta
2014-04-23 15:01:41 UTC
Permalink
Post by Wietse Venema
Clogging can be prevented with a global limit on the number of
address verification probes.
I think that should be simulated somehow. However I don't have a 300000+
host zombie spam botnet for me to test with. With four sender hosts
using script like

for i in $(seq 1 100) ; do ( smtp-source -A -N -t
ansu${i}@domain.invalid -s 10 -m 50 -l 1000
some.smtp.server.domain.invalid & ) ; done

I was only able to hit the per domain hard limit of 18.

In theory both sender and recipient verify probes in flight should be
limited by a per receiving host policy like everything else is limited
with default_destination_concurrency_limit to prevent mail queue
congestion. But I think that would be even more of a resource hog than
limiting simply by domain.

Also I think that another possible way to handle this could be using
different queue for verify probes so that handling them doesn't affect
real mail delivery. But which way is most resource friendly?
Post by Wietse Venema
Post by Mika Ilmaranta
Obtaining and keeping valid recipient address lists up to date with a few
thousand domains is not an option due to work load issues involved.
The Postfix address verification CACHE, in its default configuration,
will proactively refresh active addresses before they expire.
Therefore, your DDOS should not affect the verification of active
recipients, only those recipients that have expired or that are new.
This should be sufficient to handle a burst of bogus mail.
Verification itself is not the problem, it will be done eventually, but
all legitimate deliveries get delayed for hours when the queue is
congested with verify probes.
Post by Wietse Venema
A global limit is what I can support on the short term. Your
per-receipient-domain limit is problematic because 1) it leaks
counters when the verify daemon is restarted, and 2) it solves only
the easy half of the problem (recipient domains).
1) Kim's patch is only at proof of concept level not something I would
put directly to postfix source tree as such.

2) I don't see how this would be any different for sender verify probes.
We are facing exactly the same concurrency limits on receiver side there
or even stricter ones.

Mika
Wietse Venema
2014-04-23 15:35:50 UTC
Permalink
Post by Mika Ilmaranta
Post by Wietse Venema
Clogging can be prevented with a global limit on the number of
address verification probes.
I think that should be simulated somehow.
1) Given a global limit on the number of outstanding verification
requests that equals 1/4 of the capacity of the active queue.

2) Then 3/4 of the active queue remains available to deliver
non-verification requests, and consequently, verification requests
cannot "clog" up Postfix. When most bogus requests are for the same
domain, then that domain will suffer most of the delays. That is OK.

3) Excess verification requests tempfail immediately. Most addresses
will be unaffected because the verify cache proactively refreshes
active addresses. Only "unknown" or "inactive" addresses will be
affected. By default, inactive means no mail in 31 days, and "known
address" refresh happens after (at least) 7 days.
Post by Mika Ilmaranta
Verification itself is not the problem, it will be done eventually, but
all legitimate deliveries get delayed for hours when the queue is
congested with verify probes.
The global limit eliminates this congestion. By your comment I
understand it would be OK to tempfail verify probes. In other words
the global limit is good enough to eliminate the congestion problem.

In an ideal world with unlimited budgets I could do some fine-grained
over-engineered solution that supports per-domain limits but the real
world is different.
Post by Mika Ilmaranta
2) I don't see how this would be any different for sender verify probes.
The number of sender domains is larger than your pool of recipient
domains, and therefore, tracking the sender domains would require
more memory.

Wietse
Wietse Venema
2014-04-23 19:22:40 UTC
Permalink
I'm making one refinement step to eliminate queue congestion due
to address verification requests.

The refinement is to maintain a fixed-size cache with counters for
the most-common domain names in pending address verification requests.

This fixed-size cache allows Postfix to tempfail verification
requests selectively for domains that are requested frequently.
Only when this measure does not address the problem Postfix will
non-selectively tempfail verification requests for all domains.

Wietse

First, the problem. A mail security provider handles mail for many
customer domains and relies on an address verification cache for
recipient validation. One customer domain is subjected to the
equivalent of a recipient dictionary attack. The Postfix queue
becomes congested with recipient verification requests. Our job is
to elminate queue congestion due to address verification requests.

Second, my requirement. The solution must be scaleable: it must
work not only for recipients but also for senders. There are many
more senders (domains) than recipients (domains). The solution must
also be robust: it must avoid counters that don't return to zero
after some Postfix daemon is restarted.

1) Enforce a global limit on the number of outstanding verification
requests that equals, say, 1/4 of the capacity of the active queue.

2) Then 3/4 of the active queue remains available to deliver
non-verification requests. Consequently, verification requests
cannot "clog" up the queue. When most bogus requests are for one
domain, then that domain will suffer most of the delays.

3) The verify daemon keeps a cache with counters for the 1000 or
so most common domain names in a pending address verification
request.

4) When the total number of pending verification requests approaches,
say, 80% of the global limit, the verify daemon starts tempfailing
requests for the domains from 3) that have many pending requests.
Only after the global limit is reached, the verify daemon tempfails
all excess verification requests.

5) When Postfix tempfails an address verification request as described
in 4), most legitimate addresses will be unaffected because the
verify cache proactively refreshes active addresses before they
expire. Only "unknown" or "inactive" addresses will be affected.
By default, inactive means no mail in 31 days, and "known address"
refresh happens after (at least) 7 days.
Patrik Rak
2014-04-24 07:37:43 UTC
Permalink
Post by Wietse Venema
1) Enforce a global limit on the number of outstanding verification
requests that equals, say, 1/4 of the capacity of the active queue.
3) The verify daemon keeps a cache with counters for the 1000 or
so most common domain names in a pending address verification
request.
Is the global limit going to be enforced by the queue manager?

If not, and given that neither of the counters above is persistent,
restarting postfix ~5 times under such attack once the domain limit is
reached will allow the queue to be clogged anyway... With every restart
allowing another batch of ( 80% * 1/4 * active_queue_size ) probes into
the active queue.

It won't be clogged indefinitely, sure, I am just pointing out that
tweaking the postfix config while under such attack might be problematic.

Patrik
Wietse Venema
2014-04-24 11:27:43 UTC
Permalink
Post by Patrik Rak
Post by Wietse Venema
1) Enforce a global limit on the number of outstanding verification
requests that equals, say, 1/4 of the capacity of the active queue.
3) The verify daemon keeps a cache with counters for the 1000 or
so most common domain names in a pending address verification
request.
Is the global limit going to be enforced by the queue manager?
Primarily at the source of probe messages (verify). A secondary
counter (qmgr) would smooth out temporary glitches due to process
restarts, but that would only be a backup mechanism.

As you see I have thought about this, but I'm not constantly updating
the mailing list on every refinement. There may be more refinements,
but such stuff is obvious to people like you and me.

Wietse
Wietse Venema
2014-04-24 18:38:25 UTC
Permalink
This is the first of several patches to limit the number of address
verification requests in the Postfix mail queue.

The second patch will introduce the primary, before-queue, enforcement
mechanism that considers the domain in an email address. This
involves more code and requires more time for quality testing.

The first patch introduces the secondary, post-queue, enforcement
mechanism that limits the number of probes in the active queue to
1/4 of the active queue capacity, without any regard of the domain
in an email address.

Tempfailing requests in this manner is not as bad as one might
think. The Postfix verify cache proactively updates active addresses
well before they expire. Thus, non-selective tempfailing affects
only inactive addresses that have expired (31 days) or unknown
addresses.

This patch is as simple as possible. It involves one configuration
parameter (address_verify_pending_request_limit) for the pending
request limit, and one counter (qmgr_vrfy_pend_count) for the current
number of verification requests in the active queue. It tempfails
all verification requests that exceed the limit.

Wietse

diff --exclude=man --exclude=html --exclude=README_FILES --exclude=.indent.pro --exclude=Makefile.in -cr --exclude=util --exclude=mantools --exclude=proto --exclude=oqmgr /var/tmp/postfix-2.12-20140406/src/global/mail_params.h ./src/global/mail_params.h
*** /var/tmp/postfix-2.12-20140406/src/global/mail_params.h Sun Apr 6 19:00:17 2014
--- ./src/global/mail_params.h Thu Apr 24 09:04:32 2014
***************
*** 2733,2738 ****
--- 2733,2742 ----
#define DEF_VRFY_XPORT_MAPS "$" VAR_TRANSPORT_MAPS
extern char *var_vrfy_xport_maps;

+ #define VAR_VRFY_PEND_LIMIT "address_verify_pending_request_limit"
+ #define DEF_VRFY_PEND_LIMIT (DEF_QMGR_ACT_LIMIT / 4)
+ extern int var_vrfy_pend_limit;
+
/*
* Message delivery trace service.
*/
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=.indent.pro --exclude=Makefile.in -cr --exclude=util --exclude=mantools --exclude=proto --exclude=oqmgr /var/tmp/postfix-2.12-20140406/src/qmgr/qmgr.c ./src/qmgr/qmgr.c
*** /var/tmp/postfix-2.12-20140406/src/qmgr/qmgr.c Sat Sep 28 21:03:32 2013
--- ./src/qmgr/qmgr.c Thu Apr 24 10:19:02 2014
***************
*** 310,315 ****
--- 310,320 ----
/* .IP "\fBqmgr_ipc_timeout (60s)\fR"
/* The time limit for the queue manager to send or receive information
/* over an internal communication channel.
+ /* .PP
+ /* Available in Postfix version 2.12 and later:
+ /* .IP "\fBaddress_verify_pending_request_limit (see 'postconf -d' output)\fR"
+ /* A safety limit that prevents address verification requests
+ /* from overwhelming the Postfix queue.
/* MISCELLANEOUS CONTROLS
/* .ad
/* .fi
***************
*** 445,450 ****
--- 450,456 ----
char *var_def_filter_nexthop;
int var_qmgr_daemon_timeout;
int var_qmgr_ipc_timeout;
+ int var_vrfy_pend_limit;

static QMGR_SCAN *qmgr_scans[2];

***************
*** 718,723 ****
--- 724,730 ----
VAR_LOCAL_RCPT_LIMIT, DEF_LOCAL_RCPT_LIMIT, &var_local_rcpt_lim, 0, 0,
VAR_LOCAL_CON_LIMIT, DEF_LOCAL_CON_LIMIT, &var_local_con_lim, 0, 0,
VAR_CONC_COHORT_LIM, DEF_CONC_COHORT_LIM, &var_conc_cohort_limit, 0, 0,
+ VAR_VRFY_PEND_LIMIT, DEF_VRFY_PEND_LIMIT, &var_vrfy_pend_limit, 1, 0,
0,
};
static const CONFIG_BOOL_TABLE bool_table[] = {
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=.indent.pro --exclude=Makefile.in -cr --exclude=util --exclude=mantools --exclude=proto --exclude=oqmgr /var/tmp/postfix-2.12-20140406/src/qmgr/qmgr.h ./src/qmgr/qmgr.h
*** /var/tmp/postfix-2.12-20140406/src/qmgr/qmgr.h Sat Jul 24 13:28:00 2010
--- ./src/qmgr/qmgr.h Thu Apr 24 10:17:33 2014
***************
*** 377,382 ****
--- 377,383 ----

extern int qmgr_message_count;
extern int qmgr_recipient_count;
+ extern int qmgr_vrfy_pend_count;

extern void qmgr_message_free(QMGR_MESSAGE *);
extern void qmgr_message_update_warn(QMGR_MESSAGE *);
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=.indent.pro --exclude=Makefile.in -cr --exclude=util --exclude=mantools --exclude=proto --exclude=oqmgr /var/tmp/postfix-2.12-20140406/src/qmgr/qmgr_message.c ./src/qmgr/qmgr_message.c
*** /var/tmp/postfix-2.12-20140406/src/qmgr/qmgr_message.c Fri Apr 5 17:27:48 2013
--- ./src/qmgr/qmgr_message.c Thu Apr 24 14:27:44 2014
***************
*** 8,13 ****
--- 8,14 ----
/*
/* int qmgr_message_count;
/* int qmgr_recipient_count;
+ /* int qmgr_vrfy_pend_count;
/*
/* QMGR_MESSAGE *qmgr_message_alloc(class, name, qflags, mode)
/* const char *class;
***************
*** 38,43 ****
--- 39,51 ----
/* of in-core recipient structures (i.e. the sum of all recipients
/* in all in-core message structures).
/*
+ /* qmgr_vrfy_pend_count is a global counter for the total
+ /* number of in-core message structures that are associated
+ /* with an address verification request. Requests that exceed
+ /* the address_verify_pending_limit are deferred immediately.
+ /* This is a backup mechanism for a more refined enforcement
+ /* mechanism in the verify(8) daemon.
+ /*
/* qmgr_message_alloc() creates an in-core message structure
/* with sender and recipient information taken from the named queue
/* file. A null result means the queue file could not be read or
***************
*** 149,154 ****
--- 157,163 ----

int qmgr_message_count;
int qmgr_recipient_count;
+ int qmgr_vrfy_pend_count;

/* qmgr_message_create - create in-core message structure */

***************
*** 748,758 ****
* after the logfile is deleted.
*/
else if (strcmp(name, MAIL_ATTR_TRACE_FLAGS) == 0) {
! message->tflags = DEL_REQ_TRACE_FLAGS(atoi(value));
! if (message->tflags == DEL_REQ_FLAG_RECORD)
! message->tflags_offset = curr_offset;
! else
! message->tflags_offset = 0;
}
continue;
}
--- 757,771 ----
* after the logfile is deleted.
*/
else if (strcmp(name, MAIL_ATTR_TRACE_FLAGS) == 0) {
! if (message->tflags == 0) {
! message->tflags = DEL_REQ_TRACE_FLAGS(atoi(value));
! if (message->tflags == DEL_REQ_FLAG_RECORD)
! message->tflags_offset = curr_offset;
! else
! message->tflags_offset = 0;
! if ((message->tflags & DEL_REQ_FLAG_MTA_VRFY) != 0)
! qmgr_vrfy_pend_count++;
! }
}
continue;
}
***************
*** 1159,1164 ****
--- 1172,1185 ----
}

/*
+ * Safety: defer excess address verification requests.
+ */
+ if ((message->tflags & DEL_REQ_FLAG_MTA_VRFY) != 0
+ && qmgr_vrfy_pend_count > var_vrfy_pend_limit)
+ QMGR_REDIRECT(&reply, MAIL_SERVICE_RETRY,
+ "4.3.2 Too many address verification requests");
+
+ /*
* Look up or instantiate the proper transport.
*/
if (transport == 0 || !STREQ(transport->name, STR(reply.transport))) {
***************
*** 1423,1428 ****
--- 1444,1451 ----
myfree(message->rewrite_context);
recipient_list_free(&message->rcpt_list);
qmgr_message_count--;
+ if ((message->tflags & DEL_REQ_FLAG_MTA_VRFY) != 0)
+ qmgr_vrfy_pend_count--;
myfree((char *) message);
}
Wietse Venema
2014-04-26 00:58:23 UTC
Permalink
This is the second of several patches to limit the number of address
verification requests in the Postfix mail queue.

This patch implements the first part of the primary, before-queue,
enforcement mechanism that limits the number of probes in the active
queue to 1/4 of the active queue capacity. This part does not
consider the domain in an email address. Per-domain counters will
be added later.

During development of this patch it became clear that one address
verification request can result in multiple deliverability status
reports. If this is not taken into account, then the number of
pending requests will be under-estimated.

This problem does not exist with the earlier patch for the queue
manager. If you need to stop a flood of verification requests now,
use that patch.

The patch below eliminates one source of multiple deliverability
reports per address verification request. The workaround is to
report the deliverability only for the original address, and no
longer report the deliverability for the address that results from
rewriting or aliasing.

However, multiple deliverability reports per request also happen
with 1-to-many virtual alias expansion. For each virtual alias
expansion result, a deliverability status will be reported to the
verify(8) daemon.

To address this, the verify daemon will have to recognize that a
deliverability result arrives for an address before the automatic
refresh timer goes off. Such results are legitimate but they should
not affect the pending request counter.

Wietse

diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr --exclude=util --exclude=mantools --exclude=proto --exclude=oqmgr --exclude=qmgr --exclude=mail_params.h /var/tmp/postfix-2.12-20140406/src/global/verify.c ./src/global/verify.c
*** /var/tmp/postfix-2.12-20140406/src/global/verify.c Tue Nov 1 17:43:13 2005
--- ./src/global/verify.c Fri Apr 25 19:12:42 2014
***************
*** 105,114 ****
--- 105,116 ----
if (var_verify_neg_cache || vrfy_stat == DEL_RCPT_STAT_OK) {
req_stat = verify_clnt_update(recipient->orig_addr, vrfy_stat,
my_dsn.reason);
+ #ifndef VAR_VRFY_PEND_LIMIT
if (req_stat == VRFY_STAT_OK && strcasecmp(recipient->address,
recipient->orig_addr) != 0)
req_stat = verify_clnt_update(recipient->address, vrfy_stat,
my_dsn.reason);
+ #endif
} else {
my_dsn.action = "undeliverable-but-not-cached";
req_stat = VRFY_STAT_OK;
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr --exclude=util --exclude=mantools --exclude=proto --exclude=oqmgr --exclude=qmgr --exclude=mail_params.h /var/tmp/postfix-2.12-20140406/src/verify/verify.c ./src/verify/verify.c
*** /var/tmp/postfix-2.12-20140406/src/verify/verify.c Sat Dec 3 19:48:05 2011
--- ./src/verify/verify.c Fri Apr 25 19:48:07 2014
***************
*** 148,153 ****
--- 148,160 ----
/* .IP "\fBaddress_verify_sender_dependent_default_transport_maps ($sender_dependent_default_transport_maps)\fR"
/* Overrides the sender_dependent_default_transport_maps parameter
/* setting for address verification probes.
+ /* SAFETY CONTROLS
+ /* .ad
+ /* .fi
+ /* Available in Postfix version 2.12 and later:
+ /* .IP "\fBaddress_verify_pending_request_limit (see 'postconf -d' output)\fR"
+ /* A safety limit that prevents address verification requests
+ /* from overwhelming the Postfix queue.
/* MISCELLANEOUS CONTROLS
/* .ad
/* .fi
***************
*** 246,251 ****
--- 253,259 ----
int var_verify_neg_exp;
int var_verify_neg_try;
int var_verify_scan_cache;
+ int var_vrfy_pend_limit;

/*
* State.
***************
*** 259,264 ****
--- 267,347 ----
#define STREQ(x,y) (strcmp(x,y) == 0)

/*
+ * Preventing verification requests from overwhelming the Postfix queue.
+ *
+ * First, the verify(8) daemon enforces a global (domain-independent) limit on
+ * the number of pending verification requests. When this global limit is
+ * reached, the verify(8) daemon stops injecting new requests into the
+ * Postfix queue.
+ *
+ * By way of backup, the queue manager also enforces this global limit, to
+ * smooth out glitches due to program restarts. The queue manager enforces
+ * the global limit based on the content of the active queue, by tempfailing
+ * verification requests immediately.
+ *
+ * Later, the verify(8) daemon will maintain a fixed-size cache with per-domain
+ * counters, so that it can suppress requests more selectively.
+ *
+ * Even with non-selective limits the impact on legitimate users will be small.
+ * The verify(8) daemon proactively refreshes active addresses well before
+ * they expire. The non-selective limits will affect only addresses that are
+ * inactive (31 days by default) or addresses that are unknown.
+ */
+ static unsigned verify_pend_count;
+
+ /* verify_probe_queued - record that a probe has entered the mail queue */
+
+ static void verify_probe_queued(const char *addr)
+ {
+ const char myname[] = "verify_probe_queued";
+
+ verify_pend_count += 1;
+ if (msg_verbose)
+ msg_info("%s: address=%s pend_count=%u",
+ myname, addr, verify_pend_count);
+ }
+
+ /* verify_probe_done - chalk up another probe as completed */
+
+ static void verify_probe_done(const char *addr)
+ {
+ const char myname[] = "verify_probe_done";
+
+ /*
+ * Ignore the completion of requests that were issued before the current
+ * verify(8) process was started.
+ */
+ if (verify_pend_count == 0) {
+ if (msg_verbose)
+ msg_info("%s: ignoring address=%s without pending request count",
+ myname, addr);
+ } else {
+ verify_pend_count -= 1;
+ if (msg_verbose)
+ msg_info("%s: address=%s pend_count=%u",
+ myname, addr, verify_pend_count);
+ }
+ }
+
+ /* verify_probe_permit - permit or deny probe */
+
+ static int verify_probe_permit(const char *addr)
+ {
+ const char myname[] = "verify_probe_permit";
+
+ if (msg_verbose)
+ msg_info("%s: address=%s pend_count=%u pend_limit=%d",
+ myname, addr, verify_pend_count, var_vrfy_pend_limit);
+
+ if (verify_pend_count < var_vrfy_pend_limit) {
+ return (1);
+ } else {
+ msg_info("Dropping verification request for %s", addr);
+ return (0);
+ }
+ }
+
+ /*
* The address verification database consists of (address, data) tuples. The
* format of the data field is "status:probed:updated:text". The meaning of
* each field is:
***************
*** 387,392 ****
--- 470,476 ----
ATTR_TYPE_INT, MAIL_ATTR_STATUS, VRFY_STAT_OK,
ATTR_TYPE_END);
}
+ verify_probe_done(STR(addr));
}
vstring_free(buf);
vstring_free(addr);
***************
*** 395,409 ****

/* verify_post_mail_action - callback */

! static void verify_post_mail_action(VSTREAM *stream, void *unused_context)
{

/*
* Probe messages need no body content, because they are never delivered,
* deferred, or bounced.
*/
! if (stream != 0)
post_mail_fclose(stream);
}

/* verify_query_service - query address status */
--- 479,497 ----

/* verify_post_mail_action - callback */

! static void verify_post_mail_action(VSTREAM *stream, void *context)
{
+ char *addr = context;

/*
* Probe messages need no body content, because they are never delivered,
* deferred, or bounced.
*/
! if (stream != 0) {
! verify_probe_queued(addr);
! myfree(addr);
post_mail_fclose(stream);
+ }
}

/* verify_query_service - query address status */
***************
*** 418,423 ****
--- 506,513 ----
long probed;
long updated;
char *text;
+ int probe_permission_checked = 0;
+ int probe_permitted = 0;

if (attr_scan(client_stream, ATTR_FLAG_STRICT,
ATTR_TYPE_STR, MAIL_ATTR_ADDR, addr,
***************
*** 456,465 ****
|| (now - probed > PROBE_TTL /* safe to probe */
&& (POSITIVE_ENTRY_EXPIRED(addr_status, updated)
|| NEGATIVE_ENTRY_EXPIRED(addr_status, updated)))) {
! addr_status = DEL_RCPT_STAT_TODO;
! probed = 0;
! updated = 0;
! text = "Address verification in progress";
if (raw_data != 0 && var_verify_neg_cache == 0)
dict_cache_delete(verify_map, STR(addr));
}
--- 546,563 ----
|| (now - probed > PROBE_TTL /* safe to probe */
&& (POSITIVE_ENTRY_EXPIRED(addr_status, updated)
|| NEGATIVE_ENTRY_EXPIRED(addr_status, updated)))) {
! probe_permission_checked = 1;
! if ((probe_permitted = verify_probe_permit(STR(addr))) != 0) {
! addr_status = DEL_RCPT_STAT_TODO;
! probed = 0;
! updated = 0;
! text = "Address verification in progress";
! } else {
! addr_status = DEL_RCPT_STAT_DEFER;
! probed = 0;
! updated = 0;
! text = "Too many address verification requests";
! }
if (raw_data != 0 && var_verify_neg_cache == 0)
dict_cache_delete(verify_map, STR(addr));
}
***************
*** 494,500 ****

if (now - probed > PROBE_TTL
&& (POSITIVE_REFRESH_NEEDED(addr_status, updated)
! || NEGATIVE_REFRESH_NEEDED(addr_status, updated))) {
if (msg_verbose)
msg_info("PROBE %s status=%d probed=%ld updated=%ld",
STR(addr), addr_status, now, updated);
--- 592,600 ----

if (now - probed > PROBE_TTL
&& (POSITIVE_REFRESH_NEEDED(addr_status, updated)
! || NEGATIVE_REFRESH_NEEDED(addr_status, updated))
! && (probe_permission_checked ? probe_permitted
! : verify_probe_permit(STR(addr)))) {
if (msg_verbose)
msg_info("PROBE %s status=%d probed=%ld updated=%ld",
STR(addr), addr_status, now, updated);
***************
*** 503,509 ****
DEL_REQ_FLAG_MTA_VRFY,
(VSTRING *) 0,
verify_post_mail_action,
! (void *) 0);
if (updated != 0 || var_verify_neg_cache != 0) {
put_buf = vstring_alloc(10);
verify_make_entry(put_buf, addr_status, now, updated, text);
--- 603,609 ----
DEL_REQ_FLAG_MTA_VRFY,
(VSTRING *) 0,
verify_post_mail_action,
! mystrdup(STR(addr)));
if (updated != 0 || var_verify_neg_cache != 0) {
put_buf = vstring_alloc(10);
verify_make_entry(put_buf, addr_status, now, updated, text);
***************
*** 524,530 ****
/* verify_cache_validator - cache cleanup validator */

static int verify_cache_validator(const char *addr, const char *raw_data,
! char *context)
{
VSTRING *get_buf = (VSTRING *) context;
int addr_status;
--- 624,630 ----
/* verify_cache_validator - cache cleanup validator */

static int verify_cache_validator(const char *addr, const char *raw_data,
! char *context)
{
VSTRING *get_buf = (VSTRING *) context;
int addr_status;
***************
*** 709,714 ****
--- 809,818 ----
VAR_VERIFY_SENDER_TTL, DEF_VERIFY_SENDER_TTL, &var_verify_sender_ttl, 0, 0,
0,
};
+ static const CONFIG_INT_TABLE int_table[] = {
+ VAR_VRFY_PEND_LIMIT, DEF_VRFY_PEND_LIMIT, &var_vrfy_pend_limit, 1, 0,
+ 0,
+ };

/*
* Fingerprint executables and core dumps.
***************
*** 718,723 ****
--- 822,828 ----
multi_server_main(argc, argv, verify_service,
MAIL_SERVER_STR_TABLE, str_table,
MAIL_SERVER_TIME_TABLE, time_table,
+ MAIL_SERVER_INT_TABLE, int_table,
MAIL_SERVER_PRE_INIT, pre_jail_init,
MAIL_SERVER_POST_INIT, post_jail_init,
MAIL_SERVER_SOLITARY,
Wietse Venema
2014-04-27 00:42:15 UTC
Permalink
This is the third of several patches to limit the number of address
verification requests in the Postfix mail queue.

This patch ensures that Postfix produces only one address verification
result when a virtual alias expands into multiple addresses. Multiple
results would invalidate the verify(8) server's estimate of the
number of pending address verification requests.

There are other reasons to avoid multiple results for one virtual
alias. Multiple results aren't useful because only the last of
many results is remembered for the virtual alias. That last result
is not representative of the content of the virtual alias.

In other words this patch addresses inefficiency and inaccuracy,
and therefore it belongs in mainstream Postfix regardless of any
problem with the number of pending verification requests.

Wietse

diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/cleanup/cleanup.h ./src/cleanup/cleanup.h
*** /var/tmp/postfix-2.12-20140406-verify-2/src/cleanup/cleanup.h Sat Nov 23 19:39:32 2013
--- ./src/cleanup/cleanup.h Sat Apr 26 17:44:56 2014
***************
*** 62,67 ****
--- 62,68 ----
char *return_receipt; /* return-receipt address */
char *errors_to; /* errors-to address */
int flags; /* processing options, status flags */
+ int tflags; /* User- or MTA-requested tracing */
int qmgr_opts; /* qmgr processing options */
int errs; /* any badness experienced */
int err_mask; /* allowed badness */
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/cleanup/cleanup_envelope.c ./src/cleanup/cleanup_envelope.c
*** /var/tmp/postfix-2.12-20140406-verify-2/src/cleanup/cleanup_envelope.c Fri Jun 5 21:24:50 2009
--- ./src/cleanup/cleanup_envelope.c Sat Apr 26 17:46:26 2014
***************
*** 66,71 ****
--- 66,72 ----
#include <mail_proto.h>
#include <dsn_mask.h>
#include <rec_attr_map.h>
+ #include <deliver_request.h>

/* Application-specific. */

***************
*** 434,439 ****
--- 435,445 ----
cleanup_out(state, type, buf, len);
return;
}
+ if (mapped_type == REC_TYPE_TFLAGS) {
+ if (state->tflags == 0)
+ state->tflags = DEL_REQ_TRACE_FLAGS(atoi(mapped_buf));
+ return;
+ }
if (type == REC_TYPE_WARN) {
/* First instance wins. */
if ((state->flags & CLEANUP_FLAG_WARN_SEEN) == 0) {
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/cleanup/cleanup_out_recipient.c ./src/cleanup/cleanup_out_recipient.c
*** /var/tmp/postfix-2.12-20140406-verify-2/src/cleanup/cleanup_out_recipient.c Sun May 20 12:29:53 2007
--- ./src/cleanup/cleanup_out_recipient.c Sat Apr 26 19:59:55 2014
***************
*** 76,81 ****
--- 76,82 ----
#include <recipient_list.h>
#include <dsn.h>
#include <trace.h>
+ #include <verify.h>
#include <mail_queue.h> /* cleanup_trace_path */
#include <mail_proto.h>
#include <msg_stats.h>
***************
*** 104,109 ****
--- 105,124 ----
}
}

+ /* cleanup_verify_append - update verify daemon */
+
+ static void cleanup_verify_append(CLEANUP_STATE *state, RECIPIENT *rcpt,
+ DSN *dsn, int verify_status)
+ {
+ MSG_STATS stats;
+
+ if (verify_append(state->queue_id, CLEANUP_MSG_STATS(&stats, state),
+ rcpt, "none", dsn, verify_status) != 0) {
+ msg_warn("%s: verify service update error", state->queue_id);
+ state->errs |= CLEANUP_STAT_WRITE;
+ }
+ }
+
/* cleanup_out_recipient - envelope recipient output filter */

void cleanup_out_recipient(CLEANUP_STATE *state,
***************
*** 193,198 ****
--- 208,216 ----
* recipient information, also ignore differences in DSN attributes. We
* do, however, keep the DSN attributes of the recipient that survives
* duplicate elimination.
+ *
+ * Avoid multiple address verification results per request, because that
+ * would invalidate the verify(8) server's pending request estimate.
*/
else {
RECIPIENT rcpt;
***************
*** 200,205 ****
--- 218,231 ----

argv = cleanup_map1n_internal(state, recip, cleanup_virt_alias_maps,
cleanup_ext_prop_mask & EXT_PROP_VIRTUAL);
+ if (argv->argc > 1 && (state->tflags & DEL_REQ_FLAG_MTA_VRFY)) {
+ (void) DSN_SIMPLE(&dsn, "2.0.0", "aliased to multiple recipients");
+ dsn.action = "deliverable";
+ RECIPIENT_ASSIGN(&rcpt, 0, dsn_orcpt, dsn_notify, orcpt, recip);
+ cleanup_verify_append(state, &rcpt, &dsn, DEL_RCPT_STAT_OK);
+ argv_free(argv);
+ return;
+ }
if ((dsn_notify & DSN_NOTIFY_SUCCESS)
&& (argv->argc > 1 || strcmp(recip, argv->argv[0]) != 0)) {
(void) DSN_SIMPLE(&dsn, "2.0.0", "alias expanded");
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/cleanup/cleanup_state.c ./src/cleanup/cleanup_state.c
*** /var/tmp/postfix-2.12-20140406-verify-2/src/cleanup/cleanup_state.c Sat Nov 23 19:37:19 2013
--- ./src/cleanup/cleanup_state.c Sat Apr 26 17:45:10 2014
***************
*** 79,84 ****
--- 79,85 ----
state->return_receipt = 0;
state->errors_to = 0;
state->flags = 0;
+ state->tflags = 0;
state->qmgr_opts = 0;
state->errs = 0;
state->err_mask = 0;
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/global/post_mail.c ./src/global/post_mail.c
*** /var/tmp/postfix-2.12-20140406-verify-2/src/global/post_mail.c Mon Feb 12 15:34:48 2007
--- ./src/global/post_mail.c Sat Apr 26 20:13:33 2014
***************
*** 51,56 ****
--- 51,61 ----
/*
/* int post_mail_fclose(stream)
/* VSTREAM *STREAM;
+ /*
+ /* void post_mail_fclose_async(stream, notify, context)
+ /* VSTREAM *stream;
+ /* void (*notify)(int status, void *context);
+ /* void *context;
/* DESCRIPTION
/* This module provides a convenient interface for the most
/* common case of sending one message to one recipient. It
***************
*** 88,93 ****
--- 93,103 ----
/*
/* post_mail_fclose() completes the posting of a message.
/*
+ /* post_mail_fclose_async() completes the posting of a message
+ /* and upon completion invokes the caller-specified notify
+ /* routine, with the cleanup status and caller-specified context
+ /* as arguments.
+ /*
/* Arguments:
/* .IP sender
/* The sender envelope address. It is up to the application
***************
*** 179,184 ****
--- 189,204 ----
VSTRING *queue_id;
} POST_MAIL_STATE;

+ /*
+ * Call-back state for asynchronous close requests.
+ */
+ typedef struct {
+ int status;
+ VSTREAM *stream;
+ POST_MAIL_FCLOSE_NOTIFY notify;
+ void *context;
+ } POST_MAIL_FCLOSE_STATE;
+
/* post_mail_init - initial negotiations */

static void post_mail_init(VSTREAM *stream, const char *sender,
***************
*** 189,201 ****
VSTRING *id = queue_id ? queue_id : vstring_alloc(100);
struct timeval now;
const char *date;
! int cleanup_flags =
! int_filt_flags(filter_class) | CLEANUP_FLAG_MASK_INTERNAL;

GETTIMEOFDAY(&now);
date = mail_date(now.tv_sec);

/*
* Negotiate with the cleanup service. Give up if we can't agree.
*/
if (attr_scan(stream, ATTR_FLAG_STRICT,
--- 209,227 ----
VSTRING *id = queue_id ? queue_id : vstring_alloc(100);
struct timeval now;
const char *date;
! int cleanup_flags =
! int_filt_flags(filter_class) | CLEANUP_FLAG_MASK_INTERNAL;

GETTIMEOFDAY(&now);
date = mail_date(now.tv_sec);

/*
+ * Don't flush buffers while sending the initial message records.
+ */
+ vstream_control(stream, VSTREAM_CTL_BUFSIZE, 2 * VSTREAM_BUFSIZE,
+ VSTREAM_CTL_END);
+
+ /*
* Negotiate with the cleanup service. Give up if we can't agree.
*/
if (attr_scan(stream, ATTR_FLAG_STRICT,
***************
*** 358,364 ****
*/
if (stream != 0) {
event_enable_read(vstream_fileno(stream), post_mail_open_event,
! (void *) state);
event_request_timer(post_mail_open_event, (void *) state,
var_daemon_timeout);
} else {
--- 384,390 ----
*/
if (stream != 0) {
event_enable_read(vstream_fileno(stream), post_mail_open_event,
! (void *) state);
event_request_timer(post_mail_open_event, (void *) state,
var_daemon_timeout);
} else {
***************
*** 420,422 ****
--- 446,512 ----
(void) vstream_fclose(cleanup);
return (status);
}
+
+ /* post_mail_fclose_event - event handler */
+
+ static void post_mail_fclose_event(int event, char *context)
+ {
+ POST_MAIL_FCLOSE_STATE *state = (POST_MAIL_FCLOSE_STATE *) context;
+ int status = state->status;
+
+ if (status == 0) {
+ if (vstream_ferror(state->stream) != 0
+ || attr_scan(state->stream, ATTR_FLAG_MISSING,
+ ATTR_TYPE_INT, MAIL_ATTR_STATUS, &status,
+ ATTR_TYPE_END) != 1)
+ status = CLEANUP_STAT_WRITE;
+ }
+ (void) vstream_fclose(state->stream);
+ state->notify(status, state->context);
+ myfree((char *) state);
+ }
+
+ /* post_mail_fclose_async - finish posting of message */
+
+ void post_mail_fclose_async(VSTREAM *stream,
+ void (*notify) (int status, void *context),
+ void *context)
+ {
+ POST_MAIL_FCLOSE_STATE *state;
+ int status = 0;
+
+
+ /*
+ * Send the message end marker only when there were no errors.
+ */
+ if (vstream_ferror(stream) != 0) {
+ status = CLEANUP_STAT_WRITE;
+ } else {
+ rec_fputs(stream, REC_TYPE_XTRA, "");
+ rec_fputs(stream, REC_TYPE_END, "");
+ if (vstream_fflush(stream))
+ status = CLEANUP_STAT_WRITE;
+ }
+
+ /*
+ * Bundle up the suspended state.
+ */
+ state = (POST_MAIL_FCLOSE_STATE *) mymalloc(sizeof(*state));
+ state->status = status;
+ state->stream = stream;
+ state->notify = notify;
+ state->context = context;
+
+ /*
+ * To keep interfaces as simple as possible we report all errors via the
+ * same interface as all successes.
+ */
+ if (status == 0) {
+ event_enable_read(vstream_fileno(stream), post_mail_fclose_event,
+ (void *) state);
+ event_request_timer(post_mail_fclose_event, (void *) state,
+ var_daemon_timeout);
+ } else {
+ event_request_timer(post_mail_fclose_event, (void *) state, 0);
+ }
+ }
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/global/post_mail.h ./src/global/post_mail.h
*** /var/tmp/postfix-2.12-20140406-verify-2/src/global/post_mail.h Mon Jul 10 17:16:00 2006
--- ./src/global/post_mail.h Sat Apr 26 19:42:39 2014
***************
*** 34,39 ****
--- 34,41 ----
extern int post_mail_fputs(VSTREAM *, const char *);
extern int post_mail_buffer(VSTREAM *, const char *, int);
extern int post_mail_fclose(VSTREAM *);
+ typedef void (*POST_MAIL_FCLOSE_NOTIFY)(int, void *);
+ extern void post_mail_fclose_async(VSTREAM *, POST_MAIL_FCLOSE_NOTIFY, void *);

#define POST_MAIL_BUFFER(v, b) \
post_mail_buffer((v), vstring_str(b), VSTRING_LEN(b))
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/global/rec_attr_map.c ./src/global/rec_attr_map.c
*** /var/tmp/postfix-2.12-20140406-verify-2/src/global/rec_attr_map.c Mon Mar 13 18:12:50 2006
--- ./src/global/rec_attr_map.c Sat Apr 26 17:36:16 2014
***************
*** 48,53 ****
--- 48,55 ----
return (REC_TYPE_DSN_RET);
} else if (strcmp(attr_name, MAIL_ATTR_CREATE_TIME) == 0) {
return (REC_TYPE_CTIME);
+ } else if (strcmp(attr_name, MAIL_ATTR_TRACE_FLAGS) == 0) {
+ return (REC_TYPE_TFLAGS);
} else {
return (0);
}
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/global/rec_type.h ./src/global/rec_type.h
*** /var/tmp/postfix-2.12-20140406-verify-2/src/global/rec_type.h Fri Aug 24 15:43:51 2012
--- ./src/global/rec_type.h Sat Apr 26 17:39:05 2014
***************
*** 72,77 ****
--- 72,79 ----
#define REC_TYPE_DSN_ORCPT 'o' /* DSN orig rcpt address */
#define REC_TYPE_DSN_NOTIFY 'n' /* DSN notify flags */

+ #define REC_TYPE_TFLAGS 't' /* User- or MTA-requested tracing */
+
#define REC_TYPE_MILT_COUNT 'm'

#define REC_TYPE_END 'E' /* terminator, required */
diff --exclude=man --exclude=html --exclude=README_FILES --exclude=INSTALL --exclude=.indent.pro --exclude=Makefile.in -r -cr /var/tmp/postfix-2.12-20140406-verify-2/src/verify/verify.c ./src/verify/verify.c
*** /var/tmp/postfix-2.12-20140406-verify-2/src/verify/verify.c Fri Apr 25 19:48:07 2014
--- ./src/verify/verify.c Sat Apr 26 19:46:47 2014
***************
*** 477,482 ****
--- 477,493 ----
vstring_free(text);
}

+ /* verify_post_mail_fclose_action - callback */
+
+ static void verify_post_mail_fclose_action(int status, void *context)
+ {
+ char *addr = context;
+
+ if (status == 0)
+ verify_probe_queued(addr);
+ myfree(addr);
+ }
+
/* verify_post_mail_action - callback */

static void verify_post_mail_action(VSTREAM *stream, void *context)
***************
*** 487,497 ****
* Probe messages need no body content, because they are never delivered,
* deferred, or bounced.
*/
! if (stream != 0) {
! verify_probe_queued(addr);
myfree(addr);
- post_mail_fclose(stream);
- }
}

/* verify_query_service - query address status */
--- 498,507 ----
* Probe messages need no body content, because they are never delivered,
* deferred, or bounced.
*/
! if (stream != 0)
! post_mail_fclose_async(stream, verify_post_mail_fclose_action, addr);
! else
myfree(addr);
}

/* verify_query_service - query address status */
Wietse Venema
2014-04-28 00:30:27 UTC
Permalink
This is the fourth of several patches to limit the number of address
verification requests in the Postfix mail queue.

This patch corrects mistakes in the third patch.

- Dangling call-backs in post_mail_fclose_async() event handler.

- The queue manager no longer recognized address verification
requests.

- The verify daemon now flags a request as pending before we have
the final cleanup submission status, so that a report for a
1-to-many virtual alias will arrive when the pending count is
non-zero.

Wietse

diff -cr /var/tmp/postfix-2.12-20140406-verify-3/src/cleanup/cleanup_envelope.c ./src/cleanup/cleanup_envelope.c
*** /var/tmp/postfix-2.12-20140406-verify-3/src/cleanup/cleanup_envelope.c Sat Apr 26 17:46:26 2014
--- ./src/cleanup/cleanup_envelope.c Sun Apr 27 20:11:59 2014
***************
*** 435,445 ****
cleanup_out(state, type, buf, len);
return;
}
- if (mapped_type == REC_TYPE_TFLAGS) {
- if (state->tflags == 0)
- state->tflags = DEL_REQ_TRACE_FLAGS(atoi(mapped_buf));
- return;
- }
if (type == REC_TYPE_WARN) {
/* First instance wins. */
if ((state->flags & CLEANUP_FLAG_WARN_SEEN) == 0) {
--- 435,440 ----
***************
*** 485,490 ****
--- 480,495 ----
return;
}
}
+ if (strcmp(attr_name, MAIL_ATTR_TRACE_FLAGS) == 0) {
+ if (!alldig(attr_value)) {
+ msg_warn("%s: message rejected: bad TFLAG record <%.200s>",
+ state->queue_id, buf);
+ state->errs |= CLEANUP_STAT_BAD;
+ return;
+ }
+ if (state->tflags == 0)
+ state->tflags = DEL_REQ_TRACE_FLAGS(atoi(attr_value));
+ }
nvtable_update(state->attr, attr_name, attr_value);
cleanup_out(state, type, buf, len);
return;
diff -cr /var/tmp/postfix-2.12-20140406-verify-3/src/global/post_mail.c ./src/global/post_mail.c
*** /var/tmp/postfix-2.12-20140406-verify-3/src/global/post_mail.c Sat Apr 26 20:13:33 2014
--- ./src/global/post_mail.c Sun Apr 27 20:23:01 2014
***************
*** 454,468 ****
POST_MAIL_FCLOSE_STATE *state = (POST_MAIL_FCLOSE_STATE *) context;
int status = state->status;

! if (status == 0) {
! if (vstream_ferror(state->stream) != 0
! || attr_scan(state->stream, ATTR_FLAG_MISSING,
! ATTR_TYPE_INT, MAIL_ATTR_STATUS, &status,
! ATTR_TYPE_END) != 1)
! status = CLEANUP_STAT_WRITE;
}
! (void) vstream_fclose(state->stream);
state->notify(status, state->context);
myfree((char *) state);
}

--- 454,495 ----
POST_MAIL_FCLOSE_STATE *state = (POST_MAIL_FCLOSE_STATE *) context;
int status = state->status;

! switch (event) {
!
! /*
! * Final server reply. Pick up the completion status.
! */
! case EVENT_READ:
! if (status == 0) {
! if (vstream_ferror(state->stream) != 0
! || attr_scan(state->stream, ATTR_FLAG_MISSING,
! ATTR_TYPE_INT, MAIL_ATTR_STATUS, &status,
! ATTR_TYPE_END) != 1)
! status = CLEANUP_STAT_WRITE;
! }
! break;
!
! /*
! * No response or error.
! */
! default:
! msg_warn("error talking to service: %s", var_cleanup_service);
! status = CLEANUP_STAT_WRITE;
! break;
}
!
! /*
! * Stop the watchdog timer, and disable further read events that end up
! * calling this function.
! */
! event_cancel_timer(post_mail_fclose_event, context);
! event_disable_readwrite(vstream_fileno(state->stream));
!
! /*
! * Notify the requestor and clean up.
! */
state->notify(status, state->context);
+ (void) vstream_fclose(state->stream);
myfree((char *) state);
}

diff -cr /var/tmp/postfix-2.12-20140406-verify-3/src/global/rec_attr_map.c ./src/global/rec_attr_map.c
*** /var/tmp/postfix-2.12-20140406-verify-3/src/global/rec_attr_map.c Sat Apr 26 17:36:16 2014
--- ./src/global/rec_attr_map.c Sun Apr 27 20:12:43 2014
***************
*** 48,55 ****
return (REC_TYPE_DSN_RET);
} else if (strcmp(attr_name, MAIL_ATTR_CREATE_TIME) == 0) {
return (REC_TYPE_CTIME);
- } else if (strcmp(attr_name, MAIL_ATTR_TRACE_FLAGS) == 0) {
- return (REC_TYPE_TFLAGS);
} else {
return (0);
}
--- 48,53 ----
diff -cr /var/tmp/postfix-2.12-20140406-verify-3/src/verify/verify.c ./src/verify/verify.c
*** /var/tmp/postfix-2.12-20140406-verify-3/src/verify/verify.c Sat Apr 26 19:46:47 2014
--- ./src/verify/verify.c Sun Apr 27 20:09:50 2014
***************
*** 483,490 ****
{
char *addr = context;

! if (status == 0)
! verify_probe_queued(addr);
myfree(addr);
}

--- 483,493 ----
{
char *addr = context;

! /*
! * In case of trouble, count this request as not pending.
! */
! if (status != 0)
! verify_probe_done(addr);
myfree(addr);
}

***************
*** 497,506 ****
/*
* Probe messages need no body content, because they are never delivered,
* deferred, or bounced.
*/
! if (stream != 0)
post_mail_fclose_async(stream, verify_post_mail_fclose_action, addr);
! else
myfree(addr);
}

--- 500,513 ----
/*
* Probe messages need no body content, because they are never delivered,
* deferred, or bounced.
+ *
+ * Count this request as pending, so that we have non-zero pending count
+ * when the cleanup server reports a result for a 1-N virtual alias.
*/
! if (stream != 0) {
! verify_probe_queued(addr);
post_mail_fclose_async(stream, verify_post_mail_fclose_action, addr);
! } else
myfree(addr);
}
Mika Ilmaranta
2014-04-28 20:54:20 UTC
Permalink
Post by Wietse Venema
Post by Mika Ilmaranta
2) I don't see how this would be any different for sender verify probes.
The number of sender domains is larger than your pool of recipient
domains, and therefore, tracking the sender domains would require
more memory.
This is only difference in definitions of infinity. Sender verify is
infinity * n and recipient verify is infinity * x, where n > x, so both
are really infinite.
Wietse Venema
2014-04-29 11:26:45 UTC
Permalink
Post by Mika Ilmaranta
Post by Wietse Venema
Post by Mika Ilmaranta
2) I don't see how this would be any different for sender verify probes.
The number of sender domains is larger than your pool of recipient
domains, and therefore, tracking the sender domains would require
more memory.
This is only difference in definitions of infinity. Sender verify is
infinity * n and recipient verify is infinity * x, where n > x, so both
are really infinite.
The discussion is about tracking per-DOMAIN counters not per-ADDRESS.
The number of your customer DOMAINS (receivers) is "finite" compared
to the rest of the Internet DOMAINS (senders).

Anyway, after four days of testing and development I recommend that
you adopt my first patch for now. Maintaining robust counters for
all pending verify requests is not simple and will take more time.

Wietse

Kim B. Heino
2014-04-23 07:22:08 UTC
Permalink
Post by Viktor Dukhovni
Customers that tarpit probes are not doing anyone a favour, perhaps
cluestick can be applied.
As soon as we understood that they have tarpit enabled we asked them to
disable it. It helped, but this directory harvesting attack was still
able to DDoS us.
Kim B. Heino
2014-04-23 07:06:53 UTC
Permalink
Post by Wietse Venema
- The problem with external counters is that they aren't reset when
the verify daemon is restarted.
One of the reasons for using existing verify map with "invalid" values
was that those counters will be purged every 12h. Sure, if you restart
a lot, limit still can be exceeded.
Post by Wietse Venema
- The problem with per-domain in-memory counters is that they can
use up a lot of memory especially with sender domains.
Worst case memory usage is doubled: one value for recipient (as now)
and one for domain (new).

Our system shows 0.4% increase.
Wietse Venema
2014-04-23 11:09:52 UTC
Permalink
Post by Kim B. Heino
Post by Wietse Venema
- The problem with per-domain in-memory counters is that they can
use up a lot of memory especially with sender domains.
Worst case memory usage is doubled: one value for recipient (as now)
and one for domain (new).
You are only doing recipients. I have to support SENDER domains too.

Wietse
Loading...