diff options
author | Andreas Baumann <mail@andreasbaumann.cc> | 2015-09-22 21:21:01 +0200 |
---|---|---|
committer | Andreas Baumann <mail@andreasbaumann.cc> | 2015-09-22 21:21:01 +0200 |
commit | 63a0b1d48af39922d524741be8eaff7bb24685d5 (patch) | |
tree | 8298ba496cb6953533554c7adeedc6624383a23d /docs | |
parent | d5295e9123ad96fbea5e1e0e3a51d4450d0fea96 (diff) | |
download | OpenBSD-firewall-63a0b1d48af39922d524741be8eaff7bb24685d5.tar.gz OpenBSD-firewall-63a0b1d48af39922d524741be8eaff7bb24685d5.tar.bz2 |
added an article on pf limits in productive environment
Diffstat (limited to 'docs')
-rw-r--r-- | docs/pf-limits-in-openbsd.html | 201 |
1 files changed, 201 insertions, 0 deletions
diff --git a/docs/pf-limits-in-openbsd.html b/docs/pf-limits-in-openbsd.html new file mode 100644 index 0000000..07684c9 --- /dev/null +++ b/docs/pf-limits-in-openbsd.html @@ -0,0 +1,201 @@ + + [1]<- Previous [2]Home [3]Next -> + +PF Limits in OpenBSD + + This article documents one of several insidious little gotchas I've + encountered using OpenBSD systems in a core-router/firewall capacity in + lieu of Cisco 2851 or Juniper j4350 class hardware. Specifically, + various hard memory limits built into PF, which, when encountered, + cause PF to stop accepting new connections. + + Incidently, [4]here is the story of how I wound up replacing the + preponderant quantity of my networking gear with openBSD and saved + metric-oodles of coinage. + + Anyway, the upshot is that if you use OpenBSD with PF in a production + environment and you aren't aware of PF's memory limit (especially the + state-related memory limits), you have a ticking time-bomb on your + network. Just FYI. + + I'd been playing with OpenBSD for fun, in low-budget side projects, and + non-prod environments for years before that fateful day that I ran into + the state-table limit like a brick wall. + + It was shortly after I'd replaced the cisco-based core routing + infrastructure of our Headquarters building with OpenBSD. It presented + as a sort of network "glitch". You know, the unexplainable little + connectivity loss that only affects one user. Probably his cable, or + wall socket. But then it was two or three users, and then it was a user + whose connectivity was working fine, except for he couldn't create new + ssh connections suddenly (wha?). It was gone as quickly as it appeared, + and never seemed to adhere to any sort of consistent set of symptoms. + It was quite maddening. + + At some point I noticed that if I was quick enough, I could catch a "no + route to host" error message from PF on the console of the core + routers, and that's when I really started looking at them in earnest. + + It turns out, as I've already said, the kernel keeps memory set aside + for PF to do things like create state tables and state table entries. + In my case I was hitting the limit on the total number of states PF was + allowed to track at once. This meant that new connections would fail + with no route to host until some other state expired and made room for + the new one. This looked downright wierd troublshooting from the + outside because protocols like HTTP (which are stateless) would still + work pretty well, while others like SSH (which requires a constant + connection) were more likely to have problems. + + You can see the default sizes of these limits using pfctl -sm: +# pfctl -sm +states hard limit 10000 +src-nodes hard limit 10000 +frags hard limit 5000 +tables hard limit 1000 +table-entries hard limit 200000 + + These are pretty sane defaults for most people who are running OpenBSD + routers, which is to say, nerds who have wedged it on to their + [5]soekris board or the wrt54 they found at the second-hand store, or + the 8086 they found under the sink in their dad's house. + + If you're running production routers on real hardware you're going to + want to raise those a bit. And by `bit' I mean like two orders of + magnitude. Do this with a line in your pf.conf that looks something + like this: + + set limit { states 1000000, frags 1000000, src-nodes 100000, tables + 1000000, table-entries 1000000 } + + You can check to see if you've ever hit one of these limits with pfctl + -si, which displayes the values for a whole bunch of couters tracked by + PF: +[dave@a][~]--> sudo pfctl -si +Status: Enabled for 686 days 01:20:03 Debug: err + +State Table Total Rate + current entries 39401 + searches 587674569722 9914.3/s + inserts 23981800145 404.6/s + removals 23981760744 404.6/s +Counters + match 24166482278 407.7/s + bad-offset 0 0.0/s + fragment 0 0.0/s + short 0 0.0/s + normalize 1282 0.0/s + memory 0 0.0/s + bad-timestamp 0 0.0/s + congestion 204 0.0/s + ip-option 433656 0.0/s + proto-cksum 0 0.0/s + state-mismatch 135709 0.0/s + state-insert 0 0.0/s + state-limit 0 0.0/s + src-limit 0 0.0/s + synproxy 0 0.0/s + + If you have RRDTool installed, you can use this shell script to push + some of these values into an RRD (or repurpose it to feed collectd or + gmond or whatever): +#!/usr/local/bin/bash + +gawk="/usr/local/bin/gawk" +pfctl="/sbin/pfctl" +rrdtool="/usr/local/bin/rrdtool" +RRDHOME='/home/pcap/rrd' + +pfctl_info() { + local output=$($pfctl -si 2>&1) + local temp=$(echo "$output" | $gawk ' + BEGIN {BytesIn=0; BytesOut=0; PktsInPass=0; PktsInBlock=0; \ + PktsOutPass=0; PktsOutBlock=0; States=0; StateSearchs=0; \ + StateInserts=0; StateRemovals=0} + /Bytes In/ { BytesIn = $3 } + /Bytes Out/ { BytesOut = $3 } + /Packets In/ { getline;PktsInPass = $2 } + /Passed/ { getline;PktsInBlock = $2 } + /Packets Out/ { getline;PktsOutPass = $2 } + /Passed/ { getline;PktsOutBlock = $2 } + /current entries/ { States = $3 } + /searches/ { StateSearchs = $2 } + /inserts/ { StateInserts = $2 } + /removals/ { StateRemovals = $2 } + END {print BytesIn ":" BytesOut ":" PktsInPass ":" \ + PktsInBlock ":" PktsOutPass ":" PktsOutBlock ":" \ + States ":" StateSearchs ":" StateInserts ":" StateRemovals} + ') + RETURN_VALUE=$temp +} + +### collect the data +pfctl_info + +### update the database +$rrdtool update ${RRDHOME}/pf_stats_db.rrd --template BytesIn:BytesOut:PktsInPas +s:PktsInBlock:PktsOutPass:PktsOutBlock:States:StateSearchs:StateInserts:StateRem +ovals N:$RETURN_VALUE + + And then use the following to draw graphs from it: +#!/bin/sh + +RRDHOME='/home/pcap/rrd' +cd ${RRDHOME} + +##### +######## pf state rate graph +/usr/local/bin/rrdtool graph pf_stats_states.png \ +-w 785 -h 151 -a PNG \ +--slope-mode \ +--start -86400 --end now \ +--font DEFAULT:7: \ +--title "pf state rate" \ +--watermark "`date`" \ +--vertical-label "states/sec" \ +--right-axis-label "searches/sec" \ +--right-axis 100:0 \ +--x-grid MINUTE:10:HOUR:1:MINUTE:120:0:%R \ +--alt-y-grid --rigid \ +DEF:StateInserts=pf_stats_db.rrd:StateInserts:MAX \ +DEF:StateRemovals=pf_stats_db.rrd:StateRemovals:MAX \ +DEF:StateSearchs=pf_stats_db.rrd:StateSearchs:MAX \ +CDEF:scaled_StateSearchs=StateSearchs,0.01,* \ +DEF:States=pf_stats_db.rrd:States:MAX \ +CDEF:scaled_States=States,0.01,* \ +AREA:StateInserts#33CC33:"inserts" \ +GPRINT:StateInserts:LAST:"Cur\: %5.2lf" \ +GPRINT:StateInserts:AVERAGE:"Avg\: %5.2lf" \ +GPRINT:StateInserts:MAX:"Max\: %5.2lf" \ +GPRINT:StateInserts:MIN:"Min\: %5.2lf\t\t" \ +LINE1:scaled_StateSearchs#FF0000:"searches" \ +GPRINT:StateSearchs:LAST:"Cur\: %5.2lf" \ +GPRINT:StateSearchs:AVERAGE:"Avg\: %5.2lf" \ +GPRINT:StateSearchs:MAX:"Max\: %5.2lf" \ +GPRINT:StateSearchs:MIN:"Min\: %5.2lf\n" \ +LINE1:StateRemovals#0000CC:"removal" \ +GPRINT:StateRemovals:LAST:"Cur\: %5.2lf" \ +GPRINT:StateRemovals:AVERAGE:"Avg\: %5.2lf" \ +GPRINT:StateRemovals:MAX:"Max\: %5.2lf" \ +GPRINT:StateRemovals:MIN:"Min\: %5.2lf\t\t" \ +LINE1:scaled_States#00F0F0:"States" \ +GPRINT:States:LAST:"Cur\: %5.2lf" \ +GPRINT:States:AVERAGE:"Avg\: %5.2lf" \ +GPRINT:States:MAX:"Max\: %5.2lf" \ +GPRINT:States:MIN:"Min\: %5.2lf\n" + + which will yield you a pretty, two-axis graph like the one below which + should help you avoid limits in the future. + + PF State Graph + [6]<- Previous [7]Home [8]Next -> + +References + + 1. http://www.skeptech.org/blog/2013/01/13/unscrewed-a-story-about-openbsd/ + 2. http://www.skeptech.org/ + 3. http://www.skeptech.org/blog/2013/03/28/unhappy-bean-factory/ + 4. http://www.skeptech.org/blog/2013/01/13/unscrewed-a-story-about-openbsd/ + 5. http://soekris.com/ + 6. http://www.skeptech.org/blog/2013/01/13/unscrewed-a-story-about-openbsd/ + 7. http://www.skeptech.org/ + 8. http://www.skeptech.org/blog/2013/03/28/unhappy-bean-factory/ |