Monthly Archives: December 2012

Lesson Learned – Enable core dumps

School of Hard Knocks.

Put this into your install procedures. Enable core dumps so that when your firewall starts rebooting magically at 1am, you don’t have to go into panic mode trying to figure out how to get a core dump for analysis. If you do this now you will save about 24 hours of your life (and sleep).

No impact on reliability or performance. Works on SPLAT and GAIA:

========== OS level core dump ===========================

#add crashkernel=64M@16M to grub.conf

#chkconfig --level 3 kdump on

sk44186 – Enable generation of kernel core dump file on Check Point Security Gateway on SecurePlatform / Gaia running R7x

============ USER level core dump ===================

To get a USER level core dump check out this command:

# um_core enable or #ulimit -c unlimited

sk18307 – Enabling the generation of User Mode core dump files on SecurePlatform OS

This should be enabled by default but it is not.

dreez

Advertisements

I always wanted to know this….maximum memory used by SPLAT and GAIA

I’ve been asking this everytime I see a Check Point shirt on a body. Actually asked T3 support yesterday and they didn’t know.

SPLAT:

YES: they use PAE  in some cases up to 16GB for firewalls and 32GB for Mgt

I’m coming closer to an answer.

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solutionid=sk22343&js_peid=P-114a7bc3b09-10006&partition=General&product=SecurePlatform%22

GAIA:

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solutionid=sk71001

We took a box and jammed 144GB into it. Loaded GAIA and booted.

BOMBED!!!

We took out memory down to 64MB.

Reboot. Worked

Switched GAIA into 64-bit mode ‘ set edition default 64-bit’ command, remember ’64-bit’ and not ’64’. Jammed in memory back to 144MB.

‘top’, ‘vmstat’, ‘free’ all recognize all the memory. Now whether GAIA really uses it……The hunt continues.

UPDATE: R76 Release notes: MDM supports up to 128GB

 

Living large

dreez

SmartLog Quickstart video

I made this cheesy video for learning SmartLog.

Enjoy

dreez

Booting UTM off USB

SmartLog deletes old logs now! R75.45

Just FYI, I loaded up a SmartLog with 1.3 billion logs and filled the disk and guess what? It elegantly deletes old entries to make way for the new files.

Congrats!

Oh yeah, SmartLog just rocks! RSA Envision. Dead…Dead…Dead…

Look out for some problems I am trying to figure out:

1) Counters are wrong: The counters on the left side are wrong
2) Exporting will lose seconds and keep only minutes
3) You can only export 10,000 entries at a time

That’s what I can remember for now. Just love it! Thanks Dudi.

dreez

 

SecureXL – How it doesn’t work

[FYI….this is work in progress just stream of consciousnesses. I’ll remove this label when I think I’ve got it straight.]

[UPDATE: 2/14/2014] I have spent several months putting together how SecureXL/CoreXL works, creating labs, talking to developers and people much smarter than me. I am putting together a class on the topic .  I have to update this blog to reflect reality when I get time. Its probably about 75% correct…..

 

——————————————————

Ya know, I wish I could explain how SecureXL works but everytime I think I get it…I’m wrong. I’ve had several different CP people try to explain it to me.. and parts are wrong. So I’m not sure what to tell you. This will be my worst blog ever, but my goal is to spend my golden years trying to figure out how SecureXL works and fix this blog as I go.

You can stop reading now….

I was sent this document that does a great job of explaining SecureXL. But if its accurate who knows.

SecureXL How it worksL

For example: there is a statement in the SecureXL document that says any non-TCP/UDP/GRE rules will turn off all SecureXL. Well so much for ICMP I guess. But I think its wrong…. Acceleration is still turned on.

The basics you should remember.

SecureXL does two things…..

1) Accept Templates: Some protocols like HTTP 1.0 create MULTIPLE concurrent connections from a client to a single port on a single server, port 80. The rate of of acceptance by SecureXL is increased by caching these connections into a “template” connection table. After the first connection – any future ‘similar’ connections to the common port from that client are NOT forwarded to the firewall kernel, but instead instantly accepted and forwarded.

2) Connection Accelerated: SecureXL increases throughput on connections that have already been setup and inserted into connection table between a client and server. The successive packets in a connection will not go through the kernel but will be handled by interrupt handlers (watch your SI on ‘top’ command).

The following diagram was done by 51sec …..

12-17-2012 10-12-54 PM

Unfortunately SecureXL sometimes it feels like it only works if there is a full moon and NAT, VPN, IPS, and 100 other services are not enabled. If you do a fwaccel stat you will start to get feel of what is enabled…but I’m still leary. Also remember that fwaccel stats (with an ‘s’) is a totally different command.

Before you read the documenation you have to understand two things:

1) Accept templates/Connection templates/Connection accelerated: is overloaded. They are referring the grouped mainly HTTP traffic that creates templates in order to accelerate similar connections from 1 source PC. It is a ‘template’ because it intends to accelerate ‘similar’ traffic, not an exact connection.

2) Throughput acceleration: Connection table acceleration. Is to accelerate the second and subsequent packets of a connection. These packet have to perfectly match the connection table.

If you don’t know the difference between these two, the documentation will turn you around…

Below you can see the two commands, stat and stats. The stat command will tell you if connection templates #1 above is turned on and at what rule connection templates (not acceleration) is disabled, in this case rule #1 (by the red arrow) disables the creation of connection templates. This is what I wish I could tell you what the magic is, but I can’t I get different versions of reality. I know ICMP impacts because SecureXL was mostly designed just for HTTP connections grabbing 30 different pages from one client. The numbers in green list how many connections from templates of all connections are created and how many connections are going through the firewall kernel (F2F) for processing. You can see here that because rule #1 disables templates that 0 connection templates are created.

The second part below in red shows throughput acceleration from #2 above. “accel packets” is packets that are not forwarded to the firewall kernel and F2F are packets that are forwarded to the firewall kernel for rule processing. These are NOT the same as Accept Templates which are related to accelerating HTTP requests from a single PC. Only accelerating the second and subsequent packets from a single connection.

conns

Just a little more detail here. This is the ‘fwaccel stat’ and the ‘fwaccel stat -s’ command.
You can see how the % of accelerated packets is calculated. Take the current total
connections (TCP and non-TCP) and subtract the F2F connections (slow-path, the ones
that got forwarded through the kernel and not handled by interrupt processing).

Kasper has figured out (THANKS!!!) that PXL is really PSL, Packet Streaming Library which Check Point uses for its IPS engine. It re-assembles IP packets so the  IPS engine can inspect them (I have to update this diagram).  You can read about it here.

How secureXL computes its counts

How secureXL computes its counts

Little bit more of a deep dive here:

Why am I telling you this? Well, if for whatever reason you are wary of SecureXL (or don’t have a license) and decide to turn it off you can look at the before and after. You can also do you own ‘acceleration’ by moving the most hits rules to the top of the policy so they get hit first. You might have to use Tufin to figure out what the connection counts are on a per rule basis. CP also has some magic software where if you do a ‘fw tab -t connections’, they can tell you what rules are getting hit the heaviest.

And of course don’t trust SmartMonitor to report the right #’s. CP official policy is SmartMonitor is broke when SecureXL is turned on.

So sad. Just sharing the love here.

dreez

This is the “Oh YEAH…Aside from the marketing rah-rah, this is what will blow it up list”

These services prevent ACCEPT TEMPLATES from being created (Throughput acceleration is not inhibited by the presence of rules with these properties.)

• Service with a port number range

• Service type “other” with a match expression

• Service type RPC, DCOM, or DCE-RPC

• Service with “enable reply from any port” checked

• Source or destination is a domain or a dynamic object

• Time object associated with the rule

• Client or session authentication involved with the rule

• SYN Defender (the entire 3-way handshake must be supervised by the FireWall-1 application, slightly reducing the effect of connection-rate acceleration – most significant performance impact on short duration connections)

The following rule properties present in the security policy will disable throughput and connection-rate acceleration for all traffic.

• Rules with action “encrypt” on an interface that does not support cryptography

• Rules where the source or destination of the rule is the gateway itself

• Rules where the service has an INSPECT handler (e.g. FTP control connection)

• Rules with Security Servers or services with resources

Firewall Connection Rate, IP1260

• Rules with user authentication

• Rules for non-TCP/UDP/GRE/ESP connections (NOTE ICMP disables accept templates!!!, and it won’t be throughput accelerated)

Traffic Limitations

The following traffic is not throughput or connection-rate accelerated by SecureXL.

• Multicast traffic

• Directed broadcast traffic

• Traffic across an Access Control List-enabled interface

• Traffic whose Protocol field in the IP header is not TCP or UDP (e.g. ICMP, IGRP, etc.)

• IPv6 traffic

• VPN encryption algorithms that are not supported by the hardware.

• IP compression enabled for VPN traffic.

The following traffic is not connection-rate accelerated by SecureXL.

• VPN

• Complex connections such as FTP, H.323…

• Non-TCP/UDP connections

Environment Limitations

The following environment disables connection rate acceleration for the traffic that the environment is applied to by SecureXL.

NAT

FTP

VPN traffic

MORE REFERENCES:

Good Explanation from CPUG

SmartMonitor and Multi-CPU monitoring

You probably already know all this but it was a surprise to me. SmartMonitor is not accurate. Wow, I’m sure this was on CNN, but I missed that broadcast.

OK, so when SecureXL is enabled the official CP policy is don’t trust SmartMonitor.

But there is more.

Multi-CPU

SmartMonitor will report that CPUs are at 25%. Well WRONG. That is the average CPU for all processors.

You have to manually inspect the processors that are licensed and configured with coreXL with ‘top’. Notice that only the # of allocated processors that have kernel instances on them are actually doing the work while the other processors are idle.. So if you are doing SNMP monitoring, so sad to bad – SOL as we say in Minnesooooota – Snow Out Of Luck.

Oh yeah, be careful applying licenses with coreXL and adding/subtracting processors. It will dynamically alter the number of CPUs permitted to run and you might have a member go into ‘ready’ state if the number of processors is not consitent across all cluster members.

 

FYI:

si: time the kernel spends handling software interrupts [ user space to kernel calls for passing
packets and receiving packets]
hi: time the kernel spends handling hardware interrupts
wa: time the processor is waiting for resources
sy: time the processor is handling mainline kernel code
id: idle
us: user space – httpd and vpnd, user /bin/bash shells

[I dummied these up to make it work so might be off a bit…]

CPU time

CPU time

 

And buried in SmartMonitor is the true fact that the kernel is underwater, but overall the 12 CPUs are slacking

Kernel CPU time

Kernel CPU time

 

Doing forensics with old log files in SmartLog

So you have this big outage, and management is banging at your door “ROOT CAUSE”. What is a CP geek to do?

SMARTLOG to the rescue!!!

As we know the problem with most loggers including SmartTrackeris you only get to see the forest through a small straw. SmartLog gives you the helicopter view and quickly.

SOOOOO.

SMARTLOG – Quick and Easy. How to import and load old log/tracker files into SmartLog. Quick/Dirty and Easy Peasy

  1. Setup SmartCenter R75.45 (NOTE: I was given several patches on top of that because the counters were off and and it was core dumping. You might have to call support): Ohhhh 4 CPUS, 500 gig, 8 gig of memory should be a start. VM is OK, but disk will be slow
  2. Use SmartDashboard to connect to SmartCenter and enable SmartLog01-enable.smartlog 
  3. Then go into the smartlog conf directory01-enterconfig 
  4. smartlogstop
  5. Then go into the smartlog_settings.txt basically sk73360
  6. SOOOOO. they finally fixed sk73360.  Num_days_restriction_for_fetch_all_integrated tells SmartLog how many days to go back and index log files. So 150 says go back 150 days and index those  files up to the present day.
  7. Thanks to ‘Mike” in support, we figured out that the “time_restrictions_for_fetch_all” means the date that SmartLog was installed – sk77640. Smartlog will  only index files that are created AFTER smartlog was installed. We set the “time_restric….fetch_all” to the epoch of BEFORE the log files were created and wallah, it started indexing
  8. cd $SMARTLOGDIR/data
  9. rm -rf *
  10. cd $FWDIR/log
  11. (copy all your fw.log,logptr,loginitial_ptr files and the log switches
    into this directory)
  12. smartlogstart
  13. You can tell when the index works or not by going into the $SMARTLOGDIR/log, and “tail -f  smartlog_server.log”, and you will see the read rate counters indicating the number of records it has read.  If you monitor closely it will also show you files it ignores and why.
  14. 03readrate

So I haven’t figure out all the parameters to get this to work and will soon, but for the time being this should give you enough info to do quick and dirty forensics. Check Dr.StrangeLog out for more info, good read.

NOTE: that you only get IP addresses because the names and IPs and rules numbers can’t be correlated with the objects.C database from the server you imported the log files from. You’d have to joing this SmartCenter to the domain in order for them to share the objects.C file. But what do you want for 5 hours of work?

SmartLog- awesome. Quick and Dirty.

dreez

Helen's Loom

"Peculiar travel suggestions are dancing lessons from God." - Kurt Vonnegut

Life Stories from Dreez

These are stories from my travels. Generally I like to write stories about local people that I meet and also brag about living the retirement dream with my #1 wife Gaby. She is also my only wife.