Category Archives: Tips and Tricks

Cool tips and tricks sometimes undocumented

Redirecting NSX firewall syslogs into SmartLog

 

So we know that NSX DFW is a cool toy, but its logs are invaluable for debugging and forensics. Wouldn’t it be cool if would could see DFW logs along with SmartLogs???So our need is to have single-pane of glass security; Enter vSec and SmartLog.

After 100 VMs we knew that using DFW would just not work. In addition the logging was painful syslog based. So we decided to use the best management and logging tools on the market….and it just so happens this is one of the things CP does extremely good. Management and logging. Today’s talk is on “Having Fun with Logging”.

So we needed to get the DFW syslogs into SmartLog. How to do that. Well, first thing you do is pour salt on the documentation.  Its OK, but only gives you the basics. I have decoded this documentation and will show you how to send DFW logs to SmartLog and make it go a lot faster. By default, you can just turn it on with factory defaults… but will overwhelm your SmartLog server and all the data will be in the ‘info’ field and not parsed out. So it took me a couple months to figure out how to do all the parsing efficiently so as not to bury our log server.

And now for the rest of the story…..

NSX DFW LOGS

syslog-ng-arc

NSX is the part of vMWare vSpere that runs its SDN…including the firewalls..Distributed Fire Walls (DFW). These firewalls send their syslog to their eSXI hosts which we forward to a custom syslog-ng server on a single VM.  We customized this syslog-ng server to parse DFW logs into CSV format….no easy task but doable. Why????

why-syslog-ng

Once in CSV, syslog-ng forwards the logs to ‘syslog’ on currently a UTM 4800 with 4 cores. This is a custom syslog built by CP to accept syslog and convert to CP native log format. ‘syslog’ forwards the converted logs to fwd which enables SmartLog and Tracker to read the logs.

Simple enough…….ok, then read no further.

syslog-ng config

Syslog-ng allows you to parse through logs and reformat them. You can do some simple parsing with the basic configuration file, but the fun starts with more complex patterns with the pattern matching databases. Basically you need to convert this:

rawsysl

into this:

csvrules

Now all you people are tons smarter than this old balding has-been…so I am not going to explain syslog-ng internals to you all. You will have to read the documentation. But I will give you the overview of what I did with it.

syslog-ng has two parts

  • General filtering config – Most people use this just for simple filtering of logs
    and re-formatting of logs and redirecting logs to host files, DB’s, other log servers.
  • Pattern Database – More complex parsing where you can take pieces of the input line, parse out specific items with regex, and insert those regex patterns into variables/macros to be used later in the General Filtering Config.

General Filtering Config: First thing you need to do only filter on DFW logs and nothing else.  NSX puts labels on these logs and you will have to look in the raw log file for it.

filter

Second thing you have to do is get rid of the TERM log entries (terminate connections for TCP, for UDP you’ll see zillions of these going to 5355 some sort of local subnet DNS resolution. Unfortunately in cloud world we have huge subnets so all the VMs are doing this local link DNS resolution). 75% of the DFW logs are TERM entries so can be ignored.

Pattern Database: Next part is a bit harder. You have to build a pattern database that will parse raw DFW log entries and match each field up to a named macro. For example below is a snippet of a DFW log entry matching pattern. You can see the ‘action’, ‘domain’, ‘protocol’ fields are 3 of many fields that would match a DFW log entry from above.

dfwpatternparser

This can be a delicate task, but there is a tool to help debug … ‘pdbtool match ‘. You can type in a sample line and see if the your pattern database will match it:

syslogng-parserdatabase-dfwparsing-macros

General Filtering Config: Next, DFW labels ICMP protocol as ‘PROTO’. So I just translate into normal geek lingo ‘ICMP’.

dfw-cmp

General Filtering Config:Next is the fun part. We finally get to output CSV to the CP syslog server. Notice how the MACRO names like ‘size’ and ‘protocol’ are used to fill in the CSV?

syslogng-parserdatabase-dfwparsing-csvformat

General Filtering Config: And here is the ‘main’ section that drives all the above phases:

syslogng-parserdatabase-dfwparsing-main

so restart you syslog-ng server:

/etc/init.d/syslog-ng restart

and off you go!

At this point you can ‘tcpdump -X -n s0 port 514’ to verify that in fact the logs are formatted correctly and heading over to CP land.

SYSLOG Testing

Just in the off case my code sucks, here is how you test it. There are syslog generators out there that you can use: I used this one and here you can see how I generated traffic:

sysloggenerator-2

One key thing is the “dfwpktlogs–” is what triggers the regex’s to fire in the next section. Do NOT use ‘:’, you have to use ‘–‘ or for some reason the regex won’t parse correctly. More below…

 

CP SYSLOG MDS Config

So now the logs are flowing to port 514 on SmartCenter Log server. So you have to make sure you enable the syslog process (NOT syslogd) to listen.

syslogdashboard

 

This will start the syslog process on port 514 on SmartCenter.

syslog514

On a MLS, you have to restart the whole MLS for the config to kick in. On MLS, a syslog daemon will be started PER domain log server IP address for every domain SYSLOG is configured to accept SYSLOGs for that domain.

mls-syslog

and syslogng logs will flow magically into Tracker. Actually they will magically flow, but only fill in the INFO field and will not parse …. yet.

Oh yeah, when debugging you can get the syslog process to re-read the syslog config with:

  1. mdsenv <DOMAIN>  # MDS only
  2. fw kill fwd; fwd -n &

CP SYSLOG Parsing

OK, now it gets fun.

So there is this CP tool: Eventia Log Parser that looks pretty cool. You feed it your SYSLOGS and magic parsing configuration pops out the other end:

 

elp

Yeaaaaaaah…NO. Doesn’t work and the config you output is so complex the SYSLOG engine will run at 100%. Both ELP and SYSLOG were written about the time after the Civil War and are on version 1.0…Typical CP V1 code so don’t get your hopes up. Let me know if you have better luck, I spent days on this.

So I decided to generate the syslog parsing config myself. I saw what ELP attempted and then came up with my own.

Remember that SYSLOG-NG is sending CSV formatted logs:

csvrules

and CP SYSLOG is taking them in and turning them into SmartLog readable:

syslogentry

So let’s begin.

The debug process is this:

  1. mdsenv <DOMAIN> if on MDS
  2. Edit the syslog rules with your stuff
  3. ‘fw kill fwd; fwd -n &’ restarts the syslog daemon
  4. Use the syslog generator
    1. Forget all this parsing junk, just get the SYSLOG-NG to dump into the CP SYSLOG and see the results in TRACKER in the ‘info’ field
    2. Try and get the REGEX rule to fire on ‘dfwpktlogs–‘
    3. Try and get field #1 to parse and fill in some random text field for debugging
    4. Try and get field #1 to output into the official SmartTracker field
    5. Go to #2 for field #2/3/4/5….

So far I have given you enough information to do #4.1. Let’s work on #4.3.

This may look simple…but it took me days and days to fine tune this. The ELP generated config was pages and pages of REGEX expressions. I boiled down to 1 line:

cp-regex-syslog

Even if you aren’t a regex geek, you can see that I am parsing the CSV file into 9 fields.

REGEX GEEK OUT (don’t read this if you aren’t a regex geek):

FYI: They don’t implement FULL regex matching! Example: You can only have 9 matching field patterns, I couldn’t get it to recognize ‘:’, you can’t use lazy searches ‘.+?’. It is some limited hack that you will never figure out because there is no documentation other than examples.  AND their regex performance is horrible, so I tried to avoid using ‘.*’ because it is greedy and will scan the whole line and then backtrack. Instead I used the ‘[^,]’ which searches for all characters NOT ‘,’, which is the same as ‘(.*),’…but doesn’t have the backtracking.

 

============= END GEEK OUT===============================

So what is important is the ‘dfwpktlogs–‘.  When the regex sees this pattern, it will fill in the 9 columns of information from the CSV formatted input line. Now there are 3 different types of actions that will be taken depending on the results of a REGEX match:

Rules:

  • NO MATCH on regex: Result->log entry with data in INFO field
  • MATCH: but no log entry, you matched the  ‘dfwpktlogs–‘ but trying to print out a number into Tracker or Data type error, wrong field name, syntax error on field names, etc.
  • MATCH and () field hits: can use index_value(1) to fill out fields (coming next)

OK so next you will try and parse field #1 which is the first ‘([^,]+)’. This is the PASS|DROP field from the CSV. What I did is I first captured the field and then dumped it into a random text field. This told me that 1) I captured it correctly 2) I am able to write to SmartTracker.

csv-field

Here is the add_field that does this. add_field adds the field to tracker. Field_name is the Tracker field name..in this case it is the ‘rule_name field” field_index is CSV field 1 (from 1-9). Field type is the type of the Tracker field. But beware, I didn’t always get this right and the log entries would just dissappear so I had to experiment. I also looked into:

$FWDIR/conf/syslog/CPdefined/*.C files for examples to see what their field types are.

You can also look into

trackerfields

to see the Tracker field names and types. Its a bit kludgey but between CPDefined examples and this file you’ll get close.

Here you can see ‘rule_name” is the name of the Tracker field defined in svt_fields.C.

trackerfieldnames

 

So now you have 1) You captured field #1 and 2) You wrote it to ‘rule_name’ field. You can now verify that you captured the field you intended to. Once you verify this, you can then write it to the REAL field with:syslog-tracker-real-field

Here we are writing the PASS|DROP to the ‘action’ field in tracker:

cp-action-field

HOLD ON DREEZ: How in the heck did PASS|DROP get converted to Accept|Drop???? you ask.

Grasshopper, I introduce the dictionary file…

dict-file

This is a second file in the same directory that will do transforms for you. Here you can see “PASS” being converted to ‘accept’.

DREEZ you ask: How do I know what to convert to what?

Oh yes grasshopper. You read the CP documentation…NOT!!! HAHAHAHAHAH. Dreez makes a big funny. HAHAHAHAH. Remember grasshopper, this documentation was written in the Civil War and even now the Lord Developers rarely talk to their lowly documentation peasants slaving in the fields trying to identify nuggets of information to feed their ignorant customers. (If you want real documentation see AWS documentation, you can see where developers talk to the documentation team). That’s what phone support is for —- DUH!!!!

Oh yes, I easily diverge, forgive me.

No grasshopper you do what us old people have been doing for centuries. You look at examples in

$FWDIR/conf/syslog/CPdefined/*.C

OR:

On your log server: fw log

Will print out your logs with the field names in them.

To get a feel of what the valid field types are.

So now that you know how to do field #1, now you go through all the fields one at a time….or……

You can just use my template as a starter!!!

 

PERFORMANCE

So the first time we did this, we used CP’s syslog ELP generated config on one of our big servers and the server went to 100%++++. Unfortunately the ‘syslog’ process is single threaded so it had no where else to go.

So with 1100 VM’s….each with its own firewall…sending syslog to my config described above to a 4800 (slow) based management station…performance was much better but not ideal. The 4800 has a quad core Q9400 on it. Syslog process is a bit busy but at least not 100%. It will average between 11% and 60%, and burst to 90% now and then.

Now remember that on MLM, each domain/cma/clm has its own ‘syslog’, so you can manually distribute the load amongst multiple logging domains. But I would hate to do domain design based on log loads. For example, at one company 2 of 12 domains had 90% of our logging. So should we split them up further because of logging? Not sure.

Your mileage may very….

 

SUMMARY

Everyone is excited about this because SmartLog is what our org (and all the other orgs I’ve been at) live and die by. Centralized single-pane of glass easy to use and fairly fast security monitoring. It is the gem of all of CP’s products. And it mostly sometimes works!!!

Now we can send logs from other tools like other firewalls, URL filtering, FireEye, etc into this and use SmartLog to get quick answers. I am keeping both flat syslog files as well as sending them to mongodb and SmartLog. YES: SEIM tools exist…but either they don’t work, too expensive or at capacity, etc. ‘grep’ is free. ‘mongodb’ is free, etc.

 

 

LOGOUT!

dreez

 

 

Administrator Audit Made Easy – Create CSV of MDS user permissions

Darn auditors want to know who has what permissions in MDS……but want it in a spreadsheet! What’s up with that old technology?

Here it is, a matrix of users and their permissions.

adminperms

Python Program #2: Adminparser

NOTE: Goes hand in hand with my Cparser module.

Hopefully this will be easier with the R80 REST interface.

Audit OUT!

dreez

Convert any CheckPoint .C file into Python List

Killing two birds  with 1 pebble. Learning some python and automating our admin audits.

This is the core of it. Converts any .C file into a Python list. So you can use this to parse through your objects, rulebases, users, admin lists, etc.Once converted you can create GUIs, other parsing tools (like I will use for admin user deltas)

Download here: Cparser.py

cpadmins

Identity Awareness started to fail, Captive Portal broke – Certificates changed

This weekend our captive portals just stopped working. This obvious error told me a lot (not).

2015-05-12_7-48-47

tcpdump was equally confusing..

.2015-05-12_8-24-47

Took me a while, but turns out AD certificates changed and no one notified us. I just happen to notice that the fingerprint changed when  I fetched it.

2015-05-14_10-04-09

One of those “Thank god it wasn’t the firewall” days.

dreez

Cluster sync cable tricks

Got this from Watcha again.

A sync cable is only used to exchange state information. If you pull it, then the state tables will be out of sync, but the cluster will remain otherwise healthy and be able to failover. So YES – protocols like FTP will hang, but most other resilient protocols like DNS/HTTP will continue to work with no blips.

In a cluster , CCP  udp port 8116 packets are exchanged on all interfaces. CCP are just keep alive statuses and do NOT contain state information. If a member notices that it is not seeing CCP packets from its peer on one interface, the current standby will go to DOWN and the active member will remain active and go to Active Attention. However, failover will still work regardless if a sync cable exists or not. The downside is the state table may be hosed, but the members will failover as always (assuming that both members are otherwise healthy).

Never fully tested the above but makes sense….TBD.

 

 

AV and 100% CPU

We turned on AV and CPU’s went to 100%. Some digging…

  1. AV only filters files on HTTP and SMTP. Curious…what happened to FTP, NTFS, SMB, CIFS, etc?
  2. AV will look for viruses in HTTP on ALL ports by default.  This search bypasses SecureXL and goes into the medium path where all packets are unwrapped by the worker processes and not just quickly forwarded by the fast path interrupt handlers.avlimited
    httpport80
  3. We then tried to write exceptions and white lists to filter out some of the crap. No luck, they are broken 
  4. We were getting random reports of PDF’s getting corrupted. Not possible, we are in detect mode. Unfortunately someone came up with a tcpdump proving the firewalls in AV detect mode are corrupting traffic. Back to the drawing board……

I”m a die hard CP fan because of their management and logging, but when stuff like this hits my desk I just shake my head.

 

Performance Made Easy – Turn off State Sync

Wow, this is an obvious one I got from a CP performance guy Watcha.

DNS/HTTP/icmp…not sure what others have redundancy built into the protocol. If a packet does not go through layer 7 application layer will resend. DNS/HTTP/icmp are probably 75% of most your traffic so they are filling the state tables. WHY???

If a cluster fails over and the state tables are all hosed up…DNS/HTTP/icmp may hesitate for 1 second but will retry at the layer  7 level. So WHY??? keep them in the state table. WHY make the members share HTTP information over the SYNC cable. Get rid of all that crap.

Turn this off for protocols that are redundant at the application 7 layer. SNMP, NTP, etc…but they are probably only a fragment of your traffic. FTP you want to keep enabled and probably database connections (NOT SURE).

dns

Fun with IA Portal Sharing and Spoofing….But not too much fun

There is this cool feature you can use to:

  1. Centralize your portals
  2. Limit the number of portals you have to configure and support
  3. Users only have to remember 1 portal instead of 100
  4. Centralize the portal, but if it breaks you can still authenticate at the local portal

So it looks something like this:

  • User authenticates at central Captive Portal
  • User can then have access to ANY of the sites whose firewalls share that user database on fw-1:

Slide1

 

How is this all done?? Well, you first have to allow the fw-1 system to share identities with the remote sites:

Slide2

And on the remote sites you have to point to the fw-1: corporate identity awareness portal:

Slide3

Waaaalaaaa done. Works. Ship it. Finished. Chapter Over. Go watch TV and a beer and pizza and chill. OK, seems simple doesn’t it?  Hahahahahaha. Pretty funny.

Let’s look what happened underneath the covers.

Slide4

PEP and PDP setup connections with each other.

 

Slide5

 

 

PEP and PDP setup connections with each other. PEP needs to share the user table that PDP is holding on the fw-1: Corporate IA Portal

These are the ports that are being used to do that. You can see there are the main ports and then some random ports it spawns off.

Slide6

Slide7

 

 

 

So now comes the fun part. Not sure how your network setup is, but I saw connections come from all interfaces to all interfaces. I didn’t debug but I think it depends on which interface Captive Portal is running. Anyways the result was a ton of spoofing drops. Took me a while to figure out what was going on.

In addition, if you have firewalls between the fw-1 and the remotes, they you have to open these ports so they can talk IA.

Slide8

So to fix this whole mess, I had to put the firewalls into spoof groups on all interfaces. I also set the spoof to DETECT for a bit until I knew it all worked and then turned it back to PREVENT.

Slide9

Install policy on both fw1 Corporate IA Portal and fw2 remote.

After I did this CP actually has been working awesome so far. Remember this is with multiple AUs in a single AD domain and a single MDS domain – which is what CP strongly warns you against.

 

 

 

YAIAF – Yet Another IA Fix for Multiple AUs in a AD Domain

DANGER WILL ROBINSON, DANGER

jpg

If you have a single AU in a single AD domain, ignore this

When your picker chooses an access role, the picker will label which AU is associated with that role and store it on the firewall.

So here we are creating a group US-HQ-Admins choosen from the SIberia AU. There is only 1 AD domain ‘abc.com’ but multiple AUs/DCs, we just so happen to be pointing to Siberia AU/DC:

UserPicker Link to AU

When you choose US_HQ_Admins it will eventually be stored in the firewall tagged with the AU it was found in: Siberia:

fwauth.DB

au-config

Why do you care???

Because when the firewall is trying to tie the USER to this group US_HQ_Admins, the firewall will make sure the LDAP query it makes to an AU matches the AU in the group US_HQ_Admins. In this case they both have to be to SIBERIA. If for some reason the firewall uses the FLORIDA AU (multiple AUs in a domain)…..game over sucker.

How to fix this??

Wellllllllll. Get ready.

Add this line into CPprofile.sh and reboot/cpstop/start.

iafix

And the firewall will ignore the AU match and just match up the access role to the LDAP group regardless of what AU it is from.

Pretty cool HUH?

Identity Awareness Multiple Domain Controllers Captive Portal fails

Triple Super Secret debugs for Identity Awareness

In the off chance that there are ‘design inconsistencies’ with Identity Awareness, try this. I don’t believe these are published yet so “sssssssshhhhhhhhhhhh……” don’t tell anyone.

################ Debug with pdp tool #####################################

echo “=======> start debug `date` ” >> $FWDIR/log/pdpd.elg
#### PDP debug on
pdp d s all all
#### PDP debug off
pdp debug off
echo “=======> stop debug `date` ” >> $FWDIR/log/pdpd.elg

 

################ Debug with fw ctl debug #####################################

#### turn off any debug
fw ctl debug 0
#### reserve memory of 32K for output
fw ctl debug -buf 32000
##### turn on all flags for Identity Awareness
fw ctl debug -m IDAPI +all
##### send output to screen and to file
fw ctl kdebug -T -f > /tmp/pdp_debug.txt &
##### turn off debug
fw ctl debug 0

 

############## Debug with fw debug #################

# remove debug file
rm /opt/CPsuite-R75.40/fw1/log/pdpd.elg*
# turn on debug
fw debug fwd on TDERROR_ALL_ALL=5
# kill the pdpd, it will auto restart
killall pdpd
# log should be filling up when it auto restarts
# turn off debug
fw debug fwd off TDERROR_ALL_ALL=0
# look at debug info
less /opt/CPsuite-R75.40/fw1/log/pdpd.elg

 

Helen's Loom

"The most difficult thing is the decision to act, the rest is merely tenacity." -Amelia Earhart

Life Stories from Dreez

These are stories from my travels. Generally I like to write stories about local people that I meet and also brag about living the retirement dream with my #1 wife Gaby. She is also my only wife.