What's New in R75

Check Point R75 is based on the Software Blades Architecture.

Identity Awareness in the Check Point Security Gateway

  • Identity based Firewall and Application Control polices including users, user-group and machines.
  • Logging of User Identities makes troubleshooting simpler and allows better trend analysis.
  • Multiple and flexible methods for obtaining user identity including seamless integration with Active Directory (no need to install agent on Domain Controller), captive web portal for clientless user authentication or thin client for strong authentication and impersonation prevention based on unique patent-pending technology for light signature of packet information.
  • Scalable identity sharing between multiple gateways to identify users in one or many sites and share with other gateways in the same or different sites.

Application Control Software Blade

  • Granular Application Control to identify, allow or block thousands of applications.
  • Largest application library with AppWiki - Comprehensive application control leveraging the largest application library that scans and detects more than 100,000 applications and Social Network widgets.
  • Auto-updates for applications database on the gateway (NO need to re-install policy).
  • Detect rapidly changing Social Network Widgets via online service.

Integrated DLP Software Blade

Check Point's innovative Data Loss Prevention, now available as an integrated Software Blade.

  • Prevents data loss of critical business information.
    • Network-based solution prevents breach of corporate data.
    • Compliance with data protection standards (such as PCI-DSS, HIPPA, GLBA, SOX, etc).
  • Cutting edge technology for DLP processes enforcement.
    • Innovative MultiSpect data classification engine combines users, content and process into accurate decisions.
    • New UserCheck technology empowers users to remediate incidents.
    • Low maintenance, self-educating system - does not require IT/security personnel in incident handling while educating the users on proper data sharing policies.
  • Easy deployment for immediate data loss prevention.
    • Less than one day deployment of preventative DLP solution.
    • Over 250 pre-defined types to create your own policy.
    • Better control and auditing capabilities with centralized security management.
  • New DLP features:
    • ClusterXL HA support - quarantine database is synchronized between cluster members.
    • Incident storage at Management server.

Mobile Access blade

  • Remote Access - SSL VPN technology is used for secure encrypted communication from unmanaged mobile devices, PCs and Macs to your corporate IT infrastructure.
  • Check Point Mobile Client - for simple and secure connectivity to corporate resources from smartphones and PCs.
  • Mobile Access Portal - for connecting securely to corporate resources through a portal from a web browser.
  • SSL Network Extender (On-demand client - SNX) - Best for secure connectivity to corporate resources using non-web-based applications via an on-demand, dissolvable client.

Endpoint Security VPN R75

Endpoint Security VPN introduces the Next Generation of SecureClient, including 64-bit support. It provides mobile users seamless and secure connectivity to corporate resources by establishing an encrypted and authenticated IPSec tunnel with Check Point Security Gateways.

Enhanced IPS signature support

  • Increase scalability of the IPS engine when adding many more protections.
  • Decrease memory footprint (currently some pattern based protections require large memory footprint).
  • Provide a new framework for using non-regular keywords replacing complex regular expressions.
  • Enhance the IPS engine to support simpler and more efficient CIFS and DCE-RPC protections.

Multi-Domain Security Management (based on proven Provider-1 technology)

  • R75 supports the new licensing scheme of Multi-Domain Security Management. You can easily convert an existing Security Management deployment to a Multi-Domain Security Management deployment by adding Software Blades.

Other improvements

  • Security Management Server supports Series 80 Appliances gateways for centrally managed branch offices.
  • You can set a different authentication method per blade on the same gateway. For example, a user can login to Mobile Access with certificate authentication and login to DLP with username and password authentication.
    In Gateway Properties, configure the desired authentication method for IPSec VPN and Mobile Access in its respective Authentication page, and for Identity Awareness in its Authentication Settings page.
  • You can now use multiple portals over port 443 and port 80. For example, the SecurePlatform Web User interface and the Mobile Access portal can both be on port 443. In the SmartDashboard Gateway properties window, set the Portal URL for the different portals on the portal configuration pages.
  • The user search for remote access users works according to the user groups. If a user authenticates with an IPSEC VPN client and the user is in the LDAP groups of a Remote Access VPN Community, then the user will be found in the LDAP server. If a user authenticates to the Mobile Access portal, and the user is defined in the Access to Application rules as part of the Internal Database groups, the user will be found in the Internal Database.

Error: "Packet is dropped because there is no valid SA - please refer to solution sk19423 in SecureKnowledge Database for more information"


Symptoms
  • Error in SmartView Tracker: "Packet is dropped because there is no valid SA - please refer to solution sk19423 in SecureKnowledge Database for more information".
Cause
The Error message indicates a failure in the IPSec Security Association negotiations process: specifically a function timeout occurred. The two most common causes of function timeouts are:
  1. A packet needs to be encrypted but a new IPSec SA needed for its encryption could not be created.
  2. A packet needs to be decrypted but the IPSec SA matching the SPI on the packet does not exist.
During IKE Quick Mode Exchange, the VPN daemon negotiates IPSec Security Associations (SA's) with the VPN partner site. If negotiations fail and the exchange does not complete, the VPN daemon has no IPSec SA's to send to the firewall kernel. The firewall daemon expires the running VPN's state tables entries or does not start a new VPN, since it did not receive the updated IPSec SA's. The expiration triggers this error message.
The message indicates the SA's expired, but does not indicate the root cause of the problem. Other SmartView Tracker messages, before or after the "sk19423 Error", provide more information about the issue.

Solution
Most of the time, this message is displayed due to interoperability issues. In such cases the VPN-1 VPN Interoperability document should assist you in resolving your issue.

You can also review SmartView Tracker for other information/error messages before or after the "sk19423 error". Specifically, check to see if an IKE negotiation has failed or succeeded:

Procedure:
  1. Open SmartView Tracker.
  2. On the left hand pane double-click on the 'VPN-1' query menu item.
  3. View the queried logs in the right pane.
Note:
Be sure to verify the system clocks for all Security Gateways included in the VPN are synchronized. Unsynchronized system clocks can contribute to the symptom.

If the negotiation was successful:

A log entry in SmartView Tracker is displayed. The "Action" field of this entry displays the text "Key Install" and the "Information" field reads "IKE: Quick Mode completion". In case the IKE negotiation was successful, no corrective action for the "sk19423 error" is required.

If the negotiation failed:

Log entries display the "Encryption Scheme" field containing the text "IKE". The log entries vary but more accurately pinpoint the problem. Use these information/error messages to search SecureKnowledge for specific fix(es). If additional IKE error messages do not exist, and a VPN connection is not working, generate a VPN debug report and open a Service Request with Check Point Support.
Troubleshooting encryption errors that spawn the sk19423 message in various configurations:
These encryption failures occur when no IPSEC SA (Security Association) is found for a connection.
Log message: "Packet is dropped because an IPSEC SA associated with the SPI on the received IPSEC packet could not be found."
ConfigurationScenarioSolution
Gateway is a cluster member.Possibly the IKE negotiation was managed between the peer and member A, while the IPSEC packets reached member B.
Such a log message may indicate a problem in the synchronization network of the cluster members.
Make sure that the synchronization network works properly.
Run the command:
cphaprob -a if
Gateway is not a cluster member.Contact Check Point Support.
Log message: "Packet is dropped because there is no valid SA for user peer - please refer to solution sk19423 in SecureKnowledge Database for more information."
ConfigurationScenarioSolution
Any.The gateway tried to open a connection to a user who disconnected their remote access client.No action needed.
Gateway is a cluster member in a load balancing configuration.Possibly the remote access client had an IKE negotiation with member A while packets to that client were sent through gateway B. The IPSEC SA was not synchronized between the members.
Such a log message may indicate a problem in the synchronization network of the cluster members.
Make sure that the synchronization network works properly.
Run the command:
cphaprob -a if
Gateway is a cluster member.The remote access client is behind a NAT device and the NAT mapping was deleted from the NAT device (e.g.: because of NAT entry timeouts).
IKE packets from the gateway could not reach the remote access client since the NAT device could not forward them.
In order to work with remote access clients behind NAT devices, the client must send keep-alive packets.
To configure this:
  1. From the main menu, select 'Policy' > 'Global Properties'.
  2. Click the 'Remote Access' section.
  3. Select the 'Enable back connections' options.
The gateway is not a cluster member, and many connections are opened from the gateway domain to the remote access client.
Any.IKE negotiation failed (e.g., because of an invalid certificate).Search the logs for IKE negotiation messages.
Log message: "Packet is dropped because there is no valid SA for user peer - please refer to solution sk19423 in SecureKnowledge Database for more information."
ConfigurationScenarioSolution
Any.IKE negotiation failed (e.g., invalid certificate).Search the logs for IKE negotiation messages.

How to configure automatic backups in SecurePlatform

Automatic backups

This guide describes how to configure scheduled automatic backups with remote file transfer to an SCP/FTP server.

Backup and restore commands

SecurePlatform provides both command line, or Web GUI, capability for conducting backups of your system settings and products configuration.

The backup utility can store backups either locally on the SecurePlatform machine hard drive or to an FTP server, TFTP server or SCP server. You can perform backups on request, or according to a predefined schedule.

Backup files are kept in tar gzipped format (.tgz) . Backup files, saved locally, are kept in/var/CPbackup/backups .

The restore command line utility is used for restoring SecurePlatform settings, and/or Product configuration from backup files.

Note - Only administrators with Expert permission can directly access directories of a SecurePlatform system. You will need the Expert password to execute the restore command.

The backup & restore commands are provided in SecurePlatform to provide a simple way to perform a complete backup of the Check Point configuration as well as the SecurePlatform OS settings. You can also copy backup files to a number of SCP and TFTP servers for improved robustness of backup. The backup command, run by itself, without any additional flags, will use default backup settings and will perform a local backup.

Syntax

backup -hbackup [-h] [-d] [-l] [--purge DAYS] [--sched [on hh:mm <-m DayOfMonth> | <-w DaysOfWeek>] | off]
[--tftp [-path ] []]
[--scp [-path ] []]
[--ftp [-path ] []]
[--file [-path ] []]

Backup parameters

parametermeaning
-hobtain usage
-ddebug flag
-lflag enables backup of the Check Point Security Gateway log (By default, logs are not backed up.)
-p or --purgedelete old backups from previous backup attempts
[--sched [on hh:mm <-m DayOfMonth> | <-w DaysOfWeek>] | off]

schedule interval at which backup is to take place

  • On - specify time and day of week, or day of month
  • Off - disable schedule
--tftp [-path ][]

List of IP addresses of TFTP servers, to which the configuration will be backed up, and optionally the filename.

--scp [-path ] []List of IP addresses of SCP servers, to which the configuration will be backed up, the username and password used to access the SCP Server, and optionally the filename.
--ftp [-path ] []List of IP addresses of FTP servers, to which the configuration will be backed up,
the username and password used to access the FTP Server, and optionally, the filename.
--file [-path ]When the backup is performed locally, specify an optional filename

Note - If a Filename is not specified, a default name will be provided with the following format:
backup_hostname.domain-name_day of month_month_year_hour_minutes.tgz

For example: \backup_gateway1.mydomain.com_13_11_2003_12_47.tgz

Examples

backup –file –path /tmp filename

Puts the backup file in (local) /tmp and names it filename

backup

–tftp -path tmp

–tftp -path var file1

–scp username1 password1 –path /bin file2

–file file3

--scp username2 password2 file4

--scp username3 password3 –path mybackup

The backup file is saved on:

  1. TFTP server with ip1, the backup file is saved in the tmp directory (under the TFTP server default directory – usually /tftproot) with the default file name – backup_SystemName_TimaStamp.tgz
  2. TFTP server with ip2 , the backup file is saved on var (under the TFTP server default directory – usually/tftproot) as file1
  3. SCP server with ip3 , the backup file is saved on /bin as file2
  4. locally on the default directory (/var/CPbackup/backups) as file3
  5. SCP server with ip4 on the username2 home directory as file4
  6. SCPserver with ip5 on ~username3/mybackup/ with the default backup file name

Configuring automatic backups

For this tutorial we will use the following settings:

itemvalue

FTP Server

10.22.2.99

FTP Username

mikem

FTP Password

vpn123

Backup Schedule

Every Sunday @ 01:00

To list the active backup schedules:

  1. Login to the SecurePlatform machine in Expert Mode.
  2. Run cat /var/CPbackup/conf/backup_sched.conf to verify that there are no currently configured automatic backups that you will be overwriting.
    If it returns with a "file not found" error or if it returns back to the command prompt without showing any details, then there are no automatic backups currently configured.

    Here we see that the backup configuration file has not yet been created, so we can move on to setting up the automatic backup.

To configure the automatic backup schedule:

  1. Using our example configuration, run the following command:
    backup --sched on 01:00 -w 7 --ftp 10.22.2.99 mikem vpn123
  2. Run cat /var/CPbackup/conf/backup_sched.conf to list the backup configuration file.

    The configuration file has been created.

You can also view crontab to see that backup_util sched is in the list of scheduled jobs. Crontab is the process that handles running scheduled jobs.

To list the scheduled jobs in crontab:

  • Run crontab -l .

    You can see that SecurePlatform backup is configured to run every Sunday at 01:00am and transfer the file to the FTP server we defined.

How to Set Up a Site-to-Site VPN with a 3rd-party Remote Gateway

Generating an Internal CA Certificate

To generate an internal CA certificate for your security gateway object:

  1. In the General Properties window of your Security Gateway, make sure the IPSec VPN checkbox is marked.

  2. Click OK, the following window is displayed.

  3. Click OK. An internal CA certificate is created and your security gateway object should now have a key symbol on it bottom right.

Setting up the VPN

To set up the VPN:

You need to create an object to represent the peer gateway.

  1. Right click on Network Objects and select: New > Others > Interoperable Device

  2. Give the gateway a name, IP address, and (optional) description in the properties dialog window that is displayed.

  3. In the IPSec VPN tab in your SmartDashboard, right click in the open area on the top panel and select: New Community > Star.

  4. A Star Community Properties dialog pops up. In the General menu, enter your VPN community a name:

  5. In the Center Gateways menu, click: Add, select your local Check Point gateway object, and click OK.

  6. In the Satellite Gateways menu, click: Add, select the peer gateway object, and click OK.

  7. In the VPN Properties menu, you can change the Phase 1 and Phase 2 properties.

    Note - Make a note of the values you select in order to set the peer to match them.

  8. In the Tunnel Management menu you can define how to setup the tunnel.

    Note - The recommended tunnel sharing method is: One VPN tunnel per subnet pair.
    This will share your network on either side of the VPN, and makes the Phase 2 negotiation smooth, and requires fewer tunnels to be built for the VPN.

    If you need to restrict access over the VPN, you can do that later through your security rulebase.

  9. Expand the Advanced Settings menu and select: Shared Secret. Mark the Use only Shared Secret for all External members checkbox. Select your peer gateway from the entries in the list below and click Edit to edit the shared secret.

    Note - Remember this secret, as your peer will need it to set up the VPN on the other end.

  10. Expand the Advanced Settings menu and select: Advanced VPN Properties. Here, you can modify the more advanced settings regarding Phase 1 and 2.

    Note - Keep note of the values used. It is also a good idea to select:
    Disable NAT inside the VPN community so you can access resources behind your peer gateway using their real IP addresses, and vice versa.

  11. Click OK on the VPN community properties dialog to exit back to the SmartDashboard. You may see the following message:

  12. We are about to address the VPN domain setup in the next section, so click Yes to continue.
    Now you can see your VPN community defined:

Defining VPN Domains

You now need to define your VPN domains.
If you have not already done so, create network objects to represent your local networks and the peer networks they will be sharing with you.

To define VPN domains:

  1. From the Network Objects menu, right click on Networks and select Network to define a new network. In the following image, we are creating a network to represent our peer's internal network that they will be sharing with us:

  2. If you or your peer is sharing more than one network over the tunnel, create groups to represent each side's VPN domain. From the Network Objects menu, right click on Groups, select Groups and then Simple Group...

    In this example, we are only sharing one network, so the group will only have one object included, but you can put as many networks in this group as you would like to share. It is important not to add groups within a group as this can impact performance. Make sure the group is "flat". Give your group a meaningful name such as: Local_VPN_Domain. Click OK once you have added all of your local networks and then repeat the procedure to create a group to represent your peer's shared networks.

    Now you need to set the VPN domains for each of the gateways.

  3. Open the properties for your local Check Point gateway object. Select the Topology menu. In the VPN Domain section, select Manually defined, and from the drop-down list, select your Local VPN domain group object.

  4. Click OK to save the object.
  5. Open the properties for the peer gateway and select the group/network that represents its VPN domain:

  6. Click OK to complete the peer gateway configuration.

Creating a Rule for the Traffic

Now, you have both objects setup for VPN and you have defined your community. All that is left is to create a rule for the traffic.

Here is where you should restrict access if it is required.

To create a rule for the traffic:

  1. Decide where in your rule base you need to add your VPN access rule and right click the number on the rule just above where you want it and select: Add Rule --> Below.

    In this example, we are allowing any service across the tunnel in both directions.

    You should also explicitly set the VPN community in the VPN column on your rule.

  2. In the VPN column, right click the Any Traffic icon and select: Edit Cell.... Select the: Only connections encrypted in specific VPN Communities option button and click Add. Select the VPN community created in the above steps and click OK and then OK again.

    Your rule should now show the VPN community in the VPN column.

Completing the Procedure

Install the policy to your local Check Point gateway. The VPN is setup!

Verifying the Procedure

Once the remote side has setup their VPN to match, verify that you have secure communication with their site.

How to Configure Management HA

HA Background Information

Why Management High Availability?

The Security Management server includes several databases with information on the system, such as objects, users, and policy information. Every time the administrator makes changes to the system, this data changes. It is crucial to make a backup for this data, so that information is not lost in the event of a failure. If the Security Management server fails, or is down, a backup server needs to be in place to take over operations. If the primary Security Management (SmartCenter) server is down and there is no backup in place, operations by the Security Gateway, such as retrieval of the CRL, or fetching of the Security Policy cannot take place.

Security Management Server Backup

In Management High Availability, the Active Security Management server always has one, or more backup Standby Security Management servers that are ready to take over from the Active Security Management server, in case of failure. These must all be of the same OS and version (including HFAs and plug-ins).


The existence of the Standby Security Management server allows for backups to be in place:

  • The database of objects and users, policy information and ICA files are stored on both the Standby Security Management server, as well as on the Active Security Management server. These are synchronized, so that data is maintained and ready to be used. If the Active Security Management server is down, a Standby Security Management server needs to become active, in order to be able to edit and install the Security Policy.
  • Operations such as fetching a Security Policy, or retrieving a CRL can be performed on the Standby Security Management server.

Data Backed Up by the Standby Security Management Servers

For Management High Availability to function properly, the following data must be synchronized and backed up:

  • Configuration and ICA data, such as:
    • Databases (such as the Objects and Users)
    • Certificate information, such as Certificate Authority data and the CRL that is available to be fetched by the Check Point Security Gateways
    • The latest installed Security Policy. The installed Security Policy is the applied Security Policy. The Security Gateways must be able to fetch the latest Security Policy from either the Active or the Standby Security Management Server.

Synchronization Modes

There are two ways to perform synchronization:

  • Manual synchronization is a process initialized by the system administrator. It can be set to synchronize Databases, or Databases and the installed Security Policy.
    The former option synchronizes quicker than the latter option. It should be the preferred mode of synchronization, provided that the system administrator has edited the objects or the Security Policy, but has not installed the newly edited Security Policy, since the previous synchronization.
  • Automatic synchronization is a process configured by the system administrator to allow the Standby Security Management server to be synchronized with the Active Security Management server, at set intervals of time. This is generally the preferred mode of synchronization, since it keeps the Standby Security Management server updated. The basis for the synchronization schedule is that when the Security Policy is installed, both the installed Security Policy and all the databases are synchronized. Additionally, it is possible to synchronize the Standby Security Management servers when:
    • The system administrator saves the Security Policy
    • At a specified scheduled time

Even when automatic synchronization has been selected as the synchronization mode, it is possible to perform a manual synchronization.

Synchronization Status

The possible synchronization statuses are:

  • Never been synchronized - immediately after the Secondary Security Management server has been installed it has not yet undergone the first manual synchronization that brings it up to date with the Primary Security Management server.
  • Synchronized - the peer is properly synchronized and has the same database information and installed Security Policy
  • Lagging - the peer Security Management Server has not been synchronized properly. For instance, on account of the fact that the Active Security Management Server has undergone changes since the previous synchronization (objects have been edited, or a Security Policy has been newly installed), the information on the Standby Security Management Server is lagging.
  • Advanced - the peer Security Management Server is more up-to-date.
  • Collision - the Active Security Management Server and its peer have different installed Security Policies and databases. The administrator must perform manual synchronization, and decide which of the Security Management Servers to overwrite.

Creating a Secondary Management Server

To create the Secondary Security Management Server:

  1. From the Menu, select Manage > Network Objects > Check Point > New > Host
  2. From Object Properties, under Network Policy Management, select Secondary Server.
  3. You will be prompted to specify whether it is the Primary Management or the Secondary Management. Select Secondary Management.
  4. Create a network object to represent the Secondary Security Management server:
  5. Select Manage > Network Objects > Check Point > New > Host

    • In the Software Blades, section, select the Management tab, and select Secondary Server. Logging and Status will be selected automatically, as well.
    • Initialize SIC between the Secondary Security Management server and the Primary Security Management server by selecting Communication.
  6. If a Security Gateway is installed on the Security Management server machine, install the policy first before doing a manual synchronization.
  7. Synchronize the Secondary Security Management server with the Active Security Management server by selecting: Policy > Management High Availability and then select Synchronize.

Putting the Active Management Server on Standby

To change the Active Management Server to Standby:

  1. On the Active Security Management server, select Policy > Management High Availability to display the Management High Availability Server.
  2. In the displayed window, select Change to Standby.

Promoting the Standby Management Server to Active

To change the Standby Management Server to Active:

  1. When logging in to the standby Security Management server, the Standby window is displayed.
  2. Select Change to Active.

Refreshing the Secondary Management Server Synchronization Status

The status of the Security Management server may have changed. You may decide to do a refresh.

To refresh the synchronization status of a Security Management server:

  1. Select Policy > Management High Availability to display the Management High Availability Servers window for the selected Security Management server.
  2. In the displayed window, select Refresh.

Selecting Synchronization Method

  • The Policy > Global Properties - Management High Availability window defines the way in which the Standby Security Management server synchronizes with the Active Security Management server.

  • The Secondary Security Management server can be synchronized automatically when the policy is saved / installed, or on a scheduled event. This can also be synchronized manually. If this is the chosen method, you will need to initiate it in the Management High Availability Servers window.

Tracking Management High Availability

To check the status of the Security Management servers:

  • Select Policy > Management High Availability. This shows information about the Security Management server, which includes its peers - displayed with the name, status and type of Security Management server.
  • All management HA operations can be seen in Audit Mode in SmartView Tracker.

How To Troubleshoot Memory Leaks on IPSO

Background on UNIX Memory Management

The IPSO kernel manages several pools of memory that can be allocated to either kernel functions or processes. The memory pools can be viewed with vmstat -z, which displays all of the Universal Memory Allocator zones. Each zone is constructed with a distinctive name, and statistics about the usage:

  • Zone references by slab children
  • Memory usage of the zone
  • Number of total requests in the zone

There are children memory pools which are directly connected the zones, either by the kernel or running processes. They are called slabs. Every slab is a child of a UMA memory zone; the zone sets certain size restrictions on the type of memory pages that may be requested. The list of memory slabs may be viewed withvmstat -m . The reason to use memory slabs, is simply for efficiency. It is a much faster operation to create a slab of memory as a child of a pre-initialized UMA zone, than to create a memory zone from scratch. The UMA memory zones are children of a "primitive" structure called a Keg (outside the scope of this document, see references). This hierarchy of memory allocation is relevant because it can help you track down when a kernel virtual memory zone's used pages count is increasing and never going down. This is the definition of a memory leak within the kernel.

Memory within the children slabs or UMA zones is allocated on a page basis by the virtual memory manager. The size of these memory pages is fixed within the system, usually at 4kb. The IPSO memory allocator works by maintaining a set of lists that are ordered by increasing powers of two. Each list contains a set of memory blocks of its corresponding size. To fulfill a memory request, the size of the request is rounded up to the next power of two. A piece of memory is then removed from the list corresponding to the specified power of two and returned to the requester.

Thus, a request for a block of memory of size 53 returns a block from the 64-sized list. This sizing less than 4kb occurs within the 4kb slices that are allocated to a page; for example, consider that a memory operation is happening within a particular slab, and the requesting function needs 3 slots of memory, two for 512 bytes, and one for 2048 bytes. Since all of this can fit within a single virtual memory (VM) page, a single page could be used to store the data. The single VM page is allocated to a single requesting memory slab.

  • Keg - Provides structural information about the memory allocation
  • UMA Zone - Inherits information from a certain Keg structure, and further defines sizing and key attributes
  • Memory slab - Inherits information from one or more UMA zones, and is linked to directly by the requesting task

Any modern operating system uses a Virtual Memory system to allocate memory space to the kernel and applications. This virtual system maps memory requests to contiguous address space, which maps to discontiguous physical page locations. The virtual memory handler is used to prevent resource exhaustion of physical memory. By tracking memory accesses and allocation/free operations, it is able to dynamically free up physical memory as needed by remapping a physical memory page to the disk media - this is called swapping. Swapping of memory is normal and happens all the time as part of memory management. However during standard operating conditions, a Check Point firewall should NOT swap.

A portion of the physical memory can be reserved as WIRED memory. Wired memory refers to memory pages that may not be swapped out of physical memory to disk. Often the main usage of Wired memory is for core kernel memory structures. Programs running on the system may request a certain percentage of their memory space to be reserved as Wired memory. The total memory allocations and usage may be viewed using the topcommand.

Most often this document will be used because there is a suspected memory leak on a system. The reason for assuming there is a memory leak, is that the system crashed. A system would crash because there was insufficient Free memory, and no other type of memory could be Freed (possibly too much was Wired, or the Virtual Memory system overflowed), and a new memory allocation for a core system function failed. In this instance it is normal for the system to crash or lock up.

After the system has been recovered with a hard power cycle, there are very few clues about what originally caused the problem. The logfiles most often could not be written to, because there was no memory available to initialize the function that would write the log. All crucial IPSO kernel counters available via ipsctl would be cleared on a system reboot. Finally, the top, vmstat, and ps output would also be cleared.

In the case where there is no available core file to analyze, an analysis must be done on the newly-booted system to determine if the problem is persistent and likely to occur again.

The Memory Leak Detection Script Accompanying this Document

The script accompanying this document is intended to help trace memory leaks and determine if there is indeed a memory leak present, so that the Check Point development team may be engaged to try and help narrow down the exact cause. The script by itself will only serve as instrumentation.

If the script is aborted or the system crashes before the script is done, the raw stats will still be present but the script must be hacked (by removing the data collection portion and all sections above it) and rerun to generate the final output.

Running the Memory Leak Detection Script

You must download and copy the mem-html.sh script to the target system. You may wish to read the script before executing it. Some important considerations are there it is very CPU intensive and may conceivably cause traffic loss on production systems. For typical operations it is recommended to run the script with the following syntax:

sh mem-html.sh 3600

This will execute the script, and run for approximately one hour – 3600 seconds. In the case where you are concerned with the CPU utilization of the script, you may run it with the following syntax:

nice +20 sh mem-html.sh 3600

The script will be executed with the lowest possible CPU priority on the system; however the results may be less accurate.

The script works by writing an HTML page and several subdirectories containing raw statistics. The raw stats are compiled and written in an abbreviated format to the HTML page as graphs and tables.

Script Output, Collected Before the Intensive Data Collection

The script will collect several important data and include them at the top of the document.

The basic information which is collected includes the duration of the script run, the kernel version, uptime, and date when the script was executed. The top output is also collected for later reference.

The vmstat outputs listed are collected before, and after, the script data collection.

Intensive Data Collection

Until the end timer is reached, the script will iterate through data collection using ps and top . This output is collected in the appropriate subdirectory.

Wrapping up the Data

Once the script's main loop is complete, the data is graphed within the HTML page using CSS. This is to assist the analyst by providing a historical plot of the memory utilization in megabytes.

Tables are also created containing the start and end values of both kernel memory stats, and process memory stats. This allows the tracking of memory leaks both within the kernel space, as well as userland.

A leak in the kernel should be easily detectable by seeing the Free memory consistently shrink, and Wired memory growing consistently. There should be a corresponding growing Virtual Memory slab corresponding to the system function with the leak. Finally, the RSS for every process would drop as the virtual memory system tries to reclaim the least frequently used memory pages from the running processes.

A leak in a userland process would show similar behavior, except there would be a large increase in RSS and VSZ. The system *should* normally panic in these circumstances either when the processes reserved Wired memory increases beyond the available Free space, or the Virtual Memory address space is exhausted. It has been observed that a system may simply hang rather than panic and generate a core file.

As a convenience to the analyst, the file descriptor allocations are also tracked via the script, since file descriptor leaks could also be interpreted as a memory leak.

All of the raw data that is used by the script is stored in the subdirectories so that the analyst may drop the results into an Excel and plot them.

Script Output

The resulting index.html and subdirectories are put into a script-output.tgz tarball, and the original files and directories are then deleted. The script-output.tgz must be collected and provided to Check Point TAC for analysis. Check Point will extract the contents of the tarball.

Analysis of the Memory Leak Detection Script

The analyst of the script output must consider several items:

  • How busy is the system?
    • The system may only leak memory when there is a lot of traffic
    • Running the script during a maintenance window may not yield the desired output

  • How long it has been up?
    • A system could leak memory over very long timeframes - potentially weeks or months

  • Has the memory gone up or down normally, or abnormally?
    • A system will during a normal day request and release large amounts of memory
    • "Abnormal" growth could indicate the problem, but it must be considered with everything else that is happening

  • Where is the memory going?
    • Depending on the uptime and how busy the system is, memory can be allocated to Wired, Inactive and process memory space in a way that looks wrong
    • This is dependent on too many factors to worry about any one indicator

  • Has the condition that triggered the memory leak occurred?
    • A memory leak is usually triggered by a specific function that requests memory and never releases it. If this function doesn't execute, memory will not leak.
    • It is important to run the script on the actual system that is leaking memory, or a proven replication

When running the script, please ensure that an appropriate duration is selected. Unless the memory leak is extreme, running the script for hours, days or weeks may be required.

Additional Tools

The following commands may prove useful when trying to track down memory leaks:

  • top
  • ps -auxwwlSHm
  • vmstat -z
  • vmstat -m
  • vmstat 1 5
  • vmstat -s
  • top -mio -S -H

IPSO is compiled with an additional debug kernel. Since IPSO is FreeBSD-based, you can look up additional information about what the kernel flags are for. The additional flags that the kernel is compiled with, are:

  • Witness
  • Witness_KDB
  • Witness_Skipspin
  • Invariants
  • Invariants_Support
  • Diagnostic
  • MBUF_Stamps
  • KDB_Unattended

Choosing to run thiskernel.debug_g is extremely unwise unless directed to do so by Check Point engineering. One purpose of this debug kernel is to crash/panic when an error is detected, much more frequently than on a non-debug kernel. Therefore the kernel is very rarely provided to a customer and should rather be implemented in a lab where the error occurred.