Stopping spam

From GWAVA4

Jump to: navigation, search

Contents

Automatic Configuration Mode

Image:Spam1.png

By default, GWAVA starts spam scanning in Automatic Configuration mode which uses the heuristic spam engine. For more information on how heuristics work see Heuristic Scanning.

The scores cannot be set while in this mode. GWAVA uses a hardcoded score of 7.5, which should block a reasonable amount of spam without having any false positives. The results at this stage will not be the best, but only serve as an interim until the administrator can switch over to the newer spam engine.

GWAVA will use the heuristic scanning engine until you have fed GWAVA 1000 examples of ham and 1000 examples of spam. For information on how to gather examples of ham and spam see Training. Once you have the appropriate set of ham and spam, the antispam engine will automatically switch over to the new engine, which is based on probability.

How GWAVA calculates the probability is somewhat complicated, but essentially GWAVA looks at the word content of the message and will give you a spam probability based on what it has learned from your examples of ham and spam. After GWAVA switches over you will see a screen that resembles the following.

Image:Automatic2.png

You will notice that the spam percentage is given instead of a spam score. This means that if the probability engine is 97% sure the message is spam, it will block the message and place it in the quarantine.

The thresholds are not editable here, but you can set them yourself if you switch to simple mode using the probability scoring method, please see Probability Scanning.

Simple Configuration Mode

Heuristic Scanning

Heuristic scanning is similar to the spam scanning that was done in GWAVA 3.

With heuristic scanning, a series of pre-compiled rules (regular expressions) are executed against the components of the messages. Each rule is assigned a score and the total score is the summation of the scores of all of the rules that fired.

A starting pcr file is provided with your GWAVA installation. Your old GWAVA 3 rule file may be used if you have made several rules or have done extensive optimization. If you would like to import your pcr file from GWAVA 3 you can copy your old pcr file from your NetWare server groupwisedomain\gwava\config\spamcfg\compiled.pcr to /opt/beginfinite/gwava/services/asengine/configs/<interface_ID>/ on your GWAVA 4 server.

In order to use heuristic scanning you must browse to the Heuristics page inside GWAVAMAN. Set the Antispam configuration mode to Simple. Then click on Show spam scanner settings and make sure that Score is selected for the Score Method.

Image:Heuristics.png

  • Quarantine at this threshold - all messages exceeding this score (but below the Delete message at this threshold score) will be blocked and quarantined.
  • Delete message at this threshold - all messages exceeding this score will be blocked and will NOT be put in the quarantine.

Probability Scanning

This is the new spam engine that will be used by GWAVA once you have provided 1000 samples of ham and 1000 samples of spam to the antispam engine. DO NOT use this mode until you have a corpus of 1000 ham and 1000 spam, because nothing will be blocked. The details on how to train the engine can be found in the training section.

Probability scanning returns a spam probability (from 0% to 100%), rather than a score. The probability returned reflects closely the ham and spam that you have used to train the engine.

In order to use probability scanning you must browse to the Heuristics page inside GWAVAMAN. Set the Antispam configuration mode to Simple. Then click on Show spam scanner settings and make sure that Probability is selected for the Score Method.

Image:Probability.png

  • Quarantine at this threshold - is set to 97% as your default threshold. If the antispam engine is 97% sure the message is spam it will be both blocked and quarantined. You may lower that percentage below 97%, which will increase the amount of spam caught, but will also increase the possibility of false positives.
  • Delete message at this threshold - is set to 99.9% as your default threshold. If the antispam engine is 99.9% sure that the message is spam, it will be blocked and NOT quarantined.

Combined Probability Scanning

This method uses an experimental algorithm to combine the output scores of the heuristics and probability systems. Presently it is not recommended and is subject to change in future updates.

Probability with Score override

This method combines the the benefits of both the probability system and a trained heuristics system. By default, the probability system gives a resulting value to determine if a message is spam or not. The heuristics system then provides a score, and if it exceeds the defined threshold, it will flag the message as spam, overriding the probability system if it did not detect it.

It is important when implementing this method to ensure the heuristic scores are well defined, and is usually implemented to combat very specific types of spam words with the heuristic scores, rather than relying on the probability system learning them.

Score with Probability override

This method combines the the benefits of both the heuristics system and a trained probability system. By default, the heuristics system gives a resulting value to determine if a message is spam or not. The probability system then provides its determination, and if it is detected as spamit overrides the heuristics system if it did not detect it.

It is important when implementing this method to ensure the probability system is well trained to ensure overrides are accurate. This system would typically be used when transitioning from a heuristic score (PCR) to the more accurate probability system.

Advanced Configuration Mode

Image:Spam1.png

The advanced configuration mode allows you to have multiple thresholds and provides options to rewrite the subject or add headers for additional filtering with the mail client.

In order to use advanced configuration mode you must browse to the Heuristics page inside GWAVAMAN. Set the Antispam configuration mode to Advanced. Then click on Show spam scanner settings and make sure that Probability or Score is selected for the Score Method. Thresholds will be provided as either scores or probability percentages.

Image:Advanced1.png

Configuring Thresholds

The advantage of having multiple thresholds is the administrator can set up different services to be activated at different score thresholds in a more granular fashion than permitted by the simple configuration mode.

GWAVA has the capability of configuring up to five threshold levels.

For example, here's a potential way to use multiple thresholds:

  1. Messages below a spam probability of 95% will pass through GWAVA unaltered.
  2. Messages returning a spam probability of 95%-97% will be quarantined for review purposes. These items are probably spam, but the administrator is being conservative regarding false positives. The original message will be delivered to the user.
  3. Messages returning a 97%-99.9% probability will be both blocked and quarantined. These items are almost certainly spam.
  4. Messages exceeding a 99.9% probability will be blocked, but not quarantined.

Here is how this scenario could be configured:

Image:Advanced2.png

NOTE: In order to set the interval for 'spam threshold 3' we put a percentage in 'spam threshold 2' of 95%. This is necessary so you can define a lower limit which will give spam threshold the interval from 95%-97%.
TIP: In the Quarantine Management System (QMS), the administrator will probably want to configure Digest Services, to include spam threshold 4, but exclude spam threshold 3 as digestable events.

Rewriting the Spam Message Subject

GWAVA may be configured to rewrite the subject of an email, depending on the threshold. A common reason to do so is to allow users to create client based rules to treat the messages differently depending on the threshold ("Probably spam" vs "Definitely Spam"). In organizations where individual message filtering is emphasized, it might be appropriate to use this service.

For example, in addition to wanting to place messages with spam probabilities of 95-97% in the quarantine, it might be useful to put the spam percentage in the subject. Check Enable subject rewrite for spam threshold 3 and type in the desired message to include in the subject. The two icons on the right insert variables. The Image:subjicon.png button will insert a variable that will include the original subject. The Image:spamscoreicon.png button will insert a variable that will include the email spam score into the subject line. For our example we will use %%subject (%%spamscore), which will return the original subject and then the spam percentage afterwards in parenthesis.

Image:AdvancedSubject.png

Junk Mail Handling

Instead of (or in addition to) rewriting the subject, GWAVA can add a X-Spam-Status: Yes or X-Spam-Status: No to the e-mail header. The client can then filter on this header field.

Novell GroupWise systems can use this particularly effectively. If /xspam is added to gwia.cfg and the GWIA is restarted, then all messages with X-Spam-Status: Yes in their MIME headers will automatically be submitted to the GroupWise Junk Mail Handling options, set at the client.

Simply check Flag the message as 'Junk' for client handling to activate this option.

Keeping with our example from above, we will change 'spam threshold 4' to add the X-Spam-Status header. This way any message that GWAVA thinks are spam will end up in the client junk mail folder. The block and quarantine services will be turned off.

Image:AdvancedJunk.png

Message Services

A brief explanation of what each service does is given here.

Image:Attachment_Blocking_3.jpg Block - Do not allow this message to be delivered to the user.

Image:Attachment_Blocking_6.jpg Notify Administrator - Send an e-mail to the administrator about the message. The specific contents of the e-mail are based upon a template configured in Configuring Notification Options

Image:Attachment_Blocking_4.jpg Notify Originator - Send an e-mail to the sender of the message. The specific contents of the e-mail are based upon a template configured in Configuring Notification Options

Image:Attachment_Blocking_5.jpg Notify Recipient - Send an e-mail to recipient of the message. (Often this only makes sense in conjunction with a Block). The specific contents of the e-mail are based upon a template configured in Configuring Notification Options

Image:Attachment_Blocking_8.jpg Quarantine - Store the message in the GWAVA Quarantine Management System (QMS). Users with accounts in the QMS will be able to access these messages, and possibly release or forward them, depending on their configured rights. These messages will also be subject to being provided in a HTML digest, if the QMS administrator has turned this function on (Digest Services).

SURBL

Image:Spam2.png

SURBL is a powerful antispam technique, one of the most effective available today. When SURBL is enabled, GWAVA checks a blacklist server(s) to see if the message received contains URLs or embedded image links that reference a domain of known spammers.

Check Enable SURBL event and select the services you wish to associate with the SURBL event. To add an additional SURBL server, enter the server hostname in the field provided and click add.

Default SURBL servers installed by the "Stop Spam" wizard are:

  • multi.surbl.org
  • black.uribl.com


Image:SURBL.png

RBL

Image:Spam3.png

RBL is a common antispam technique. With RBL, GWAVA checks a blacklist server(s) to see if the message received is from an IP Address of known spammers.

Check Enable RBL event and select the services you wish to associate with the RBL event. To add an additional RBL server, enter the server hostname in the field provided and click add.

Default RBL servers installed by the "Stop Spam" wizard are:

  • sbl-xbl.spamhaus.org
  • bl.spamcop.net

Image:RBL2.png

Personal tools