Monday, April 28, 2008

Bayesian Spam Filters - How Do They Work?

Bayesian spam filters, which are a type of scoring content-based spam filters, analyze the contents of the mail, and calculate the probability of the message being spam.
Bayesian spam filters, which are a type of scoring content-based spam filters, analyze the contents of the mail, and calculate the probability of the message being spam. It builds up a list of characteristics of elements that are typically spam as well as good emails. The advantage of the Bayesian spam filters is that they build up the list of characteristics themselves, and do not depend on the manually built list.
Bayesian spam filters more or less try to emulate how you personally identify your spam emails. One look at an email tells you whether the email is genuine or spam. The probability that you will characterize a good mail as spam is zero . Ideally, it would be great if spam filters do work in the same way. At least, the Bayesian spam filters are trying in this direction.
Spam Filtering
Suppose that the word textile often appears in your legitimate mails, but never in your spam mails, then there is zero probability of the word textile indicating spam. On the other hand, the words Nigeria and lottery quite often and at times most exclusively, appear as spam - made famous by the 419 scams out of Nigeria and elsewhere in Africa.
For Bayesian spam filters, these two words Nigeria and lottery have every probability of being found in spam emails - as much as 100 percent.
Whenever you receive a new message, the Bayesian spam filter analyzes it, and calculates, by using the individual characteristics, the probability of it being a spam. If it so happens that your message contains both words, textiles and Nigeria or lottery , the Bayesian spam filter cannot ascertain whether the message is a genuine one or a spam. It will further analyze other characteristics that will allow it to assess the probability of classifying the message as either, spam or legitimate.
Bayesian Spam Filters - Adapting Automatically
Once you have classified the message, as shown above, it can be used to further train the spam filter. This is how it works. In the above scenario:
If the message is analyzed as being spam, then the probability of the word textile indicating legitimate mail is lessened. If the message is analyzed as being legitimate mail, then the probability of the words, Nigeria or lottery - whichever was used - needs to be re-analyzed and re-considered as spam.
The advantage of Bayesian spam filters is that they self adapt by learning from their own decisions, as well as the user s decisions - if made manually. This automatic adaptability of the Bayesian spam filters is excellent for individual email users. Most spam emails have very similar and at times identical characteristics, whereas the characteristics of legitimate mails are different for each individual.
--------------------------------------------------------------------------------------------------------------
Author is admin and technical expert associated with development of security and performance enhancing software like Registry Cleaner, Anti Spyware, Window Cleaner. Learn how Anti Spam filter helps in securing online privacy. Visit our Home page or Resource Center to read more about products.



Bookmark it: del.icio.usdigg.comreddit.comnetvouz.comgoogle.comyahoo.comtechnorati.comfurl.netbloglines.comsocialdust.comma.gnolia.comnewsvine.comslashdot.orgsimpy.com

No comments: