Talk:Naive Bayes classifier/Archives/2015

This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Two steps?

Is it the example complete? I mean, some articles speak about two steps:

1) preclassification (get statistics from training data, calculate conditional probability and classify...)

and

2) Bayesian Classification (update a priori probability and classify...).

Where is the 2nd step? Don't you update the probabilities? Don't we need to iterate anything?

regards

— Preceding unsigned comment added by 81.202.7.175 (talk • contribs)

Which articles in particular are you referring to? QVVERTYVS (hm?) 21:48, 26 November 2014 (UTC)

little p versus capital P

Can somebody explain to me what p(...) means? Honestly, what symbols am I allowed to write inside the brackets of *little* p and how is it defined? I know P(...)... I am very confused once more why probabilities are computed using density functions and the difference of p and P is the reason for much of the trouble caused by this article.

(Edit: see this question of mine). — Preceding unsigned comment added by 78.51.30.89 (talk) 21:50, 29 May 2015 (UTC)

Sex classification example

Hi, the example added last august about sex classification is puzzling me, could anyone tell me how to compute the P(heigth | man)? The author gave the value 1.5789 with a note stating that "probability distribution over one is OK. It is the area under the bell curve that is equal to one" which also puzzle me.

Anyway, the only example of naive bayes classification for real-valued I have found is available here and the author use the probability density function of a normal distribution with estimated parameters to compute the probability P(temperature=66), how correct is that? It is very surprising for me to use a PDF to compute a probability. -- Sam —Preceding unsigned comment added by 77.248.94.92 (talk) 18:55, 24 September 2010 (UTC)

Using probability densities instead of probabilities is correct and necessary because continuous random variables are involved. -- X7q (talk) 22:08, 6 December 2011 (UTC)

The example math is incorrect for this example as stated above, this is not the correct method to compute probabilities. Note, the proposed method (sample - mean)/stdev gives nonsensical answers, including its least probable to measure the sample mean and infinitely probable to measure something infinitely far from the sample mean. The correct method is using the standard normal distribution. — Preceding unsigned comment added by 134.134.139.70 (talk) 21:28, 17 November 2011 (UTC)

"proposed method (sample - mean)/stdev" - it doesn't look to me that the original author of the example proposed that. More like somebody didn't understood his derivation and inserted this dumb formula, and it somehow survived in the article. I've removed it. Yes, normal distribution's probability density formula is what is needed there. -- X7q (talk) 22:08, 6 December 2011 (UTC)

Not standard normal distribution, though, but normal with the parameters learned during training. -- X7q (talk)

Can someone post the correct way to compute this? I am still having trouble understanding. Edit: Yes, thanks! — Preceding unsigned comment added by 98.248.214.237 (talk) 21:38, 6 December 2011 (UTC)

I've made a few changes to this section just now. Looks any better to you? -- X7q (talk) 22:08, 6 December 2011 (UTC)

I propose a revamp of the notation in the Testing section. I understand that it is trying to convey a plain-English message to those who are not mathematically inclined, but it looks tacky. I'm going to start editing and hope people are OK with it. — Preceding unsigned comment added by LinuxN877 (talk • contribs) 05:38, 15 December 2012 (UTC)

The example for P(height|man) is wrong: The PDF for value 6 given mean 5.855 and sd 0.035033 is 0.002170381 (and not approx. 1.5789) 85.10.127.15 (talk) 12:15, 15 July 2015 (UTC)