Fast Recap:
- Estimated Probability
Recap:
Probability that a generic pattern , drawn from pdf , belongs to an arbitrary region of the feature space:
If out of data are in , we can estimate via the relative frequency:
- Practical Problem: Consider that if the region isnβt really big enough and the pdf is constant the probability will tend to .
As (our number of data) increases, so does our region: , where the subscript means the number of data in the region, such that belong to each of them, , where is our pattern of interest (~Ex.: we are searching for the pdf of males in a specific university, we use the region to estimate from data.
- be the volume of .
- be the number of data (out of ) in .
- be the -th estimate of , ~Ex.: .
Asymptotic Necessary and Sufficient Conditions: we want to ensure :
- (to guarantee convergence)
β To satisfy these conditions, we can do one of two things:
- Fix a proper volume, say and determine consequently. (Parzen Window).
- Fix (~Ex.: ), and determine consequently, in such a way that exactly patterns fall in (-nearest neighbor).
Original Files:
