Fast Recap:
  • Estimated Probability

Recap:

Probability that a generic pattern , drawn from pdf , belongs to an arbitrary region of the feature space:

If out of data are in , we can estimate via the relative frequency:

  • Practical Problem: Consider that if the region isn’t really big enough and the pdf is constant the probability will tend to .

As (our number of data) increases, so does our region: , where the subscript means the number of data in the region, such that belong to each of them, , where is our pattern of interest (~Ex.: we are searching for the pdf of males in a specific university, we use the region to estimate from data.

  • be the volume of .
  • be the number of data (out of ) in .
  • be the -th estimate of , ~Ex.: .

Asymptotic Necessary and Sufficient Conditions: we want to ensure :

  1. (to guarantee convergence)

β‡’ To satisfy these conditions, we can do one of two things:

  1. Fix a proper volume, say and determine consequently. (Parzen Window).
  2. Fix (~Ex.: ), and determine consequently, in such a way that exactly patterns fall in (-nearest neighbor).

Original Files: