Naive Bayes - Prediction probability Storm days 🌪️
In this post, I use the Naive Bayes classifier to predict whether a day in a North Sea wind farm will become a storm day 🌪️.
For more information on Naive Bayes, you can read this post on DataCamp.

My features are wind speed 💨, air pressure 🌡️, and cloud cover ☁️, each recorded as simple categories such as high, medium, low.
From a small historical dataset, we estimate the probabilities of a storm, then use Naive Bayes to compute how likely a new day with given conditions (wind speed, pressure, clouds) will turn into a storm.
Storm prediction dataset
| WindSpeed | Pressure | Cloud | StormDay |
|---|---|---|---|
| high | low | high | yes |
| high | low | low | yes |
| high | normal | high | yes |
| medium | low | high | yes |
| medium | normal | low | no |
| low | high | low | no |
| low | normal | low | no |
| medium | high | high | no |
Question
What is the probability of a storm day when
\[(\text{Wind Speed}, \text{Pressure}, \text{Cloud}) = (\text{low}, \text{low}, \text{high}) \, ?\]The feature vector is
\[X = (\text{Wind Speed}, \text{Pressure}, \text{Cloud}) = (\text{low}, \text{low}, \text{high}).\]The probability of a storm day given this feature is
\[P(\text{StormDay = yes} \mid \text{WindSpeed = low}, \text{Pressure = low}, \text{Cloud = high}).\]Using the Naive Bayes assumption, we write
\[P(X \mid \text{yes}) = P(\text{low wind} \mid \text{yes}) \cdot P(\text{low pressure} \mid \text{yes}) \cdot P(\text{high cloud} \mid \text{yes}).\]From the dataset:
- Total days: \(8\)
- Storm “yes”: \(4 \Rightarrow P(\text{yes}) = 4/8 = 0.5\)
- Storm “no”: \(4 \Rightarrow P(\text{no}) = 4/8 = 0.5\)
Looking only at the 4 storm‑yes days:
- WindSpeed = low with storm = yes: \(0/4 \Rightarrow P(\text{low wind} \mid \text{yes}) = 0\)
- Pressure = low with storm = yes: \(3/4 \Rightarrow P(\text{low pressure} \mid \text{yes}) = 0.75\)
- Cloud = high with storm = yes: \(3/4 \Rightarrow P(\text{high cloud} \mid \text{yes}) = 0.75\)
Substituting into the Naive Bayes likelihood:
\[P(X \mid \text{yes}) = 0 \times 0.75 \times 0.75 = 0.\]So the unnormalized score for “storm = yes” is
\[P(\text{yes}) \cdot P(X \mid \text{yes}) = 0.5 \times 0 = 0.\]Therefore,
\[P(\text{storm = yes} \mid X) = 0\]under unsmoothed Naive Bayes, because one conditional probability is zero.
In a real model, we would use Laplace smoothing so a single missing combination does not force the probability to 0.
Fixing the zero‑probability with Laplace smoothing
In the unsmoothed Naive Bayes calculation, we got
\[P(\text{low wind} \mid \text{storm = yes}) = 0,\]which made
\[P(X \mid \text{storm = yes}) = 0\]for
\[X = (\text{low wind}, \text{low pressure}, \text{high cloud}).\]To avoid this, we apply Laplace smoothing with \(\alpha = 1\).
For a categorical feature value \(x\) and class \(y\),
\[P_{\text{Laplace}}(x \mid y) = \frac{\text{count}(x, y) + \alpha}{\text{count}(y) + \alpha \cdot K},\]where \(K\) is the number of possible values of that feature.
In our example:
- Wind speed categories: \(K_{\text{wind}} = 3\) (low, medium, high)
- Pressure categories: \(K_{\text{pressure}} = 3\) (low, normal, high)
- Cloud categories: \(K_{\text{cloud}} = 2\) (low, high)
- Storm = yes days: \(\text{count}(\text{yes}) = 4\)
Now compute the smoothed likelihoods for the storm = yes class.
- Wind speed:
- Pressure:
- Cloud:
The smoothed likelihood for \(X\) under storm = yes is
\[P_{\text{L}}(X \mid \text{yes}) = \frac{1}{7} \cdot \frac{4}{7} \cdot \frac{2}{3} = \frac{8}{147}.\]Using the prior \(P(\text{yes}) = 0.5\),
\[\text{score}(\text{yes} \mid X) \propto P(\text{yes}) \cdot P_{\text{L}}(X \mid \text{yes}) = 0.5 \cdot \frac{8}{147} = \frac{4}{147}.\]Smoothed likelihood for “storm = no”
For the 4 “storm = no” days we have:
- WindSpeed values: low, low, medium, medium
- Pressure values: high, normal, normal, high
- Cloud values: low, low, low, high
With \(\alpha = 1\):
Wind speed (3 categories):
\[P_{\text{L}}(\text{low wind} \mid \text{no}) = \frac{2 + 1}{4 + 1 \cdot 3} = \frac{3}{7}.\]Pressure (3 categories):
\[P_{\text{L}}(\text{low pressure} \mid \text{no}) = \frac{0 + 1}{4 + 1 \cdot 3} = \frac{1}{7}.\]Cloud (2 categories):
\[P_{\text{L}}(\text{high cloud} \mid \text{no}) = \frac{1 + 1}{4 + 1 \cdot 2} = \frac{2}{6} = \frac{1}{3}.\]Thus the smoothed likelihood for \(X\) under “no” is
\[P_{\text{L}}(X \mid \text{no}) = \frac{3}{7} \cdot \frac{1}{7} \cdot \frac{1}{3} = \frac{3}{147}.\]With prior \(P(\text{no}) = 0.5\),
\[\text{score}(\text{no} \mid X) = 0.5 \cdot \frac{3}{147} = \frac{3}{294}.\]From above,
\[\text{score}(\text{yes} \mid X) = \frac{4}{147} = \frac{8}{294}.\]Final posterior probability
Now normalize the two scores:
\[P(\text{storm = yes} \mid X) = \frac{\text{score}(\text{yes} \mid X)} {\text{score}(\text{yes} \mid X) + \text{score}(\text{no} \mid X)} = \frac{\frac{8}{294}}{\frac{8}{294} + \frac{3}{294}} = \frac{8}{11} \approx 0.73.\]So after Laplace smoothing, the model predicts a storm with probability about 73% for
\[X = (\text{low wind}, \text{low pressure}, \text{high cloud}),\]much better than the zero probability from the unsmoothed model.
