
Metódy riešenia úloh z pravdepodobnosti a štatistiky


\[\mathbb{P}((X,Y) \in C_i | X,Y) = \frac{\mathbb{P}((X,Y) \in C_i) \, f_i(x,y)}{\sum_{j=1}^3 \mathbb{P}((X,Y) \in C_j) \, f_j(x,y)}\]
Odvodíme:
V R-ku: funkcie lda a qda z balíka MASS


Dáta z minulého cvičenia o autách a klasifikácia na základe spotreby:
data("mtcars")
mtcars$mpg_nadpriemer <- (mtcars$mpg > mean(mtcars$mpg))
mtcars$mpg_nadpriemer <- factor(mtcars$mpg_nadpriemer)
library(caret)
set.seed(123)
mtcars_index <- createDataPartition(mtcars$mpg_nadpriemer,
p = 0.8, # podiel dat v trenovacej casti
list = FALSE)
# indexy testovacej casti
(1:nrow(mtcars))[-mtcars_index][1] 4 8 13 14 31
Teraz zoberieme spojité premenné, budeme predpokladať, že predpoklad normality je splnený:
plot(mtcars_train$wt, mtcars_train$qsec,
xlab = "wt", ylab = "qsec",
col = c("red", "blue")[as.numeric(mtcars_train$mpg_nadpriemer)], pch=19)
Spolu s dátami (čierne), ktoré chceme predikovať:
plot(mtcars_train$wt, mtcars_train$qsec,
xlab = "wt", ylab = "qsec",
col = c("red", "blue")[as.numeric(mtcars_train$mpg_nadpriemer)], pch=19)
points(mtcars_test$wt, mtcars_test$qsec, col = "black", pch = 19)
Kvadratická diskriminačná analýza a predikcie:
Call:
qda(mpg_nadpriemer ~ wt + qsec, data = mtcars_train)
Prior probabilities of groups:
FALSE TRUE
0.5555556 0.4444444
Group means:
wt qsec
FALSE 3.867933 17.16867
TRUE 2.287333 18.66583
[1] TRUE TRUE FALSE FALSE FALSE
Levels: FALSE TRUE
FALSE TRUE
Hornet 4 Drive 0.4064068 0.593593185
Merc 240D 0.2125211 0.787478891
Merc 450SL 0.9935647 0.006435312
Merc 450SLC 0.9926888 0.007311209
Maserati Bora 0.9985874 0.001412613
Realita:
predikcia
realita FALSE TRUE
FALSE 3 0
TRUE 0 2