This page collects worked exercises on Bayesian networks. The value is not only the final probability, but the reasoning process: identifying dependencies, checking conditional independence, reading active paths and applying Bayes' rule without losing the graph structure.
Indicate which of these relations are true and which are false. For the first and the fifth, show in detail the calculations you performed.
Ip(A,B): In order to determine whether this relation is true or false, we must discover whether there is a relationship between A and B; that is, whether a change in A affects the probability of B ( p(+b∣+a)=p(+b∣¬a) ).With this goal, and using the table with the data provided in the statement, I have calculated the probability of P(A,B) in each possible case.P(+a,+b)=∑cP(+a,+b,c)=0.03+0.12=0.15P(+a,¬b)=∑cP(+a,¬b,c)=0.07+0.28=0.35P(¬a,+b)=∑cP(¬a,+b,c)=0.03+0.27=0.30P(¬a,¬b)=∑cP(¬a,¬b,c)=0.02+0.18=0.20Obtaining:
Probability
p(+b)
p(¬b)
Totals
p(+a)
0,15
0,35
0,5
p(¬a)
0,30
0,20
0,5
Totals
0,45
0,55
1
From this table, I will calculate the conditional probability of both variables:P(+b∣+a)=p(+a)p(+a,+b)=0.500.15=0.3P(+b∣¬a)=p(¬a)p(¬a,+b)=0.500.30=0.6P(¬b∣+a)=p(+a)p(+a,¬b)=0.500.35=0.7P(¬b∣¬a)=p(¬a)p(+a,¬b)=0.500.20=0.4From the results shown, I can state that p(+b∣+a)=p(+b∣¬a) since 0.3 = 0.6; that is, depending on the value of variable A, variable B is affected.
Ip(A,C)FALSE
Ip(B,C)FALSE
Ip(A,B∣C)TRUE
Ip(B,C∣A)FALSE. In order to determine whether this relation is true or false, we must discover whether there is a relationship between B and C that blocks A; that is, whether a change in B and C does not affect the conditional probability of A ( P(B,C∣A)=P(B,¬C∣A)) ).To analyze this explanation, it is necessary to begin by calculating all the possibilities of P(B,C), which will be needed for the next steps: P(+b,+c)=∑aP(a,+b,+c)=0.03+0.03=0.06P(+b,¬c)=∑aP(a,+b,¬c)=0.12+0.27=0.39P(¬b,+c)=∑aP(a,¬b,+c)=0.07+0.02=0.09P(¬b,¬c)=∑aP(a,¬b,¬c)=0.28+0.18=0.46With this information, I obtained:
Probability
p(+c)
p(¬c)
Totals
p(+b)
0,06
0,39
0,45
p(¬b)
0,09
0,46
0,55
Totals
0,15
0,85
1
Using this information and knowing that:Where we can observe that:P(+B,+C∣+A)=P(+B,¬C∣+A)P(¬B,+C∣+A)=P(¬B,¬C∣+A)P(+B,¬C∣+A)=P(¬B,+C∣+A)
Therefore, the statement is false.
Exercise 2
Let G be an undirected graph containing five nodes and the following links: A-B, A-C, B-C, B-D and C-E. 1Indicate which of the following relations are true and which are false, and which active paths exist between the variables in each relation.
Ig(A,B)falsa→¬Ig(A,B)The paths that exist are: A-C-B and A-B
Ig(B,D)falsa→¬Ig(B,D)The path that exists is: B-D
Ig(A,D)falsa→¬Ig(A,D)The paths that exist are: A-B-D and A-C-B-D
Ig(D,E)falsa→¬Ig(D,E)The paths that exist are: D-B-C-E and D-B-A-C-E
Indicate which of the following relations are true and which are false. Also indicate for each of them which active paths exist between the first two variables and whether any path between them has been blocked by the third variable.
Ig(A,B∣C) There is a connection, therefore it would be ¬Ig(A,B∣C)through A-B, and A-C-B is blocked
Ig(A,D∣B) True. All possible paths A-B-D and A-C-B-D would be blocked
Ig(D,E∣C) True. All possible paths D-B-C-E and D-B-A-C-E would be blocked
Ig(D,E∣A) False ¬Ig(D,E∣A)D-B-A-C-E would be blocked, but D-B-C-E would remain active
Exercise 3
2For each of the following relations, indicate whether it is true or false; also indicate which paths between the two variables are active and which are inactive.
Ig(A,B) True, there is no connection. The possible disconnected paths that may exist are A-C-B and A-C-F-D-B
Ig(A,E) False ¬Ig(A,E)The active path is A-C-E
Ig(A,D) True, there is no connection. The possible paths would be A-C-B-D and A-C-F-D
Ig(A,F) False ¬Ig(A,F)The active path is A-C-F. Another inactive path could be A-C-B-D-F
Ig(D,E) False ¬Ig(D,E)The active path could be B-C-E-D.
Ig(E,F) False ¬Ig(E,F)The active path could be C-E-F or B-D-F-C-E
Indicate which of the following relations are true and which are false. Also indicate for each of them whether any path between the first two variables that was inactive has been activated by the third variable, and vice versa, that is, whether any path that was active has been blocked.
Ig(A,B∣C)→¬Ig(A,B∣C)C activates the path between A and B and, therefore, there is a path between A, B and C.
Ig(A,B∣E)→¬Ig(A,B∣E)Between A and B there is the path A-C-B, which is activated because E is a descendant of C. The path A-C-F-D-B is blocked.
Ig(A,B∣F)→¬Ig(A,B∣F)Between A and B there is the path A-C-B, which is activated because F is a descendant of C. The path A-C-F-D-B would also be activated.
Ig(A,D∣C)→¬Ig(A,D∣C)C activates the path between A and D and, therefore, there is an active path A-C-B-D. Another possible path that is not enabled would be A-C-F-D, because node F is not active.
Ig(A,D∣F)→¬Ig(A,D∣F)F activates the path between F and D and, therefore, there is an active path A-C-F-D. The other possibility, A-C-B-D, is deactivated.
Ig(A,F∣C)→True. The path between A and F passes through C, which is blocked. Therefore, the path A-C-F.
Ig(A,F∣E)→¬Ig(A,F∣E)There is an active path between A and F that goes through A-C-F.
Ig(D,E∣B)→True The path D-B-C-E is blocked by B, and E-C-F-D-B also remains blocked.
Ig(E,F∣C)→True The path E-C-F is blocked by C.
Exercise 4
Let P be a probability distribution that satisfies the following dependence property =Ip(A,B) : Draw all undirected graphs and directed acyclic graphs with two variables that are independence maps (I-maps) of P.Recall that for a graph G to be an independence map of a probability distribution P, it is sufficient that every separation relation in G corresponds to an independence relation in P; it is not necessary that every independence relation in P be reflected as a separation in G.3
Directed graphs.
4
Undirected graphs.
Exercise 5
Let P be a probability distribution that satisfies the following independence property: IP(A,B): Draw all undirected graphs and directed acyclic graphs with two variables that are independence maps (I-maps) of P.Recall that for a graph G to be an independence map of a probability distribution P, it is sufficient that every separation relation in G corresponds to an independence relation in P; it is not necessary that every independence relation in P be reflected as a separation in G.5
Directed graphs.
6
Undirected graphs.
Exercise 6
Let P be a probability distribution that satisfies the following dependence and independence properties: ¬ Ip (A, B), ¬ Ip (A,C), ¬ Ip (B, C), Ip (A, B ∣ C), ¬ Ip (A, C ∣ B), Ip (B, C ∣ A). Draw all undirected graphs and directed acyclic graphs with three variables that are independence maps (I-maps) of P.Recall that for a graph G to be an independence map of a probability distribution P, it is sufficient that every separation relation in G corresponds to an independence relation in P; it is not necessary that the whole independence relation in P must be reflected as a separation in G.Based on the previous paragraph, I found the following graphs that satisfy all the conditions in the statement:
The first graph shows the smallest option that satisfies the conditions in the statement:7
Minimum directed graph.
The second block includes all the graphs that also satisfy the conditions, but they have one extra variable, D, connected to each of the nodes:8
Undirected graph
If we merge all the previous graphs into one, we get:9
Undirected graph
Or by adding two extremes10
Undirected graph
By adding a node between A and B and/or B and C11 12
Undirected graph
Regarding the directed graphs I found, they are the following:
In the first graph drawn we find the smallest graph that satisfies the model conditions. Starting from the first graph and including two more nodes in the graph, I drew the three remaining graphs.13
Directed graph
The first graph is the most reduced option under this solution. From this graph we can generate the next two options by adding more nodes between the two nodes.14
Directed graph
15
Directed graph
Exercise 7
In a certain country, the prevalence of typhoid fever is 0'001 and that of tuberculosis is 0'01. Typhoid fever always produces fever, and bradycardia (slow heart rate) in 40 % of cases. Tuberculosis produces fever in 60 % of cases and tachycardia (faster than normal heart rate) in 58 % . The prevalence of fever in patients who do not suffer from either of these two diseases is 1'5 % , that of bradycardia is 0'05 % and that of tachycardia is 1'3 % .
According to the naive Bayesian method, indicate which variables are involved in this problem and what values each of them can take.Typhoid fever is 0'001 = P(+fiebre)=0′001Tuberculosis 0'01 = P(+tuberculosis)=0′01typhoid fever → always fever = P(+fiebre∣fiebretifoidea)=1typhoid fever → bradycardia (slow heart rate) in 40 % of cases = P(+bradicardia∣fiebretifoidea)=0.4tuberculosis → fever in 60 % of cases = P(+fiebre∣tuberculosis)=0.6tuberculosis → tachycardia in 58 % of cases = P(+taquicardia∣tuberculosis)=0.58Others → fever 1'5 % = P(+fiebre∣Otros)=0.015Others → bradycardia 0'0005 % = P(+bradicardia∣Otros)=0.0005Others → tachycardia 0'013 % = P(+taquicardia∣Otros)=0.013With the information in the statement, we can conclude that the variables are:
Disease: A patient may have:
typhoid fever.
tuberculosis.
something else.
Fever: Absent or present.
Bradycardia: Absent or present.
Tachycardia: Absent or present.
Draw the corresponding diagram for the naive Bayesian method.16
Diagram
Indicate the conditional probabilities, in table form, that define the model.First of all, I will indicate the probabilities for each of the different diagnoses given in the statement.
Probability for the different states of patients
Typhoid Fever
Tuberculosis
Other
P(Disease)
0.001
0.01
0.989
Next comes the probability that a patient with a given disease has fever ( P(Fiebre∣Enfermedad) )
Probability for the different states of patients regarding the symptom fever
Typhoid Fever
Tuberculosis
Other
fever
1
0.6
0.015
¬ fever
0
0.4
0.985
Next comes the probability that a patient with a given disease has bradycardia ( P(braquicardia∣Enfermedad) )
Probability for the different states of patients regarding the symptom bradycardia
Typhoid Fever
Tuberculosis
Other
bradycardia
0.4
0
0.0005
¬ bradycardia
0.6
1
0.9995
Next comes the probability that a patient with a given disease has tachycardia ( P(taquicardia∣Enfermedad) )
Probability for the different states of patients regarding the symptom tachycardia
Typhoid Fever
Tuberculosis
Other
tachycardia
0
0.58
0.013
¬ tachycardia
1
0.42
0.987
State the hypotheses you are using to solve this problem and discuss whether they are reasonable or not, that is, whether they seem to be a good approximation.The hypothesis I deduced from the statement is that a patient can have three different types of diseases: typhoid fever, tuberculosis, or another disease that is neither typhoid fever nor tuberculosis.The approximation is reasonable, since it indicates the probability that a patient with certain symptoms has one or more diseases, making it a useful approximation to confirm or rule out possible diagnoses.
What is the diagnosis for each of the possible combinations of findings: fever, no fever, tachycardia, bradycardia, normal rhythm, fever and tachycardia, fever and normal rhythm, etc.? If you wish, you may use the OpenMarkov program to complete the table, but in that case you must work out by hand and show the detailed calculations for two of those combinations.To calculate the probabilities that the patient suffers from a disease based on the symptoms, we will use Bayes' theorem through the following formula:e = diseasef = fevert = tachycardiab = bradycardiap(e∣f,b,t)=∑ep(f,b,t∣e)∗p(e)p(f,b,t∣e)∗p(e)From the previous tables, we can generate a table that includes all possible combinations of p(f,b,t∣e) , that is, the conditional probability table for each disease.
Conditional probability table for each disease
Symptoms|Disease
+f,+b,+t
0
0
0,0000000975
+f,+b, ¬ t
0,4
0
0,0000074025
+f, ¬ b,+t
0
0.348
0.000195
+f, ¬ ,b, ¬ t
0.6
0.252
0.01479
¬ f,+b, + t
0
0
0,0000064025
¬ f,+b, ¬ t
0
0
0,000486098
¬ f, ¬ b, + t
0
0.232
0.0128
¬ f, ¬ b, ¬ t
0
0.168
0.9717
In this way, using equation 1 we can calculate the different probabilities of a given diagnosis conditioned on the different combinations of symptoms.p(tuberculosis∣+f,−b,+t)=∑eP(p(+f,+b,+t∣e′)∗p(e′)p(+f,−b,+t∣tuberculosis)∗p(tuberculosis)=0.348∗0.01+0∗0.001+0.000195∗0.9890.348∗0.01=0,0036728550.348∗0.01=0,947p(tuberculosis∣+f,−b,+t)=∑eP(p(+f,+b,+t∣e′)∗p(e′)p(+f,−b,+t∣tuberculosis)∗p(tuberculosis)=0.348∗0.01+0∗0.001+0.000195∗0.9890.348∗0.01=0,0036728550.348∗0.01=0,947
Symptoms|Disease
+f,+b,+t
0
0
1
+f,+b, ¬ t
0,98
0
0.02
+f, ¬ b,+t
0
0.95
0.05
+f, ¬ ,b, ¬ t
0.03
0.12
0.82
¬ f,+b, + t
0
0
1
¬ f,+b, ¬ t
0
0
1
¬ f, ¬ b, + t
0
0.15
0.085
¬ f, ¬ b, ¬ t
0
0.01
0.99
Experience shows that, when there is tuberculosis, fever and tachycardia are associated in most cases that is, tuberculosis generally produces tachycardia if and only if it produces fever. Does this observation call into question the validity of the results obtained in the previous section?
No, because if tachycardia is present, fever is present and bradycardia is absent, the probability that a patient has tuberculosis is 95%, which seems to indicate that the model could be valid based on experience.