I often feel bad when, after a long day of number crunching, I produce a neat looking figure that has no apparent meaning, yet is too cool to delete. However, an image devoid of meaning has little intrinsic value outside the realm of art, and art is often not helpful to the daily grind of an engineer. This is why I often scrap such images, debug the code that created them, and let the memories of pretty math-pictures fade away.
But with this whole blog thing, I can now post such “cool” images with a description of how they were created and for what purpose, and thus preserve them forever on the Internet. Keep in mind, though, that despite their apparent technical character, the actual message of the image is little more than art - there is no physical meaning in it. Of course, perhaps there might be meaning in it to someone else, that is, provided they arrived at it through defensible means.
So here is image #1 of many in this series of posts…
Title: Pignistic probability density functions that convey the epistemic uncertainty associated with the cumulative distribution function values for different levels of loss.
Narrative Description: Consider the situation where you are doing a risk assessment with the goal of estimating the probability that you (the decision maker) will suffer some degree of loss, say $1,000 or more, within the next year. The information you have available to make a statement about the probability of this event is low, which leads you to express your uncertainty in this probability in terms of a Dempster-Shafer structure (or more generally, a random set). Unfortunately, it is not easy to visualize random sets on a pretty graph such as the one above, which leads you to transform this random set via some information theoretic principle into an “equivalent” probability distribution. One such transformation is the pignistic transformation often talked about (and named) by the late Professor Philippe Smets, which is grounded on the principle of insufficient reason which states that in the absence of additional information to discriminate between multiple alternative hypotheses on the basis of likeliness, the mass of probability afforded to this set of alternatives must be evenly distributed among them. The figure above gives a family of pignistic probability density functions for various degrees of loss of the character “$X or more,” where each curve represents a different value of X.
Why Did I Produce This Graph: This graph was no way intended to be a summary of results. Rather, an algorithm I am working on suffered some terrible drawbacks that essentially rendered 18-hours of computational time wasted. Have you ever encountered a situation where your cumulative distribution function, or CDF, exhibited decreasing behavior at some increasing values of a random variable? Well, a CDF wouldn’t be a CDF if the function exhibited such behavior (CDFs are ALWAYS non-decreasing as you increase the argument). So why am I seeing it? To check, I decided to plot the density functions that produce such CDFs - this is what is shown above. If the decreasing nature of the CDF was legitimate, I should be seeing negative values on the y-axis of the pignistic probability densities above. Do we see this? No (take my word for it that I did not adjust the y-axis to hide negative values). So what is the problem?
What Did I Learn: My algorithm did not correctly generate a CDF from the pignistic probability density functions above. This is just another example of how a simple error in a person’s MATLAB code can cause serious errors downstream. I also noticed that my pignistic density functions did not integrate to one, that is, the density did not give me a value of 1.0 for the probability being a value somewhere between zero and one. Obviously this is wrong since probability must take on a value between zero and one. I basically found a second error that was flipping the endpoints of an interval in some cases, such that mass was being assigned to an interval such as [0.8 0.6] which is obviously not a true interval since the right endpoint exceeds the left endpoint. This is where the missing mass was going that caused my pignistic densities to integrate to values less than one, namely to incorrectly-specified intervals such as [0.8 0.6]. So, while the graph above labels the y-axis as pignistic probability density, it isn’t really since the curves do not adhere to the requirement that they must integrate to one.
Welcome to my world. I hope this post shed some insight into how I reason through the algorithm debugging process.

4 responses so far ↓
1 Matt Maisel // May 21, 2008 at 7:00 pm
Will, I’m glad to see that you’ve taken a step into the blogosphere, You have a lot of knowledge that you can share with SRA undergrads!
ps. Letting 1&1 host WordPress for you isn’t a good idea. I have the basic 1&1 web hosting plan, but I installed WP to my web space myself. It gives you a lot more freedom with blog design, and your readers don’t have to register and login to 1&1 just to comment on your blog. Learn how to host the WordPress platform youself!
2 Will McGill // May 22, 2008 at 5:56 pm
Thanks for the note - I will see what I can do about getting WordPress up and running for myself.
3 Kris Wheaton // May 23, 2008 at 9:50 am
This is interesting stuff! I really look forward to seeing how this develops.
Kris
4 Jim Peerenboom // Jun 1, 2008 at 8:20 am
It would be helpful to post full citations for the “100 books.” The list is very useful.
Leave a Comment