Exploring the .NET Framework

Perhaps the most frustrating part about learning any new development platform is learning where to find things.  With the .NET platform, for example, I have this wonderful class library that has all manner of classes to access the underlying services.  But without a roadmap, finding things is maddening.  For example, if you want to get the current date and time, you access a static property in the System.DateTime class (System.DateTime.Now).  If you want to pause your program, you call System.Threading.Thread.Sleep()—a static method.  The class library is peppered with these static methods and properties, and ferreting them out is difficult.  What I need is some kind of reference that lists all of the objects that provide similar useful static members, and a cross-reference (maybe an index) that will let me quickly find the appropriate static member for a particular purpose.  As it is, I’m spending untold hours searching the documentation or just randomly picking a topic in the library reference and scanning it for interesting tidbits.

That said, I’m having a ball learning C# and .NET programming.  I’m still not as comfortable with it as I am with Delphi (hardly a surprise, considering I’ve been working with Delphi since its early beta days in 1994), but I can see C# becoming my language of choice very quickly.  C# and the .NET platform have all the ease of use of Delphi, a comprehensive class library, good documentation, and a solid component model that’s supported by the underlying framework (a luxury that Delphi didn’t have).  It’s easy to like this environment, and it’s going to be rather difficult for other compiler vendors to provide tools that programmers will prefer over C# and Visual Studio .NET.  I’m interested to see what Borland comes up with.  Whatever it is, it’ll have to be pretty compelling to get me to switch.

Movie review: In the Bedroom

I find it difficult to believe that Debra and I just sat through In the Bedroom on DVD.  My first (printable) response after the movie was over is “You’ve got to be kidding me.”  That pile of brooding incoherent disconnected images was nominated for 8 Academy Awards and countless “lesser” distinctions?  How could anybody in that film be nominated for anything other than Best Comatose Performance?

I’ve long known that film critics and I rarely agree when it comes to drama, what with our widely differing opinions of such drivel as The Last EmperorMy Dog SkipThe Thin Red Line, and Titanic.  But until recently I thought perhaps I just didn’t get it.  I’ve finally realized, though, that  if you think of films like In the Bedroom as THE ROYAL NONESUCH, and film critics as the citizens of that little Arkansas town in The Adventures of Huckleberry Finn, then things make a lot more sense.

In the Bedroom is yet another film that suffers from (among many other things) the deadly sin of taking itself seriously.  Don’t waste your time or your money.

More on redundant code

Something else interesting about the research I mentioned yesterday is that they’ve used the tool to find a large number of previously undiscovered bugs in the Linux kernel—primarily, if I’m reading the sketchy information in the Slashdot postings correctly, in kernel device drivers.  That the bugs reside primarily in device drivers isn’t terribly surprising.  Device driver code is notoriously difficult to write for many reasons, and doubly so when the programmers don’t take the time to read and understand the hardware manuals.  It’s harder still when the manuals don’t exist and the programmer is working from knowledge gained by poking random data at the hardware interface to see what comes out.

That this analysis reveals so many previously undiscovered bugs both validates and refutes the open source mantra “with enough eyes, all bugs are shallow.”  Validation because somebody finally looked at the code, and refutation because it points out that not all code is equally examined.  Some parts of the code get looked at by thousands of eyes, and other parts don’t even get tested by the original programmer, much less reviewed by somebody competent.  An automated auditing tool like this is useful, but it still can’t replace a competent programmer reviewing the code, as it’s still quite possible for errors to occur in modules that do not exhibit any of the redundancies or similar indicators.  The idea behind open source is that “somebody will care.”  The reality is that lots of people care about certain parts of the project, but other parts are left wanting.  That particular problem can only get worse as the kernel continues to grow.

Redundant code as a bug indicator

Scanning Slashdot today during lunch, I came across this posting about two Stanford researchers who have written a paper (it’s a PDF) showing that seemingly harmless redundant code frequently points to not-so-harmless errors.  They used a tool that does static analysis of the source code to trace execution paths and such.  The technology behind their tool is fascinating and something I’d like to study, given the time.  But that’s beside the point.

On the surface, the paper’s primary conclusion—that redundancies flag higher-level correctness mistakes—seems obvious.  After all, it’s something that we programmers have suspected, even known, for quite some time.  But our “knowledge” was only of the problems that arose specifically from particular redundancies like “cut and paste” coding errors—repetitions.  The paper identifies other types of redundancies (unused assignments, dead code, superfluous conditionals).  Some of these redundancies actually are errors, but many are not.  The paper’s primary contribution (in my opinion) is to show that, whether or not the redundancies themselves are errors, their existence in a source file is an indicator that there are hard errors—real bugs—lurking in the vicinity.  How strong of an indicator?  In their test of the 1.6 million lines of Linux source code (2,055 files), they show that a source file that contained these redundancies was from 45% to 100% more likely to contain hard errors than a source file picked at random.  In other words, where there’s smoke (confused code, which most likely means a confused programmer), it’s likely that fire is nearby.  These methods don’t necessarily point out errors, but rather point at likely places to find errors.

A production tool based on the methods presented in this research would be an invaluable auditing tool.  Rather than picking a random sample of source files for review, auditors could use this tool to identify modules that have a higher likelihood of containing errors.  Very cool stuff, and well worth the read.

The fallacy of affordable health insurance

The idea behind insurance is simple.  A group of people agree to pool their funds to protect individuals in the group from financial ruin in the case of a catastrophic loss.  The group’s premium payments are invested, ideally at a profit, and any excess funds over what is reserved for future losses is paid back to the group members (in the case of a mutual insurance company), or distributed to the company’s stock holders.  This works quite well in many situations because catastrophic losses are relatively rare.  Life insurance works slightly differently in that everybody dies at some point.  Insurance companies use well-researched statistics to project the insured’s life expectancy and then structure a payment plan so that the insured’s premiums, when invested at a reasonable rate, will return more than the policy’s face value before the insured person dies.  The reason you buy life insurance isn’t to insure that your estate will have $100,000 (or whatever sum) when you die at age 80, but rather that if you kick the bucket in your 50’s, your dependents will have something to fall back on.  If you could guarantee that you’d live to be 80, there would be no need for life insurance.  No, you can do much better by investing the money yourself.

One other thing.  Insurers base the price of the premiums (what individuals must pay) on two things:  the computed probability of a loss, and the cost to fund the loss should it happen.  That is, a 22-year-old with a drunk driving conviction and an $80,000 Porsche will pay a higher premium than a 40-year-old mother of three with a minivan and a clean driving record.

Okay, that’s how insurance works.  So what’s my point?

My point is that “health insurance” as we’ve come to know it can’t possibly work.  It’s a huge Ponzi scheme that at some time has to crumble in on itself.  Remember, insurance works by spreading the cost of infrequent catastrophic losses over a large group of individuals.  Critical care insurance can work in this way.  But you can’t fund day-to-day health care, which is what today’s “health insurance” has become, using any kind of insurance scheme.  The theory is that younger members of the group, who are supposedly in better health and need less medical care, help subsidize the payment of care for older members of the group.  This all works fine as long as health care costs remain relatively fixed.  But several things happen:  better and more expensive medical technology (tests, treatments, drugs) becomes available, younger members insist on more coverage (lower deductibles, lower co-payments, wider coverage of services), life expectancies get longer, and an increased scrutiny by an ever more litigious society requires that every possible test be run in every possible circumstance.  Oh, and insurance companies don’t have free reign to adjust their premiums based on an individual’s health history, age, or habits.  (Yes, smokers pay the same premiums as non-smokers.)  Prices increase, soar, and then skyrocket.

Until relatively recently, employers have been picking up most of the increasing costs of health care “insurance.”  But employers are starting to realize that they can’t continue to pay the ever-increasing premiums, and they’re expecting employees to pay a little more out of their own pockets.  This will work for a short while, but individual employees won’t be able to afford it for long.  At some point, government will step in and take over the whole health insurance Ponzi scheme.  But even the Federal government has finite resources, which soon will be overwhelmed by the cost of everybody insisting on the absolute best possible care right now.

Do I have an answer?  Of course I do.  Scale back expectations, insist that people take responsibility for their own health (that is, eat right, exercise, cease self-destructive behaviors), shoulder the costs of your own day-to-day medical care, and use insurance as it’s intended—to cover catastrophic losses like broken legs and serious diseases.  It’s a workable plan, in theory.  Sadly, it requires more restraint and personal responsibility than most people today can manage.

Trainable Bayesian spam filters

My friend Jeff Duntemann posted a note yesterday in his web diary about using trainable Bayesian filters to filter spam.  I still don’t agree that filtering is the best way to combat spam, but it’s probably the best we’re going to get, all things considered.  Blocking spam at the source (i.e. preventing it from entering the system in the first place) would be much more effective, but the design of the email protocols, and resistance to change prevent implementation of an effective Internet-wide spam blocking scheme.  So we’re left with filtering at the delivery end.

The nice thing about Bayesian filters, as Jeff points out, is that they are trainable.  And the one that everybody’s talking about (see Jeff’s site for the link) has a 99.5% success rate, with zero false positives.  It’s impressive, and perhaps this is the way to go.  But on the client?  Like spam blocking, filtering should be done on the server.  All it would take is some simple modifications to the email server, a few extensions to the POP and IMAP mail protocols, and everybody could have spam filtering regardless of what email client they’re using.  Filtering on the server would be much more efficient than having each individual client do the filtering.  Plus, servers could implement black list filtering on a per-user basis, and perhaps stop a large amount of unwanted email from ever being accepted.

Do I expect this to happen?  Sadly, no.  Even as outdated and inefficient as our mail protocols are, I don’t expect them to be changed any time soon.  We’re left waiting for the established email clients to include this kind of feature, or for somebody to come up with a new email client that has a good interface, includes all of the features we’ve come to expect, and also has advanced spam blocking features.  I think it’s going to be a long wait.

Good deals on good wine

In case you haven’t heard, there’s a glut of wine on the market.  Vintners are struggling, and consumers are enjoying record low prices on some very good wines.  California growers enjoyed an excellent grape harvest this year, and now there’s way too much wine on the market.  Wine producers are victims of their own success.  California wine makers spent huge amounts of money marketing their products in the 80’s and 90’s, and successfully increased demand for their product.  Their increased market combined with the boom years of the 90’s led to many newcomers entering the market and established companies planting new orchards.  At some point along the way they crossed the line that separates meeting demand and overproduction.  Even without the current economic downturn, they’d have too much wine on their hands.  I don’t know if vintners are screaming for federal price supports or other types of relief yet, although that won’t surprise me.

I’m keeping the ranting to a minimum lately, so I’ll just mention that it’s usually a good idea to see which way the wagon’s headed before you hop on it.  But if you like wine, now would be a great time to head down to your favorite retailer and get some.

Spirograph revisited

I think every kid had a Spirograph when I was growing up.  I know that I spent countless hours with my colored pens and those little plastic pieces, trying to overlap figures in different ways to make beautiful designs.  I’d pretty much forgotten about Spirograph until 1995, when my friend and co-author Jeff Duntemann wrote a program he called “Spiromania” for our book Delphi Programming Explorer.  I converted the program to C++ for The C++Builder Programming Explorer, and have played with it from time to time since, even toying with it when I was learning about writing Windows screen savers.  I’m at it again, but this time with completely new code written in C#.  It’s a project for learning .NET programming.

One of the cool things about computers is that you can simulate things that you just can’t do with a physical model.  For example, the two figures shown above were created by simulating a circle of radius 60 rolling around a circle of radius 29, and the pen on the edge of the bigger circle.  The only difference is that the figure on the left is drawn with smooth curves—which is what you’d get with a real Spirograph toy—and the figure on the right is drawn by plotting 5 points for each time the big circle goes around the little circle.  (In actuality, the figure on the left also is drawn using straight lines, but the lines are sufficiently short to give the illusion of smooth curves.)  In any case, the figure on the right would be impossible (well, okay, exceedingly difficult) to create using a physical Spirograph toy.

I’ve written a .NET custom control so I can drop these things on a form and fiddle with their properties.  I’m working now on some animation and a little better user interface, and eventually will have a program that will allow you to create and manipulate multiple images, moving them around and overlapping them.  It’s great way to learn a new programming environment.

More on Web services

The basic idea behind Web services isn’t really new.  Over the years I’ve seen a few IT shops that had standard protocols for their disparate systems to communicate.  They all had problems, though, because the protocols were mostly ad-hoc creations not subjected to rigorous design, and weren’t very easy to modify or extend.  And there was no possible way that systems from company A could talk with systems from company B.

Microsoft’s innovation (and perhaps I’m stepping on a land mine using that word here) isn’t so much the individual ideas of standard protocols, standard language, automatic data description and discovery, or any such technology.  No, Microsoft’s innovation here is in tying all of those individual technologies together, filling in the holes with very solid design work, and pushing for standardization so that any program running on any modern system has the same ability to query any other system for information.

You don’t need .NET in order to write or access Web Services.

Naysayers and Microsoft bashers will complain (mostly unjustly, in my opinion) about the company shoving things down our throats, strong-arming the industry, or coercing standards organizations.  The plain fact of the matter is that our industry has needed something like this for at least 10 years, and no “consortium” has even come close to providing it.  Certainly, Microsoft hasn’t acted alone in this—they’ve had the cooperation of many other companies—but without Microsoft, it wouldn’t have happened.  Microsoft has provided a huge benefit to the industry by spearheading the creation of an open Web services architecture and making it freely available to all.  That they stand to make money from selling applications and development software based on Web Services doesn’t lessen the benefit that they have provided.  To the contrary, it should make them much more interested in ensuring that the standard is complete, consistent, and flexible.  From what I’ve seen of the standards, I believe that it is.

Using Web services for integration

One of the things that impresses me most about Microsoft’s .NET strategy is not so much the technology behind it (although that is impressive), but rather the way they’re going about selling it to businesses.  Microsoft has identified a real concern in the business world:  disparate systems that have to share information.  For example, consider a medium sized bank that has offices throughout southern and central Texas.  Among the systems that they support are:

  • Central data processing for posting checks, deposits, loan payments, etc.
  • Web site
  • Online banking
  • Automated clearing house for Fed transactions
  • Word processing with central document storage
  • Teller terminals
  • Automated Teller Machines
  • Customer Relationship Management system used by Customer service representatives
  • Credit rating and scoring system
  • Human Resources

Those are just some of the internal systems.  They also would like to interface with their suppliers and business partners.  Some run on big iron, some on PCs, others on older systems that aren’t even supported by their manufacturers anymore.  Some of the systems are in a single location, and others are spread out over thousands of square miles.  Software is a mix of pre-packaged applications, commercial applications with custom modifications, and in-house custom applications.  Ideally, all of these systems could share data.  That turns out to be very difficult, though, due to incompatible formats (ASCII versus EBCDIC, for example), incompatible communications protocols, or other problems.

It’s certainly possible to modify each system so that it can interact with all the others.  The obvious way is to modify each individual system so that it understands what it needs to know about each of the other systems.  Assuming that each of the 10 systems above needed to interact with all 9 others, you would have to write 90 different interfaces.  Even if you only had to write 1/4 of that (23 interfaces), it’d be a daunting task.

Microsoft’s idea (more on that tomorrow) is simplicity itself:  write just 10 (maybe 11) interfaces.  If you can define a standard communications protocol, and a way for all applications to describe the data that they provide, then all you need to do for System A to talk with System F is to tell System A what data it needs to obtain.  It becomes almost a trivial matter to instruct System A to obtain the current balance information for a particular account.

So why “maybe 11” interfaces?  Ideally, each system would be able to communicate with each of the others using the standard protocol.  If that’s not possible, though (consider an old machine that has no ability to communicate via TCP—they do exist), than an intermediary system will have to serve as a proxy.  The proxy machine accepts requests from the other systems on the standard protocol, and relays those requests to the orphan systems.  Or, vice-versa.

This is the basic idea behind “Web Services.”  Although simple in concept, it still requires much thought and care in design and implementation.  More tomorrow.