Tuesday, April 19, 2016

Wash. Rinse. Repeat.

For many years, bottles of shampoo contained these instructions:
  1. Wash.
  2. Rinse.
  3. Repeat.
This is an example of what is called, in computer science, a "loop". You do something over and over again. Hopefully, at some point, you stop doing it. For example, these might be the instructions for adding up a list of numbers:
  1. If there are no more numbers, quit.
  2. Get the next number.
  3. Add it to the total.
  4. Go to step 1.
This and the shampoo instructions are examples of "procedural" programs. They tell you how to do something. There's another kind of programming, though, called "declarative" programming. With declarative programming, you say what you want done, but not how to do it.

For example, the shampoo instructions might say:
  1. Wash your hair.
This implies that the shampoo user knows that you need to rinse the soap out and possibly repeat the process a certain number of times.

In the case of a computer program, the "declarative" version might be:
  1. Sum these numbers.
It's not difficult to write a programming language that knows what to do when you give it a declarative instruction like this. I worked for 35 years programming in APL, a computer language invented in the 1960s that is largely declarative. The APL instruction for "sum these numbers" is:

      +/NUMBERS
If you've used computer spreadsheets like Excel, you're probably familiar with its declarative command for summing a list of numbers:

      SUM(A1:A5)

Most computers, deep down inside, still need to go through the list of numbers one at a time to add them up, but it's easy to write computer programs that accept declarative instructions like these and are smart enough to fill in all the steps necessary to produce the result.  In short, with declarative programming, you don't need to write "loops". The computer system knows how to do the loops for you. The loops are "implied" rather than "explicit".

Some years ago I had an assignment to convert an actuarial accounting system from RPG to a more modern programming language. RPG is an IBM programming language invented in 1959 and still used today, though in a much evolved form. In its original form, RPG was highly declarative and the looping was implied. At its simplest, you wrote a program that operated on one input record and produced one output record. RPG knew how to run this program over and over again on a lot of different input records to produce a lot of different output records. You didn't have to write this loop yourself.

I realized as I worked on this project that the RPG model of programming involved a powerful simplification. Of necessity, each step of a complex process was broken down into small bits. Each bit took in a record from an input file, did a bit of work, and produced record in an output file that was passed on as input to the next step in the process. RPG worked this way because it was designed to work efficiently on the very limited computers available at the time, but it also had the benefit of making it much easier to create correct programs. A very complex process, broken down in this way, became very manageable. It was something like "mass production" applied to programming. Unfortunately, as computers became more powerful, we started creating programs that weren't so simple. They ended up more artisanal than industrial, and correctness and maintainability suffered. I think we need to revisit the industrial model for maintaining business logic.

One of the most successful declarative languages is SQL, used to retrieve and manipulate data in relational databases. Declarative languages have the same difficulty as functional programming, though, in that they often end up having long "one-line" expressions that can be difficult to work with. The following is a typical expression from SQL:

SELECT DISTINCT Pnumber FROM PROJECT WHERE Pnumber IN   ( SELECT Pnumber FROM PROJECT, DEPARTMENT, EMPLOYEE WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Lname=‘Smith’ ) OR Pnumber IN ( SELECT Pno FROM WORKS_ON, EMPLOYEE WHERE Essn=Ssn AND Lname=‘Smith’ );
These "one-liners" can be broken down and indented to make them more readable, but they're still difficult to work with:

SELECT DISTINCT Pnumber
FROM PROJECT
WHERE Pnumber IN   (
      SELECT Pnumber
      FROM PROJECT, DEPARTMENT, EMPLOYEE
      WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Lname=‘Smith’ )
OR Pnumber IN (
      SELECT Pno
      FROM WORKS_ON, EMPLOYEE
      WHERE Essn=Ssn AND Lname=‘Smith’ );

Writing and testing SQL statements like this is a huge pain. Blockly visual programming, though, is just the thing for these kinds of expressions. Sure enough, some folks at a German firm have created a Blockly SQL editor. You can see a demo at this YouTube video.

The Blockly for Business Logic (BBL) idea I'm developing uses functional programming to process a single record. It relies on an external process to feed it records and handle results. The inputs would most likely be generated using SQL, and Blockly should be used to define the SQL-derived data inputs.

I've been working on developing examples of what BBL would look like when applied to various business requirements, which I'll share in future posts.

No comments:

Post a Comment