Dienstag, 13. Januar 2015

First example: Validating birth dates in database

In the first example we have to implement a validation of a birth date, stored in table "person", column "birthdate".

Many times I have seen implementations like the following pseudocode:

validate_birthdate () {
    string today = today_iso();
    foreach (Person person = select ("select id, surname, prename,

                  birth, valid_state from person where...")
            ) {
        bool update = false;
        if (regex.match(person.date, 
                        "^\d{4,4}-[01]\d-[0-3]\d$")
           ) {
            if (person.date > today) {
                update = true;
                person.valid_state = "invalid birth";
            }
        }
        else {
            update = true;
            person.valid_state = "syntax birth";
        }
        if (update) {
             
update("update person "
                  + "set valid_state = " + person.valid_state + " "
                  + "where id = "        + person.id
                  );
        }
    }

What's wrong with this approach?


It is running and seems to be correct.
But how to test this code?

You need a test database.
You have to add rows into the person table. Possibly other rows have to be added before.
Read in rows added.
Then run method for test.
Read again rows and compare with original rows.
Delete added rows after test.

Possibly your SW has unexpected side effects. Or expected side effects, i.e. counters, ids, ...
Possibly your SW destroys something in the database.
Or not all of your testrows will/can be deleted.
Possibly you run in conflict with other tests using the same tables.
Or other tests causing problems in you tests somehow.

Best would be a private database, but creation may need hours or days. And you need space on disk, permissions, and...

And, also very important, it is a lot of work if you use mocking and testdoubles and write/use a database mock instead of a real database.

What todo?


First separate I/O from operation, extract validation into method do_validate_birth(person). Very easy with Eclipse and Visual Studio, they will do it for you. Just select body and choose "extract method" or something similary:

validate_birthdate () {
    foreach (Person person = select ("select id, surname, prename,

                   birth, valid_state from person where...")
             ) {
        bool update = this.do_validate_birth(person);
        if (update) {
             
update("update person "
                  + "set valid_state = " + person.valid_state + " "
                  + "where id = "        + person.id
                  );
        }
    }

}

bool do_validate_birth(Person person) {
        bool invalid = false;
        string today = today_iso();
        if (regex.match(person.date, "^\d{4,4}-[01]\d-[0-3]\d$")) {
            if (person.date > today) {
                invalid = true;
                person.valid_state = "invalid birth";
            }
        }
        else {
            invalid = true;
            person.valid_state = "syntax birth";
        }
        return invalid;
    }

What are the benefits?
See the test:

test_do_validate_birth() {
    object_in_test = new PersonValidator();

    Person person_to_validate = new Person;
    person.date = "2100-10-11";

    // now do the test
    bool result_is_invalid 
            = object_in_test.do_validate_birth(person)
            ;

    // validate test
    if (! result_is_invalid) {
        print_line 
            "test of "
          + "object_in_test.do_validate_birth(person) "
          + "failed!"
          ;
    }
}


All problems described above are gone! Totally! And you do no longer need mocks or testdoubles.
Using a Testframework could reduce effort for writing the tests additionally. In Perl you just code

ok (! $object_in_test->do_validate_birth($person),
   "! Person->do_validate_birth(person) for '$date'  ");

for doing the test and validation.

The Regex.match() call in do_validate_birth() should also be extracted into a rule or method like

is_iso_date_str(String date_string) {
    return regex.match(date_string, "^\d{4,4}-[01]\d-[0-3]\d$");
}

So the if line

if (regex.match(person.date, "^\d{4,4}-[01]\d-[0-3]\d$")) {

is replaced by

if (is_iso_date_str(person.date)) {

Now you need no longer knowledge about regexes to see what person.date should be.
And you can use and test is_iso_date_str() elsewhere.

But what about validate_birthdate? It still contains logic and a loop.

validate_birthdate () {
    foreach (Person person = select ("select id, surname, prename, 
             birth, valid_state from person where...")
            ) {
        bool update = this.do_validate_birth(person);
        if (update) {
            update("update person "
                 + "set valid_state = " + person.valid_state + " "
                 + "where id = "        + person.id
                 );
        }
    }
}

Let's change it to

validate_birthdate () {
    LIST person_list = select_persons();
    do_validate_list_of_births(person_list);
}

It contains no logic now. select_persons() does the select in the database, it does no longer interest, how.

do_validate_list_of_births(person_list) - the extracted method - does an update in Database:

bool do_validate_list_of_births(List person_list) {
    foreach (Person person in person_list) ) {
        bool update = this.do_validate_birth(person);
        if (update) {
            update("update person "
                 + "set valid_state = " + person.valid_state + " "
                 + "where id = "        + person.id
                 );
        }
    }
}

To handle this, we need an Action class or a closure:

bool do_validate_list_of_births(List   person_list,
                                Action update_action
                               ) {
    foreach (Person person in person_list) {
        bool update = this.do_validate_birth(person);
        update_action(person, update);
    }
}

Now only a loop is left. Trivial to test, just give action doing nothing for test.

Action may be created outside or inside of validate_birthdate:

validate_birthdate () {
    LIST person_list = this.select_persons();
    do_validate_list_of_births(person_list, \update_action());
}

\update_action() is a reference to function update_action and defined like:

update_action(Person person, bool update) {
    if (update) {
            update("update person "
                 + "set valid_state = " + person.valid_state + " "
                 + "where id = "        + person.id
                 );
        }
}

Summary


We have now 6 methods instead of 1:

--- Integration, no logic, no test needed


this.validate_birthdate()
do_validate_list_of_births()

--- Operations doing the validation


do_validate_birth(person)
is_iso_date_str(date_string)

--- Operations for I/O


update_action(person, update)
select_persons()

Benefits


But now the code is easy to understand, easy, fast to test -> less/no errors and easy to reuse.
And as result easy to adopt/change.

Only validate_birthdate( ) has to stay in original class. All other methods may be moved to other classes. And the call from outside to validate_birthdate() stays unchanged.

So we redesigned a part of code without effects to outside.

With a little bit of praxis you split it like this during development without much more development effort, but much less error solving effort afterwards.

Donnerstag, 8. Januar 2015

Data - Operation - Integration and Glue

Data - Operation - Integration and Glue

Trying to build up software by atom components like molecules


Hi all,

one year ago I started separation of my classes into data, operation, input, output, integration and glue classes. This was inspired by blog of Ralf Westphal:

http://blog.ralfw.de/2013/04/software-fraktal-funktionale.html

If you follow this approach, you don't need any mocking or testdoubles anymore. Yes really!!
And then you can easily write tests.

I think, this may be a new programming paradigm,  I call it "molecular software development".

But how to illustrate this?

Think of classes like atoms, then a SW component will be a molecule.

D:    Data          (Daten)
C:    Converter
O:    Operation
I:    Integrator
B:    Binder, Glue  (Verbinder)
F:    Factory

Image missing....

Data classes are full egocentric,  they only can be connected with other data. They have no complex functionality, also validation is done in separate operation classes. They will be passed through the system.

Operations know only themselves and data objects. They have inputs (methods) and cannot directly call other operations or integrations. To call something outside,  they have to call glue connector classes (lambdas, if existing in programming language). Here all the logic will be implemented. Only here! Here tests are needed. Many tests, but easy to implement with no need for mocking or testdoubles!

Converter are special operations that transform data between different classes or versions. They transform i.e. car.v1 into car.v2 and create, move or change attributes to do that. So all other operations use only a fix version of data, i.e. op1 use car.v1, op2 car.v2.

Input and Output are separated into own operation classes, because they are difficult to test.

Integrators and glue connectors should not contain complex logic, no logic is best case. They just convert data and call operations or other integrators. They usually need not to be tested, but they can. Glue is given to operator classes to call something in outside.

Factory or builder classes are creating instances and combine them to components or applications.

The idea of "molecular software development" is now not to allow other connections than described above. In best case it would be impossible to violate these rules by development environment.

In very small projects everything may be mixed in a single class, but methods should NOT mix behavior, so you will get operation methods, integration methods, input and output methods.

To be continued...