Tuesday 31 July 2012

Generating random numbers

One thing that is always difficult in a system is generating a truly random number.  Computers aren't random, they're very logical, therefore this is inherently difficult.  Having said that, there's always a way to calculate a number which is "random enough".  There is no function for this in Uniface, so I'm going to look at a few different ways to achieve this.


1) perform - the only way we could find to do this originally was to build a random number function in C++ and then call out to it from our Uniface program.  Something like this...


  perform "GetRandomNumber" ;call 3gl function which returns 0-32767
  rand = $1 / 32767


2) $uuid - since the Uniface Unique Identifer function was added, this has given an alternative method.  This is largely based on the current timestamp and either includes the processor ID or the ethernet address, depending on the operating system you are using.  The value returned is a 32 character hexadecimal string, so we need to remove the non-numeric characters.  



  rand = "0.%%$replace($replace($uuid,1,'&',"",-1),1,"-","",-1)%%%" * 1



We actually found that this was not random enough on non-Windows systems, as the last part of the identifier is the same throughout each transaction, so we have used characters from 3 identifiers.


3) DIY - you could also create your own random number generator, after all, these are just mathematical formulas.  The C++ "rand" function that we utilise in method (1) is a simple Linear Congruential Generator.  This takes an initial seed value and then uses it to create the next number in the sequence, which is then used as the seed for the next number.  The key is finding a combination of values that gives an evenly distributed spread of numbers, to ensure that the numbers appear suitably random.


  $$rand = ((214013 * $$rand) + 2531011) % 4294967296
  rand = $$rand / 4294967296


As you can see, this relies on the seed value already being populated, which I've stored in a global register in this example.  This could be set in the application shell execute trigger, maybe using $uuid or a time based numeric.  


Often these algorithms return a subset of the bits in order to improve the spread, but it is not possible to do extraction at the bit level in Uniface, as far as I'm aware.  Another algorithm that is popular (and generally considered better) is a Mersenne Twister, but this uses bit-shifting techniques that I don't think are possible in Uniface either.


So let's test the performance of these different methods of 2,000,000 iterations...


1) perform = 00:10.00, 00:10.00, 00:10.01 (10 seconds)
2) $uuid = 00:32.31, 00:32.44, 00:32.39 (over 32 seconds)
3) DIY = 00:34.75, 00:34.72, 00:34.72 (under 35 seconds)


As you can see, the original perform is the quickest method (although we've found that generally using a 3GL function does not hold up very well under load and these tests are only as a single user).  It can be hard to support a 3GL function across multiple platforms, but this solution is mathematically the most random method.  Out of the alternatives, $$uuid is quite simple but does not give a good spread of random numbers, not compared with the DIY method.  


It should be emphasised that none of these methods are truly random, and therefore should not be used for cryptographic purposes.  They should be suitable for simple things though, like simulating a dice throw.


Hopefully one day Uniface will provide it's own $rand or $random function - a native function should perform the best and would hopefully be implemented in a way that was suitably random with a decent spread.


Summary: If it's feasible to use a perform then this is the best way to go, both for speed and randomness.  However, you may wish to consider building your own random number generator, possibly using a Linear Congruential algorithm.

Wednesday 25 July 2012

User-defined functions

One of the things which I initially found a little odd, although very simple to grasp, was the way that Uniface allows you to create modules of code as an entry, not a function.  This entry has any number of parameters (well, there probably is a limit) which can be defined as "in", "out", or "inout".  The only thing that is returned from this entry is a numeric status, which you return and this sets $status accordingly.  I think it's fairly common that developers use a status of "0" to indicate successful and positive for additional successful status, then a negative status for errors.  Any other type of value could be returned in an "out" or "inout" parameter.


This is a very simple mechanism and it works nicely.  However, sometimes it can lead to some rather verbose code, as you have to put the call to the entry on a separate line, you can't nest the call within an if or while statement as part of the condition, for example.  Here's a simple entry which converts kilometres into miles...


  entry kms_to_mls_entr
  params
    numeric kms : in
    numeric mls : out
  endparams
    mls = kms*5/8
    return 0
  end


To convert two values using the entry and then add them, it takes 3 lines, like this...

  call kms_to_mls_entr(kms1,mls1)
  call kms_to_mls_entr(kms2,mls2)
  total = mls1 + mls2



From version 9 onwards (sorry, I can't remember the minor version number) you have another option.  It took me a while to find any information in the manuals, but it's there; they're called "User-Defined Functions".  Here's an example...

  entry kms_to_mls_func
  returns numeric
  params
    numeric kms : in
  endparams
    return kms*5/8
  end



Notice that there is a returns statement before the params, which determines what datatype should be returned.  In this case it is numeric, but it doesn't need to be.  In this case, to convert the two values using this function, it takes a single line...

  total = kms_to_mls_func(kms1) + kms_to_mls_func(kms2)

This can make the code a bit neater.  But of course I'm obsessed with performance, so my next step is the test these two methods over 2,000,000 iterations...


  • entry = 00:17.74, 00:17.69, 00:17.60 (under 18 seconds)
  • function = 00:13.64, 00:13.64, 0:13.67 (under 14 seconds)

So as you can see, the code is more concise and it performs better.  It also relies on less variables defined, so it's a win-win all round really.  You can even do this with a global procedure.



A bit of a side note; I thought you always had to specify a datatype when defining a parameter, and that this was always a local variable.  I discovered recently that you can also use a component variable or a painted field name, in which case you don't specify the datatype.  For example...


  entry kms_to_mls_entr
  params
    numeric kms : in
    $miles$ : out
  endparams
    $miles$ = kms*5/8
    return 0
  end


Summary: It is possible to create a "user-defined function", which is an entry that returns a specific datatype, allowing you to create more concise code, that also performs better. 

Friday 20 July 2012

Types of for loops - part two


In part one I talked about the forlist statements and how these could be used to iterate through Uniface lists for easily.  The next thing I want to talk about is looping through entity occurrences. Personally I've always done this using a combination of setocc and $curocc, something like this...



  setocc "ent",1
  while ( $status > 0 )
    ;do something
    setocc "ent",$curocc(ent)+1
  endwhile



This has the advantage of not needing to use any variables for the loop.  However, it may be better for performance if a similar loop was used, but using variables to control it...


  count = 0
  stat = $hits(ent)
  while ( count < stat )
    count = count+1
    setocc "ent",count
    ;do something
  endwhile



Another alternative would be to use the new statement forentity, which was also added in Uniface 9.5...


  forentity "ent"
    ;do something
  endfor



As you can see, the code is much more concise.  There is no need to initialise the count variable or  $status, everything is done as part of the forlist statement, and the incrementing and extracting are done automatically.  The "count" variable is optional, if you don't need it then you don't need to include it.

So let's test these three blocks of code over 65,000 iterations...

  • while ($curocc) = 00:00.47, 00:00.47, 00:00.47 (about half a second)
  • while (variables= 00:04.62, 00:04.25, 00:05.10 (about 5 seconds)
  • forentity = 00:00.37, 00:00.34, 00:00.35 (about a third of a second)


As you can see, the new forentity is more concise code and also performs better, fairly significantly over some alternatives.

I was surprised by how slow the second method was compared to the first.  The only explanation I have for this is that $hits is taking a long time to complete the hit-list before the loop starts, rather than completing the hit-list as it goes through the loop.  Turns out I’ve been doing it a pretty efficient way all along, but I like the simplicity of the new forentity statement.  Which leads me to an almost identical summary as in part one. 

Summary: Whilst I have previously always used while loops, I shall now be considering switching the forentity loops, for iterating through entity occurrences.

Thursday 19 July 2012

Types of for loops - part one

I have already discussed the basic for loop in my last post, but in Uniface 9.5 there were a number of other list constructs made available, which I plan to investigate over the next few posts, having never used them before.  


I wrote a post a couple of months ago entitled Performance of list processing, which looked at different ways of looping through a Uniface list of values.  In this post I determined that one of the quickest ways was a while loop with a counter, using getitem to extract each value in turn, something like this...


  count = 0
  $status = 1
  while ( $status > 0 )
    count = count+1
    getitem temp,list,count
    ;do something
  endwhile


However, one of the new constructs is forlist, which can be used to the same affect...

  forlist temp,count in list
    ;do something
  endfor

As you can see, the code is much more concise.  There is no need to initialise the count variable or  $status, everything is done as part of the forlist statement, and the incrementing and extracting are done automatically.  The "count" variable is optional, if you don't need it then you don't need to include it.


It is also possible to do the same thing when you have an ID list, where the same construct will return both the ID and the value of each item in the list separately...


  forlist/id id,temp,count in list
    ;do something
  endfor

The "count" variable is also optional in this case.  You know what's coming next...

So let's test these three blocks of code over 2,000,000 iterations...

  • while = 01:18.47, 01:17.55, 01:18.11 (around 1 minute 18 seconds)
  • forlist = 01:08.06, 01:08.51, 01:08.42 (just over 1 minute 8 seconds)
  • forlist/id = 01:17.27, 01:18.03, 01:17.26 (just over 1 minute 17 seconds)

As you can see, forlist is not only more concise from a coding perspective, but it also performs better.  Given the number of iterations, the performance gain would probably be limited, but it is clearly the better option.

Summary: Whilst I have previously always used while loops, I shall now be considering switching the forlist loops, for iterating through a list.  

Tuesday 10 July 2012

Types of simple loops

There are lots of different ways to create loops in Uniface.  Mostly I've found that developers stick to the type they prefer, the one that makes most sense to the way that they think about code.  However, avoiding work in loops is something that is worth considering.  Whenever you have a loop, the conditionality of the loop is run for every iteration, therefore big performance gains can be had by thinking about this conditionality, especially in large loops.  And by large, I mean lots of iterations, not lots of code inside the loop.


So what are the different types of loops...


The while loop is very simple in it's construct.  It just has a condition, which can contain any expression that you wish to put there.  It will be checked at the beginning of each loop, continuing if the condition is true, so if it is never true then the code will never run.  


The repeat loop is also very simple.  It also has a condition, which can contain any expression that you wish.  However, the condition is checked at the end of each loop.  The condition is also reversed; in this case it will loop until it condition true.  This means that if it is true at the start then it will still run the code once.

The for loop has only recently been added to Uniface, in version 9.5.  It has a "counter" variable, a "start" value, an "end" value, and an optional "step" value (which is 1 by default).  This allows you to clearly define the number of times that you wish to loop, right up front.  The condition will be checked at the beginning of each loop, so if the "start" value is never less than the "end" value then the code will never run.


Uniface 9.5 also added some more specific loop commands, but I'll talk about these in a future post.

Having discussed their difference, here is how to make each of them iterate 5 times, and the time it takes to do so 20,000,000 times...


1) while = 00:28.80, 00:28.92, 00:29.01 (about 29 seconds)

  count = 0 
  while ( count < 5 )
    count = count + 1
    ;do something
  endwhile



2) repeat = 00:29.02, 00:29.32, 00:28.93 (about 29 seconds)

  count = 0 
  repeat
    count = count + 1
    ;do something
  until (count >= 5)


3) for = 00:29.05, 00:29.08, 00:29.01 (about 29 seconds)

  for count = 1 to 5
    ;do something
  endfor
  count = count - 1 ;this will equal 6 at the end of the loop



Given the timings, I should probably go back and rewrite my starting paragraph.  But just to prove that I actually write these things as I'm going, I'll do a big U-turn instead :)

Summary: There are different ways of looping in Uniface, but they all perform equally well, so pick your favourite.  Think about the conditionality though, as it will be processed for each iteration.


Monday 2 July 2012

Assignment operators

This is sort of an undocumented feature, certainly not one that I knew about until today, but as I discovered it in the documentation, I couldn't really label it as undocumented.  It's used in an example in the for loop section (which I'm investigating for a future blog post!) but not in the operators section, where I would have expected it.


In javascript there are three ways of incrementing a numeric value...

  • c = c + 1
  • c += 1
  • c++

The first is probably the most obvious at first glance, but the other two make sense once you know what they mean.  Both + and ++ are referred to as "arithmetic operators" whereas += is referred to as an "assignment operator".

Whilst I've known this in javascript for many years, I had no idea that you could also use the following assignment operators in Uniface as well...

  • Addition: c += 1 (which is the same as c = c + 1)
  • Subtraction: c -= 1 (which is the same as c = c - 1)
  • Multiplication: c *= 1 (which is the same as c = c * 1)
  • Division: c /= 1 (which is the same as c = c / 1)
  • Modulus: c %= 1 (which is the same as c = c % 1)

Interestingly in javascript if you attempt to do += with string values then it concatenates the two strings, but Uniface will always treat this as a numeric operator and attempt to cast the strings as numerics.

I've checked the performance of these shorthand operators with their longhand equivalents and they seem to perform identically, which possibly indicates that the compiler interprets them in the same way.  This means it is really a personal choice over which style the developer prefers.

It's worth noting that the arithmetic operators ++ and -- do not work - these cause compile errors.

Summary: There are shorthand numeric assignment operators available in Uniface, if you prefer this style of syntax.