Learning R for PL/SQL Developers—Part 2
 
by Arup Nanda

Part 2 of a three-part series that presents an easier way to learn R by comparing and contrasting it to PL/SQL.

Table of Contents
Reading Input
What IF
Or, ELSE...
Null is Nothing
Let's Make a Case
FOR the Love of Loop
Give Me a Break
Looping While
Breaking from the While Loop
Repeat Until I Break
Next, Please
Let's Continue
Functions
Variable Scope
Summary
Quiz
About the Author
 

Welcome to the second installment of this series. In this installment, you will learn about the more advanced concepts of the R language such as evaluating conditions, executing loops, and creating program units such as functions.

Reading Input

But before we start, let's explore a rather trivial activity in any interactive program: accepting an input from the user at runtime. The R function for that is called readline(). It prompts the user for an input and reads the input to be stored as a value. Here is an example:

> v1 <- readline(prompt = "what's the good word> ")
what's the good word> 

You can enter a value at the blinking cursor. Suppose you enter 1. The variable v1 will be assigned the value of 1. If you type v1, you will see the value. Remember from Part 1 that you can just type the variable name and the value will be displayed. There is no need for a print() function.

> v1
[1] "1"

But note the double quotes. This is a character string. You can confirm that by using the class() function, which you learned about in Part 1:

> class(v1)
[1] "character"

If you want to use the value as a number, then you must convert it to a number by using the as.numeric() function, which you learned about in Part 1, or the as.integer() function. Here is an example:

> v1 <- as.numeric(v1)

Confirm that it's a numeric value now:

> class(v1)
[1] "numeric"

Alternatively, you can use this:

> v1 <- as.integer(v1)
> class(v1)
[1] "integer"
What IF

Like the IF statement in PL/SQL, the most basic conditional operation in R is also IF. In PL/SQL, the general structure of the IF condition is:

IF conditional expression THEN
  some statements
ELSE 
  some statements
END IF;

Here is some sample code:

-- pl1.sql

declare
  v1 number;
begin
  v1 := &inputvalue;
  if (v1 < 101) then
    dbms_output.put_line('v1 is less than 101');
  end if;
end;

Here is the output (after you input 100 at the prompt):

Enter value for inputvalue: 100
old 4: v1 := &inputvalue;
new 4: v1 := 100;
v1 is less than 101

Here is how you write the same logic in R:

#r1.txt

v1 <- is.integer(readline(prompt = "enter a number> "))
if (v1<101)
{
   print ("v1 is less than 101")
}

Here is the output:

[1] "v1 is less than 101"

There are a few things to note here before we further:

  • After the if condition, the action block is enclosed by two curly braces: { and }. This is the standard convention in R. This is similar to the convention in C language.
  • There is no line-ending character, for example, a semicolon (;). A line ends with the normal end-of-line character.
  • The indentation is purely for readability reasons. R doesn't need the indentation as syntax, just like PL/SQL. This is different from other languages such as Python, where indentation denotes the block that the code belongs to. In R, the curly braces delineate that block.
Or, ELSE ...

When there is IF, there must be ELSE. In PL/SQL, you would write like this:

--pl2.sql

declare
  v1 number;
begin
  v1 := &inputvalue;
  if (v1 < 101) then
    dbms_output.put_line('v1 is less than 101');
  else
    dbms_output.put_line('v1 is not less than 101');
  end if;
end;

The R syntax is also the same—-ELSE—but there is a catch you should be aware of. Here is the equivalent R code:

v1 <- 100
if (v1 < 101)
{
   print ('v1 is less than 101')
print ('Not Indented')
} else
{
   print ('v1 is greater than 101')
      print ('Way Too much Indented')
}

I used the indentation messages to show you how indentations are not important in R, just as in PL/SQL. But I want to show you a very important differentiator. Note the presence of a curly brace before the else in the code above. The ending curly brace of the IF condition tells R to stop evaluating and start processing. If you have an ELSE, it must come in the same line after the curly brace; otherwise, the R interpreter will not be able to evaluate the else. Note what happens when you put else in the next line:

# r2a.txt
v1 <- 100
if (v1<=100)
{
   print ('v1 is less than or equal to 100')
print ('Not Indented')
} 
else 
{
   print ('v1 is greater than 100')
      print ('Way Too much Indented')
}

Output:

[1] "v1 is less than or equal to 100"
[1] "Not Indented"
Error: unexpected 'else' in "else"
Execution halted

The else was not properly handled, because it was not in the same line as the ending curly brace of the IF. This is a very important syntax difference from PL/SQL that you should be aware of. Most developers familiar with other languages such as C, where the curly braces are used as well, make this mistake.

What if you want to put another condition? There could be another IF after the ELSE:

#r3.txt
v1 <- 100
if (v1 < 100)
{
   print ('v1 is less than 100')
} else 
  if (v1 == 100)
{
   print ('v1 is equal to 100')
} else
{
   print ('v1 is greater than 100')
}

Executing that code, we get this:

C:\>rscript r3.txt
[1] "v1 is equal to 100"
Null is Nothing

When you have to put in a line but nothing needs to be done, you usually use a NULL statement in PL/SQL.

--pl4.sql
declare
  x number := 10;
  y number := 11;
begin
  if (x<y) then
    dbms_output.put_line('Yes');
  else
    null;
  end if;
end;
/

The null statement in line 9 is required. You have to put a valid PL/SQL statement within the IF and END IF statements. Otherwise, the code will produce an error. In R, you don't need to put anything between the curly braces. The following is the equivalent code in R.

# r4.txt

x <- 10
y <- 11
if (x<y) {
  print ("Yes")
} else
{
}
Let's Make a Case

The CASE statement of PL/SQL is pretty powerful. It allows you to define several conditions in one expression. Here is an example.

--pl5.sql

declare
  n1 number;
begin
  n1 := 5;
  case
    when (n1<=25) then
      dbms_output.put_line('n1 is within 25');
    when (n1<=50) then
      dbms_output.put_line('n1 is within 50');
    when (n1<=75) then
      dbms_output.put_line('n1 is within 75');
    else
      dbms_output.put_line('n1 is greater than 75');
  end case;
end;
/

There is an equivalent of the CASE statement in R. It's called the switch() function. But unlike CASE, the switch() function works in different ways when the first argument is an integer or a character. Let's first see the effect with an integer input:

> v1 <- switch (1,'first','second','third','fourth')
> v1
[1] "first"

The switch statement returned "first" because the first argument is the integer 1. Therefore, the switch statement picked the first argument from the list of selections. Similarly, if you choose other numbers, switch will choose the corresponding values.

# r5.txt

> v1 <- switch (2,'first','second','third','fourth')
> v1
[1] "second"
> v1 <- switch (3,'first','second','third','fourth')
> v1
[1] "third"
> v1 <- switch (4,'first','second','third','fourth')
> v1
[1] "fourth"

What if you pass a number for which there is no corresponding selection, for example, 0? Let's see:

# r5a.txt

> v1 <- switch (0,'first','second','third','fourth')
> v1
NULL
> v1 <- switch (5,'first','second','third','fourth')
> v1
NULL

Note the output, which shows NULL. In Part 1, you saw a value called NA, which was roughly the equivalent of NULL in PL/SQL. The R NULL has no clear equivalent; it's sort of undefined.

The first argument does not have to be an integer. It could also be an implied integer, for example, a boolean value. Remember, boolean TRUE and FALSE evaluate to 1 and 0, respectively. If the first parameter is an expression that results in a boolean value, it is converted to an integer. If the expression is 3 < 5, the result will be TRUE, that is, 1; so switch will pick the first argument from the choices:

# r5b.txt

> v1 <- switch (3<5,'first','second','third','fourth')
> v1
[1] "first"

Unfortunately it doesn't work in all cases. What if the expression evaluates to FALSE? It will be then converted to 0; but there is no 0th choice. So it will return a NULL:

# r5c.txt

> v1 <- switch (3>5,'first','second','third','fourth')
> v1
NULL

The switch function works differently when the first input is a character. In this format, you will need to provide the return values for several input values. You can also provide the default value for when none of the input values match. For instance, here is an R expression that returns the position of vowels, that is, 1 for a, 2 for b, and so on. It should return 0 if the input is not a vowel.

# r6.txt

> v1 <- switch('a',a=1,e=2,i=3,o=4,u=5,0)
> v1
[1] 1
> v1 <- switch('z',a=1,e=2,i=3,o=4,u=5,0)
> v1
[1] 0

This format of the switch() function is closer to the CASE statement in PL/SQL when there is a default value.

FOR the Love of Loop

Consider the usual looping code in PL/SQL using the FOR ... LOOP ... END LOOP construct. The general structure is this:

FOR i in StartingNumber ... EndingNumber LOOP
   ...
END LOOP;

The R equivalent also uses for; but there is no LOOP keyword and, consequently, no END LOOP either. As in the case of the if statement, the end of the conditional expression is marked by the curly brace start. Also, like the if statement, the block of program statements to be repeated is identified by being enclosed inside curly braces.

But before we start talking about loop, we need to know how to create a start and end number range. It's the seq() function.

> seq(10,20)
[1] 10 11 12 13 14 15 16 17 18 19 20

Let's see a very simple function that generates 11 values from 10 to 20 and displays them one by one:

Here's PL/SQL code:

-- pl7.sql

begin
  for i in 1..10 loop
    dbms_output.put_line('i= '||i);
  end loop;
end;

Here's the R code:

for (i in seq(10:20)) {
  print(i)
}

Here's the output:

[1] 10
[1] 11
[1] 12
[1] 13
[1] 14
[1] 15
[1] 16
[1] 17
[1] 18
[1] 19
[1] 20

The seq() function is very useful in R. You will be using it a lot to create data, especially when trying to fit models. Let's see some more parameters of this function. If you want to skip values, you can pass the optional third parameter to specify the skip values. Let's say, we want to print 10 to 20 but skip every 2 numbers.

> seq(10,20,2)
[1] 10 12 14 16 18 20

Similarly we can use the third parameter to increment negatively. To produce numbers from 20 to 10, incrementing by 1, the third parameter should be -1:

> seq(20,10,-1)
[1] 20 19 18 17 16 15 14 13 12 11 10
Give Me a Break

If you want to break the loop iteration, just use the break statement, which is exactly same as the break statement in PL/SQL. If you want to enter a number and check if it has a multiple between 10 and 20, you could use the following. In R %% is the modulo operator, equivalent to the mod() function in PL/SQL. The expression v1 %% v2 returns 0 if v1 is a multiple of v2. You want to iterate through the loop from 10 and 20, but stop when you find a multiple. The break statement comes in there.

# r8.txt

n <- as.integer(readline("Enter a number> "))
for (i in rep(10:20)) {
  print(i)
  if (i%%n == 0)
  {
     break
  }  
}

Executing it produces the following:

> source ("r8.txt")

Enter a number> 7
[1] 10
[1] 11
[1] 12
[1] 13
[1] 14
Looping While

The second type of loop we will cover is a variant of FOR but without a start and end: the WHILE loop. It allows you to loop as long as a condition is met (the condition can be set to always be true for a forever loop). Here is an example of printing the 10 number.

PL/SQL code:

-- pl9.sql
declare
  i number := 0;
begin
  while (i<11) loop
     dbms_output.put_line('i= '||i);
     i := i+1;
  end loop;
end;
/

The output:

i= 0
i= 1
i= 2
i= 3
i= 4
i= 5
i= 6
i= 7
i= 8
i= 9
i= 10

In R, the syntax is the same: while. Like the FOR loop, the code to be inside the WHILE loop is marked by the curly braces, equivalent to the BEGIN and END markers of PL/SQL. Like PL/SQL, the indentations are merely for readability; they are not part of the syntax.

# r9.txt
i <- 0
while (i<11) {
  print(i)
  i <- i+1
}

The output:

[1] 0
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
Breaking from the While Loop

Suppose you want to put a condition in the loop that will make the program break away from the loop when the condition is satisfied. For instance, in the previous program, you want to break form the loop when the variable i is a multiple of 5. In PL/SQL, you can do that in two different ways:

  • exit when ConditionIsSatisfied
  • if (ConditionIsSatisfied) then exit

Functionally they are the same. In R, the keyword break breaks the loop from executing and jumps out to the first line after the loop. We will examine the approaches in both these languages.

In PL/SQL using approach 1:

--pl10a.sql
declare
  i number := 1;
begin
  while (i<11) loop
    exit when mod (i,5) = 0;
    dbms_output.put_line('i= '||i);
    i := i+1;
  end loop;
end;
/

The output:

i= 1
i= 2
i= 3
i= 4

In PL/SQL using approach 2:

--pl10b.sql
declare
  i number := 1;
begin
  while (i<11) loop
     dbms_output.put_line('i= '||i);
     i := i+1;
     if mod (i,5) = 0 then
        exit;
     end if;
  end loop;
end;
/

In either approach, the output is the same. While the output is same, the approaches are different and might behave differently. In the first approach, the condition for breaking is checked immediately at the start of the loop. In the second approach, it's evaluated after the counter is incremented. So you have to be careful when coding for either approach. The change in logic might be subtle, but it is important and can introduce bugs in a program.

In R, there is no equivalent of the first version. The second version is what you would use in R. You already saw an example in the FOR loop. Let's see the same for the WHILE loop. Here is the R code. Execute it yourself on the R command line see the results.

# r10.txt
i <- 1
while (i<11) {
  print(i)
  if (i%%5 == 0) {
    break
  }
  i <- i+1
}
Repeat Until I Break

You might have seen another case where you you need to repeat the loop indefinitely until a break condition comes in. In those cases a WHILE loop with a condition that always evaluates to TRUE will help. Here is an example in PL/SQL:

WHILE (TRUE) LOOP
...
IF Condition THEN 
   BREAK;
END IF
END LOOP

In R, you can write it the same way:

while(TRUE) 
{
   if Condition 
   {
       break
   }
}

There is a simpler way in R: using the repeat clause. Here is an example:

# r11.txt
n <- as.integer(readline("Enter an integer> "))
i <- 1
repeat {
  if (i == n) 
  {
     cat("It took me ", i, " iterations to find your number\n")
     break
  }
  i <- i+1
}

Executing it, produces the following:

Enter an integer> 29
It took me 29 iterations to find your number
Next, Please

Another element of the loop is the next statement. This allows you to skip over a loop based on a condition. Let's take the same example we saw for the break statement. But let's say we don't need to count in multiples of 5. In other words, we count how many iterations we had to do to come to the number entered by the user; but we will not count iterations 5, 10, 15, and so on.

# r12.txt
n <- as.integer(readline("Enter an integer> "))
i <- 0
j <- 1
repeat {
  i <- i+1
  if (i %% 5 == 0)
  {
     next
  }
  if (i == n) 
  {
     cat("It took me ", j, " iterations to find your number\n")
     break
  }
  j <- j+1
}

Executing it produces this:

> source('r12.txt')
Enter an integer> 29
It took me 24 iterations to find your number
Let's Continue

Remember the PL/SQL continue statement? It is used inside a loop to instruct the program to jump to the end of the loop and continue with the rest of the loop iterations as usual. The syntax is equivalent to the next statement in R. Let's see a small example:

The PL/SQL code:

-- pl13.sql
declare
  mynum number := 3;
begin
  for i in 1..10 loop
    if mod (i,mynum) = 0 then
      dbms_output.put_line('multiple found as '||i);
      continue;
      dbms_output.put_line('we are continuing');
    end if;
    dbms_output.put_line ('No multiple found as '||i);
  end loop;
end;
/

Executing the code:

No multiple found as 1
No multiple found as 2
multiple found as 3
No multiple found as 4
No multiple found as 5
multiple found as 6
No multiple found as 7
No multiple found as 8
multiple found as 9
No multiple found as 10

The R code:

# r13.txt
mynum <- 3
for (i in 1:10)
{
   if (i%%mynum == 0)
   { 
      cat ("Multiple found as ", i, "\n")
      next
   }
   cat ("No multiple found as ", i, "\n")
}

Executing the R code:

> source('r13.txt')
No multiple found as 1
No multiple found as 2
Multiple found as 3
No multiple found as 4
No multiple found as 5
Multiple found as 6
No multiple found as 7
No multiple found as 8
Multiple found as 9
No multiple found as 10
Functions

As is the case in most languages, R provides repeatable code segments, similar to procedures and functions in PL/SQL. As you already know, in PL/SQL, a procedure does not return anything (although it can have an OUT parameter; but that's not a return, so it's not the same thing), and a function returns a single value. In R, the equivalents of both PL/SQL procedures and functions is called simply a function. A Python function may or may not return anything.

In this article, we will cover how to write functions and use them in your programs. As in Part 1 of this series, we will see how to do something in PL/SQL and then do the equivalent in R.

A function definition in PL/SQL has this general syntax format:

function FunctionName (
  Parameter1Name in DataType,
  Parameter2Name in DataType,
...
return ReturnDatatype
is
   localVariable1 datatype;
   localVariable2 datatype;
begin
   ... function code ...
    return ReturnVariable;
end;

A procedure definition in PL/SQL has this general syntax:

procedure ProcedureName (
 Parameter1Name in DataType,
 Parameter2Name in DataType,
...
) 
is
   localVariable1 datatype;
   localVariable2 datatype;
begin
... procedure code ...
end;

A function definition syntax is somewhat convoluted in my opinion, compared PL/SQL; but the rest of the body is pretty standard. R follows this simple syntax:

FunctionName  <- function (Parameter1Name,Parameter2Name, ...) {
        ... function code ...
  return ReturnVariable

}

Note some important properties of the R function definition compared to the PL/SQL equivalent:

  • The definition starts with the name of the function and the assignment operator.
  • The parameters, if any, come inside the parentheses.
  • Only the parameter names are listed. The data types are not mentioned, unlike with PL/SQL. Parameters are optional; a function doesn't need to have parameters.
  • There body of the function is indicated by { and }. Unlike PL/SQL, there is no BEGIN ... END construct.
  • There is no mention of what the function returns during the definition. If you notice there is no mention in the definition as to whether the function even returns anything.
  • The indentation after the { sign is merely for readability; it's not needed. This is the same style followed in R to mark IF ... THEN ... ELSE blocks or loops.
  • The function may optionally return something at the final line. The syntax is the same as in PL/SQL: return ReturnVariable.

Now that you've got the basic idea about the syntax vis-à-vis PL/SQL, let's start with a very simple procedure in PL/SQL that accepts a principal amount and interest rate, computes the interest amount and the new principal after the interest is added, and displays the new principal.

Here is how we do it in PL/SQL. Note that I deliberately chose to use the R naming convention, for example, pPrincipal, not a PL/SQL-style variable name such as p_principal.

PL/SQL code:

-- pl14.sql
declare procedure calcInt (
  pPrincipal number,
  pIntRate number
) is
  newPrincipal number;
begin
  newPrincipal := pPrincipal * (1+(pIntRate/100));
  dbms_output.put_line ('New Principal is '||newPrincipal);
end;

begin
calcInt(100,10);
end;
/

Here is the output:

New Principal is 110

R code:

# r14.txt

calcInt <- function (pPrincipal, pIntRate)
{
  newPrincipal <- pPrincipal * (1+(pIntRate/100))
  paste("New Principal is ",as.character(newPrincipal))
}

We save this as r1.txt and call it using the source() function you learned about in Part 1 of this series.

> source('r14.txt')
> calcInt(100,10)
[1] "New Principal is 110"

Default Value of Parameters

Sometimes you need to pass a default value to a parameter. This value is in effect if the user does not explicitly pass the parameter. Building on the previous procedure, suppose we want to make the parameter pIntRate optional, that is, make it a certain value (such as 5) when the user does not explicitly mention it. In PL/SQL, you mention the parameter this way:

ParameterName DataType := DefaultValue

In R, it's exactly the same, but because the assignment operator in R is the equals sign (=), not :=, that's what you need to use. Besides, remember, you don't mention the data type for parameters. Here is the general syntax:

ParameterName = DefaultValue

You can write the PL/SQL function this way (the changes are in bold):

--pl15.sql
declare
  procedure calcInt (
     pPrincipal number,
     pIntRate number := 5
) is
  newPrincipal number;
begin
  newPrincipal := pPrincipal *(1+(pIntRate/100));
  dbms_output.put_line('New Principal is '||newPrincipal);
end;

begin
-- don't mention the pIntRate parameter.
-- defaults to 5
calcInt(100);
end;
/

R code:

# r15.txt
calcInt <- function (pPrincipal, pIntRate = 5)
{
  newPrincipal <- pPrincipal * (1+(pIntRate/100))
  paste("New Principal is ",as.character(newPrincipal))
}

One important property of functions in R is that the default values can be variables as well. This is not possible in PL/SQL. For instance, in PL/SQL the following will be illegal:

-- pl16.sql
declare
  defIntRate number := 5;
procedure calcInt (
  pPrincipal number,
  pIntRate number := defIntRate;
) is
 
...

But it's perfectly valid in R. Let's see how:

# r16.txt
defIntRate <- 5
calcInt <- function (pPrincipal, pIntRate = defIntRate)
{
  newPrincipal <- pPrincipal * (1+(pIntRate/100))
  paste("New Principal is ",as.character(newPrincipal))
}

The variable defIntRate dynamically influences the operation of the function. If you change the value of this variable, the function changes as well. Consider this following example:

> calcInt(100)
[1] "New Principal is 105"

Now let's change the value of this variable to 10 and re-execute this function.

> defIntRate <- 10
> calcInt(100)
[1] "New Principal is 110"

The new value of the variable took effect in the function.

Positional Parameters

You already know that in PL/SQL, you do not have to provide parameter values in the order in which the parameters were defined in the procedure. You can pass values by specifying the parameter by name. For instance, if a procedure F1 assumes the parameters P1 and P2in that order—you can call the procedure this way with the parameter values Val1 and Val2, respectively:

F1 (Val1, Val2);

But you can also call them with explicit parameter name assignments:

F1 (P2 => Val2, P1 => Val1);

This explicit naming allows you to order the parameters any way you want when calling the procedure. It also allows you to skip some non-mandatory parameters. In R, the equivalent syntax is this:

F1 (P2=Val2, P1=Val1)

So, just the greater-than operator (=>) is changed to the equals sign (=). Let's see examples in both PL/SQL and R.

PL/SQL example:

--pl7.sql
declare
  procedure calcInt (
    pPrincipal number,
    pIntRate number := 5
) is
  newPrincipal number;
begin
  newPrincipal := pPrincipal *(1+(pIntRate/100));
  dbms_output.put_line('New Principal is '||newPrincipal);
end;

begin
  calcInt(pIntRate=>10, pPrincipal=>100);
end;
/

The output is this:

New Principal is 110

R example:

# r3.txt
calcInt <- function (pAccType = "Savings", pPrincipal, pIntRate = 5)
{
  vIntRate <- pIntRate
  if (pAccType == "Savings")
  {
    # eligible for bonus int rate
    vIntRate <- pIntRate + 5
  }
  newPrincipal <- pPrincipal * (1+(vIntRate/100))
  paste("New Principal is ",as.character(newPrincipal))
}

Executing the R code produces this:

> calcInt(pPrincipal=100)
[1] "New Principal is 110"
> calcInt(pPrincipal=100, pAccType = "Savings")
[1] "New Principal is 110"
> calcInt(pPrincipal=100, pAccType = "Checking")
[1] "New Principal is 105"

One of the useful cases in PL/SQL is to define a default value only when the value is not explicitly provided. Take for instance, when the user didn't specify anything for the interest rate, and you want the default values to be based on something else, for example, the account type. If the account type is Savings (the default), the interest rate should should be 10 percent; otherwise, it should be 5 percent. Here is how you will need to write the function:

-- pl18.sql
declare
  procedure calcInt (
    pPrincipal number,
    pIntRate number := null,
    pAccType varchar2 := 'Savings'
  ) is
    newPrincipal number;
    vIntRate number;
  begin
    if (pAccType = 'Savings') then
      if (pIntRate is null) then
         vIntRate := 10;
      else
         vIntRate := pIntRate;
      end if;
    else
      if (pIntRate is null) then
        vIntRate := 5;
      else
        vIntRate := pIntRate;
      end if;
    end if;
    newPrincipal := pPrincipal * (1+(vIntRate/100));
    dbms_output.put_line('New Principal is '|| newPrincipal);
  end;
begin
  calcInt(100);
  calcInt(100, pAccType => 'Checking');
end;
/

The equivalent of the following line:

pIntRate       number := null,

in R is this:

pIntRate = NULL

The PL/SQL equivalent of IS NULL in R is the function is.null(). Here is the complete R example (note the capitalization of "NULL"):

# r18.txt
calcInt <- function (pAccType = "Savings", pPrincipal, pIntRate = NULL)
{
  if (is.null(pIntRate)) 
  {
     vIntRate <- 5
  } else
  {
    vIntRate <- pIntRate
  }
  if (pAccType == "Savings")
  {
     # eligible for bonus int rate
     vIntRate <- pIntRate + 5
  }
  newPrincipal <- pPrincipal * (1+(vIntRate/100))
  paste("New Principal is ",as.character(newPrincipal))
}

Executing the R code produces this:

> source('r18.txt')
> calcInt(pPrincipal=100, pAccType = "Checking")
[1] "New Principal is 105"
> calcInt(pPrincipal=100, pAccType = "Checking", pIntRate = 10)
[1] "New Principal is 110"
> calcInt(pPrincipal=100, pAccType = "Checking", pIntRate = NULL)
[1] "New Principal is 105"

Returning Values

So far, we have talked about procedures in PL/SQL, which do not return anything. In contrast, functions in PL/SQL return a value. Here is a simple example of a function that returns the interest rate for the account type, which is the parameter passed to it:

--pl19
declare
  function getIntRate(
    pAccType in varchar2
  )
  return number
  is
    vRate number;
  begin
    case pAccType
      when 'Savings' then vRate := 10;
      when 'Checking' then vRate := 5;
      when 'MoneyMarket' then vRate := 15;
    end case;
    return vRate;
  end;
begin
  dbms_output.put_line('Int Rate = '||getIntRate('Savings'));
  dbms_output.put_line('Int Rate = '||getIntRate('Checking'));
  dbms_output.put_line('Int Rate = '||getIntRate('MoneyMarket'));
end;
/

Here is the output:

Int Rate = 10
Int Rate = 5
Int Rate = 15

The equivalent of the following code line:

return vRate;

in R, fortunately, is similar, but not exactly the same:

return (vRate)

Note the parentheses. Here is the R function:

# r19.txt
getIntRate <- function (pAccType)
{
  if (pAccType == "Savings")
  {
    vRate <- 10
  } else
  if (pAccType == "Checking")
  {
    vRate <- 5
  } else
  if (pAccType == "MoneyMarket")
  {
    vRate <- 15
  }
  return (vRate)
}

Executing the R code produces this:

> getIntRate("Savings")
[1] 10

You can try the other values:

> getIntRate("Checking")
> getIntRate("MoneyMarket")

Another way to write the same function logic is using the switch() function you learned about earlier in this article. It's the equivalent of the CASE statement in PL/SQL.

# r19a.txt
getIntRate <- function (pAccType)
{
  vRate <- switch (pAccType,"Savings"=10, "Checking"=5, "MoneyMarket"=15,0)
  return (vRate)
}

A very important concept of functions in R is that the return (...) value is implicit. By default, the function returns the value of the last assigned object, even if you don't actually have an explicit return statement. Let's see the example of a very simple function that takes a number and two variables are assigned inside the body. The function doesn't return anything.

# r20.txt

f1 <- function(inVal)
{
   v1 <- inVal * 2
   v2 <- v1 * 2
}

As you can see, the function returns nothing. Now let's call the function:

> f1(2)

There is no result, because the function doesn't return or print anything. Now let's capture the return value of the function in a variable v3:

> v3 <- f1(2)
> v3
[1] 8

How did the function return 8, when we didn't write a return statement? It's because the last assigned value was v2 and that was implicitly returned. By default, all functions will implicitly return the last value assigned. So does it mean that we need not write the return statement? Of course not. We need an explicit return in the function code for these two reasons:

  • To clearly state what we are returning
  • To stop the function from executing the rest of the code if I have what I need in order to return from the function.
Variable Scope

As you write multiple levels of code in R, such as subprograms calling other subprograms, you might face the prospect of encountering the same names being defined for variables inside these subprograms. In that case, which of the values assigned to the variables will be relevant? This is where you have to know how the scope of the variables, which is a very important concept to remember. Let's start with a simple example function that accepts an input value, stores it in a local variable called v1, and prints it.

#r21.txt
f1 <- function(inVal)
{
  v1 <- inVal
  cat ("v1=", v1, "\n")
}

Executing the code produces this:

> source("r21.txt")
> f1(1)
v1= 1

What will happen if you have another variable outside the function that has the same name of "v1"? Will the function use the value set inside the function or simply get the value from outside? Let's change the code and see the output. We first set the value of v1 to 10 outside the function and set it to 1 inside:

#r22.txt
v1 <- 10
f1 <- function(inVal)
{
  v1 <- inVal
  cat ("v1=", v1, "\n")
}

When we execute the code, what should we get? Let's see:

> source("r22.txt")
> f1(1)
v1= 1

The output is 1, which is the value assigned inside the function. The prior assigned value, 10, was not considered. This is the expected behavior in pretty much any language, including PL/SQL. So it's no surprise.

However, what happens if a variable called v1 is not even created inside the function, as shown below:

#r22a.txt
v1 <- 10
f1 <- function(inVal)
{
cat ("v1=", v1, "\n")
}

Note that the variable v1 is not defined inside the function, yet it is called in the cat() function. In PL/SQL, if you reference a variable not defined in the function, you will get a syntax error. Let's see what happens in R. Executing the R code results in this:

> source("r22a.txt")
> f1(1)
v1= 10

Whoa! What happened? We did not get a syntax error. Instead R pulled up the variable v1 defined outside the function. This is a very important property of R, which is very unlike PL/SQL. You should pay attention to this behavior, because it can cause many bugs if it is not understood properly. Let's recap. If a variable is referenced inside a function, R first looks to see if that variable is already defined inside the function. If so, the value is used. Otherwise, if that variable is not defined, R looks up to the immediate top level code to see if that variable is defined there. If it is found, that value is used. If is is not found there, the next upper level of code is checked, and so on.

Summary

Let's take a recap of what we explored in this article. Like the previous article, we will examine one element of PL/SQL and contrast it with the equivalent R element.

  PL/SQL R
Read user input It's actually done in SQL*Plus using PROMPT and ACCEPT. It's done using the readline() function, which returns a string. So, the input must be converted to whatever type you wish. For example, as.integer() converts the input to an integer.
Basic IF if (condition) then
    List of statements to be executed
end if;
    List of statements outside of IF block
if (condition)
{
    List of statements to be executed
}
    List of statements outside of IF block


Note the dissimilarities:
1. There is no end if.
2. The code for the IF block is inside curly braces, which is equivalent to if..end if.
3. The condition must be within parentheses.
ELSE if (condition) then
   List of statements to be executed
else
   List of statements
end if;
List of statements outside of IF block
if (condition)
   List of statements to be executed
} else
   List of the statements in else block
}


There is no end if and as with IF, the blocks of code are enclosed in curly braces. The else statement must be on the same line as the ending curly brace.
ELSIF ELSIF No equivalent. You should use else if (condition).
Do nothing where a legal, syntactically correct statement is needed NULL; Not needed. You can use {} to indicate a null.
FOR loop for i in Start...End
loop
    List of statements
end loop;
for (i in Start,End)
{
    List of statements
} Statement not inside the for loop


Note that the expression for the for loop is within parentheses, which is mandatory.
Skip counters in a loop, for example, skipping one number as, 1, 3, 5, and so on for i in Start...End
loop
    if mod(i,2) = 0 then
      Do something
    end if;
end loop;
for i in seq(Start, End, how much to skip)
     Do something
}
Count backwards in a loop, for example, 3, 2, 1, and so on for i in reverse Start...End
loop
    List of statements
end loop;
for (i in seq(Start, End, -1))
{
    List of statements
}

 
WHILE loop while (condition) loop
    List of statements;
end loop;

 
while (condition)
{
    List of statements
}
Statement not inside the while loop


Note that the condition must be within parentheses.
Break away from a loop EXIT WHEN (Some condition);

or

IF (Some condition) THEN EXIT;
Inside a loop:

if (Some condition) {
    break
}


Note the curly braces.
case statement case
    when Some condition then
      List of statements;
    when Some other condition then
      List of statements;
end case;
Use the switch() function.
Continue in a loop CONTINUE next
Repeat a loop No direct equivalent. Can be simulated using the WHILE(TRUE) LOOP ... END LOOP; construct. The following repeats the loop until an explicit break comes:

repeat {
    Statements
    if (condition) {
      break
    }
}
Procedures and functions function FunctionName (
    Parameter1Name in DataType,
    Parameter2Name in DataType,
    ...
)
return ReturnDatatype
is
    localVariable1 datatype;
    localVariable2 datatype; begin
    ... function code ...
    return ReturnVariable;
end;


A procedure definition in PL/SQL has this general syntax:

procedure ProcedureName(
    Parameter1Name in DataType,
    Parameter2Name in DataType,
    ...
)
is
    localVariable1 datatype;
    localVariable2 datatype;
begin
    ... procedure code ...
end;
A function definition in R follows this simple syntax:

FunctionName <- function (Parameter1Name, Parameter2Name, ...)
{
      ... function code ...
    return SomeValue
}


Unlike in PL/SQL, there is no difference between procedures and functions. Both types are called functions. A function may or may not return anything. If it does return something, you do not specify what data type it returns at the time of definition.

Function definitions start with the function name assignment operator followed by the keyword function compared to function or procedure in PL/SQL.

Like PL/SQL, the parameters shown above are optional; not all functions need parameters. The parameters do not have data types listed at definition. Therefore, you can pass any type of data at runtime.

Unlike PL/SQL, if you don't have any parameters, you still need to have the parentheses, for example, myFunc <- function () {

Unlike PL/SQL, there is no begin ... end block to designate the code for a function. The curly braces determine the code of the function.

Like PL/SQL, the return statement is the last statement of the function code to be executed.

Unlike PL/SQL, even if there is no return statement, the function returns the last assigned value of a variable.
Parameter default value ParameterName DataType := DefaultValue ParameterName = DefaultValue

Unlike PL/SQL, default values of parameters can be variables, for example:

defIntRate = 5
calcInt <- function (pPrincipal, pIntRate = defIntRate) {
Positional parameter specification in functions and procedures If a PL/SQL procedure named F1 has two parameters, P1 and P2 (in that order), you have to pass the parameter values at runtime in the same order, as shown below, where Val1 and Val2 are values for parameters P1 and P2, respectively:

However, you can also use positional parameters by specifying the parameter names and their values in any other order by using the => operator, as shown below:

F1 (P2 => Val2, P1 => Val1);
You can do the same thing in R, as follows:

f1 (p2 = val2, p1 = val1)
Explicitly set input values of parameters to null at design time so that in the code you can check if the parameter value was passed or not. pIntRate number := null, pIntRate = NULL

Note that "NULL" is uppercase.
Return values for a function return SomeValue; return (SomeValue)

The big difference is that you do not need to specify whether the function has to return something, and if it does, you do not need to specify the data type of the return value at design time.

Also, the return value has to be within parentheses.
Global variables The variables defined outside functions will be different from variables defined with the same name inside functions. Same behavior. However, two important caveats are particularly important for PL/SQL developers to remember:

1. Because there is no such thing as declaration of variables, variables come into existence when they are first referenced. If they are referenced inside a function first, they are local; otherwise, they are global. If you must be 100 percent clear about the scope, simply first assign a value such as "None" to the variable wherever you want the scope to be.

2. If a variable is not referenced at all in a function, but a variable of the same name exists in the program outside the function, the variable is valid inside the function as well.
 
Quiz

Let's test your understanding with these simple questions.

Questions

1. Consider the following code:

#q1.txt
f1 <- function (inVal)
{
  v1 <- inVal * 2
  cat ("Inside f1, v1=", v1, "\n")
}
f2 <- function (inVal)
{
  v1 <- inVal * 2
  cat ("Inside f2, v1=", v1, "\n")
}
f3 <- function (inVal)
{
  v1 <- inVal * 2
  cat ("Inside f3, v1=", v1, "\n")
}

f3(f2(f1(2)))

Here is the output:

> source ("q1.txt")
Inside f1, v1= 4
Inside f2, v1=
Inside f3, v1=

Why don't we see the values of v1 in the other functions?

2. Consider the following function:

# q2.txt

f1 <- function (inVal)
{
   v1 <- inVal * 2
}

Note that there is no return statement. So this code doesn't return anything. However, we still call it and assign the return value to another variable v2:

> v2 <- f1(2)
> v2
[1] 4

How come the function returned 4?

3. You are starting an R session from scratch. You gave the following command:

# q3.txt
if (x<y) {
  print('yes')
}

And the output was this:

Error: object 'x' not found

Why was the error produced? Isn't it true that R defines variables when they are referenced?

4. What will be result of the following code?

# q4.txt
v1 <- 10
f1 <- function (inVal)
{
   v1 <- 4
   2 * v1 * f2(inVal)
}
f2 <- function (inVal)
{
   inVal * v1
}
f1(2)

5. What will be the output of the following?

> v1 <- 2
> v2 <- switch(v1,100,200,300,400)
> v2

6. Along the same lines, here is modified code where you ask the user to input the value of v1 instead of hardcoding it.

# q6.txt

> v1 <- readline("Enter a number> ")
Enter a number> 2
> v2 <- switch(v1,100,200,300,400)

But it failed with the following message:

Error: duplicate 'switch' defaults: '100' and '200'

What happened? The only change you made was to accept the value; and now it's producing an error.

7. You are writing a statement to check the number input by the user is less than 100. Here is the code you wrote:

# q7.txt

> v1 <- as.integer(readline("Enter a number> "))
Enter a number> 5
> v2 <- switch((v1<100), "Yes, less than 100", "No, greater than 100")
> v2
[1] "Yes, less than 100"

It worked correctly. It reported that the number entered by the user (5) is less than 100. So, you re-executed the statement with a different input:

> v1 <- as.integer(readline("Enter a number> "))
Enter a number> 200
> v2 <- switch((v1<100), "Yes, less than 100", "No, greater than 100")
> v2
NULL

Note the output. It's NULL, not the desired output. Why?

8. What is the difference between break and next statements?

9. You have all the R commands in a file called, say, myscript.R. How can you call the script and not have to enter the commands one by one?

10. I am trying to write a simple function that merely prints the word "Hello." So the function doesn't accept any parameters. Here is how I started typing, but I got an error:

> printHello <- function
+ {
Error: unexpected '{' in:
"printHello <- function
{"

Why did I get the error? I don't have any parameter; so I can't pass it anyway.

Answers

1. Note the function definition. There are no return statements inside the function. So the v1 value was not populated. The correct syntax would have been this:

#q1a.txt
f1 <- function (inVal)
{
v1 <- inVal * 2
cat ("Inside f1, v1=", v1, "\n")
return (v1)
}
f2 <- function (inVal)
{
v1 <- inVal * 2
cat ("Inside f2, v1=", v1, "\n")
return (v1)
}
f3 <- function (inVal)
{
v1 <- inVal * 2
cat ("Inside f3, v1=", v1, "\n")
return (v1)
}

f3(f2(f1(2)))

Here is output now:

> source("q1a.txt")
Inside f1, v1= 4
Inside f2, v1= 8
Inside f3, v1= 16

2. Even though the function might not have an explicit return statement, the last assigned value is returned implicitly. Because v1 was assigned last, it was returned.

3. No; R creates variables when they are assigned, not when they are referenced. In this code, you simply referenced x and y without assigning any value to them. So they were not created. The following would have been valid code in which the values of x and y were assigned.

# q3a.txt
x <- 1
y <- 2
if (x<y) {
  print('yes')
}

4. It will be 160. Here is why. Inside the f1 code, you see a reference to f2; so R will go on to evaluate f2(2). Inside the f2 code, there is a reference to variable v1. But there is no variable v1 defined inside f2. So R will look up the variable v1 defined at the beginning of the code (that is, 10). So f2(2) will return 2 * 10 (that is, 20). Then control will pass to function f1. However, there is a variable named v1 inside it. So, that value (that is, 4) will be used. f1(2) will evaluate to 2 * 4 * f2(2). Because f2(2) returned 20, this expression will be 2 * 4 * 20 = 160.

5. It will be 200. When you pass an integer as the first argument to switch, it uses that to decide which position it will look up to get the value. In this case, you have passed 2; hence, it will look up to the second position, which has the value 200. Hence, the switch function will return 200.

6. The function readline() returns a value of character data type. So v1 is a character. The switch() function behaves differently when the first argument is a character instead of a number. So, the syntax for the rest of the arguments in the switch was wrong. The correct code is this:

# q6a.txt

> v1 <- as.integer(readline("Enter a number> "))
Enter a number> 2
> v2 <- switch(v1,100,200,300,400)
> v2
[1] 200

7. The switch function works on integers only, not on logical values. In the first case (5<100), the expression is TRUE, which becomes 1. Therefore, switch() picks up the first value on the list: "Yes, less than 100". However, in the second case, it was FALSE, which equates to 0, and there was nothing for the 0-th option. It's not possible.

So, how would you write that code to test if the input number is less than or greater than 100? One option is to use if ... else ,,, construct. But if you want to use switch(), you should use this:

# q7a.txt

v2 <- switch((v1<100)+1, "No, greater than 100", "Yes, less than 100")

Now the code will yield right results.

8. The break statement stops the loop and exits the loop completely. The code after the loop is executed. The next statement simply jumps control to the end of the loop; so control goes back to the beginning of the loop again.

9. There are two options:

  • Use Rscript. From the command line call this:
     
    C:\> rscript myscript.r
    
  • From the R command-line interpreter, call the script using this:
     
    > source("myscript.r")
    

In both options, we assume that the script file is in that directory.

10. Even if you don't need parameters (or arguments) to R functions, you still need to use parentheses:

printHello <- function()

 

About the Authors
Arup Nanda ( arup@proligence.com) has been an Oracle DBA since 1993, handling all aspects of database administration, from performance tuning to security and disaster recovery. He was Oracle Magazine's DBA of the Year in 2003 and received an Oracle Excellence Award for Technologist of the Year in 2012.
Join the Database Community Conversation