Sunday 22 February 2015

5.0 Names, Binding, Type Checking and Scopes

1.0 Names (identifiers)
  • A name is a string of characters to identify some entity in a program 
  • Names are a fundamental attribute for variables 
  • Two design issues for names are:
    •  Are they case sensitive
    • Are the special words of the language reserved words or keywords
1.1 Name Forms
  • Names can have character length restrictions in languages (31 characters in FORTRAN)
  • Can consist of letters, digits and Underscore, underscore now less popular with camel case 
1.2 Special Words 
    Reserved Keyword =  A reserved word is a special word of a programming language that cannot be used as a name

    i.e in C#    for, if , new, this


    Contextual Keyword - A keyword that is special only in certain contexts, but it is not a reserved word in C#. Some contextual keywords, such as partial and where, have special meanings in two or more contexts.

    i.e in C#     -  from, select. var , add, remove, value


    There is one potential problem with reserved words; If the language includes a large number of reserved words, the user has the difficulty making up names that are not reserved. the best example is COBOL that has  300 reserved words compared to C# which has 79.


    2.0 Variables


    A program variable is an abstraction of a computer memory cell or collection of cells.

    In some cases, the characteristics of these abstractions are very close to the characteristics of the memory cells; an example is an integer variable, which is usually represented directly in one of more bytes of memory.
    In other cases, the abstractions are far removed for the organization of hardware memory, as with a three-dimensional array, which requires a software were mapping function to support abstraction.


    A variable can be characterized as a sextuple of attributes: (name, address, value, type, lifetime, scope).

    2.1 Name
    Refer to Section 1.0


    2.2 Address

    The address of a variable is the machine memory address in which it is associated.

    In many languages, it is possible for the same variable to be associated with different addresses at different times in the program. For example if a subprogram has a local variable that is allocated from the run time stack when the subprogram is called, different calls may result in that variable having different addresses.

    Aliases - Multiple variables can point to the same memory address. When more than one variable can be used to access the same memory location they are called aliases. This hinders readability because it allows a variable to have its value changed by an assignment to a different variable.
    references variables are aliases , Aliasing can be created  in many languages through subprogram parameters.


    2.3 Type
    The type of a variable determines the range of values the variable can store and the set of operations that are defined for values of the type.  For example, the int type in Java specified a values range of -2147483648 to 2147483647 and arithmetic operations of adding, subtraction, multiplication, division and modulus

    2.4 Value
    The value of a variable is the contents of the memory cell or cells associated with the variable.

    It is convenient to think of computer memory in terms of abstract cells.
    Physical cells (only 8 bits in length)  are far to small for most program variables.
    We define an abstract cell to have the size required by the variable with which it is associated
    i.e Even though a floating point number could occupy 4 physical cells, we think of it as occupying one abstract memory cell.


    3.0 Binding

    Binding is an associationsuch as
    • between an attribute and an entity,
    • between an operation and a symbol

    Binding time - The time that the binding takes place is called the binding time.Binding can take place at
    • Language design time
      • I.e The Astrix symbol (*) is usually bound to the multiplication operation of a language at language design time
    • Language implementation time
      • I.e A data-type such as int in c, is bound to a range of possible values at
        Language implementation time
    • Compile time
      • I.e A variable in a java program may be bound to its a data-type at compile time
    • Load time
      • I.e A variable may be loaded to a storage cell when the program is loaded into memory, This binding in some cases does not happen until run-time
    • Link time
      • I.e A call to a library subprogram is bound to the subprogram code at link time.
    • Run time
      • I.e A variable may be loaded to a storage cell at run time


    Static Binding - A binding is static if it first occurs before run time and remains unchanged throughout the program execution. When you define the type in declaration  statement.

    Advantage - Compiler can pick up errors sooner


    Dynamic Binding (Javascript, PHP) -
    If the binding first occurs during run time or can change in the course of program execution.The type of a variable is not specified by a declaration statement, nor can it be determined by the spelling of its name.
    When the assignment statement is executed , the variable being assigned is bound to the type of the value of  the expression on the right hand side of the assignment.
    i.e  in JavaScript, below would dynamically change the type of the variable names 'list'
    list = [1.23, 123];
    list = 54


    4.0 Storage, Binding and Lifetime

    Storage Bindings - binding a variable to be stored in a memory cell

    Allocation -  is the process of binding a variable to a memory cell from a pool of available memory.
    De-allocation  - is the process of placing a memory cell that has been unbound from a variable back into the pool of available memory.

    Lifetime - The lifetime of a variable is the time during which the variable is bound to a specific memory locations. The lifetime of a variable begins when is bound to a specific cell and ends when it is unbound from that cell.

    Heap -  The heap is a collection of storage cells whose organization is highly disorganized because of the unpredictability of its use.

    Run time Stack - The run time stack is data structure that stores information about the active sub-programs of a computer program. the last called is first to complete, hence the word common naming 'the stack'. it de-allocates the memory of the subprogram when then sub-program has completed its job.

    - A variable will be stored on the run-time stack or the heap depending on whether the lifetime of the storage cannot be determined ahead of time.

    4.1 Static Variables
    - Bound to memory cells before program execution begins and remain  bound to those same memory cells until program execution terminates.

    - global access variables are static variables

    4.2 Stack-Dynamic Variables
     -  Stack-Dynamic Variables are those whose storage bindings are created when their declaration statements are elaborated, but whose types are statically bound.

    - for example, the variable declarations that appear at the beginning of a java  method are elaborated when the method is called and the variable defined by those declarations are de-allocated when the method completes its execution.

    4.3 Explicit Heap-Dynamic Variables
    - Explicit heap-dynamic variables are nameless (abstract) memory cells that are allocated and de-allocated by explicit run time instructions specified by the programmers.

    -These variables, which are allocated and de-allocated to the heap, can only be reference through pointer and reference variables.



    4.4 Implicit Heap-Dynamic Variables
    - Implicit Heap-Dynamic Variables are bound to heap storage only when they are assign values. in fact all their attributes are bound every time they are assign.


    i.e in this an example in java-script   when a variable dynamically goes from a string to an numeric array,

    highs= "yolo"

    highs= [34,56,98,21,79]


    5.0 Type Checking

    Type checking is the activity of ensuring that the  operands of an operator are of compatable types.

    Coercion  - A compatible type is one that either is legal for the operator or is allowed under language to the implicitly converted by compiler generated code to a legal type. This is called coecion. Ie. Int + float , the int is coerced to a float.

    Type Error - A type error is the application of an operator to an operand of an inappropriate type.

    Static type checking - run at compile time (C#)
    Dynamic type checking - run at run time (Javascript)


    6.0 Strong Typing

    • A language is strong typed if type errors are always detected.
    • This requires that the types of all operands can be determined. Either at compile time or run time. 
    • The importance of strong typing lies in its ability to detect all uses of variables that result in type errors. 
    • A strong typed language also allows the detection, at run time, of uses of the incorrect type values in variables that can store values off more then on type.

    7.0 Type Equivalence (compatibility)

    -The compatibility rules dictate the types of operands that are acceptable for each of the operators and there by specify the possible type errors of the language.

    The rules are called compatibility because in some cases the type of an operand can be implicitly converted by the compile or run-time system, to make it acceptable to the operator.

    -Compatibility rules help with coercion.

    i.e an int is type compatible  with a float amongst a addition operator.    int (operand - this will be coerced to a float)          + (operator)        float(operand)


    8.0 Scope

    - The scope of a variable is the range of statements in which the variable is visible. A variable is visible in a statement if it can referenced in that statement.

    Local - A variable is local in a program untir or block if it is delcared there.
    Non Local - A variable in non local of a program unit or block when the variable within the program unit or block, but are not declare there.


    8.1 Static scoping
    Static scoping is so named because the scope of a variable can be statically determined, that is, prior to execution. this permits a program reader to determine the type of every variable in a program.


    9 Scope and life time
    Scope and lifetime are unrelated -  even though it may seem so ..  i.e below The scope of the variable "sum" is completely contained within the compute function. it does not extend to the body of the function print header, although the print header function executes in the midst of execution of compute. Howe-ever  the life time of "sum" extends over the time during which print header method executes. What ever storage location sum is bound to before the call to printheader, that binding will continue during and after the execution of printheader.


    void compute()
    {
       int sum;
        ...
       printHeader();
    }

    10 Referencing Environments
    The referencing environment of a statement is the collection of all variables that are visible in the statement. The referencing environment of a statement in a static scope language is the variable declared in its local scope plus the collection of all variables of  its ancestor scopes that are visible.

    11. Named Constants
    - A named constant is a viarable that is bound to a value once only;
    - Name constant are usually to aid readability and program reliability,
    - i.e const pi =  3.14159,   would help readability instead of just using the number

    C# has two kinds of names constants, those defined with

    Const -  are implicitly static, static bound to value cant be changed.
    Read-only - Dynamically bound to values. value can define when the contain object is created.



    Saturday 21 February 2015

    3.0, Semantics and semantics

    Semantics is the meaning of expressions, statements, and program units. (i,e this is an if statement)


    Syntax is the form of its expressions, statements, and program units.  (ie. the if statement syntax is different in C# and Vb.net)

    If(<boolean_expre> ) { Statement ; }

    If (<boolean_expre> ) Then Statement End If

    ------------


    index = 2 * count = 12


    Lexeme - Small units that don't offer a description (A basic unit of meaning) (programming units)
    Token = A category of lexemes

    Lexemes       Token
    Index             indentfier

    =                    eqaul_sign
    2                    int_literal
    *                    mult_op
    Count            identifier+                   plus_op
    17                 int_literal
    ;                    semicolon or end_of_statement
       
    ---------

    Backus naur form (BNF) is a metalanguage (language to describe a language) for programming languages

    ie. a BNF description of a if and  a if else statement would be 

    <if_stmt> -> if <logic_expr> then <stmt>
                     |  
    if <logic_expr> then <stmt>else <stmt>


    The
    <if_stmt> symbol on the left hand side (LHS) of the  -> arrow, is the abstraction being defined.

    The text to the right of the arrow (RHS) is the definition of the LHS. it consists of of some mixture of tokens lexemes and references to other abstractions.

    Multiple definitions can be written as a single rule, with different definition separated by the    |     symbol


    When describing lists for example a variable list of identifiers
    It is common to use recursion
    A rule is recursive if the LHS appears in its RHS

    <indent_list> => identifier
                              | identifier, <indent_list>


    --------

    Operator Precedence

    A = B + C * D 

    When an expression includes two different operations for example in the above * and +,
     one obvious semantic issue is the order of evaluation of the two operators (is it add then multiply or is it visa versa in the above express. This semantic question can be answer by assignment different precedence levels to operators. If * is higher in precedence, then it will be the one used first.

    Associativity of Operators

    When an expression includes two operators that have the same precedence (as * and / usually do), a semantic rule is required to specify which should have precedence. The rule is named associativity.


    A = A / B * C



    Usually the correct order if left associative (the direction we read) so for this example in we would do the division first.


    But it can also in times be right associative, to indicate right associativity right recursion can be used. (LSH right of the RHS) ie.

    <factor> => <exp> ** <factor>



    Parse Trees

     Parsing is the problem of transforming a linear sequence of characters into a syntax tree
     
    Read from left most bottom most



    Grammar example1 IPL