Naming and Scope in Programs.
This is an automatically translated post by LLM. The original post is in Chinese. If you find any translation errors, please leave a comment to help me improve the translation. Thanks!
Recently, I have been reading "The Essence of Code", which explains the concepts behind many programming languages. This note mainly refers to Chapter 7 of the book: Names and Scopes.
Naming
When a computer executes a program, it does so based on the information in memory addresses, without needing to know the names of each variable. Therefore, early programming did not have the concept of naming. Later, in order to facilitate programmers in remembering the meanings and types represented by variables, names were given to variables and functions. For example, compared to "Longitude: 108.654519, Latitude: 34.255314", "China Western Science and Technology Innovation Port" is easier to understand and remember. Similarly, compared to "8.7.198.46", the description "a non-existent website" is more memorable.
So how does a computer associate names with variables and functions? The answer is by using a lookup table. This is similar to finding a book in a library by searching the database, converting the book title into a classification number, and then using the classification number to locate the desired book. Taking the library of the Innovation Port as an example:
1 | "The Essence of Code" => TP312/159, "The First Volume of Ancient and Modern Mathematical Thoughts" => O11/510-1/C.1 |
With such a lookup table, when the computer processes the command "find 'The Essence of Code'", it can be translated into "find the book with the number TP312/159" for execution.
Taking Python as an example:
1 | x = 'python' |
In the above program, the computer establishes the following lookup table:
graph LR x[x]-->a[2256008235312] y[y]-->b[2256008254512] z[z]-->a
In the above lookup table, the variable x and the variable z point to
the same memory address. Therefore, when the variable z changes, the
variable x will also change accordingly. It's just that strings in
Python are immutable
Naming Conflicts
In early program design, the lookup table was shared by the entire program. This can lead to some problems, such as the following C++ code:
1 |
|
This loop should end after running 10 times. However, what if the
variable i
is modified in the function
print
:
1 | void print(){ |
Since the variable i
is used both inside and outside the
function, it cannot end directly after the expected 10 iterations of the
for
loop. It will loop indefinitely.
How to avoid such naming conflicts? One way is to use longer names
and indicate in the name what the variable is used for. For example,
i_in_print
and i_in_main
, or use a variable
naming policy for management in collaborative development.
Another way is to introduce scopes. The following text will mainly focus on scopes.
Scopes
Scope refers to the range of validity of a name. In order to prevent
the operation of i
in the print
function from
affecting the outside, its value can be stored before the function is
executed and retrieved after the function is executed. This mechanism is
called dynamic scoping.
Dynamic Scoping
Taking the Perl
language as an example, the variable
value is saved at the entrance of the function and written back at the
end of the function.
1 | sub shori{ |
However, this approach has a problem. When a function is called within another function, the modification of variables will affect the called function. Take the following two functions as an example:
1 | $x = "global" |
When the function yobu
is called, the variable
x
is changed to yobu
. Before the function is
executed and the value is written back, when the yobarebu
function is called, the value of the variable x
used is
yobu
. This shows that the modification of variables in the
yobu
function affects the execution of the called
function.
The reason for this phenomenon is that in dynamic scoping, all variables share a global lookup table, and when entering a function, a dynamic lookup table is created, which is shared by all functions and can be accessed by all functions. This leads to the occurrence of the above phenomenon.
In summary, in dynamic scoping, when each variable is searched for its corresponding address, it is searched in order from the nearest to the farthest. This causes the phenomenon of mutual influence when functions are nested. To solve this problem, we can use separate lookup tables for each function to store variables. This is called static scoping.
Static Scoping
Static scoping uses a separate variable table for each function. When the computer searches for a variable, it first searches in the variable table of the current function, and if it is not found, it searches in the global variable table. This effectively solves the problem of mutual influence when functions are nested in dynamic scoping.
However, static scoping also has some problems. One is the problem of nested functions. Taking Python as an example:
1 | x = "global" |
In this nested function, at first glance, the output of the
x
in the bar
function should be
foo
. However, in Python 2, when the bar
function searches for the variable x
, it first searches in
the local lookup table, and if it is not found, it directly searches in
the global lookup table. Therefore, the output result is
global
. This design brings a lot of misunderstandings to
the program. This problem was not solved until Python 3.
The second problem is rebinding in the outer scope. When we want to modify a value in an outer scope within a function, a new variable will be created instead, making it impossible to modify the value in the outer scope. For example:
1 | x = "global" |
In Python 3.0, the nonlocal
keyword was introduced to
solve this problem:
1 | x = "global" |
Summary
The development of computer technology has brought about more powerful computing capabilities, resulting in increasingly complex programs. The development of naming and scoping highlights the contradiction between humans and computers when naming variables and functions in large-scale programs. Any feature in a programming language is not created out of thin air, but appears to solve certain problems.