Difference between revisions of "BASH scripting"
(→Environment Variables) |
(→Bourne-Again Shell (Bash)) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 31: | Line 31: | ||
number=6 | number=6 | ||
− | for ((i=0;i<number;i++)) | + | for ((i=0;i<${number};i++)) |
do | do | ||
echo "Hello world ${i}" | echo "Hello world ${i}" | ||
Line 70: | Line 70: | ||
===Paths and Environment variables=== | ===Paths and Environment variables=== | ||
+ | |||
+ | Every file has a [https://en.wikipedia.org/wiki/Path_(computing) path], i.e, a location within the file system. Paths can be specified in an absolute way -- with respect to the whole file system -- or in a relative way -- with respect to the working directory. | ||
+ | <code>HOME</code> and <code>PATH</code> are variables containing paths information, i.e., the "addresses" within the directory tree that allows the user to find their files and executables. | ||
Variables such as <code>HOME</code> or <code>PATH</code> are inherited from the environment. | Variables such as <code>HOME</code> or <code>PATH</code> are inherited from the environment. | ||
They have short and easily memorizable names that characterize important '''environment''' features. | They have short and easily memorizable names that characterize important '''environment''' features. | ||
− | <code>HOME</code> is the variable that identifies your ''home'' directory (e.g, <code>/gpfs/home/''your_username''</code> on Seawulf). | + | <code>HOME</code>, for instance, is the variable that identifies your ''home'' directory (e.g, <code>/gpfs/home/''your_username''</code> on Seawulf). |
+ | You can modify your home directory by changing <code>HOME</code> to whichever path you find more suitable, but it is advisable not to do it if you have scripts and programs that depend on a pre-existing value for the variable. | ||
+ | |||
<code>PATH</code> is the variable that determines which directories the shell should look for the programs that the user might use. | <code>PATH</code> is the variable that determines which directories the shell should look for the programs that the user might use. | ||
− | |||
If you use the Python programming language, you might need to create a <code>PYTHONPATH</code> variable to be able to include your own Python subroutines and classes. | If you use the Python programming language, you might need to create a <code>PYTHONPATH</code> variable to be able to include your own Python subroutines and classes. | ||
Similarly, if you use the Amber software, all Amber-related programs can be found in the path defined by <code>AMBERHOME</code>. | Similarly, if you use the Amber software, all Amber-related programs can be found in the path defined by <code>AMBERHOME</code>. | ||
We, DOCK6 developers and users, define <code>DOCKHOME</code> as the path to the most stable release or as we see it fit. | We, DOCK6 developers and users, define <code>DOCKHOME</code> as the path to the most stable release or as we see it fit. | ||
+ | |||
+ | If you type the name of a file as if it were a command, the shell searches for this program in certain directories defined in the <code>PATH</code> variable. | ||
+ | <code>PATH</code> specifies the order in which these directories should be searched by the shell. | ||
+ | You can add more directories to the variable by typing: | ||
+ | |||
+ | export PATH=/new/path/to/directory:${PATH} | ||
+ | |||
+ | In the command above, you appended the path <code>/new/path/to/directory</code> to the beginning of the <code>PATH</code> variable. | ||
==Basic commands== | ==Basic commands== | ||
+ | Most basic commands can be found at [[Unix]]. | ||
+ | Here are some tricks of the trade: | ||
+ | |||
+ | ===Iterations=== | ||
+ | |||
+ | If you have an iterative task to run, you can use a ''for'' loop. The simplest kind of for loop was already shown in [https://ringo.ams.stonybrook.edu/index.php/BASH_scripting#Bourne-Again_Shell_.28Bash.29 Bourne-Again Shell] section. | ||
+ | Suppose, however, that each line of the file <code>systems.txt</code> contains the name of a system that you need to work on. | ||
+ | The most straightforward way of doing it in bash is: | ||
+ | |||
+ | for line in $(cat systems.txt) | ||
+ | do | ||
+ | ## Commands | ||
+ | echo ${line} | ||
+ | done | ||
+ | |||
+ | ''while'' loops can be used in a similar way: | ||
+ | |||
+ | while IFS= read -r line | ||
+ | do | ||
+ | ## Commands | ||
+ | done < system.txt | ||
+ | |||
+ | Every character after a <code>#</code> is a comment and will not be read by the interpreter. | ||
+ | The exceptions to this rule are the <code>#!</code> that defines the shell and cluster management/job scheduling systems (see [[SLURM]]). | ||
+ | <code>IFS</code> stands for internal field separator and it is used by the shell to deal with word splitting. | ||
+ | |||
+ | ===Conditionals=== | ||
+ | |||
+ | Conditions are written as ''if...then'', ''if...then...else'' statements. | ||
+ | You can nest ''if'' and ''else'' clauses as many times as necessary, but try to keep the decision-making process simple. | ||
+ | Nested ''if'' clauses can be great sources of headaches. | ||
+ | |||
+ | if ''conditional_expression'' | ||
+ | then | ||
+ | ##commands | ||
+ | elif ''another_conditional_expression'' | ||
+ | then | ||
+ | ##other commands | ||
+ | else | ||
+ | ##more commands | ||
+ | fi | ||
+ | |||
+ | Bash does not care about code indentation, but it is useful to indent your code to increase its readability. | ||
+ | |||
+ | ===Creating files during runtime=== | ||
+ | |||
+ | You will frequently run bash scripts to prepare your files before a simulation. | ||
+ | This is easily done by: | ||
+ | cat <<EOF > filename | ||
+ | ''text'' | ||
+ | EOF | ||
− | + | <code>EOF</code> means ''end of file'' and the sequence above means that ''text'' will be written in <code>filename</code>. |
Latest revision as of 15:19, 15 July 2020
Contents
Bourne-Again Shell (Bash)
Bash is an acronym for "Bourne-Again Shell", the name of a code interpreter and a high-level programming language, and it is a must-know tool in Computational Chemistry and Biology.
You can use Bash scripting in Unix/Linux computers through a terminal.
When you initialize the shell, i.e, the interpreter, your computer runs initialization files -- ~/.bash_profile
, ~/.bash_login
, and ~/.profile
(where ~/
points to your home directory) -- but we do not recommend changing these files unless you really know what you are doing.
In most cases, you can change the ~/.bashrc
file, which allows the user to customize the system according to their needs.
A bash script is a text file containing a series of instructions written in the bash language. You can create one by typing the following commands in the terminal:
touch my_first_script.sh
which will generate a modifiable file that you can use to write the instructions to be executed by the shell. You can use the Vi text editor to write your code; just remember to add to the beginning of the file the following line:
#!/bin/sh
This line tells the interpreter that this is a bash script. You can run your script by telling the interpreter:
bash my_first_script.sh
or you can change the permissions of the file to make it an executable by typing:
chmod +x my_first_script.sh
and then running:
./my_first_script.sh
Suppose your my_first_script.sh contains the following lines:
#!/bin/sh number=6 for ((i=0;i<${number};i++)) do echo "Hello world ${i}" done
If you run ./my_first_script.sh
, the output will be:
Hello world 0 Hello world 1 Hello world 2 Hello world 3 Hello world 4 Hello world 5
For more on commands, see Unix.
Environment Variables
Declaring and accessing variables
Bash allows the user to assign values to variables in the command line, but it is more common to set any variables inside your scripts or ~/.bashrc
file. In Bash, you define your variable using the following syntax:
my_variable=value
Do not leave spaces between the variable name and its value. You can check the value of a variable by typing the following command in your terminal:
echo $my_variable
The shell will show the following result in your screen:
value
Remember to tell the shell that you want to return the value of the my_variable
by using the $
sign, otherwise, you'll be shell to print the string my_variable
on the screen.
In the previous section, the variable number
contained the value 6
, and the variable i
was an iteration counter that was called inside the for loop.
It is good practice to encapsulate the variable name with curly brackets {}
to avoid ambiguities inside the code.
Paths and Environment variables
Every file has a path, i.e, a location within the file system. Paths can be specified in an absolute way -- with respect to the whole file system -- or in a relative way -- with respect to the working directory.
HOME
and PATH
are variables containing paths information, i.e., the "addresses" within the directory tree that allows the user to find their files and executables.
Variables such as HOME
or PATH
are inherited from the environment.
They have short and easily memorizable names that characterize important environment features.
HOME
, for instance, is the variable that identifies your home directory (e.g, /gpfs/home/your_username
on Seawulf).
You can modify your home directory by changing HOME
to whichever path you find more suitable, but it is advisable not to do it if you have scripts and programs that depend on a pre-existing value for the variable.
PATH
is the variable that determines which directories the shell should look for the programs that the user might use.
If you use the Python programming language, you might need to create a PYTHONPATH
variable to be able to include your own Python subroutines and classes.
Similarly, if you use the Amber software, all Amber-related programs can be found in the path defined by AMBERHOME
.
We, DOCK6 developers and users, define DOCKHOME
as the path to the most stable release or as we see it fit.
If you type the name of a file as if it were a command, the shell searches for this program in certain directories defined in the PATH
variable.
PATH
specifies the order in which these directories should be searched by the shell.
You can add more directories to the variable by typing:
export PATH=/new/path/to/directory:${PATH}
In the command above, you appended the path /new/path/to/directory
to the beginning of the PATH
variable.
Basic commands
Most basic commands can be found at Unix. Here are some tricks of the trade:
Iterations
If you have an iterative task to run, you can use a for loop. The simplest kind of for loop was already shown in Bourne-Again Shell section.
Suppose, however, that each line of the file systems.txt
contains the name of a system that you need to work on.
The most straightforward way of doing it in bash is:
for line in $(cat systems.txt) do ## Commands echo ${line} done
while loops can be used in a similar way:
while IFS= read -r line do ## Commands done < system.txt
Every character after a #
is a comment and will not be read by the interpreter.
The exceptions to this rule are the #!
that defines the shell and cluster management/job scheduling systems (see SLURM).
IFS
stands for internal field separator and it is used by the shell to deal with word splitting.
Conditionals
Conditions are written as if...then, if...then...else statements. You can nest if and else clauses as many times as necessary, but try to keep the decision-making process simple. Nested if clauses can be great sources of headaches.
if conditional_expression then ##commands elif another_conditional_expression then ##other commands else ##more commands fi
Bash does not care about code indentation, but it is useful to indent your code to increase its readability.
Creating files during runtime
You will frequently run bash scripts to prepare your files before a simulation. This is easily done by:
cat <<EOF > filename text EOF
EOF
means end of file and the sequence above means that text will be written in filename
.