Saturday, January 31, 2009

Learning Unix Scripting

==Understanding Unix Environment==

{This is still work in progress..}

===Unix File System===

* Hierarchical File System

The Unix File System is a hierarchical file system.
* / is root of the system
* /var is a directory in root
* /var/www is the full path of the sub-directory in "var"
* /var/www/htdocs is the full path of the sub-sub-directory in "/var/www" and so on...

1. Know where you are: '''pwd'''
:It shows the directory path where you are.

2. Navigate: '''cd'''
:If you want to get to certain dir, for example enter
cd /var/www/htdocs/db

3. How do you know what is in the dir? Using '''ls'''
:Other related to list the file
: '''ls -l''': Give you the long dir listing

4. Getting the souce code: '''lynx -source'''
: Enter
lynx -source httpwebaddress.
: lynx is a text-based browser. -source tells lynx NOT to display the HTML code, but to return the raw HTML code.

5. Unix command history
:Just press the ''Arrow up'' key to see your previous command
''Whatever you do, double check''

6. Get the internet page to your folder
:Example: to put the internet page to the dir where you are at now, key in
lynx -source internet_address.html > your_pagename.html

7. Change mode
: ''chmod'' takes several arguments
: -x is to make executable
: ''chmod -x programfile'' will make program executable rather than just a plain text file.

Questions

1. Why put /var/www/htdocs/db/myprogram in front of myprogram?
: This is because whenever you execute a program, Unix shell needs to find the path of that program.
: In this case, we spell out the full directory path so that Unix shell knows where that program is.

2. Why then can we use the ''lynx'' command and don't have to put the full path?
: This is because lynx command is already in the Shell's list of known paths.
: Later on, we will show you what is the current known paths, e.g. echo $PATH

== Command Line Arguments==

1. Command Line command without arguments
: For example ''pwd'' doesn't need to take arguments

2. Command line command with one argument
: For example ''ls -l'' takes one argument, the flag ''-l'' to give a ''l''ong listing.

3. Command line command with two arguments
: For example ''lynx -source http://www.google.com''
: lynx the command
: -source the first argument
: http://www.google.com as the second argument

== Loop Control ==

How do we make the numbers autoincrement without us to put in the numbers ourselves?

1. Make 6 into a program to create multiple files
:(1)pseudo code for ''my_program''
for (i=0;i<100;i++)
{ lynx -source internet_address.html[i] > your_pagename[i].html}

:(2)Change the file to executable file
:(3)Key in [path]/my_program
:(4)chmod +x my_program (This is to make the file executable.)

==Learning Unix Commands==

'''Environment Control'''

# refers to comments

cd d 
# Change to directory d
mkdir d 
#Create new directory d
rmdir d 
#Remove directory d
mv f1 [f2...] d 
#Move file f to directory d
mv d1 d2 
#Rename directory d1 as d2
passwd 
#Change password
alias name1 name2 
#Create command alias (csh/tcsh)
alias name1="name2" 
#Create command alias (ksh/bash)
unalias name1[na2...] 
#Remove command alias na
ssh nd 
#Login securely to remote node
exit 
#End terminal session
setenv name v 
#Set env var to value v (csh/tcsh)
export name="v" 
#set environment variable to value v (ksh/bash)

==Miscellaneous commands==

1)
 ls 
# is to list the directory

2)
cp 
# copy

3)
wc 
# word count

4)
ls -al | grep root |awk '{print "file: $9, S8" }' 
# to extract the 8th and the 9th column; details of all the files/directories called root is ls

5)
wc -l 
# count number of lines

6)
 > 
# is to funnel to a new file

7)
 ls -al | grep root |awk '{print "file: $9, S8" }' > x.file 
# transfer the output to a file name, in this case called x.file

8)
 pico x.file 
# a text editor that helps you read the x file; to get out from pico, type ctrl x; pico is very sophisticated

9)
 vi x.file 
# a text editor that helps you read the x file

10)
 :q 
# helps exit vi, which is very obscure.

11)
 :q! 
# helps exit vi, which is very obscure.

12)
 pwd 
# print the current working directory

13)
 cd 
# is to change the directory

14)
 tab 
#use it to automatically fill in the blanks; auto completion

15)
 key up and down 
# help you look at commands typed earlier

16)
 echo "ls -al /root/Desktop/ |awk '{print $5 "b", S9 }' " >tinwee 
# creates an executable file

17)
 ./tinwee 
# executing tinwee, but will not work because you have no permission

18)
 chmod 755 tinwee 
# give full access to run tinwee

19)
 pico tinwee 
# to make the program more generic by modifying the code

20)
 ls -al $1 |awk '{print $5 "bytes", $9 } 
#change the code to more generic instead of hardwiring it

21)
 tinwee / 
# takes / as $1, which in this case is root; note that this $1 is different from the #awk $1

22)
 ln tinwee t 
# creating a shortname or short cut (soft link) for tinwee (alias); make sure it is unique alias

23)
 echo $PATH | sed 's/:/\n/g' 
# this is to show the path of all the directories and files line by #line

24)
 echo $PATH | sed 's/:/\n/g' |more 
# show little by little

25)
 echo $PATH | sed 's/:/\n/g' |less 
# show less

26)
 echo $PATH | sed 's/:/\n/g' |cat 
# to show all

27)
 echo $PATH | sed 's/:/\n/g' |tac 
# to show in reverse order

28)
 echo $PATH | sed 's/:/\n/g' |less |rev 
# show less but in reverse order

29)
 q 
# to quit while looking at the results of these

30)
 ctrl + C 
#to quit while looking at these

31)=create a translation tool, simple one==

echo "ATGCTTA"
" |rev |tac |tr "atcg" "tagc"
# rev reverses the sequence, tac show it in reverse order and tr does the translation

32)
alias 
#does not link a file but substitutes for the name

33)
 >> 
#will append to the bottom of a file that already exists

34)
 > 
#append to a new file

35)=Creata database of your own=
lynx http://aps.unmc.edu/AP/database/query_output.php?ID=00286 > AP00286.html 
#saves the web page to AP00286.html but it is not saving the source code

lynx -source http://aps.unmc.edu/AP/database/query_output.php?ID=00286 > AP00286.html

#saves the web page to AP00286.html and this time it is saving the source code
echo "lynx -source http://aps.unmc.edu/AP/database/query_output.php?ID=00286 > AP00286.html" >getapd

pico getapd 
# read the getapd file
change
lynx -source http://aps.unmc.edu/AP/database/query_output.php?ID=00286 > AP00286.html
to
lynx -source http://aps.unmc.edu/AP/database/query_output.php?ID=0$1 > AP0$1.html 


# this is making the code generic

getapd AP00287 
# type on the command prompt to download the record 00287 without typing the whole command

pico getapd 
# open the getapd file to make the command more generic


for i in 'Seq -w 1137`
do
lynx -source http://aps.unmc.edu/AP/database/query_output.php?ID=0$i > AP0$i.html
sleep 3
done


./getapd 
#run getapd for automatic download

36.
" " 
# double quotes are used whe you want to evaluate the variable. For example "$PATH" will evaluate the dollar sign and give you relevant matches
37.
' ' 
# are used when one wants to print the special characters without evaluating them. For example, '$PATH' will return $PATH

No comments: