Useful Shell Function pathmunge

When I wrote a function that updated the PATH environment variable to include a new directory path in its set of values I had named it “addtopath”, and “addtopathhigh” when prepending such directory. I was tweaking the /etc/profile file in Red Hat Enterprise Linux 5.3 when I saw this pathmunge function. It does the same:

pathmunge () {
  if ! echo $PATH | /bin/egrep -q "(^|:)$1($|:)" ; then
     if [ "$2" = "after" ] ; then
        PATH=$PATH:$1
     else
        PATH=$1:$PATH
     fi
  fi
}

The difference, is that it both prepends and appends the PATH variable with the given argument ($1). The second argument controls whether it is an append operation when it is passed the value “after”, otherwise, it just prepends the given path to PATH’s value (analogous to the addtopathhigh function, which prepends the directory to the value and thus makes the added path with HIGHer priority in the search path, that was the reason why I named it that way).

In order to maintain it standard with RHEL I decided to adopt this function. Same name. But I improved it. The new definition expands the variables with curly braces, and after the pre/append it also sanitizes it by cleaning any preceding or trailing semi-colons for the basic case when the PATH variable had an empty value prior to the “munging”.

Here it is:

pathmunge () {
  if ! echo "${PATH}" | /bin/egrep -q "(^|:)$1($|:)" ; then
     if [ "$2" = "after" ] ; then
        PATH="${PATH}:${1}"
     else
        PATH="${1}:${PATH}"
     fi
     PATH="${PATH#:}"
     PATH="${PATH%:}"
  fi
}

 

Command Example for useradd

The useradd command is a tool that provides sysadmins with the capability of adding new users (a.k.a. logins) to the system. The generic form of the command is this:

useradd -c 'COMMENT' -d 'HOME' -m -p 'CRYPT_STRING' -s 'SHELL' LOGIN
  • COMMENT is used generally for user information such as full name, phone number etc.
  • HOME is the home folder e.g. /home/user.
  • CRYPT_STRING string generated by the crypt(3) function (libcrypt, a component of glibc, the GNU C Library).
  • SHELL is the default shell for the user.
  • LOGIN is for the new username in question.

The -m option creates the user’s home directory if it does not exist.

Example:

useradd -c 'John Doe' -d /home/johndoe -m -p 'ABiELdbxGY2fY' -s '/bin/ksh' johndoe

John Doe just got himself a new account with the /home/johndoe home/login directory, created (-m) if not present, with CRYPT_STRING ‘ABiELdbxGY2fY’ generated by the crypt(3) function for the (simplest and unbelievably most commonly used) password ’123456′ with salt ‘AB’. The default login shell for this account is ksh (-s ‘/bin/ksh’). And last, the username, johndoe.

Go on and create some users on your grid.

Bash Pattern Matching Operator

GNU Bash (4.2 as of this post)’s manual page says that its pattern matching operator =~

value =~ pattern

uses the ERE (Extended Regular Expression syntax) specification used by regex(3) which is theĀ POSIX 1003.2 regular expressions format. Also, =~ has the same precedence as the == and != operators.

This is documented in the SHELL GRAMMAR section – Compound Commands – [[ expression ]].

Linux Look-and-feel on Windows with Cygwin

Want to have GNU/Linux utilities at hand on a Windows box? Suffer no more. Cygwin brings you that along with its several packages (GNU etc.) bringing recompiled versions of these for Windows.

Just to name a few out-of-the-box packages:

  • Interpreters: bash 4.x, ksh, zsh etc.
  • Servers: Apache web server, PostgreSQL, sqlite3
  • Scripting languages: Perl, Python, Ruby etc.
  • Utilities: cron, md5deep etc.
  • Version control: bzr (bazaar), git, subversion etc.

Remember: this is just to name a few! There are a myriad of packages for you to choose from.

Having bash 4 available is priceless. Do your normal shell scripting tranquile in Windows!

Also, one of the main attractions in terms of usefulness that Cygwin brings us is the mintty terminal emulator (intended for use with Cygwin on Windows). It has several awesome features like:

  • It is fast!
  • Making the user able to customize the colors, font and cursor type.
  • Control the behavior for the backspace key (^H etc.).
  • Command line options to hold execution (-h option) after execution of any command (-c command, as stated in mintty tips on google code), and also controlling the initial state of the window for when you create a shortcut for it (option -w max makes it start maximized).
  • Handling the copying with left mouse button selection well and preventing bad situations regarding line wrapping while doing so. ctrl + insert and shift + insert for copy and paste (respectively) are accepted as well.
  • Scroll with shift + {arrow|pgdn|pgup} .
  • Toggle on / off many of the program’s functions.
  • Window transparency.

Comparing mintty to the default terminal emulator in cygwin one can conclude that the former is a very significant improvement. For more, check this great article on mintty and also this one at howtogeek.com (more complete).

Try cygwin if you still had not and let us know of your explorations! Happy scripting!

Ultra Fast AWK Syntax Tutorial

An AWK program reads in a file and divides this input logically into records (defaults to each line read). Furthermore, while processing one record, it is logically divided into fields. Fields are by default separated by white-space characters, not including the newline character which is the default RS (record separator). $0 represents the entire record. $x represents the x-th field.

An AWK program is defined by a set of what are known as “pattern action” pairs. A pattern is a condition to be satisfied for the record, and when it does, the corresponding action is executed for that record. See the various types of patterns in the section “Code Blocks” further down.

A special thank goes to my colleague Cesar A. Murakami, for this tutorial’s main content:

Some Special variables (Input/Output):

  • FS: Input field separator (char or regex)
  • RS: Input record separator
  • NF: quantity of fields of the current record
  • NR: Current input record number
  • OFS: Output field separator
  • ORS: Output record separator

Some more special variables (files and match):

  • ARGC: number of parameters passed to awk
  • ARGV: the parameters (files)
  • FILENAME: the current file being processed
  • FNR: record number of FILENAME
  • RSTART: starting position of string matched by match() function
  • RLENGTH: length of string matched by match() function

Conditionals:

if (var1 > 2) {
  print "greater than 2";
  var1 = var1 + 1;
}
if ( var1 in arr )
  printf "%d is an array index\n",var1;
else {
  arr[var1] = var2;
  printf "%d was not an array index\n",var1;
}

Loops:

for (i=1; i<=10; i++) {
  print i;
}
for (i in arr) {
  print arr[i];
}
while ( x != 0 ) {
  do_something;
  print x;
}

Functions:

function mysum(param1, param2) {
  return param1 + param2;
}

Code Blocks ('PATTERN {ACTION}'):

BEGIN {
  # Code to be executed before record processing.
}
/regex/ {
  # Code to be executed when $0 contains a substring that are matched by regex.
}
$1 == "XYZ" {
  # Code to be executed when the first field is equal to "XYZ".
 }
$2 ~ /regex/ {
  # Code to be executed when the second field contains a substring that are matched by regex.
}
! PATTERN {
  # Code to be executed when PATTERN is not matched / satisfied.
}
PAT1, PAT2 {
  # Code to be executed for record range (PAT1 is matched on start record, PAT2 on final, both lines included in the range).
}
{
  # Code to be executed for every record.
}
END {
  # Code to be executed after record processing.
}