Padrino Routes Accept Regex URL Map

As Sinatra, Padrino route definitions accept regex when used as the :map option’s value in named aliases.

A standard named alias (‘thisroute’) mapped to some url with a code parameter embedded in the URL:

# For URLs like /someslug/anytext
get :thisroute, :map => '/someslug/:code' do
  # code param available as params[:code]
end

Accessing it as a block parameter, like in Sinatra, is also possible:

# For URLs like /someslug/anytext
get :thisroute, :map => '/someslug/:code' do |code|
  # code param available as request scope variable named code
end

And the killer one, if we want to enforce a pattern to the parameter, we capture it in a regex (suppose we only want to allow digits for that parameter, one or more), like this:

# For URLs with digits in param., only e.g. /someslug/123
get :thisroute, :map => %r|/someslug/([\d]+)| do |code|
  # code made of digits captured by regex available as code var.
end

Behold! And YES!! The first capture (text matched by the regex portion inside the parenthesis – the regex syntax element which delimits a capturing group) is available as the first block parameter, super conviniently! Now go enjoy it, and boost up your Padrino application routing!

Bash Pattern Matching Operator

GNU Bash (4.2 as of this post)’s manual page says that its pattern matching operator =~

value =~ pattern

uses the ERE (Extended Regular Expression syntax) specification used by regex(3) which is the POSIX 1003.2 regular expressions format. Also, =~ has the same precedence as the == and != operators.

This is documented in the SHELL GRAMMAR section – Compound Commands – [[ expression ]].

awk – regexp constants v.s. string constants

The linked to guide section below (“Using Dynamic Regexps”) is well worth the reading to understand the differences between regexp and string constants used while matching and the 3 main pros in using regexp constants over string constants while performing any matching operation:

http://www.chemie.fu-berlin.de/chemnet/use/info/gawk/gawk_5.html#SEC32

Gawk’s gensub Regexp Replacement Function

This is a general substitution function provided as an extension by the GNU Gawk utility.

Syntax:

gensub(regexp, replacement, how[, target])

Unlike sub and gsub, the target is not modified, but just contains the original text input. The result of the operation is the return of the function, instead. The regular-expression pattern regexp will be searched for in target (default is $0 i.e. the entire record) and each one of these matches (greedy) will be substituted for the replacement text according to the value held by the how argument: “g” or “G” will cause the replacement for all the matches; or it can be a number that indicates which occurrence to replace, starting from one (“1″).

A very interesting functionality provided by this extension is the group capturing ability in the same fashion as widely used for regexp’s i.e. via enclosing parenthesis. You specify which group to use in the replacement by the n notation, where n is a digit from one to nine. (Reminder: To put a backslash in the string you must represent it as ‘\’.) An example is as follows:

$ gawk ‘
> BEGIN {
>     a = “dog beautiful”
>     b = gensub(/([^ ]+) ([^ ]+)/, “\2 \1″, “g”, a)
>     print b
> }’
-| beautiful dog

Now an example specifying which match to replace (as opposed to “g” or “G”, denoting global substitution, as a value for the 3rd argument):

$ gawk ‘
> BEGIN {
>     a = “such a dog beautiful”
>     b = gensub(/([^ ]+) ([^ ]+)/, “\2 \1″, “2″, a)
>     print b
> }’
-| such a beautiful dog