I have a Python script which reads a configuration file that is also a Python script. In that configuration file, it stores variables for customizations, but the most important part is the lists of regular expressions. It looks like:

# stuff...
EXCLUDE_KEYWORDS = [
    '1 blah',                             # 2012-01-03T07:04:21Z
    'a foobar',                           # 2012-01-13T09:12:51Z
    # more...
    ]
# more...

The main Python script will read those strings and creates regular expression objects based upon those in runtime. As you can see the timestamp at the end of each line, I manually added each RE by hand and I was getting tired of doing such task because I had to scroll down hundreds of lines in order to put the new entry in alphabetical order. Therefore, the following script was born today:

#!/bin/bash

TMP_FILE=/tmp/blah.tmp
TARGET="$XDG_CONFIG_HOME/blah/blah-cfg.py"

awk -v s="$1" -v d="$(date --utc +%Y-%m-%dT%H:%M:%SZ)" '
BEGIN {
  ck = 0
  s = sprintf("%-41s # %s", "    '\''" s "'\'',", d)
  slower = tolower(s)
}
/EXCLUDE_KEYWORDS/ {ck = 1; print $0; next}
{
  if (ck && slower < tolower($0)) {
    print s
    ck = 0
    }
  print $0
}
/^[[:blank:]]+\]/ {ck = 0}
' < "$TARGET" > "$TMP_FILE"

diff -u "$TARGET" "$TMP_FILE"

read -p "Update file? " ans
[[ -z "$ans" ]] && exit 1

mv "$TARGET" "$TARGET.bak"
mv "$TMP_FILE" "$TARGET"

This Bash script with awk will add new entry in the format I used, the indentation and timestamp should be added as well. The only thing I need to take care is to escape properly when I call this script. But it will be okay, the output will be diffed and I can see if the result is correct or not:

$ ./blah-addkw.sh Wsdf
--- /home/livibetter/.config/blah/blah-cfg.py      2012-03-19 13:05:35.000000000 +0800
+++ /tmp/blah.tmp      2012-03-19 13:07:31.829574520 +0800
@@ -441,6 +441,7 @@
     'Wabcdefggu',                         # 2012-01-06T07:13:14Z
     'Wddsdfklsdjfie sdjfljkjf',           # 2012-01-24T01:10:07Z
     'wkjsdfjkskdflsdjflkj kr',            # 2012-01-20T22:17:11Z
+    'Wsdf',                               # 2012-03-19T05:07:31Z
     'Wpa',                                # 2011-12-20T00:57:31Z
     'Wzz',                                # 2012-01-15T01:17:20Z
     'X-kljsdlfjdsklfjlsdjfkj',            # 2012-01-25T01:14:25Z
Update file? k

I masked entries, but you can get the idea. Also note that, it’s case-insensitive when sorting.

For plain text file, you can simply echo new stuff >> FILE; sort FILE > TMP_FILE and do the double check before you write back. But since it’s a code in my case, some necessary checking is required because of the syntax.

Before I wrote this script, I only noticed I had about 500 lines in that configuration file. It seems a lot and it is because the first timestamp is 2011-12-07T20:39:43Z. I wasted a lot of time doing scrolling and sorting in my brain. xD