November 12, 2013

Blogging with your shell

Greetings. My pseudonym is Ypnose.

Some people already know me (at least I hope so), I try to be often involved in FOSS projects. I also write articles on my french blog, since 2011 (it was discontinued for many reasons). This is my 2nd weblog. I'll talk about shell or related hacks. I plan to share some of my tools & configs as well.

Actually, I'm in love with *ksh shells (and not necessarily because I started serious programming on mksh). Compared to the others shells I tried, they do everything faster and more easily. Many persons use a shell everyday, only in interactive mode. They want a fancy prompt with colors, dozens of Unicode symbols, PWD, date(1) or clock. Personally, I chose my shell according to its "power". For my sake, script usage is much more important. Before *ksh, I played a lot with POSIX shell.

I decided to create my own static website generator script, because I was unable to find something which fits my special needs. We already have similar softwares, like werc, swerc or Hakyll. Unfortunately, I wanted a tool in pure shell (here mksh(1)) + others UNIX utilities, like find(1) or awk(1). I have no limits (except my own limits), I can do everything I want with my lines. That's why wswsh ([w]eb [s]ucks [w]ithout [sh]ell) is born. It's designed as I wish. Sometimes, the SHELL is underestimated, because it seems to be archaic, simplistic, dirty or designed for only easy tasks. I think this is stupid, it's a powerful tool. I do difficult/advanced tasks with it and I'm happy so far.

I'll try to explain you how wswsh works (reading my comments is useful too).

We are in a directory (which will be your brand new website) named "ITOOGUD", with the following files. It composes the minimal dependencies.

.
├── includes/
│   └── layout
├── wswsh
└── wswsh.conf.default

includes contains website parts (layouts, header or footer). wswsh is the main part of the script and wswsh.conf.default is the example config file. You have to rename it to wswsh.conf It contains the website options. Read the comments inside for more information.

Now, the show begins...

Make a directory called src. In this example, we have a basic structure, with 5 folders and 4 files. As you can see, some files have an unknown extension .wshtml. This is a generic extension (I can distinguish my posts without having a headache), it symbolizes articles written in HTML fragments (.wtfhtml is OK too). Moreover, try to avoid .html for your articles (you will understand why after).

.
├── includes/
│   └── layout
├── src/
│   ├── css/
│   │   └── style.css
│   ├── blog/
│   │   └── index.wshtml
│   ├── me/
│   │   └── john_doe.wshtml
│   └── foo/
│       └── baz/
│           └── trololol.wshtml
├── wswsh
└── wswsh.conf

Just run ./wswsh "$PWD". First of all, the script will reproduce exactly the directory structure, in a new folder dest. wswsh recursively analyzes the sources located to src.

dest/
├── css/
├── blog/
├── me/
└── foo/
    └── baz/

Then, find(1) "scans" those sources and "records" files with the chosen extension (here .wshtml). For example, if WSH_INTERP is defined to WSH_INTERP="ahrf", the interpreter "reads" the file and its standard output is redirected to a file $ ahrf my_article > my_new_article. This interpreter must be placed in your PATH! Otherwise, the script concatenate the file(s) (+ page_header & page_footer functions) to a new one, in dest. It does something like $ cat file1 file2 file3 > new_file. Trivial isn't it?

dest/
├── css/
│   └── style.css
├── blog/
│   └── index.html
├── me/
│   └── john_doe.html
└── foo/
    └── baz/
        └── trololol.html

Huhh?... How can you copy this style.css file? It's not handled by find(1), right?

Yes, that's true, boy. There is a variable WSH_CSSF in wswsh.conf to copy the stylesheet into dest/. This variable can be used to apply tags in layout as well.

.
├── dest/
│   └── css/
│       └── style.css
├── src/
│   └── css/
│       └── style.css
├── wswsh
└── wswsh.conf

Sometimes, our pages are not dynamically generated, so we do not need to deal with these files. It's true for static start pages or menus. In that case, we just desire to copy them, to the destination. The program has a section (cf copy_html) which does the trick, if it finds *.html. As it was mentioned above, you shouldn't use .html for all posts / pages.

.
├── src/
│   ├── start/
│   │   └── index.html
│   └── menu/
│       └── 2013/
│           └── index.html
├── wswsh
└── wswsh.conf

Despite all the care, some files are trying to escape us. It was possible to export those files, stored in a variable called FIL but I decided to remove that possibility. Copying external file(s) shouldn't be done by the main script. You could do it using a trivial script or via Makefile.

.
├── src/
│   ├── blog/
│   ├── flux.xml
│   └── robots.txt
├── wswsh
└── wswsh.conf

Testing the website was a bit problematic. I wanted to test the web pages, with a minimal server and I was too lazy to install Nginx / lighttpd (and using them is totally overkill). What a dilemma! I finally decided to grab and compile a tiny C web server called darkhttpd (NOO, it's not PURE shell). Indeed, but it doesn't "violate" my specifications. wswsh is a standalone script and this server isn't part of the script. Since I'm a very detail-oriented person, I like to verify the final result but you don't have to do it.

I won't describe all the possibilities in this article (configs parsing, various interpreters, hooks), because it would be very boring. If you read closely my code, I guess it won't be very difficult to understand what it does. I'll add some functionalities later (perhaps). Atom feeds are also supported (was removed). However, it won't be "ported" on sh.

The script uses "Modified BSD License".

I hope you were interested by this first article. See ya soon!