Author Archives: alex

Engrish in Mozambique

I live in Mozambique (though I was born in Europe), and I’d like to share a little bit of this country with you in the form of our local Engrish, for lack of a better term. Sometimes business try to communicate to people whose first language isn’t Portuguese, which is the official language here. The result might be called Menglish, maybe, but I’ll just go with Engrish here. What follows is a list of fun stuff.

Fun at the Water Park

The local “Adil” water park shares some goodness:

“We will not tolerate any offense against public morals, ethics and lack of bad behavior on the grounds of the park.”

That’s right. ethics are good, but bad behavior is better.

Resumé gems

Excerpt from a great cover letter (sent to 100+ organizations – To, not BCC):

Waiting for Your Excellency the greatest consideration and care to nurture more professional experience, hereby express their willingness to join in your workgroup.

Naturally closely followed by this, in case there were any lingering doubts:

English: Spoken and written fluently.

I’ll be adding more as I dig it up…

 

 

Writing your own bootloader for a toy operating system (2)

Now that we know the structure of the boot parameter block (BPB) and extended boot parameter block (EBPB), we can start writing our first code. (If you need a refresher, have a look at part 1 of this article).

First code in GNU assembler

We’ll be using the GNU assembler, since it’s free, comes with a boatload of options, supports AT&T and Intel assembly syntax and plays nice with gcc and ld later on. Some of the preprocessor directives used may need some explanation, but all code will be in straightforward Intel syntax.

Here’s some boilerplate code to get started:

.code16
.intel_syntax noprefix
.text
.org 0x0
 
LOAD_SEGMENT = 0x1000
 
.global main
main:
  jmp short start
  nop
 
// BPB and EBPB here
 
start:
  // rest of code

The pile of preprocessor instructions at the top tell the assembler to assemble code for real mode. Since all (intel-based) computers start up in real mode with 16-bit instructions, we won’t be able to write 32-bit code here. We also instruct GNU as that we’ll be using Intel syntax (e.g. mov ax, 1 instead of movw $1, %ax – some prefer the latter, but most readers of this text will be familiar with Intel).  The origin of our code will be 0×0, i.e. all absolute addresses start at 0×0, which will be convenient.

Then there’s the main entry point of our code, which corresponds to the first byte of actual output when assembled. The code under “main” simply jumps over the BPB and EBPB located at offset 0×3, resuming execution at the label start.

We’ve also defined a constant LOAD_SEGMENT, which is the segment where we’ll be loading our second stage bootloader (more about that later).

The Boot Parameter Block

The structure of the boot parameter block can be coded like this:

bootsector:
 iOEM:        .ascii "DevOS   "  ; OEM String
 iSectSize:   .word  0x200       ; Bytes per sector
 iClustSize:  .byte  1           ; Sectors per cluster
 iResSect:    .word  1           ; #of reserved sectors
 iFatCnt:     .byte  2           ; #of fat copies
 iRootSize:   .word  224         ; size of root directory
 iTotalSect:  .word  2880        ; total # of sectors if < 32 MB
 iMedia:      .byte  0xF0        ; Media Descriptor
 iFatSize:    .word  9           ; Size of each FAT
 iTrackSect:  .word  9           ; Sectors per track
 iHeadCnt:    .word  2           ; number of read-write heads
 iHiddenSect: .int   0           ; number of hidden sectors
 iSect32:     .int   0           ; # sectors for > 32 MB
 iBootDrive:  .byte  0           ; holds drive that the boot sector came from
 iReserved:   .byte  0           ; reserved, empty
 iBootSign:   .byte  0x29        ; extended boot sector signature
 iVolID:      .ascii "seri"      ; disk serial
 acVolumeLabel:                  ; just placeholder. We don't yet use volume labels.
 root_strt:   .byte 0,0          ; hold offset of root directory on disk
 root_scts:   .byte 0,0          ; holds # sectors in root directory
 file_strt:   .byte 0,0          ; holds offset of bootloader on disk
 file_scts:   .byte 0,0          ; holds # sectors in boot loader
              .byte 0,0
 rs_fail:     .byte 0            ; hold # tries done when attempting to read a sector
 acFSType:    .ascii "FAT16   "  ; file system type

The fields in this structure correspond to the specification in part 1 of this text, and since they’re nicely labelled, we’ll be able to refer to them later on. Note that since we don’t volume labels here, we’re able to take the 11 bytes used for the volume label and store other things there – fields to be used later. Note that you are not required to do this: I thought it would be a great way to save space, but I later found that my bootloader did everything I wanted and I still have about 20 bytes to spare, so I could move these fields out of the EBPB after all. But for now, we’ll keep them here.

Real-mode Segments

After the start label, we can write some actual code. Let’s start by defining our real mode data segments:

  cli                       
  mov  iBootDrive, dl  ; save what drive we booted from (should be 0x0)
  mov  ax, cs          ; CS is set to 0x0, because that is where boot sector is loaded (0:07c00)
  mov  ds, ax          ; DS = CS = 0x0
  mov  es, ax          ; ES = CS = 0x0
  mov  ss, ax          ; SS = CS = 0x0
  mov  sp, 0x7C00      ; Stack grows down from offset 0x7C00 toward 0x0000.
  sti

Here, we mask interrupts so that interrupt calls don’t mess up our sector declarations. We set ES = DS = SS = CS = 0×0, and make the stack grow down from 0x7C00 (our boot loader was loaded at 0x7C00). When done, we turn the interrupts back on. It’s important to note that the BIOS places the number of the boot drive in the DL register. We store it in our BPB for later use.

Resetting the disk system

Next, we need to prepare the floppy drive for use. This is done through BIOS interrupt 0×13, subfunction 0. We call it with the boot drive in DL:

  mov  dl, iBootDrive   ; drive to reset
  xor  ax, ax           ; subfunction 0
  int  0x13             ; call interrupt 13h
  jc   bootFailure      ; display error message if carry set (error)

If the reset fails, the carry flag will be set and we jump to a label where we handle a boot failure by showing a message, waiting for a keypress and rebooting. Come to think of it, we’ll need a way to print a string to the screen.

Printing a string

We’ll add a short function that uses BIOS interrupt 0×10, subfunction 9 to print characters to the screen. The calling code must point DS:SI to the null-terminated string to be printed.

.func WriteString
 WriteString:
  lodsb                   ; load byte at ds:si into al (advancing si)
  or     al, al           ; test if character is 0 (end)
  jz     WriteString_done ; jump to end if 0.
 
  mov    ah, 0xe          ; Subfunction 0xe of int 10h (video teletype output).
  mov    bx, 9            ; Set bh (page number) to 0, and bl (attribute) to white (9).
  int    0x10             ; call BIOS interrupt.
 
  jmp    WriteString      ; Repeat for next character.
 
 WriteString_done:
  retw
.endfunc

We can now define the “bootFailure” label:

diskerror: .asciz "Disk error. "
bootFailure:
  lea si, diskerror
  call WriteString
  call Reboot

Great. We’ve got code to reset the floppy drive, and if it fails, there’s code that prints failure strings and reboots. Although, we still have to write a Reboot function.

Rebooting

Here is some code that prints a “Press any key to reboot” message, waits for a keystroke, and reboots the machine.

rebootmsg: .asciz "Press any key to reboot\r\n"
.func Reboot
 Reboot:
  lea    si, rebootmsg    ; Load address of reboot message into si
  call   WriteString      ; print the string
  xor    ax, ax           ; subfuction 0
  int    0x16             ; call bios to wait for key
 
  .byte  0xEA             ; machine language to jump to FFFF:0000 (reboot)
  .word  0x0000
  .word  0xFFFF
.endfunc

Here, we use BIOS interrupt 0×16, subfunction to read a key (any key). We then add a far jump to 0xffff:0000 we causes the machine to reboot.

Summary

We’ve written assembler code that prepares data and stack segments and resets the floppy drive. We’ve also added functions for writing text to the screen, waiting for a keypress, and rebooting, which wraps up most of the framework we’ll need for the rest of the bootloader. In the next section, we’ll write code that actually accesses the floppy drive to load our kernel (or second stage bootloader, actually, but we’ll sort that out).

Writing your own boot loader for a toy operating system (1)

If you’re writing your own toy operating system, the first thing you’ll need is a boot sector. It’s a piece of code that lives in the first sector of a (floppy) disk. This code gets called by the BIOS as soon as the computer starts up.

Do note that you can actually start developing other components of your toy operating system before writing boot code, since you can use GRUB (GNU Grand Unified Boot Loader) or LILO to start your kernel. Using one of these tools brings advantages, since they’ll switch the processor to protected mode for you, and allow you to load kernels that are placed beyond cylinder 1024 of a hard disk.

However, writing your own boot code can be a very interesting exercise in assembly programming, and you’ll have full control over what your bootloader actually does. Plus, you get to try and do it better than the people who wrote the DOS/Win95 bootloaders (which isn’t saying a lot as you’ll see below).

Boot loader requirements

The boot code lives in the first sector of a floppy disk, which typically has a size of 512 bytes. However, 61 of those bytes are occupied by data, placed on the disk when it is formatted. This data includes the size of a disk sector, number of FAT tables, number of tracks per sector, volume ID, and more. This yields 451 bytes available for code, which is not a whole lot. That’s one reason we’ll use assembler to write our code.

The DOS/Windows bootloader and its limitations

Let’s consider the bootloader that most of us have used many times: the bootloader that comes with DOS or Windows (up to Windows 95). What does it do?

  • Reset the floppy disk system
  • Read the first sector of the root directory from the disk
  • Verify that the first file found there is IO.SYS (the kernel)
  • Load IO.SYS  into memory
  • Transfer control to IO.SYS

Since the space available for actual code in the boot sector is limited, the author of the DOS bootloader introduced an important requirement: the file IO.SYS must be the first file in the root directory. The DOS code does not scan the entire root directory looking for the required file. If IO.SYS is not the first file found, then the boot code fails.

This is why DOS/Windows come with the SYS.COM program, which is used to make a disk bootable. This program actually cleans the root directory of a floppy disk and copies IO.SYS into it as the first entry, effectively removing all the other files. It would have been much nicer if it had been possible to copy IO.SYS to the root directory of a disk, in any position. Then any disk could be make bootable without sacrificing the files on it. This can actually be done, but it requires more assembly code, something the DOS developers apparently did not find any space for – but we can do better.

At any rate, modern operating systems will switch the processor to protected mode, which allows us to address up to 4 GB of memory in a flat model, and switch on paging to protect processes from one another. This wasn’t part of DOS/Windows 95, but we’ll need to do it.

How a boot loader gets called

When the computer starts up, it executes a power-on self test (POST). It then performs the following actions:

  • Determine which device (drive) to use for booting, using preferences stored in the CMOS.
  • Try to load the first sector (and only the first sector) from the boot drive into memory at address 0:0x7C00.
  • Verify that the the first sector is in fact bootable by checking for the presence of a magic number (see below).
  • Store the number of the drive used in register DL.
  • Point the CPU’s instruction pointer to 0:0x7C00, and start execution from there.

What a boot loader should do

Here’s a list of things that a bootloader should do in order to load your operating system’s kernel (we’ll cover concepts like the A20-line, IDT and GDT tables later):

  • Reset the floppy disk system
  • Write a “loading” message to the screen
  • Find the kernel in the root directory of the disk (at any position)
  • Read the kernel from disk into memory
  • Enable the A20-line
  • Setup the IDT and GDT tables
  • Switch to protected mode
  • Clear the processor prefetch queue
  • Run the kernel

Boot Sector Layout

The boot sector of a floppy disk has a very specific layout, because the BIOS requires access to certain data which it needs to find in the place it expects it to be. Also, an operating system will need to access this data to determine how large the disk is, what file system it uses, what its volume label is and so on. For this article, we’ll assume a floppy disk formatted with a FAT16 file system. The layout of the boot sector is then:

OffsetSizeContentsTypical value
00003CodeJump to rest of code
00038BPBOEM nameGreat-OS
00112Bytes per sector512
00131Number of sectors per cluster1
00142Number of reserved sectors1
00161Number of FAT tables2
00172Number of root directory entries (usually 224)224
00192Total number of sectors2880
00211Media descriptor0xf0
00222Number of sectors per FAT9
00242Number of sectors/track9
00262Number of heads2
00282Number of hidden sectors0
00302EBPBNumber of hidden sectors (high word)0
00324Total number of sectors in filesystem
00361Logical drive number0
00371Reserved
00381Extended signature0x29
00394Serial number
00438Volume labelMYVOLUME
00548Filesystem typeFAT16
0062448CodeBoot code
05102RequiredBoot signature0xaa55

A required element of the boot sector is the boot parameter block (BPB) and the extended boot parameter block (EBPB, for FAT16). This block must be placed at offset 3, size 59 bytes. Also, the boot sector must end with the magic number 0xaa55: some BIOSes will check whether this value is present at offset 510. If not, the BIOS will refuse to boot from the disk. All other bytes are available for us to fill in. We can calculate that that adds in fact up to 451 bytes. Also, the first three bytes are separated from the rest and should only be used to jump to the rest of the code, so that’s less 3 bytes for interesting code…

Here is a typical hex dump of a boot sector without any code. Colored in red are the parts in the BPB and EBPB as decribed above, and the magic number at the end. Everything else is available for code:

0x0000 00 00 00 47 72 65 61 74 2d 4f 53 00 02 01 01 00 ...Great-OS.....
0x0010 02 e0 00 40 0b f0 09 00 09 00 02 00 00 00 00 00 .à.@............
0x0020 00 00 00 00 00 00 29 73 65 72 69 00 00 00 00 00 ......)seri.....
0x0030 00 00 00 00 00 00 46 41 54 31 36 20 20 20 fa 88 ......FAT16   ú^
0x0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x00a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x00b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x00c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x00d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x00e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x00f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x01a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x01b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x01c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x01d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x01e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x01f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa ..............

This article continues in part 2 of this series.

Jiskefet Lullo’s Gezinsverpakking: alle video’s!

Hier zijn ze dan:

Linking a flat binary from C with MinGW

If you’re trying to compile a kernel written in C for your own toy operating system, you may run into trouble compiling/linking your code. Assuming you’re using GRUB to load your kernel, or you’ve rolled your own boot sector, you’ll now want to compile your kernel code (written in C) to a flat binary. The toolchain provided by MinGW (gcc and ld) is well suited for this, as long as you know a few tricks.

Let’s start with a very simple kernel.c program just to see if we can get things working:

int main(void)
{
mylabel:
  goto mylabel;
}

We’ll compile this with gcc, switching on all warnings (the compiler is our friend):

gcc -Wall -pedantic-errors kernel.c -o kernel.exe

This will yield a working program that we can actually execute at the command prompt. It’ll pause indefinitely, as desired. However, there are a number of problems with the resulting binary:

First, the binary includes a PE header, which specifies how Windows must load and execute the program. We’re writing a kernel, so we don’t want any of this header data. We must find it way to remove it.

Second, the program is relocatable. The operating system (i.e. Windows) will load the code into memory where it wants, then use the information contained in the PE header to make sure that all references are correct. The references are provided relatively, that is, the can be relocated. For our kernel, this is not what we want: we want to load our kernel at a specific address (say 0×20000) and make all references work precisely (statically) there.

This can be illustrated by running objdump:

$ objdump -f kernel.exe
kernel.exe: file format pei-i386
architecture: i386, flags 0x00000132:
EXEC_P, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00401160

Objdump’s output shows that a PE header is present (pei-i386 file format) and that a default random start adress of 0×00401160 has been defined. Let’s see what we can do about the start address. Since we want our kernel to always run at 0×20000, we can instruct the linked to use that address to place the code. Linker options can be passed to gcc:

Hint: do not use gcc to compile but not link, then ld to do the linking separately. Strange error messages will ensue. It’s easier to simply pass the linking options to gcc and let gcc call ld for you.

$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -Wl,-Ttext=0x20000
$ objdump -f kernel.exe
kernel.exe: file format pei-i386
architecture: i386, flags 0x00000132:
EXEC_P, HAS_SYMS, HAS_LOCALS, D_PAGED
start address 0x00020160

Oh look: our start address is now 0×00020160. The excess 0×160 bytes are the space occupied by the header, which we don’t want. We can try to pass the option –oformat binary to the linker, which will make it link a flat binary for us. Unfortunately (under MinGW), we get this:

c:/mingw/bin/../lib/gcc/mingw32/4.5.2/../../../../mingw32/bin/ld.exe:
cannot perform PE operations on non PE output file 'kernel.exe'.
collect2: ld returned 1 exit status

This can be resolved though: let the linker create the kernel.exe executable, then pass it through objcopy to create the flat binary:

objcopy -O binary -j .text kernel.exe kernel.bin

This will yield, finally, an executable. Unfortunately, it’s 3376 bytes in size! About 10 bytes would be closer to the mark. Obviously, code is being included that we didn’t write: references to standard libraries. Since we don’t have any standard libraries in our fledgling operating system, we’ll need to remove this. This can be done by passing the -nostdlib argument to gcc:

$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -nostdlib -Wl,-Ttext=0x20000
C:\Users\Alex\AppData\Local\Temp\cc5nshHf.o:kernel.c:(.text+0x7):
  undefined reference to `__main'
collect2: ld returned 1 exit status

Foiled again! Now that we have no standard libraries, ld is looking for startup code that doesn’t exist. We did write a main function, but it’s actually looking for a wrapper to that main function normally supplied by the standard libraries. Let’s try a different approach: we’ll rename our main function.

int start(void)
{
mylabel:
  goto mylabel;
}

Now our code compiles, and we’re down to a flat binary of 2011 bytes. It turns out that we must also pass -nostdlib to the linker:

$ gcc -Wall -pedantic-errors kernel.c -o kernel.exe -nostdlib
  -Wl,-Ttext=0x20000,-nostdlib

Now we get an executable of 24 bytes. In fact, on my system I get:

00000000h: 55 89 e5 eb fe 90 90 90 ff ff ff ff 00 00 00 00
00000010h: ff ff ff ff 00 00 00 00

When disassembled, this yields:

push ebp
mov ebp, esp
jmp .-2

This corresponds exactly to the code we wrote: a stack frame is created for the start function (even though we are not interested in it – a C program must always start with a function), then an infinite loop is entered (which we wrote using a label and a goto statement).

Wait… this code only occupies 5 bytes. So why are there 24 bytes in the flat binary image? We can see that the first three unneeded bytes have a value of 0×90, which corresponds to NOP instructions. This is probably added to get at least an 8-byte boundary. However, why an additional 16 bytes are added, I actually don’t know. If anyone can explain, I’d be grateful.

Nevertheless, we have now produced a flat binary that can be launched by our boot sector or second stage boot loader. It can be placed at 0×20000 and includes no undesired headers. Just the code, please, ma’am.

Installing Ruby’s mysql2 gem for MySQL 64-bits

I was trying to install Ruby’s mysql2 gem on my Windows 7 computer, running MySQL 64-bits. It turns out that the mysql2 gem is not compatible with MySQL 64-bits libmysql.dll file.

When installing, you get something like:

c:\>gem install mysql2 -- with-mysql-include=x:\include
Temporarily enhancing PATH to include DevKit...
Building native extensions.  This could take a while...
ERROR:  Error installing mysql2:
ERROR: Failed to build gem native extension.
 
C:/Ruby192/bin/ruby.exe extconf.rb with-mysql-include=x:\include
checking for rb_thread_blocking_region()... yes
checking for main() in -llibmysql... yes
checking for mysql.h... yes
checking for errmsg.h... yes
checking for mysqld_error.h... yes
creating Makefile
 
make
C:/Ruby192/bin/ruby -e "puts 'EXPORTS', 'Init_mysql2'"  > mysql2-i386-mingw32.def
gcc -I. -IC:/Ruby192/include/ruby-1.9.1/i386-mingw32 -I/C/Ruby192/include/ruby-1.9.1/ruby/backward -I/C/Ruby192/include/ruby-1.9.1 -I. -DHAVE_RB_THREAD_BLOCKING_REGION -DHAVE_MYSQL_H -DHAVE_ERRMSG_H -DHAVE_MYSQLD_ERROR_H -Ix:\include -O3 -g -Wextra -Wno-unused-parameter -Wno-parentheses -Wpointer-arith -Wwrite-strings -Wno-missing-field-initializers -Wno-long-long -Wall -funroll-loops  -o client.o -c client.c
gcc -I. -IC:/Ruby192/include/ruby-1.9.1/i386-mingw32 -I/C/Ruby192/include/ruby-1.9.1/ruby/backward -I/C/Ruby192/include/ruby-1.9.1 -I. -DHAVE_RB_THREAD_BLOCKING_REGION -DHAVE_MYSQL_H -DHAVE_ERRMSG_H -DHAVE_MYSQLD_ERROR_H -Ix:\include    -O3 -g -Wextra -Wno-unused-parameter -Wno-parentheses -Wpointer-arith -Wwrite-strings -Wno-missing-field-initializers -Wno-long-long -Wall -funroll-loops -o mysql2_ext.o -c mysql2_ext.c
gcc -I. -IC:/Ruby192/include/ruby-1.9.1/i386-mingw32 -I/C/Ruby192/include/ruby-1.9.1/ruby/backward -I/C/Ruby192/include/ruby-1.9.1 -I. -DHAVE_RB_THREAD_BLOCKING_REGION -DHAVE_MYSQL_H -DHAVE_ERRMSG_H -DHAVE_MYSQLD_ERROR_H -Ix:\include    -O3 -g -Wextra -Wno-unused-parameter -Wno-parentheses -Wpointer-arith -Wwrite-strings -Wno-missing-field-initializers -Wno-long-long -Wall -funroll-loops  -o result.o -c result.c
result.c: In function 'rb_mysql_result_fetch_fields':
result.c:376:35: warning: comparison between signed and unsigned integer expressions
gcc -shared -s -o mysql2.so client.o mysql2_ext.o result.o -L. -LC:/Ruby192/lib -L. -LC:\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\tk\\lib -LC:\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\tcl\\lib -LC :\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\libyaml\\lib -L
C:\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\pdcurses\\lib -LC:\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\openssl\\lib  -LC:\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\zlib\\lib -LC:\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\iconv\\lib -L
C:\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\gdbm\\lib -LC:\\Users\\Luis\\Projects\\oss\\oneclick\\rubyinstaller\\sandbox\\libffi\\lib  -Wl,--enable-auto-image-base,--enable-auto-import mysql2-i386-mingw32.def  -lmsvcrt -ruby191 -llibmysql  -lshell32 -lws2_32
client.o: In function `nogvl_connect':
C:\Ruby192\lib\ruby\gems\1.9.1\gems\mysql2-0.3.6\ext\mysql2/client.c:114: undefined reference to `mysql_real_connect@32'
client.o: In function `nogvl_init':
C:\Ruby192\lib\ruby\gems\1.9.1\gems\mysql2-0.3.6\ext\mysql2/client.c:105: undefined reference to `mysql_init@4'
client.o: In function `set_ssl_options':
C:\Ruby192\lib\ruby\gems\1.9.1\gems\mysql2-0.3.6\ext\mysql2/client.c:700: undefined reference to `mysql_ssl_set@24'
collect2: ld returned 1 exit status
make: *** [mysql2.so] Error 1
Gem files will remain installed in C:/Ruby192/lib/ruby/gems/1.9.1/gems/mysql2-0.3.6 for inspection.
Results logged to C:/Ruby192/lib/ruby/gems/1.9.1/gems/mysql2-0.3.6/ext/mysql2/gem_make.out

In other words, yet get a bunch of undefined references to functions that should exist in libmysql.dll. Except they don’t, because you’re running MySQL 64-bits and its functions have a different signature.

The solution is this: get the installation files for MySQL 32-bits (same version as yours), and copy the libmysql.dll file to your Ruby installation’s library directory  (C:\Ruby192\lib, for instance). Now linking will succeed.

Automating website & MySQL backups

I have a web server with a number of clients’ websites on it. It’s necessary to backup these websites every day, since clients use a content management system to make changes regularly. These changes can be updates to a website’s MySQL database, or they can be changes to the files stored within these websites. What I’d like is to backup the MySQL database and the filesystem for each website, every day, at a specific time. The backups must rotate: when there are, say, five backups, I want the oldest one to be removed as the newest one is written. Also, I’d like the backup solution to send me an email every day after it’s completed the backups with a summary of the procedure.

So, in summary, my needs are these:

  • Define a list of websites to back up
  • For each site, backup (dump) the MySQL database
  • For each site, backup the website’s file structure
  • Send an email to one or more people with a summary of the backup process.

It’s possible to do this with a shell script (like AutoMySQLBackup does). However, AutoMySQLBackup does not backup file systems or send email. Also, shell scripting tends to be messy code, so I decided to use Ruby.

Configuration file

First off, I’d like to store the list of websites to backup in a separate configuration file so that I can edit this list easily. Also, for reusability, I’ll store database access credentials and email addresses there too. The simplest way of making a configuration file to be read by Ruby is to actually write the configuration file in Ruby, like so:

BACKUPDIR = "/backup/webserver"
ROTATE = 5
DBUSER = "root"
DBPASSWORD = "myrootpassword"
EMAILS = [ "alex@email.com", "john@email.com" ]
WEBSITES = {
 "sample.com" => {
   "path" => "/usr/local/www/apache22/data/sample.com",
   "database" => "sampledb"
 },
 "example.net" => {
   "path" => "/usr/local/www/apache22/data/example.net",
   "database" => "exampledb"
 }
}

This file stores a variable ROTATE which indicates the number of backups to keep before throwing away the oldest one. For each website, I specify the path to the files to be backed up, and the name of the MySQL database. The configuration file will be included and parsed automatically by the backup script, since it is plain Ruby code.

Backup script

The backup script begins by requiring SMTP support, so that we can send emails later. It also starts an output buffer (“output”) where we will store all messages generated by the script to be included in the email. Before starting the backup procedure, we start a begin…rescue block so that me may catch any exceptions thrown by Ruby, in order to include these in the email as well.

require "net/smtp"
output = "Webserver backup script"
begin
  # Load config file:
  require "/usr/home/alex/backup-script/config.rb"
  # Does the backup directory exist?
  if not FileTest::exists?(BACKUPDIR)
    raise "Backup directory #{BACKUPDIR} does not exist."
    exit
  end

The script now loops through the list of websites defined in the configuration file, creating a backup directory with the name of the website for each if it doesn’t already exist:

  WEBSITES.each do |name, website|
 
    output << "\r\n\r\nBacking up #{name}:"
 
    # Establish backup dir
    path = BACKUPDIR + "/" + name
 
    # If website dir does not exist, create it.
    if not FileTest::exists?(path)
      Dir.mkdir path
      output << "\r\n  Directory #{path} created."
    end

Next, the script enumerates the subdirectories that already exist in the website’s backup directory. This is because we will create a subdirectory with date backup’s date for each backup (e.g. 20110810-105535, for 10 August 2011, 10:55:35). These directories are then sorted alphabetically, so that the least recent backup of the website is first in the list.

    # Get entries inside dir with modification times (sorted first to last)
    entries = []
    Dir.entries(path).each do |entry|
      next if entry == "." or entry == ".."
      mtime = File.mtime(path + "/" + entry).to_f
      entries << [ mtime, entry ]
    end
    entries.sort! { |x,y| x[0] <=> y[0] }
    output << "\r\n  #{entries.length} backups found (max #{ROTATE-1})."

The total number of backups found is compared to the value of ROTATE. If there are too many backups, the latest one(s) (first in the list) are removed.

    # Remove least recent entries if more than ROTATE available:
    while entries.length > ROTATE - 1
      entry = entries.shift
      cmd = "rm -R -f #{path}/#{entry[1]}"
      `#{cmd}`
      output << "\r\n  Removed #{path}/#{entry[1]}"
    end

Having cleaned up excess backups, the script now creates a fresh folder, naming it with the current date and time:

    # Create new folder for backup:
    subdir = Time.now.strftime("%Y%m%d-%H%M%S")
    Dir.mkdir path + "/" + subdir
    output << "\r\n  Created directory #{path}/#{subdir}"

If a website has a database defined in the configuration file, the script now calls mysqldump to create a backup of the database inside the newly created backup subdirectory. The backup is gzipped as well. Note that a full path to mysqldump must be provided, since cron, which we will use later to run our script at specific times, does not include a path to mysqldump in the shell that it runs in.

    # Dump database (if required)
    if website.has_key? "database"
      # Dump db
      cmd = "/usr/local/bin/mysqldump -u#{DBUSER} -p#{DBPASSWORD} #{website["database"]} | gzip > #{path}/#{subdir}/#{name}.sql.gz"
      `#{cmd}`
      output << "\r\n  Dumped database #{website["database"]} to #{path}/#{subdir}/#{name}.sql.gz"
    end

If a website has a path to files defined in the configuration file, the script now uses tar/gzip to create a tarball of the entire website file structure, recursing into subdirectories.

    # Dump code (if required)
    if website.has_key? "path"
      # Copy code
      `cd #{website["path"]}; tar -czf #{path}/#{subdir}/#{name}.tar.gz *`
      output << "\r\n  Created zipped tarball of code in #{path}/#{subdir}/code"
    end
  end

This completes the loop that backs up all the websites. We now end our rescue clause in order to catch any exception thrown by Ruby during this process. The exception text is appended to the running log (output) as well as written to standard output.

rescue StandardError => error
  output << "Error occurred: " + error
  puts "Error occurred: " + error
end

All that is left to do is to send the output off through email. This is easy to do (any one reason we’re using Ruby):

# Mail output:
Net::SMTP.start('127.0.0.1') do |smtp|
  output = "Subject: Webserver backup procedure\r\n" + output
  EMAILS.each do |email|
    smtp.send_message output, "alex@email.com", email
  end
end

Adding the script to cron

We can now add the script to the system’s crontab in order to run at regular times. We’ll write a small shell script that launches the script using the bash shell, to make sure that cron has access to a powerful shell to run in:

#/usr/local/bin/bash
/usr/local/bin/ruby /usr/home/alex/backup-script/backup.rb

The following entry is added to the system crontab (/etc/crontab). This will make sure that the script runs every day at 22:00.

# Run webserver backup script
00      22      *       *       *       root    /usr/home/alex/backup-script/backup.sh

SASS and CSScaffold

I think the concept that SASS brings to the table (or CSScaffold, for that matter) is one we’ve all had when we play with CSS and think, “Gee, I would be nice if you could use variables and constants here, and if you could duplicate less code.” And then we would think of splitting our CSS up into many little files, since they’re easier to organize by function, only to find that that wasn’t such a hot idea because a browser will have to make a new HTTP connection for each one to download it.

So here’s SASS/CSScaffold adding just those features that CSS was missing. But is it all good news? I’d say on the whole, yes, but here’s a few points:

SASS requires that you compile your stylesheets every time you update them. My typical development cycle is make a little change to CSS (one one monitor) and hit refresh in my browser (other monitor) to see if the change did what I wanted it to do. That would have to change: now I would need to compile my CSS before I hit refresh. Not insurmountable, but it’s one more thing I can and will forgot and then I’ll think, “Hey, now why didn’t that change do anything?” only to find out after some head-scratching that I forgot to compile.

CSScaffold doesn’t seem to have this problem: since it’s written in PHP, it’ll run every time the CSS is requested from the server. I’m sure the authors have built in some sort of caching, so it should be quick enough. That actually sounds handier to me than SASS does, merely because I don’t need to compile. So the question is then, is CSScaffold just as good, better, or worse? If it’s just as good, I’ll go with it instead of SASS!

But is what SASS/CSScaffold do really that new? Like I said at the start of this post, it’s an idea all of us have thought of… and implemented! It’s always been possible to produce CSS through PHP. You can put a link to a PHP file in your page’s header, have it output a text/css header and you’re good to go. That’ll allow you to use variables, like SASS, constants, like SASS, functions and mixins, like SASS, all at zero cost since you had PHP anyway. You’ll basically only need to write the fancy gradient functions that SASS adds.

In order to add caching, you could pull your CSS through Smarty, thus prettifying the syntax a bit (it never feels quite right having PHP produce actual HTML or CSS what with the separation of code and presentation, so using Smarty gives a fuzzy warm feeling of righteousness). You could even write some spiffy new functions for Smarty, thus creating your own Sassy Smarty. So why all the hullabaloo?

Well, for one thing… SASS does more than I ever implemented with a CSS/PHP/Smarty approach, so hats off for that. But I still don’t like the compilation requirement.

Dynamic CSS through PHP

When writing CSS,you will find yourself repeating information a lot, which is always a bad thing in programming. CSS 2 lacks constants, which would allow us to define a value once and refer to it many times. Instead, we are forced to repeat the actual value many times, making updating CSS a process that is prone to errors.

Also, in order to reduce the number of connections a client must make to the server, it’s necessary to place all CSS in a single file. But this may mean that you end up with a lot of possibly unrelated CSS in a single file, making it difficult to navigate while you’re developing. There are times when it’s simply handier to have lots of  small files instead of one big file, but it’s just not practical for download by your visitors.

These two problems can be resolved by loading your CSS through PHP. Instead of serving the CSS file directly, i.e.

<link rel="stylesheet" href="css/style.css" type="text/css" media="screen"/>

you can have the server load a PHP script that produces CSS like so:

<link rel="stylesheet" href="css/style.css.php" type="text/css" media="screen"/>

Note that this will only work if the scripts emits a text/css header:

<?php
  header("Content-type: text/css");
  ...
?>

Now your PHP script can define some constants that you simply insert into your CSS:

<?php
   header("Content-type: text/css");
   $mycolor = "#aaa";
?>
 
p {
  line-height: 1.1em;
  color: <?php echo $mycolor; ?>
}

Your script could also load various CSS files for processing and output the result in one go, solving the second problem we found. But we can do better still. You can have your PHP script use Smarty to produce the CSS, making the use of contants easier (and prettier):

<?php 
  header("Content-type: text/css"); 
  require_once "../smarty/Smarty.class.php";
  $smarty = new Smarty();
  $smarty->template_dir = "../smarty/templates";
  $smarty->compile_dir = "../smarty/templates_c";
  $smarty->cache_dir = "../smarty/cache";
  $smarty->config_dir = "../smarty/configs";
  $smarty->compile_check = true;
  $smarty->caching = 0;
  $smarty->display("file:style.css");
?>

The file style.css would be the main style sheet manifold; it could load other (sub-) stylesheets. For instance:

{assign var="defaultfont" value="normal 11px/1.2em Arial, sans-serif"}
{assign var="thinborder" value="solid 1px #aaa"}
{assign var="inputcolor" value="#666"}
{include file="sys/global-reset.css"}
{include file="sys/base.css"}
{include file="sys/loader.css"}
{include file="control/accordion.css"}
{include file="control/ajaxtable.css"}
{include file="control/button.css"}

The values that were assigned to defaultfont, thinborder and inputcolor can be used in the sub-stylesheets like so:

input
{
  border: {$thinborder};
  color: {$inputcolor};
}

FireFox 4 does not like script.aculo.us builder

After upgrading to FF4 I noticed that some of my JavaScript, which had been working perfectly fine, stopped working. I was able to isolate the problem to the use of the script.aculo.us Builder class to create a script element, like so:

var head = $$("head")[0];
js = Builder.node("script", { type: "text/javascript", src: path });
js.onreadystatechange = function() { if (js.readyState == 'loaded'
  || js.readyState == 'complete') js.onload(); };
js.onload = function() { console.log("loaded!"); };
head.insert(js);

However, the onload event would never be triggered. In fact, Firebug indicates that the JavaScript file I’m trying to load is never actually loaded from the server. So it’s back to basics without using script.aculo.us’s Builder:

var head = $$("head")[0];
var js = document.createElement('script');
js.type = 'text/javascript';
js.onreadystatechange = function() { if (js.readyState == 'loaded'
  || js.readyState == 'complete') js.onload(); };
js.onload = function() { console.log("loaded"); };
js.src = path;
head.appendChild(js);

And guess what: this works. The file is loaded. Now why does this happen? The new script element is in fact added to the DOM; I can see that in Firebug. But it never loads the JavaScript from the server.

Playing around with script.aculo.us’s builder.js shows that the script tag cannot be created through innerHTML but must be created through document.createElement instead. I don’t know why script.aculo.us tries the innerHTML approach first, but it does – and it works. It just doesn’t load the javascript file. If I deliberately make the innerHTML approach fail, it falls back to document.createElement, which works.

This is not the whole story, though. When adding attributes to the newly created element, builder.js again tries to use innerHTML before using document.create. And again, skipping innerHTML to make it fall back to document.create works.

The reason innerHTML is used can be found here, according to the source, but I could not access this URL at the time of this writing.