OS for Dummies Loader

Table of contents
1. Introduction 9. The command interpreter
2. Supported file systems 10. How does it work?
3. Back to boot sectors 11. Well, let's talk about COFF images...
4. Some assumptions 12. What's after a COFF image is loaded?
5. What exactly should an OS loader do? 13. Installation guide
6. What is the OS for Dummies Loader? 14. How to make correct COFF images?
7. Features 15. Compiling the OS loader and example programs
8. What does it look like? 16. Conclusion

Introduction

Well, this manual is about very first part of any OS. It's about an OS loader.

Why OS loader? This is very easy to explain. Just keep on reading this paper...

A computer can't load an OS from the scratch because OSes are different. A computer doesn't know how to load an abstract OS. So it loads only one sector (a boot sector). The boot sector is a very first sector on the disk (both floppy and HDD). This boot sector continues loading of the OS it's been made for.

But it's not enough to have only a boot sector. Since sectors are very small (usually 512-bytes long), a boot sector can not do both load an OS and initialize hardware/CPU (in our case I mean Protected Mode setup) with the following control transfer.

What to do? I think the best solution is to write a program actually loading an OS and performing other needed things before transferring the control to an OS. This program should be loaded by a boot sector. This program is usually called OS loader.

That's basically all the idea. Let's get to the details.

Back to top
Supported file systems

Because of simplicity, I've chosen MSDOS FAT12/16. It's also may be present on your computer, although it's a bit outdated FS. I still have FAT16 on my HDD and MSDOS 6.22 with Windows95 lives on it quite fine. This doesn't mean only FAT12/16 will be supported in this project. This is just a start point. Btw, since our project is brand new and it's very small, we can put all the OS to a single floppy disk (1.44 MB) and that would be enough for the beginning.
I hope you don't forget that FAT12 floppy disks are supported by: MSDOS, Windows 9x and Windows NT. I know nothing about other OSes but I believe they support them too. So putting all our files is very easy. Just run your favorite shell (Norton Commander, Explorer or whatever else) and copy the files. Very easy I think.

Back to top
Back to boot sectors

I need to say something about boot sectors... I've done two boot sectors (for FAT12 and FAT16) capable to load and run standard COM and EXE programs from either floppy disk or HDD. So writing an OS loader with standard assemblers and comilers also is a very simple task. That's all for now about boot sectors.

Back to top
Some assumptions

Since the OS loader I offer is one of very first things in our OS project, I should make some assumptions about executable formats it should support.

I hope all the other developers (and me) will use for our OS project the following free assembler and compiler:

I think use of these great, powerful and free assembler and compiler is very good. Everyone will be able to build and/or change our OS.

Btw, NASM and DJGPP also work on Linux. Not DJGPP, but the same GNU C/C++ compiler just for Unix/Linux. So it's also very good to use such multi platform setup. IMHO Linux is a very good OS.

Well, as a conclusion of this chapter I'd like to list down some formats for executable programs which OS loader should support:

Usage of standard formats simplyfies our life. Btw, COFF and ELF may also include debugging information. This is also very handy.
Back to top
What exactly should an OS loader do?

...something like this, I think:

Back to top
What is the OS for Dummies Loader?

OS for Dummies Loader is an OS loader for our OS project.
This is what I've made (I=Alexei A. Frounze ;). It's a very simple program, although its source is about 100KB long. :))

Back to top
Features
I think it's almost full set of features that an OS loader should have.
Back to top
What does it look like?

When it boots up, it prints to the screen something like this:
 
-= OS FOR DUMMIES  LOADER v1.1 =-
  by Alexei A. Frounze (c) 2000

MSDOS logical disks total: 4
Conventional RAM installed: 640 KB
Extended     RAM installed: 64512 KB

-= Command interpreter =-

A:\>_

MSDOS logical disks total stands for number of MSDOS FAT12/16 disks installed in the system. This number also includes floppy disks. For example, if system has 1 floppy disk (named as "A:" in the MSDOS) and 3 logical disks on a HDD (named as "C:", "D:" and "E:" respectively) this number is 4.
Note: the OS loader supports two HDDs as well as one.
Note #2: the OS loader doesn't put all primary MSDOS partitions alltogether to the top of disk list. For example, if you have 2 HDDs with the following layouts:
 
HDD1
Logical disk name Partition type
C: primary
D: extended
E: extended

 
HDD2
Logical disk name Partition type
C: primary
D: extended

If you attach both disks to the computer, MSDOS will assign names for logical disks as follows:
 
Logical name Old logical name HDD number Partition type
C: C: 1 primary
D: C: 2 primary
E: D: 1 extended
F: E: 1 extended
G: D: 2 extended

Such a stupid thing has been invented for compatibility reasons when large HDDs (bigger than 32MB) appeared. This is a history ;-)

But how the OS loader assigns names for logical disks? It assigns them very clearly and suitable:
 
Logical name Old logical name HDD number Partition type
C: C: 1 primary
D: D: 1 extended
E: E: 1 extended
F: C: 2 primary
G: D: 2 extended

So first 3 disks correspond to HDD1 and last 2 disks correspond to HDD2.
Do you like it?

Conventional RAM installed stands for size of RAM below 1MB mark. Nowadays it may vary from 639KB to 640KB due to Extended BIOS Data Area. Usually this area is 1KB long and it is located just below video adapter buffer (which starts at 0A0000h address).

Extended RAM installed stands for size of RAM above 1MB mark. If your system has 64MB of RAM total, this field should be (64-1)MB=63*1024KB=64512KB.

Back to top
The command interpreter

As I mentioned above, the OS loader has a built-in command interpreter. Now is time to list down supported commands. The command interpreter provides listing of valid commands by either ? or HELP command:
 
A:\>help
Valid commands are:

CLS             Clears the screen
DIR [path]      Displays a list of files and subdirectories
CD <path>       Changes current directory
CD\             Returns to the root directory
CD..            Returns to the parent directory
TYPE <filename> Displays the contents of a text file
DATE            Displays current date
TIME            Displays current time
HELP            Shows the list of valid commands
?               Shows the list of valid commands
EXIT            Exits the command interpreter with the following rebooting

Note #1:        You may change current disks typing 'DiskLetter:'
Note #2:        You may run standard COM/EXE programs not using DOS services

Contact info:   Alexei A. Frounze <alexfru@chat.ru>

A:\>_

Note: DIR and CD commands work wither with short path names (name of a subdirectory) or absolute path names like C:\TASM\BIN. The same is with the TYPE command. It expects either short file name (name of file in the current directory) or absolute file name like: C:\TASM\DOC\TLINK.TXT.

Note #2: when you want to run a program under the OS loader (doesn't matter its type, it may be either .COM or .EXE or .O), you should also type either a short filename or absolute filename as shown above. But you can omit file name extension. I.e. you can simply type KERNEL instead of KERNEL.O.

Back to top
How does it work?

As I said before, the OS loader is capable to load and run .COM and .EXE programs. Yeah, that's true. You may really run them under the loader.

But how about quitting from those program to the loader, since MSDOS is not present and such service as MSDOS Int 20h and function 4Ch of Int 21h are unaccessible? Just keep on using these two ways of quitting. The OS loader sets up all the interrupt vectors in the range 20h through 33h so that they return the control to the OS loader. So it doesn't matter which one you're using. Your program will come back to the loader anyway. Just don't forget that any interrupt in the range 20h through 33h does nothing (like it should in MSDOS) but just terminates a program.

The OS loader also can load a COFF image (filename extension is .O) which is a standard output format for DJGPP C/C++ compiler for DOS. Then the loader sets up Protected Mode and transfers the control to the program (I hope, I don't need to explain here what is a Protected mode). Since DJGPP is a 32-bit protected mode oriented C/C++ compiler for i386+ CPU, making OS kernel with it and NASM is very simple (I've included some example programs to the OS for Dummies Loader package).

Back to top
Well, let's talk about COFF images and related stuff more detailed...

First of all I want to point out that not any COFF images could be loaded under the OS loader. They must:

1st rule means that all the variables and functions used by the program must all lie in a single .O file. This also means that all the static pointers in the image must be initialized and relocated. This just the same as .OBJ file produced by TASM/MASM linked to a .COM/.EXE program.

2nd rule implies that the base address where the program is relocated to must not be arbitrary. It must be either 0 or something above the 1MB+64KB mark.

Let me explain this a bit better...

If the base address of the .text section is 0 (according to file header), such a COFF image can be loaded to any physical address of the RAM and work there in its own code and data segments. You just allocate two descriptors for the code and data and set their base address to physical address where the image is loaded to. But this assumes that the program lives in its own isolated/locked address space. I mean its addresses are not physical anymore and btw, limits of code and data segments may be the same as sizes of appropriate sections (.text and .data+.bss+stack).
Btw, with enabled page translation stuff things may be improved: some RAM areas may be addressed using original address space and protection colud be done a lot better.
So, such images have their pros and cons.
Good: may be loaded to any physical address and may not interfer with other programs due to segment descriptors.
Bad: addressing is not physical anymore
By default these images are loaded at the 1MB+64KB mark. I'll tell why a bit later.

If the base address of the .text section is greater than 1MB+64KB, such a COFF image is fixed. It can not be loaded to any address of the RAM where some free space is available. But this is an advantage too. You may setup 2 descriptors for code and data segments so that they start at the physical address of 0 (base=0) and have limits of 4GB. Sounds like true flat memory model? Yeah, it is a true flat memory model. When your program is designed this way, it can address any byte of the RAM using physical addresses only and nothing extra over that.
Again, they have their pros and cons.
Good: true flat memory model with physical addressing scheme
Bad: can not be loaded to arbitrary region of RAM

So you may choose any one of these designs. It's up to you, OS for Dummies developer!

Well, I just need to cover some details and I'm done.

Why I'm talking about strange 1MB+64KB mark?
Very first megabyte of the RAM is abit specific thing:

Back to top
What does the OS loader do after a COFF image is loaded?

It simply sets up GDT entries. 2 descriptors are setup for OS loader in order to be able to return to the loader. 1 descriptor is set up as a 4GB segment and 2 descriptors are set up for code and data of the program from the COFF image.

It also sets up IDT so that by defaulf any exception terminates the program and returns the control back to the OS loader. This is also very good - computer is not triple faulted. I'll tell you a secret. The OS loader can be loaded under MSDOS, if there is no installed memory managers (QEMM, EMM386.EXE, HIMEM.SYS, etc). It will load and run programs from COFF images perfectly under MSDOS, since they are loaded above 1st megabyte of the RAM.
Notice: you can't load and run .COM and .EXE programs from the OS loader if it is loaded under MSDOS. So you may simply fix the bug caused the exception and run the OS loader with fixed version of the COFF image.

When setting up of the GDT, IDT and PMode is done, OS loader pushes onto the stack some 32-bit parameters as follows:

These parameters may be easily read from the stack and used by an OS kernel. This is also shown in the example programs accompanying the OS for Dummies Loader.

Btw, by default stack is limited to 64KB for the program.

Back to top
Installation guide

In order to install the OS for Dummies Loader, you should prepare a FAT12/16 disk (floppy or HDD) that has installed the BootProg boot sector (installation guide for BootProg is available in the BootProg packege). Then simply copy the STARTUP.BIN file from the OS for Dummies Loader package to the disk. That's all. You are ready to reboot your computer.

Back to top
How to make correct COFF images supported by the OS loader?

Although all the needed files and info is included to the package, I'll explain this here.

Program entry point must be either a main() function in a C source for DGJPP or _main subroutine in an ASM source for NASM. This is needed because the ld linker by default expects entry point named as _main.

ASM sources are compiled by the following command:

NASM myfile.asm -f coff

By default NASM generates output file with name myfile.o, but this may be overriden by the -o command.line option.

C sources are compiled as follows:

GCC -c myfile.c

By default DJGPP generates output file with name myfile.o, but this may be overriden by the -o command.line option.

Note: don't write command line options in other case than specified. Filenames specified in upper case can also cause problems. That's because GCC is a case-sensitive program. For example, if you type MYFILE.C it would think that your program is C++ but not C.

Okay, if you compiled sources successfully, you should then relocate and link them. Relcation / linkage is done by the ld linker (included to DJGPP).

If you have only one source file (let it be an abc.asm file) and hence only one object file (abc.o), you relocate the abc.o file as follows:

ld -o abcl.o -Tlink.scr abc.o

abcl.o is an output (relocated / linked) file name
abc.o is an input object file name
link.scr is a script file for the ld linker

You may run the abcl.o program under OS for Dummies Loader.

But what is the script file there? Well, if you're an attentive reader, you may remember that the base address of the .text section may vary. This script file contains information about base addresses of the sections and some other information as well. I'll show examples of such a script file a bit later.

Okay, so if you have two source files in your program what's now? It's also very simple to get working. Let's imagine that 1st source is an ASM and named as abca.asm and 2nd one is a C source named as abcc.c. Assuming that these sources are already compiled to the abca.o and abcc.o object files respectively, link command will now be the following:

ld -o abcl.o -Tlink.scr abca.o abcc.o

abcl.o is an output (relocated / linked) file name
abca.o is a 1st input object file name
abcc.o is a 2nd input object file name
link.scr is a script file for the ld linker

And again you may run the abcl.o program under OS for Dummies Loader.

As I promised, I'm showing you examples of script files needed for ld.
 
 
Script file for setting .text section base address to 0:
/* Adapted from /djgpp/lib/djgpp.djl */
OUTPUT_FORMAT("coff-go32")
ENTRY(_main)
SECTIONS
{   .text 0x00000000 :
    { *(.text)
 . = ALIGN(4096);
 etext = .; _etext = .; }
    .data :
    { *(.data)
 . = ALIGN(4096);
 edata = .; _edata = .; }
    .bss :
    { *(.bss)
 *(COMMON)
 . = ALIGN(4096);
 end = .; _end = .; }}

 
Script file for setting .text section base address to 1MB+64KB:
/* Adapted from /djgpp/lib/djgpp.djl */
OUTPUT_FORMAT("coff-go32")
ENTRY(_main)
SECTIONS
{   .text 0x00110000 :
    { *(.text)
 . = ALIGN(4096);
 etext = .; _etext = .; }
    .data :
    { *(.data)
 . = ALIGN(4096);
 edata = .; _edata = .; }
    .bss :
    { *(.bss)
 *(COMMON)
 . = ALIGN(4096);
 end = .; _end = .; }}

As you can see, only one value is different and that's enough for us.

Back to top
What is needed to compile OS loader and example programs
Where do I get these compilers and assemblers from?
All the needed make files and short instructions are included to the package. I think you won't have any problems with it.
Back to top
Conclusion

It's now your time to develop an OS, developer! Have fun!

I hope I did my part of work well enough and I also hope I provided enough information about all this stuff.

I would like to thank everyone who made BP, BC, TASM, MASM, NASM, DJGPP, all the people helped me and you for such a cool idea:

OS for Dummies RULEZZ!!!

I'm wishing you luck with it.

Back to top

 
 
Contact Information
Author Alexei A. Frounze
E-mail alexfru@chat.ru
Homepage http://alexfru.chat.ru
Mirror http://members.xoom.com/alexfru
 

by Alexei A. Frounze (c) 2000