LFB Emulation and LFBemu Introduction Bored being not able to play some games because they require a so-called Linear Frame Buffer (LFB)? Or wanna forget about bank switching in your program which uses VESA graphics modes? The information on how to get this LFB working is contained in this document and the LFBemu program which actually provides a full source code and may be very helpful. Basically, this information is for a programmer. And a provided program is not a magic one that enables LFB disabled for some stupid reason or by mistake on the manufacturer's factory. Don't think that it's just enough to run this program and after that play your games requiring LFB. It doesn't work this way, although it's really possible to make such a program out of all this stuff and this is a desireable thing. Download vesrsion 2.2 now! Assumptions... The author assumes that the reader (e.g. you) does know what VESA/VBE stands for, what Linear Frame Buffer (LFB) stands for, what "bank switching" is and (s)he is advanced in Protected Mode programming for i386+ CPUs. The Idea... By means of page translation we can emulate such things as Linear Frame Buffer on video cards which don't have such a feature. Suppose we have set up a VESA graphics mode 640x480x8bpp (#101h) and our program runs after that in 32-bit Protected Mode. Let's reserve as much linear space as required to fit entire the screen (let's say) of resolution 640x480x8bpp starting at (say) linear address of 2MB. Let's map each page belonging to this linear space region onto the standard VGA buffer (located in the region of physical addresses 0A0000h through 0BFFFFh) so that 1st page points to the beginning of the buffer (e.g. addr 0A0000h) 2nd page points to the beginning + 4KB 3rd page points to the beginning + 8KB 4th ... + 12KB ... 16th ... + 60KB 17th page points to the beginning of the buffer (e.g. addr 0A0000h) 18th page points to the beginning + 4KB 19th page points to the beginning + 8KB 20th ... + 12KB ... 32th ... + 60KB 33th page points to the beginning of the buffer (e.g. addr 0A0000h) ... and so on Let's mark pages 1...32 as present and read/write and the rest of pages as not present. Let's see what happens if we try to access a byte in the region of 2MB...2MB+64KB. Nothing interesting really happens - we either read data from VGA buffer or write it to the buffer and see changes on the screen. But what if we try to access a byte at the address of 2MB+64KB or 2MB+65KB or whatever outside the linear space made out of present pages? Well, we will cause a Page Fault Exception. "What a pity!" you would say... "It's not gonna help us, a ridiculous exception. What does it have to do with all this stuff, huh?"... But guess what? It's something we really desire! If we make a Page Fault Handler routine we can get a linear address of the byte we tried to access but failed due to a not present page. Remember, there is such a register CR2? Yep, we get that address from this register upon the exception. What's next? It's simple... Having this address we can find out a 64KB-size block withing which (block 1 = pages 1...32, block 2 = pages 33..64, etc) the byte we've tried to access resides. We then mark all pages of this block as present ones, mark all pages in other blocks as not present, switch a bank (YEAH, here we can switch bank from #1 to #2 because the byte we want to access is in the bank #2) and perform the interrupt return (IRETD) from the page fault handler. The program now continues the instruction which caused the exception and normally completes reading or writing from/to that byte because the page is now present. (Basically, this is how virtual memory and swapping works - one can extend physical RAM size by means of free space on the HDD). Since we switch banks upon the page fault, we may go deeper and deeper into the video card RAM... The nice thing about this design is that a user application doesn't know anything about all those banks, pages, exceptions. Everything is processed inside the Page Fault Handler. And the user application just works with the linear space from 2MB and up to whatever the limit is (depending on the screen resolution) and that's it. The user application by itself doesn't have to worry about anything. It's a Piece of Cake, isn't it? :) Unfortunately, not. :) It turns out that everything works just fine unless we want to read/write a word or a double word which parts reside in different blocks (e.g. one byte of the word/double word is in one block but the rest of the bytes are in the other one). So if we keep our design as is, we end up with a infinite loop. Let's see what happens... If we try to access a word/dword crossing the block boundary, we may have 2 (and more) exceptions - one for the byte(s) in the first block and then immediatly after the exception has been handled we have the second one corresponding to the byte(s) in the next block. This is because in order to complete the read/write instruction we have to have access to all of bytes involved. So, if we mark pages for the 1st block as present while all others as not present upon 1st exception, we have a second exception for the byte(s) in the 2nd block. We do same thing here, e.g. we mark pages for the 2nd block as present ones while all others as not present ones. After we finish with the second exception, CPU tries to start the instruction which cause all these exceptions over again and we have 3rd exception (because now pages of the 1st block are not present) and then 4th and so on sitting in a tight loop around a single instruction. It's all Over. :( Is it? :) The possible approach is to use temporarily one extra page instead of that 2nd block inside the page fault handler... We don't map this page onto the VGA buffer, it's just a piece of RAM. Basically the problem with infinite loop arises from the fact that we can not physically have 2 or more banks being selected at the same instant. A video card has only one current bank. This is why we always had only one 64KB-block made of present pages at any given moment. So we just map this extra page instead of that 2nd block but we don't mark pages of previous block as not present ones. Thus after issuing IRETD, the instruction which accesses the word/dword residing in separate blocks, can be complete. Of course before mapping in that extra page we should write something into its first dword because the interrupted instruction has to read the correct information. But then, after this instruction completes, we have to write changes made to the extra page back to the screen because we want to see this information correct as well. Also after we finish with this damn instruction which causes a bunch of exceptions, bank switches and data transfers, we have to restore our primary state - e.g. pages of only one of blocks are present and the extra page is not mapped. Now this is a real problem... We have to stop after that instruction and do repair all stuff back. But how??? Scared? :) It's easy enough. You don't really have to disassembly that instruction on the fly to find out the length of the instruction and then put something to the code segment of the program right after this instruction. Is unreal. What you can do is just to modify the EFLAGS.TF flag of the intrrupted program (which is pushed onto the stack before the control passes to the code of the page fault handler) so that it becomes 1 after IRETD. This bit (Trap Flag) makes possible to debug a program step-by-step, e.g. at the end of each instruction an Int 1 is generated automatically so that you can examine state of registers. This mode of working of CPU is called Single-Step mode. You may trace a program this way and I bet you've done this hundreds of times before. Haven't you? :) This way, after the instruction finally completes, we get the Int1 which is passed to its handler. Inside this handler we can unmap the extra page, mark pages of the next block as present, all others as not present, clear the EFLAGS.TF bit on the stack and issue IRETD once again now from single-step exception handler :). Easy? Not? Why not? the concept is very simple. Well, it's kinda a lot of work to be done, but it really works... I've managed to complete this program in a day from scratch. You don't have to. :) Learn from my code, understand what's going on and then use this idea on your own if you want. Page Fault Handler Actions (if index/offset within the current bank is < 3, e.g. a dword is probably being accessed): 1. switch to next 64KB bank (w/o remapping) 2. read a dword from the video buffer (at addr 0a0000h) 3. write this dword to an extra page (at offs 0 within the page) 4. map the extra page instead of a corresponding non-present page of LFB (e.g. 1st page within this next 64KB bank) 5. switch the 64KB bank back (w/o remapping) 6. set single-step flag 7. perform IRETD What happens inbetween of two exceptions (page fault and single-step) 8. an instruction reads/writes data from/to previous 64KB bank and the extra page thus it reads/writes everything correctly. upon instruction completion single-step exception occurs. Single-Step Handler Actions 9. read a dword from the extra page (at offs 0 within the page) 10. write the dword to the video buffer (at addr 0a0000h) 11. switch to next 64KB bank (with remapping) 12. clear the single-step flag 13. perform IRETD What's missing? This is actually a good question... We can now either read or write words/dwords that reside in two pages... But is it all we need? Funny somehow, but it's not the end yet. :) What about such instructions as MOVSB/W/D and XCHG which have to read and write at the same time? For example, if we want to copy a sprite from one location on the screen to another on the same screen, then this becomes a problem. We can not dig inbetween reading information and writing information in a single instruction. It's atomic - we can not do much about this. We probably can allow a lot more exceptions to happen and care about this with those "extra" pages but this becomes a lot of extra work. But don't worry. What we've done is not too bad at all. Probably we can not move data around the screen w/o any extra buffers, but we now don't have any problems with separate reading and writing from/to the screen. It's a very good achivement. An addition... LFBemu does emulate thigns like MOVSx and XCHG now correctly. It uses now up to 3 extra pages. 3 extra pages is enough to cover simultaneous access to 2 memory cells (words / dwords / etc) crossing bank boundaries. About the Program The program requires a few things... 0. A computer. An x86 computer. :) 1. i386+ CPU. 2. Real mode upon execution (e.g. no any crappy Windows', EMM386.EXE's and stuff like that). E.g. real-mode DOS is needed. 3. Of course a VESA compatible video card (I guess VESA 1.2 or better). 4. The VESA BIOS should have VESA Protected Mode Interface - a set of 32-bit PMode functions such that allows us to select a bank still sitting in PMode. Since version 2.0 of LFBemu, its not a requirement anymore. It would just be better to have this thing because otherwise a v86 task will be used in order to switch banks by real-mode VESA BIOS. And this is much slower. 5. Probably a user. Better both a programmer and a user. :) I'd say it should be a single human being who is both :) because this program doesn't just let to enable LFB and then play existing games requiring LFB. The program is actually intended to show how to emulate LFB. Besides that, there is no reason to run this program on a computer equipped with a VESA card that does support LFB. :) It will work, but it's ridiculous. Isn't it? :) To run the program just type lfbemu or run it from your shell program (like Norton Commander). The program contains everything it needs inside itself. E.g. it doesn't need load any files or run other programs. Just a nice single program. :) The Second Idea... The idea is simple. To put this code into an existing DOS extender (DPMI host). This would allow to run programs requiring LFB because both DPMI service functions and emulation is done in a single PMode program. What's New in this Release? 1. LFBemu 2.2 does emulate thigns like MOVSx and XCHG now correctly. It uses now up to 3 extra pages. 3 extra pages is enough to cover simultaneous access to 2 memory cells (words/ dwords/etc) crossing bank boundaries. 2. Some minor changes made in version 2.1... The program is a bit reorganized so that the .COM file is again very short. v86 emulation doesn't call real-mode BIOS int 10h by means of the "int" insturction. Instead it uses a far call which saves time required for bank switching procedure. It must be quite fast now. 3. Nothing much... In version 2.0 IRQs are disabled while the program is in PMode so that if BIOS enables interrupts, the program doesn't crash. 4. Since version 2.0, a v86 task is used to pass bank switching to the real-mode BIOS. This allows to emulate LFB even on cards that lack VESA Protected Mode Interface. Bugs and Stuff alike 1. The program doesn't hangs at the point you see a nice picture. It just waits 10 seconds and then terminates. 2. If the program starts making a sound and then freezes up the system then there is probably a bug. More likely in BIOS but who knows -- v86 handling is tough. :) Guarranties, Warranties, Responsibilites and other crap... You use this information and program at your own risk. The author can not guarrantee anything and can not be responsible for any misuse of the information and program as well as for any harm it can make. The program and infrmation are provided as is to the public domain. If you haven't read this -- it's your problem, not mine. :) Copyrights Yep, this is about this weired symbol "(c)" :). It means that this software and information is an intellectual property and it has its owner. You may use and distribute this package for educational and non-comercial purposes w/o any problems for free (well, shipping cost is up to you :). Omitting the information about the primary author and copyrigths, earning money from this or anything else similar to such actions is strictly prohibited. Should you ever have questions or concerns, contact the author. Contact Information Author : Alexei A. Frounze e-mail : alexfru@chat.ru homepage: http://alexfru.chat.ru mirror : http://members.nbci.com/alexfru/ pmode : http://welcome.to/pmode/