|
|
December 25 The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds that of a single large drive. The array of drives appears to the host computer as a single logical drive. The Mean Time Between Failure (MTBF) of the array is equal to the MTBF of an individual drive, divided by the number of drives in the array. Because of this, the MTBF of a non-redundant array (RAID 0) is too low for mission-critical systems. However, disk arrays can be made fault tolerant by redundantly storing information in various ways. Five types of array architectures, RAID 1 through RAID 5, were originally defined, each provides disk fault-tolerance with different compromises in features and performance. In addition to these five redundant array architectures, it has become popular to refer to a non-redundant array of disk drives as a RAID 0 array. Disk Striping Fundamental to RAID technology is striping. This is a method of combining multiple drives into one logical storage unit. Striping partitions the storage space of each drive into stripes, which can be as small as one sector (512 bytes) or as large as several megabytes. These stripes are then interleaved in a rotating sequence, so that the combined space is composed alternately of stripes from each drive. The specific type of operating environment determines whether large or small stripes should be used. Most operating systems today support concurrent disk I/O operations across multiple drives. However, in order to maximize throughput for the disk subsystem, the I/O load must be balanced across all the drives so that each drive can be kept busy as much as possible. In a multiple drive system without striping, the disk I/O load is never perfectly balanced. Some drives will contain data files that are frequently accessed and some drives will rarely be accessed. By striping the drives in the array with stripes large enough so that each record falls entirely within one stripe, most records can be evenly distributed across all drives. This keeps all drives in the array busy during heavy load situations. This situation allows all drives to work concurrently on different I/O operations, and thus maximize the number of simultaneous I/O operations that can be performed by the array. Definition of RAID Levels RAID 0 is typically defined as a group of striped disk drives without parity or data redundancy. RAID 0 arrays can be configured with large stripes for multi-user environments or small stripes for single-user systems that access long sequential records. RAID 0 arrays deliver the best data storage efficiency and performance of any array type. The disadvantage is that if one drive in a RAID 0 array fails, the entire array fails. RAID 1, also known as disk mirroring, is simply a pair of disk drives that store duplicate data but appear to the computer as a single drive. Although striping is not used within a single mirrored drive pair, multiple RAID 1 arrays can be striped together to create a single large array consisting of pairs of mirrored drives. All writes must go to both drives of a mirrored pair so that the information on the drives is kept identical. However, each individual drive can perform simultaneous, independent read operations. Mirroring thus doubles the read performance of a single non-mirrored drive and while the write performance is unchanged. RAID 1 delivers the best performance of any redundant array type. In addition, there is less performance degradation during drive failure than in RAID 5 arrays. RAID 2 arrays sector-stripe data across groups of drives, with some drives assigned to store ECC information. Because all disk drives today embed ECC information within each sector, RAID 2 offers no significant advantages over other RAID architectures and is not supported by Adaptec RAID controllers. RAID 3, as with RAID 2, sector-stripes data across groups of drives, but one drive in the group is dedicated to storing parity information. RAID 3 relies on the embedded ECC in each sector for error detection. In the case of drive failure, data recovery is accomplished by calculating the exclusive OR (XOR) of the information recorded on the remaining drives. Records typically span all drives, which optimizes the disk transfer rate. Because each I/O request accesses every drive in the array, RAID 3 arrays can satisfy only one I/O request at a time. RAID 3 delivers the best performance for single-user, single-tasking environments with long records. Synchronized-spindle drives are required for RAID 3 arrays in order to avoid performance degradation with short records. Because RAID 5 arrays with small stripes can yield similar performance to RAID 3 arrays, RAID 3 is not supported by Adaptec RAID controllers. RAID 4 is identical to RAID 3 except that large stripes are used, so that records can be read from any individual drive in the array (except the parity drive). This allows read operations to be overlapped. However, since all write operations must update the parity drive, they cannot be overlapped. This architecture offers no significant advantages over other RAID levels and is not supported by Adaptec RAID controllers. RAID 5, sometimes called a Rotating Parity Array, avoids the write bottleneck caused by the single dedicated parity drive of RAID 4. Under RAID 5 parity information is distributed across all the drives. Since there is no dedicated parity drive, all drives contain data and read operations can be overlapped on every drive in the array. Write operations will typically access one data drive and one parity drive. However, because different records store their parity on different drives, write operations can usually be overlapped.
In summary: • RAID 0 is the fastest and most efficient array type but offers no fault tolerance. RAID 0 requires a minimum of two drives. • RAID 1 is the best choice for performance-critical, fault-tolerant environments. RAID 1 is the only choice for fault-tolerance if no more than two drives are used. • RAID 2 is seldom used today since ECC is embedded in all hard drives. RAID 2 is not supported by Adaptec RAID controllers. • RAID 3 can be used to speed up data transfer and provide fault tolerance in single-user environments that access long sequential records. However, RAID 3 does not allow overlapping of multiple I/O operations and requires synchronized-spindle drives to avoid performance degradation with short records. Because RAID 5 with a small stripe size offers similar performance, RAID 3 is not supported by Adaptec RAID controllers. • RAID 4 offers no advantages over RAID 5 and does not support multiple simultaneous write operations. RAID 4 is not supported by Adaptec RAID controllers. • RAID 5 combines efficient, fault-tolerant data storage with good performance characteristics. However, write performance and performance during drive failure is slower than with RAID 1. Rebuild operations also require more time than with RAID 1 because parity information is also reconstructed. At least three drives are required for RAID 5 arrays. Dual-Level RAID In addition to the standard RAID levels, Adaptec RAID controllers can combine multiple hardware RAID arrays into a single array group or parity group. In a dual-level RAID configuration, the controller firmware stripes two or more hardware arrays into a single array. NOTE The arrays being combined must both use the same RAID level. Dual-level RAID achieves a balance between the increased data availability inherent in RAID 1 and RAID 5 and the increased read performance inherent in disk striping (RAID 0). These arrays are sometimes referred to as RAID 0+1 or RAID 10 and RAID 0+5 or RAID 50. Creating Data Redundancy RAID 5 offers improved storage efficiency over RAID 1 because only the parity information is stored, rather than a complete redundant copy of all data. The result is that three or more drives can be combined into a RAID 5 array, with the storage capacity of only one drive dedicated to store the parity information. Therefore, RAID 5 arrays provide greater storage efficiency than RAID 1 arrays. However, this efficiency must be balanced against a corresponding loss in performance. The parity data for each stripe of a RAID 5 array is the XOR of all the data in that stripe, across all the drives in the array. When the data in a stripe is changed, the parity information is also updated. There are two ways to accomplish this: The first method is based on accessing all of the data in the modified stripe and regenerating parity from that data. For a write that changes all the data in a stripe, parity can be generated without having to read from the disk, because the data for the entire stripe will be in the cache. This is known as full-stripe write. If only some of the data in a stripe is to change, the missing data (the data the host does not write) must be read from the disks to create the new parity. This is known as partial-stripe write. The efficiency of this method for a particular write operation depends on the number of drives in the RAID 5 array and what portion of the complete stripe is written. The second method of updating parity is to determine which data bits were changed by the write operation and then change only the corresponding parity bits. This is done by first reading the old data which is to be overwritten. This data is then XORed with the new data that is to be written. The result is a bit mask which has a 1 in the position of every bit which has changed. This bit mask is then XORed with the old parity information from the array. This results in the corresponding bits being changed in the parity information. The new updated parity is then written back to the array. This results in two reads, two writes and two XOR operations. This is known as readmodify- write.
The cost of storing parity, rather than redundant data as in RAID 1, is the extra time required for the write operations to regenerate the parity information. This additional time results in slower write performance for RAID 5 arrays over RAID 1. Because Adaptec RAID controllers generate XOR in hardware, the negative effect of parity generation is primarily from the additional disk I/O required to read the missing information and write the new parity. Adaptec RAID controllers can generate parity using either the full- or partial-stripe write algorithm or the read-modify-write algorithm. The parity updated method chosen for any given write operation is determined by calculating the number of I/O operations needed for each type and choosing the one with the smallest result. To increase the number of full stripe writes, the cache is used to combine small write operations into larger blocks of data.
Handling I/O Errors Adaptec RAID controllers maintain two lists for each RAID 5 array: a Bad Parity List, and a Bad Data List. These lists contain the physical block number of any parity or data block that could not be successfully written during normal write, rebuild or dynamic array expansion operations. These lists alert the controller that the data or parity in these blocks is not valid. If the controller subsequently needs data from a listed block and cannot recreate the data from existing redundant data, it returns an error condition to the host. Blocks are removed from the Bad Parity List or the Bad Data List if the controller successfully writes to them on a subsequent attempt. Degraded Mode When a drive fails in a RAID 0 array, the entire array fails. In a RAID 1 array, a failed drive reduces read performance by 50%, as data can only be read from the remaining drive. Write performance is increased slightly because only one drive is accessed. A RAID array operating with a failed drive is said to be in degraded mode. RAID 5 arrays synthesize the requested data by reading and XORing the corresponding data stripes from the remaining drives in the array. For RAID 5, the magnitude of the performance impact in degraded mode depends on the number of drives in the array. An array with a large number of drives will experience more performance degradation than an array with small number of drives.
Rebuilding a Failed Hard Drive A failed drive can be replaced in a RAID 1 or RAID 5 array by physically removing the drive and replacing it or by a designated Hot Spare. Adaptec RAID controllers will rebuild the data for the failed drive onto the new drive or Hot Spare. This rebuild operation occurs online while normal host reads and writes are being processed by the array. RAID 1 arrays are rebuilt relatively quickly, because the data is simply copied from the duplicate (mirrored) drive to the replacement drive. For RAID 5 arrays, the data for the replacement drive must be synthesized by reading and XORing the corresponding stripes from the remaining drives in the array. RAID 5 arrays that contain a large number of drives will require more time for a rebuild than a small array. December 24 Threads in Visual Basic are not commonly used because of complexity. In a 32 bit Windows environment (talking about Windows 95/98/NT) it has become a necessity to use more than one thread for each process. But is it true that creating threads in Visual Basic is tough? Not really! Visual Basic has its own feature of simplifying Windows 32 bit API calls. Think about a long backup procedure, which needs to be intervened in case the user wants to suspend or stop the backup procedure. If you have not implemented a separate thread for this copy purpose, you will not be able to stop the process until and unless you kill the process forcefully or you restart your machine (Windows 95 non-OEM versions might show endless blue screens). This is not a good idea for a professional grade application. Therefore let us talk about creating threads. We will look at creating free threading model (and not apartment model of threading) in Visual Basic using the 'CreateThread' Win32 API call. Those who are familiar with creating Win32 applications using Visual C++, will be able to recognize this little function which helps in creating a new thread within a process very easily. In Visual Basic, it is also pretty easy to create a thread. I will take a look at the example code ExCopy (which is supplied with this article). This code is used to break a long file into several small 3¼-inch high-density floppy disks (similar to what you see in WinZip utility). An independent thread handles the copy process. Therefore you can see that during the copy process, you can easily move the ExCopy window. Ø How the thread is created: Let us have a look at the following function written in the example code… 'StartCopy 'Thread creation helper function Public Function StartCopy(clsObject As clsExCopy) Dim NewThreadID As Long Dim Threadhandle As Long Dim Param As Long ' Free threaded approach Param = ObjPtr(clsObject) 'Create a thread with no security attribute but 'with default stack size and default creation flag Threadhandle = CreateThread(0,_ 0, _ AddressOf ThreadFunction, _ Param, _ 0, _ NewThreadID) If Threadhandle = 0 Then ' Return with zero (error) Debug.Print "Unable to create the free thread" Exit Function End If ' We don't need the thread handle CloseHandle Threadhandle 'Return the created ID StartCopy = NewThreadID End Function This public function is used to create the thread. StartCopy takes only one parameter of clsExCopy type. The class clsExCopy is defined in the ExCopyFile_IO.cls module. This class is used to copy data to the disks. You should call the StartCopy function with a valid clsExCopy class object. For example you might want to call this function when the user clicks on 'Start' button… Private Sub StartBtn_Click() 'Create a new Thread object If ExCopyObj Is Nothing Then Set ExCopyObj = New clsExCopy Else Set ExCopyObj = Nothing Set ExCopyObj = New clsExCopy End If ExCopyObj.bStopNow = False 'Create a new thread StartCopy ExCopyObj 'Can not click on Start now StartBtn.Enabled = False End Sub Therefore you are passing an object value to the StartCopy function. In this function the Win32 API CreateThread function is used to create a new thread. Now take a closer look at the call to this function. v The first parameter is actually a pointer to the SECURITY_ATTRIBUTE structure. We are passing zero to avoid any security feature. v The second parameter is used to represent the stack size of the newly created thread. We are passing zero to let Windows Kernel determine the stack value. v The CreateThread function uses the third parameter as a function pointer, which it calls when the thread is created. The third parameter uses the 'AddressOf' unary operator which returns the pointer to a function (acceptable by the API). Therefore AddressOf ThreadFunction will return the actual pointer to the function 'ThreadFunction' which does the Win32 API very well accept. v The fourth parameter is used as a pointer to the arguments passed to the function mentioned in third parameter. Visual Basic has a function ObjPtr which returns the pointer to memory object which I have used to pass as a parameter to the function ThreadFunction. v The fifth parameter is used to specify how you want the thread to behave. For example you may set the thread priority of a particular thread to a higher value by specifying THREAD_PRIORITY_ABOVE_NORMAL to this parameter. This is a bit field where several options can be OR-ed. v The CreateThread function returns the thread identification value in the sixth and the final parameter. The StartCopy function returns the handle of the thread which you can destroy using the CloseHandle Win32 API call after the thread is successfully created. Destroying this handle does not destroy the thread. This merely frees the handle associated with that thread. The CreateThread function returns immediately after it creates the thread. The newly created thread runs as an independent thread. As you have noticed that we have passed a pointer the function ThreadFunction to the third parameter of CreateThread function, let us see what it looks like… 'ThreadFunction 'The thread entry point Public Function ThreadFunction(ByVal Param As IUnknown) As Long Dim clsObject As clsExCopy ' Free threaded approach Set clsObject = Param clsObject.ExCopy End Function Pretty small, isn't it? ThreadFunction is used to call a method (ExCopy in this case) from the clsExCopy class. This function accepts parameters as IUnknown which represents the basic interface supported by Visual Basic COM layer. The value from this parameter is copied to a clsExCopy class object. Remember that once you go out of this function, the thread is destroyed. This is all about creating multiple threads in Visual Basic. Remember to create the project as 'ActiveX EXE'. The rest of the example code deals with form controls and the copy technique (in clsExCopy). This way of creating a thread is called 'Free Threading', which is not supported by Visual Basic COM/DCOM layer. Visual Basic COM/DCOM uses either 'Single Threading' model or 'Apartment model' threading. But free threading models are very fast compared to Apartment model threading which uses marshalling technique to transport data between two objects. Anyway we will discuss about this approach later. Samit Ray Consultant Price Waterhouse Associates (P) Ltd. Samit_Ray@india.notes.pw.com SamitRay@cal.vsnl.net.in
Revision 3 by Vladan Bato (bat22@geocities.com) In this document I'll try to describe the AUD file format used in Command & Conquer and Redalert. Command & Conquer is a trademark of Westwood Studios, Inc. Command & Conquer is Copyright (C)1995 Westwood Studios, Inc. Command & Conquer: Red Alert is a trademark of Westwood Studios, Inc. Command & Conquer: Red Alert is Copyright (C)1995,1996 Westwood Studios, Inc. The information provided here is for anyone who would like to make an AUD player program or AUD to WAV or WAV to AUD converters. Most information about AUD files and IMA-ADPCM compression has been provided by Douglas McFadyen. I won't explain here the format of the WAV files. You'll have to find this info yourself. I'll just tell you how to obtain 16-bit PCM data and how to encode it. I will use Pascal-like notation throughout this document. =============================== 0. IMPRTANT NOTE - WHAT'S NEW =============================== This revision contains an important difference in the IMA-ADPCM compression routine. Instead of computing the diffrence between the current and previous sample, it computes the difference between the current sample and the value that the decoding routine will predict for the previous sample. This is the way the algorithm is implemented in C&C. If you implement it the way it was in previous revisions of this document, the sound will be the same but there will be a "pop" sound at the end. ============== 1. AUD FILES ============== The AUD files have the following header : Header : record SamplesPerSec : word; {Frequency} Size : longint; {Size of file (without header)} OutSize : longint; {Size of ouput data} Flags : byte; {bit 0=stereo, bit 1=16bit} Typ : byte; {1=WW compressed, 99=IMA ADPCM} end; There are two types of compression. The first is the IMA-ADPCM compression used for 16-bit sound. It's used in most AUD files. The other one is a Westwood's proprietary compression for 8-bit sound and is used only for death screams. I won't describe it in this document because I don't know how it works. The rest of the AUD files is divided in chunks. These are usually 512 bytes long, except for the last one. Each chunk has the following header : ChunkHd : record Size : word; {Size of compressed data} OutSize : word; {Size of ouput data} ID : longint; {Always $0000DEAF} end; The IMA-ADPCM compression compresses 16-bit samples to 4 bits. This means that OutSize will be apporximately 4*Size. The IMA-ADPCM compression and decompression are described in the following sections. Note that the current sample value and index into the Step Table should be initialized to 0 at the start and are mantained across the chunks (see below). ========================== 2. IMA-ADPCM COMPRESSION ========================== I won't describe the theory behind the IMA-ADPCM compression. I will just give some pseudo code to compress and decompress data. The compression algorithm takes a stream of signed 16-bit samples in input and produces a stream of 4-bit codes in output. The 4-bit codes are stored in pairs (two codes in one byte). The first one is stored in the lower four bits. Two varaibles must be mantained while compressing : the previous sample value and the current index into the step table. You can find the Step Table in Appendix B. The Index adjustment table is in Appendix A. Here's the pseudo-code that will do the compression : Index:=0; Prev_Sample:=0; while there_is_more_data do begin Cur_Sample:=Get_Next_Sample; Delta:=Cur_Sample-Prev_Sample; if Delta<0 then begin Delta:=-Delta; Sb:=1; end else Sb:=0; {Sb is bit 4 of the output Code (sign bit)} Code := 4*Delta div Step_Table[Index]; if Code>7 then Code:=7; {These are the 3 low-order bits of output code} Index:=Index+Index_Adjust[Code]; if Index<0 then Index:=0; if Index>88 the Index:=88; Predicted_Delta:=(Step_Table[Index]*Code) div 4 + Step_Table[Index] div 8; {This is the Delta that decoding routine will compute} Prev_Sample:=Prev_Sample+Predicted_Delta; if Prev_Sample>32767 then Prev_Sample:=32767 else if Prev_Sample<-32768 then Prev_Sample:=-32768; {Prev_Sample is the sample value that the decoding routine will compute} Output_Code(Code+Sb*8); end; Note that this code is usually implemented in more efficient manner (No need to divide). The Get_Next_Sample function should return the next sample from the input buffer. The Output_Code function should store the 4-bit code to the output buffer. One byte contains two 4-bit codes, and this function should take care of this. ============================ 3. IMA-ADPCM DECOMPRESSION ============================ It is the exact opposite of the above. It receives 4-bit codes in input and produce 16-bit samples in output. Again you have to mantain an Index into the Step Table an the current sample value. The tables used are the same as for compression. Here's the code : Index:=0; Cur_Sample:=0; while there_is_more_data do begin Code:=Get_Next_Code; if (Code and $8) <> 0 then Sb:=1 else Sb:=0; Code:=Code and $7; {Separate the sign bit from the rest} Delta:=(Step_Table[Index]*Code) div 4 + Step_Table[Index] div 8; {The last one is to minimize errors} if Sb=1 then Delta:=-Delta; Cur_Sample:=Cur_Sample+Delta; if Cur_Sample>32767 then Cur_Sample:=32767 else if Cur_Sample<-32768 then Cur_Sample:=-32768; Output_Sample(Cur_Sample); Index:=Index+Index_Adjust[Code]; if Index<0 then Index:=0; if Index>88 the Index:=88; end; Again, this can be done more efficiently (no need for multiplication). The Get_Next_Code function should return the next 4-bit code. It must extract it from the input buffer (note that two 4-bit codes are stored in the same byte, the first one in the lower bits). The Output_Sample function should write the signed 16-bit sample to the output buffer. ========================================= Appendix A : THE INDEX ADJUSTMENT TABLE ========================================= Index_Adjust : array [0..7] of integer = (-1,-1,-1,-1,2,4,6,8); ============================= Appendix B : THE STEP TABLE ============================= Steps_Table : array [0..88] of integer =( 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 19, 21, 23, 25, 28, 31, 34, 37, 41, 45, 50, 55, 60, 66, 73, 80, 88, 97, 107, 118, 130, 143, 157, 173, 190, 209, 230, 253, 279, 307, 337, 371, 408, 449, 494, 544, 598, 658, 724, 796, 876, 963, 1060, 1166, 1282, 1411, 1552, 1707, 1878, 2066, 2272, 2499, 2749, 3024, 3327, 3660, 4026, 4428, 4871, 5358, 5894, 6484, 7132, 7845, 8630, 9493, 10442, 11487, 12635, 13899, 15289, 16818, 18500, 20350, 22385, 24623, 27086, 29794, 32767 ); --- Vladan Bato (bat22@geocities.com) http://www.geocities.com/SiliconValley/8682 This article explains how BSP (binary space partitioning) trees can be used in a game such as DOOM as part of the rendering pipeline to perform back-face culling, partial Z-ordering and hidden surface removal. To explain the use of BSP trees, it is best to start with an example. Consider a very simple DOOM level. A---------------------------------a----------------------------------B | | | | | | y | | d1 | | b1 | | f' | | | | | | | C--------------------f-----------------------D | | | | | | | | | | f" | | | | d | | b | | | | | | | | | e" e e' g' g g" | | d2 | | | | b2 | | | | | | | | | | | | | | E F | | | | x | | | | | | G---------------------------------c----------------------------------H ----c1---- ----------------------c2-------------------- -----c3----- The level consists of a room within a room. The player cannot go outside of the area within the square ABHG. First some definitions (sorry :-) The _vertices_ are marked A-H, the _faces_ are marked a-g. We define a _line_ by using an ordered pair of vertices, so that a = (A,B) e = (E,C) f = (C,D) g = (F,D) We say a point is to the _left_ of a line if it is to the left of the vector between its two vertices, taken in order. So, in the above example, nothing is to the left of line a; everything is to the right of it. Note that this depends upon our defintion of line a, and if we had defined a = (B,A) then everything would be to the left of line a. A _face_ is a side of a line which is visible to the player. Wall e above, for example, has two faces (marked e' and e"). Not all walls have two faces - if the player can never see one side of a wall it only has one. A face is fully defined by an ordered pair of vertices and an ordered pair of faces - a left face and a right face. The BSP tree for the example above might look like this: f / \ / \ / \ a,d1,b1 e / \ / \ / \ d2,c1 g / \ / \ / \ c2 c3,b2 Each node contains a line. Everything to the left of that line is in the left subtree, and everything to the right of that line is in the right subtree. Note that face d is neither completely to the right of nor to the left of face f. To accomodate this, we split it up into two halves, and put one half into the left subtree and one half into the right subtree. Thus, we have to generate new faces in order to build the BSP tree. I will explain how the BSP tree is created later. Firstly, I will give the algorithm used to render a picture using the tree. Suppose the player is standing at position 'x', and looking North. We start at the top of the tree at line f. We are standing to the right of line f, so we go down the LEFT of the tree. This is because we want the furthest polygons first. We come to the left-hand-most terminating node. We write down the faces here in our notepad. "a,d1,b1". Since we've come to a terminator, we back up a level. Back to the top, but we have to go down the right subtree yet. Firstly, though, we look at face f - the deciding face for this node. We've got everything behind it in our list, we've yet to look at anything in front of it, but we must put it into our list. Note that face f has two sides - f' and f". Since we already know we're on the right of line f, we know that we can only see its right side - so we write f" in our notepad. It now says a,d1,b1,f". Note, though, that if we were looking south (i.e. our line-of-sight vector points away from face f) then we could not see either face f or anything on the other side of face f - in this case, we just don't bother going any further down the tree. Now we go down the subtree and come to node e. We are on the right of e, so we go down the left subtree and get a terminal node - we just write d2,c1 in our notepad. Back up, decide on which side of e to put in. We decide e'. The notepad now says a,d1,b1,f",d2,c1,e'. Down the right subtree to node g. We're on the left, so down the right subtree to c3,b2, up, check g (we're on the left = g'), back down to the final node, get c2, up, up, up, and we're done. The notepad ends up saying: a d1 b1 f" d2 c1 e' c3 b2 g' c2 If we draw these walls, in this order, then we will get the correct scene. I would recommend using a one-dimensional Z-buffer to get finer granularity than the painter's algorithm provides, before plotting the walls. Note also that some walls are behind you - however, since you need to calculate their z coordinates for the perspective transform, you can merely discard faces with negative z values. Creating the BSP tree --------------------- The BSP tree almost creates itself. The only difficulty is knowing when to stop recursing. Notice that the terminal nodes are just put into the list - so a sufficient condition for a group of faces to form a terminal node is that they can be drawn in a set order without any mistakes occuring in the drawing. That is, if wherever the player can stand, the group of walls will never obscure each other. So let us begin: Choose face f (the choice is fairly arbitrary - it is best to choose faces which don't split many other faces up. However, in this case it is unavoidable). Split up faces d and b, because they straddle the line f. (The line you are splitting along is known as the _nodeline_ in DOOM-speak). Then put everything to the left of f in the left subtree, and vice-versa: f / \ / \ / \ a,d1,b1 b2,c,d2,e,g We can terminate the left node - because walls a,d1 and b1 form a convex shape, they can never overlap each other from any point of view. However, on the other side, face e can obscure face d2 from certain viewpoints (our example viewpoint above, for one) so we divide along side e. This causes side c to be split, but side a is not split because it's not in our current list of sides. The next level is: f / \ / \ / \ a,d1,b1 e / \ / \ / \ d2,c1 b2,c2,g Now, c1 and d2 never overlap, so we have another terminal node. We next divide along line g, splitting c2 into c2 and c3, and the last nodes are terminals (a node with one face in is always terminal :-). This is the basic idea behind a BSP tree - to give an example how effective it is, consider standing at point y and looking North. Because you're looking away from face f, you don't bother recursing down the entire left subtree. This then very quickly gives you the ordered list of faces: a,d1,b1. Refinements ----------- If at each node we define a bounding box for each subtree, such that every line in a subtree is contained by its corresponding bounding box, then we can cut some invisible polygons (ones which lie to the left or right of the screen) out by comparing each bounding box with the cone of vision - if they don't intersect, then you don't go down the whole subtree. DOOM does this, allowing it to store an *entire* level in one huge BSP tree. Here's some pseudo-code to traverse the tree. The function left() returns TRUE if the second input vector is to the left of the first input vector. This is a simple dot product, and by pre-calculating the slope of the nodeline can be done with one multiply and one subtract. vector player ; player's map position vector left_sightline ; vector representing a ray cast through ; the left-most pixel of the screen vector right_sightline ; the right-most pixel of the screen structure node { vector vertex1 vector vertex2 node left_subtree node right_subtree face left_face face right_face box bounding_box bool terminal_node face terminal_node_faces[lots] } recurse(node input) if (cone defined by left and right sightlines does not intersect the node's bounding box) return fi if node.terminal_node ; terminal node - add faces to list add(node.terminal_node_faces) return fi if left(vertex2-vertex1,player-vertex1) ; player is to the left of the nodeline if not left(vertex2-vertex1,right_sightline) ; sight points right - we are looking at the face recurse(node.right_subtree) add(node.left_face) fi ; now go down the left subtree recurse(node.left_subtree) else ; player is to the right of the nodeline if left(vertex2-vertex1,left_sightline) ; sight points left - we are looking at the face recurse(node.left_subtree) add(node.right_face) fi ; now go down the right subtree recurse(node.right_subtree) fi return end recurse This isn't anywhere near a decent implementation - the data structures, for example, leave a *lot* to be desired :-) It should be possible to encode all the functions inline; in fact, it would be feasible to take a BSP tree and hard-code it into some run-time generated code which you just call to recurse the tree ... but I'm just a hacker at heart ;-) Anyway, I hope this helps answer some peoples' questions on this subject. If you have any more questions, please don't hesitate to email me. Catch you later, Eddie xxx ee@datcon.co.uk =========================================================================== Official Archimedes convertor of : Hear and remember, see and understand, Wolfenstein 3D and proud of it!! : do and forget. =================================: Something like that, anyway. ee@datcon.co.uk ========================================== By Asatur V. Nazarian (samael@avn.mccme.ru) In this document I'll try to describe how sound effects are stored in the .PIG/.S11/.S22 resource files of Interplay/Parallax games Descent 1 and Descent 2. The files this document deals with have extensions: .PIG, .S11, .S22. Throughout this document I use C-like notation. All numbers in all structures described in this document are stored in files using little-endian (Intel) byte order. ====================== 1. Descent 1 Sound FX ====================== In Descent 1 all sound files are stored in DESCENT.PIG file except for one file: DIGITEST.RAW which is in DESCENT.HOG. As to Descent 1 (v1.0) .PIG file, it has the following header: struct PIGHeader { DWORD nFiles; DWORD nSoundFiles; }; nFiles -- this is the number of non-sound files in the PIG, nSoundFiles -- this is the number of RAW sounds in the PIG. Following the header go (nFiles) records describing non-sound files. What we need to know about these records that each of them is 17 bytes long. After those records go (nSoundFiles) records describing sounds. Each record has the following format: struct DSNDEntry { char szFileName[8]; DWORD nSamples; DWORD dwFileSize; DWORD dwFileStart; }; szFileName -- this is the name for sound padded with zeroes. Note that there're some 8 bytes long filenames which are thus not zero-terminated. nSamples -- the number of samples in RAW file. dwFileSize -- the size of RAW file (in bytes). dwFileStart -- the starting position of the RAW file relative to the beginning of data files in PIG. That is, to get the starting position of the RAW file relative to the beginning of the PIG file you need to add (dwFileStart) to (sizeof(PIGHeader)+nFiles*17+nSoundFiles*sizeof(DSNDEntry)). So, starting at that position goes RAW 8-bit unsigned mono data. All sound files in Descent 1 .PIG should be played at 11025 Hz. Note that Descent v1.4 has slightly different .PIG file. Namely, it starts with a DWORD value which is the position (relative to the .PIG file beginning) of PIGHeader. Starting at that position is a PIGHeader which is just the same to the described above. ====================== 2. Descent 2 Sound FX ====================== In Descent 2 all sound files are stored in DESCENT2.S11 and DESCENT2.S22 except for DIGITEST.RAW in DESCENT2.HOG. .S11 and .S22 files have the following header: struct DSNDHeader { char szID[4]; DWORD dwDummy; DWORD nFiles; }; szID -- string ID is always "DSND", dwDummy -- ??? looks like has no reasonable meaning, nFiles -- the number of files in .S11/.S22 file. After the header go (nFiles) DSNDEntry records (just the same as for Descent 1). Just like in Descent 1, (dwFileStart) field of each entry is the starting position of the RAW file relative to the beginning of RAW files. That is, to get the starting position of the RAW file relative to the beginning of the .S11/.S22 file you need to add (dwFileStart) to (sizeof(DSNDHeader)+nFiles*sizeof(DSNDEntry)). Each file in .S11/.S22 is 8-bit unsigned RAW. RAWs from .S11 should be played at 11025 Hz and from .S22 -- at 22050 Hz. ---------------------------------------- Asatur V. Nazarian (samael@avn.mccme.ru) http://anx.da.ru http://www.fortunecity.com/campus/electrical/81/samael.html http://www.music.ag.ru/ On all these sites you can find my GAP program which can deal with PIG/S11/S22 resource files, extract RAWs from them and play back those RAWs. There's also complete source code of GAP and all its plug-ins there, including DSND plug-in, which could be used for further details on how you can deal with this format. By Valery V. Anisimovsky (samael@avn.mccme.ru) In this document I'll try to describe audio file formats used in many (older) Electronic Arts games. Described are formats for music, sound effects, speech and movie soundtracks. The games using these formats include: NBA Live'96, NHL'96, FIFA'96, The Need For Speed, NHL'97. Maybe many more, e.g.: NHL'95. The files this document deals with have extensions: .ASF, .AS4, .KSF, .EAS, .SPH, .BNK, .CRD, .TGV. Note that the files described here may have other extensions (and the same structure!): Electronic Arts tends to change extensions from game to game. Throughout this document I use C-like notation. All numbers in all structures described in this document are stored in files using little-endian (Intel) byte order. ======================= 1. ASF/AS4 Music Files ======================= The music in many Electronic Arts games is in .ASF/.AS4 stand-alone files. These files have the block structure analoguous to RIFF. Namely, these files are divided into blocks (without any global file header like RIFFs have). Each block has the following header: struct ASFBlockHeader { char szBlockID[4]; DWORD dwSize; }; szBlockID -- string ID for the block. dwSize -- size of the block (in bytes) INCLUDING this header. Further I'll describe the contents of blocks of all block types in ASF/AS4 files. When I say "block begins with..." that means "the contents of that block (which begin just after ASFBlockHeader) begin with...". Quoted strings are block IDs. "1SNh": header block. This is the first block in ASF/AS4. This block begins with the structure describing the audio stream: struct EACSHeader { char szID[4]; DWORD dwSampleRate; BYTE bBits; BYTE bChannels; BYTE bCompression; BYTE bType; DWORD dwNumSamples; DWORD dwLoopStart; DWORD dwLoopLength; DWORD dwDataStart; DWORD dwUnknown; }; szID -- ID string, always "EACS". dwSampleRate -- sample rate for the file. bBits -- if multiplied by 8 gives the resolution of (decompressed) sound data, that is 1 means 8-bit and 2 means 16-bit. bChannels -- channels number: 1 for mono, 2 for stereo. bCompression -- if 0x00, the data in the file is not compressed: signed 8-bit PCM or signed 16-bit PCM. If this byte is 0x02, the audio data is compressed with IMA ADPCM. Note that non-compressed 8-bit files use SIGNED format! Signed 16-bit data may be sent to the wave output without any additional conversions, while signed 8-bit data should be converted to unsigned format. For example you can do that so: unsigned8Bit=signed8Bit+0x80 or, just the same: unsigned8Bit=signed8Bit^0x80 (this's a bit faster). bType -- type of file: always 0x00 for ASF/AS4 (multi-block) files. dwNumSamples -- number of samples in the file. May be used for song length (in seconds) calculation. dwLoopStart -- beginning of the repeat loop (in samples). 0xFFFFFFFF means no loop. dwLoopLength -- length of the repeat loop (in samples). Zero for no loop. dwDataStart -- in ASF/AS4 files this is not used (equal to 0). After the EACSHeader the first chunk of sound data comes. If the data isn't compressed, it's just signed 8/16-bit PCM. If the data is compressed, it starts with a small chunk header: struct ASFChunkHeader { DWORD dwOutSize; LONG lIndexLeft; LONG lIndexRight; LONG lCurSampleLeft; LONG lCurSampleRight; }; dwOutSize -- size of uncompressed audio data in this chunk (in samples). lIndexLeft, lIndexRight, lCurSampleLeft, lCurSampleRight are initial values for IMA ADPCM decompression routine for this chunk (for left and right channels respectively). I'll describe the usage of these further when I get to IMA ADPCM decompression scheme. Note that the structure above is ONLY for stereo files. For mono there're just no lIndexRight and lCurSampleRight fields. After this chunk header the compressed data comes. You may find IMA ADPCM decompression scheme description further in this document. Hereafter by "chunk" I mean the audio data in the "1SNd" data block, that is, compressed data which starts after ASFChunkHeader. "1SNd": data block. If no compression is used these blocks contain just signed 8/16-bit PCM audio data. Otherwise the data in each of these blocks begins with the same ASFChunkHeader described above and after that comes compressed data. Note that the first chunk of audio data is in "1SNh" block, along with the global EACS header! "1SNl": loop block. This block defines looping point for the song. It contains only DWORD value, which is the looping jump position (in samples) relative to the start of the song. Note that you should make the jump NOT when you encounter this block but when you come across the "1SNe" block which may appear some "1SNd" data blocks after this block! "1SNe": end block. This block indicate the end of audio stream. Make looping jump when you encounter it. It contains no data and its size is 8 bytes that is the size of ASFBlockHeader. Interesting that some AS4 files contain audio data beyond this block. This should be considered as non-standard feature not worth to support. =================== 2. KSF Music Files =================== Some EA games use other format for music/speech files: .KSF. These files begin with "KWK`" ID string. Following this ID, comes PATl header. It begins with "PATl" ID string and its size is 56 bytes (always?) including its ID string. After PATl header comes TMpl header: struct TMplHeader { char szID[4]; BYTE bUnknown1; BYTE bBits; BYTE bChannels; BYTE bCompression; WORD wUnknown2; WORD wSampleRate; DWORD dwNumSamples; // ??? BYTE bUnknown3[20]; }; szID -- string ID, always "TMpl". bBits -- resolution of sound data (0x10 for 16-bit, 0x8 for 8-bit). bChannels -- channels number: 1 for mono, 2 for stereo. bCompression -- if 0x00, the data in the file is not compressed: signed 8-bit PCM or signed 16-bit PCM. If this byte is 0x02, the audio data is compressed with IMA ADPCM. See the note for EACS header above. wSampleRate -- sample rate for the file. dwNumSamples -- number of samples in the file. May be used for song length (in seconds) calculation. Should be divided by 2 for mono sound. After TMpl header comes sound data. For compressed files, IMA ADPCM compression is used (see below). ===================================== 3. IMA ADPCM Decompression Algorithm ===================================== During the decompression four LONG variables must be maintained for stereo stream: lIndexLeft, lIndexRight, lCurSampleLeft, lCurSampleRight and two -- for mono stream: lIndex, lCurSample. At the beginning of each "1SNd" data block and at the beginning of the file -- when processing "1SNh" block -- you must initialize these variables using the values in ASFChunkHeader. Note that LONG here is signed. Here's the code which decompresses one byte of IMA ADPCM compressed stereo stream. Other bytes are processed in the same way. BYTE Input; // current byte of compressed data BYTE Code; LONG Delta; Code=HINIBBLE(Input); // get HIGHER 4-bit nibble Delta=StepTable[lIndexLeft]>>3; if (Code & 4) Delta+=StepTable[lIndexLeft]; if (Code & 2) Delta+=StepTable[lIndexLeft]>>1; if (Code & 1) Delta+=StepTable[lIndexLeft]>>2; if (Code & 8) // sign bit lCurSampleLeft-=Delta; else lCurSampleLeft+=Delta; // clip sample if (lCurSampleLeft>32767) lCurSampleLeft=32767; else if (lCurSampleLeft<-32768) lCurSampleLeft=-32768; lIndexLeft+=IndexAdjust[Code]; // adjust index // clip index if (lIndexLeft<0) lIndexLeft=0; else if (lIndexLeft>88) lIndexLeft=88; Code=LONIBBLE(Input); // get LOWER 4-bit nibble Delta=StepTable[lIndexRight]>>3; if (Code & 4) Delta+=StepTable[lIndexRight]; if (Code & 2) Delta+=StepTable[lIndexRight]>>1; if (Code & 1) Delta+=StepTable[lIndexRight]>>2; if (Code & 8) // sign bit lCurSampleRight-=Delta; else lCurSampleRight+=Delta; // clip sample if (lCurSampleRight>32767) lCurSampleRight=32767; else if (lCurSampleRight<-32768) lCurSampleRight=-32768; lIndexRight+=IndexAdjust[Code]; // adjust index // clip index if (lIndexRight<0) lIndexRight=0; else if (lIndexRight>88) lIndexRight=88; // Now we've got lCurSampleLeft and lCurSampleRight which form one stereo // sample and all is set for the next input byte... Output((SHORT)lCurSampleLeft,(SHORT)lCurSampleRight); // send the sample to output HINIBBLE and LONIBBLE are higher and lower 4-bit nibbles: #define HINIBBLE(byte) ((byte) >> 4) #define LONIBBLE(byte) ((byte) & 0x0F) Note that depending on your compiler you may need to use additional nibble separation in these defines, e.g. (((byte) >> 4) & 0x0F). StepTable and IndexAdjust are the tables given in the next section of this document. Output() is just a placeholder for any action you would like to perform for decompressed sample value. Of course, this decompression routine may be greatly optimized. As to mono sound, it's just analoguous: Code=HINIBBLE(Input); // get HIGHER 4-bit nibble Delta=StepTable[lIndex]>>3; if (Code & 4) Delta+=StepTable[lIndex]; if (Code & 2) Delta+=StepTable[lIndex]>>1; if (Code & 1) Delta+=StepTable[lIndex]>>2; if (Code & 8) // sign bit lCurSample-=Delta; else lCurSample+=Delta; // clip sample if (lCurSample>32767) lCurSample=32767; else if (lCurSample<-32768) lCurSample=-32768; lIndex+=IndexAdjust[Code]; // adjust index // clip index if (lIndex<0) lIndex=0; else if (lIndex>88) lIndex=88; Output((SHORT)lCurSample); // send the sample to output Code=LONIBBLE(Input); // get LOWER 4-bit nibble // ...just the same as above for lower nibble Note that HIGHER nibble is processed first for mono sound and corresponds to LEFT channel for stereo. ==================== 4. IMA ADPCM Tables ==================== LONG IndexAdjust[]= { -1, -1, -1, -1, 2, 4, 6, 8, -1, -1, -1, -1, 2, 4, 6, 8 }; LONG StepTable[]= { 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 19, 21, 23, 25, 28, 31, 34, 37, 41, 45, 50, 55, 60, 66, 73, 80, 88, 97, 107, 118, 130, 143, 157, 173, 190, 209, 230, 253, 279, 307, 337, 371, 408, 449, 494, 544, 598, 658, 724, 796, 876, 963, 1060, 1166, 1282, 1411, 1552, 1707, 1878, 2066, 2272, 2499, 2749, 3024, 3327, 3660, 4026, 4428, 4871, 5358, 5894, 6484, 7132, 7845, 8630, 9493, 10442, 11487, 12635, 13899, 15289, 16818, 18500, 20350, 22385, 24623, 27086, 29794, 32767 }; ========================= 5. TGV Movie Soundtracks ========================= .TGV movies have the block structure analoguous to that of ASF/AS4. Video-related data is in "kVGT" and "fVGT" (or "TGVk" and "TGVf") blocks and sound-related data is just in the same blocks as in ASF/AS4: "1SNh", "1SNd", "1SNl", "1SNe". So, to play TGV movie soundtrack, just walk blocks chain, skip video blocks and process sound blocks. ================================== 6. Sound/Speech Files: .EAS, .SPH ================================== Some sounds and all speech are usually in .EAS and .SPH files. These files have the header which is just the same as EACSHeader structure described above with two additions: (bType) is always 0xFF for sound/speech files, (dwDataStart) is the starting position of audio data relative to the beginning of the file. After the header, starting at (dwDataStart), comes audio data, up to the end of the file. The data is either non-compressed or IMA ADPCM compressed depending on the (bCompression) byte in the header. If it's IMA ADPCM compressed, there're no initial values for samples and indices at the beginning of the audio data. Just initialize them all to zeroes and start decompression at (dwDataStart). ==================================== 7. Sound Effects in .BNK/.CRD Files ==================================== Most of sound effects are stored in .BNK and .CRD resource files. Those .BNKs and .CRDs may contain several sounds. They begin with some seemingly meaningless data, but after some junk of that data (typically starting at position 0x228, but not necessarily) come several EACS headers describing all sounds in .BNK/.CRD. Each EACS header has almost the same format as described above with some minor changes (some fields have different placement): struct EACSHeader { char szID[4]; DWORD dwSampleRate; BYTE bBits; BYTE bChannels; BYTE bCompression; BYTE bType; DWORD dwLoopStart; DWORD dwLoopLength; DWORD dwNumSamples; DWORD dwDataStart; DWORD dwUnknown; }; and with the same two additions just as for .EAS/.SPH speech/sound: (bType) is always 0xFF, (dwDataStart) is the starting position of sound data relative to the beginning of the .BNK/.CRD file containing that sound. So, what you need to do is just search in .BNK/.CRD for "EACS" ID string and read EACSHeader from the position where you found "EACS". And the same for all sounds contained within .BNK/.CRD. The sound data itself (for each EACS header describing it) starts at (dwDataStart) and its size may be computed using (dwNumSamples) EACSHeader field (for example) with the following formula: Size=dwNumSamples*SampleSize/CompressionRatio, where: CompressionRatio=1 for non-compressed sounds, 2 for 8-bit IMA ADPCM compressed sounds, 4 for 16-bit IMA ADPCM compressed sounds, SampleSize=bChannels*bBits (1 for mono 8-bit, 2 for mono 16-bit, etc.). So, starting at (dwDataStart) comes just either PCM audio data (as described above for .EAS/.SPH files) or IMA ADPCM compressed data (without initial sample/index values, just as in .EAS/.SPH). Set CurSample(Left/Right) and Index(Left/Right) to zeroes and start the decompression. =========== 8. Credits =========== Vladan Bato (bat22@geocities.com) http://www.geocities.com/SiliconValley/8682 The IMA ADPCM decompression scheme above is based on that described in his AUD3.TXT document. Peter Pawlowski (peterpw666@hotmail.com, piotrpw@polbox.com) http://www.geocities.com/pp_666/ Pointed out corrections in IMA ADPCM decoder. Toni Wilen (nhlinfo@nhl-online.com) http://www.nhl-online.com/nhlinfo/ Provided me with info on KSF files, PATl and TMpl headers. Toni Wilen is the author of SNDVIEW utility (available on his pages) which decompresses Electronic Arts audio files and compresses WAVs back into EA formats. Denis AUROUX (MXK) (auroux@clipper.ens.fr) http://www.eleves.ens.fr:8080/home/auroux/nfsspecs.txt Additional info on EACS headers. The author of The Unofficial NFS File Format Specifications. ------------------------------------------- Valery V. Anisimovsky (samael@avn.mccme.ru) http://www.anxsoft.newmail.ru http://anx.da.ru On these sites you can find my GAP program which can search for audio files in .BNK/.CRD resources, and play back .ASF/.AS4 songs, .BNK/.CRD sounds, .EAS/.SPH speech and soundtracks of .TGV movies. There's also complete source code of GAP and all its plug-ins there, including ASF plug-in, which could be used for further details on how you can deal with these formats. By Valery V. Anisimovsky (samael@avn.mccme.ru) In this document I'll try to describe audio file formats used in many (new) Electronic Arts games. Described are formats for music, movie soundtracks and partly sound effects/speech. The games using these formats include: Need For Speed 2, NFS3, NFS4, NFS5, NBA Live'98, NBA'99, NBA'2000, NHL Online'98, NHL'99, NHL'2000, FIFA'98, FIFA'99, FIFA'2000, Bundesliga Stars 2000, Madden NFL'98, Madden NFL'2000, EURO'2000, Fighter Pilot, Warhammer II: Dark Omen, Dungeon Keeper 2, Populous 3, Wing Commander: Prophecy. Maybe many more, e.g.: NBA'97, FIFA'97. The files this document deals with have extensions: .ASF, .STR, .MUS, .LIN, .MAP, .WVE, .TGQ, .DCT, .MAD, .UV, .UV2, .BNK, .VIV. Note that the files described here may have other extensions (and the same structure!): Electronic Arts tends to change extensions from game to game. Throughout this document I use C-like notation. All numbers in all structures described in this document are stored in files using little-endian (Intel) byte order, unless otherwise stated. ========================= 1. .ASF/.STR Music Files ========================= The music in many new Electronic Arts games is in .ASF stand-alone files (sometimes ASF files have extension .STR). These files have the block structure analoguous to RIFF. Namely, these files are divided into blocks (without any global file header like RIFFs have). Each block has the following header: struct ASFBlockHeader { char szBlockID[4]; DWORD dwSize; }; szBlockID -- string ID for the block. dwSize -- size of the block (in bytes) INCLUDING this header. Further I'll describe the contents of blocks of all block types in .ASF file. When I say "block begins with..." that means "the contents of that block (which begin just after ASFBlockHeader) begin with...". Quoted strings are block IDs. "SCHl": header block. This is the first block in ASF. In the most of files this block begins with the ID string "PT\0\0" (or number 0x50540000). Further goes the PT header data which describes audio data in the file. This PT header should be parsed rather than just read as a simple structure. Here I give the parsing code. These functions use fread() and fseek() stdio functions. // first of all, we need a function which reads a small (variable) number // bytes and composes a DWORD of them. Note that such DWORD will be a kind // of big-endian (Motorola) stored, e.g. 3 consecutive bytes 0x12 0x34 0x56 // will give a DWORD 0x00123456. DWORD ReadBytes(FILE* file, BYTE count) { BYTE i, byte; DWORD result; result=0L; for (i=0;i<count;i++) { fread(&byte,sizeof(BYTE),1,file); result<<=8; result+=byte; } return result; } // these will be set by ParsePTHeader DWORD dwSampleRate; DWORD dwChannels; DWORD dwCompression; DWORD dwNumSamples; DWORD dwDataStart; DWORD dwLoopOffset; DWORD dwLoopLength; DWORD dwBytesPerSample; BYTE bSplit; BYTE bSplitCompression; // Here goes the parser itself // This function assumes that the current file pointer is set to the // start of PT header data, that is, just after PT string ID "PT\0\0" void ParsePTHeader(FILE* file) { BYTE byte; BOOL bInHeader, bInSubHeader; bInHeader=TRUE; while (bInHeader) { fread(&byte,sizeof(BYTE),1,file); switch (byte) // parse header code { case 0xFF: // end of header bInHeader=FALSE; case 0xFE: // skip case 0xFC: // skip break; case 0xFD: // subheader starts... bInSubHeader=TRUE; while (bInSubHeader) { fread(&byte,sizeof(BYTE),1,file); switch (byte) // parse subheader code { case 0x82: fread(&byte,sizeof(BYTE),1,file); dwChannels=ReadBytes(file,byte); break; case 0x83: fread(&byte,sizeof(BYTE),1,file); dwCompression=ReadBytes(file,byte); break; case 0x84: fread(&byte,sizeof(BYTE),1,file); dwSampleRate=ReadBytes(file,byte); break; case 0x85: fread(&byte,sizeof(BYTE),1,file); dwNumSamples=ReadBytes(file,byte); break; case 0x86: fread(&byte,sizeof(BYTE),1,file); dwLoopOffset=ReadBytes(file,byte); break; case 0x87: fread(&byte,sizeof(BYTE),1,file); dwLoopLength=ReadBytes(file,byte); break; case 0x88: fread(&byte,sizeof(BYTE),1,file); dwDataStart=ReadBytes(file,byte); break; case 0x92: fread(&byte,sizeof(BYTE),1,file); dwBytesPerSample=ReadBytes(file,byte); break; case 0x80: // ??? fread(&byte,sizeof(BYTE),1,file); bSplit=ReadBytes(file,byte); break; case 0xA0: // ??? fread(&byte,sizeof(BYTE),1,file); bSplitCompression=ReadBytes(file,byte); break; case 0xFF: subflag=FALSE; flag=FALSE; break; case 0x8A: // end of subheader bInSubHeader=FALSE; default: // ??? fread(&byte,sizeof(BYTE),1,file); fseek(file,byte,SEEK_CUR); } } break; default: fread(&byte,sizeof(BYTE),1,file); if (byte==0xFF) fseek(file,4,SEEK_CUR); fseek(file,byte,SEEK_CUR); } } } dwSampleRate -- sample rate for the file. Note that headers of most of ASFs/MUSes I've seen DO NOT contain sample rate subheader section. Currently I just set sample rate for such files to the default: 22050 Hz. It seems to work okay. dwChannels -- number of channels for the file: 1 for mono, 2 for stereo. If this is NOT set by ParsePTHeader, then you may use the default: mono. dwCompression -- Compression tag. If this is 0x00, then no compression is used and audio data is signed 16-bit PCM. If this is 0x07, the audio data is compressed with EA ADPCM algorithm. Please read the next section for the description of EA ADPCM decompression scheme. In some files this tag is omitted -- I use 0x00 (no compression) for them. dwNumSamples -- number of samples in the file. dwDataStart -- in ASF files this's not used. dwLoopOffset -- offset when looping (from start of sound part). dwLoopLength -- length when looping. dwBytesPerSample -- bytes per sample (Default is 2). Divide this by dwChannels to get resolution of sound data. bSplit -- this looks like to be 0x01 for files using "split" SCDl blocks (see below). If this subheader field is absent, the file uses "normal" (interleaved) SCDl blocks. bSplitCompression -- this looks like to be 0x08 for files using non-compressed "split" SCDl blocks. If this subheader field is absent in the file using "split" SCDls, the file uses EA ADPCM compression. This subheader field should not appear in a file using "normal" (interleaved) SCDls. The structure and the meanings of some parts of PT header is very uncertain. Please mail me if you find out more! Note that some music/video files have somewhat different format of SCHl header. Namely, first comes PATl header: it begins with "PATl" ID string and its size is 56 bytes (always?) including its ID string. After PATl header comes TMpl header: struct TMplHeader { char szID[4]; BYTE bUnknown1; BYTE bBits; BYTE bChannels; BYTE bCompression; WORD wUnknown2; WORD wSampleRate; DWORD dwNumSamples; // ??? BYTE bUnknown3[20]; }; szID -- string ID, always "TMpl". bBits -- resolution of sound data (0x10 for 16-bit, 0x8 for 8-bit). bChannels -- channels number: 1 for mono, 2 for stereo. bCompression -- if 0x00, the data in the file is not compressed: signed 8-bit PCM or signed 16-bit PCM. If this byte is 0x02, the audio data is compressed with IMA ADPCM. See my EA-ASF.TXT specs for description of IMA ADPCM decompression scheme. wSampleRate -- sample rate for the file. dwNumSamples -- number of samples in the file. May be used for song length (in seconds) calculation. Should be divided by 2 for mono sound. Note that the meaning of this field may be different when TMpl header is used inside the SCHl header. "SCCl": count block. This block goes after "SCHl" and contains one DWORD value which is a number of "SCDl" data blocks in ASF file. "SCDl": data block. These blocks contain audio data. Depending on the parameters set in the header (see above) SCDl block may contain compressed (by EA ADPCM or IMA ADPCM) or non-compressed audio data and the data itself may be interleaved or split (see below). If no compression is used and the file does not use "split" SCDl blocks, SCDl block begins with a DWORD value which is the number of samples in this block and after that comes signed 16-bit PCM data, in the interleaved form: LRLR...LR (L and R are 16-bit sample values for left and right channels). Hereafter by "chunk" I mean the audio data in the "SCDl" data block, that is, compressed/non-compressed data which starts after chunk header. In the newer EA games (NHL'2000/NBA'2000/FIFA'99'2000/NFS5) non-compressed "split" SCDl blocks are used. These blocks begin with a chunk header: struct ASFSplitPCMChunkHeader { DWORD dwOutSize; DWORD dwLeftChannelOffset; DWORD dwRightChannelOffset; } dwOutSize -- size of audio data in this chunk (in samples). dwLeftChannelOffset, dwRightChannelOffset -- offsets to PCM data for left and right channels, relative to the byte which immediately follows ASFSplitPCMChunkHeader structure. E.g. for left channel this offset is zero -- the data starts immediately after this structure. After this structure comes PCM data for stereo wavestream and it's not interleaved (LRLRLR...), but it's split: first go sample values for left channel, then -- for right channel, that is the layout is LL...LRR...R. If EA ADPCM (or IMA ADPCM) compression is used, but the file does not use "split" SCDls, each SCDl block begins with a chunk header: struct ASFChunkHeader { DWORD dwOutSize; LONG lCurSampleLeft; LONG lPrevSampleLeft; LONG lCurSampleRight; LONG lPrevSampleRight; }; dwOutSize -- size of decompressed audio data in this chunk (in samples). lCurSampleLeft, lCurSampleRight, lPrevSampleLeft, lPrevSampleRight are initial values for EA ADPCM decompression routine for this data block (for left and right channels respectively). I'll describe the usage of these further when I get to EA ADPCM decompression scheme. Note that the structure above is ONLY for stereo files. For mono there're just no lCurSampleRight, lPrevSampleRight fields. If IMA ADPCM compression is used, the meanings of some chunk header fields are different -- see my EA-ASF.TXT specs for details. After this chunk header the compressed data comes. See the next section for EA ADPCM decompression scheme description. If EA ADPCM (or IMA ADPCM) compression is used and the file uses "split" SCDls, each SCDl block begins with a different chunk header: struct ASFSplitChunkHeader { DWORD dwOutSize; DWORD dwLeftChannelOffset; DWORD dwRightChannelOffset; }; SHORT lCurSampleLeft; SHORT lPrevSampleLeft; BYTE bLeftChannelData[]; // compressed data for left channel goes here... SHORT lCurSampleRight; SHORT lPrevSampleRight; BYTE bRightChannelData[]; // compressed data for right channel goes here... dwOutSize -- size of decompressed audio data in this chunk (in samples). dwLeftChannelOffset, dwRightChannelOffset -- offsets to compressed data for left and right channels, relative to the byte which immediately follows ASFSplitChunkHeader structure. E.g. for left channel this offset is zero -- the data starts immediately after this structure. lCurSampleLeft, lCurSampleRight, lPrevSampleLeft, lPrevSampleRight have the same meaning as above, but note that these values are SHORTs. So, use mono decoder for each channel data and then create normal LRLR... stereo waveform before outputting. Such (newer) files may be separated from the others by presence of 0x80 type section in PT header (the value stored in the section is 0x01 for such files). Some of such files also do not contain compression type (0x83) section in their PT header. "SCLl": loop block. This block defines looping point for the song. It contains only DWORD value, which is the looping jump position (in samples) relative to the start of the song. You should make the jump just when you encounter this block. "SCEl": end block. This block indicates the end of audio stream. Note that in some games audio files are contained within game resources. As a rule, such resources are not compressed/encrypted, so you may just search for ASF file signature (e.g. "SCHl") and this will mark the beginning of audio stream, while "SCEl" block marks the end of that stream. ==================================== 2. EA ADPCM Decompression Algorithm ==================================== During the decompression four LONG variables must be maintained for stereo stream: lCurSampleLeft, lCurSampleRight, lPrevSampleLeft, lPrevSampleRight and two -- for mono stream: lCurSample, lPrevSample. At the beginning of each "SCDl" data block you must initialize these variables using the values in ASFChunkHeader. Note that LONG here is signed. Here's the code which decompresses one "SCDl" block of EA ADPCM compressed stereo stream. BYTE InputBuffer[InputBufferSize]; // buffer containing audio data of "SCDl" block BYTE bInput; DWORD dwOutSize; // outsize value from the ASFChunkHeader DWORD i, bCount, sCount; LONG c1left,c2left,c1right,c2right,left,right; BYTE dleft,dright; DWORD dwSubOutSize=0x1c; i=0; // process integral number of (dwSubOutSize) samples for (bCount=0;bCount<(dwOutSize/dwSubOutSize);bCount++) { bInput=InputBuffer[i++]; c1left=EATable[HINIBBLE(bInput)]; // predictor coeffs for left channel c2left=EATable[HINIBBLE(bInput)+4]; c1right=EATable[LONIBBLE(bInput)]; // predictor coeffs for right channel c2right=EATable[LONIBBLE(bInput)+4]; bInput=InputBuffer[i++]; dleft=HINIBBLE(bInput)+8; // shift value for left channel dright=LONIBBLE(bInput)+8; // shift value for right channel for (sCount=0;sCount<dwSubOutSize;sCount++) { bInput=InputBuffer[i++]; left=HINIBBLE(bInput); // HIGHER nibble for left channel right=LONIBBLE(bInput); // LOWER nibble for right channel left=(left<<0x1c)>>dleft; right=(right<<0x1c)>>dright; left=(left+lCurSampleLeft*c1left+lPrevSampleLeft*c2left+0x80)>>8; right=(right+lCurSampleRight*c1right+lPrevSampleRight*c2right+0x80)>>8; left=Clip16BitSample(left); right=Clip16BitSample(right); lPrevSampleLeft=lCurSampleLeft; lCurSampleLeft=left; lPrevSampleRight=lCurSampleRight; lCurSampleRight=right; // Now we've got lCurSampleLeft and lCurSampleRight which form one stereo // sample and all is set for the next input byte... Output((SHORT)lCurSampleLeft,(SHORT)lCurSampleRight); // send the sample to output } } // process the rest (if any) if ((dwOutSize % dwSubOutSize) != 0) { bInput=InputBuffer[i++]; c1left=EATable[HINIBBLE(bInput)]; // predictor coeffs for left channel c2left=EATable[HINIBBLE(bInput)+4]; c1right=EATable[LONIBBLE(bInput)]; // predictor coeffs for right channel c2right=EATable[LONIBBLE(bInput)+4]; bInput=InputBuffer[i++]; dleft=HINIBBLE(bInput)+8; // shift value for left channel dright=LONIBBLE(bInput)+8; // shift value for right channel for (sCount=0;sCount<(dwOutSize % dwSubOutSize);sCount++) { bInput=InputBuffer[i++]; left=HINIBBLE(bInput); // HIGHER nibble for left channel right=LONIBBLE(bInput); // LOWER nibble for right channel left=(left<<0x1c)>>dleft; right=(right<<0x1c)>>dright; left=(left+lCurSampleLeft*c1left+lPrevSampleLeft*c2left+0x80)>>8; right=(right+lCurSampleRight*c1right+lPrevSampleRight*c2right+0x80)>>8; left=Clip16BitSample(left); right=Clip16BitSample(right); lPrevSampleLeft=lCurSampleLeft; lCurSampleLeft=left; lPrevSampleRight=lCurSampleRight; lCurSampleRight=right; // Now we've got lCurSampleLeft and lCurSampleRight which form one stereo // sample and all is set for the next input byte... Output((SHORT)lCurSampleLeft,(SHORT)lCurSampleRight); // send the sample to output } } HINIBBLE and LONIBBLE are higher and lower 4-bit nibbles: #define HINIBBLE(byte) ((byte) >> 4) #define LONIBBLE(byte) ((byte) & 0x0F) Note that depending on your compiler you may need to use additional nibble separation in these defines, e.g. (((byte) >> 4) & 0x0F). EATable is the table given in the next section of this document. Output() is just a placeholder for any action you would like to perform for decompressed sample value. Clip16BitSample is quite evident: LONG Clip16BitSample(LONG sample) { if (sample>32767) return 32767; else if (sample<-32768) return (-32768); else return sample; } As to mono sound, it's just analoguous: dwSubOutSize=0x0E for mono and you should get predictor coeffs and shift from one byte: bInput=InputBuffer[i++]; c1=EATable[HINIBBLE(bInput)]; // predictor coeffs c2=EATable[HINIBBLE(bInput)+4]; d=LONIBBLE(bInput)+8; // shift value And also you should process HIGHER nibble of the input byte first and then LOWER nibble for mono sound. Of course, this decompression routine may be greatly optimized. ================== 3. EA ADPCM Table ================== LONG EATable[]= { 0x00000000, 0x000000F0, 0x000001CC, 0x00000188, 0x00000000, 0x00000000, 0xFFFFFF30, 0xFFFFFF24, 0x00000000, 0x00000001, 0x00000003, 0x00000004, 0x00000007, 0x00000008, 0x0000000A, 0x0000000B, 0x00000000, 0xFFFFFFFF, 0xFFFFFFFD, 0xFFFFFFFC }; ================================================== 4. .WVE/.DCT/.MAD/.TGQ/.UV/.UV2 Movie Soundtracks ================================================== .WVE/.DCT/.MAD/.TGQ/.UV/.UV2 movies have the block structure analoguous to that of .ASF. Video-related data is in "pIQT", "mTCD", "MADk", "MADm", "MADe", "pQGT", etc. blocks and sound-related data is just in the same blocks as in .ASF: "SCHl", "SCCl", "SCDl", "SCLl", "SCEl". So, to play .WVE/.DCT/.MAD/.TGQ/.UV/.UV2 movie soundtrack, just walk blocks chain, skip video blocks and process sound blocks. Note that in some games video files (as well as audio files) are contained within game resources. As a rule, such resources are not compressed/encrypted, so you may just search for ASF file signature (e.g. "SCHl") and this will mark the beginning of audio stream, while "SCEl" block marks the end of that stream. =================== 5. MUS Music Files =================== Interactive music is in .MUS files. These have the same block structure as .ASFs with two important differences: 1) MUS file may contain several "SCHl" header blocks. 2) Each "SCHl" header block starts at the position which is a multiple of 4. That is, if you've read the "SCEl" end block and your current file position is, say, dwCurPos, do the following: if ((dwCurPos % 4) == 0) just read the next block, otherwise skip (4 - (dwCurPos % 4)) bytes and then read the next block. If you walk the block chain of a .MUS file, you'll get the block sequence like this: SCHl, SCCl, SCDl, ..., SCEl, SCHl, SCCl, SCDl, ..., SCEl, .... That is, a MUS file is a kind of collection of ASF files, each ASF file beginning being aligned on DWORD boundary. Each ASF file starts with "SCHl" block and ends with "SCEl" block. Further I'll refer to such ASFs in .MUS as "MUS sections". Each MUS section contains a part of song. If you try to play these parts consecutively as they appear in .MUS you will not get right song playback for most .MUS files. To play .MUS in the right sequence you'll need either .LIN or .MAP file (with the same name) which should be found in the same directory as the .MUS on Electronic Arts game CD. While in NFS 2 almost all .MUSes have the correspondent .ASFs which are used for non-interactive playback, in NFS 3 all songs are .MUSes and to play them you'll need to use correspondent .LIN file (for some songs -- .MAP file). ============================================= 6. .LIN/.MAP Files and Correct .MUS Playback ============================================= .LIN/.MAP files which should be found in the same directory as .MUSes define the interactive and non-interactive ("normal") playback sequences. Typically, .LINs define normal (non-interactive) and .MAPs define interactive sequences. Some .MAPs define normal sequence. Both .LINs and .MAPs have the same structure, which I'll describe here. Each .LIN or .MAP corresponds to the .MUS with the same name: e.g. CREDITS.MAP corresponds to CREDITS.MUS and EMPRROCK.LIN -- to EMPRROCK.MUS. .LIN/.MAP file has the following header: struct MAPHeader { char szID[4]; BYTE bUnknown1; BYTE bFirstSection; BYTE bNumSections; BYTE bRecordSize; // ??? BYTE Unknown2[3]; BYTE bNumRecords; }; szID -- string ID, always "PFDx". bFirstSection -- index (zero-based) of the first MUS section to be played. Hereafter by "index of .MUS section" I mean the number which identifies the section in .MUS file: index 0 corresponds to the first section, 1 -- to the second, etc. That is, the section index is zero-based. bNumSections -- number of sections in the correspondent MUS file. bRecordSize -- size of record, array of which follows the table of section definitions in .LIN/.MAP file. More about this later. bNumRecords -- number of records in the array mentioned above. Following the header, comes the table of (bNumSections) definitions for each section of .MUS. Each definition describes the correspondent .MUS section: the first describe first .MUS section, the second describes second .MUS section, etc. Each definition has the following format: struct MAPSectionDef { BYTE bIndex; BYTE bNumRecords; BYTE szID[2]; struct MAPSectionDefRecord msdRecords[8]; }; bIndex -- ??? not necessary for non-interactive playback. bNumRecords -- number of MAPSectionDefRecords used (of 8) in msdRecords[]. Used are msdRecords[0], ..., msdRecords[bNumRecords-1], others are zeroed. For .LINs/.MAPs, defining non-interactive playback sequence, it seems that (bNumRecords) is always 1, that is, only the first MAPSectionDefRecord is used and should be used for playback sequence. If (bNumRecords) is zero, this means that the section described by the MAPSectionDef is the final in playback sequence and there's no next section for it. szID -- ID, seems to be always "\xFF\xFF". Not necessary for non-interactive playback. msdRecords -- array of 8 records (used are only first (bNumRecords)), each record having the following format: struct MAPSectionDefRecord { BYTE bUnknown; BYTE bMagic; BYTE bNextSection; }; bMagic -- seems to be 0x64 for the records defining non-interactive playback. But, maybe, not necessarily. Just ignore that. bNextSection -- index (zero-based) of the next section in the .MUS playback sequence. The section with the index (bNextSection) should be played after the section which is described by this MAPSectionDef. More about the .MUS playback later. After the table of .MUS section definitions comes the array of (MAPHeader.bNumRecords) seemingly useless records each record having the size (MAPHeader.bRecordSize). I've got some doubts about my treatment of (MAPHeader.bRecordSize) field, so it seems to be safer to use 0x10 as the record size. Just skip this array. It's of no use for non-interactive playback. After that array comes the final part of .LIN/.MAP -- the array of DWORDs which are just the starting positions of .MUS sections (that is, positions for "SCHl" blocks describing the correspondent sections). Important note: these DWORDs are stored using big-endian byte order! That means that the four bytes in the file, e.g., 0x12 0x34 0x56 0x78 constitute the DWORD value 0x12345678 and NOT 0x78563412 (as it's treated by Intel processors). These starting positions are relative to the .MUS file beginning. Now, when we know the structure of .LIN/.MAP files, I'll describe how they should be used for non-interactive .MUS playback. First, read the .LIN/.MAP header. This gives you the index of first section in playback sequence (MAPHeader.bFirstSection). Then get the starting position of this section from the positions table: fseek(mapfile,sizeof(MAPHeader)+MAPHeader.bNumSections*sizeof(MAPSectionDef)+ MAPHeader.bNumRecords*MAPHeader.bRecordSize+index*sizeof(DWORD),SEEK_SET); fread(&dwStart,sizeof(DWORD),1,mapfile); Invert byte order in dwStart: dwStart=SWAPDWORD(dwStart), where #define SWAPDWORD(x) ((((x)&0xFF)<<24)+(((x)>>24)&0xFF)+(((x)>>8)&0xFF00)+(((x)<<8)&0xFF0000)) Now you've got correct dwStart and just set the file pointer in .MUS file to that to get to the section start. Read the section's "SCHl" header and further blocks and play the section. Then get to this section's definition structure, for example, using the code like this: fseek(mapfile,sizeof(MAPHeader)+index*sizeof(MAPSectionDef),SEEK_SET); Read the section definition: fread(&secdef,sizeof(MAPSectionDef),1,mapfile); Now (secdef.msdRecords[secdef.bNumRecords-1].bNextSection) is the next section to play back. Get its starting position from the table, etc. Repeat this procedure until you come across either a section you've already played or the section definition with zero (bNumRecords). In the former case you may loop the song or just stop playback. In the latter case you should just stop playback. Some final words about .MUS/.ASF/.LIN/.MAP files... When to play .MUS file using .LIN or .MAP and what to use: .LIN or .MAP ? If along with the .MUS file there's an .ASF file with same name, play the .ASF file -- it should be used for non-interactive playback. If there's no .ASF file with the same name as .MUS, but along with the .MUS there's a .LIN file with the same name as .MUS, play .MUS file using that .LIN file. If there's no .LIN or .ASF file correspondent to .MUS file, but there's a .MAP file with the same name, play the .MUS file using that .MAP. And finally, if there's none of .ASF, .LIN or .MAP file for .MUS, it's an error. You may try to play that .MUS section-by-section or use playback sequence of your choice. ==================================== 7. Sound Effects in .BNK/.VIV Files ==================================== Most of sound effects and speech files (and sometimes ASF music files) are stored in .BNK and .VIV resource files. The .BNK file may contain several sounds. BNKs of older version have the following header: struct OldBNKHeader { char szID[4]; WORD wVersion; WORD wNumberOfSounds; DWORD dwFirstSoundStart; DWORD dwSoundsArray[wNumberOfSounds]; }; For the newer BNK files the header is: struct NewBNKHeader { char szID[4]; WORD wVersion; WORD wNumberOfSounds; DWORD dwFirstSoundStart; DWORD dwSoundSize; // = total filesize - dwFirstSoundStart DWORD dwUnknown; // seems to contain small number <20 or -1 DWORD dwSoundsArray[wNumberOfSounds]; }; szID -- string ID, always "BNKl". wVersion -- for old version this is 0x0002, for new version -- 0x0004. wNumberOfSounds -- number of sounds stored in .BNK file. dwFirstSoundStart -- the starting position of the first sound audio data relative the BNK file beginning. There's no real use of this... dwSoundsArray -- the array of (wNumberOfSounds) DWORDs. Each of these is the shift to the PT header describing the separate sound in .BNK relative to the starting position of this DWORD. That is, if such DWORD (dwShift) starts at the position (dwShiftPos) (relative to the start of .BNK), the correspondent PT header starts at the position: dwPTHeaderPos=dwShiftPos+dwShift. Note that some DWORDs in this array are zeroes that means they correspond to no sound. Remember that PT header starts with the "PT\0\0" signature. So, (dwSoundsArray) points to a number of PT headers in .BNK, which follow the BNK header. Each of these PT headers describe a separate sound in .BNK. Refer to the .ASF file description for details on dealing with PT headers. Note that some PT headers do not contain (dwChannels), (dwSampleRate), (dwCompression) data. I use the default value if it's omitted in the header: mono, 22050 Hz, unknown compression. In any case, PT header for .BNK sound should contain values for (dwNumSamples) and (dwDataStart). (dwDataStart) is the starting position of sound data relative to the start of .BNK file. Sound data itself has no additional headers and in case of EA ADPCM compression (dwCompression==0x07) should be decoded just like "SCDl" block data (following ASFChunkHeader). As to the size of the sound data, just use (dwNumSamples) and stop playback of the sound when it's exhausted. As to .VIV files these seem to be multi-data resources. In particular, they can contain .BNK/.ASF files. So, if you want to play sounds from a .BNK file contained within .VIV, just search .VIV for "BNKl" string ID and that will be just the .BNK file described above. Note that all (dwDataShifts) given in PT headers in .BNK are always positions relative to the start of .BNK file, that is, if .BNK is in .VIV, they will be relative to the start of "BNKl" signature you found in .VIV. To play .ASF file from .VIV you may just search for "SCHl" string ID and that'll mark the beginning of .ASF file, while the end will be marked by "SCEl" block. =========== 8. Credits =========== Dmitry Kirnocenskij (ejt@mail.ru) Worked out EA ADPCM decompression algorithm. Toni Wilen (nhlinfo@nhl-online.com) http://www.nhl-online.com/nhlinfo/ Provided me with info on new SCDl structure, new BNK version header, PATl and TMpl headers. Toni Wilen is the author of SNDVIEW utility (available on his pages) which decompresses Electronic Arts audio files and compresses WAVs back into EA formats. Jesper Juul-Mortensen (jjm@danbbs.dk, ICQ#43452941) http://www.danbbs.dk/~jjm http://nfstoolbox.homepage.dk http://nfscheats.com/nfstoolbox Additional info on PT header block types. The author of utilities for NFS'x. ------------------------------------------- Valery V. Anisimovsky (samael@avn.mccme.ru) http://www.anxsoft.newmail.ru http://anx.da.ru On these sites you can find my GAP program which can search for audio files in .BNK/.VIV resources, and play back .ASF/.MUS/.STR songs, some .BNK/.VIV sounds and soundtracks of .WVE/.DCT/.MAD/.TGQ/.UV/.UV2 movies. There's also complete source code of GAP and all its plug-ins there, including MUS/ASF plug-in, which could be used for further details on how you can deal with these formats. Autodesk Animator files explanation (.FLI only excerpted). I believe that the original programmer wrote up this doc. It's correct, as I've used the info to realtime playback stock .FLIs on a 680x0 machine. All numbers in a .FLI file are in Intel format, so you may have to compensate for that, of course. - kevin 8.1 Flic Files (.FLI) The details of a FLI file are moderately complex, but the idea behind it is simple: don't bother storing the parts of a frame that are the same as the last frame. Not only does this save space, but it's very quick. It's faster to leave a pixel alone than to set it. A FLI file has a 128-byte header followed by a sequence of frames. The first frame is compressed using a bytewise run-length compression scheme. Subsequent frames are stored as the difference from the previous frame. (Occasionally the first frame and/or subsequent frames are uncompressed.) There is one extra frame at the end of a FLI which contains the difference between the last frame and the first frame. The FLI header: byte size name meaning offset 0 4 size Length of file, for programs that want to read the FLI all at once if possible. 4 2 magic Set to hex AF11. Please use another value here if you change format (even to a different resolution) so Autodesk Animator won't crash trying to read it. 6 2 frames Number of frames in FLI. FLI files have a maxium length of 4000 frames. 8 2 width Screen width (320). 10 2 height Screen height (200). 12 14 2 flags Must be 0. 16 2 speed Number of video ticks between frames. 18 4 next Set to 0. 22 4 frit Set to 0. 26 102 expand All zeroes -- for future enhancement. Next are the frames, each of which has a header: byte size name meaning offset 0 4 size Bytes in this frame. Autodesk Animator demands that this be less than 64K. 4 2 magic Always hexadecimal F1FA 6 2 chunks Number of 'chunks' in frame. 8 8 expand Space for future enhancements. All zeros. After the frame header come the chunks that make up the frame. First comes a color chunk if the color map has changed from the last frame. Then comes a pixel chunk if the pixels have changed. If the frame is absolutely identical to the last frame there will be no chunks at all. A chunk itself has a header, followed by the data. The chunk header is: byte size name meaning offset 0 4 size Bytes in this chunk. 4 2 type Type of chunk (see below). There are currently five types of chunks you'll see in a FLI file: number name meaning 11 FLI_COLOR Compressed color map 12 FLI_LC Line compressed -- the most common type of compression for any but the first frame. Describes the pixel difference from the previous frame. 13 FLI_BLACK Set whole screen to color 0 (only occurs on the first frame). 15 FLI_BRUN Bytewise run-length compression -- first frame only 16 FLI_COPY Indicates uncompressed 64000 bytes soon to follow. For those times when compression just doesn't work! The compression schemes are all byte-oriented. If the compressed data ends up being an odd length a single pad byte is inserted so that the FLI_COPY's always start at an even address for faster DMA. FLI_COLOR Chunks The first word is the number of packets in this chunk. This is followed directly by the packets. The first byte of a packet says how many colors to skip. The next byte says how many colors to change. If this byte is zero it is interpreted to mean 256. Next follows 3 bytes for each color to change (one each for red, green and blue). FLI_LC Chunks This is the most common, and alas, most complex chunk. The first word (16 bits) is the number of lines starting from the top of the screen that are the same as the previous frame. (For example, if there is motion only on the bottom line of screen you'd have a 199 here.) The next word is the number of lines that do change. Next there is the data for the changing lines themselves. Each line is compressed individually; among other things this makes it much easier to play back the FLI at a reduced size. The first byte of a compressed line is the number of packets in this line. If the line is unchanged from the last frame this is zero. The format of an individual packet is: skip_count size_count data The skip count is a single byte. If more than 255 pixels are to be skipped it must be broken into 2 packets. The size count is also a byte. If it is positive, that many bytes of data follow and are to be copied to the screen. If it's negative a single byte follows, and is repeated -skip_count times. In the worst case a FLI_LC frame can be about 70K. If it comes out to be 60000 bytes or more Autodesk Animator decides compression isn't worthwhile and saves the frame as FLI_COPY. FLI_BLACK Chunks These are very simple. There is no data associated with them at all. In fact they are only generated for the first frame in Autodesk Animator after the user selects NEW under the FLIC menu. FLI_BRUN Chunks These are much like FLI_LC chunks without the skips. They start immediately with the data for the first line, and go line- by-line from there. The first byte contains the number of packets in that line. The format for a packet is: size_count data If size_count is positive the data consists of a single byte which is repeated size_count times. If size_count is negative there are -size_count bytes of data which are copied to the screen. In Autodesk Animator if the "compressed" data shows signs of exceeding 60000 bytes the frame is stored as FLI_COPY instead. FLI_COPY Chunks These are 64000 bytes of data for direct reading onto the screen. -eof- Notes: Since these are animations, the last frame will delta into a copy of the first one (which was usually a large BRUN chunk). Therefore, looping should go back to the _second_ frame chunk (usually a LC or COLOR chunk) instead of all the way back to the file beginning, to avoid a "stutter" caused by unnecessarily redecoding the original frame. Also, a very few files may have palette animation, so write your code so that COLOR chunks can be found at any time. - kevin Credits: Lars Hamre, Norman Lin, Mark Cox, Peter Hanning, Steinar Midtskogen, Marc Espie, and Thomas Meyer (All numbers below are given in decimal) 3rd Revision Module Format: # Bytes Description ------- ----------- 20 The module's title, padded with null (\0) bytes. Original Protracker wrote letters only in uppercase. (Data repeated for each sample 1-15 or 1-31) 22 Sample's name, padded with null bytes. If a name begins with a '#', it is assumed not to be an instrument name, and is probably a message. 2 Sample length in words (1 word = 2 bytes). The first word of the sample is overwritten by the tracker, so a length of 1 still means an empty sample. See below for sample format. 1 Lowest four bits represent a signed nibble (-8..7) which is the finetune value for the sample. Each finetune step changes the note 1/8th of a semitone. Implemented by switching to a different table of period-values for each finetune value. 1 Volume of sample. Legal values are 0..64. Volume is the linear difference between sound intensities. 64 is full volume, and the change in decibels can be calculated with 20*log10(Vol/64) 2 Start of sample repeat offset in words. Once the sample has been played all of the way through, it will loop if the repeat length is greater than one. It repeats by jumping to this position in the sample and playing for the repeat length, then jumping back to this position, and playing for the repeat length, etc. 2 Length of sample repeat in words. Only loop if greater than 1. (End of this sample's data.. each sample uses the same format and they are stored sequentially) N.B. All 2 byte lengths are stored with the Hi-byte first, as is usual on the Amiga (big-endian format). 1 Number of song positions (ie. number of patterns played throughout the song). Legal values are 1..128. 1 Historically set to 127, but can be safely ignored. Noisetracker uses this byte to indicate restart position - this has been made redundant by the 'Position Jump' effect. 128 Pattern table: patterns to play in each song position (0..127) Each byte has a legal value of 0..63 (note the Protracker exception below). The highest value in this table is the highest pattern stored, no patterns above this value are stored. (4) The four letters "M.K." These are the initials of Unknown/D.O.C. who changed the format so it could handle 31 samples (sorry.. they were not inserted by Mahoney & Kaktus). Startrekker puts "FLT4" or "FLT8" here to indicate the # of channels. If there are more than 64 patterns, Protracker will put "M!K!" here. You might also find: "6CHN" or "8CHN" which indicate 6 or 8 channels respectively. If no letters are here, then this is the start of the pattern data, and only 15 samples were present. (Data repeated for each pattern:) 1024 Pattern data for each pattern (starting at 0). (Each pattern has same format and is stored in numerical order. See below for pattern format) (Data repeated for each sample:) xxxxxx The maximum size of a sample is 65535 words. Each sample is stored as a collection of bytes (length of a sample was given previously in the module). Each byte is a signed value (-128 ..127) which is the channel data. When a sample is played at a pitch of C2 (see below for pitches), about 8287 bytes of sample data are sent to the channel per second. Multiply the rate by the twelfth root of 2 (=1.0595) for each semitone increase in pitch eg. moving the pitch up 1 octave doubles the rate. The data is stored in the order it is played (eg. first byte is first byte played). The first word of the sample data is used to hold repeat information, and will overwrite any sample data that is there (but it is probably safer to set it to 0). The rate given above (8287) conveys an inaccurate picture of the module-format - in reality it is different for different Amigas. As the routines for playing were written to run off certain interrupts, for different Amiga computers the rate to send data to the channel will be different. For PAL machines the clock rate is 7093789.2 Hz and for NTSC machines it is 7159090.5 Hz. When the clock rate is divided by twice the period number for the pitch it will give the rate to send the data to the channel, eg. for a PAL machine sending a note at C2 (period 428), the rate is 7093789.2/856 ~= 8287.1369 (Each sample is stored sequentially) Pattern Format: Each pattern is divided into 64 divisions. By allocating different tempos for each pattern and spacing the notes across different amounts of divisions, different bar sizes can be accommodated. Each division contains the data for each channel (1..4) stored after each other. Channels 1 and 4 are on the left, and channels 2 and 3 are on the right. In the case of more channels: channels 5 and 8 are on the left, and channels 6 and 7 are on the right, etc. Each channel's data in the division has an identical format which consists of 2 words (4 bytes). Divisions are numbered 0..63. Each division may be divided into a number of ticks (see 'set speed' effect below). Channel Data: (the four bytes of channel data in a pattern division) 7654-3210 7654-3210 7654-3210 7654-3210 wwww xxxxxxxxxxxxxx yyyy zzzzzzzzzzzzzz wwwwyyyy (8 bits) is the sample for this channel/division xxxxxxxxxxxx (12 bits) is the sample's period (or effect parameter) zzzzzzzzzzzz (12 bits) is the effect for this channel/division If there is to be no new sample to be played at this division on this channel, then the old sample on this channel will continue, or at least be "remembered" for any effects. If the sample is 0, then the previous sample on that channel is used. Only one sample may play on a channel at a time, so playing a new sample will cancel an old one - even if there has been no data supplied for the new sample. Though, if you are using a "silence" sample (ie. no data, only used to turn off other samples) it is polite to set its default volume to 0. To determine what pitch the sample is to be played on, look up the period in a table, such as the one below (for finetune 0). If the period is 0, then the previous period on that channel is used. Unfortunately, some modules do not use these exact values. It is best to do a binary- search (unless you use the period as the offset of an array of notes.. expensive), especially if you plan to use periods outside the "standard" range. Periods are the internal representation of the pitch, so effects that alter pitch (eg. sliding) alter the period value (see "effects" below). C C# D D# E F F# G G# A A# B Octave 1: 856, 808, 762, 720, 678, 640, 604, 570, 538, 508, 480, 453 Octave 2: 428, 404, 381, 360, 339, 320, 302, 285, 269, 254, 240, 226 Octave 3: 214, 202, 190, 180, 170, 160, 151, 143, 135, 127, 120, 113 Octave 0:1712,1616,1525,1440,1357,1281,1209,1141,1077,1017, 961, 907 Octave 4: 107, 101, 95, 90, 85, 80, 76, 71, 67, 64, 60, 57 Octaves 0 and 4 are NOT standard, so don't rely on every tracker being able to play them, or even not crashing if being given them - it's just nice that if you can code it, to allow them to be read. Effects: Effects are written as groups of 4 bits, eg. 1871 = 7 * 256 + 4 * 16 + 15 = [7][4][15]. The high nibble (4 bits) usually determines the effect, but if it is [14], then the second nibble is used as well. [0]: Arpeggio Where [0][x][y] means "play note, note+x semitones, note+y semitones, then return to original note". The fluctuations are carried out evenly spaced in one pattern division. They are usually used to simulate chords, but this doesn't work too well. They are also used to produce heavy vibrato. A major chord is when x=4, y=7. A minor chord is when x=3, y=7. [1]: Slide up Where [1][x][y] means "smoothly decrease the period of current sample by x*16+y after each tick in the division". The ticks/division are set with the 'set speed' effect (see below). If the period of the note being played is z, then the final period will be z - (x*16 + y)*(ticks - 1). As the slide rate depends on the speed, changing the speed will change the slide. You cannot slide beyond the note B3 (period 113). [2]: Slide down Where [2][x][y] means "smoothly increase the period of current sample by x*16+y after each tick in the division". Similar to [1], but lowers the pitch. You cannot slide beyond the note C1 (period 856). [3]: Slide to note Where [3][x][y] means "smoothly change the period of current sample by x*16+y after each tick in the division, never sliding beyond current period". The period-length in this channel's division is a parameter to this effect, and hence is not played. Sliding to a note is similar to effects [1] and [2], but the slide will not go beyond the given period, and the direction is implied by that period. If x and y are both 0, then the old slide will continue. [4]: Vibrato Where [4][x][y] means "oscillate the sample pitch using a particular waveform with amplitude y/16 semitones, such that (x * ticks)/64 cycles occur in the division". The waveform is set using effect [14][4]. By placing vibrato effects on consecutive divisions, the vibrato effect can be maintained. If either x or y are 0, then the old vibrato values will be used. [5]: Continue 'Slide to note', but also do Volume slide Where [5][x][y] means "either slide the volume up x*(ticks - 1) or slide the volume down y*(ticks - 1), at the same time as continuing the last 'Slide to note'". It is illegal for both x and y to be non-zero. You cannot slide outside the volume range 0..64. The period-length in this channel's division is a parameter to this effect, and hence is not played. [6]: Continue 'Vibrato', but also do Volume slide Where [6][x][y] means "either slide the volume up x*(ticks - 1) or slide the volume down y*(ticks - 1), at the same time as continuing the last 'Vibrato'". It is illegal for both x and y to be non-zero. You cannot slide outside the volume range 0..64. [7]: Tremolo Where [7][x][y] means "oscillate the sample volume using a particular waveform with amplitude y*(ticks - 1), such that (x * ticks)/64 cycles occur in the division". The waveform is set using effect [14][7]. Similar to [4]. [8]: -- Unused -- [9]: Set sample offset Where [9][x][y] means "play the sample from offset x*4096 + y*256". The offset is measured in words. If no sample is given, yet one is still playing on this channel, it should be retriggered to the new offset using the current volume. [10]: Volume slide Where [10][x][y] means "either slide the volume up x*(ticks - 1) or slide the volume down y*(ticks - 1)". If both x and y are non-zero, then the y value is ignored (assumed to be 0). You cannot slide outside the volume range 0..64. [11]: Position Jump Where [11][x][y] means "stop the pattern after this division, and continue the song at song-position x*16+y". This shifts the 'pattern-cursor' in the pattern table (see above). Legal values for x*16+y are from 0 to 127. [12]: Set volume Where [12][x][y] means "set current sample's volume to x*16+y". Legal volumes are 0..64. [13]: Pattern Break Where [13][x][y] means "stop the pattern after this division, and continue the song at the next pattern at division x*10+y" (the 10 is not a typo). Legal divisions are from 0 to 63 (note Protracker exception above). [14][0]: Set filter on/off Where [14][0][x] means "set sound filter ON if x is 0, and OFF is x is 1". This is a hardware command for some Amigas, so if you don't understand it, it is better not to use it. [14][1]: Fineslide up Where [14][1][x] means "decrement the period of the current sample by x". The incrementing takes place at the beginning of the division, and hence there is no actual sliding. You cannot slide beyond the note B3 (period 113). [14][2]: Fineslide down Where [14][2][x] means "increment the period of the current sample by x". Similar to [14][1] but shifts the pitch down. You cannot slide beyond the note C1 (period 856). [14][3]: Set glissando on/off Where [14][3][x] means "set glissando ON if x is 1, OFF if x is 0". Used in conjunction with [3] ('Slide to note'). If glissando is on, then 'Slide to note' will slide in semitones, otherwise will perform the default smooth slide. [14][4]: Set vibrato waveform Where [14][4][x] means "set the waveform of succeeding 'vibrato' effects to wave #x". [4] is the 'vibrato' effect. Possible values for x are: 0 - sine (default) /\ /\ (2 cycles shown) 4 (without retrigger) \/ \/ 1 - ramp down | \ | \ 5 (without retrigger) \ | \ | 2 - square ,--, ,--, 6 (without retrigger) '--' '--' 3 - random: a random choice of one of the above. 7 (without retrigger) If the waveform is selected "without retrigger", then it will not be retriggered from the beginning at the start of each new note. [14][5]: Set finetune value Where [14][5][x] means "sets the finetune value of the current sample to the signed nibble x". x has legal values of 0..15, corresponding to signed nibbles 0..7,-8..-1 (see start of text for more info on finetune values). [14][6]: Loop pattern Where [14][6][x] means "set the start of a loop to this division if x is 0, otherwise after this division, jump back to the start of a loop and play it another x times before continuing". If the start of the loop was not set, it will default to the start of the current pattern. Hence 'loop pattern' cannot be performed across multiple patterns. Note that loops do not support nesting, and you may generate an infinite loop if you try to nest 'loop pattern's. [14][7]: Set tremolo waveform Where [14][7][x] means "set the waveform of succeeding 'tremolo' effects to wave #x". Similar to [14][4], but alters effect [7] - the 'tremolo' effect. [14][8]: -- Unused -- [14][9]: Retrigger sample Where [14][9][x] means "trigger current sample every x ticks in this division". If x is 0, then no retriggering is done (acts as if no effect was chosen), otherwise the retriggering begins on the first tick and then x ticks after that, etc. [14][10]: Fine volume slide up Where [14][10][x] means "increment the volume of the current sample by x". The incrementing takes place at the beginning of the division, and hence there is no sliding. You cannot slide beyond volume 64. [14][11]: Fine volume slide down Where [14][11][x] means "decrement the volume of the current sample by x". Similar to [14][10] but lowers volume. You cannot slide beyond volume 0. [14][12]: Cut sample Where [14][12][x] means "after the current sample has been played for x ticks in this division, its volume will be set to 0". This implies that if x is 0, then you will not hear any of the sample. If you wish to insert "silence" in a pattern, it is better to use a "silence"-sample (see above) due to the lack of proper support for this effect. [14][13]: Delay sample Where [14][13][x] means "do not start this division's sample for the first x ticks in this division, play the sample after this". This implies that if x is 0, then you will hear no delay, but actually there will be a VERY small delay. Note that this effect only influences a sample if it was started in this division. [14][14]: Delay pattern Where [14][14][x] means "after this division there will be a delay equivalent to the time taken to play x divisions after which the pattern will be resumed". The delay only relates to the interpreting of new divisions, and all effects and previous notes continue during delay. [14][15]: Invert loop Where [14][15][x] means "if x is greater than 0, then play the current sample's loop upside down at speed x". Each byte in the sample's loop will have its sign changed (negated). It will only work if the sample's loop (defined previously) is not too big. The speed is based on an internal table. [15]: Set speed Where [15][x][y] means "set speed to x*16+y". Though it is nowhere near that simple. Let z = x*16+y. Depending on what values z takes, different units of speed are set, there being two: ticks/division and beats/minute (though this one is only a label and not strictly true). If z=0, then what should technically happen is that the module stops, but in practice it is treated as if z=1, because there is already a method for stopping the module (running out of patterns). If z<=32, then it means "set ticks/division to z" otherwise it means "set beats/minute to z" (convention says that this should read "If z<32.." but there are some composers out there that defy conventions). Default values are 6 ticks/division, and 125 beats/minute (4 divisions = 1 beat). The beats/minute tag is only meaningful for 6 ticks/division. To get a more accurate view of how things work, use the following formula: 24 * beats/minute divisions/minute = ----------------- ticks/division Hence divisions/minute range from 24.75 to 6120, eg. to get a value of 2000 divisions/minute use 3 ticks/division and 250 beats/minute. If multiple "set speed" effects are performed in a single division, the ones on higher-numbered channels take precedence over the ones on lower-numbered channels. This effect has a large number of different implementations, but the one described here has the widest usage. N.B. This document should be fairly accurate now, but as the module format is more of an observation than a standard, a couple of effects cannot be relied upon to act exactly the same from tracker to tracker (especially if the tracker is not for the Amiga). It is probably better to use this document as a guide rather than as a hard-and-fast definition of the module format. Oh.. and yes, I would normally give bytes as hex values, but it is easier to understand a consistent notation. Andrew Scott (Adrenalin Software), INTERNET:ascott@tartarus.uwa.edu.au Author of MIDIMOD (MOD to MIDI converter), PTMID (MIDI to MOD converter) Information from File Format List 2.0 by Max Maischein. --------!-CONTACT_INFO---------------------- If you notice any mistakes or omissions, please let me know! It is only with YOUR help that the list can continue to grow. Please send all changes to me rather than distributing a modified version of the list. This file has been authored in the style of the INTERxxy.* file list by Ralf Brown, and uses almost the same format. Please read the file FILEFMTS.1ST before asking me any questions. You may find that they have already been addressed. Max Maischein Max Maischein, 2:244/1106.17 Max_Maischein@spam.fido.de corion@informatik.uni-frankfurt.de Corion on #coders@IRC --------!-DISCLAIMER------------------------ DISCLAIMER: THIS MATERIAL IS PROVIDED "AS IS". I verify the information contained in this list to the best of my ability, but I cannot be held responsible for any problems caused by use or misuse of the information, especially for those file formats foreign to the PC, like AMIGA or SUN file formats. If an information it is marked "guesswork" or undocumented, you should check it carefully to make sure your program will not break with an unexpected value (and please let me know whether or not it works the same way). Information marked with "???" is known to be incomplete or guesswork. Some file formats were not released by their creators, others are regarded as proprietary, which means that if your programs deal with them, you might be looking for trouble. I don't care about this. -------------------------------------------- The GF1 Patch files are multipart sound files for the Gravis Ultrasound sound card to emulate MIDI sounds in high quality. Each Patch can consist of many samples (for example, a string ensemble consists of Violin, Viola, Cello, Bass) which are played depending on the note to play. A patch can also contain a part to be played before the loop and a part to be played after the tone has been released. OFFSET Count TYPE Description 0000h 12 char ID='GF1PATCH110' 000Ch 10 char Manufacturer ID 0018h 60 char Description of the contained Instruments or copyright of manufacturer. 0054h 1 byte Number of instruments in this patch 0055h 1 byte Number of voices for sample 0056h 1 byte Number of output channels (1=mono,2=stereo) 0057h 1 word Number of waveforms 0059h 1 word Master volume for all samples 005Bh 1 dword Size of the following data 0060h 36 byte reserved Following this header, the instruments with their headers follow. An instrument header contains the name and other data about one instrument contained within the patch. OFFSET Count TYPE Description 0000h 1 word Instrument number. ?Maybe the MIDI instrument number?. In the Gravis patches, this is 0, in other patches, I found random values. 0002h 16 char ASCII name of the instrument. 0012h 1 dword Size of the whole instrument in bytes. 0016h 1 byte Layers. Needed for whatever. 0017h 40 byte reserved About the patch, I don't know anything. Maybe somebody could enlighten me. Each patch record has the following format : OFFSET Count TYPE Description 0000h 7 char Wave file name 0007h 1 byte Fractions 0008h 1 dword Wave size. Size of the wave digital data 000Ch 1 dword Start of wave loop 0010h 1 dword End of wave loop 0012h 1 word Sample rate of the wave 0014h 1 word Minimum frequency to play the wave 0016h 1 word Maximum frequency to play the wave 0018h 1 dword Original sample rate of the wave data 001Ch 1 int Fine tune value for the wave 001Eh 1 byte Stereo balance, values unknown** 001Fh 6 byte Filter envelope rate 0025h 6 byte Filter envelope offse 002Bh 1 byte Tremolo sweep 002Ch 1 byte Tremolo rate 002Dh 1 byte Tremolo depth 002Fh 1 byte Vibrato sweep 0030h 1 byte Vibrato rate 0031h 1 byte Vibrato depth 0032h 1 byte Wave data, bitmapped 0 - 8/16 bit wave data 1 - signed/unsigned data 2 - de/enable looping 3 - no/has bidirectional looping 4 - loop forward/backward 5 - Turn envelope sustaining off/on 6 - Dis/Enable filter envelope 7 - reserved 0033h 1 int Frequency scale, whatever that means 0035h 1 word Frequency scale factor 0037h 36 byte Reserved EXTENSION:PAT OCCURENCES:PC PROGRAMS:Patch Maker SEE ALSO:VOC,WAVe By Asatur V. Nazarian (samael@avn.mccme.ru) In this document I'll try to describe audio file format used in many Sierra On-Line games. In most games these files are contained within .SFX and .AUD resource files (usually named RESOURCE.SFX and RESOURCE.AUD). When encountered as stand-alone files, they usually have extension .AUD (but it has nothing to do with Westwood's AUD audio file format!). The games using this format include: King's Quest 6, King's Quest 7, King's Quest 8, Leisure Suit Larry 6, Leisure Suit Larry 7, Pepper's Adventures in Time, Quest For Glory 3, Space Quest 5, Torin's Passage, Phantasmagoria, Phantasmagoria II: The Puzzle Of Flesh, Gabriel Knight, Gabriel Knight II, Shivers 2: Harvest of Souls. Maybe many more, e.g.: other games of SQx, GQx, LSLx, KQx series. The files this document deals with have extensions: .SFX, .AUD. Note that the extension of resource files containing AUD audio files may be different from these. Throughout this document I use C-like notation. All numbers in all structures described in this document are stored in files using little-endian (Intel) byte order. =================== 1. AUD File Header =================== The AUD file has the following header: struct AUDHeader { BYTE bID; BYTE bShift; char szID[4]; WORD wSampleRate; BYTE bFlags; DWORD dwDataSize; }; bID -- is equal to 0x8D in all Sierra On-Line games I've seen, except for King's Quest 8, where it equals to 0x0D. bShift -- defines where audio data starts: (bShift+2) is the starting position of the audio data relative to the file start (NOT to the start of RESOURCE.SFX/RESOURCE.AUD containing this file). szID -- always "SOL\0". Note that there're four bytes including terminating zero! wSampleRate -- sample rate for the file. bFlags -- bit-mapped flags: bit 0 -- if set, audio data is compressed (otherwise it's PCM), bit 1 -- ??? (I've never seen it set), bit 2 -- if set, audio data is 16-bit (8-bit otherwise), bit 3 -- if set, audio data is in signed format (unsigned otherwise): 16-bit sound is signed and 8-bit is unsigned, bit 4 -- if set, sound is stereo (mono otherwise). dwDataSize -- size of the audio data (in bytes). ================= 2. AUD File Data ================= Starting at (bShift+2) from the file start, comes AUD audio data. If bit 0 of bFlags is not set, it's just PCM: 8-bit or 16-bit, signed or unsigned. Otherwise it's compressed with the algorithm, which I refer to as SOL ADPCM. SOL ADPCM has two types: 8-bit (for 8-bit sound) and 16-bit (for 16-bit sound). =========================================== 3. 8-bit SOL ADPCM Decompression Algorithm =========================================== Let's (CurSample) be current sample value and (InputBuffer) contain SOL ADPCM compressed data: SHORT CurSample; BYTE InputBuffer[InputBufferSize]; BYTE code; DWORD i; // index into InputBuffer CurSample=0x80; // unsigned 8-bit for (i=0;i<InputBufferSize;i++) { code=HINIBBLE(InputBuffer[i]); // get HIGHER 4-bit nibble if (code & 8) // sign bit CurSample-=SOLTable3bit[INDEX4(code)]; else CurSample+=SOLTable3bit[code]; CurSample=Clip8BitSample(CurSample); // clip to 8-bit unsigned value range Output((BYTE)CurSample); // send to the output stream code=LONIBBLE(InputBuffer[i]); // get LOWER 4-bit nibble ...the same for lower nibble } HINIBBLE and LONIBBLE are higher and lower 4-bit nibbles: #define HINIBBLE(byte) ((byte) >> 4) #define LONIBBLE(byte) ((byte) & 0x0F) Note that depending on your compiler you may need to use additional nibble separation in these defines, e.g. (((byte) >> 4) & 0x0F). Output() is just a placeholder for any action you would like to perform for decompressed sample value. SOLTable3bit is the delta table given near the end of this document. INDEX4(code) is really a tricky thing. In some games (mostly older ones) it should be the following: #define INDEX4(code) (0xF-(code)) While in some other games it's the following: #define INDEX4(code) ((code) & 7) "Old" INDEX4 is used, for example, in King's Quest 6, Quest For Glory 3, Gabriel Knight. "New" INDEX4 is used in Torin's Passage, maybe in other games. I do not know the reliable way to figure out which of those you should use for particular file, but currently I use the simplest technique: I just decode first, say, 1Kb of data using both approaches and look if one of them results in the output stream which is far from reasonable 8-bit unsigned sound (that is, it's mean sample value is far from 0x80). Clip8BitSample is quite evident: SHORT Clip8BitSample(SHORT sample) { if (sample>255) return 255; else if (sample<0) return 0; else return sample; } Note that the HIGHER nibble is processed first. ============================================ 4. 16-bit SOL ADPCM Decompression Algorithm ============================================ It's just analoguous to the 8-bit decompression scheme: LONG CurSample; BYTE InputBuffer[InputBufferSize]; BYTE code; DWORD i; CurSample=0x0000; // signed 16-bit for (i=0;i<InputBufferSize;i++) { code=InputBuffer[i]; if (code & 0x80) // sign bit CurSample-=SOLTable7bit[INDEX8(code)]; else CurSample+=SOLTable7bit[code]; CurSample=Clip16BitSample(CurSample); // clip to 16-bit signed value range Output((SHORT)CurSample); // send to the output stream } SOLTable7bit is the delta table given near the end of this document. INDEX8(code) might be as tricky as for 8-bit sound. But in all games I've seen where compressed 16-bit sound is used it's just the following: #define INDEX8(code) ((code) & 0x7F) At least it's true for Torin's Passage, King's Quest 8, Gabriel Knight, etc. Clip16BitSample is quite evident, too: LONG Clip16BitSample(LONG sample) { if (sample>32767) return 32767; else if (sample<-32768) return (-32768); else return sample; } Note that the decompression schemes are given ONLY for unsigned 8-bit sound and signed 16-bit sound. I've never seen signed 8-bit or unsigned 16-bit sound in AUD format, but to support these you should only support the correspondent clipping (-128..127 for signed 8-bit and 0..65535 for unsigned 16-bit) and make additional conversion before outputting the sample value: signed->unsigned for 8-bit sound or unsigned->signed for 16-bit sound, provided that you've initialized (CurSample) to the correspondent value: 0x00 for signed 8-bit and 0x8000 for unsigned 16-bit. Also, those algorithms are ONLY for mono sound, but their improvement for stereo is simple: for 8-bit sound left channel is in HIGHER nibble and right is in LOWER one, while for 16-bit sound left channel is first byte and right channel is second one. Note that you should maintain two different (CurSample) variables for left and right channels: (CurSampleLeft) and (CurSampleRight). Of course, both decompression routines described above may be greatly optimized. ==================== 5. SOL ADPCM Tables ==================== BYTE SOLTable3bit[]= { 0, 1, 2, 3, 6, 0xA, 0xF, 0x15 }; WORD SOLTable7bit[]= { 0x0, 0x8, 0x10, 0x20, 0x30, 0x40, 0x50, 0x60, 0x70, 0x80, 0x90, 0xA0, 0xB0, 0xC0, 0xD0, 0xE0, 0xF0, 0x100, 0x110, 0x120, 0x130, 0x140, 0x150, 0x160, 0x170, 0x180, 0x190, 0x1A0, 0x1B0, 0x1C0, 0x1D0, 0x1E0, 0x1F0, 0x200, 0x208, 0x210, 0x218, 0x220, 0x228, 0x230, 0x238, 0x240, 0x248, 0x250, 0x258, 0x260, 0x268, 0x270, 0x278, 0x280, 0x288, 0x290, 0x298, 0x2A0, 0x2A8, 0x2B0, 0x2B8, 0x2C0, 0x2C8, 0x2D0, 0x2D8, 0x2E0, 0x2E8, 0x2F0, 0x2F8, 0x300, 0x308, 0x310, 0x318, 0x320, 0x328, 0x330, 0x338, 0x340, 0x348, 0x350, 0x358, 0x360, 0x368, 0x370, 0x378, 0x380, 0x388, 0x390, 0x398, 0x3A0, 0x3A8, 0x3B0, 0x3B8, 0x3C0, 0x3C8, 0x3D0, 0x3D8, 0x3E0, 0x3E8, 0x3F0, 0x3F8, 0x400, 0x440, 0x480, 0x4C0, 0x500, 0x540, 0x580, 0x5C0, 0x600, 0x640, 0x680, 0x6C0, 0x700, 0x740, 0x780, 0x7C0, 0x800, 0x900, 0xA00, 0xB00, 0xC00, 0xD00, 0xE00, 0xF00, 0x1000, 0x1400, 0x1800, 0x1C00, 0x2000, 0x3000, 0x4000 }; ================================================ 6. AUD Resources: RESOURCE.AUD and RESOURCE.SFX ================================================ When stored in .SFX/.AUD resources, the audio files are stored "as is", without compression (unlike other Sierra On-Line resource files) or encryption. That means if you want to play/extract AUD file from the RESOURCE.SFX/.AUD resource you just need to search for szID id-string ("SOL\0") and read AUDHeader starting at the position two bytes before found id-string. This will give you starting point of the file and the size of the file will be (dwDataSize+bShift+2). =========== 7. Credits =========== Anthony Larme (larme@bit.net.au) http://www.bit.net.au/~larme/ [Phantasmagoria Memorial Websites] It was just him who inspired me to explore this format deeper and helped me much with the AUDs from Sierra's games I had no access to. It was also him who tested my Game Audio Player on many Sierra's games and reported me results. ---------------------------------------- Asatur V. Nazarian (samael@avn.mccme.ru) http://anx.da.ru http://www.fortunecity.com/campus/electrical/81/samael.html http://www.music.ag.ru/ On all these sites you can find my GAP program which can search for SOL audio files in .SFX/.AUD resources, extract them, convert them to WAV and play them back. There's also complete source code of GAP and all its plug-ins there, including SOL plug-in, which could be used for further details on how you can deal with this format. From: galt@dsd.es.com (byte numbers are hex!) HEADER (bytes 00-19) Series of DATA BLOCKS (bytes 1A+) [Must end w/ Terminator Block] - --------------------------------------------------------------- HEADER: ======= byte # Description ------ ------------------------------------------ 00-12 "Creative Voice File" 13 1A (eof to abort printing of file) 14-15 Offset of first datablock in .voc file (std 1A 00 in Intel Notation) 16-17 Version number (minor,major) (VOC-HDR puts 0A 01) 18-19 1's Comp of Ver. # + 1234h (VOC-HDR puts 29 11) - --------------------------------------------------------------- DATA BLOCK: =========== Data Block: TYPE(1-byte), SIZE(3-bytes), INFO(0+ bytes) NOTE: Terminator Block is an exception -- it has only the TYPE byte. TYPE Description Size (3-byte int) Info ---- ----------- ----------------- ----------------------- 00 Terminator (NONE) (NONE) 01 Sound data 2+length of data * 02 Sound continue length of data Voice Data 03 Silence 3 ** 04 Marker 2 Marker# (2 bytes) 05 ASCII length of string null terminated string 06 Repeat 2 Count# (2 bytes) 07 End repeat 0 (NONE) 08 Extended 4 *** *Sound Info Format: **Silence Info Format: --------------------- ---------------------------- 00 Sample Rate 00-01 Length of silence - 1 01 Compression Type 02 Sample Rate 02+ Voice Data ***Extended Info Format: --------------------- 00-01 Time Constant: Mono: 65536 - (256000000/sample_rate) Stereo: 65536 - (25600000/(2*sample_rate)) 02 Pack 03 Mode: 0 = mono 1 = stereo Marker# -- Driver keeps the most recent marker in a status byte Count# -- Number of repetitions + 1 Count# may be 1 to FFFE for 0 - FFFD repetitions or FFFF for endless repetitions Sample Rate -- SR byte = 256-(1000000/sample_rate) Length of silence -- in units of sampling cycle Compression Type -- of voice data 8-bits = 0 4-bits = 1 2.6-bits = 2 2-bits = 3 Multi DAC = 3+(# of channels) [interesting-- this isn't in the developer's manual] --------------------------------------------------------------------------------- Addendum submitted by Votis Kokavessis: After some experimenting with .VOC files I found out that there is a Data Block Type 9, which is not covered in the VOC.TXT file. Here is what I was able to discover about this block type: TYPE: 09 SIZE: 12 + length of data INFO: 12 (twelve) bytes INFO STRUCTURE: Bytes 0-1: (Word) Sample Rate (e.g. 44100) Bytes 2-3: zero (could be that bytes 0-3 are a DWord for Sample Rate) Byte 4: Sample Size in bits (e.g. 16) Byte 5: Number of channels (e.g. 1 for mono, 2 for stereo) Byte 6: Unknown (equal to 4 in all files I examined) Bytes 7-11: zero By Asatur V. Nazarian (samael@avn.mccme.ru) In this document I'll try to extend Vladan Bato's description of .AUD audio file format used in some Westwood Studios games. Namely, Bato's AUD3.TXT describes IMA ADPCM compressed AUDs used in C&C and Red Alert and what I'll try to describe here is Westwood ADPCM compressed AUDs used in Legend Of Kyrandia III: Malcolm's Revenge (and, also some files in C&C): Malcolm's music, sound FX, speech and video soundtracks. Also described is the format of soundtracks in C&C, Red Alert and C&C: Tiberian Sun. Probably, the formats described here are used in some other Westwood games, e.g.: Lands Of Lore 2,3, Blade Runner, Dune 2000. AUD3.TXT and VQA_FRMT.TXT to which I refer in this document may be found on Wotsit (www.wotsit.org) or on Vladan Bato's pages link to which is given in the end of this document. The files this document deals with have extensions: .AUD, .TLK, .PAK, .VQA. Throughout this document I use C-like notation. All numbers in all structures described in this document are stored in files using little-endian (Intel) byte order. ============= 1. AUD Files ============= Malcolm's AUD files have the same format as C&C's AUDs (which is described in AUD3.TXT) with only one exception: there's no OutSize field in their header. So it looks like the following: struct AUDHeaderOld { WORD wSampleRate; DWORD dwSize; BYTE bFlags; BYTE bType; }; bType is equal to 0x01 for WS ADPCM compressed AUDs. All WS ADPCM compressed sounds I've ever encountered are 8-bit. The meanings of the other fields in AUD header are the same as for C&C AUDs. These AUDs are divided in chunks with the chunk header being the same as for C&C, but those chunks have variable size (may be NOT 512 bytes) unlike C&C AUDs! Note that WS ADPCM compressed AUDs in C&C (death screams) have just the same format as other AUDs in this game, i.e. with OutSize field. ==================================== 2. WS ADPCM Decompression Algorithm ==================================== Each AUD chunk may be decompressed independently of others. This lets you implement the seeking for WS ADPCM AUDs (unlike IMA ADPCM ones). But during the decompression of the given chunk a variable (CurSample) should be maintained for this whole chunk: SHORT CurSample; BYTE InputBuffer[InputBufferSize]; // input buffer containing the whole chunk WORD wSize, wOutSize; // Size and OutSize values from this chunk's header BYTE code; CHAR count; // this is a signed char! WORD i; // index into InputBuffer WORD input; // shifted input if (wSize==wOutSize) // such chunks are NOT compressed { for (i=0;i<wOutSize;i++) Output(InputBuffer[i]); // send to output stream return; // chunk is done! } // otherwise we need to decompress chunk CurSample=0x80; // unsigned 8-bit i=0; // note that wOutSize value is crucial for decompression! while (wOutSize>0) // until wOutSize is exhausted! { input=InputBuffer[i++]; input<<=2; code=HIBYTE(input); count=LOBYTE(input)>>2; switch (code) // parse code { case 2: // no compression... if (count & 0x20) { count<<=3; // here it's significant that (count) is signed: CurSample+=count>>3; // the sign bit will be copied by these shifts! Output((BYTE)CurSample); wOutSize--; // one byte added to output } else // copy (count+1) bytes from input to output { for (count++;count>0;count--,wOutSize--,i++) Output(InputBuffer[i]); CurSample=InputBuffer[i-1]; // set (CurSample) to the last byte sent to output } break; case 1: // ADPCM 8-bit -> 4-bit for (count++;count>0;count--) // decode (count+1) bytes { code=InputBuffer[i++]; CurSample+=WSTable4bit[(code & 0x0F)]; // lower nibble CurSample=Clip8BitSample(CurSample); Output((BYTE)CurSample); CurSample+=WSTable4bit[(code >> 4)]; // higher nibble CurSample=Clip8BitSample(CurSample); Output((BYTE)CurSample); wOutSize-=2; // two bytes added to output } break; case 0: // ADPCM 8-bit -> 2-bit for (count++;count>0;count--) // decode (count+1) bytes { code=InputBuffer[i++]; CurSample+=WSTable2bit[(code & 0x03)]; // lower 2 bits CurSample=Clip8BitSample(CurSample); Output((BYTE)CurSample); CurSample+=WSTable2bit[((code>>2) & 0x03)]; // lower middle 2 bits CurSample=Clip8BitSample(CurSample); Output((BYTE)CurSample); CurSample+=WSTable2bit[((code>>4) & 0x03)]; // higher middle 2 bits CurSample=Clip8BitSample(CurSample); Output((BYTE)CurSample); CurSample+=WSTable2bit[((code>>6) & 0x03)]; // higher 2 bits CurSample=Clip8BitSample(CurSample); Output((BYTE)CurSample); wOutSize-=4; // 4 bytes sent to output } break; default: // just copy (CurSample) (count+1) times to output for (count++;count>0;count--,wOutSize--) Output((BYTE)CurSample); } } HIBYTE and LOBYTE are just higher and lower bytes of WORD: #define HIBYTE(word) ((word) >> 8) #define LOBYTE(word) ((word) & 0xFF) Note that depending on your compiler you may need to use additional byte separation in these defines, e.g. (((byte) >> 8) & 0xFF). The same holds for 4-bit and 2-bit nibble separation in the code above. WSTable4bit and WSTable2bit are the delta tables given in the next section. Output() is just a placeholder for any action you would like to perform for decompressed sample value. Clip8BitSample is quite evident: SHORT Clip8BitSample(SHORT sample) { if (sample>255) return 255; else if (sample<0) return 0; else return sample; } This algorithm is ONLY for mono 8-bit unsigned sound, as I've never seen any other sound format used with WS ADPCM compression. Of course, the decompression routine described above may be greatly optimized. =================== 3. WS ADPCM Tables =================== CHAR WSTable2bit[]= { -2, -1, 0, 1 }; CHAR WSTable4bit[]= { -9, -8, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 8 }; ===================================================== 4. AUDs in Legend Of Kyrandia III: Malcolm's Revenge ===================================================== The WS ADPCM compression described above is used for all audio in this game: Music is stand-alone .AUD files. Speech is AUDs in .TLK resource files. Sounds are AUDs in .PAK resource files. These .TLKs and .PAKs do not use any compression or encryption for AUDs, so AUDs are stored "as is" in them. If you want to extract/play an AUD from PAK or TLK you just need to search the PAK or TLK for the AUD id, that is, DWORD value equal to 0x0000DEAF (or, in other words, string "\xAF\xDE\0\0"). Refer to Vladan Bato's AUD3.TXT for more details on AUD file structure. ========================= 5. VQA Movie Soundtracks ========================= Soundtrack of VQA movie in Malcolm, C&C, Red Alert and C&C: Tiberian Sun is stored in SND0, SND1 or SND2 blocks. Refer to VQA_FRMT.TXT by Aaron Glover for details on the structure of VQA files. Here I only describe the contents of VQA sound blocks and VQHD (header) block. VQHD block contains header for VQA. To the best of my knowledge, it has the following format: struct VQAHeader { WORD wVersion; WORD unknown1; WORD wNumFrames; WORD wWidth; WORD wHeight; WORD unknown2; WORD unknown3; WORD unknown4; WORD unknown5; DWORD unknown6; WORD unknown7; WORD wSampleRate; BYTE bChannels; BYTE bResolution; char unknown8[14]; }; wVersion -- version of VQA: 1 -- oldest Malcolm's VQAs, 2 -- C&C, Red Alert, 3 -- C&C: Tiberian Sun. wNumFrames -- number of frames in VQA. Note that number of sound blocks is (wNumFrames+1) for VQAs of version 2 (C&C, Red Alert), and (wNumFrames) for versions 1 and 3. wSampleRate -- sample rate for soundtrack. Note that version 1 (Malcolm's) VQAs may have this value set to 0x0000! Use 22050 Hz in such cases. bChannels -- number of channels (1 -- mono, 2 -- stereo). Note that version 1 VQAs may have this set to 0x00, so use 1 (mono) for such files. bResolution -- resolution of soundtrack (0x10 -- 16-bit, 0x8 -- 8-bit). Note that version 1 VQAs may have this set to 0x00, so use 0x8 for such files. All VQAs in Malcolm have their sound in either SND0 or SND1 blocks. SND0 blocks contain non-compressed PCM data. SND1 blocks contain small header and WS ADPCM compressed sound data. The header is the following: struct SND1Header { WORD wOutSize; WORD wSize; }; Following the header comes WS ADPCM compressed sound data. Each SND1 sound block may be decompressed, just like a chunk of AUD file, independently of the others and the routine described above may be used for its decompression without any changes, provided you use wOutSize from the SND1Header. As to VQAs in C&C and Red Alert their sound is in the SND2 blocks and compressed with IMA ADPCM algorithm, described in Vladan Bato's AUD3.TXT. The contents of SND2 block is just compressed data, without any headers and those blocks should be decompressed in their turn just like chunks of IMA ADPCM compressed AUD file as it's described in AUD3.TXT. This holds only for mono soundtracks. But there're also stereo soundtracks in C&C and C&C: Tiberian Sun. They have different left/right channel nibbles layout. For C&C (version 2) VQAs the layout is the following: LL RR LL RR ... That is, first byte contains two nibbles for two left channel values, next byte contains nibbles for right channel, etc. Note that lower nibble should be processed first and then higher one (see AUD3.TXT). For C&C: Tiberian Sun (version 3) VQAs the layout is different: in SND2 block first go all nibbles for left channel, then all nibbles for right channel: LL LL LL ... LL RR RR RR ... RR Note that nibbles should be processed in the same turn: lower nibble first. So, when decoding SND2 block, just decompress first half of the block data for left channel, then second half -- for right channel. =========== 6. Credits =========== Vladan Bato (bat22@geocities.com) http://www.geocities.com/SiliconValley/8682 Sent me docs on IMA ADPCM AUDs and VQAs. Alexey Schepetilnikov (a.shepetilnikov@globalone.ru) http://www.fortunecity.com/campus/electrical/81/ Inspired me to work out WS ADPCM decompression scheme. ---------------------------------------- Asatur V. Nazarian (samael@avn.mccme.ru) http://anx.da.ru http://www.fortunecity.com/campus/electrical/81/samael.html http://www.music.ag.ru/ On all these sites you can find my GAP program which can search for AUD audio files in .MIX/.TLK/.PAK resources and .VQA soundtracks, extract them, convert them to WAV and play them back. There's also complete source code of GAP and all its plug-ins there, including AUD plug-in, which could be used for further details on how you can deal with this format. Let’s animate plasma in Windows by cycling its colors and fading it away on exit Porting graphics applications from DOS to Windows is an onerous task because Windows takes you as far away from the system internals as it possibly can. The rationale being that there might be other windows open displaying their own graphics, and if you came along and wrote to the screen directly or modified the palette, then they would get trashed. To get around this problem, Windows introduced the concept of the system and logical palettes. In the article “The color of Windows” in the last issue, I showed you how to create your own logical palette containing the specific colors you want displayed, and how to map it onto the system palette. In this article, I’ll introduce you to palette animation by drawing a plasma, a kind of fractal, cycling its colors a la FRACTINT (the popular fractals program), and fading it away when you exit. Animating a plasma in Delphi Now that we are familiar with how Windows handles palettes, let’s put that knowledge to use by drawing plasma and then animating it. I’ve written a small program in Delphi to do this, and here is its interface section containing some definitions and constants that we’ll require throughout. Suppose, for a moment, that we were working in 256-color mode, and then the total number of colors available to us would be 236 because 20 colors are taken by the system palette. Now as we are basically drawing the plasma using the three primary colors with varying intensities, if we were to reserve 78 slots for each primary color and one for black, then the total number of colors required would be 235. Hence I’ve set the constant NumColors to 235 and MaxOneColor to 78. MaxX and MaxY define the boundaries of our canvas, while IntensityShift is a constant I’ve used to bias the original plasma palette towards the high intensity range so as not to let the colors look dull and muted. Plasma is a pointer to a two-dimensional array which we will use to simulate our canvas as far as drawing the plasma goes. Since at every step while drawing the plasma we’ll need to find out the index (into the logical palette) of a point on the canvas, I thought it better to simulate the canvas rather than waste time in repeated API calls. The LogicalPalette is a record encapsulating a TLogPalette which we’ll use to inform the system palette about the colors we require, plus an Entries array containing the details of the 234 colors and their usage. unit Main; interface uses Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs, ExtCtrls, StdCtrls, Buttons; const NumColors=235; MaxOneColor=(NumColors-1) div 3; IntensityShift=150; MaxX=319; MaxY=199; type Grid = Array [0..MaxX,0..MaxY] of Byte; PtrGrid = ^Grid; LogicalPalette = Record LogPal : TLogPalette; Entries : Array [1..NumColors-1] of TPaletteEntry; end; TfrmMain = class(TForm) Panel: TPanel; imgPlasma: TImage; btnDraw: TBitBtn; btnExit: TBitBtn; btnAnimate: TBitBtn; procedure btnExitClick(Sender: TObject); procedure FormCreate(Sender: TObject); procedure btnAnimateClick(Sender: TObject); procedure btnDrawClick(Sender: TObject); procedure FormClose(Sender: TObject; var Action: TCloseAction); private Plasma: PtrGrid; AnimationIsOn: Boolean; NewPalette: LogicalPalette; procedure PalFade; procedure CreatePlasma; procedure RotateColors; procedure MakePlasmaPal; procedure Subdivide(x1,y1,x2,y2:integer); procedure SetColor(xa,ya,x,y,xb,yb:integer); procedure Idle(Sender:TObject;var Done:Boolean); end; var frmMain: TfrmMain; The first step towards drawing the plasma is to set the colors in our logical palette to 78 shades of red, green, and blue each. We also have to ensure that each entry in the palette has its PC_RESERVED flag set so that it can be used for animation. The MakePlasmaPal procedure does this. It also sets the palette version to 0x300, and marks the first entry in the logical palette as black. implementation {$R *.DFM} procedure TfrmMain.MakePlasmaPal; var i:integer; begin with NewPalette do begin LogPal.palVersion:=$300; LogPal.palNumEntries:=NumColors; LogPal.palPalEntry[0].peRed:=0; LogPal.palPalEntry[0].peBlue:=0; LogPal.palPalEntry[0].peGreen:=0; LogPal.palPalEntry[0].peFlags:=PC_RESERVED; for i:=1 to MaxOneColor do begin Entries[i].peRed:=IntensityShift+i; Entries[i].peBlue:=0; Entries[i].peGreen:=0; Entries[i].peFlags:=PC_RESERVED; Entries[MaxOneColor+i].peRed:=0; Entries[MaxOneColor+i].peBlue:=IntensityShift+i; Entries[MaxOneColor+i].peGreen:=0; Entries[MaxOneColor+i].peFlags:=PC_RESERVED; Entries[2*MaxOneColor+i].peRed:=0; Entries[2*MaxOneColor+i].peBlue:=0; Entries[2*MaxOneColor+i].peGreen:=IntensityShift+i; Entries[2*MaxOneColor+i].peFlags:=PC_RESERVED; end; end; end; Now let’s get our fingers dirty and look at the actual code which draws the plasma. The two subroutines responsible for this are Subdivide() and SetColor(). Subdivide is a recursive subroutine which is called after setting the color of the canvas corners randomly. The first time it is called we pass to it the coordinates of the top-left and bottom-right corners of the canvas respectively. Subdivide checks whether these two points are adjacent to each other, and quits if they are. Otherwise it calculates the center of the rectangle formed by these two points. Then SetColor is called four times to set the midpoints of the four lines which form the boundary of the rectangle to a particular color. The colors are chosen in such a manner that when the line in question is long, the color of its midpoint can fluctuate about the average value of the colors at the endpoints by a big margin. However, as the line’s endpoints come closer and closer to each other, the color of its midpoint is constrained to lie near the average of the endpoints’ colors. This ensures that a point’s neighbors will have colors that differ only marginally form its own, and yet when viewed on a big scale, the colors at one end of the canvas will be markedly different from those at the other end. Subdivide then assigns the rectangle’s midpoint a color-value equal to the average color of the points that form the corners of the rectangle. By plotting these five points we have now broken or subdivided the original rectangle into four smaller rectangles (each formed by taking one vertex and the center of the original rectangle) which are then passed to Subdivide again. Thus, recursion takes place until the entire canvas has been painted. procedure TfrmMain.Subdivide(x1,y1,x2,y2:integer); var x,y:integer; begin if not(((x2-x1)<2) and ((y2-y1)<2)) then begin x:=(x1+x2) div 2; y:=(y1+y2) div 2; SetColor(x1,y1,x,y1,x2,y1); SetColor(x2,y1,x2,y,x2,y2); SetColor(x1,y2,x,y2,x2,y2); SetColor(x1,y1,x1,y,x1,y2); Plasma^[x,y]:=(Plasma^[x1,y1]+Plasma^[x2,y1]+Plasma^[x2,y2]+Plasma^[x1,y2]) div 4; SubDivide(x1,y1,x,y); SubDivide(x,y1,x2,y); SubDivide(x,y,x2,y2); SubDivide(x1,y,x,y2); end; end; procedure TfrmMain.SetColor(xa,ya,x,y,xb,yb:integer); var color:integer; begin color:=abs(xa-xb)+abs(ya-yb); color:=random(color*2)-color; inc(color,(Plasma^[xa,ya]+Plasma^[xb,yb]) div 2); if (color<1) then color:=1; if (color>NumColors-1) then color:=NumColors-1; if (Plasma^[x,y]=0) then Plasma^[x,y]:=color; end; Having drawn the plasma on our pseudo canvas, let’s see how we can display it on the screen. The CreatePlasma function does this. It starts off by allocating memory for the plasma array. It then creates a new bitmap and sets its boundaries. I chose to copy the plasma onto a bitmap, and then assign that to the imgPlasma.Picture.BitMap property rather than copy the plasma directly onto imgPlasma.Picture.BitMap, because it significantly speeds up processing. MakePlasmaPal is then called to set up the palette for drawing the plasma. The palette is created and selected into the active window’s DC and then realized. All the elements in the Plasma array are then set to zero via FillChar(), while the corners are set to different random colors. Subdivide() is then called to actually create the plasma. We then transfer the Plasma array to the bitmap’s Pixels property, and then assign the bitmap to imgPlasma.Picture.BitMap. To display a color from the logical palette one calls the PALETTEINDEX macro, which accepts as input an index into the currently selected logical palette, and returns a COLORREF value. Finally, the memory and resources taken up by the bitmap and the Plasma array are freed. procedure TfrmMain.CreatePlasma; var BitMap:TBitMap; x,y:integer; begin try New(Plasma); except on EOutOfMemory do begin ShowMessage('Not enough memory.'); Halt(1); end; end; Screen.Cursor:=crHourGlass; Randomize; BitMap:=TBitMap.Create; BitMap.Width:=MaxX+1; BitMap.Height:=MaxY+1; MakePlasmaPal; BitMap.Palette:=CreatePalette(NewPalette.LogPal); SelectPalette(Canvas.Handle,BitMap.Palette,False); RealizePalette(Canvas.Handle); FillChar(Plasma^,Sizeof(Plasma^),0); Plasma^[0,0]:=1+Random(NumColors-2); Plasma^[MaxX,0]:=1+Random(NumColors-2); Plasma^[MaxX,MaxY]:=1+Random(NumColors-2); Plasma^[0,MaxY]:=1+Random(NumColors-2); SubDivide(0,0,MaxX,MaxY); for y:=0 to MaxY do for x:=0 to MaxX do BitMap.Canvas.Pixels[x,y]:=PALETTEINDEX(Plasma^[x,y]); imgPlasma.Picture.BitMap:=BitMap; Screen.Cursor:=crDefault; Dispose(Plasma); BitMap.Free; end; procedure TfrmMain.btnDrawClick(Sender: TObject); begin CreatePlasma; btnDraw.Enabled:=False; end; Animating the palette Now comes the tricky part, animating the palette. Why tricky? Well, that’s because a whole lot about the AnimatePalette() API call is undocumented. But before we get into all that, here’s the definition of AnimatePalette: BOOL AnimatePalette(HPALETTE hpal, UINT iStartIndex, UINT cEntries, CONST PALETTEENTRY *ppe); AnimatePalette accepts as input four parameters. The first parameter is the handle of the logical palette being used for animation. The second and third are the first logical entry to be replaced and the total number of entries to be replaced respectively. The fourth parameter is a pointer to the first member of an array of PALETTEENTRY structures used to replace the palette entries. This array must contain at least the number of entries specified by the third parameter. AnimatePalette returns TRUE or FALSE depending on whether the call is successful or not. Theoretically, to animate the palette, all one has to do is to call AnimatePalette with the proper parameters, and things should work out all right. However, there’s more to it than meets the eye. First, AnimatePalette works only in 256-color mode and yet, in every other mode, it still returns TRUE as if the call has been successful. So now you know why we’ve been working in 256-color mode all along. Second, AnimatePalette refuses to work even in 256-color mode when Microsoft Plus! has been installed (at least it didn’t work on any of the Plus! machines I checked it on). And finally, AnimatePalette will not work unless you associate your logical palette with the active window. You can check this by commenting out the SelectPalette() and RealizePalette() calls in the CreatePlasma procedure. You’ll note that the code for drawing the plasma still works fine, but the code which handles the animation does not. Tough, but true. And the really frustrating part about it all is that, to the best of my knowledge, the first two points are undocumented. Are these bugs or features? I don’t know, but you can take them up with Mr Gates when you meet him next. But I digress. Now that we have been well advised against the quirks of AnimatePalette, the actual palette animation can be done in a trice using the RotateColors() subroutine. Every time RotateColors is called we animate the palette 32 times to ensure that there is no sudden jump in colors when we introduce a new color into the palette. For every round of animation, each palette entry is copied over the preceding entry, except for the first entry which is discarded. The last entry in the logical palette is varied, in successive increments, from the color that occupied the slot originally to the new color we chose at random at the beginning of the subroutine. Finally, a delay of 25 milliseconds is added so as to slow down the animation a little bit. To enable continuous palette animation, we need to write a subroutine of type TIdleEvent which will call RotateColors, and then point the applications OnIdle event to that subroutine. To stop the animation one then just needs to set the OnIdle event back to Nil. The subroutines btnAnimateClick() and Idle() handle these tasks. procedure TfrmMain.RotateColors; const NumRotations=32; var OldEnt,NewEnt:TPaletteEntry; i,j:integer; begin OldEnt:=NewPalette.Entries[NumColors-1]; NewEnt.peRed:=Random(256); NewEnt.peBlue:=Random(256); NewEnt.peGreen:=Random(256); for j:=1 to NumRotations do begin for i:=1 to NumColors-2 do NewPalette.Entries[i]:=NewPalette.Entries[i+1]; with NewPalette.Entries[NumColors-1] do begin peRed:=OldEnt.peRed+Round((NewEnt.peRed-OldEnt.peRed)*j/NumRotations); peBlue:=OldEnt.peBlue+Round((NewEnt.peBlue-OldEnt.peBlue)*j/NumRotations); peGreen:=OldEnt.peGreen+Round((NewEnt.peGreen-OldEnt.peGreen)*j/NumRotations); end; AnimatePalette(imgPlasma.Picture.BitMap.Palette,1,NumColors-1,@NewPalette.Entries[1]); Sleep(25); end; end; procedure TfrmMain.btnAnimateClick(Sender: TObject); begin if not(AnimationIsOn) then begin AnimationIsOn:=True; btnAnimate.Caption:='&Stop'; Application.OnIdle:=Idle; end else begin AnimationIsOn:=False; btnAnimate.Caption:='&Animate'; Application.OnIdle:=Nil; end; end; procedure TfrmMain.Idle(Sender:TObject;var Done:Boolean); begin RotateColors; Done:=False; end; So, now all that’s left is to tie the loose ends together. Listed below are the routines to make the plasma fade away when you close the application. The PalFade() procedure, which is called from the form’s OnClose event, keeps decreasing the intensity of each of the red, green and blue components of every color by unity, until all the colors in the palette have become black. This makes the plasma appear to fade into black. Finally, the boolean variable AnimationIsOn is set to false in the form’s OnCreate event. procedure DecreaseTillZero(var n:byte;var b:boolean); begin if (n<>0) then begin b:=False; Dec(n); end; end; procedure TfrmMain.PalFade; var AllBlack:boolean; i:integer; begin repeat AllBlack:=True; for i:=1 to NumColors-1 do begin with NewPalette.Entries[i] do begin DecreaseTillZero(peRed,AllBlack); DecreaseTillZero(peGreen,AllBlack); DecreaseTillZero(peBlue,AllBlack); end; end; AnimatePalette(imgPlasma.Picture.BitMap.Palette,1,NumColors-1,@NewPalette.Entries[1]); Sleep(5); until AllBlack; end; procedure TfrmMain.FormClose(Sender: TObject; var Action: TCloseAction); begin Palfade; end; procedure TfrmMain.btnExitClick(Sender: TObject); begin Close; end; procedure TfrmMain.FormCreate(Sender: TObject); begin AnimationIsOn:=False; end; end. To get the program up and running, all you have to do is to type out the code in Delphi and compile it. You can discard the TPanel and other non-essential components which I used to brighten up the form. Alternatively, if the idea of typing in so much code doesn’t appeal to you, you can send me an e-mail and I’ll send over the source code and Delphi form and project files to you. Before running the program, however, make sure you’re in 256-color mode, otherwise the palette animation code will not work. Also remove any fancy wallpapering that you might have, or its colors will start jumping. That just about wraps everything up for this article. However, that’s not all there is to Windows and palettes. Far from it! There is image color matching or gamut matching, and then there are color spaces and color profiles to consider. But one can only cover so much in an article without losing depth and cohesion. So maybe next time… May 12 Hi All,
I have recently updated my music playlist which automatically plays when you open my MSN space. If you have any suggestion regarding the music, the order etc. please leave a comment and I'll try to fix it.
Till then EnJoY the music. Cheers! :)
April 12 Introduction Windows Media metafiles, commonly known as playlists, are text files that link Web pages to Windows Media–based content on a Windows Media server or Web server. The purpose of a metafile is to redirect streaming media content away from browsers, which in most cases are not capable of rendering the content, to Microsoft® Windows Media Player. Windows Media metafiles have a .wvx, .wax, or .asx extension. When a browser downloads a file with one of these extensions from a Web site, the browser opens Windows Media Player. Windows Media Player then locates and plays the content specified in the file. A Windows Media metafile contains a type of Extensible Markup Language (XML) scripting that can be interpreted only by Windows Media Player. A metafile script can be as simple or complex as you need it to be. The most basic metafile contains simply the Uniform Resource Locator (URL) of some digital media content on a server. A complex metafile can contain references to multiple files or streams arranged in a playlist, instructions on how to play the files or streams, text and graphic elements, and hyperlinks associated with elements on the Windows Media Player user interface. Creating a Simple Metafile To get started creating a simple metafile, open your favorite text editor, such as Microsoft Notepad, and type the following items: <ASX version="3.0"> <Entry> <ref HREF="Path"/> </Entry> </ASX> In the third line, replace Path with the path or URL of your Windows Media–based content using syntax from the following table. Source of content | Syntax | | File on a Windows Media server | rtsp://ServerName/Path/FileName.wmv | | Multicast stream that is accessed from a Windows Media server | http://WebServerName/Stations/kxyz.nsc | | Unicast stream that is accessed from a publishing point on a Windows Media server | mms://ServerName/PublishingPointAlias | | File on a Web server | http://WebServerName/Path/Filename.wmv | | File on a network share | file://\\ServerName\Path\Filename.wmv | | File on a local hard disk | file://c:\Path\Filename.wmv | After you type this into Notepad, save the file as Filename.wvx if it is used to redirect video files that have a .wmv extension. Save the file with a .wax extension if it redirects audio-only files that have a .wma extension. Typically, Filename is the name of the Windows Media file or stream, but it can be any name you choose. Check to be sure that the metafile is working by double-clicking its file name in Windows® Explorer. Windows Media Player should open and start streaming the content. After you've confirmed that the metafile works, save it to your Web server along with your Web pages, and link to it by means of an <a href> tag, or embed it in a Web page using the <OBJECT> tag. A sample playlist: <ASX version="3.0"> <Entry> <ref HREF="http://gomes.samuel.googlepages.com/Shaan-JhankarBeats-SunoNa.wma"/> </Entry> <Entry> <ref HREF="http://gomes.samuel.googlepages.com/RahatFatehAliKhan-Paap-MaanKiLagan.wma"/> </Entry> <Entry> <ref HREF="http://gomes.samuel.googlepages.com/Karunesh-GlobalSpirit-Punjab.wma"/> </Entry> <Entry> <ref HREF="http://gomes.samuel.googlepages.com/RabbiShergill-Rabbi-BullaKiJana.wma"/> </Entry> <Entry> <ref HREF="http://gomes.samuel.googlepages.com/Karunesh-GlobalVillage-PrayerOfJoy.wma"/> </Entry> <Entry> <ref HREF="http://gomes.samuel.googlepages.com/AlanisMorissette-JaggedLittlePill-HandinMyPocket.wma"/> </Entry> </ASX> Adding up everything To make this playlist play using the Windows® Media Player addin, you need to save the playlist with an ASX extension and upload it to a webspace. For example, say we upload it to http://gomes.samuel.googlepages.com/MyMusic.asx. Now, you also need to upload the media files that the ASX file points to. Or, you can also put links of existing links of media files on the Internet. Finally, put the URL of the uploaded ASX file in the Windows® Media Player addin and save it. You are done. Enjoy!March 21 struct Indian_female_professionals { double styles; short skirts; long time_to_understand_problems; float mind; void knowledge; char non_co-operative; } struct married_females { double weight; short tempered; long gossip; float hopes; void word; char unstable; } struct engaged_females { double time_on_phone; short attention_on_work; long boast; float on_cloud_nine; void understanding; char edgy; } struct newly_married_females { double dinner_invitation; short time_at_work; long lunch_break; void bank_balance; char hen_pecked; } struct Indian_husband_wife_professionals { double income; short tempered; long time_no_see_each_other; void love_life; char money_making; } March 19 My old C++ gFrame class library:
(watch out for the new one!)
The same class libary listed above in a contest:
Expression Evaluator done in VB.NET:
A Connect Four like game done in VB.NET:
A MOD/XM/S3M/IT player using Visual C++, MFC and ModPlugin DLL:
A Quake Engine modification in Visual C++ (Win32 OpenGL only):
Sound Blaster AWE32/64 Driver For MikMod 2 (Watcom C/DJGPP, 32-bit DPMI):
Win32 mmio ACM sound streaming (Visual C++):
ModPlugin ActiveX control using the ModPlugin DLL (VB6):
Nokia RTTTL to Sony Ericsson IMY ringtone converter (VB6):
Have fun and watch out for more. :) March 12
I need feedback and support from all you guys out there.
P.S.: It is still a work in progress. February 15 I think MacOSX running on regular Intel hardware is a great idea. Just think about the kind of interoperability that would offer. I guess I am biased because I think from the developer's point of view (being a developer myself). If MacOSX is able to run on regular Intel compatible hardware, then it can become as popular as Windows is today. And that would give developer motivation to write the same software the write for Windows. Just think about it. Today the best games come for the Windows platform first and then they are PORTED to the Mac. Now, since the Mac is using regular Intel compatible hardware, developers can target both platforms at the same time. Besides if MacOSX runs for regular Intel chips it also means it would run on those sexy AMD chips (which I am a great fan of). Compare to those many OSes out there, the MacOSX really stands out. Take a look. UNIX like core (Darwin), stability, features, performance, looks, it has it all. Not many people can afford the costly (albeit great) Mac hardware. So this would be a real big thing. If I were to rate all those Desktop OSes out there, my rating would go like this (earlier the better): 1. Window XP (not those 9Xs please), 2. MacOSX, 3. SkyOS (check this OS out), 4. ReactOS (free WinXP clone, although incomplete; check this out). Sorry I am not a Linux fan due to the bitter experience I had with it and lack of standardization.
|
|
|
|