@GTV reviews the Cosmic Fantasy 1-2 Switch collection by Edia, provides examples of the poor English editing/localization work. It's much worse for CF1. Rated "D" for disappointment, finding that TurboGrafx CF2 is better & while CF1's the real draw, Edia screwed it up...
Main Menu

CHIPCE-8: CHIP-8 interpreter for the PCE

Started by trapexit, 12/09/2014, 12:40 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

trapexit

https://github.com/trapexit/chipce8

Something I was tinkering with a while back and finally got around to finishing. This version is currently done in HuC but I've got an assembly version in the works as well for comparison.

I also took the opportunity to document different CHIP-8 derivatives which were scattered among several sources for those interested.

Once I finish the assembly version I'll probably try to get SCHIP-8 fully implemented. Main issue is graphics.

NecroPhile

Nice one.  Now rover won't have to finish his.  :D
Ultimate Forum Bully/Thief/Saboteur/Clone Warrior! BURN IN HELL NECROPHUCK!!!

NightWolve


trapexit

Quote from: NightWolve on 12/09/2014, 10:10 AMBenefits of this are ?
The ability to play shitty CHIP-8 games on the PCE.

NightWolve


trapexit

I really had no expectation that people would actually use this. CHIP-8 wasn't very good when it was new. Fun learning about it and the COSMAC systems though.

I was thinking of attempting a RCA Studio II emulator. I've had one for years and it's similar to the COSMAC systems. That too would be for the fun given it's awful as well. Years ago I made a simple PC Parallel to PCE BRAM transfer app and cable. Was considering doing that again but with an Arduino. The bigger project I have in mind is recreating Bloodlust Software's pong game Nogginknockers. I'm friends with the artist so I could probably get him to modify the graphics for me.

esteban

I was not expecting this development. Hmmmm....

:)
IMGIMG IMG  |  IMG  |  IMG IMG

TurboXray

Trap, this is really cool. I've looked over Chip-8 and it looks like a fun little system to emulate. At least for the sake of making an emulator/simulator/whatever.

The problem I'm seeing, is that some websites show the 64x32 res are square pixels (so a wide screen format) and some show this as rectangle pixels (0.5 pixel aspect ratio, full screen). Which one is correct?

trapexit

Quote from: guest on 12/09/2014, 05:11 PMThis is your new top priority, only with USB instead of Parallel.

Go ahead. Get started right away!!!

For real. This has been the unicorn I've been hunting for since the early 90's. You would be my hero, and that alone  would make the endeavor worth the effort, right? :D
Hah. I need to investigate Arduino more on how I could get the data to the PC. With my original I just had a simple cable connected to the joy port and parallel on the PC side with some simple software on each side (Linux for the PC). I actually was going to look into that next. I originally wanted to write something for the Turbo Everdrive but writing a FAT driver is more work than what I've in mind. Though raw writing to an SD card with the TE wouldn't be hard but not as nice.

Anyway... I'll look into it next.

trapexit

Quote from: TurboXray on 12/09/2014, 09:40 PMThe problem I'm seeing, is that some websites show the 64x32 res are square pixels (so a wide screen format) and some show this as rectangle pixels (0.5 pixel aspect ratio, full screen). Which one is correct?
"fairly square"

http://www.old-computers.com/museum/doc.asp?c=543

Quote128 is the maximum vertical resolution. Since video data was passed to the 1861 with an interrupt routine this vertical resolution could be lowered by passing the same row several times. Usually this resulted in resolutions like 64 x 16 (popular on single board Elf computers with only 256 bytes (!) RAM since this uses only 128 bytes for video memory). Typically 64 x 32 (256 bytes video RAM) or 64 x 64 (512 bytes video RAM) were used.
Only 64 x 32 produces huge, but fairly square pixels. All other modes are somewhat limited in use by the odd aspect ratios of the pixels. That and the video memory of one whole kilobyte made the maximum resolution of 64 x 128 fairly useless. The pixels were extremely wide and flat.

trapexit

Quote from: guest on 12/09/2014, 10:39 PMUser megatron-uk has a great start on a FAT driver. Semi functional if I recall correctly. It would be amazeballs if you could benefit from his work. I'd personally be fine with bram to sd. :D
Yup. I accidentally started an argument on the TEd forum about all that. Had just gotten my TE recently and asked for the firmware code to enhance the experience. In part to provide builtin bram transfer support. Got attacked for asking.

Anyway... I saw his work and forked his github repo. As last I checked write support was lacking. Just needs some love I'm sure. I did some quick research into working with SD cards a while back but got distracted with chipce8.

TurboXray

Hmm.. so
QuoteIf the PC Engine supported a resolution of 512x256 then we'd be able to map CHIP-8 pixels to all white or all black tiles. The closest we're able to get to that is 512x224 leaving us with a CHIP-8 pixel being 8x7 PC Engine pixels. Slightly lopsided but not distractingly so. Given the layout of pixel data in VRAM this works out reasonably well.
If they are square, then you should be able to do a 64x32 nicely with an hsync interrupt. Run a 512 pixel res mode with a 64x32 tilemap. Two tiles (one all black pixels, one all white pixels) is all you need. 512/8 = 64 horizontal pixels. The 32 tall map is 256 pixels tall; too much for the PCE display. But since the pixels are square, just use an hsync line to cut them in half (only show 4 rows of pixels per map line). That should give the nice square pixel with wide screen look easily. Or is that what you're already doing?

 There's also a way to do this without hsync interrupts: by setting up the VDC line length to be shorter than the real length, halving the display by skipping every other line - works in mendafen too. I mean, if HuC is the problem here (h-int routine).

 Anyway, cool stuffs. Is there a library of games to try out for the SCHIP emulators?

trapexit

I created an 8x7 pixel at 512x224. Then just write directly to the tile's vram. It was easy and sufficiently fast.

Hadn't thought about about those ways. I've read quite a bit about the VDC but don't have much practical experience. Perhaps I should fool with those ideas.

chipce-8 doesn't yet support SCHIP-8. Once I figure out a good way to do 128x64...

I believe chip8.com has a big collection. I will be adding all the documentation I've found and roms to github soon. Seems chip8.com is down right now...

https://drive.google.com/file/d/0BxNcO7ALeiejeFVuU2VJWkk2Mm8/view?usp=sharing

TurboXray

The reason why I bring it up (writing to the tilemap as pixels, rather than writing to tiles themselves), is that you can do 128x128 @241 color bitmap mode on the SGX. On the PCE, it's 64x128 @ 241 colors, or 128x64 @ 16colors (these two modes as setup different from each other). Coupled with VDMA, you can have an alt map in vram to write to and use VDMA to copy it over during vblank - giving a free double buffer bitmap function. With XOR patterned tiles (which is what the tilemap write references), you could double those colors or more (XOR dithering for 512 pixel mode looks decently convincing to a solid color). The best thing, is that you're only writing to the tilemap (or tilemap buffer in vram) with a fast single or double byte write for 1 or 2 pixels (depends on the setup you use). It's pretty fast.

trapexit

Quote from: TurboXray on 12/10/2014, 02:33 PMThe reason why I bring it up (writing to the tilemap as pixels, rather than writing to tiles themselves), is that you can do 128x128 @241 color bitmap mode on the SGX. On the PCE, it's 64x128 @ 241 colors, or 128x64 @ 16colors (these two modes as setup different from each other). Coupled with VDMA, you can have an alt map in vram to write to and use VDMA to copy it over during vblank - giving a free double buffer bitmap function. With XOR patterned tiles (which is what the tilemap write references), you could double those colors or more (XOR dithering for 512 pixel mode looks decently convincing to a solid color). The best thing, is that you're only writing to the tilemap (or tilemap buffer in vram) with a fast single or double byte write for 1 or 2 pixels (depends on the setup you use). It's pretty fast.
I understand the concept. I originally wanted to do something like that but I'm not following how I can display the whole 64x32 or 128x64 tile map.

TurboXray

#15
QuoteI understand the concept. I originally wanted to do something like that but I'm not following how I can display the whole 64x32 or 128x64 tile map.
Ok, so for 64x32@16colors (or 2, doesn't matter) - you set the VDC to 10.74mhz dot clock: 512 pixel mode. If each tilemap entry was a single/solid color, then you have an effective resolution of 64 pixels horizontally. You can make this a byte= per pixel access to, which while it sounds wasteful, is extremely fast for pixel accessing (no shift, etc). The byte value would be the LSB of the map data entry (which is a WORD). This byte references/points to a preset of tiles of single colors. For the sake of simplicity, you could limit this tile set to 16 tiles: each tile is a single/solid color ranging from 0 to 15. So a byte is excessive in this scenario, but it still has the advantage of speed.

 So for 64x32 screen bitmap, set the tilemap to 64 entries wide and the height as 32. Now, 32 'tiles' tall is 256 pixels - too tall for the display and not to mention the wrong pixel aspect ration. Instead, you use an hsync routine that sets the vertical line placement of the map. You basically display each 'row' of the map for <N> number of scanlines. 32*8=256, but 32*4 = 128, etc. You'll have to figure what pixel aspect ration is and use hsync to skip lines of a tilemap row to accomplish this. You don't want to show 32*1, because the pixels would be too skinny in height. As far as everything outside this, you just clip via hysnc manually or even better - set the VDC regs to clip and center the display for you via the vertical placement regs.

 For a 128x64 screen, it's a little bit different. Instead of a set of 16 tiles, you need a set of 256 tiles. Again, the max color is still 16 colors - but each pixel 'nibble' references a tile. Since two nibbles fit nicely into a byte, you still use the byte as the pointer. But this time, that byte represents two pixels. The reason for 256 tiles, is the 16*16 combination. The MAP (BAT) is still 64 entries wide, but now you have to set the height to 64 entries tall. Again, you'll have to use a specific line skipping h-int routine to get the display division for the proper pixel aspect ration.

 For 64x128 @ 241 colors (which doesn't fit the chip-8 specs, but..), it's more complicated. It requires having a 128x64 map, but using hsync to do both X and Y repositioning. The tile set in vram is also similar to 16color mode (16 tiles). But instead of using a BYTE, you now need to write a WORD. On the processor side, you take what is a 'byte' pixel (8bit color) and use that for a LUT to get a word value. This word value is: LSB= one of 16 color tiles, and MSB is one of 16 subpalettes. Because colors in the subpalette that are multiple of 16 always show as color #0, this is what the look-up-table is for, since it's not a linear ramp. You get (15*16)+1, or (15 tile colors * 16 subpalettes) + the BG color #0. Thus, 241 colors.

 So to recap, you never write to tiles (only setup once the set that you need to reference). The tilemap itself becomes the bitmap display. If clip the display via the VDC regs, you can use the VRAM to VRAM DMA to copy as buffered bitmap (alt BAT) to the BAT during vblank (which would be huge compared to normal vblank time), giving the full map resolution without 'tearing' or such, if you need more than one frame to update it.

 I hope that makes sense.

trapexit

To reiterate:

For 64x32 at 1bit color (regular CHIP-8):
* Create 1 black tile and 1 white tile
* Set the VDC to 512 pixel mode
* Set the tilemap to 64x32
* When setting a pixel calculate the offset into the alternate BAT tilemap and set the tile appropriately.
* On vblank do a VRAM to VRAM DMA transfer from the alternate to main BAT
* On hblank modify BYR so that it displays (at 512x224) 7 rows of each tile (or... 512x224 = 8/x; x=3.5; best we could do to keep ratio is to have 4 rows per tile. I was trying to use the whole screen which is why I used 8x7)

For 128x64:
* Use on 8x8 tile as 2 2x1 SCHIP-8 pixels?
* Each nibble would be 0 or 1 and combined would point to one of 4 tiles. (not even necessary really, just use the first 2 bits)
* 64x64 tilemap
* Screen would be 512x224 so each SCHIP-8 pixel would be 4x2 (512x224 = 4/x; x ~= 2)


Now... is there a way to work around the hblank interrupt issue in HuC?

TurboXray

Quote from: trapexit on 12/12/2014, 02:19 PM* When setting a pixel calculate the offset into the alternate BAT tilemap and set the tile appropriately.
* On vblank do a VRAM to VRAM DMA transfer from the alternate to main BAT
Well, only if you want a double buffer bitmap. I'm pretty sure the chip-8 system didn't have one. So that might be overkill.

QuoteNow... is there a way to work around the hblank interrupt issue in HuC?
HuC uses an IRQ mode setup: each IRQ has lib or user mode. For user mode, you specify the address to your interrupt routine (needs to be local). For the VDC, this can be set individually for h-int and v-int. So if you don't want to break v-int lib stuff of HuC, just reserve h-int user mode. Just make sure to put your h-int code in a fixed bank location (I usually put it in the library bank #1 which is fixed mapped to $e000). I have one written for HuC, if you're interested.

 There is another way, that divides the screen by 2 without h-int involvement, but will offset the screen a little bit. The VDC has a horizontal h-sync reg. This reg is defined by periods of 8 VDC pixels. The VDC basically waits for the VCE to assert the hsync signal. As soon as it receives it, it snaps to the next phase in the line. If the VDC doesn't receive this signal from the VCE, it will time out and start drawing the next horizontal line. The catch here is, is that at any time - if the VCE asserts h-sync signal to the VDC, the VDC automatically jumps the phase after hsync. It doesn't matter where in the scanline drawing or phase it's currently in. So you can get the VDC to start prepping the internal regs for the next scanline, but have the VCE force it to start the next line. This causes the VDC to skip every other line. On a side note, BG and SPR line calculation don't happen at the same time; SPR line calculation is done first. So it's possible to have the VDC ~only~ skip the sprite line (shrinking them by 2) while the BG draws normally.  But anyway, yeah - that's one method with the artifact of having a slight right bias offset of the screen (once you setup the regs, you'll see why).

 Hsync method is the cleaner option.

trapexit

I'm interested. Yeah... CHIP-8 wasn't double buffered but it'd be interesting to try.

I'm currently using vsync handler to deal with the CHIP-8 timers[0].

void chip8_vsync_hook(void) __mapcall __irq
{
}

irq_add_vsync_handler(chip8_vsync_hook);
irq_enable_user(IRQ_VSYNC);

I'm now noticing there is IRQ_HSYNC but no irq_add_hsync_handler.


TurboXray

Here: http://pcedev.net/HuC/Chsync_ver_1_1/chsync_ver1_1.zip

HuC mimics the SCD environment for IRQ management, so it follows the same rules. So I manually did this:

/*.......................................................................................*/
EnableCustomHsync()
{
#asm                         
  php
  sei
  lda irq_m
  sta _irq_m_backup
  ora #$d0                    ;;// disable int hsync handler, enable custom handler. Attach custom vsync handler
  sta irq_m
  lda #low(cust_hsync.1)
  sta hsync_hook
  lda #high(cust_hsync.1)
  sta hsync_hook+1
  lda #low(cust_vsync.1)
  sta vsync_hook
  lda #high(cust_vsync.1)
  sta vsync_hook+1
  lda <vdc_crl
  ora #$04
  sta <vdc_crl
  st0 #$06
  st1 #$00
  st2 #$00
  plp
 
 
 
#endasm
}

 The actual handler is in the library.asm file. Just modify that as needed, or replace it completely. If you need HuC's vsync functionality, you could always call yours first and then jump to the original (it'll RTI from inside there). Pretty sure you'll need to do this for HuC lib gamepad routines to work (IIRC), unless you call that manually as well or such.