PC-FX homebrew development.

Started by elmer, 01/26/2015, 03:40 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

elmer

Quote from: OldRover on 03/13/2016, 07:46 PMI can do replacement graphics for AGE's help file once it's translated, if that helps.
That would be nice!

OldRover

Well, here's the MpConv2 GUI pics in English at least...

frozenutopia.com/pcfx/pic1.png - pic7.png
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

#152
For what it's worth, I am almost done translating the AGE application. The menus are done, but the thing has a large string table, primarily for the status bar, which I am handling now.

frozenutopia.com/pcfx/age.png
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

I accidentally broke one of the dialogs while editing... might take some time to fix, even though it's just the New Image one that I doubt anyone will use... still, can't leave it broken.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

NightWolve

What is your resource editor of preference here BTW ? Too bad nobody found a stash of old tools like this for PC Engine... Kinda neat for anybody that would get serious with the PC-FX. I honestly don't care much for this platform, but have at it if you do!

OldRover

I am using Borland Resource Workshop 4.5 to do the resource editing... it's the only one I could find that actually works with 16 bit programs... pretty much everything else is for win32.

I've completed all of the strings I can actually see... it is not completely translated, and there is some text in the EXE itself that will also need translation at some point (xvi32 can handle this), but it's done enough to be useful. Of course, me being me, I decided to demo it with fine art... aka painted boobs. :P

frozenutopia.com/pcfx/age_e.png

Configured as a 256 color 7up tilemap. One thing that's rather cool about this program is that you can clone a window and then zoom the clone and apply a grid to it... any work you do on either the original or the clone (or many clones, for that matter) is applied to every window of that image. That's actually really friggin cool. Too bad this thing is seriously suckage for actual editing though.

So I put together the edited binary and uploaded it with a simple readme file. I's nearly 1AM and I'm tired but wanted to get this out tonight. Have at it. :)

frozenutopia.com/pcfx/age_e.7z
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: NightWolve on 03/14/2016, 12:23 AMToo bad nobody found a stash of old tools like this for PC Engine.
I take it that you're overlooking that the PC-FX contains 2 of the PC Engine's VDC chips, and so has an identical sprite and tile format for the sprites and those background layers.

So, the AGE editor applies directly to the PC Engine, it just also handles the PC-FX's extra formats, too.


Quote from: OldRover on 03/14/2016, 12:57 AMSo I put together the edited binary and uploaded it with a simple readme file. I's nearly 1AM and I'm tired but wanted to get this out tonight. Have at it. :)
Thanks! It's really nice having someone else interested in trying to open-up the PC-FX for development!

Gredler

Quote from: elmer on 03/14/2016, 11:55 AM
Quote from: NightWolve on 03/14/2016, 12:23 AMToo bad nobody found a stash of old tools like this for PC Engine.
I take it that you're overlooking that the PC-FX contains 2 of the PC Engine's VDC chips, and so has an identical sprite and tile format for the sprites and those background layers.

So, the AGE editor applies directly to the PC Engine, it just also handles the PC-FX's extra formats, too.


Quote from: OldRover on 03/14/2016, 12:57 AMSo I put together the edited binary and uploaded it with a simple readme file. I's nearly 1AM and I'm tired but wanted to get this out tonight. Have at it. :)
Thanks! It's really nice having someone else interested in trying to open-up the PC-FX for development!
Thanks for explaining this - I was completely unaware! If you guys want a artist guinea pig I'd be down to look into getting a machine setup to try this out.  :)

elmer

Quote from: Gredler on 03/14/2016, 01:10 PMThanks for explaining this - I was completely unaware! If you guys want a artist guinea pig I'd be down to look into getting a machine setup to try this out.  :)
Excellent, thanks!

Really, you can do everything in Mednafen for the most part, there's little need to buy a real PC-FX until you're actually "hooked".  :wink:

OldRover

Holy crap, I really was tired that night... I totally missed the "I's"... soooooooooo uncharacteristic of me to miss a typo so glaring.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

OK... from what I can tell... vsync should be "easy"... it looks like this is something on the 6261. Port 300h is the status port, and bit 15 indicates whether or not the screen is drawing. So, from this, I guess a "wait for vblank" routine would simply be a polling loop that buggers 300h until bit 15 is 0.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

So it appears liberis already does this, with the function eris_tetsu_is_displaying(). However, it seems rather useless. I check its value... if it's 1, I wait until it's 0, and if it's 0, I wait until it becomes 1 and then wait for it to become 0 again. Naturally, this doesn't work for shit, even though this is how this should actually work.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: OldRover on 03/16/2016, 11:57 PMSo it appears liberis already does this, with the function eris_tetsu_is_displaying(). However, it seems rather useless. I check its value... if it's 1, I wait until it's 0, and if it's 0, I wait until it becomes 1 and then wait for it to become 0 again. Naturally, this doesn't work for shit, even though this is how this should actually work.
It's working perfectly ... just as long as you understand that "disp" also goes to "0" during hblank as well as during vblank!  :wink:

If you want to do a wait loop like that, then you should probably be looking at the "raster count" instead.

The "proper" way to do it would be to look at a frame counter that gets incremented by the vblank-interrupt ... but I have no idea if liberis even hooks the interrupts.

****************

I've found out why newlib isn't working ... it's actually something to do with the code that's generated by the compiler's optimize-for-space settings.

For some reason it's generating an extra bogus stack-fixup instruction that's throwing off the stack alignment, and causing the code to get the wrong return address.

This isn't going to be a lot of fun to track down!  #-o

NightWolve

Quote from: elmer on 03/14/2016, 11:55 AM
Quote from: NightWolve on 03/14/2016, 12:23 AMToo bad nobody found a stash of old tools like this for PC Engine.
I take it that you're overlooking that the PC-FX contains 2 of the PC Engine's VDC chips, and so has an identical sprite and tile format for the sprites and those background layers.

So, the AGE editor applies directly to the PC Engine, it just also handles the PC-FX's extra formats, too.
Cool, that's great to know, it should be noted in the ReadMe as it could be useful for PCE hacking.

OldRover

OK so what I have done then is created two functions... one pauses the program until the raster line is below 1, and another that pauses the program until the raster line is above 0. Is there enough time in between these two points to do drawing and such, or am I better off just checking for 0 and skipping the second one?
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: OldRover on 03/17/2016, 07:11 PMOK so what I have done then is created two functions... one pauses the program until the raster line is below 1, and another that pauses the program until the raster line is above 0. Is there enough time in between these two points to do drawing and such, or am I better off just checking for 0 and skipping the second one?
Well ... I'd read the docs, which Google Translate butchers as ...

b. RASTER COUNT (bit5 ~ 13)
I shows the raster number currently being displayed on the CRT. Display period is up to 22-261.
It should be noted, does not match the scanning line number that is defined in the NTSC signal.
During external synchronization, I will be 1FFH if the external sync signal is disturbed.

Note) of raster number value, you need to be sure you read more than once since become undefined by the timing to be read is constantly updated.


So you've got a range of values for the 239 lines of the actual display, with no extra values during the vblank period.

It depends upon what you're trying to do.

If you know for sure that your code is running at 60Hz, you could always wait for raster count 261, and then wait again for "disp" to go to zero (which is probably the vertical-blank).

That's horrible, but it might work.

If it were me doing this, then I'd just figure out how to hook the vblank-interrupt with a small assembly-language function, and implement a counter.

OldRover

I hate counters, hooking interrupts, and RISC assembly. :lol:

I did a test where I measured the high and low with eris_tetsu_get_raster()... the range was 0 to 261 (makes sense, since I'm telling tetsu to use 262 lines). Also, I tested getting rid of the second pause, and things worked the same as using it, so it looks like there's no point in using that second pause.

...and yeah... that's some severely butchered language there... :lol:
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

Mednafen

eris_tetsu_get_raster() may occasionally return a garbage value on real hardware as it doesn't appear to follow the hardware programming manual's advice.  You can call the function multiple times to work around it, but that's not as efficient as having the workaround for the hardware bug in the function itself...

elmer

Quote from: OldRover on 03/17/2016, 07:51 PMI hate counters, hooking interrupts, and RISC assembly. :lol:
Well, I guess that's pretty-much-why I'm spending my time banging my head against the brick-wall of GCC!  ](*,)

To be honest ... I personally believe that the V810 is the best designed and most-programmer-friendly of all the 5th-Generation CPUs (including the 3DO's ARM!).

You don't have to actually think about any of the "nasty" gotchas from those 1st-generation RISC architectures ... it has hardware-interlocks so that you can just code as you see fit (you just lose a few percentage-points of performance).

IMHO, the PlayStation's and N64's MIPS CPUs are ugly-but-bearable. The Saturn's SH-2 is an abomination, and just shows Sega's continued inability to design elegant hardware. The 3DO's ARM chip is really nice ... but a classic CISC architecture.

But the V810 is actually the first well-designed, programmer-friendly RISC chip that I've ever seen.

I guess that we've all got our own comfort-zones.


Quote from: Mednafen on 03/17/2016, 09:53 PMeris_tetsu_get_raster() may occasionally return a garbage value on real hardware as it doesn't appear to follow the hardware programming manual's advice.  You can call the function multiple times to work around it, but that's not as efficient as having the workaround for the hardware bug in the function itself...
Thanks! I wasn't sure if that was a "real" warning in the docs, or just a bad artefact of Google Translate.

I'll fix the liberis function to actually read the value a couple of times and make sure that it's stable.

I just "fixed" Alex's startup code to do a faster job of clearing "C"'s BSS segments on startup ... partially for the performance improvement, but mostly to avoid having to dig much deeper into whatever-it-is that I broke in GCC's stack code (I think that I know, I'm just not enjoying the thought of fixing it).

While you're "on-the-line" ... can I ask if you've seen anything in the PC-FX BIOS or libraries that uses the R2 and R5 registers?

AFAIK, they're not used except temporarily-during-startup, and I'd like to use them for the Frame Pointer and the Thread Local variables pointer.

SamIAm

#169
QuoteTo be honest ... I personally believe that the V810 is the best designed and most-programmer-friendly of all the 5th-Generation CPUs (including the 3DO's ARM!).

You don't have to actually think about any of the "nasty" gotchas from those 1st-generation RISC architectures ... it has hardware-interlocks so that you can just code as you see fit (you just lose a few percentage-points of performance).

IMHO, the PlayStation's and N64's MIPS CPUs are ugly-but-bearable. The Saturn's SH-2 is an abomination, and just shows Sega's continued inability to design elegant hardware. The 3DO's ARM chip is really nice ... but a classic CISC architecture.
Interesting. It makes me wish I understood more about processors and programming.

When it comes to Sega, in addition to whatever poor choices the engineers were making, it seems to me that the executive side may have been steering them into bad ideas as well. Sega had a relationship with Hitachi - I think they were manufacturing the 68000s and other components in the Mega Drive, and there are even rumors that Hitachi execs and Sega execs were golfing buddies. Not to mention, I wouldn't be surprised if there was also a reluctance to use non-Japanese parts.

Don't let me derail the thread, though.  :wink:

EDIT: Here's a Japanese article I just found that tells at least some of the story of how Sega chose the SH-2, from the perspective of a Hitachi engineer, published in an industry magazine in September 1997.

wayback://renesas.com/mpumcu/superh/related_sh/theme/story/06.jsp

Some interesting points:
- The SH-1 was completed by 1992, but nobody was buying it.

- Hitachi approached Sega about using it in summer 1992. Interestingly, their first meeting wound up taking place in the Sega cafeteria.

- Sega rejected the SH-1 as it was. They told them to come back with 1) improved multiplier performance, 2) synchronous DRAM interface circuits installed, 3) the ability to meet certain game-oriented benchmarks.

- Hitachi started mass producing the SH-1 before they had any buyers for it. The lead engineer was frustrated with this and with the general work environment, and was about to take a job at another company. He submitted his intention to resign and everything.

- The vice president of the Hitachi's semiconductor subsidiary called him, buttered him up, and asked him to stick around a while longer.

- Rather suddenly, Sega accepted the SH chip for its next game system in autumn 1992. Hitachi started working an an improved SH-2 soon after.

- In summer 1993, Sega complained to Hitachi that the SH-2 wasn't powerful enough. In September, there was a big meeting of top people from both companies, and Hitachi apparently revealed a "secret strategy" - put two of them together.

- By March 1997, Hitachi had sold 15 million SH-2s through Sega. However, the profits weren't good because Sega was so desperate to keep costs down.

TailChao

Quote from: elmer on 03/17/2016, 07:39 PMIf you know for sure that your code is running at 60Hz, you could always wait for raster count 261, and then wait again for "disp" to go to zero (which is probably the vertical-blank).

That's horrible, but it might work.

If it were me doing this, then I'd just figure out how to hook the vblank-interrupt with a small assembly-language function, and implement a counter.
I don't think we're going to find a PAL PC-FX in some NEC dumpster any time soon.

In the long run the hook is required if you start doing complex raster effects. May as well get it out of the way and then not think about it.

Quote from: elmer on 03/17/2016, 11:16 PMTo be honest ... I personally believe that the V810 is the best designed and most-programmer-friendly of all the 5th-Generation CPUs (including the 3DO's ARM!).

You don't have to actually think about any of the "nasty" gotchas from those 1st-generation RISC architectures ... it has hardware-interlocks so that you can just code as you see fit (you just lose a few percentage-points of performance).

IMHO, the PlayStation's and N64's MIPS CPUs are ugly-but-bearable. The Saturn's SH-2 is an abomination, and just shows Sega's continued inability to design elegant hardware. The 3DO's ARM chip is really nice ... but a classic CISC architecture.

But the V810 is actually the first well-designed, programmer-friendly RISC chip that I've ever seen.
Which is such a shame, since the rest of the PC-FX doesn't really live up to the CPU.

MIPS is fantastic if you're a compiler.

I have to wonder if most of Sega's odd console hardware choices are from taking their arcade designs (which are very reasonable) and trying to rip features out until it reaches a consumer price point (which is never reasonable). The company as a whole is such a great case study.

Quote from: elmer on 03/17/2016, 11:16 PMI guess that we've all got our own comfort-zones.
This is definitely a factor.

The nice thing about console hardware (especially old hardware) is that every single platform has something "wrong" with it. We get to pick the "least bad" option.

elmer

Quote from: SamIAm on 03/18/2016, 12:42 AMDon't let me derail the thread, though.  :wink:
Hahaha ... I like tangents!  :wink:


QuoteEDIT: Here's a Japanese article I just found that tells at least some of the story of how Sega chose the SH-2, from the perspective of a Hitachi engineer, published in an industry magazine in September 1997.
Thanks, that's very interesting.  :-k

Here's a Hitachi marketing-brief about the SH-2 from back around that time-period ...

wayback://hotchips.org/hc_archives/HC6.4.2.pdf

QuoteWhen it comes to Sega, in addition to whatever poor choices the engineers were making, it seems to me that the executive side may have been steering them into bad ideas as well. Sega had a relationship with Hitachi - I think they were manufacturing the 68000s and other components in the Mega Drive, and there are even rumors that Hitachi execs and Sega execs were golfing buddies. Not to mention, I wouldn't be surprised if there was also a reluctance to use non-Japanese parts.
That sounds like your typical "business" scenario.

Hitachi's early CPU successes were in being a licensed second-source for American CPUs, and often improving them ... such as their HD64180 (improved Zilog Z80) and their HD6309 (improved Motorola 6809). Hitachi's relationship with Motorola gave them first the 6809, and then the 68000 (which they sold to Sega for the MegaDrive, and then also the Saturn).

AFAIK, the SuperH series (SH1, SH2, SH3) was one of their first home-grown designs ... and they made a bit of a mess of it.

It's still the only CPU that I know of that has 2 different instructions, with different names, and different binary codes, that do exactly the same thing.

It also misses out a bunch of useful things, which means that you end up needing to use multiple instructions to accomplish what other RISC architectures can do with a single instruction.

Their "secret strategy" of throwing 2 SH-2 CPUs together in the Saturn and the 32X was an absolute joke, and just a marketing gimmick to claim double the processing power. In reality, the each CPU could pretty-much completely use the entire memory bandwidth by itself, and so pairing the 2 of them on a single bus just lead to massive bus-contention and starvation for each processor.

They tried to mitigate this by re-purposing the 4KB of write-through-cache into internal memory in order to reduce the bus-contention, but that was a joke because the 5th-generation machines were all supposed to be programmed in "C", and neither the language, nor any of Sega's libraries could actually take advantage of that little 4KB of uncontested memory.

Which is why nearly-all early Saturn games just disabled the 2nd CPU and ran entirely on the 1st CPU ... throwing away 50% of Sega's marketing-hype performance.


The PC-FX's V810 (and the PlayStation's MIPS R3000) don't have the raw fake-benchmark performance numbers of the SH-2, but they easily make up for it in practice by providing "usable" performance.

IMHO, the PlayStation is the 5th-generation equivalent of the PC Engine. It's an elegantly-designed machine that does exactly what it's designed to do, and does it very, very well. But since it's basically a super-Amiga, with a fast blitter for 2D games, and good-for-its-time-but-poor-in-hindsight 3D performance, it's just a little bit "boring" to program.

The PC-FX is like a C-programmable SuperGrafx, with some extra SNES-like background layers and video too. It's just a more-interesting 2D machine, especially if you contemplate using the HuC6273 chip that's on the PC-FXGA to give you Saturn-like 3D capabilities (or just lots of sprites).

elmer

Quote from: TailChao on 03/18/2016, 11:07 AMIn the long run the hook is required if you start doing complex raster effects. May as well get it out of the way and then not think about it.
I agree ... but as-for-myself, I'm too busy at the moment working on other things (like the Xanadus and trying to get GCC "fixed").

Well ... that and pontificating in interesting threads! :roll:


QuoteMIPS is fantastic if you're a compiler.
Which is OK in 2016 ... but the quality of the compilers back in 1995 was pretty bad. You pretty-much-had-to drop down into assembly for any speed-critical routines.

But "yeah", MIPS's delay-slots aren't that nasty to program-around. IIRC, my distaste at the time was more to do with my lack of a decent assembler.

By the time that the N64 came around, that had been fixed by the guys at Cross-Products and PsyQ ... but the N64 had it's own "mysteries" with the RSP.


QuoteWhich is such a shame, since the rest of the PC-FX doesn't really live up to the CPU.
The choice to ship with 2 PCE VDP chips is definitely bizarre.

The choice to ship with the PCE sound-chip is a bit regretable, but just follows-on from NEC's years of promoting CD-audio and premixed-ADPCM in games.

The PC-FX's own custom chips, the HuC6271 for video, the HuC6272 for backgrounds and SCSI, and the HuC6273 for "3D" sprites, seem to be pretty-nice, and very much remind my of a Saturn-done-right-but-a-bit-less-powerful-and-a-lot-more-usable.


QuoteI have to wonder if most of Sega's odd console hardware choices are from taking their arcade designs (which are very reasonable) and trying to rip features out until it reaches a consumer price point (which is never reasonable). The company as a whole is such a great case study.
There's definitely a huge difference in practical design-philosophy between the throw-everything-at-it-and-damn-the-cost arcade boards where you're charging thousands of dollars each to a select-few customers in order for them do suck the quarters out of people's pockets, and the let's-sell-millions-cheaply console market.

Sony's PlayStation is a perfect example of that. It's a very "minimal" hardware design, with the costly chips put exactly where they're needed.

The Saturn, OTOH, is an absolute joke of a design, with expensive bits-and-bobs thrown everywhere and lots of stuff just sitting there idle because it's got nothing to do most-of-the-time.


QuoteThe nice thing about console hardware (especially old hardware) is that every single platform has something "wrong" with it. We get to pick the "least bad" option.
Hahaha ... "yep"!  :wink:

Mednafen

Quote from: elmer on 03/17/2016, 11:16 PMWhile you're "on-the-line" ... can I ask if you've seen anything in the PC-FX BIOS or libraries that uses the R2 and R5 registers?

AFAIK, they're not used except temporarily-during-startup, and I'd like to use them for the Frame Pointer and the Thread Local variables pointer.
No, but I haven't really dug into the BIOS and official dev libraries that much.

OldRover

OK so on another subject... I am now starting to delve into the tools we have at our disposal, now that AGE is translated. I took a 32x32 BMP image and converted it to a 6270 sprite. AGE expanded it to 128x128. Dafuq?
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

ccovell

Quote from: elmer on 03/18/2016, 11:59 AMIt's still the only CPU that I know of that has 2 different instructions, with different names, and different binary codes, that do exactly the same thing.
CLA when you have LDA #0  when you also have XOR A,A
SHL A when you have ADD A,A...

(Well, I guess with completely orthogonal instruction sets, such things are inevitable.)

Quote from: elmer on 03/18/2016, 11:59 AMThe Saturn, OTOH, is an absolute joke of a design, with expensive bits-and-bobs thrown everywhere and lots of stuff just sitting there idle because it's got nothing to do most-of-the-time.
Your description there completely reminded me of how local municipalities (and their workers) operate over here in Japan...  :-|

OldRover

#176
I modified trap15's sprite example to fill VRAM with an array rather than noise.

frozenutopia.com/pcfx/spr7up_cd-0000.png

But this is as far as I got. If I try changing the size of the sprite (using either eris_sup_spr_create or eris_sup_spr_ctrl), it vanishes. There is also no update code in the example program so it makes me wonder how the sprite's position is ever updated at all... unless liberis is automatically updating the SATB any time a change is made...

EDIT: Wait a minute... am I right in reading that the FX's 6270s can only make 16x16 sprites, completely fucking over one of the best features of the same chip on the PCE?
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: ccovell on 03/19/2016, 09:40 AMCLA when you have LDA #0  when you also have XOR A,A
Ooooh ... I'm definitely curious which CPU has both "CLA" and "XOR A,A". I wouldn't count "LDA #0" because that would probably have an extra byte in the encoding, or an extra cycle in execution, and possible even different flag settings.

QuoteSHL A when you have ADD A,A...

(Well, I guess with completely orthogonal instruction sets, such things are inevitable.)
Yep, with those instructions it would depend upon the allowed addressing modes for each instruction.

*************

But the SH-2 has ...

Format:      SHAL Rn
Abstract:    T ← Rn ← 0
OpCode:      0100nnnn00100000
Description:
Arithmetically shifts the contents of general register Rn to the left by one bit, and
stores the result in Rn. The bit that is shifted out of the operand is transferred to the T bit.

Format:      SHLL Rn
Abstract:    T ← Rn ← 0
OpCode:      0100nnnn00000000
Description:
Logically shifts the contents of general register Rn to the left by one bit, and stores
the result in Rn. The bit that is shifted out of the operand is transferred to the T bit.

2 instructions that do exactly the same thing.


They're complemented by ...

Format:      SHLL2  Rn
Abstract:    Rn << 2 → Rn
Format:      SHLL8  Rn
Abstract:    Rn << 8 → Rn
Format:      SHLL16 Rn
Abstract:    Rn << 16 → Rn

Format:      SHAR Rn
Abstract:    MSB → Rn → T
Format:      SHLR Rn
Abstract:    0 → Rn → T

Format:      SHLR2 Rn
Abstract:    Rn >> 2 → Rn
Format:      SHLR8 Rn
Abstract:    Rn >> 8 → Rn
Format:      SHLR16  Rn
Abstract:    Rn >> 16 → Rn


Errrmmmm ... WTF happened to "SHAR2", "SHAR8" and "SHAR16"??? They'd actually be useful, but they're nowhere to be found.

Definitely a CPU designed by a hardware engineer that never had to actually program a line of code in his life.

Of course, any self-respecting CPU designer from the time would just have put in a barrel-shifter, and provided "SHLL/SHLR/SHAR Rn,Rn" instead of that rat's-nest of stupid instructions.


QuoteYour description there completely reminded me of how local municipalities (and their workers) operate over here in Japan...  :-|
Hahaha, so true everywhere!  :wink:

OldRover

frozenutopia.com/pcfx/spr7up_cd-0003.png

Getting somewhere... but this is rather convoluted and quite different than working with the same chip on the PCE.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: OldRover on 03/19/2016, 12:42 PMWait a minute... am I right in reading that the FX's 6270s can only make 16x16 sprites, completely fucking over one of the best features of the same chip on the PCE?
Nope, you're missing something in the docs, or Google Translate is messing things up for you.

It's exactly-the-same VDC chip that's in the PCE, and it has the same 32x64 maximum sprites with the size-bits in the same place.

I think that you may be expecting a little-too-much from the current-state of liberis.

Alex has provided all the critical background stuff in there ... but the higher-level "engine" stuff (like SATB updates) is completely-missing AFAIK.

That's where you get to read the SDK docs, and perhaps get-your-feet-wet with some V810 assembly language ... or not.


Quote from: OldRover on 03/19/2016, 01:18 PMGetting somewhere... but this is rather convoluted and quite different than working with the same chip on the PCE.
I'm glad that you're making progress!

It's exactly-the-same chip, with exactly-the-same data formats, how is it different to work with?

OldRover

In order to get this sprite to display on the PCE, I would normally give it pattern values in offsets of 0x40. Here, it's a pattern value offset of 2. That does not make a whole lot of sense to me.

eris_sup_spr_set(0);
eris_sup_spr_create(0, 0, 0, 0);
eris_sup_spr_set(1);
eris_sup_spr_create(16,0,2,0);
eris_sup_spr_set(2);
eris_sup_spr_create(0,16,4,0);
eris_sup_spr_set(3);
eris_sup_spr_create(16,16,6,0);

but anyway...

_eris_sup_spr_ctrl:
mov lp, r17
setup_addr r20, r19, r18, 3
mov r6, r16
mov r7, r15

mov r19, r6
mov r18, r7
jal _eris_low_sup_set_vram_write

mov r19, r6
mov r18, r7
jal _eris_low_sup_set_vram_read

mov r19, r6
jal _eris_low_sup_vram_read
mov r10, r7

and r16, r7
or r15, r7
mov r19, r6
jal _eris_low_sup_vram_write

mov r17, lp
jmp [lp]

I am not sure how similar this is to the original C code I wrote, as this is nineteen flavors of Greek to me, but in theory, the only purpose of the function is to modify the high byte of the fourth word of the SATB entry.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

#181
OK so I did some more testing... it seems that the fourth argument of eris_sup_spr_create() affects the entire fourth word. So, I set it to 0x1180 to set a size of 32x32 with a priority of 1 and a palette of 0, then got rid of the code for the other three sprites. It worked. So, I changed it to 0x1080 to reduce it to 16x32. It worked. So, it looks like it's affecting the entire word. I can work with that for now. :) I can only surmise from this that eris_sup_spr_ctrl() is doing the same thing.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

#182
Quote from: OldRover on 03/19/2016, 01:55 PMIn order to get this sprite to display on the PCE, I would normally give it pattern values in offsets of 0x40. Here, it's a pattern value offset of 2. That does not make a whole lot of sense to me.
I've not looked at the documentation ... is he perhaps using an "index" instead of an "byte-offset"?

_eris_sup_spr_ctrl:
mov lp, r17
setup_addr r20, r19, r18, 3
mov r6, r16
mov r7, r15

mov r19, r6
mov r18, r7
jal _eris_low_sup_set_vram_write

mov r19, r6
mov r18, r7
jal _eris_low_sup_set_vram_read

mov r19, r6
jal _eris_low_sup_vram_read
mov r10, r7

and r16, r7
or r15, r7
mov r19, r6
jal _eris_low_sup_vram_write

mov r17, lp
jmp [lp]


It makes sense when you know that the 1st 4 parameters to a C function are passed in registers r6,r7,r8,r9, and that the return value is passed back in r10.

"lp" is just the function's return-address (unlike the 6502, it doesn't go on the stack). So he's saving it temporarily in r17 so that it doesn't get trashed by the subroutine calls (the "jal" jump-and-link instructions).

He's using r15-r20 as temporary registers (there are 32 registers).

So, that code ...
  • calculates a VRAM address
  • sets that address in the VDC's MARR (READ) and MAWR (WRITE) registers
  • reads the 16-bit word at that VRAM address
  • ANDs it with the the contents of r6 (the original 1st parameter)
  • ORs it with the the contents of r7 (the original 2nd parameter)
  • writes the 16-bit word back into the VRAM address

Basically, simple stuff, but done in a really slow-and-horrible way.

BTW ... Alex's function has a bug in it. He shouldn't be using r20 without saving it on the stack.

The ABI defines it as a "callee-saved" function-preserved register, and not a "caller-saved" function-workspace register.

P.S. I'm going to "guess" that this is the kind of technical stuff that makes most people's head ache!  :wink:

OldRover

Aye... the way spr_ctrl normally works is you set a mask and then set new parameters. For example:

spr_ctrl(SIZE_MAS|FLIP_MAS, SZ_32x32|NO_FLIP);
or

spr_ctrl(FLIP_MAS, FLIP_X);
so only the bits you want to change are affected.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

Okay so this is all starting to come together... all the differences in paradigms are starting to lose relevance. Practice always makes perfect, I guess. :)

frozenutopia.com/pcfx/asteroid-0000.png

Two 32x64 sprites glued together on the first 6270. What I did was write a test program in assembly that does nothing more than import a sprite file, then wrote a utility that will extract the converted data from it and put it into a neat little text file. I can then copy the text out of the text file and paste it into my program, where I have a large array that holds all the sprite data. It's not the most elegant solution but for now it works. :)
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

That's great to see on the screen!  :)

OldRover

#186
frozenutopia.com/pcfx/asteroid-0003.png

I could not find a random number generator in liberis and since stdlib is disabled, I figured I would just add one manually. I implemented the basic xorshift RNG, and gave it static seeds (until I implement a "Push Run Button" part to seed it). The stars fly by from right to left and respawn at random X and Y values when they reach the left side of the screen. The right half of the asteroid on the screen is white because I was testing the eris_sup_spr_pal() function... it works. :)

What I am doing here is porting my old Asteroid Challenge demo to the PC-FX. A great deal of the code is directly portable, though any code that references graphics needs to be updated. The core logic and collision routines are 100% portable without changes from its PCE original. This is why I like C so much. ;)

EDIT: Moved further along... have added the asteroid generator.

/asteroid-0007.png
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

esteban

IMGIMG IMG  |  IMG  |  IMG IMG

OldRover

As was with the PCE version, the sprites of the high score and the player shots create massive amounts of flicker due to the very large asteroids... but now I can move both of those to the second 6270 to eliminate the flicker. Woohoo :lol:
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

The game works.

frozenutopia.com/pcfx/asteroid-0010.png

It currently lacks asteroid-to-player collision because it runs too fast, and I have not yet added any sound aside from the CD audio music... but it's working and playable.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

#190
The pad code does not work on real hardware. I don't know why. It will register a value and then immediately go back to 0... and will toggle just like that as you hold down *anything*.

EDIT: I came up with a hack that works... I read the pad more than once and use the highest value as the result. That gets the job done. I have no idea what's wrong but hey, at least I figured out a workaround.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: OldRover on 03/19/2016, 11:00 PMThe game works.
Excellent!  :)

But OTOH, that means that you're going to beat me to having a working PC-FX game.  :cry:


Quote from: OldRover on 03/20/2016, 12:25 AMThe pad code does not work on real hardware. I don't know why. It will register a value and then immediately go back to 0... and will toggle just like that as you hold down *anything*.
That's another one of those things to look for in the SDK docs.

The PC-FX's joypads are hardware-serial-read rather than parallel-read, and IIRC the hardware allows for a multi-tap.

So the liberis function that you're calling may not be resetting the "read" back to the "initial" state for the pad ... or something ... or I may be totally wrong.


Quote from: Mednafen on 03/18/2016, 02:31 PM
Quote from: elmer on 03/17/2016, 11:16 PMWhile you're "on-the-line" ... can I ask if you've seen anything in the PC-FX BIOS or libraries that uses the R2 and R5 registers?

AFAIK, they're not used except temporarily-during-startup, and I'd like to use them for the Frame Pointer and the Thread Local variables pointer.
No, but I haven't really dug into the BIOS and official dev libraries that much.
It doesn't look like anything in Hudson's compiler or BIOS is using R2 or R5, much to my surprise and happiness.

It does look like the BIOS "fsys" library is using $54 (or maybe $58) bytes of GP-relative storage from -32768[GP] to -32684[GP], but I think that I can take care of that in GCC's linker script so that we can still use those library functions.

My "stack-handling" problem is about 90% fixed ... it's something that I broke when doing the incredibly-ugly update from GCC 4.5.4 to GCC 4.7.4.  #-o

OldRover

Quote from: elmer on 03/20/2016, 01:51 PMExcellent!  :)

But OTOH, that means that you're going to beat me to having a working PC-FX game.  :cry:
Sowwie... :cry: but on the plus side... that means that we're really super close to having a completely useful toolchain, right? :D

Quote from: elmer on 03/20/2016, 01:51 PMThat's another one of those things to look for in the SDK docs.

The PC-FX's joypads are hardware-serial-read rather than parallel-read, and IIRC the hardware allows for a multi-tap.

So the liberis function that you're calling may not be resetting the "read" back to the "initial" state for the pad ... or something ... or I may be totally wrong.
I am not sure what the problem is myself and I have not tried doing it through just regular port reads (I assume liberis already does this though), but I worked around it by reading the port multiple times and using the highest value.

In other fun stuff, I tried implementing a track loop function... but it froze up the system. The SCSI-2 docs are HORRIBLE to comprehend... I was barely able to get CD audio playing at all, and even then, I can only send a single play command because successive commands lock the whole machine up. I am assuming it has something to do with the flag that tells the device to return a result immediately when a command is sent, but I have NO idea how to even get return values from the drive, nor am I able to figure out how to set ANY of the code pages to configure ANYTHING. I realize I'm not a coder but ffs, this shit's written for robots hocked up on Chesto berries.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

OldRover

More fun trial-and-error... I used AGE to convert a 256 color BMP to a 256 color KING bitmap, then wrote a custom utility to strip off the 256 byte header, and used yet another tool I have that converts any file into a word array, plus my palette conversion utility that converts a JASC-PAL to a YUV palette array... bottom line, through a lot of experimentation and a lack of proper documentation, I got a nifty 256 color image to display on KING BG0.

frozenutopia.com/pcfx/bgking_cd-0004.png

Here's the code for it... this requires no external resources, as it's entirely array-based. I used bg7up.c as the initial template for this program, and made changes where needed.

frozenutopia.com/pcfx/bgking.c

I tried to paste the code into this msg but it exceeded the character limit for a post, so I just uploaded the source instead.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: OldRover on 03/20/2016, 09:01 PMSowwie... :cry: but on the plus side... that means that we're really super close to having a completely useful toolchain, right? :D
QuoteI am not sure what the problem is myself and I have not tried doing it through just regular port reads (I assume liberis already does this though), but I worked around it by reading the port multiple times and using the highest value.
I think that your 2nd comment answers your 1st question.

We're definitely on the "path" to having a completely useful toolchain, and it's nice that you're getting some pretty stuff on the screen, but ...

From my POV, we've got a "broken" compiler, and a "broken" library, with minimal functionality.

It's a good, maybe-even-a-great, start ... but it's just a "start". There's one heck of a long way to go to have a "completely useful toolchain".

Now, I've got the patience to keep-on-plugging-away it, but progress with GCC is slow-and-very-very-painful.

OldRover

#195
Quote from: elmer on 03/20/2016, 11:20 PMNow, I've got the patience to keep-on-plugging-away it, but progress with GCC is slow-and-very-very-painful.
And this is why you're so awesome. :D

EDIT: I erased the rest of this post because I'm retarded.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: OldRover on 03/21/2016, 02:35 AM
Quote from: elmer on 03/20/2016, 11:20 PMNow, I've got the patience to keep-on-plugging-away it, but progress with GCC is slow-and-very-very-painful.
And this is why you're so awesome. :D
Hahaha ... perhaps "stupid" is a better word!  ](*,)

Anyway, I think that I got the "stack-problem" in GCC 4.7.4 fixed late last-night. Now it needs some more testing to make sure that I wasn't hallucinating.

Then I still want to "break" it again and implement a different ABI, if I can.  :-s

The current ABI probably seemed-like-a-good-idea-at-the-time, but it breaks the possibility of in-game stack backtraces which are incredibly-helpful to decent debugging.

Luckily, NEC's idea of reserving the R2 and R5 registers for things that nobody-ever-used-in-practice gives us 2 free registers to use in a new ABI (if I can wrestle GCC into compliance).

Again ... very low-level technical stuff, but it's part of making the toolchain "usable" for any of the poor-suckers that decide to give-it-a-try.  :wink:

OldRover

I'll leave the low-end stuff to you, heh. :lol: I'm more of the end-user type... and I can assure you that us poor suckers are going to be numerous once this is more refined. ;)

I did finally get ADPCM playback working, and how. It's not trivial to explain how it works, but I will probably write up a quick tutorial on it at some point, as well as other things I've managed to get working on the end-user side of the struggle. Not having access to a whole lot of proper, readable documentation has proven to be the largest hurdle, and trying to find any kind of example source code with google usually returns snippets that use existing libraries rather than going straight to the metal. My experiences with SCSI have been the worst offender on both counts.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II

elmer

Quote from: elmer on 03/21/2016, 11:45 AMAnyway, I think that I got the "stack-problem" in GCC 4.7.4 fixed late last-night. Now it needs some more testing to make sure that I wasn't hallucinating.
The good news ... "yes", that "stack problem" seems to be fixed!  :D

And with-a-few-more-changes, newlib is now actually linking into a test-app, and newlib's "strlen" is working.

With yet-more-changes, "sprintf" is actually compiled and linked into the test-app.

The bad news ... that's shown-up the next bug in the compiler, "varargs" are not working.  #-o

Oh well ... onto the next piece of nasty debugging.  ](*,)


Quote from: OldRover on 03/21/2016, 01:59 PMI did finally get ADPCM playback working, and how. It's not trivial to explain how it works, but I will probably write up a quick tutorial on it at some point, as well as other things I've managed to get working on the end-user side of the struggle.
Great! Any documentation that you can do would be extremely helpful.

OldRover

OK sooooo.... ADPCM. This turned out easier than I thought... I was just doing a few things wrong, which can happen when you code for so long without resting.

eris_king_set_kram_pages(0, 0, 0, 0);The last value is the ADPCM KRAM page. I leave it at 0. You can put it at 1, which means you have to set bit 31 when you load a sample into KRAM.

out8(0x120,1);Bits 0 and 1 set the sample rate. I set this to 1 for 16kHz. 0 is 32kHz, 2 is 8kHz and 3 is 4kHz. Bits 2 and 3 set linear interpolation for channels 0 and 1, respectively. Bits 4 and 5 reset channels 0 and 1 respectively. I left them all alone for now. Experiment at your own leisure.

out8(0x122,0x3F);
out8(0x124,0x3F);
out8(0x126,0x3F);
out8(0x128,0x3F);
Set the left/right volumes of channel 0 and 1 to max (range is 0x0 - 0x3F). liberis has a function for this but as you've noticed, I'm avoiding liberis functions for now and just using the ports directly... I know the ports work, not so much the functions. :)

out16(0x600,0x51);
out16(0x604,0);
out16(0x600,0x52);
out16(0x604,0);
Port 0x600, when written, is Register Select on the KING. For this, I'm first telling KING to select register 0x51, which is the channel control for ADPCM channel 0. Bit 0 is the buffer select (0 for sequential and 1 for ring), bit 1 enables end interrupt and bit 2 enables intermediate interrupt. I'm not 100% sure exactly how the interrupts work just yet so I leave them alone for now. Anyway, writing to 0x604 after selecting a register configures that register with the data you send it. Register 0x52 will configure ADPCM channel 1.

That's pretty much it for getting ADPCM set up. To actually play a sample, you have to do a few things.

out16(0x600,0x58);0x58 is the ADPCM channel 1 start address. This is going to tell KING where to get the ADPCM data from. ADPCM data is kept in normal KRAM. Using 0x5C instead of 0x58 does the same for channel 1.

out16(0x604,(0x2000/256));The KRAM address has to be divided by 256. I wrote this out longhand to demonstrate.

out16(0x600,0x59);Now I'm telling KING to set the end address. This is 0x59 for channel 0 and 0x5D for channel 1.

out16(0x604,(0x2000+10417));The end address is an absolute address, not a divided one. My ADPCM sample is 20834 bytes long, but since addressing is in words, I cut this value in half. I do believe that if your samples' addresses are outside of the range of 16 bits that you will also need to send the upper bits of the address to 0x606... I've never tested this but this seems logical.

Now that you've told KING the range of the sample, it's time to actually play it.

out16(0x600,0x50);0x50 is the register that controls playback for both channels. That also means that you will need to be mindful of samples already playing.

out16(0x604,5);Finally, play the sample. Bit 0 plays the sample configured for channel 0 and bit 1 plays the sample configured for channel 1. Bits 2 and 3 control the sampling rate and use the same scheme as earlier. Since I want 16kHz, I set bit 2 on and bit 3 off. So... bit 0 on, bit 1 off, bit 2 on, bit 3 off... value is 5. This plays the ADPCM sample in channel 0 at 16kHz.

If you need to stop a sample while it's playing, select register 0x50 and send a 0 to the bit of the channel you wish to stop. Writing just 0 to 0x604 will stop both channels. You can read 0x604 to see which channels are currently in use, and then just stop either of them if you want to. If you set the buffer type to Ring, the sample will playback infinitely until you tell it to stop in this manner... or so the docs state.

So that pretty much sums up how to use ADPCM on the PC-FX.
Turbo Badass Rank: Janne (6 of 12 clears)
Conquered so far: Sinistron, Violent Soldier, Tatsujin, Super Raiden, Shape Shifter, Rayxanber II