VHDL CACHE ?

General chat about boosters.
User avatar
exxos
Site Admin
Posts: 1618
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

VHDL CACHE ?

Post by exxos » Thu Dec 28, 2017 3:40 pm

One thing which is nice about the 030 CPU is the instruction and data caches.

I was just looking to see if any example code was about..

https://github.com/joewing/memsim/blob/ ... cache.vhdl

https://github.com/Tabrizian/Cache

Does not mean much to me :( but I think this would be possible for someone who knew about VHDL and how caches work.

The bottleneck on the ST-RAM is always the main issue with my current booster is, so adding a cache would help overcome this a little and gain some extra speed that way without having to move to the 030 CPU.
4MB STFM 1.44 FD- VELOCE+ 020 STE - 4MB STE 32MHz - STFM 16MHz - STM - MEGA ST - Falcon 030 CT60 - Atari 2600 - Atari 7800 - Gigafile - SD Floppy Emulator - PeST - HxC - CosmosEx - Ultrasatan - various clutter

https://www.exxoshost.co.uk/atari/ All my hardware guides - mods - games - STOS
https://www.exxoshost.co.uk/atari/last/storenew/ - All my hardware mods for sale - Please help support by making a purchase.

Zarchos
Posts: 26
Joined: Mon Dec 11, 2017 4:01 pm

Re: VHDL CACHE ?

Post by Zarchos » Thu Dec 28, 2017 9:09 pm

Wouldn't adding cache mean auto modifying code wouldn't work anymore ?

User avatar
exxos
Site Admin
Posts: 1618
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: VHDL CACHE ?

Post by exxos » Thu Dec 28, 2017 10:22 pm

Zarchos wrote:
Thu Dec 28, 2017 9:09 pm
Wouldn't adding cache mean auto modifying code wouldn't work anymore ?
probably yes. But every hardware mod done is going to break something, its the price of speed unfortunately.
4MB STFM 1.44 FD- VELOCE+ 020 STE - 4MB STE 32MHz - STFM 16MHz - STM - MEGA ST - Falcon 030 CT60 - Atari 2600 - Atari 7800 - Gigafile - SD Floppy Emulator - PeST - HxC - CosmosEx - Ultrasatan - various clutter

https://www.exxoshost.co.uk/atari/ All my hardware guides - mods - games - STOS
https://www.exxoshost.co.uk/atari/last/storenew/ - All my hardware mods for sale - Please help support by making a purchase.

keli
Posts: 54
Joined: Tue Aug 22, 2017 1:34 pm

Re: VHDL CACHE ?

Post by keli » Thu Dec 28, 2017 11:20 pm

exxos wrote:
Thu Dec 28, 2017 10:22 pm
Zarchos wrote:
Thu Dec 28, 2017 9:09 pm
Wouldn't adding cache mean auto modifying code wouldn't work anymore ?
probably yes. But every hardware mod done is going to break something, its the price of speed unfortunately.
As in this case when the cache is external to the CPU it should be fully transparent and self modifying code should work fine. It's when the CPU has internal and separate instruction and data caches you run into problems.

User avatar
exxos
Site Admin
Posts: 1618
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: VHDL CACHE ?

Post by exxos » Thu Dec 28, 2017 11:36 pm

keli wrote:
Thu Dec 28, 2017 11:20 pm
As in this case when the cache is external to the CPU it should be fully transparent and self modifying code should work fine. It's when the CPU has internal and separate instruction and data caches you run into problems.
I was reading up on some cache stuff and read similar. Something about instructions been treated as data and it screwing up the self-mod-code.
4MB STFM 1.44 FD- VELOCE+ 020 STE - 4MB STE 32MHz - STFM 16MHz - STM - MEGA ST - Falcon 030 CT60 - Atari 2600 - Atari 7800 - Gigafile - SD Floppy Emulator - PeST - HxC - CosmosEx - Ultrasatan - various clutter

https://www.exxoshost.co.uk/atari/ All my hardware guides - mods - games - STOS
https://www.exxoshost.co.uk/atari/last/storenew/ - All my hardware mods for sale - Please help support by making a purchase.

User avatar
rpineau
Posts: 196
Joined: Thu Aug 17, 2017 6:08 pm
Location: USA
Contact:

Re: VHDL CACHE ?

Post by rpineau » Fri Dec 29, 2017 2:03 am

External caches usually don't have issue with auto-mod or generated code as it's all "data".
On a CPU with separate internal instruction and data cache, if you try to modify instruction that are inside the instruction cache (say less that 256 bytes away from PC assuming all cache line were filed with the instruction following the current one, which is not even guaranteed) it will not work as the CPU will try to do this modification in the data cache and probably hit a cache mis, reload the data cache but still not modifying the instruction in the instruction cache.
So an external cache can be beneficiary and will definitely speed up things (like the cache on the MSTE).
Rodolphe

Petari
Posts: 151
Joined: Tue Nov 28, 2017 1:32 pm

Re: VHDL CACHE ?

Post by Petari » Fri Dec 29, 2017 6:44 am

I think that here is some confusion about auto modifying code. In case of 68030 there is problem of modifying code, which is only some 2-20 bytes ahead. But that is already in CPU's pipeline, so will not execute properly. I called it pipeline bug - and I found it in at least 10 games - like Infestation, Parasol Stars (in depacker), Maupiti Islands . Such code needs to be corrected for proper work on Falcon, TT . And of course, it will execute not good even with caches off.
I did not research about concrete problems with SW incompatible with 68030 caches, but in principle, self modifying code should work with them - except mentioned close ahead writings. And 68030 caches are too small sized, what means not really efficient. I would not waste time to emulate them. Cache on or off in some 3D game helps not much in case of 68030 - because code is complex, and will run out of cache size very soon.
Something like it is in Mega STE - 16KB, single cache is indeed better and more efficient solution. And I'm sure that we can even find some cache chips on market, or doing it with VHDL, FPGA - just not that 68030 data-instruction divided type.

keli
Posts: 54
Joined: Tue Aug 22, 2017 1:34 pm

Re: VHDL CACHE ?

Post by keli » Fri Dec 29, 2017 8:55 am

exxos wrote:
Thu Dec 28, 2017 11:36 pm
I was reading up on some cache stuff and read similar. Something about instructions been treated as data and it screwing up the self-mod-code.
You're describing the situation with internal caches where the instruction and data caches are separate. With an external cache the CPU only sees a faster RAM (on cache hits.) Edit: Rodolphe and Peter explain it better above. I didn't think of the instruction pipeline on later CPUs throwing another spanner in the works for SMC.

You'd need to worry more about modifications in RAM not coming from the CPU, like from the DMA or the Blittter (unless you manage to put the blittter on the same side of the cache as the CPU). Maybe the cache logic could monitor the /BGA pin and invalidate the cache the first time it sees a write cycle while it is active?

User avatar
exxos
Site Admin
Posts: 1618
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: VHDL CACHE ?

Post by exxos » Fri Dec 29, 2017 10:53 am

Okay, let me think out loud here..

Cache the whole 4MB ST-RAM area.

When the CPU does the wright, it also writes into the Cache. When the CPU does read it can read from the Cache RAM much high speeds.

The first problem is that we would need a extra bits to store if the Cache was updated or not. So we will just call this the flag bit. So when the CPU does the wright, it sets the flag. If the flag is not set the CPU does the read, we will be reading from st-ram and then the Cache gets updated and the flag is set.

So on this basis the Cache is updated always on wright cycles, and if the flag is not set, data is loaded from ST ram as normal and the catch is also updated.

Now for example if the CPU did a wright to some address, and then immediately data read from the same address afterwards, it would be able to load it direct from the Cache at high speeds. Similar the first time it does a read on that particular address, the second time they can loaded directly from the Cache.

Overall this would seem pretty easy to implement.. However we do have things like the blitter and DMA to content with.

In terms of the blitter, if it did a write then the Cache would also need to be updated. If the blitter did a read it could read from the Cache. However it would not speed up the blitter I think as it is still running at 8 MHz. Technically if the blitter was clocked at 16 MHz during a catch hit and it could improve performance. But this moment in time would just assume there is no performance benefit in the blitter reading or writing from Cache.. If anything, we are only concerned about is it updating memory data.

We also have DMA contend with. As there is no way that I know up to easily keep record of what areas of ram were updated, then the only thing we could do is to add into our Cache control logic, that if something happens on the DMA bus, then we must invalidate the whole Cache RAM area. We could actually do this with the blitter as well, it would work, but obviously invalidating the catch slow things down and that this moment in time. I think we can treat CPU and blitter as the same in just reading and writing to the Cache, as we should know what address the blitter is accessing.

Overall, we could just play safe and say that if anything else happens on the CPU bus or the DMA bus which isn't controlled by the CPU directly, we just invalidate the Cache.

I suppose basically, DMA activity in terms of games would be loading in the next level or something so we would need to invalidate a lot of RAM anyway . Once the data is loaded, maybe sprite or something which only need to be read from ram continuously, all these could be Cache hits at 32 MHz.
4MB STFM 1.44 FD- VELOCE+ 020 STE - 4MB STE 32MHz - STFM 16MHz - STM - MEGA ST - Falcon 030 CT60 - Atari 2600 - Atari 7800 - Gigafile - SD Floppy Emulator - PeST - HxC - CosmosEx - Ultrasatan - various clutter

https://www.exxoshost.co.uk/atari/ All my hardware guides - mods - games - STOS
https://www.exxoshost.co.uk/atari/last/storenew/ - All my hardware mods for sale - Please help support by making a purchase.

Petari
Posts: 151
Joined: Tue Nov 28, 2017 1:32 pm

Re: VHDL CACHE ?

Post by Petari » Fri Dec 29, 2017 12:08 pm

It is called tag RAM - what stores which locations are cached - therefore you see 2 SRAM chips in Mega STE.

Considering blitter: (and DMA too) - there is simply no sense to cache them. Goal is not to have faster writes in cache, because that will probably be not accessed immediately, so chances that it will be out of cache because new data will overwrite it are big. We need as much fast possible writes in main RAM.
And not to forget that there is 3-rd DMA in ST: video stage. It accessing RAM via same chip as DMA - MMU . If we want for instance to draw something as much as possible on screen, so direct write - with blitter or from hard drive, cache can not speed up it at all. It simply must go in video RAM. And it's speed is limited.
All in all, cache only for CPU. For some serious speed up of ST, first thing needed is faster RAM. With all new chips. It is called often Falcon or TT :lol:

Post Reply