Realizing much of my problems with discrete boards on original kazzos was
that I completely ignored the fact that most discretes are subject to bus
conflicts. This makes writes to the mapper bank selecting uber flakey...
Created routine in bnrom.lua script to start flash operation by writing
the bank table to where Lizard puts it. Need to write routine to find the
bank table in a provided rom for flashing. And dumping needs to find the
bank table prior to making mapper writes and then use it for bank
switching!
Tested and working on AVR, stm adapter, and inl6
Lizard 512KB flash/dump:
AVR decrepit old firmware & app: 7.9KBps flash
AVR new firmware and app (this build) F:12.1KBps D:14.6KBps
STM adapter: F:31KBps D:114KBps
INL6: F:40KBps D:120KBps
Modified flash/dump scripts to be more generic accepting mapper/memory
args and file names. Then they only handle the buffer and file
operations.
Created scripts/nes folder for holding all mapper scripts. Currently only
nrom.lua is working and verified with inlretro6. Found issue where the
very first byte read from PRG-ROM was garbage. Narrowed it down to lower
address byte not settling in time, added NOP and resolved issue.
Still need to test on original kazzo and stm adapter, planning to do that
after this commit.
Next task is to get BNROM working so I can start getting to work on lizard
builds while taking advantage of speed boost!
Most progress was on jtag lua statemachine code. From what I recall I
tested and verified most state change possibilities with logic analyzer.
So they should be fairly good. Possible I didn't test all later ones,
or things are partly unfinished, but my best guess is they're good.
Appears was able to erase MachXO CPLD. Added time delay for run test.
Did some basic testing for gameboy power switching circuit.
Also just got STM8S001 CIC programming working for discrete boards via
A0. Pretty sure I broke EXP0 in the process for SNES boards.. So need
to go back and fix that I think due to new means of changing swim pin.
Various changes to STM8 SWIM code to make more versatile allowing SWIM
pin to be located effectively on any STM32 GPIO pin. Still haven't
touched an AVR implementation, but made place holders so it can compile
for AVR at least. These SWIM changes aren't heavily tested, mostly just
made sure could flash SOIC-8 STM8 CIC via CIC CLK.
Beginings of JTAG code to configure CPLDs. Currently only tested state
change and scan out reading MachXO-256, 4032/64v, & XC9572/36XL CPLDs
Tested and working on inlretro6 v1.0p, stm adapter, & avr kazzos.
Older devices with flipflops will apply 5v signals to JTAG pins but time
is mostly minimized by keeping signals defaulted low unless actively
changing states or scanning data.
Still need to verify scan in working, probably move TDI/TDO long strings
to buffers instead of 32byte PBJE data array. Also need smarter PBJE
host code to keep track of current state and come up with PBJE register
values without hard coding them..
But things are working fairly well so far with SWIM & JTAG
implementations. Had some issues where I thought jtag pin toggling was
getting optimized out, but I must have simply had the logic analyzer
speed set too low and was missing pin changes that can be as quick as
40nsec with space optimized code. Current inl6 code is ~4400Bytes,
without optimization it's nearly 50% larger at ~6550Bytes..!
Optimizations seem fine in testing and with logic analyzer running at
50Mhz which is good because the GPIO registers are set as volatile so
they better not be getting optimized away!
boards for SF2 builds. Not necessarily the most clean, but it was
stable and worked well.
Need to get swim comms working on other board designs.
Need to come up with better swim activation with more exact timing.
Still need to implement swim comms on avr, hopefully that doesn't prove
to be too much of a PITA... Not looking forward to that. Can probably
only handle low speed, and faking pullup may not work as well without
time on it's side @ 16Mhz...
-Updated STM devices to always run @ 48MHz
Doesn't seem to cause any problems with SNES flashing couple thousand SF2
boards have been flashed with this build without issues
-Added note to usb_operations.c as manf/prod ID can't be read if drivers
aren't installed. Caused issues for Todd as he hadn't installed drivers
for new hardware.
-STM swim operations are working pretty well for SNES v2 and v3 boards
Haven't even touched SWIM on AVR core yet...
SWIM is pretty pin independent but only implemented on EXP0 so far
Reads "ROTF" aren't bullet proof but they're pretty good. Biggest
room for improvement aside from adding a legit pullup would be to have
an interrupt trigger the device header bit falling edge instead of the
current polling method which has decent amount of jitter.
Implementing interrupts would also probably allow for more easily
supporing reads longer than a single byte...
Only reads one byte, but good enough.. to get things done.
Code should actually work for low and high speed, but have only tested
high speed on writes so far.
Having issue where reads can fail at times. Esp with long strings of
'0'.. Perhaps operating at high speed would improve matters..
Although I'm also realizing maybe I'm not waiting for the device to reset
and reload HSI trim factory value, need to check that..
The new assembly file/function does everything needed so can start cutting
out inline assembly from swim_out function.
Swim code needs to run at 48Mhz. Realizing this is pretty vital to having
enough time to handle high speed. And timing of artificial pull-up
requires high trimmability..
Able to enter active mode and Write on the fly.
Simple test to toggle LED on STM8 GPIO works!
Still quite far from ideal setup. Some things needed:
-defines for ACK/NAK/NO_RESP in dictionary to report inteligbly to lua
-move test SWIM code into separate lua script
-define STM8-CIC registers for easier calling from lua
-entering active mode is too board dependent, need to use swim_base
-Need to make better use of device timers for entering active mode
-arm assembly is quite a mess, unaware of calling convention when writting
-stopping more than just r0-4, r5+ need to be restored if used
-thinking I'd like a full out assmebly file that gets compiled separately
-nothing is done to support SWIM with AVR
-hacking lack of powerful enough pullup on SWIM pin
not much that can be done to get around this...
don't want to add resistors to programmer for every pin I 'might' use
don't want to add resistors to each board that's made
-seems to work well enough, but reads will prob prove challenging
-currently only running at slow speed with ton of NOPs
AVR not yet working, performing low level SWIM operations will require
decent amount of core specific code due to differences in pin driver
styles, timers, cycles per instruction, etc. The fact that SWIM pin
changes based on the board ADDR0, DATA0, EXP0, etc multiplies this low
level code... Thinking about executing SWIM low level drivers from SRAM.
Initialization could include loading these routines to SRAM.
For now just focusing on supporting SWIM on STM cores for SNES boards.
set as other things may trigger off of this.
Also forgot firmware size report:
AVR KAZZO:
avr-size build_avr/avr_kazzo.elf
text data bss dec hex filename
6952 6 674 7632 1dd0 build_avr/avr_kazzo.elf
INL6:
arm-none-eabi-size -t build_stm/inlretro_stm.elf
text data bss dec hex filename
8408 0 684 9092 2384 build_stm/inlretro_stm.elf
STM adapter:
arm-none-eabi-size -t build_stm/inlretro_stm.elf
text data bss dec hex filename
9212 0 684 9896 26a8 build_stm/inlretro_stm.elf
Also the more I get into these flashing routines, maybe the original kazzo
wasn't so bad.. It just takes awhile to flash these chips... I'm not
sure it makes much sense to offer/support stm32 adapter. While it is less
work to cut it out, it's brought out bugs in my code just due to having
different platforms with their quirks manifest bugs.
prototype which has STM8 CIC driving flash /OE with inversion of SYS /RST.
STM8 CIC is running at 16Mhz, and doesn't actually function as CIC. Still
need to come up with special way to signal to CIC that it's plugged into a
programmer and not a console.
Things aren't as fast as they could be, but they're good for now and
proved working on all kazzo versions. Expecting decent speedup could be
aquired by optimizing the flash routine, not changing address unless
needed, or only changing low byte of address, etc. Could also let the
host put the flash chip in unlock bypass mode and keep it there until done
with flashing.
Current speeds:
INL6: 42.2 KBps flashing, 92KBps dumping
stm adapter: 25.3 KBps flashing, 96KBps dumping
AVR kazzo: 18.0KBps flashing, 14.3KBps dumping
Was able to get the inl6 up to 59KBps flashing. Which was 35sec total
flash time for 16mbit chip which has typical flash time of 22s plus
overhead. This got slowed down when supporting stm adapter as checks for
buffer status were required from what I recall. Also fixing flash polling
routine AVR found slowed things down.
Was able to get 140KBps dump time on inl6 with 16mbit SNES flash. This
was slowed when supporting stm adapter which brought out issue with stm32
usb driver. Locks up the device if the buffer isn't fully dumped prior to
calling. Need to get driver to support sending NAKs until data is dumped.
Current fix for checking buffer status slows things down for all devices.
AVR brought out issue with SNES v3 design where we can't rely on flash
poll data to toggle between reads as /OE and /CE are stuck low. Have to
toggle /RESET slowly to toggle /OE and ensure we don't move on to next
byte until previous is fully flashed.
STM32 found initial issue where /WR should be set low first to set
direction of data level shifter, then set /ROMSEL low to enable level
shifter output. Not doing this caused bus conflicts between the two
causing flakey writes where not all bits were getting cleared.
lua scripts currently force SNES, need to add smart check that identifies
SNES flash board if vector data is 0xFFFF. Also funky order where it
always erases after flashing as this was more convient for testing.
While this commit is far from ideal, it's stable and I've done my best to
not commit junk that will cause problems later. Just make sure to always
verify dumping algo before assuming something is wrong with writes!
works on both inl6 and original kazzo just fine. Dumping v3 prototype has
a few byte corruptions on inl6, but is fine on original kazzo. The same
bytes often fail, but not consistently. Tinkered with adding delay, but
that didn't help. Also have issue with adapter not dumping properly.
Prob bug with HIGH ADDR on that board need to sort out still. Going to
focus on erasing and dumping next then come back to some of these issues.
Not 100% sure what all happened with this update.. :/
Tested and have all 3 recent kazzos flashing and dumping PRG-ROM and
CHR-ROM on NROM NES board. Pretty sure I tested purple and green kazzos
too as I had left those on in pinport and seem to recall having them all
working when I tweeted 2 weeks ago..
Created new status_wait for buffers so can wait for them to finish
dumping/flashing before starting/ending operations. That cleaned up
dump/flash code a fair amount.
On first tests today I had issues where setting flash operation would hang
and fail with both stm kazzos. As I started to debug the issue it
disappeared, so IDK what that was all about.. I think there might be an
issue with my stm32 usb drivers.. Those were updated in this commit to
properly allow write "OUT" packets to be supported.
Planning to start tinkering with SNES in prep for the no save boards
arriving tomorrow!
Need to get stm32 up and working, currently the usbFunctionWrite causes
device descriptor request to fail on stm32 devices. So need to do some
debugging there which I was expecting..
Learned lesson yet again to stop putting logic inside ADDR/DATA port
macros. The expansion of putting logic inside those is hard to predict,
it ends up varing based on mcu hardware.. Just don't do it!!
Have some cleaning up on buffer.c that's needed. Currently have to give
the device some time to dump buffer prior to calling payload. The buffer
manager should be able to handle this itself! Also don't think I should
have to reset raw buffers and reallocate from scratch between PRG/CHR
dumping! But that's currently the case. It works for now on AVR kazzo
and STM adapter & inlretro6.
tested and verified on purple, green, and yellow/orange avr kazzos and
stm32 inlretro6 proto, and stm32 adapter with yellow kazzo board
AVR takes ~17.5sec to dump 256KB -> 1:10 for 1MByte = 14.6KBps
STM takes ~8.5sec to dump 1MByte = 120KBps
STM32 usb driver is far from optimal as it's setup to be minimal with only
8byte endpoint0 to make an effort to align avr and stm. Larger endpoints
and bulk transfers should greatly speed up stm usb transfers
refactored firmware buffer.c and implemented most of the required opcodes
added check that should cover if device isn't ready for a IN/OUT
transfer. Does this by usbFunctionSetup returning zero which causes the
device to ignore the host. Don't think I've got the stm32 usb driver
setup properly to handle this not sure I fully understand Vusb driver
either. Anyway, hopefully it works well enough for now and keep this in
mind if issues crop up in future.
Still haven't implemented usbFunctionWrite, not sure stm usb driver is
setup properly yet either..
build sizes:
avr yellow/orange: avr-size build_avr/avr_kazzo.elf
text data bss dec hex filename
5602 6 674 6282 188a build_avr/avr_kazzo.elf
previous builds of avr code size was ~6.4KB when flashing and dumping was working.
AVR bootloader is 1.7KB taking up majority of 2KB boot sector.
So AVR has 16KB - 2KB boot = 14KB available, using ~44% of non-boot sector
available flash Have 4 buffers defined, and 512B of raw buffer defined so using
~65% SRAM Making pretty good use of the chip just for basic framework.
Not a ton of room for board/mapper specific routines, so will have to keep this
in mind. Creating more generic routines to save flash will come with a speed
hit, but perhaps we shouldn't worry too much about that as devices below
really boost speed without even trying. There is some sizable amount of
SRAM available could perhaps load temporary routines into SRAM and execute
Also have ability to decrease buffer sizes/allocation. Perhaps routines
could actually be store *IN* the raw buffers.. ;)
stm adapter: arm-none-eabi-size -t build_stm/inlretro_stm.elf
text data bss dec hex filename
7324 0 680 8004 1f44 build_stm/inlretro_stm.elf
Currently targetting STM32F070C6 which has 32KB flash, 6KB SRAM
Could upgrade to STM32F070CB in same LQFP-48 package w/ 128KB/16KB
Don't think that'll be of much value though especially with limitation
on connectors for adapter.
So currently don't have user bootloader, only built in ones.
8KB of 32KB avaiable flash = 25% utilization
680B of 6KB available sram = 11% utilization
32KB device doubles amount of available flash compared to AVR, although
stm32 code isn't quite a condensed compared to AVR.
stm inlretro6: arm-none-eabi-size -t build_stm/inlretro_stm.elf
text data bss dec hex filename
6932 0 680 7612 1dbc build_stm/inlretro_stm.elf
Mostly limited to STM32F070RB as choosing device requiring XTAL, and
desire large number of i/o. This device provides 128KB flash, 16KB SRAM
Currently using 7.6KB/128KB flash = 6% utilization
Currently using 680B/16KB SRAM = 4.1% utilization
LOTS of room for growth in this device!! Part of why I choose it over
crystalless 072 version, as it came with more flash for less cost.
Also hardly making use of 1KB of USB dedicated SRAM:
32B buffer table entries
16B endpoint0 IN/OUT
48B of 1024B available = 4.6% utilization
Have separate lua modules now in scripts/app folder
Dictionary calls are now their own lua module
firmware now capable of calling multiple different dictionaries
have firmware & lua io and nes dictionaries, able to detect
NES and famicom carts. Created expansion port abstraction so most kazzo
versions behave identically.
Created separate make file for stm adapter and inl6
added PURPLE_KAZZO and GREEN_KAZZO defines back in. They work well enough
for sensing NES vs famicom carts so far. GREEN_KAZZO requires
PURPLE_KAZZO to also be defined. GREEN_KAZZO is also only compatible with
AVR_CORE due to software_AHL/AXL functions specifically written for AVR.
I think things will work if a STM_ADAPTER is placed on a PURPLE_KAZZO and
both those defines are made as only real difference is software tying of
AXL and X_OE. But haven't tested this aside from ensuring it compiles.
Have correction to pinport_al.h that will commit immediately after this.
Effectively deleted old dictionary call function/files.
Created lua_usb_vend_xfr function so lua can directly send and receive
vendor setup transfers.
Dictionary calls are more like function calls now, and all args aren't
required so the LED can be turned on for example in lua like so:
dict_pinport("LED_ON")
general format is:
dict_name( opcode, operand, misc, datastring )
Also added ability to store opcode's return length in shared dict library
files with RL=number in the comments following the opcode.
Negative numbers designate OUT transfers, positive for IN.
Default value can be determined by each dictionary's calling function.
Decided pinport is 1 for SUCCESS/ERROR CODE.
Also have default return data means with second byte giving length of
return data in bytes that follows.
dictionary call function reports any errors reported by the device and
returns any return data from the device excluding error code / data len
Now time to start implementing some of these dictionaries on the device.
packet arrives. Had issue with return data on STM32 not being properly
aligned when the rv array was only 8bit. So defining it as a 16bit array
and then pointing a 8bit pointer to it seems to be an easy fix for now.
Ready to start working on pinport dictionary. Need to get lua code
working on a lower level handling the dictionary calls. Need it do do
things like fill out the wLength and everything for me so one doesn't have
to remember every detail about an opcode/dictionary before calling it.
Realizing code was heavily segmented based on how big/many operands there
were and how big the return data was. This is hard to maintain, need lua
to resolve this issue, and make everything easier to script. Thinking
opcode/dictionary calls need to be more like a function call. Passing in
necessary args only, and returning data instead of succeed/fail.
underlying hardware/mcu. Created avr_gpio.h to define AVR pin registers
in a struct fashion similar to what's common with ARM code. Doing that
makes things much easier to abstract in pin macro 'functions'.
Added define to Makefiles that flags pinport_al.h which board is targetted
for build.
Tested and able to turn on/off and pull-up LED on all 3 builds.
Two different Makefiles, specify which with -f file flag:
make -f Make_avr clean program
make -f Make_stm clean program
made release dir to put released .hex firmware files
Need to make separate avr build folder
Need to make one master Makefile that calls one of the other makefiles as
instructed.
Currently device is recognized by PC but does nothing else other than
being recognized by app during connection process:
arm-none-eabi-size -t build_stm/inlretro_stm.elf
text data bss dec hex filename
1332 0 20 1352 548 build_stm/inlretro_stm.elf
1332 0 20 1352 548 (TOTALS)
avr-size avr_kazzo.elf
text data bss dec hex filename
1496 2 43 1541 605 avr_kazzo.elf
enumeration with host, no vendor/class requests handled.
move avr builds into avr_release dir
move original source files into source/old for future reference.
avr-size avr_kazzo.elf
text data bss dec hex filename
1496 2 43 1541 605 avr_kazzo.elf
usb_Func_write updates buffer status if bytes remaining is zero.
Not the best solution as a buffer could be over/under run, define
MAKECHECKS to have buffer mark itself if full.
This method is faster and we always have transfer sizes match buffer sizes
anyway.
Had to add check to get cur_buff status and wait to send payload until
it's empty. Still need to add timeout check as it'll spin forever if
there is a problem and it's never empty...
device should be able to handle buffer sizes smaller than usb transfer
but this probably isn't true if the first two bytes are stuffed into setup
packet. Currently relies on end of (upto) 8 byte transfer to fill buffer.
MAKECHECKS would verify we don't overflow buffer.. Still kind of a half
thought out idea unfortunately.
Not sure how I thought flash operations were previously working as there
were many bugs I had to correct to support flash operations properly.
Operations module appears to be working so far, still need to pass
functions to operation module.
Flash operations verify PRG-ROM 32KB writes working with file comparison.
Currently dependent on extra buffer status reads to delay next buffer.
I think the write operation is taking longer than the usb load operation.
Potentially due to slow code of operation module, but also possible I
had only been testing with slow eeepc linux machine previously. Perhaps
combination of both.
Still need to correct issue so added buff status delays aren't needed.
buffer manager should be able to key off of status==USB_FULL but that
doesn't seem to work. When trying I don't always get the same number of
buffers to get flashed so appear to have a race condition or something
not properly intialized..?
Need sort out sending of USB STALL if buffer isn't ready to be loaded yet.
This commit is mainly for documentation/reference purposes as things are
kind of working, but buggy/unstable.
AVR Memory Usage
----------------
Device: atmega164a
Program: 6486 bytes (39.6% Full)
(.text + .data + .bootloader)
Data: 679 bytes (66.3% Full)
(.data + .bss + .noinit)
Things appear to be working with some early testing. Assumption that oper_info elements
are aligned in SRAM linearly appears to hold true. Researching this I found it probably
was true, but can't be certain esp if gets changed in the future to not be purely 8byte
sized elements.
Still need to provide means to decode function numbers info function pointers.
Need to verify page programmed successfully as it currently just continues even if unable to
flash proper data. Need to make write page utilize variables for bank address based on mapper
and/or memory as currently doesn't flash CHR-ROM due to $5555 $2AAA being above address space
of CHR-ROM
Found bug with setting map_n_part due to >/< instead of >=/<= for setting called_buff...
Was also setting mem_type and part backwards in dump.c
The had issues with usb timing out for more than 1 buffer read back
Problem was due to lack of usbPoll while dumping during double buffering
Adding usbPoll to page read to correct issue
Appears to be issue with dumping first byte of this choplifter cart I'm testing with.
Not so certain it's my bug though.. No matter what I do the first byte reads
back 0x78 and copy I downloaded has 0x00. Setting my first byte to 0x00 also
creates proper CRC32 according to bootgod's database. So need to look into this more
to figure out what's going on.
Detecting mirroring code working and tested
Started working on buffer operations from host
Current code compiles but not yet at point where can start testing
Adding cpu page read to nes.c to have faster dumping operations.
moving enums to shared as gets used quite a bit communicating between device and host.
Processing input args to create rom file when dumping
Adding create_rom function in file.c working but need to add check if file already exists
Listing out number of mappers which planning to support
Using CHR-RAM sensing, and flash manf/prod ID based on PRG-ROM banking
fixing bug in firmware for ppu writes was ANDing in /A13 instead of ORing..
adding datasheets to hardware folder for 5v PLCC and 3v TSOP flash used on all flash boards
Prepended DICT_ to dictionary names to prevent using those defines for something else accidentally
"NES/SNES" etc could be used in a lot of places, don't want to use wrong enum/define in wrong place.
created enums.h to list out all enums/defines for cartridge and memory elements in one location.
separate file.c/h file for getting data in/out of a files, and opening/closing them.
adding test roms to roms folder so they can be used for various testing.
buffer opcode updates to transfer payloads
including stuffing two bytes of write transfers in setup packet.
Calling specific buffers with miscdata or opcode.
new dump and flash modules for firmware.
new buffer function update_buffers called during main to monitor and
manage buffer objects when not being loaded/unloaded from USB.
Trying to prevent transfer from exceeding buffer size.
Also verifying buffer's status is properly set to enforce upholding of the status.
Giving usbFunctionWrite a means to communicate it's error/success back to host with USB 'dictionary'.
Had a good lesson on what static means... :/
everything working now as previously designed
speed testing on windows10 PC yeilded ~21KBps when transferring 128-512KB
payloads and 128Byte transfer size. Going to bump to 256 and see how that
does after 128KB speed tests on linux machine.
created host test.c/.h file for general testing of new features.
that way I can start working on erase/write.h files and just use test.c as
scratch code space for tinkering and still call with -t flag on command
line.
modified dictionary calls to include pointers to data and lengths.
moved all buffer operations out of usb.c with new bridge function between
the two files. Lots of pointing going on and lessons learned..
Thankfully everything seems to be working if you actually call the
functions as I designed them.. Gotta love trouble shooting bugs that
don't exist.. Helped updating allocate output to get returned as error
back to the host.
Moved typedef structs to firmware type.h file as seemed to cause
compilation issues being contained in the files .h file when other .c
files needed those types.
Fixed casting warnings with usbMsgPtr ended up looking at usbdrv.c figured
out how close I got, just shouldn't have been putting the * in there..
complete. should be able to allocate buffers from host, but haven't got
to testing it yet. Compiling on firmware though..
Currently have 256 bytes of raw_buffer, and 8 buffer objects/structs
each with ~16 bytes per object. So could trim things down, but still have
decent amount of SRAM left. Could have another 256 byte buffer at this
rate.. but might not leave enough SRAM for temporary routines.
Possible that raw buffer space could be dynamically allocated
as either buffer space or temporary routine space...
AVR Memory Usage
----------------
Device: atmega164a
Program: 4094 bytes (25.0% Full)
(.text + .data + .bootloader)
Data: 573 bytes (56.0% Full)
(.data + .bss + .noinit)