Have basic testing complete of erasing application/main code, flashing
data, and reading it back for verification.
This ended up being pretty big task to get working. Some previous
efforts helped out quite a bit though. The first thing needed was a
path out of the main application and this was done in bootload.c by
calling PREP_FWUPDATE. That jumps to the fwupdate area (first 2KByte)
of flash.
There the 'fwupate main' takes over. It updates the usbFunction
Setup/Write ram function pointers to fwupdate's own setup function.
Then it must hijack the processor's execution so once the PREP_UPDATE
exception is complete the processor returns to the fwupdater instead of
the main. This is done by snooping back through the stack and finding
the stack frame keying off of xPSR and valid PC address. It then stomps
the PC & LR in the stack frame to steal execution from the main thread.
After that, all usb transfers are handled by the fwupdater.
Able to get buy without the write so far since setup packets provide
data but are also IN transfers to give path for sending data back to
host. So to keep things small and simple this is all that's handled so
far. Once I get tired of it being so slow I can implement the
usbFunctionWrite and speed things up quite a bit. Haven't actually
timed it yet, but for only 20KByte of data it and not being very
frequent it shouldn't be a big deal. The more I say this the more I'm
thinking I'll add that next because I'll be using it myself so much for
development.. Less time in that state is less likely for ppl to
'semi-brick' their device.
There is of course always the stmicro dfuse demo that can always unbrick
the device. I tried really hard to jump to their bootloader but no
matter what I did I couldn't get it working. It was never recognized by
USB. I half way wonder now if I needed to disable the bootpin which I
never would want to do anyway..
Created separate build_stm folders for INL6 & INL_NES which is what all
the NESmaker kits use. Also update the make files to be more accurate
about what chip their using since fwupdate tries to prevent a hardfault
from flash access beyond what's available.
This update doesn't include a means of updating the first 2KByte of
firmware updater space itself. But the application code should be able
to take care of that for us in a future update. It's only 2KByte so
just temporarily storing the fresh build in SRAM will probably work.
Although will have to be careful about any calls from application code
to fwupdater. Plus there's always dfuse..
Other problem I ran into was erasing the application code. It worked
fine early on for all 30KByte. But as my fwupdater function grew it
crashed when page 18 was erased. Realized my bigger switch/case
statement was calling a gcc library function that resided in the
application code. It was only 50Bytes, so moved it to fwupdate section.
Brought 2 of similar library functions over as well, but one of them
disappeared with update to latest version of arm-none-eabi-gcc.
Not a commit really, but this is the release where I updated gcc. Was
previously:
gcc version 6.2.1 20161205 (release)
[ARM/embedded-6-branch revision 243739]
is now:
gcc version 7.3.1 20180622 (release) [ARM/embedded-7-branch revision
261907] (GNU Tools for Arm Embedded Processors 7-2018-q2-update)
Updating gcc provided a smaller build size of ~250 Bytes from the tail
end. But it also freed up ~50Bytes in fwupdate space as well.
with large number of updates to the linker script (nokeep.ld)
The first 2KByte is dedicated for vector table, usb driver, usb desc
tables, hardfault, dummy handler, and firmware update routines. There
is currently ~700Bytes of free space in that first 2KB. Should be
plenty of space for firmware update routines and other advanced future
features.
The 070RB has 2KByte pages, and 070C6 has 1KB pages, which is the
smallest erase granularity size. So we can't really have anything
smaller than 2KByte on the RB. This leaves 30Kbyte for the
main/application code on the C6 which should be more than enough.
That 30KByte starts with the reset handler fixed to 0x0800_0800 because
we don't want to have to update the vector table.
After the reset handler is the usbFunctionWrite, then Setup routines
which the usb driver calls for incoming/outgoing data. These need to be
in first 64KByte of flash as a 16bit pointer is kept in usb_buff RAM.
Write was put first as it's less likely to change, with Setup following
which is quite large due to all the inlining that's happening inside it
thanks to the compiler.
Perhaps these function locations could be kept at a fixed location. Or
we could make a 'vector table' of our own just before the reset handler.
This may speed things up a bit, but for now it works. Also like the
ability to change these pointers which may be useful for the next update
as the firmware update code will effectively need it's own Setup/Write
functions. So the current pointers can just be updated to call them
instead, and restore originals/new ones through reset.?
This leaves 96KByte of unused flash on the 070RB, don't have any plans
for this yet. Perhaps future updates for all the connectors and
features will require it.
Also added definition for fast ram functions to .data section. Got that
working but not sure when it may be needed..
Need to physically separate them now. Then can focus on erasing &
flashing ourselves.
Added some speed checks to bnrom.lua script that I was testing usb code
with. Was able to verify read/write speeds were no affected by changes
in this commit. Did some testing against older firmware v2.2 though
there does seem to have been a slight slow down on write speeds.
Although, perhaps that's because of the nrom flash verifications that
are also included in this build (but not commited)..?
Deleted shared_usb.h because it was a copy of shared_dict_usb.h
This build_stm .hex does include some NROM flash updates to allow
checking if the last byte programmed successfully because was having
weird problems with that. But not ready to commit all those changes and
they're highly unrelated to this commit.
Now that usb code doesn't use any .data nor .bss need to fully separate
the USB firmware code from the application. Main way to do this will be
to have usb code be effectively entirely interrupt driven.
Thinking the best way to initialize usb will to have the application
code jump to the USB ISR and maybe use some messaging system with the 2
unused usb_buff indexes (4Bytes).
The USB code will include the vector table, so it will point to the
reset handler, but that will point to the application code's reset
handler, just need to make sure that's at a fixed location.
The USB code is just over 1KByte last I checked, so dedicating 2KByte
should be good. Erase granularity is 1 page (1KB on C6, 2KB on RB). So
that will work well. Write protection granularity is 4KByte, but really
we shouldn't need to use write protection as there will always be the
built in bootloader to save a bricked device.
In effort to remove USB firmware driver's dependance on .data/.bss
Started by fixing bug that wasn't allowing USB_BTABLE to be relocatable
Was neglecting byte addressing vs usb_buff[] array indexing of 16bit
half words.
Still have 4 bytes of .bss for usbMsgPtr, need to modify the
communication protocol between application code and usb code to
move/remove this pointer out of .bss there are 4 bytes of usb_buff
ram available for it to be moved into but need to ensure only 16bit
access is made.
Once that's done can separate usb code from application code and have
usb code only interrupt driven, with application code polling.
Then the usb code can sniff out firmware update packets and update
application code behind it's back.
Removed logging of transfer count since it wasn't being used
num_bytes_expecting isn't used but breaks device descriptors if cut for
some reason... so I just moved it and kept it...
Another weird issue is after reflashing the mcu via stlink the first
inlretro.exe excecution fails due to some usb error. Not sure if it's
related to the usb code changes I just made, or possibly some other
recent updates to inlretro executable..? I think this issue has existed
forever, but was hard to pin down and always went away after a reset.
tested and verified on purple, green, and yellow/orange avr kazzos and
stm32 inlretro6 proto, and stm32 adapter with yellow kazzo board
AVR takes ~17.5sec to dump 256KB -> 1:10 for 1MByte = 14.6KBps
STM takes ~8.5sec to dump 1MByte = 120KBps
STM32 usb driver is far from optimal as it's setup to be minimal with only
8byte endpoint0 to make an effort to align avr and stm. Larger endpoints
and bulk transfers should greatly speed up stm usb transfers
refactored firmware buffer.c and implemented most of the required opcodes
added check that should cover if device isn't ready for a IN/OUT
transfer. Does this by usbFunctionSetup returning zero which causes the
device to ignore the host. Don't think I've got the stm32 usb driver
setup properly to handle this not sure I fully understand Vusb driver
either. Anyway, hopefully it works well enough for now and keep this in
mind if issues crop up in future.
Still haven't implemented usbFunctionWrite, not sure stm usb driver is
setup properly yet either..
build sizes:
avr yellow/orange: avr-size build_avr/avr_kazzo.elf
text data bss dec hex filename
5602 6 674 6282 188a build_avr/avr_kazzo.elf
previous builds of avr code size was ~6.4KB when flashing and dumping was working.
AVR bootloader is 1.7KB taking up majority of 2KB boot sector.
So AVR has 16KB - 2KB boot = 14KB available, using ~44% of non-boot sector
available flash Have 4 buffers defined, and 512B of raw buffer defined so using
~65% SRAM Making pretty good use of the chip just for basic framework.
Not a ton of room for board/mapper specific routines, so will have to keep this
in mind. Creating more generic routines to save flash will come with a speed
hit, but perhaps we shouldn't worry too much about that as devices below
really boost speed without even trying. There is some sizable amount of
SRAM available could perhaps load temporary routines into SRAM and execute
Also have ability to decrease buffer sizes/allocation. Perhaps routines
could actually be store *IN* the raw buffers.. ;)
stm adapter: arm-none-eabi-size -t build_stm/inlretro_stm.elf
text data bss dec hex filename
7324 0 680 8004 1f44 build_stm/inlretro_stm.elf
Currently targetting STM32F070C6 which has 32KB flash, 6KB SRAM
Could upgrade to STM32F070CB in same LQFP-48 package w/ 128KB/16KB
Don't think that'll be of much value though especially with limitation
on connectors for adapter.
So currently don't have user bootloader, only built in ones.
8KB of 32KB avaiable flash = 25% utilization
680B of 6KB available sram = 11% utilization
32KB device doubles amount of available flash compared to AVR, although
stm32 code isn't quite a condensed compared to AVR.
stm inlretro6: arm-none-eabi-size -t build_stm/inlretro_stm.elf
text data bss dec hex filename
6932 0 680 7612 1dbc build_stm/inlretro_stm.elf
Mostly limited to STM32F070RB as choosing device requiring XTAL, and
desire large number of i/o. This device provides 128KB flash, 16KB SRAM
Currently using 7.6KB/128KB flash = 6% utilization
Currently using 680B/16KB SRAM = 4.1% utilization
LOTS of room for growth in this device!! Part of why I choose it over
crystalless 072 version, as it came with more flash for less cost.
Also hardly making use of 1KB of USB dedicated SRAM:
32B buffer table entries
16B endpoint0 IN/OUT
48B of 1024B available = 4.6% utilization
packet arrives. Had issue with return data on STM32 not being properly
aligned when the rv array was only 8bit. So defining it as a 16bit array
and then pointing a 8bit pointer to it seems to be an easy fix for now.
Ready to start working on pinport dictionary. Need to get lua code
working on a lower level handling the dictionary calls. Need it do do
things like fill out the wLength and everything for me so one doesn't have
to remember every detail about an opcode/dictionary before calling it.
Realizing code was heavily segmented based on how big/many operands there
were and how big the return data was. This is hard to maintain, need lua
to resolve this issue, and make everything easier to script. Thinking
opcode/dictionary calls need to be more like a function call. Passing in
necessary args only, and returning data instead of succeed/fail.
Two different Makefiles, specify which with -f file flag:
make -f Make_avr clean program
make -f Make_stm clean program
made release dir to put released .hex firmware files
Need to make separate avr build folder
Need to make one master Makefile that calls one of the other makefiles as
instructed.
Currently device is recognized by PC but does nothing else other than
being recognized by app during connection process:
arm-none-eabi-size -t build_stm/inlretro_stm.elf
text data bss dec hex filename
1332 0 20 1352 548 build_stm/inlretro_stm.elf
1332 0 20 1352 548 (TOTALS)
avr-size avr_kazzo.elf
text data bss dec hex filename
1496 2 43 1541 605 avr_kazzo.elf