This is the mail archive of the
ecos-devel@sources.redhat.com
mailing list for the eCos project.
Re: AW: contributing a failsafe update meachanism for FIS from within ecos applications
- From: Andrew Lunn <andrew at lunn dot ch>
- To: Alexander Neundorf <alex at neundorf dot net>
- Cc: ecos-devel at ecos dot sourceware dot org, andrew at lunn dot ch
- Date: Thu, 28 Oct 2004 21:29:45 +0200
- Subject: Re: AW: contributing a failsafe update meachanism for FIS from within ecos applications
- References: <200410281923.58620.neundorf@kde.org> <200410282010.04689.alex@neundorf.net>
> createImage() does: create a new entry, writes the data, marks the new entry
> as valid.
>
> It consists of the following steps:
> startUpdate (redboot) - modify the fis table contents in RAM and flash, mark
> them in progress
> writeData (app) - either all at once, or in flash block sized chunks
> finishUpdate (redboot) - mark the new fis table as valid in flash
So the first step maps to open()
The second step to write()
and the third set to close().
> > I also want to make sure that the design you propose is flexiable
> > enough to support other peoples needs. So it seems you have enough
> > memory to hold a complete image, but i want to ensure the same design
> > can do multiple writes in a clean way using the same API. I would also
> > like it to work without actually needed the redundant FIS block. Not
> > everybody is so paranoid about power failure, but would like to be
> > able to upgrade there application from within the application.
>
> Well, paranoid...
> If it fails the device doesn't work anymore...
>
> Without redundant fis:
> startUpdate doesn't change the flash contents, the new fis table contents are
> written in finishUpdate, so it will work too (except that power failure....
> well you know).
The point is the user gets the choice. They can have a totaly safe
system, or a system that works 99.9% of the time but needs one less
flash block.
> > You are again breaking the abstract. You are doing the CRC creation in
> > the application where as it should be redboot doing it.
> My main reason for this: I'd like to have the new fis table already
> completely correct on the flash except the valid_flag before the
> actual writing process starts, so that the final step really only
> has to set the valid_flag to valid.
I cannot think of a reason why this is actually needed? But maybe im
missing something.
> Apart from that, is it possible for redboot to calculate the crc if
> it doesn't have enough ram to hold the complete image while updating
> and if the application is responsible for the actual writing ?
> Which ram is actually available in a VV function ? (sorry for stupid
> questions)
In redboot side of the VV: Only the stack and any variables in
redboots BSS. But it does not need any RAM. The image is in flash so
it justs runs the CRC over that.
> [OT] why is crc32 used instead of the posix crc ?
Redboot came before posix crc. It also make little difference. crc32
is OK. Its the same one used on ethernet frames.
> ...
> > Assumption 1. All the needed FIS entries exist.
> > Assumption 2. Your boot script is:
> > fis load app
> > go
> > fis load app.bak
> > go
>
> This second step is cool :-)
>
> > open(/foo) does two VV call to get the start and length of the image
> > in flash and allocates the block cache.
> >
> > write() would copy the data into the block cache. If this fills the
> > block cache it simply erases and then writes. As soon as the erase
> > starts, the CRC is wrong. So in terms of redboot, this image is now
> > corrupt.
> >
> > close() flushes the block cache.
>
> Is this is all done in the application ?
Well this is the API between the application and the fisfs. I've
described the actions that fisfs does for each API call. open needs to
call a VV, but write can do all the work without calling redboot.
> > It then does VV calls to ask redboot
> > to recalculate the CRC and put it into the in memory copy of the FIS
> > directory. It then calls a VV function to commit the FIS directory.
> > Redboot does an atomic write, with respect to power failure, of the
> > FIS directory using the valid fields in the redundant FIS blocks etc.
> >
> > So how do you do a safe upgrade of the application:
> >
> > open("app");
> > write(); CRC is now wrong, so app.bak would be booted.
> > write(); CRC is now wrong, so app.bak would be booted.
> > write(); CRC is now wrong, so app.bak would be booted.
> > close(); CRC is now valid, so the new image would be booted.
> > open("app.bak");
> > write(); CRC is now wrong, but it does not matter, app is valid
> > write(); CRC is now wrong, but it does not matter, app is valid
> > write(); CRC is now wrong, but it does not matter, app is valid
> > close(); CRC is now valid and we have two identical apps.
>
> I would prefer an obviously different API for the updating process
> since it is "dangerous" for the whole system. With my createImage()
> which writes a complete image at once there is also ensured that
> there can be at most one corrupt image at a time. When splitting
> open, write and close there can be more than one corrupt
> image. open() for writing should check that there is no other file
> open.
That is not a problem. The filesystem can easily enforce this and
return EAGAN when open() is called when another file has been open'd
for writing.
Andrew