This is the mail archive of the ecos-devel@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

contributing filesystem and a failsafe update meachanism for FIS from within ecos applications


Hi,

it's not even one year ago, and already I have the next version of my patch available ;-)

The attached patch implements a read-only filesystem for FIS, and three extra utility functions for manipulating it safely from ecos applications.

We need to be able to perform safe updates of the firmware, safe regarding power loss at any point in time. Since redboot comes with FIS, we'd like to use fis.
In order to update the firmware a new firmware image has to be placed on the flash and the fis directory has to be updated. When updating the fis directory, the directory is erased and afterwards written with the new contents.
Now if the power goes down directly after erasing the directory redboot can't start the firmware image anymore since it can't read the directory.

In order to enable failsafe operation of redboot and fis under such circumstances, a backup of the fis directory has to be kept until the new directory has been written successfully.
Here comes my proposed strategy:
Currently the fis directory occupies one block of the flash. For safe operation it needs a second redundant block. Both blocks contain the fis directory, but only one is valid (and current).
Redboot needs a way to determine which block contains the valid information.
For this and to stay compatible with existing flash, I suggest to use the first entry of the fis directory table as a valid marker, which can be used to decide which of the two blocks is valid.
It looks like this:

#ifdef CYGOPT_REDBOOT_REDUNDANT_FIS
#define CYG_REDBOOT_RFIS_VALID_MAGIC_LENGTH 10
#define CYG_REDBOOT_RFIS_VALID_MAGIC ".FisValid"  //exactly 10 bytes

#define CYG_REDBOOT_RFIS_VALID       (0xa5)
#define CYG_REDBOOT_RFIS_IN_PROGRESS (0xfd)
#define CYG_REDBOOT_RFIS_EMPTY       (0xff)

struct fis_valid_info
{
   char magic_name[CYG_REDBOOT_RFIS_VALID_MAGIC_LENGTH];
   unsigned char valid_flag[2]; //this should be safe for all alignment issues
   unsigned long version_count;
};
#endif // CYGOPT_REDBOOT_REDUNDANT_FIS


The name is a special name ".FisValid", followed by the actual valid_flag which signals the validity of this FIS table. This way the FIS table stays compatible with the other algorithms in redboot.
To find out the valid FIS table, the name of the first entry is checked against ".FisValid". If it matches valid_flag is checked. The table is only valid, if valid_flag== 0xa5a5. If this is true for both FIS tables, the current and the redundant one, version_count is compared. Then the FIS table with the bigger version_count becomes the valid FIS table.

When performing a safe update, the algorithm must do the following:
(after the * followes what happens when the power goes down at this point in time)

1. modify the fis directory (in RAM) so that it reflects the desired changes, set the valid_flag to RFIS_IN_PROGRESS and set version_count=version_count+1;
*nothing has changed yet, so redboot will work as before

2. erase the flash where the currently invalid fis directory is located
*the valid_flag of the fis directory which will become the new valid directory is 0xffff, and the valid flag of the currently still active directory is still 0xa5a5, and the images haven't been touched yet, so still everything ok for redboot

3. write the modified fis directory in this erased flash block. In redboot/flash.c: fis_start_update_directory()
*as above, but the valid_flag of the directory which is intended to become valid is now 0xfdfd. The images still haven't been touched, so everything is ok.

4. modify the flash image (erase, program)
*now the image has been modified. If you erase the only runnable firmware image on the flash you are of course lost, just avoid this. In all other cases, there is still a working fis directory and a working firmware image on the flash. The old current fis directory is still valid, and the currently running firmware image hasn't been touched. By checking the crc's of the images later you can detect which images are broken.

5. after the image is written, set the valid_flag of the fis directory which will become active to 0xa5a5. In order to do this, the flash block doesn't have to be erased, since the transition from 0xfdfd to 0xa5a5 only sets some bits to 0. When this is done, the image has been written correctly and the new fis directory has the right magic_name, the right valid_flag and its version_count is higher than the version_count of the old fis directory. In redboot/flash.c:  fis_update_directory()
*if the power goes down while writing the 4 bytes of the valid_flag, either the valid_flag has already reached 0xa5a5, then everything is ok, if not it will have a valid_flag != 0xa5a5 and thus not be considered valid.

The attached patch implements support for this strategy in redboot. It basically reads the first entry of both fis blocks, checks them and sets one to be the valid one. The fis manipulation functions in redboot have been modified to support this style of operation. This "safe" FIS can be enabled via the option CYGOPT_REDBOOT_REDUNDANT_FIS.

To make the update functionality availabe to ecos applications a new virtual vector call had to be added, since flash_fis_op() can't list the existing images, it can only return information for an image if you already know its name. The new VV call has the following subfunctions:

* CYGNUM_CALL_IF_FLASH_FIS_GET_VERSION: for checking the compatibility between redboot VV interface and the application

* CYGNUM_CALL_IF_FLASH_FIS_INIT: read the FIS table and find the valid one

* CYGNUM_CALL_IF_FLASH_FIS_GET_ENTRY_COUNT: get the maximum number of entries the FIS table can have

* CYGNUM_CALL_IF_FLASH_FIS_GET_ENTRY: return the information for one FIS table entry by its index. This uses a binary struct, which isn't identic to struct fis_image_desc, but contains most of its information. 

* CYGNUM_CALL_IF_FLASH_FIS_MODIFY_ENTRY: puts the parameters given for an image in the specified entry of the FIS table (in RAM). If you have done this for the image you want to modify, call FIS_START_UPDATE, then update the image and finally call FIS_FINISH_UPDATE

* CYGNUM_CALL_IF_FLASH_FIS_START_UPDATE: start updating the FIS table. Has to be called before writing the image on the flash. Without redundant FIS this does nothing. With redundant FIS it does what is described in step 3) above.

* CYGNUM_CALL_IF_FLASH_FIS_FINISH_UPDATE: finish updating the FIS table. Has to be called after writing the image to the flash successfully. Without redundant FIS it simply writes the new FIS table, with redundant FIS it just marks the already written new table as valid.

For the user there are three functions fis_get_entry(), fis_remove_image() and fis_create_image() available, which call these VVs appropriately. fis_create_image() currently takes a pointer to the whole data buffer and writes it as image on the flash. This might not work for devices which don't have so much RAM. But since this is implemented in the application, it should not be too hard for somebody who needs this functionality to extend the functionality accordingly.

We use this update mechanism now for approx. one year and it has never failed. So I think it would be a good contribution to eCos.

Additionally a read-only file system for FIS is implemented in the attached patch.

So what do you think ?

Bye
Alex

Attachment: ecos.fisfs.patch.gz
Description: ecos.fisfs.patch.gz


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]