This is the mail archive of the mailing list for the eCos project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Re: NAND technical review

Just some explanatory remarks below, hardware related.

Ross Younger wrote:

1. NAND 101 -------------------------------------------------------------

(Those familiar with NAND chips can skip this section, but I appreciate
that not everybody on-list is in the business of writing NAND device
drivers :-) )

(i) Conceptual

Now, I mentioned ECC data. NAND technology has a number of underlying
limitations, importantly that it has reliability issues. I don't have a full
picture - the manufacturers seem to be understandably coy - but my
understanding is that on each page, a driver ought to be able to cope with a
single bit having flipped either on programming or on reading. The
Such a "broken bit" is because the transistor that contains the bit is physically broken, and is stuck at 1 or at 0 (I don't know if it can be both). So you cannot anymore erase it (flip it back to 1) or program it (flip to 0).

I thought only programming or erasing could break it, not reading?
Is somebody sure about this?
recommended way to achieve this is by storing an ECC in the spare area: the
algorithm published by Samsung is popular, requiring 22 bits of ECC per 256
bytes of data and able to correct a 1 bit error and detect a 2 bit error.

There is also the question of bad blocks. Again, full details are sketchy. A
chip may be shipped with a number of "factory-bad" blocks (e.g. up to 20 on
this Samsung chip); they are marked as such in their spare area. (What
constitutes a "bad" block is not published; one imagines that the factory
have access to more test information than users do and that there may be
statistical techniques involved in judging the likely reliability of the
block.) Blocks may also fail during the life of the device, usually by the
NAND flash chips are very dense chips (many bits on a small size) and there is a trade-off in manufacturing between reliablility and density. To make them dense (hence cheap) faults have to be tolerated.
The manufacturer just tries to program all bits a first time to check for manufacturing errors. When a broken bit is discovered, the entire block is marked bad.
chip reporting a failure during a program or erase operation. Because of
this, the manufacturers recommend that chip drivers scan the device for
factory-bad markers then create and maintain a Bad Block Table throughout
the life of of the device. How this is done is not prescribed, but the
behaviour of the Linux MTD layer is something approximating a de facto standard.
(iii) Electrical

Most, if not all, NAND chips have the same broad electrical interface.

There is a master Chip Enable line; nothing happens if this is not active.
(below a hardware designer note :-)
Be carefull on this: a standard chip enable is only active during the actual read or write. But an access to a NAND flash is a complete cycle during which the NAND flash embedded control logic needs to keep its state!
Therefore, the Chip Enable (or Chip Select) of the NAND flash is (on my ARM9 anyhow) connected to a GPIO pin (general-purpose input/output pin). Therefore the SW has to assert this pin at the start of an access and de-assert it at the end.
The read hardware Chip Select pin is not connected.
(In R's SW in the io/flash_nand/../controller: cyg_nand_ctl_chip_select, that calls chip_select implemented in the board-specific driver in /devs/flash/[uC brand])
Data flows into and out of the chip via its data bus, which is 8 or 16 bits
wide, mediated by Read Enable and Write Enable lines.

Commands and addresses are sent on the data bus, but routed to the
appropriate latches by asserting the Address Latch Enable or Command Latch
Enable lines at the same time.

There is also a ready/busy line which the driver can use to tell when an
operation is in progress. Typical operation times from the Samsung spec
sheet I have to hand are 25us for a page read, 300us for a page program, and
2ms for a block erase.

(iv) Board hook-up
Sometimes the ready/busy line isn't wired in or requires a jumper to be set
to route it. This can be worked around: for a read operation, one can just
insert a delay loop for the prescribed maximum time, while for programs and
erases, most (all?) chips have a "Read Status" command which can be used to
query whether the operation has completed.
We started our driver this way
It can be beneficial to be able to set up the ready/busy line as an
interrupt source, as opposed to having to poll it. Whilst there is an
overhead involved in context-switching, if other application threads have
much to do it may be advantageous overall for the thread waiting for the
NAND to sleep until woken by interrupt.
To speed up, now we poll the ready/busy. To use it as interrupt is still todo.
Of course, it is possible to put multiple chips on a board. In that case
there needs to be a way to route between them; I would expect this to be
done with the Chip Select line, addressed either by different MMIO addresses
or a separate GPIO or CPLD step. Theoretically, multiple chips could be
hooked up in parallel to give something that looks like a 16 or 32-bit
"wide" chip, but I have never encountered this in the NAND world, and it
would impose a certain extra level of complexity on the driver.
Indeed, this would be difficult: a NAND is not a simple memory mapped device as a NOR flash or SRAM, easy to put in parallel.
Only because of bad block management, putting them in parallel is difficult: they cannot be put parallel in hardware, they need to be addresses separately. Then they must be made parallel virtually in software.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]