This is the mail archive of the ecos-devel@sourceware.org mailing list for the eCos project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Re: NAND technical review

From: Jürgen Lambrecht <J dot Lambrecht at televic dot com>
To: eCos developers <ecos-devel at ecos dot sourceware dot org>
Date: Thu, 8 Oct 2009 10:16:03 +0200
Subject: Re: Re: NAND technical review
References: <4ACB4B58.2040804@ecoscentric.com>

Just some explanatory remarks below, hardware related.

Ross Younger wrote:

<snip>

1. NAND 101 -------------------------------------------------------------
(Those familiar with NAND chips can skip this section, but I appreciate
that not everybody on-list is in the business of writing NAND device
drivers :-) )
(i) Conceptual

<snip>

Now, I mentioned ECC data. NAND technology has a number of underlying limitations, importantly that it has reliability issues. I don't have a full picture - the manufacturers seem to be understandably coy - but my understanding is that on each page, a driver ought to be able to cope with a single bit having flipped either on programming or on reading. The

Such a "broken bit" is because the transistor that contains the bit is physically broken, and is stuck at 1 or at 0 (I don't know if it can be both). So you cannot anymore erase it (flip it back to 1) or program it (flip to 0).

I thought only programming or erasing could break it, not reading?
Is somebody sure about this?

recommended way to achieve this is by storing an ECC in the spare area: the
algorithm published by Samsung is popular, requiring 22 bits of ECC per 256
bytes of data and able to correct a 1 bit error and detect a 2 bit error.
There is also the question of bad blocks. Again, full details are sketchy. A chip may be shipped with a number of "factory-bad" blocks (e.g. up to 20 on this Samsung chip); they are marked as such in their spare area. (What constitutes a "bad" block is not published; one imagines that the factory have access to more test information than users do and that there may be statistical techniques involved in judging the likely reliability of the block.) Blocks may also fail during the life of the device, usually by the

NAND flash chips are very dense chips (many bits on a small size) and there is a trade-off in manufacturing between reliablility and density. To make them dense (hence cheap) faults have to be tolerated. The manufacturer just tries to program all bits a first time to check for manufacturing errors. When a broken bit is discovered, the entire block is marked bad.

chip reporting a failure during a program or erase operation. Because of this, the manufacturers recommend that chip drivers scan the device for factory-bad markers then create and maintain a Bad Block Table throughout the life of of the device. How this is done is not prescribed, but the behaviour of the Linux MTD layer is something approximating a de facto standard.

<snip>

(iii) Electrical

Most, if not all, NAND chips have the same broad electrical interface.

There is a master Chip Enable line; nothing happens if this is not active.

(below a hardware designer note :-) Be carefull on this: a standard chip enable is only active during the actual read or write. But an access to a NAND flash is a complete cycle during which the NAND flash embedded control logic needs to keep its state! Therefore, the Chip Enable (or Chip Select) of the NAND flash is (on my ARM9 anyhow) connected to a GPIO pin (general-purpose input/output pin). Therefore the SW has to assert this pin at the start of an access and de-assert it at the end. The read hardware Chip Select pin is not connected. (In R's SW in the io/flash_nand/../controller: cyg_nand_ctl_chip_select, that calls chip_select implemented in the board-specific driver in /devs/flash/[uC brand])

Data flows into and out of the chip via its data bus, which is 8 or 16 bits
wide, mediated by Read Enable and Write Enable lines.

Commands and addresses are sent on the data bus, but routed to the
appropriate latches by asserting the Address Latch Enable or Command Latch
Enable lines at the same time.

There is also a ready/busy line which the driver can use to tell when an
operation is in progress. Typical operation times from the Samsung spec
sheet I have to hand are 25us for a page read, 300us for a page program, and
2ms for a block erase.

(iv) Board hook-up

<snip>

Sometimes the ready/busy line isn't wired in or requires a jumper to be set to route it. This can be worked around: for a read operation, one can just insert a delay loop for the prescribed maximum time, while for programs and erases, most (all?) chips have a "Read Status" command which can be used to query whether the operation has completed.

We started our driver this way

It can be beneficial to be able to set up the ready/busy line as an interrupt source, as opposed to having to poll it. Whilst there is an overhead involved in context-switching, if other application threads have much to do it may be advantageous overall for the thread waiting for the NAND to sleep until woken by interrupt.

To speed up, now we poll the ready/busy. To use it as interrupt is still todo.

Of course, it is possible to put multiple chips on a board. In that case there needs to be a way to route between them; I would expect this to be done with the Chip Select line, addressed either by different MMIO addresses or a separate GPIO or CPLD step. Theoretically, multiple chips could be hooked up in parallel to give something that looks like a 16 or 32-bit "wide" chip, but I have never encountered this in the NAND world, and it would impose a certain extra level of complexity on the driver.

Indeed, this would be difficult: a NAND is not a simple memory mapped device as a NOR flash or SRAM, easy to put in parallel. Only because of bad block management, putting them in parallel is difficult: they cannot be put parallel in hardware, they need to be addresses separately. Then they must be made parallel virtually in software.

Regards,
Jürgen

Follow-Ups:
- Re: NAND technical review
  - From: Jonathan Larmour

References:
- Re: NAND technical review
  - From: Ross Younger

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]