I have a board in production which uses the same ISSI SDRAM with an identical schematic to the XA-SK-SDRAM slice card. About 85% of these boards work absolutely perfectly and reliably. However, I'm getting some boards which have write/read errors that I'm having real trouble getting to the bottom of. Continuity between the processor and SDRAM appears to be fine.
I've written a test function based on the sc_sdram_burst testbench code which shows several curious aspects of the problem :
#define TIMER_TICKS_PER_US PLATFORM_REFERENCE_MHZ
#define READ_DELAY 100
static unsigned makeTestWord(unsigned bank, unsigned row, unsigned word)
{
return (bank + (row << (SDRAM_BANK_ADDRESS_BITS)) + (word << (SDRAM_BANK_ADDRESS_BITS+SDRAM_ROW_ADDRESS_BITS)));
}
static void address_test3(chanend c_server)
{
unsigned buffer[SDRAM_ROW_WORDS];
unsigned num_rows = SDRAM_ROW_COUNT;
timer T;
int time;
for (unsigned bank = 0; bank < SDRAM_BANK_COUNT; bank++)
{
for (unsigned row = 0; row < num_rows; row++)
{
for (unsigned word = 0; word < SDRAM_ROW_WORDS; word++)
{
buffer[word] = makeTestWord(bank, row, word);
}
sdram_buffer_write(c_server, bank, row, 0, SDRAM_ROW_WORDS, buffer);
sdram_wait_until_idle(c_server, buffer);
T :> time;
T when timerafter(time + (READ_DELAY * TIMER_TICKS_PER_US)) :> time;
sdram_buffer_read(c_server, bank, row, 0, SDRAM_ROW_WORDS, buffer);
sdram_wait_until_idle(c_server, buffer);
for (unsigned word = 0; word < SDRAM_ROW_WORDS; word++)
{
if(makeTestWord(bank, row, word) != buffer[word])
{
printstr("Failed address_test3 at bank 0x");
printhex(bank);
printstr(" row 0x");
printhex(row);
printstr(" word 0x");
printhex(word);
printstr(" - should be 0x");
printhex(makeTestWord(bank, row, word));
printstr(" , read 0x");
printhexln(buffer[word]);
break;
}
}
}
}
}
On most boards, this code runs with no errors. On a faulty board I get output similar to :
Failed address_test3 at bank 0x0 row 0x7E0 word 0x24 - should be 0x91F80 , read 0x91780
Failed address_test3 at bank 0x0 row 0xE27 word 0x56 - should be 0x15B89C , read 0x15B09C
Failed address_test3 at bank 0x2 row 0xFA4 word 0x0 - should be 0x3E92 , read 0x3692
Failed address_test3 at bank 0x3 row 0xFB0 word 0x2A - should be 0xABEC3 , read 0xAB6C3
Things I've discovered through experimentation with this function :
1. The longer I make READ_DELAY, the more errors I get. With READ_DELAY at 0 I get no errors even on a faulty board.
2. The errors don't happen at the same addresses every time I run the application
3. The errors always happen at a row with bit 0x0200 set and the error in the data is always a 0x0800 bit missing
But now I'm not quite sure where to go next. Point (1) above suggests to me that this is a refresh problem, but if that's the case then I don't understand why it would vary between boards.
Any suggestions of what to try next would be gratefully received as this one is baffling me.
SDRAM errors
-
- XCore Addict
- Posts: 131
- Joined: Wed Aug 03, 2011 9:13 am
-
- XCore Addict
- Posts: 131
- Joined: Wed Aug 03, 2011 9:13 am
The part number I have used IS42S16400J-7TL. This is the updated version of the IS42S16400F-7TL part used on the XA-SK-SDRAM which is now EOL. As far as I can see on the datasheets there are no performance/timing differences between these two parts.
I don't have a slicekit to test the code with, but my test code is just a slightly modified version of the standard testbench code. It's not doing anything complicated or performance testing - just writing to a row, then a short delay, then reading the data back to check that it's the same. On about 85% of my boards it's absolutely rock solid and I've got dozens of these boards out in the field working absolutely fine, but on the remaining 15% it isn't. I don't really understand why as there's no obvious hardware fault and I'm not quite sure what to try next given the above test results.
I don't have a slicekit to test the code with, but my test code is just a slightly modified version of the standard testbench code. It's not doing anything complicated or performance testing - just writing to a row, then a short delay, then reading the data back to check that it's the same. On about 85% of my boards it's absolutely rock solid and I've got dozens of these boards out in the field working absolutely fine, but on the remaining 15% it isn't. I don't really understand why as there's no obvious hardware fault and I'm not quite sure what to try next given the above test results.