AN00170 "Using the SDRAM library"
-
- Member
- Posts: 12
- Joined: Thu Apr 14, 2016 8:14 pm
Looking forward to the new documentation. One question in the meantime - are there any plans to increase the total amount of memory that lib_SDRAM can address beyond 256M bits?
-
Verified
- XCore Legend
- Posts: 1156
- Joined: Thu May 27, 2010 10:08 am
There is no intrinsic limit on 256Mb in the code now - It's just we haven't implemented or tested beyond that. The previous version (coded for XS1) had a hard limit at 64Mb because of the way loop unrolling was done; it used a near addressing mode that only allowed a 6b offset (2^6 used twice = 128 long words = 256 16b words), limiting column address to 8b, which tops out at 64Mb for a 16b wide memory.are there any plans to increase the total amount of memory that lib_SDRAM can address beyond 256M bits?
The write inner loop for XS2 is able to use a loop counter as it has the right addressing modes, whereas the read still needs to use the unrolled loop with branch calculation, although it is no longer limited to 128 loops due to using dual-issue. It subtracts the number of words to read from the number of row words, multiplies by 2 and then branches to the right point in the unrolled loop (see below).
Code: Select all
ldw w_temp, sp[1+WRITE_STACK_WORDS] //row_words
sub w_temp, w_temp, r3
add w_temp, w_temp, w_temp
ldw w_ras, r1[2] //ras
ldw w_we, r1[3] //we
ldw w_dq_ah, r1[0] //dq_ah
.align 4
ldc w_two, 2
bru w_temp
//(blocking)
#include "sdram_block_write_body.inc"
This is not tested, but I don't forsee any issues here.
-
- Member
- Posts: 12
- Joined: Thu Apr 14, 2016 8:14 pm
Hi InfiniteImprobability,
We have started testing with lib_sdram running on our hardware without success so far. As previously posted, our hardware uses port 16B on tile 1 for the ADQ bus and ports 1A-1D for the control bus. The SDRAM part number is a Micron MT48LC16M16A2P-6A:G (167MHZ, CL = 3, tAC = 5.4ns max, tOH = 3ns min).
During testing, several clock dividers (4, 5, & 6), CAS values (2, 3, & 4), #define N values (0, 1, & 2) in io.S, set_pad_delay(dq_ah, n) values (0, 1, & 2) in server.xc were tested with no observable differences in behavior.
Scoping the SDRAM chip at the package with a spring ground instead of a ground clip wire, the power supply rails are clean, as well as SD_CLK, WE#, CAS#, RAS# and DQM signals. The ADQ bus signals look clean as well, but only a few can be scoped at a time without a logic analyzer; that also makes it difficult to know if signals are in the right place (I suspect it's quite a task even with an analyzer!).
Our test code was adapted from the github example, and set_pad_delay(dq_ah, 0) was used along with #define N (1). Based on the timing spreadsheet, these values provided plenty of margin with a 50MHz clock:
If ADDR_START is set to 0x0, the following output is produced:
- Tony
We have started testing with lib_sdram running on our hardware without success so far. As previously posted, our hardware uses port 16B on tile 1 for the ADQ bus and ports 1A-1D for the control bus. The SDRAM part number is a Micron MT48LC16M16A2P-6A:G (167MHZ, CL = 3, tAC = 5.4ns max, tOH = 3ns min).
During testing, several clock dividers (4, 5, & 6), CAS values (2, 3, & 4), #define N values (0, 1, & 2) in io.S, set_pad_delay(dq_ah, n) values (0, 1, & 2) in server.xc were tested with no observable differences in behavior.
Scoping the SDRAM chip at the package with a spring ground instead of a ground clip wire, the power supply rails are clean, as well as SD_CLK, WE#, CAS#, RAS# and DQM signals. The ADQ bus signals look clean as well, but only a few can be scoped at a time without a logic analyzer; that also makes it difficult to know if signals are in the right place (I suspect it's quite a task even with an analyzer!).
Our test code was adapted from the github example, and set_pad_delay(dq_ah, 0) was used along with #define N (1). Based on the timing spreadsheet, these values provided plenty of margin with a 50MHz clock:
Code: Select all
#include <xs1.h>
#include <platform.h>
#include <stdio.h>
#include <stdlib.h>
#include "sdram.h"
#define SERVER_TILE 1
on tile[SERVER_TILE] : out buffered port:32 sdram_dq_ah = XS1_PORT_16B;
on tile[SERVER_TILE] : out buffered port:32 sdram_cas = XS1_PORT_1D;
on tile[SERVER_TILE] : out buffered port:32 sdram_ras = XS1_PORT_1C;
on tile[SERVER_TILE] : out buffered port:8 sdram_we = XS1_PORT_1B;
on tile[SERVER_TILE] : out port sdram_clk = XS1_PORT_1A;
on tile[SERVER_TILE] : clock sdram_cb = XS1_CLKBLK_1;
#define ADDR_START 0x0
void application(streaming chanend c_server) {
#define BUF_WORDS (8)
unsigned read_buffer[BUF_WORDS];
unsigned write_buffer[BUF_WORDS];
unsigned * movable read_buffer_pointer = read_buffer;
unsigned * movable write_buffer_pointer = write_buffer;
s_sdram_state sdram_state;
sdram_init_state(c_server, sdram_state);
printf("Start SDRAM Demo\n");
//Fill the memory initially with known pattern
for(unsigned i=0;i<BUF_WORDS;i++){
write_buffer_pointer[i] = 0xdeadbeef;
}
sdram_write(c_server, sdram_state, ADDR_START, BUF_WORDS, move(write_buffer_pointer));
sdram_complete(c_server, sdram_state, write_buffer_pointer);
// Read back starting from the same address
sdram_read (c_server, sdram_state, ADDR_START, BUF_WORDS, move(read_buffer_pointer));
sdram_complete(c_server, sdram_state, read_buffer_pointer);
for(unsigned i=0;i<BUF_WORDS;i++){
if(read_buffer_pointer[i] != write_buffer_pointer[i])
printf(" Failure at word %d: Value written = %08x Value read = %08x \n", i, write_buffer_pointer[i], read_buffer_pointer[i]);
else
printf(" Success at word %d: Value written = %08x Value read = %08x \n", i, write_buffer_pointer[i], read_buffer_pointer[i]);
}
printf("SDRAM demo complete.\n");
}
int main() {
streaming chan c_sdram[1];
par {
on tile[SERVER_TILE]:sdram_server(c_sdram, 1,
sdram_dq_ah,
sdram_cas,
sdram_ras,
sdram_we,
sdram_clk,
sdram_cb,
3, 256, 16, 9, 13, 2, 64, 8192, 5); //IS45S16160D 256Mb option (Note: we're using Micron MT48LC16M16A2P-6A:G)
on tile[SERVER_TILE]: application(c_sdram[0]);
}
return 0;
}
Setting ADDR_START to 0xe produces garbage with every word the same value:Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = 00000000
Failure at word 1: Value written = deadbeef Value read = 00000000
Failure at word 2: Value written = deadbeef Value read = 00000000
Failure at word 3: Value written = deadbeef Value read = 00000000
Failure at word 4: Value written = deadbeef Value read = 00000000
Failure at word 5: Value written = deadbeef Value read = 00000000
Failure at word 6: Value written = deadbeef Value read = 00000000
Failure at word 7: Value written = deadbeef Value read = 00000000
SDRAM demo complete.
When it's set to 0xf:Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = 001c001c
Failure at word 1: Value written = deadbeef Value read = 001c001c
Failure at word 2: Value written = deadbeef Value read = 001c001c
Failure at word 3: Value written = deadbeef Value read = 001c001c
Failure at word 4: Value written = deadbeef Value read = 001c001c
Failure at word 5: Value written = deadbeef Value read = 001c001c
Failure at word 6: Value written = deadbeef Value read = 001c001c
Failure at word 7: Value written = deadbeef Value read = 001c001c
SDRAM demo complete.
Any idea what might be going on?Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = 001e001e
Failure at word 1: Value written = deadbeef Value read = 001e001e
Failure at word 2: Value written = deadbeef Value read = 001e001e
Failure at word 3: Value written = deadbeef Value read = 001e001e
Failure at word 4: Value written = deadbeef Value read = 001e001e
Failure at word 5: Value written = deadbeef Value read = 001e001e
Failure at word 6: Value written = deadbeef Value read = 001e001e
Failure at word 7: Value written = deadbeef Value read = 001e001e
SDRAM demo complete.
- Tony
-
Verified
- XCore Legend
- Posts: 1156
- Joined: Thu May 27, 2010 10:08 am
Hi Tony,
Sorry to hear it’s not working out of the box. I felt pretty confident about this one..
OK well let’s fault find systematically..
Sounds like you have done some reasonable sanity checking for the HW. Signal-wise we should be able to slow everything down to take normal timing out of the equation, as long as either N=1 or set_port_sample_delay() is set. For example:
This will give bags of margin (22ns setup, 63ns hold). I tried this on my board here and all worked as expected.
At this speed, the only signal that can mess things up is the clock (reflections or induced noise from other signal causing false edges); every other signal will have lots of time to settle before being sampled. So if you have probed the SDRAM end of this signal with a high frequency probe and it looks OK, then we can likely rule out hardware.
So assuming for now that HW is all good, it comes down to configuring/driving the SDRAM state machine. This is good because we have complete control over this in software.
From your tests, it looks like the Xcore is sampling back the second part of the address (column which is provided during read command) left shifted by one:
0x0 << 0x1 = 0x0
0xe << 0x1 = 0x1c
0xf << 0x1 = 0x1e
If you look at the function addr_to_col() in server.xc, you can see the left shift by one.. so that fits. It looks like the SDRAM isn’t responding to the read at all and the bus has held the last value.
It is reading the same value twice because the port for SDRAM is setup as a 16b port which is read twice in 2 consecutive cycles with a single 32b read.
So I think it’s worth looking at the setup to make sure that the xcore controller has been configured properly. Let me check the arguments you are passing into the sdram server..
Sorry to hear it’s not working out of the box. I felt pretty confident about this one..
OK well let’s fault find systematically..
Sounds like you have done some reasonable sanity checking for the HW. Signal-wise we should be able to slow everything down to take normal timing out of the equation, as long as either N=1 or set_port_sample_delay() is set. For example:
Code: Select all
divider = 25 (10MHz)
#define N (0)
set_port_sample_delay(1) not commented out.
set_pad_delay(dq_ah, 0);
At this speed, the only signal that can mess things up is the clock (reflections or induced noise from other signal causing false edges); every other signal will have lots of time to settle before being sampled. So if you have probed the SDRAM end of this signal with a high frequency probe and it looks OK, then we can likely rule out hardware.
So assuming for now that HW is all good, it comes down to configuring/driving the SDRAM state machine. This is good because we have complete control over this in software.
From your tests, it looks like the Xcore is sampling back the second part of the address (column which is provided during read command) left shifted by one:
0x0 << 0x1 = 0x0
0xe << 0x1 = 0x1c
0xf << 0x1 = 0x1e
If you look at the function addr_to_col() in server.xc, you can see the left shift by one.. so that fits. It looks like the SDRAM isn’t responding to the read at all and the bus has held the last value.
It is reading the same value twice because the port for SDRAM is setup as a 16b port which is read twice in 2 consecutive cycles with a single 32b read.
So I think it’s worth looking at the setup to make sure that the xcore controller has been configured properly. Let me check the arguments you are passing into the sdram server..
-
Verified
- XCore Legend
- Posts: 1156
- Joined: Thu May 27, 2010 10:08 am
Arguments passed to server all look fine.
Hmm - I guess next step will to be look at SDRAM initialisation for that part. We have tested ISSI devices at this end but not Micron.
Hmm - I guess next step will to be look at SDRAM initialisation for that part. We have tested ISSI devices at this end but not Micron.
-
- Member
- Posts: 12
- Joined: Thu Apr 14, 2016 8:14 pm
Hi Infiniteimprobability,
The 10MHz parameters are producing the same results seen here at 50MHz. I also re-checked SD_CLK rise and fall with an analog scope that has more bandwidth than my DSO, and it still looks good. It's still low-pass filtered by the analog scope's BW, but no reflections are observable in the edges, and ringing is not seen either.
Thanks for demystifying the results. At least the values we're getting back are explainable!
A quick review of the datasheets for the ISSI (p. 21) and Micron (p.42) parts reveals differences in their initialization requirements. Micron wants the CKE pin held low when the clock is first applied, a command issued, and then the CKE pin can be pulled high. The process looks similar after that. Similar to the XMOS reference design, our board simply has the CKE pin pulled high through a 10k resistor. It's tempting to think that the right value cap on the CKE could be used in conjunction with the 10k pullup to meet Micron's requirements, but a 1-bit port may be needed.
Unfortunately, the IS45S16160D part is not readily available in the U.S. The more recent IS45S16160G has some availability in the 143MHz speed (a few hundred on the shelf). It's not nearly as available as the Micron part (tens of thousands currently), but if it can get us through the proof-of-concept and prototype phases, it might get us out of the SDRAM rabbit hole for long enough to address the Micron issues. Luckily, the TSOPII pinout is the same for both parts.
We will eventually need to move to an SDRAM that can be sustained in production whether it's Micron or another vendor. In the meantime, do you think the available IS45S16160G-7 is close enough to the IS45S16160D-7 you are using to be worth a try?
Thanks!
- Tony
The 10MHz parameters are producing the same results seen here at 50MHz. I also re-checked SD_CLK rise and fall with an analog scope that has more bandwidth than my DSO, and it still looks good. It's still low-pass filtered by the analog scope's BW, but no reflections are observable in the edges, and ringing is not seen either.
Thanks for demystifying the results. At least the values we're getting back are explainable!
A quick review of the datasheets for the ISSI (p. 21) and Micron (p.42) parts reveals differences in their initialization requirements. Micron wants the CKE pin held low when the clock is first applied, a command issued, and then the CKE pin can be pulled high. The process looks similar after that. Similar to the XMOS reference design, our board simply has the CKE pin pulled high through a 10k resistor. It's tempting to think that the right value cap on the CKE could be used in conjunction with the 10k pullup to meet Micron's requirements, but a 1-bit port may be needed.
Unfortunately, the IS45S16160D part is not readily available in the U.S. The more recent IS45S16160G has some availability in the 143MHz speed (a few hundred on the shelf). It's not nearly as available as the Micron part (tens of thousands currently), but if it can get us through the proof-of-concept and prototype phases, it might get us out of the SDRAM rabbit hole for long enough to address the Micron issues. Luckily, the TSOPII pinout is the same for both parts.
We will eventually need to move to an SDRAM that can be sustained in production whether it's Micron or another vendor. In the meantime, do you think the available IS45S16160G-7 is close enough to the IS45S16160D-7 you are using to be worth a try?
Thanks!
- Tony
-
Verified
- XCore Legend
- Posts: 1156
- Joined: Thu May 27, 2010 10:08 am
Hi Tony - yes saw the issue with CKE in the Micron datasheet, which is not addressable without a HW change. However, I also found a few potential issues with the init code. I have made some updates here:
https://github.com/ed-xmos/lib_sdram/tr ... ion_update
Can you try the version of server.xc on that branch and see if it helps?
https://github.com/ed-xmos/lib_sdram/tr ... ion_update
Can you try the version of server.xc on that branch and see if it helps?
-
- Member
- Posts: 12
- Joined: Thu Apr 14, 2016 8:14 pm
Hi Infiniteimprobability - Now we're getting somewhere!! Initial results with ADDR_START set to 0x0:
To:
...but keeping ADDR_START at 0x0 produced the result:
- Tony
Changing the code that stuffs the write buffer from:Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = beefdead
Failure at word 1: Value written = deadbeef Value read = beefdead
Failure at word 2: Value written = deadbeef Value read = beefdead
Failure at word 3: Value written = deadbeef Value read = beefdead
Failure at word 4: Value written = deadbeef Value read = beefdead
Failure at word 5: Value written = deadbeef Value read = beefdead
Failure at word 6: Value written = deadbeef Value read = beefdead
Failure at word 7: Value written = deadbeef Value read = a214dead
SDRAM demo complete.
Code: Select all
write_buffer_pointer[i] = 0xdeadbeef;
Code: Select all
write_buffer_pointer[i] = 0xdeadbeef + i;
Now changing ADDR_START to 0xe produces a slightly different result at word 7:Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = bef0dead
Failure at word 1: Value written = deadbef0 Value read = bef1dead
Failure at word 2: Value written = deadbef1 Value read = bef2dead
Failure at word 3: Value written = deadbef2 Value read = bef3dead
Failure at word 4: Value written = deadbef3 Value read = bef4dead
Failure at word 5: Value written = deadbef4 Value read = bef5dead
Failure at word 6: Value written = deadbef5 Value read = bef6dead
Failure at word 7: Value written = deadbef6 Value read = a214dead
SDRAM demo complete.
...and changing ADDR_START to 0xf also produces a slightly different result at word 7:Failure at word 7: Value written = deadbef6 Value read = beefdead
The results above were produced with SD_CLK at 10MHz, #define N (1), set_pad_delay(dq_ah, 0), and set_port_sample_delay(dq_ah) not commented out. It appears to be reading 24 bits ahead. As a test, cas_latency was changed from 3 to 2 with no luck, so it was set back to 3. Trying different SD_CLK frequencies, 10, 41.7, or 62.5MHz produced similar failures, but 50MHz produces:Failure at word 7: Value written = deadbef6 Value read = 8042dead
It looks like the Micron part is now initialized properly (without a fly wire to a 1-bit port to toggle CKE) and we are on the edge of getting things to work! Changing ADDR_START from 0xf to 0xe however produces:Start SDRAM Demo
Success at word 0: Value written = deadbeef Value read = deadbeef
Success at word 1: Value written = deadbef0 Value read = deadbef0
Success at word 2: Value written = deadbef1 Value read = deadbef1
Success at word 3: Value written = deadbef2 Value read = deadbef2
Success at word 4: Value written = deadbef3 Value read = deadbef3
Success at word 5: Value written = deadbef4 Value read = deadbef4
Success at word 6: Value written = deadbef5 Value read = deadbef5
Success at word 7: Value written = deadbef6 Value read = deadbef6
SDRAM demo complete.
Changing ADDR_START from 0xe to 0x0 gives us success again (maybe the starting address has to be a 16 bit multiple?). Let me know what to try next and I'll report back.Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = debdbeef
Failure at word 1: Value written = deadbef0 Value read = deadbeef
Failure at word 2: Value written = deadbef1 Value read = deadbef8
Failure at word 3: Value written = deadbef2 Value read = deadbef1
Failure at word 4: Value written = deadbef3 Value read = defdbef6
Failure at word 5: Value written = deadbef4 Value read = deadbef3
Failure at word 6: Value written = deadbef5 Value read = febdbef4
Failure at word 7: Value written = deadbef6 Value read = feafbef5
SDRAM demo complete.
- Tony
-
Verified
- XCore Legend
- Posts: 1156
- Joined: Thu May 27, 2010 10:08 am
This is good progress. I am not sure about what the Micro chip does with CKE internally but perhaps it really is just clock enable rather than some reset logic. Either way though, you should consider connecting up CKE to a port for future revs to be within spec.
I think things are pretty much working. 50MHz looks good. Also, the following:
I am not sure what the affect of ADDR_START is yet, it may be that this rather simple example is not doing what you'd expect non aligned addresses (I need to have a quick look). I have been using /lib_sdram/tests/sdram_testbench to check operation (which does a more thorough test including refresh etc.). This test is only visible on the github repo rather than the downloaded zip..
Thanks for your patience by the way. It's been a bit of a rocky road but hopefully we're nearly there now.
I think things are pretty much working. 50MHz looks good. Also, the following:
...is also as expected. At 10MHz N needs to be zero, otherwise you are reading 16bits too late. Change it back to 0 for 10MHz and I would expect everything to be fine.The results above were produced with SD_CLK at 10MHz, #define N (1), set_pad_delay(dq_ah, 0), and set_port_sample_delay(dq_ah) not commented out. It appears to be reading 24 bits ahead.
I am not sure what the affect of ADDR_START is yet, it may be that this rather simple example is not doing what you'd expect non aligned addresses (I need to have a quick look). I have been using /lib_sdram/tests/sdram_testbench to check operation (which does a more thorough test including refresh etc.). This test is only visible on the github repo rather than the downloaded zip..
Thanks for your patience by the way. It's been a bit of a rocky road but hopefully we're nearly there now.
-
- Member
- Posts: 12
- Joined: Thu Apr 14, 2016 8:14 pm
Hi Infiniteimprobability,
Yes - very good progress. Thanks again for your fast and expert help! It looks like we have enough to get things rolling with the Micron chips for now.
We might put an IS45S16160G-7 (the closest we can get to the IS45S16160D-7) on a proof-of-concept board for testing since there is some ambiguity about Micron's use of the CKE pin. We can revisit using the more available Micron part when we get closer to production, wire up the CKE pin to a port on the next PCB rev if needed, and maybe select a different 16-bit port. We'll also look into the sdram_testbench code to do stress testing.
When testing with the ISSI part, I assume we need to use the released version of server.xc and not the one modified for Micron - correct? Also, can BUF_WORDS be made larger?
- Tony
Yes - very good progress. Thanks again for your fast and expert help! It looks like we have enough to get things rolling with the Micron chips for now.
We might put an IS45S16160G-7 (the closest we can get to the IS45S16160D-7) on a proof-of-concept board for testing since there is some ambiguity about Micron's use of the CKE pin. We can revisit using the more available Micron part when we get closer to production, wire up the CKE pin to a port on the next PCB rev if needed, and maybe select a different 16-bit port. We'll also look into the sdram_testbench code to do stress testing.
When testing with the ISSI part, I assume we need to use the released version of server.xc and not the one modified for Micron - correct? Also, can BUF_WORDS be made larger?
- Tony