AN00170 "Using the SDRAM library"

If you have a simple question and just want an answer.
User avatar
PlayingWithWires
Member
Posts: 12
Joined: Thu Apr 14, 2016 8:14 pm

Post by PlayingWithWires »

Looking forward to the new documentation. One question in the meantime - are there any plans to increase the total amount of memory that lib_SDRAM can address beyond 256M bits?


User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

are there any plans to increase the total amount of memory that lib_SDRAM can address beyond 256M bits?
There is no intrinsic limit on 256Mb in the code now - It's just we haven't implemented or tested beyond that. The previous version (coded for XS1) had a hard limit at 64Mb because of the way loop unrolling was done; it used a near addressing mode that only allowed a 6b offset (2^6 used twice = 128 long words = 256 16b words), limiting column address to 8b, which tops out at 64Mb for a 16b wide memory.

The write inner loop for XS2 is able to use a loop counter as it has the right addressing modes, whereas the read still needs to use the unrolled loop with branch calculation, although it is no longer limited to 128 loops due to using dual-issue. It subtracts the number of words to read from the number of row words, multiplies by 2 and then branches to the right point in the unrolled loop (see below).

Code: Select all

	ldw w_temp, 	sp[1+WRITE_STACK_WORDS]	//row_words
	sub w_temp, w_temp, r3
	add w_temp, w_temp, w_temp

	ldw w_ras, 		r1[2]	//ras
	ldw w_we, 		r1[3]	//we
	ldw w_dq_ah, 	r1[0]	//dq_ah

.align 4
	ldc w_two, 2
	bru w_temp

	//(blocking)
	#include "sdram_block_write_body.inc"
The unrolled loop for XS2 is currently 256 iterations (32b each -> 9b column address for 16b memory) so I think you can just make the unrolled loop contained in sdram_block_read_body_xs2.inc twice as long to support 10b column addresses. It will use another 2KB of program memory though.

This is not tested, but I don't forsee any issues here.
User avatar
PlayingWithWires
Member
Posts: 12
Joined: Thu Apr 14, 2016 8:14 pm

Post by PlayingWithWires »

Hi InfiniteImprobability,

We have started testing with lib_sdram running on our hardware without success so far. As previously posted, our hardware uses port 16B on tile 1 for the ADQ bus and ports 1A-1D for the control bus. The SDRAM part number is a Micron MT48LC16M16A2P-6A:G (167MHZ, CL = 3, tAC = 5.4ns max, tOH = 3ns min).

During testing, several clock dividers (4, 5, & 6), CAS values (2, 3, & 4), #define N values (0, 1, & 2) in io.S, set_pad_delay(dq_ah, n) values (0, 1, & 2) in server.xc were tested with no observable differences in behavior.

Scoping the SDRAM chip at the package with a spring ground instead of a ground clip wire, the power supply rails are clean, as well as SD_CLK, WE#, CAS#, RAS# and DQM signals. The ADQ bus signals look clean as well, but only a few can be scoped at a time without a logic analyzer; that also makes it difficult to know if signals are in the right place (I suspect it's quite a task even with an analyzer!).

Our test code was adapted from the github example, and set_pad_delay(dq_ah, 0) was used along with #define N (1). Based on the timing spreadsheet, these values provided plenty of margin with a 50MHz clock:

Code: Select all

#include <xs1.h>
#include <platform.h>
#include <stdio.h>
#include <stdlib.h>
#include "sdram.h"

#define SERVER_TILE 1
on tile[SERVER_TILE] : out buffered port:32   sdram_dq_ah = XS1_PORT_16B;
on tile[SERVER_TILE] : out buffered port:32   sdram_cas   = XS1_PORT_1D;
on tile[SERVER_TILE] : out buffered port:32   sdram_ras   = XS1_PORT_1C;
on tile[SERVER_TILE] : out buffered port:8    sdram_we    = XS1_PORT_1B;
on tile[SERVER_TILE] : out port               sdram_clk   = XS1_PORT_1A;
on tile[SERVER_TILE] : clock                  sdram_cb    = XS1_CLKBLK_1;

#define ADDR_START 0x0

void application(streaming chanend c_server) {
#define BUF_WORDS (8)
  unsigned read_buffer[BUF_WORDS];
  unsigned write_buffer[BUF_WORDS];
  unsigned * movable read_buffer_pointer = read_buffer;
  unsigned * movable write_buffer_pointer = write_buffer;

  s_sdram_state sdram_state;
  sdram_init_state(c_server, sdram_state);

  printf("Start SDRAM Demo\n");

  //Fill the memory initially with known pattern
  for(unsigned i=0;i<BUF_WORDS;i++){
    write_buffer_pointer[i] = 0xdeadbeef;
  }
  sdram_write(c_server, sdram_state, ADDR_START, BUF_WORDS, move(write_buffer_pointer));
  sdram_complete(c_server, sdram_state, write_buffer_pointer);

  // Read back starting from the same address
  sdram_read (c_server, sdram_state, ADDR_START, BUF_WORDS, move(read_buffer_pointer));
  sdram_complete(c_server, sdram_state, read_buffer_pointer);

  for(unsigned i=0;i<BUF_WORDS;i++){
    if(read_buffer_pointer[i] != write_buffer_pointer[i])
      printf("    Failure at word %d:   Value written = %08x    Value read = %08x \n", i, write_buffer_pointer[i], read_buffer_pointer[i]);
    else
      printf("    Success at word %d:   Value written = %08x    Value read = %08x \n", i, write_buffer_pointer[i], read_buffer_pointer[i]);
  }
  printf("SDRAM demo complete.\n");

}

int main() {
  streaming chan c_sdram[1];
  par {
      on tile[SERVER_TILE]:sdram_server(c_sdram, 1,
              sdram_dq_ah,
              sdram_cas,
              sdram_ras,
              sdram_we,
              sdram_clk,
              sdram_cb,
              3, 256, 16, 9, 13, 2, 64, 8192, 5); //IS45S16160D 256Mb option (Note: we're using Micron MT48LC16M16A2P-6A:G)

    on tile[SERVER_TILE]: application(c_sdram[0]);

  }
  return 0;
}
If ADDR_START is set to 0x0, the following output is produced:
Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = 00000000
Failure at word 1: Value written = deadbeef Value read = 00000000
Failure at word 2: Value written = deadbeef Value read = 00000000
Failure at word 3: Value written = deadbeef Value read = 00000000
Failure at word 4: Value written = deadbeef Value read = 00000000
Failure at word 5: Value written = deadbeef Value read = 00000000
Failure at word 6: Value written = deadbeef Value read = 00000000
Failure at word 7: Value written = deadbeef Value read = 00000000
SDRAM demo complete.
Setting ADDR_START to 0xe produces garbage with every word the same value:
Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = 001c001c
Failure at word 1: Value written = deadbeef Value read = 001c001c
Failure at word 2: Value written = deadbeef Value read = 001c001c
Failure at word 3: Value written = deadbeef Value read = 001c001c
Failure at word 4: Value written = deadbeef Value read = 001c001c
Failure at word 5: Value written = deadbeef Value read = 001c001c
Failure at word 6: Value written = deadbeef Value read = 001c001c
Failure at word 7: Value written = deadbeef Value read = 001c001c
SDRAM demo complete.
When it's set to 0xf:
Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = 001e001e
Failure at word 1: Value written = deadbeef Value read = 001e001e
Failure at word 2: Value written = deadbeef Value read = 001e001e
Failure at word 3: Value written = deadbeef Value read = 001e001e
Failure at word 4: Value written = deadbeef Value read = 001e001e
Failure at word 5: Value written = deadbeef Value read = 001e001e
Failure at word 6: Value written = deadbeef Value read = 001e001e
Failure at word 7: Value written = deadbeef Value read = 001e001e
SDRAM demo complete.
Any idea what might be going on?

- Tony
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

Hi Tony,
Sorry to hear it’s not working out of the box. I felt pretty confident about this one..

OK well let’s fault find systematically..

Sounds like you have done some reasonable sanity checking for the HW. Signal-wise we should be able to slow everything down to take normal timing out of the equation, as long as either N=1 or set_port_sample_delay() is set. For example:

Code: Select all

divider = 25 (10MHz)
#define N (0)
set_port_sample_delay(1) not commented out.
set_pad_delay(dq_ah, 0);
This will give bags of margin (22ns setup, 63ns hold). I tried this on my board here and all worked as expected.

At this speed, the only signal that can mess things up is the clock (reflections or induced noise from other signal causing false edges); every other signal will have lots of time to settle before being sampled. So if you have probed the SDRAM end of this signal with a high frequency probe and it looks OK, then we can likely rule out hardware.

So assuming for now that HW is all good, it comes down to configuring/driving the SDRAM state machine. This is good because we have complete control over this in software.

From your tests, it looks like the Xcore is sampling back the second part of the address (column which is provided during read command) left shifted by one:

0x0 << 0x1 = 0x0
0xe << 0x1 = 0x1c
0xf << 0x1 = 0x1e

If you look at the function addr_to_col() in server.xc, you can see the left shift by one.. so that fits. It looks like the SDRAM isn’t responding to the read at all and the bus has held the last value.

It is reading the same value twice because the port for SDRAM is setup as a 16b port which is read twice in 2 consecutive cycles with a single 32b read.

So I think it’s worth looking at the setup to make sure that the xcore controller has been configured properly. Let me check the arguments you are passing into the sdram server..
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

Arguments passed to server all look fine.

Hmm - I guess next step will to be look at SDRAM initialisation for that part. We have tested ISSI devices at this end but not Micron.
User avatar
PlayingWithWires
Member
Posts: 12
Joined: Thu Apr 14, 2016 8:14 pm

Post by PlayingWithWires »

Hi Infiniteimprobability,

The 10MHz parameters are producing the same results seen here at 50MHz. I also re-checked SD_CLK rise and fall with an analog scope that has more bandwidth than my DSO, and it still looks good. It's still low-pass filtered by the analog scope's BW, but no reflections are observable in the edges, and ringing is not seen either.

Thanks for demystifying the results. At least the values we're getting back are explainable!

A quick review of the datasheets for the ISSI (p. 21) and Micron (p.42) parts reveals differences in their initialization requirements. Micron wants the CKE pin held low when the clock is first applied, a command issued, and then the CKE pin can be pulled high. The process looks similar after that. Similar to the XMOS reference design, our board simply has the CKE pin pulled high through a 10k resistor. It's tempting to think that the right value cap on the CKE could be used in conjunction with the 10k pullup to meet Micron's requirements, but a 1-bit port may be needed.

Unfortunately, the IS45S16160D part is not readily available in the U.S. The more recent IS45S16160G has some availability in the 143MHz speed (a few hundred on the shelf). It's not nearly as available as the Micron part (tens of thousands currently), but if it can get us through the proof-of-concept and prototype phases, it might get us out of the SDRAM rabbit hole for long enough to address the Micron issues. Luckily, the TSOPII pinout is the same for both parts.

We will eventually need to move to an SDRAM that can be sustained in production whether it's Micron or another vendor. In the meantime, do you think the available IS45S16160G-7 is close enough to the IS45S16160D-7 you are using to be worth a try?

Thanks!

- Tony
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

Hi Tony - yes saw the issue with CKE in the Micron datasheet, which is not addressable without a HW change. However, I also found a few potential issues with the init code. I have made some updates here:

https://github.com/ed-xmos/lib_sdram/tr ... ion_update

Can you try the version of server.xc on that branch and see if it helps?
User avatar
PlayingWithWires
Member
Posts: 12
Joined: Thu Apr 14, 2016 8:14 pm

Post by PlayingWithWires »

Hi Infiniteimprobability - Now we're getting somewhere!! Initial results with ADDR_START set to 0x0:
Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = beefdead
Failure at word 1: Value written = deadbeef Value read = beefdead
Failure at word 2: Value written = deadbeef Value read = beefdead
Failure at word 3: Value written = deadbeef Value read = beefdead
Failure at word 4: Value written = deadbeef Value read = beefdead
Failure at word 5: Value written = deadbeef Value read = beefdead
Failure at word 6: Value written = deadbeef Value read = beefdead
Failure at word 7: Value written = deadbeef Value read = a214dead
SDRAM demo complete.
Changing the code that stuffs the write buffer from:

Code: Select all

    write_buffer_pointer[i] = 0xdeadbeef;
To:

Code: Select all

    write_buffer_pointer[i] = 0xdeadbeef + i;
...but keeping ADDR_START at 0x0 produced the result:
Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = bef0dead
Failure at word 1: Value written = deadbef0 Value read = bef1dead
Failure at word 2: Value written = deadbef1 Value read = bef2dead
Failure at word 3: Value written = deadbef2 Value read = bef3dead
Failure at word 4: Value written = deadbef3 Value read = bef4dead
Failure at word 5: Value written = deadbef4 Value read = bef5dead
Failure at word 6: Value written = deadbef5 Value read = bef6dead
Failure at word 7: Value written = deadbef6 Value read = a214dead
SDRAM demo complete.
Now changing ADDR_START to 0xe produces a slightly different result at word 7:
Failure at word 7: Value written = deadbef6 Value read = beefdead
...and changing ADDR_START to 0xf also produces a slightly different result at word 7:
Failure at word 7: Value written = deadbef6 Value read = 8042dead
The results above were produced with SD_CLK at 10MHz, #define N (1), set_pad_delay(dq_ah, 0), and set_port_sample_delay(dq_ah) not commented out. It appears to be reading 24 bits ahead. As a test, cas_latency was changed from 3 to 2 with no luck, so it was set back to 3. Trying different SD_CLK frequencies, 10, 41.7, or 62.5MHz produced similar failures, but 50MHz produces:
Start SDRAM Demo
Success at word 0: Value written = deadbeef Value read = deadbeef
Success at word 1: Value written = deadbef0 Value read = deadbef0
Success at word 2: Value written = deadbef1 Value read = deadbef1
Success at word 3: Value written = deadbef2 Value read = deadbef2
Success at word 4: Value written = deadbef3 Value read = deadbef3
Success at word 5: Value written = deadbef4 Value read = deadbef4
Success at word 6: Value written = deadbef5 Value read = deadbef5
Success at word 7: Value written = deadbef6 Value read = deadbef6
SDRAM demo complete.
It looks like the Micron part is now initialized properly (without a fly wire to a 1-bit port to toggle CKE) and we are on the edge of getting things to work! Changing ADDR_START from 0xf to 0xe however produces:
Start SDRAM Demo
Failure at word 0: Value written = deadbeef Value read = debdbeef
Failure at word 1: Value written = deadbef0 Value read = deadbeef
Failure at word 2: Value written = deadbef1 Value read = deadbef8
Failure at word 3: Value written = deadbef2 Value read = deadbef1
Failure at word 4: Value written = deadbef3 Value read = defdbef6
Failure at word 5: Value written = deadbef4 Value read = deadbef3
Failure at word 6: Value written = deadbef5 Value read = febdbef4
Failure at word 7: Value written = deadbef6 Value read = feafbef5
SDRAM demo complete.
Changing ADDR_START from 0xe to 0x0 gives us success again (maybe the starting address has to be a 16 bit multiple?). Let me know what to try next and I'll report back.

- Tony
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

This is good progress. I am not sure about what the Micro chip does with CKE internally but perhaps it really is just clock enable rather than some reset logic. Either way though, you should consider connecting up CKE to a port for future revs to be within spec.

I think things are pretty much working. 50MHz looks good. Also, the following:
The results above were produced with SD_CLK at 10MHz, #define N (1), set_pad_delay(dq_ah, 0), and set_port_sample_delay(dq_ah) not commented out. It appears to be reading 24 bits ahead.
...is also as expected. At 10MHz N needs to be zero, otherwise you are reading 16bits too late. Change it back to 0 for 10MHz and I would expect everything to be fine.

I am not sure what the affect of ADDR_START is yet, it may be that this rather simple example is not doing what you'd expect non aligned addresses (I need to have a quick look). I have been using /lib_sdram/tests/sdram_testbench to check operation (which does a more thorough test including refresh etc.). This test is only visible on the github repo rather than the downloaded zip..

Thanks for your patience by the way. It's been a bit of a rocky road but hopefully we're nearly there now.
User avatar
PlayingWithWires
Member
Posts: 12
Joined: Thu Apr 14, 2016 8:14 pm

Post by PlayingWithWires »

Hi Infiniteimprobability,

Yes - very good progress. Thanks again for your fast and expert help! It looks like we have enough to get things rolling with the Micron chips for now.

We might put an IS45S16160G-7 (the closest we can get to the IS45S16160D-7) on a proof-of-concept board for testing since there is some ambiguity about Micron's use of the CKE pin. We can revisit using the more available Micron part when we get closer to production, wire up the CKE pin to a port on the next PCB rev if needed, and maybe select a different 16-bit port. We'll also look into the sdram_testbench code to do stress testing.

When testing with the ISSI part, I assume we need to use the released version of server.xc and not the one modified for Micron - correct? Also, can BUF_WORDS be made larger?

- Tony
Post Reply