SPI delay

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

Please review the following thread to see if it is suitable for your project. Specifically the solution by xsamc (XMOS) and post your results:

https://www.xcore.com/viewtopic.php?t=4765


User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

I'll take a good look at xsamc's code. Being undocumented and largely uncommented it will take a little pulling apart.
Since it uses 32 bit buffers on the ports I can only assume that it transfers multiple of 4 bytes for writes and reads.

I have a single 12 bit value to read, so that will require some modifications.

It has arbitrary delays of 10us and 5us. Given that I need to perform my retrieval in less than 10us, that will need changing.

It would appear that both the XMOS library and this are designed for 'bulk' transfers at a 'low' rate. Where 'bulk' is multiple words and 'low' is every ms or so.

I guess I may have to reimplement for my application.

Unfortunately, the same lines are used for a more traditional SPI device at start-up before they are switched across to the fast ADC.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

Call me old fashioned but I would consider to bit-bang out my own implementation of this SPI interface. Years ago during a review of the QSPI flash interface on XMOS, we managed to squeeze out higher throughput than what the QSPI library at the time reported. Also found some bugs with the official library back then.
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

One of the reasons for choosing XMOS was the support for various standard interfaces. It is massively suboptimal to have to reinvent this. My added value should be writing the pieces of code that are unique to my application.

I can write an SPI interface, but that really isn't good use of my time!

I would not call it "old fashioned". I would call it inefficient.

I am not doing any of this for fun. I am designing product.

[Edit: Apologies for the grumpiness. I have found a lot of the XMOS stuff has been almost fit for purpose. I have done a lot of stuff which should have been easy that hasn't been.]
User avatar
CousinItt
Respected Member
Posts: 364
Joined: Wed May 31, 2017 6:55 pm

Post by CousinItt »

It's something to do with the lack of set up for the clock block you're using for the SPI master. Attached is a simulation snap from your code, with the SPI rate set at 2500 kbps, and the last parameter of spi_master() set to null (which forces use of the default clock). It's working as it should. I know it's not the real thing, but I've found the simulator to be pretty reliable.
SPI_capture.PNG
The lib_spi document doesn't provide much guidance on configuring the clock block, but I think they run more slowly than block 0 by default. See sections 50.3 and 50.5 in the xTIMEcomposer user guide for information on clock control functions.

As an aside, note that the spi_master and the test code are running in the same thread.

The asynchronous SPI master has to be the way to go for a high rate. See the AN00160 example code.

HTH
You do not have the required permissions to view the files attached to this post.
User avatar
CousinItt
Respected Member
Posts: 364
Joined: Wed May 31, 2017 6:55 pm

Post by CousinItt »

Apologies if I implied this was an error on your part - it seems to be a bug in the SPI master code for the case that a clock block is supplied.
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

No. All useful info. I'll check tomorrow. Thanks.
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

This morning's findings...

I have added diagnostic lines to the SPI library.

This has shown me that the delay is in syncing to the p_ss line.

Snippet from spi_sync.xc below.

This is where the delay is

Code: Select all

sync(p_ss[selected_device]);
This seems to be telling me that in end_transaction the chip select line is taken low. begin_transaction waits for this to happen (?) on the sync line. For some reason this takes 600us. Is this a badly configured port on the clock? or incorrect buffering?

Code: Select all

            case accepting_new_transactions => i[int x].begin_transaction(unsigned device_index,
                    unsigned speed_in_khz, spi_mode_t mode):{
p_dbg <: 0x1;
                //Get the mode bits from the spi_mode
                get_mode_bits(mode, cpol, cpha);

                //xassert(device_index < num_slaves);

                sync(sclk);
p_dbg <: 0x2;
                //Wait for the chip deassert time if need be
                if(device_index == selected_device)
                 sync(p_ss[selected_device]);
p_dbg <: 0x4;
//
                //Set the expected clock idle state on the clock port
                partout(sclk, 1, cpol);
                sync(sclk);

                if(isnull(cb)){
                    //Calculate the clock period from the speed_in_khz
                    period = (XS1_TIMER_KHZ + speed_in_khz - 1)/speed_in_khz;//round up
                } else {
                    //Set the clock divider
                    stop_clock(cb);
                    unsigned d = (XS1_TIMER_KHZ + 4*speed_in_khz - 1)/(4*speed_in_khz);//FIXME this has to round up too
                    configure_clock_ref(cb, d);
                    start_clock(cb);
                }
                //Lock the begin transaction
                accepting_new_transactions = 0;

                //Do a slave select
                selected_device = device_index;
                p_ss[selected_device] <: 0;
p_dbg <: 0x0;
                break;
            }
            case i[int x].end_transaction(unsigned ss_deassert_time):{
                //Unlock the transaction
                accepting_new_transactions = 1;

                unsigned time;
                partout(sclk, 1, cpol);
                sync(sclk);
                p_ss[selected_device] <: 1 @ time;

                //TODO should this be allowed? (0.6ms max without it)
                if(ss_deassert_time > 0xffff)
                   delay_ticks(ss_deassert_time&0xffff0000);

                time += ss_deassert_time;

                p_ss[selected_device] @ time <: 1;
                break;
            }
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

Changing the opening lines of spi_master to

Code: Select all

    for(unsigned i=0;i<num_slaves;i++)
    {
        configure_out_port(p_ss[i], cb, 0);
        p_ss[i] <: 1;
    }
i.e. configuring the chip select lines.

Makes it so that the entire read takes <7us. :)

CousinItt was correct in pointing out that the clocks/ports were to blame.

Thanks all for your advice and support.
User avatar
RedDave
Experienced Member
Posts: 77
Joined: Fri Oct 05, 2018 4:26 pm

Post by RedDave »

By using this code, I can make it work while leaving the library code unchanged (always preferable!)

Code: Select all

        on tile[0]:
        {
            configure_out_port(p_spi_ssn[0], clk0_spi, 0);
            spi_master(i_spi, 1, p_spi_sclk, p_spi_mosi, p_spi_miso, p_spi_ssn, 1, clk0_spi, p_dbg);
        }