Most Efficient Way to Generate Varying Duty Cycle and PWM

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

PS. It's very interesting kster!
But when using feed-forward solutions without feedback, i'm not sure that the benefit of using the maximum frequency benefits over the probably increasing problems with the error on each edge that will be generated from the transistors since they are not perfect and do not provide infinite bandwith.

JAES fellow Malcom has done alot of work during his life:
Check out http://www.essex.ac.uk/csee/research/au ... tions.html if anyone has missed it

Hmm, I should read thisone myself.
http://www.essex.ac.uk/csee/research/au ... lifier.pdf

DS.


Probably not the most confused programmer anymore on the XCORE forum.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

Woody wrote:You may find that you get better results by using timestamping. Each port has a timer which is incremented every time it receives a clock pulse. Timestamping allows you to specify the exact time (clock number) that you want a signal transition to occur on.

Code: Select all

int portTime;

pwmPort <: 0 @ portTime;  // Find out the current port time
portTime += 20;
pwmPort @ portTime <: 1;
portTime += 60;
pwmPort @ portTime <: 0;
portTime += 47;
pwmPort @ portTime <: 1;
portTime += 33;
pwmPort @ portTime <: 0;
As you can see from this example you can first issue an output and read the time that it occured at, then you can set a time in the future when you want the next output to occur and then schedule that.

See section 4.3 "Performing I/O on Specific Clock Edges" of Programming XC on XMOS Devices for more details: http://www.xmos.com/support/documentation
What is the frequency of the clock when it's clocked internally? Does it use the 400 MHz clock ?
If so, what about a 500 MHz L device.

For an example, how does (t+8) correlate to 12 ns in this XMOS module ?

Code: Select all

   // read with address.
   p_sram_addr <: Adrs @ t;
   // read data with 12 ns access time.
   p_sram_data @ (t + 8) :> Result;
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
Woody
XCore Addict
Posts: 165
Joined: Wed Feb 10, 2010 2:32 pm

Post by Woody »

lilltroll wrote:What is the frequency of the clock when it's clocked internally? Does it use the 400 MHz clock ? If so, what about a 500 MHz L device.
If ports are clocked internally they use the reference clock. This is 100MHz*. Note that the reference clock is also used for the timers.

*There are occasionally times when you may want the ref. clock to differ from 100MHz. This can be achieved via the .xn file (see the 'XS1-? Clock Frequency Control' documents http://www.xmos.com/support/documentation for details.

Note that there may be knock on effects of changing the ref. clock because code blocks may assume a 100MHz timer.
lilltroll wrote:For an example, how does (t+8) correlate to 12 ns in this XMOS module ?

Code: Select all

   // read with address.
   p_sram_addr <: Adrs @ t;
   // read data with 12 ns access time.
   p_sram_data @ (t + 8) :> Result;
For +8 to correspond to a 12ns delay a ref clock with a period of 1.5ns would be required (666MHz). This is out of range for an XS1 device, so I suspect that the comment is either invalid or out of context. Where did you get the code?
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

Woody wrote: p_sram_addr <: Adrs @ t;
// read data with 12 ns access time.
p_sram_data @ (t + 8) :> Result;[/code]
For +8 to correspond to a 12ns delay a ref clock with a period of 1.5ns would be required (666MHz). This is out of range for an XS1 device, so I suspect that the comment is either invalid or out of context. Where did you get the code?[/quote]

It's from the SRAM module http://www.xmos.com/applications/memory/sram-controller
I wanted so see if I could come closer to 50 Mreads/s e.g. 50 Mbytes/s

Also check this tread: http://www.xcore.com/forum/viewtopic.php?f=15&t=512
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
Woody
XCore Addict
Posts: 165
Joined: Wed Feb 10, 2010 2:32 pm

Post by Woody »

That comment is wrong. It is really an 80ns access (8*100MHz cycles). Thanks for pointing it out, I'll put a bug on the comments in that code.
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm

Post by Folknology »

@infiniteimprobability

I am curious why in your chart using the buffered serialised port is more expensive thread wise, 3 times more to be specific can you explain that.


Also I'm interested in seeing what happens when you increase the frequency resolution up to 100s of Khz, then how does it change the table balance? I assume some of these methods duck out when you hit certain frequencies and multiple instances?

regards
Al
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am

Post by infiniteimprobability »

I am curious why in your chart using the buffered serialised port is more expensive thread wise, 3 times more to be specific can you explain that.
I'll have a go.. The buffered serialised method basically requires you to shovel the next 32b of data before the buffer empties. If you are running at 12b (2^12=4096), 20KHz then this period will be:

(1/20E3) / 4096 * 32 = 390ns. Running the thread at 50MHz (20ns instruction time) means you've 19 cycles to work out whether to transmit 0b0000000000000000, 0b1111111111111 or something in between from a lookup table of the 30 transition patterns.

You've also got to take care of updating the duty register although shared memory will do you favours here.

So as a rough guess (haven't done the calcs), you can probably only get away with 2 outputs per thread running at full pelt. Relaxing the PWM frequency or resolution would change things..

I've seen code that manages 6 outputs per thread using this method, but it does it differently by having a server thread which just shovels the data and the client thread (which updates the PWM duty) does the hard work. Nice idea....
Also I'm interested in seeing what happens when you increase the frequency resolution up to 100s of Khz, then how does it change the table balance? I assume some of these methods duck out when you hit certain frequencies and multiple instances
Well it all comes down to how many cycles you have - doubling the frequency would require a drop in resolution of one bit, so 160KHz should be doable at 9b, 320KHz at 8b and so forth...

WHat requires such high PWM frequency? :?:
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm

Post by Folknology »

Interesting stuff and definitely worth more investigation, particularly where to put which pieces in the control and drive parts of the closed loop and still get good thread value.
I've seen code that manages 6 outputs per thread using this method, but it does it differently by having a server thread which just shovels the data and the client thread (which updates the PWM duty) does the hard work. Nice idea....
I think this idea makes a lot of sense with the server being the motor driver and the controller handling more complex things like the feedback transforms which it needs the threads for etc..

WHat requires such high PWM frequency?
Its more of a mathematical and performance curiosity than practical application from a motor POV as most high speed units checkout around 50Khz. Obviously getting above audio range is beneficial but higher will likely degrade rather than benefit performance. I suppose it could be useful for other applications such as piezoelectric motor/driving etc..

regards
Al
Last edited by Folknology on Wed Jun 23, 2010 5:44 pm, edited 1 time in total.
kster59
XCore Addict
Posts: 162
Joined: Thu Dec 31, 2009 8:51 am

Post by kster59 »

Those 32 bit buffered ports are supposed to be double buffered.

At 100mhz output, you only need to write once every 32 operations.

Supposing I'm at 400mhz I should be able to do 32*4 operations between writes.

so I can do:

for loop
porta <: mynumber;
portb <: mynumber;
portc <: mynumber;
portd <: mynumber;

and have time to spare with a bunch of calculations since it should only block when the buffer is full (which happens only once in 32 operations).

I currently have a PWM code running in 1 thread and can update at least 8 motors with 8 bit PWM at 100mhz while computing the next value to write in the same thread on the fly.

Or am I missing something?
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

Has no-one applied dither to PDM?

I took a very fast look at the "Class D Audio Power Amplifier"

Shouldn't a first order SigmaDelta look something like this:

Code: Select all

void speaker(streaming chanend c_in,out buffered port:1 p,clock clk){
const unsigned short delay=35;
unsigned short time=0;
int x,y;
unsigned dither;
int qe=0;
set_clock_ref(clk);
configure_port_clock_output(p, clk);
configure_out_port_no_ready(p, clk, 0);

start_clock(clk);
	while(1) {
		c_in:>x;
		for(int i=0;i<64;i++){  //fs=44.6 kHz Use maximum oversampling
		if(x>=qe)
		  {y=65536;time+=delay; p@time <: 1;}
		 else
		  {y=-65536;time+=delay; p@time <: 0;}
		crc32(dither,x,0xEB31D82E); //Magic poly
		qe=qe+y-x+(dither>>25);
		}
	}
}
The for loop is just a very ugly oversampling for testing, but it's a reason why I use -+2^16 and not -+2^15 due to the nature of PDM. The value 25 controls the amount of applied dither. Even the XC-1 speaker will sound nicer with dither. (The use of x in the CRC32 is overkill)

PS. I am testing a 5:th order noise shaper, since the demo code uses 8 times oversampling. DS
Probably not the most confused programmer anymore on the XCORE forum.