1 Description 1.1 Setup -

-

-

CYUSB3012-BZXC (see appendix §2.1 for code extract): o Firmware based on ‘slavefifosync’ project, adapted for 256k RAM device o 32-bits GPIF o FLAGA is dedicated to thread 0 (EPIN), DMA ready o FLAGB dedicated to thread 0(EPIN), DMA watermark = 7 ARIA5 FPGA (see appendix §2.2 for code extract): o Writing at PCLK=100MHz o FIFO mechanisms in order to buffer the data sent to the cyusb3012 PC: o Win7 64-bits o Intel Core i3-2100 CPU 3.10GHz, 4Go RAM o Cypress C++ streamer host application o LECROY Advisor T3 USB analyzer + USB Protocol suite

1.2 FPGA architecture

Data Producer Can write @100MHz or @lower speed

-

wr 32-bits full

FIFO 64x32-bits

rd 32-bits empty

USB State machine

wr FLAGA/B

CYUSB3012

32-bits

FPGA is producer of data and can fill a FIFO at 100MHz or at lower speed depending on hardware dip switch on the board (WR speed = 100MHz/(dip switch value + 1)) The USB state machine writes as soon as the data is available on the FIFO (empty status) and the microcontroller is ready to receive data (FLAGA/B)

1.3 Correct behavior When FPGA is configured at 100MHz FIFO write with DIP SWITCH = 0, no problem encountered: 357000 KB/s achieved, smaller value than measured with CYUSB3KIT and its CYUSB3014 (512k) due to the lower size of the DMA buffers within the CYUSB3012 (256k). On the scope we see a series of 32 bursts + a small deadtime which corresponds to the 32 packets per Xfer.

‘x_mTestPoint[9..0]’ outputs in VHDL code corresponds exactly to the D[9..0] scope signals which are also named on the left.

1.4 Wrong behavior 1.4.1 33MHz Write When FPGA is configured at reduced speed, e.g. 33MHz FIFO write with DIP SWITCH = 2, the stuck is encountered after few Xfer (36 in the example below):

STUCK

Yellow traces : Zoom on a end of Xfer when FLAGB is asserted low : depending on the numbers already written before FLAGB is asserted, we write the remaining bytes in order to have 7+1 bytes in total (due to DMA watermark = 7) e.g. if there were 2 bytes written in the pipe before FLAGB is asserted then we have to write 6 more bytes before to have FLAGA asserted LOW (WR counter in the pipe is shown on cnt[2..0] traces). FLAGA & B

40ns between last WRN and FLAGA

6 more WRN

FIFO WR period

 State machine is working fine and is signals are coherent with expected behavior.

1.4.2 20MHz Write Another example with DIP SWITCH = 4 i.e. 20MHz FIFO write

STUCK after 19 Xfer

FLAGA & B

40ns between last WRN and FLAGA

6 more WRN

FIFO WR period

1.4.3 390KHz write Another example with DIP SWITCH = 255 i.e. 390KHz FIFO write

40ns between last WRN and FLAGA

FIFO Write FIFO not empty

Last write (cnt=7)

FIFO WR period

The data is written at correct speed and the last write after FLAGB asserted low is done at wr_cnt = 7 suspended when

1.4.4 USB traces with 390KHz FIFO write LECROY USB ADVISOR T3 is used Trigger is set on NRDY condition : 11 NRDY found during the C++ streamer transfer finally stuck.

1st NRDY found : packet 1006220 : Behavior OK due to host retry

HOST REQUEST RETRY 7.6ms after NRDY DEVICE ANSWER after 338ns Others 10 NRDY found are similar Last NRDY : packet 1023905 : Behavior KO due to host non immediate retry except after 1500ms timeout set in C++ streamer:

HOST REQUEST RETRY ~1500ms after NRDY due to timeout No direct retry for SEQN=4

Same trace with LINK Commands displayed: Device & host are communicating correctly:

LINK GOOD FOR HOST and LINK CREDIT available LINK GOOD FOR DEVICE LUP & LDN are also active signifying that both HOST & DEVICE are in U0 state. The LINK commands are sent/received continuously up to the next Host ACK which occur 1500ms later

 CONCLUSION : Host is not retrying the immediate EPIN XFER after the device NRDY. Problem is on host driver side.

2 Appendix 2.1 USB µC SlaveFifoSync code 2.1.1 Extract of fx3_256k.ld linker file The linker file is the one required for the 256k device. It has been changed a little bit in order to optimize the DMA buffer sizes (larger for EPIN = 1024*12=12288 bytes, smaller for EPOUT, 64 bytes) /*Descriptor area Base: 0x40000000 Code area Base: 0x40003000 Data area Base: 0x40025000 Driver heap Base: 0x40029000 Buffer area Base: 0x40030000 2-stage boot area Base: 0x40038000 TOP Base: 0x40040000 MEMORY { I-TCM : ORIGIN = 0x100 SYS_MEM : ORIGIN = 0x40003000 DATA : ORIGIN = 0x40025000 }

2.1.2

Size: 12KB Size: 136KB Size: 16KB Size: 28KB Size: 64KB REMOVED = 256K device memory*/

LENGTH = 0x3F00 LENGTH = 0x22000 LENGTH = 0x4000

Extract of cyfxtx.c

#define CY_U3P_MEM_HEAP_BASE #define CY_U3P_MEM_HEAP_SIZE #define CY_U3P_SYS_MEM_TOP

(0x40029000) (0x7000) (0x40040000)

2.1.3

Extract of cyfxslfifosync.h

#define CY_FX_SLFIFO_GPIF_16_32BIT_CONF_SELECT (1) #define CY_FX_USB_FS_PACKET_SIZE_CONS #define CY_FX_USB_HS_PACKET_SIZE_CONS #define CY_FX_USB_HS_BURST_LEN_CONS

(64) /* full speed */ (512) /* high speed */ (1) /* high speed */

/* Slave FIFO P_2_U channel buffer count FROM GPIF to USB*/ #define CY_FX_USB_BURST_LEN_CONS (12) #define CY_FX_USB_PACKET_SIZE_CONS (1024) #define CY_FX_DMA_BUF_SIZE_P_2_U (12) #define CY_FX_SLFIFO_DMA_BUF_COUNT_P_2_U (4) /* Slave FIFO U_2_P channel buffer count FROM USB to GPIF*/ #define CY_FX_USB_BURST_LEN_PROD (1) #define CY_FX_USB_PACKET_SIZE_PROD (64) #define CY_FX_DMA_BUF_SIZE_U_2_P (1) #define CY_FX_SLFIFO_DMA_BUF_COUNT_U_2_P (2) #define #define #define #define

CY_FX_SLFIFO_DMA_TX_SIZE CY_FX_SLFIFO_DMA_RX_SIZE CY_FX_SLFIFO_THREAD_STACK CY_FX_SLFIFO_THREAD_PRIORITY

#define CY_FX_EP_PRODUCER #define CY_FX_EP_CONSUMER /* USB Socket 1 is producer */ #define CY_FX_PRODUCER_USB_SOCKET /* USB Socket 1 is consumer */ #define CY_FX_CONSUMER_USB_SOCKET

(0) /* DMA transfer size is set to infinite */ (0) /* DMA transfer size is set to infinite */ (0x0400) /* Slave FIFO app. thread stack size */ (8) /* Slave FIFO application thread priority */ 0x01 0x81

/* EP 1 OUT, USB producer */ /* EP 1 IN, USB consumer */

CY_U3P_UIB_SOCKET_PROD_1 CY_U3P_UIB_SOCKET_CONS_1

/* Used with FX3 Silicon. */ /* P-port Socket 0 is producer to USB (GPIF addr=0) & read from GPIF*/ #define CY_FX_PRODUCER_PPORT_SOCKET CY_U3P_PIB_SOCKET_0 /* P-port Socket 3 is consumer of USB (GPIF addr=3) & write to GPIF*/ #define CY_FX_CONSUMER_PPORT_SOCKET CY_U3P_PIB_SOCKET_3

2.1.4 Extract of cyfxslfifosync.c Only DMA automatic transfer is configured CyFxSlFifoApplnStart (void) { ... /* in superspeed mode */ size = CY_FX_USB_PACKET_SIZE_CONS; burstLength = CY_FX_USB_BURST_LEN_CONS;

/* Producer endpoint configuration (USB to GPIF) */ epCfg.pcktSize = CY_FX_USB_PACKET_SIZE_PROD; epCfg.burstLen = CY_FX_USB_BURST_LEN_PROD; apiRetStatus = CyU3PSetEpConfig(CY_FX_EP_PRODUCER, &epCfg); /* Consumer endpoint configuration (GPIF to USB) */ epCfg.burstLen = burstLength; epCfg.pcktSize = size; apiRetStatus = CyU3PSetEpConfig(CY_FX_EP_CONSUMER, &epCfg); /* Create a DMA AUTO channel for U2P transfer (USB to GPIF). DMA size is set based on the USB speed. */ dmaCfg.size = CY_FX_DMA_BUF_SIZE_U_2_P* CY_FX_USB_PACKET_SIZE_PROD; dmaCfg.count = CY_FX_SLFIFO_DMA_BUF_COUNT_U_2_P; dmaCfg.prodSckId = CY_FX_PRODUCER_USB_SOCKET; dmaCfg.consSckId = CY_FX_CONSUMER_PPORT_SOCKET; dmaCfg.dmaMode = CY_U3P_DMA_MODE_BYTE; dmaCfg.notification = 0; dmaCfg.cb = NULL; dmaCfg.prodHeader = 0; dmaCfg.prodFooter = 0; dmaCfg.consHeader = 0; dmaCfg.prodAvailCount = 0; apiRetStatus = CyU3PDmaChannelCreate (&glChHandleSlFifoUtoP, CY_U3P_DMA_TYPE_AUTO, &dmaCfg); /* Create a DMA AUTO channel for P2U transfer. (GPIF to USB)*/ dmaCfg.size = CY_FX_DMA_BUF_SIZE_P_2_U*size; dmaCfg.count = CY_FX_SLFIFO_DMA_BUF_COUNT_P_2_U; dmaCfg.prodSckId = CY_FX_PRODUCER_PPORT_SOCKET; dmaCfg.consSckId = CY_FX_CONSUMER_USB_SOCKET; dmaCfg.cb = NULL; apiRetStatus = CyU3PDmaChannelCreate (&glChHandleSlFifoPtoU, CY_U3P_DMA_TYPE_AUTO, &dmaCfg); ... } CyFxSlFifoApplnInit { ... // GPIF designer has defined the GPIF configuration : // - SLAVE SYNC, EXT positive clock , Little Endian, 32-bits, 2 ADDR lines, OE enabled // - Flag A is declared on GPIO21 associated to thread 0 (uc FIFO ADDR = 0 i.e. GPIF to USB) as DMA ready, active low // - Flag B is declared on GPIO22 associated to thread 0 (uc FIFO ADDR = 0 i.e. GPIF to USB) as DMA watermark, active low.

// - Flag C is declared on GPIO23 associated to thread 3 (uc FIFO ADDR = 3 i.e. USB to GPIF) as DMA ready, active low // - Flag D is declared on GPIO25 associated to thread 3 (uc FIFO ADDR = 3 i.e. USB to GPIF) as DMA watermark, active low // set the watermark for thread 0 (uc FIFO ADDR = 0 i.e. GPIF to USB) to value = 7 : // Number of data words written after the clock edge at which the partial FLAG is sampled low = watermark x (32/bus width) – 4 => Word Nb = 7x32/32-4 = 3 CyU3PGpifSocketConfigure (0,CY_U3P_PIB_SOCKET_0, 7, CyFalse,1); // set the watermark for thread 3 (uc FIFO ADDR = 3 i.e. USB to GPIF) to value=6 : // Number of data words available for reading (while keeping SLOE# asserted) after the clock edge at which the partial FLAG is sampled asserted = watermark x (32/bus width) – 1=> Word Nb = 6x32/32-1 = 5 // Number of cycles for which SLRD# may be kept asserted after the clock edge at which the partial FLAG is sampled asserted = watermark x (32/bus width) – 3 => SLRD# Nb = 3 CyU3PGpifSocketConfigure (3,CY_U3P_PIB_SOCKET_3, 6, CyFalse,1); ... }

2.2 ALTERA ARIA5 FPGA code 2.2.1

Write state machine

p_wr_state_machine : process(sl_int_state, sl_int_start_state, sl_int_fifo_TX_empty, sl_int_flag, sl_int_wr_cnt, sl_int_fifo_TX_rdusedw) begin sl_next_state