s correlation using the SFXC software correlator

4 Gbit/s correlation using the SFXC software correlator M. M. Kettenis August 29, 2012 Introduction The European VLBI Network (EVN) is in the process...
2 downloads 1 Views 199KB Size
4 Gbit/s correlation using the SFXC software correlator M. M. Kettenis August 29, 2012

Introduction The European VLBI Network (EVN) is in the process of upgrading it maximum recording rate from 1 Gbit/s to 4 Gbit/s and has the ambition to also support eVLBI at these data rates. To show the considerable progress made towards this goal, a demonstration of 4 GBit/s recording and real-time e-VLBI correlation of three EVN stations was organized on June 20, 2012. Many aspects of that demonstration are described in the report titled “Control Systems and Scheduling Mechanisms for 4Gbps recording equipment” by Harro Verkouter, which is part of NEXPReS deliverable D5.6. The document you are reading now gives a more detailed description of the correlation aspect of this demonstration.

Software Correlation The DBBC equipment that is capable of doing VLBI at data rates beyond 1 Gbit/s is only able to do so using bandwidths of 32 MHz per subband. Since the Mark4 hardware correlator is limited to bandwidth of 16 MHz and lower, the only viable option for correlating 4 Gbit/s at JIVE is using the SFXC software correlator. For reasons explained later, we cannot simply run the SFXC software correlator in real-time feeding it with 4 Gbit/s data streams. At JIVE we use a Linux cluster built out of commodity hardware for running the SFXC software correlator. This cluster is dedicated to running the software correlator. It consists of 32 compute nodes with 2 quad-core CPUs each (256 cores total), and a head node with a single quad-core CPU. The cluster nodes are interconnected using a 40 Gbit/s QDR infiniband network and a 1 Gbit/s Ethernet network (the latter also includes the head node). The first 16 compute nodes use 2.27 GHz Intel Xeon E5520 CPUs. The other 16 nodes use the newer, and somewhat faster, 2.4 GHz Intel Xeon E5620 CPUs. All cluster nodes have 24 GB of RAM.

Corner Turning The VLBI data formats that are in wide use today (Mark4, VLBA, Mark5B) store bits from multiple subbands in a single computer word. In order to be able to correlate these subbands the bits need to be disentangled, an operation

1

commonly referred to as ”corner turning”. Since the DBBC/Fila10G was developed to be compatible with existing VLBI systems its data needs to be corner turned as well (although firmware is being developed for the Fila10G board that produces separate data streams for each subband which wouldn’t need to be corner turned). In the SFXC software correlator, the corner turning step is part of the ”input node” MPI process which is responsible for decoding the data format, extracting the subbands and doing the coarse-grained delay tracking. In the current software correlator setup at JIVE the input node processes run on Mark5 units. The Mark5A and Mark5B units at JIVE are equipped with two 3.2 GHz Intel ”Netburst” Xeon processors. By today’s standards these processors are fairly slow. With the current SFXC codebase this results in a maximum throughput of around 800 Mbit/s. So to be able to do real-time processing in this setup we would need to split the data from each telescope into several streams of under 800 Mbit/s. For an efficient implementation of the splitter this would imply splitting the 4 Gbit/s data stream into 8 chunks of 512 Mbit/s. If we want to process data from three telescopes this means we would need 24 Mark5 units to receive all data streams. Since we only had 22 operational units available for the demo, this wasn’t an option. Another option was to move the input nodes, and therefore the corner turning step, from the Mark5 units to cluster nodes. Our newest cluster nodes are equipped with two Quad-core 2.4 GHz Intel ”Westmere” Xeon processors. Initial tests showed that these nodes are powerful enough to handle two incoming data streams of 1 Gbit/s each. To be able to get that much data into the cluster, 8 nodes were equipped with 10 Gbit/s Ethernet interfaces (recycling CX4 equipment that had become redundant at a previous network upgrade). An obvious drawback of this approach is that it reduces the number of cluster nodes available to do the actual correlation. However, tests showed that 6 cluster nodes are sufficient to correlate 3 telescopes at 1 Gbit/s. So a total of 8 nodes should be enough to do data decoding, extraction and correlation for three telescopes at that data rate. Therefore the 32 nodes that are available in the cluster at JIVE should be enough to correlate 3 stations at 4 Gbit/s.

Correlation Setup Splitting the 4 Gbit/s data stream into 1 Gbit/s chunks is done by the Jive5AB corner turning platform described by deliverable D5.6. This corener turning platform runs on the so-called “harrobox” and produces VDIF data that is partly corner turned such that we get four 1 Gbit/s data streams containing subband 1-8, 9-16, 17-24 and 25-32. These chunks can be processed fully independently of each other. The easiest way to achieve this is by processing each chunk using a separate software correlator instance. Therefore we divided the cluster into 4 sub-clusters of 8 nodes each. Each sub-cluster runs its own instance of the SFXC software correlator, processing a 1 Gbit/s stream from each telescope. This setup is illustrated in Figure 1. In order to support this setup, a script was written that configures and starts the software correlator and the Jive5AB software on a sub-cluster using a fixed network setup. Separate VEX files were created for each sub-cluster (derived from the original VEX file) with a description of the data format that matches

2

harrobox Ef

harrobox On

harrobox Ys

4 × 1 Gbit/s datastreams (SB 1-8, 9-16, 17-24, 25-32)

4 × 1 Gbit/s datastreams (SB 1-8, 9-16, 17-24, 25-32)

Internet

3 × 1 Gbit/s datastreams (SB 1-8 for Ef, On, Ys)

sfxc-a/e

3 × 1 Gbit/s datastreams (SB 25-32 for Ef, On, Ys)

sfxc-b/f

sfxc-c/g

sfxc-d/h

Figure 1: Correlation setup used for the NEXPReS 4 Gbit/s demo. the 1 Gbit/s data stream seen by the sub-cluster. The relevant $THREADS section (as proposed in the upcoming VEX 2 standard1 ) for the first sub-cluster (correlating subbands 1-8) looks as follows: $THREADS; def VDIF-1024-8-2; format = VDIF : : 1024; thread = 0 : 1 : 1 : 1024 : 8 : 2 : : : 5000; channel = &CH01 : 0 : 0; channel = &CH02 : 0 : 1; channel = &CH03 : 0 : 2; channel = &CH04 : 0 : 3; channel = &CH05 : 0 : 4; channel = &CH06 : 0 : 5; channel = &CH07 : 0 : 6; channel = &CH08 : 0 : 7; enddef; After starting the script, the software correlator will start correlating at the start time of the schedule, discarding any data sent before the schedule starts. If the start time of the schedule is already in the past, correlation will start immediately. Correlation will happen regardless of whether data is received from the telescopes. This means that the data tansfer can be controlled independently from running the correlator. It also means the correlation process is resiliant against data loss from one or more stations. The initial tests were done using data in Mark5B format, with 16 x 16 MHz subbands. It turned out that correlating VDIF data with 8 x 32 MHz subbands needed slightly more computation power. Fortunately using the idle cores on 1 https://safe.nrao.edu/wiki/bin/view/VLBA/Vex2doc

3

the cluster node that only handles input from a single station provides enough additional computing power to meet the real-time constraint. Various buffers in the correlator and the Jive5AB software needed to be adjusted (upwards) to handle the high data rate and the fluctuations in processing speed inherent to running the cluster at near-maximum capacity. The correlation setup was successfully used during the NEXPReS 4 Gbit/s demonstration on June 20 2012. Although no fringes were detected (almost certainly due to DBBC configuration issues) the correlator performed as expected, keeping up with the incoming data in real-time and producing believable autocorrelations.

Conclusions The NEXPReS 4 Gbit/s demonstration showed that distributed correlation is a viable strategy to correlate high-bandwidth real-time e-VLBI experiments with a software correlator. However, it also shows that a reasonable amount of computing resources is necessary to correlate just three stations at 4 Gbit/s. To correlate the full core of the e-EVN at 4 Gbit/s, significantly more resources will be needed (although a couple of years of Moore’s law will do the job as well).

Glossary DBBC Digital BaseBand Converter e-EVN Subset of the European VLBI Network capable of doing real-time eVLBI e-VLBI Electronic VLBI Very Long Baseline Interferometry where the baseband data is transferred electronically over computer networks CX4 Copper Exchange 4 First generation copper based 10 Gbit/s network link, using four 3.125 Gbit/s lanes harrobox Corner turning platform based on COTS hardware and Jive5AB software; see NEXPReS deliverable D5.6 MPI Message Passing Interface Middleware for developing portable and scalable large-scale parallel applications QDR Quad Data Rate 40 Gbit/s Infiniband VLBI Very Long Baseline Interferometry Radio astronomical observational technique using many radio telescopes to simulate a larger one by combining the individual signals VLBA Very Long Baseline Array A dedicated VLBI array in North America

4

VDIF VLBI Data Interchange Format Next generation data format for digitally sampled voltage signals from VLBI antennas

5

Suggest Documents