Wednesday, November 17, 2010

Flash Programming Speed


Calculating the theoretical Flash programming speed using boundary-scan can
provide a good estimate for the time it will take and allows us to evaluate how
specific factors will affect programming speed. To follow the Tips to Reduce
Flash Programming Time, we’ll look at how to calculate the theoretical
programming speed, and then use the formula to get a better idea of how
different factors may affect programming speed.

Calculating Theoretical Programming Time

For this discussion, let's assume a 16-bit wide S29GL512P Spansion device using
single-word program mode. While this device supports buffered programming with a
32-word buffer, due to the way boundary-scan accesses the Flash, the difference
between single-word and buffered programming times is often minimal. The time it
takes to scan the chain for each data write—which takes up the bulk of the
programming time—remains the same.

The following equation, pulled from our DFT Guidelines, is commonly used for
calculating the theoretical time that it takes to program Flash memory time
using the boundary-scan interface. This equation assumes ideal conditions and
will show the best programming time that can be achieved.

(#bits in chain) * (#scans/write) * (#writes/location) * (#locations)

TCK frequency

Where the parameters are defined as:

Flash Programming Parameter Descriptions

Table 1: Flash Programming Parameter Descriptions

Example Calculation

We’ll perform an example calculation using the following conditions:

Flash Program Speed Calculation Data

Table 2: Example Flash Program Speed Calculation Data

The absolute minimum programming time that can be achieved when programming the
entire device is:

The absolute minimum programming time that can be achieved when programming the entire device is

370 seconds for each megabyte of data isn’t great—that’s over 6 minutes!
Hopefully there’s something we can do to improve this speed. Let’s see how the
chain length and TCK rate will affect programming speed.

How TCK Rate and Chain Length Affect Programming Speed

Using our equation for calculating program time, we can explore how different
factors affect programming speed. First, vary the TCK rate between 1 MHz and 25
MHz. The data is presented in the graph below.

Flash Program Rate vs. TCK Frequency

Figure 1: Flash Program Rate vs. TCK Frequency

Note that utilizing the External Write signal cuts the programming time
approximately in half—utilizing this feature can often provide dramatically
improved performance.

Program vs. Scan Chain Length

Figure 2: Program vs. Scan Chain Length

Notice that as the scan chain length approaches 650 bits with external write and
300 bits without external write, the boundary-scan programming rate crosses the
programming rate based on typical write time. At this point, the boundary-scan
Flash programming performance will be similar to the programming rate described
by the device data sheet, and boundary-scan will be able to scan faster than the
device can program! To prevent data errors, the poll-for-done option will need
to be utilized to ensure that the previous program operation has completed
before the next scan begins.


Now that we have an intuitive understanding of how chain length and TCK rate
affect programming rate, we can put our knowledge into practice. Programming
rate too slow? See if the TCK rate can be adjusted for Flash programming, and
make sure that the chain is being optimized as much as possible.

Source: Calculate Flash Programming Speed

Tuesday, November 16, 2010

Adaptive FPGA Programming for SVF and STAPL/JAM


There are two common file standards for programming FPGAs: SVF and STAPL/JAM.
Most vendors can generate either type of file, but which should you choose?
First we should look at a significant difference between the two: STAPL allows
the use of conditional expressions, while SVF does not. In terms of FPGA & CPLD
programming, this means STAPL can provide adaptive programming, while SVF is
limited to delays.

How It Affects Programming Speed

In general, an adaptive programming algorithm will run faster than a
non-adaptive programming algorithm, since it can poll the device status and
determine exactly when programming has been completed and execution may resume.
Non-adaptive programming algorithms must wait a pre-defined time—usually the
device’s worst case program time—before proceeding.

The flow charts below show simplified examples of programming algorithms:

Programming flow charts

Figure 1: Programming flow charts

If the worst case delay far exceeds the typical and minimum delay, then the
adaptive programming will finish first. In some cases, increasing the clock rate
and shortening the delay on the non-adaptive file may allow it to surpass the
adaptive programming speed.


STAPL files can often provide better programming performance than SVF files.
Despite the lack of adaptive programming features in SVF, ScanExpress Runner and
ScanExpress Programmer JTAG implement some techniques in SVF execution to speed
up programming, such as re-scanning on failure and adjusting delay time when a
particular suffix (_xilinx.svf, etc.) is used. Additionally, Lattice has
expanded their SVF files to include non-standard LOOP statements to facilitate
adaptive programming.

What is your experience with CPLD and FPGA programming? We’re always seeing new
and unusual cases—a new FPGA programs slower than its previous version, STAPL
executes much faster than SVF and in the odd case, SVF executes much faster than
STAPL, etc.—and look for input to help improve our software.

Source: SVF and STAPL/JAM: Adaptive FPGA Programming

Monday, November 15, 2010

HSWAP pin for FPGA


The HSWAP pin (also known as HSWAP_EN or PUDC) is commonly found on Xilinx FPGAs.
This pin controls whether the FPGA’s user IO pins will have a pull-up resistor
or float—when HSWAP is LOW, each IO pin will have an internal pull-up resistor.
For our example we’ll look at a particular Spartan-3 case, but this may apply to
other parts as well.


In most cases, the JTAG and configuration control pins will keep their
pull-ups regardless of the state of the HSWAP—but in our experience we’ve seen
evidence of exceptions where the internal pull-up on some FPGAs has an effect on
compliance pins, such as INIT_B or PROG_B. This is an important distinction—in
certain cases INIT_B & PROG_B will have a dependence on HSWAP, so it’s often a
good practice to use external pull-up or pull-down resistors rather than relying
on the internal pull-ups to control these lines.


Consider the four cases below:

HSWAP & INIT_B/PROG_B configuration
Figure 1: Four cases of HSWAP & INIT_B/PROG_B configuration

Note: These cases make the assumption that the HSWAP_EN pin will have an effect
on PROG_B and INIT_B, but this is not always the case. Consult the device
documentation and errata for details.

In the cases 1 and 2, important compliance and input pins are connected to
strong pull-up/pull-down resistors and the state of HSWAP should have no effect
on the state of these pins. In case 3, INIT_B and PROG_B are floating and may
cause test failures. In case 4, the pull-down on HSWAP ensures that input pins
are pulled up, but does not cover the case where an input or compliance pin may
need to be pulled down.


When designing for boundary-scan test, it pays off to consider the
pre-configuration behavior of FPGAs. To cover all scenarios—though it may not
always be necessary for boundary-scan test—it’s a good idea to include a strong
pull-down on HSWAP during boundary-scan test, but consider the consequences of
pull-ups on IOs before relying on it for pre-configuration cases. Whenever
possible, include pull-ups/pull-downs on configuration and mode pins such as
INIT_B & PROG_B. As always, when in doubt check the device documentation!

Source: FPGA: HSWAP pin

Sunday, November 14, 2010

Bypass Boundary-Scan Devices


On occasion and due to incompatibilities, non-compliance, debugging, or various
other factors related to the boundary-scan chain, it may be necessary to
physically bypass a boundary-scan device and remove it from testing. The most
common approach is to add a bypass resistor, but there are important
consequences with this approach. We’ll discuss some considerations that should
be noted when bypassing an installed device to remove it from the scan chain.

Active TDO

When physically bypassing a boundary-scan device, note that the output of the
bypassed device may still be driving unless explicitly disabled. This may result
in two devices driving the input to the next device at the same time. In the
following diagram, if U2 is bypassed with U2_BYPASS_RESISTOR and U2 remains
installed, both U1.TDO and U2.TDO drive U3.TDI.

Common device bypass configuration

Figure 1: Common device bypass configuration

Hardware solutions to the active TDO problem

The simplest method of removing the contention between TDO pins is to physically
remove the connection between the conflicting TDO pins. Some options include:
  • Remove the series termination resistor. If there is a series termination
    resistor as shown below, it may be removed to eliminate the contention on TDO

Series termination resistor on TDO

Figure 2: Series termination resistor on TDO

  • Lift the U2.TDO pin

  • Remove U2

  • Cut the trace connected to U2.TDO, before it connects to the bypass
    resistor trace

  • Lift the U3.TDI pin, and wire from U1.TDO to the lifted pin

Software solutions to the active TDO problem
In cases where modification to the hardware is undesirable, there are some
additional solutions that may be used. These solutions rely on specific behavior
with respect to the JTAG control signals and, in our experience, not all devices
will perform the same way.

Keep TMS to the device high, and provide at least 5 TCK during testing
Per the IEEE-1149.1 2001 standard section 6.2, the TDO pin should be inactive
(tri-stated) when it is not driving data. This can be guaranteed by forcing U2
into JTAG reset mode by giving it five or more clocks with TMS high. Note that
some devices may not fully conform to the standard. Additionally, in the typical
design, the TMS signal connected to U2 is connected to other devices as well.

If the device uses TRST, hold it active
Note that in a typical design, the TRST connected to U2 is connected to other
devices, too. This option is only viable if the TRST line connected the device
to be bypassed does not affect other boundary-scan devices.

Additional Considerations

Even after the TDO connection has been removed, the bypassed device still
receives TCK and TMS signals. This causes the device to receive all JTAG state
machine commands (on the still connected TMS) and data (on the still connected
TDI). This may place the device in an unknown and possibly undesirable state.
This rarely has an effect on testing or board safety, but it is a possibility.
If TMS on this device is not shared with additional boundary-scan devices, the
solution is to hold TMS high.

Boundary-scan device operation in non-boundary-scan mode
This is not a factor for boundary-scan chain operation, but is a factor in
testing. Now that the device is not in the scan chain, it will operate
“normally”, which may involve driving pins to unknown states. Just like all
other non-boundary-scan devices, for maximum test coverage it is advantageous to
control the device so that the outputs are disabled. If the outputs drive, those
nets cannot be tested due to contention.


While it’s great when things go well, it’s important to be prepared for some
debugging. Whenever possible, it’s a good idea to design the board to facilitate
test and debug—not only do series termination resistors improve signal quality,
but can ease the process of wiring around a problem device. For more tips on
designing for testability, visit the Design Tips and Guidelines section of our

Source: Bypassing Boundary-Scan Devices

Saturday, November 13, 2010

JTAG Program CPLDs & FPGAs


In-system programming (ISP) of CPLDs & FPGAs is a key application of JTAG. Most
modern CPLDs & FPGAs include a JTAG port for programming and boundary-scan
tests, and each vendor provides the software to generate an SVF or STAPL/JAM
file for execution in ScanExpress Runner or Programmer.

In this topic we'll discuss the basic topologies—with respect to JTAG and ISP—of
CPLDs, FPGAs, and Configuration Devices and how each one affects our approach to
ISP. For the purposes of this discussion, we’ll keep the topic vendor
agnostic—the methods for each vendor are remarkably similar.


CPLDs present the simplest case: the lack of external configuration devices
means that you’ll be directly programming the logic device through JTAG.
Creating a JTAG programming file should be a straightforward process when using
the appropriate vendor’s software—usually a matter specifying the part number,
the configuration data, then generating the SVF or STAPL/JAM file.

It should be noted that some CPLDs internal “configure” on power up, loading
data from an internal Flash to an internal SRAM. In this respect, they resemble
FPGAs. Additionally, some modern FPGAs include non-volatile memory as well,
further blurring the lines between FPGAs and CPLDs. Fear not—with respect to
ISP, these CPLDs may be treated the same as traditional CPLDs.


FPGAs present some complications due to their volatile nature. Rarely will it be
necessary to program an FPGA through JTAG—instead, we want to program the
configuration device such that the next time (and any subsequent times) the
board boots, it will load the new configuration data.

FPGA and configuration device connections usually come in one of two flavors:
  1. The FPGA and configuration device are both connected to the scan chain. The
    configuration device may be programmed directly through JTAG.

  2. FPGA is on the scan chain, but the configuration device does not have a
    JTAG port. The configuration device must be programmed indirectly through
    the FPGA.

JTAG Programmable Configuration Device

We’ll first examine the case of a JTAG programmable configuration device, as
shown below. Since we have direct JTAG access to the configuration device, it is
simply a matter of scanning out the correct instructions and data. The vendor’s
generated SVF or STAPL/JAM file will be ideal.

JTAG programmable configuration device

Figure 1: JTAG programmable configuration device

Note that since the FPGA’s connection (other than TDO to TDI) is not necessary
for programming, the FPGA’s boundary-scan register does not need to be scanned
each time. Configuration devices generally have few pins. Taking these two
factors together, we observe that programming through JTAG is very efficient in
this case, and can result in significantly better programming times than the
cases we’ll explore next.

When a JTAG connection is available on the configuration device or CPLD,
programming is about as simple as it can get. In the next post, we’ll discuss
how to deal with FPGAs that utilize non-JTAG configuration devices.

Source: JTAG Programming of CPLDs & FPGAs

Friday, November 12, 2010

Phase-locked Loops (PLLs) in Clock Buffers - JTAG Boundary-scan Tip

PLLs contained in clock distribution ICs generally will not function correctly with a clock input that neither maintains a constant frequency nor operates in the correct frequency range. This applies to both the JTAG clock (TCK) and to synchronous device clock pins, such as those found on SDRAM.

However, all hope is not lost! Many buffers have a method of disabling or bypassing the PLL. For boundary-scan testing this mode should be used whenever possible, and the clock distribution device should use a transparent model in ScanExpress TPG. Some common methods for dealing with PLLs include:
  • PLL disable pin, such as a test pin.

  • Mode pins, which include a bypass mode. Sometimes this is stated by
    saying the “reference” is applied to the outputs.

  • Applying a different voltage (sometimes no power) to a power pin,
    usually the PLL power pin.

  • Please refer to the device data sheet to determine if and how the
    internal PLL can be disabled.

For example, compare the popular Cypress CY2305 & CY2309 clock buffers (data sheet available from the Cypress website at
). See table 2 of the referenced data sheet: CY2309 includes select input pins not available on the CY2305, adding a PLL shutdown mode in which the output source follows the reference clock, allowing the clock buffer to be treated as a transparent device during boundary-scan tests.

Thursday, November 11, 2010

Strong Pull-ups on FPGAs - JTAG Boundary-Scan Test Tip


Many FPGAs in their preconfigured state include relatively strong internal
pull-up/pull-downs, often in the 4.7k-ohm range or lower. If a weak
pull-up/pull-down resistor is attached to such a pin, there is risk that the
pull-up/pull-down test may fail.


Consider the simplified diagram below:

10k Pull-down attached to a pre-configuration BIDIR FPGA pin

Figure 1: 10k Pull-down attached to a pre-configuration

The pre-configuration boundary-scan pin has an effective internal pull-up
resistance of 4.7k-ohms. It is externally strapped with a weak 10k-ohm pull-down
resistor. Driving the net will not be a problem—when the output buffer is
enabled, current will flow through either resistor, allowing the output node to
be driven and sensed both HIGH and LOW.

However, the pull-up/pull-down test will tri-state the output of this pin and
then expect the 10k pull-down to take the value (as sensed by the input cell)
down to “0”. This is not the case—let’s determine why.

Voltage Dividers

When the output buffer tri-states, we end up with a simple voltage divider
between the internal effective resistance and the external pull-down. We can
calculate the value here:

Vpd = Vcc * Rext/(Rext + Rint) = Vcc * 10k/(4.7k + 10k) ~= 0.7Vcc

This is a very high value and will likely not meet the VIL requirements, causing
the resistor test to fail, possibly intermittently.


When this situation occurs, the best solution may be to remove the net from
testing during the resistor test. While—depending on resistor values—it may be
on the border of meeting the threshold requirements and often sense LOW, it is
probable that it will not be reliable and cause false test failures. In reality
the pull-down should be considered un-testable by boundary-scan
pull-up/pull-down test methods.

Wednesday, November 10, 2010

JTAG Cables


TAP adapter cables are often necessary to convert from the standard
pinout to the TAP connector pinout of a particular target. The pinout may be
Altera or Xilinx programming headers, CPU emulation headers, or other
proprietary pinout. In this discussion, we’ll cover design considerations for
creation of custom TAP adapter cables.

Creating TAP Adapter Cables

Most TAP connectors are two rows of 0.025 inch square posts on 0.1 by 0.1 inch
centers, making them suitable for mass terminated ribbon cable. In some cases,
the TAP connector may be single row, or part of a much larger connector, such as
a DIN connector.

When designing and constructing an adapter cable, there are a few design factors
to consider.

  • Which Boundary-scan controller is being used? If only
    controllers with 20-pin TAPs will be used, a 20-pin ribbon cable connector
    such as a 3M 3421-6620 will plug directly into the controller. If an older
    controller will be used or a variety of controllers will be used, we
    recommend using a 10-pin cable connector such as a 3M 4610-6351 for maximum
    compatibility. This will accept the 10-pin cable connector from all

  • Ensure that the mating connectors are obtained first. The acquisition
    process may take days, so get it started as soon as possible.

  • Maintain good signal integrity by using as short a cable as
    This will help EMI, crosstalk, cable capacitance, etc.

  • Maintain good signal integrity with good signal return paths. The ground
    wires affect signal integrity because they are the return path for the
    signals. To enable high TCK rates, our boundary-scan controllers have signal
    slew rates in the 2-5 ns range. This requires a good signal return path,
    commonly called ground, to insure signal quality. On the standard
    pinout, there is a signal return path for every signal. Many TAP connectors
    on boards to not have a ground for every signal. We recommend connecting all
    the grounds of the boundary-scan controller cable to the target ground pin
    or pins. If there is one ground pin, it should be fanned out to all the
    cable grounds. If there are two ground pins, we recommend connecting the
    board ground pin closest to the board TCK pin to the cable ground wire
    closest to the cable TCK signal. All other ground wires in the cable should
    be connected to the second ground. For example, for the Altera programming
    header, the wirelist should be as follows:

    Pinout for Altera Programming Header

    Table 1: Example Pinout for Altera Programming Header

  • Maintain good signal integrity with signal termination. Serial
    and pullup/pulldown termination is best done on the board. However, if the
    board lacks the appropriate termination, it can occasionally be solved by
    placing the termination on the cable.

  • Verify the pinout. It is very easy to swap the TDO/TDI pin
    assignments of the target versus the boundary-scan controller cable. Do not
    rely on the signal names. Check that the direction of the data flow matches.

  • Test the cable. Once the infrastructure test is working,
    determine the maximum TCK rate. We recommend then looping infrastructure so
    it runs at least two minutes. This will test the signal integrity of the
    scan chain, including the adapter cable.

  • If an adapter PCB is used instead of an adapter cable, the same
    concepts apply.
    Verify the pinouts. Use a ground plane to insure good
    signal return paths. Connect as many ground pins as possible to the ground

  • If the UUT does not have the recommended termination, it may be
    helpful to implement the termination in the cable or adapter PCB.

  • A cable with a ground plane is usually not needed. If the signal
    return paths are limited, it may help, replacing the signal return paths. If
    testing in a high EMI environment, it may help provide some shielding when
    oriented so that the ground plane is between the EMI source and the signal

  • If in a high EMI environment, use twisted pair wires, preferably
    twisted, shielded pairs.
    This can be awkward to implement, so this
    is recommend as a last resort when EMI is a strong suspect as a problem

  • When using wire wrap wires to make connections, the same concepts
    apply. At least twist the TCK with a ground wire. Preferably, twist all
    signal wires with a signal return wire.
    Connect the signal return wire
    at both ends. On the controller end, connect to the “paired” return wire
    (1&2, 3&4 etc). At the target end, connect to grounds as close as possible
    to the signal connection.
Source: JTAG TAP Adapter Cables

Thursday, April 15, 2010

Tutorial: The Role of JTAG in system debug & test throughout the embedded system development lifecycle

JTAG Debug Advantages
The primary advantages of using a debugger with JTAG access are:
* The JTAG connection provides direct access to the otherwise hidden CPU core
* The JTAG interface consumes no system I/O ports (serial, Ethernet)
* The JTAG debug method uses little or no system memory allocation (as in monitors)
* There is no monitor to crash along with a system crash (not useful at board bring-up)
* The JTAG connection does not require target system power (except some USB-only probes)
* A JTAG debugger can "steal cycles" to read registers/memory without stopping CPU (assuming that the debug logic built into the CPU provides this capability)
* A JTAG debug session can reset and/or initialize the system (Note: System reset is not part of JTAG. Rather, it is an adjunct to using JTAG for remote debugging, enabling a remote reset of a JTAG probe and target over a network.)
* A JTAG debugger can connect to the debug logic without perturbing the system
* Provides the only reasonable means to connect to targets that do not yet have working bootcode or I/O drivers

JTAG Debug Limitations
The JTAG debug connection does not solve all the world's debug problems because of some serious limitations:

1) Code download over JTAG is not the fastest way to download large programs (>20MB), especially for target systems that rely on 10/100BaseT Ethernet access.

2) Multicore system debug where multiple CPU cores are daisy-chained on the same scan chain and can be individually accessed, but implementing a synchronous debug operation requires additional on-chip hardware to circumvent skidding associated with JTAG operations.

Subsequently, hundreds of CPU cycles may go by after an asynchronous JTAG stop command is issued. Examples of these capabilities are now beginning to appear, e.g., the global inter-processor control logic in Cavium Networks Octeon family, with up to 16 64-bit cnMIPS cores.

3) "Printf" still provides an easy complement for extracting a variety of debug status reports.

JTAG to the Rescue - Boundary Scan Testing

The Joint Test Action Group (JTAG) began solving board-level test problems in the 1990's by standardizing a serial scan chain method (JTAG; IEEE 1149.1) for accessing on-chip resources and additional shift registers built into the I/O paths of every IC for boundary scan testing.

Before the emergence of boundary scan testing, debugging of potential solder bump issues underneath a chip assembly was difficult. Prior to board assembly, every IC is tested to assure its flawless operation. Thus, if the assembled printed circuit board PCB does not work properly, the malfunction must be caused by a solder bridge, gap or a flaw in the printed circuit board. But what if the flaw is underneath the chip assembly, where it can't be seen or repaired easily?

The boundary scan testing methodology addresses this issue. As illustrated in Figure 1, a serial scan path through I/O registers was added and exercised by a sophisticated test program unique to each board to help identify a faulty chip or other device, so that these can be reworked or replaced. In the diagram in Figure 1, each grey box represents a category of device function, e.g., flash, peripherals, I/O ports, etc.

Figure 1. JTAG connection used for boundary scan testing

The JTAG approach provides a method to test very complex systems, while keeping the pin count low. Specifically, the IEEE1149.1 specification requires only 5 pins for the JTAG connection, no matter how long the scan chain register path is. The standard pin functions for the JTAG Test Access Port include:

TRST Test Reset (output from JTAG probe to chip to reset JTAG test logic)
TCK Test Clock (output from JTAG probe to chip to set JTAG scan rate)
TDI Test Data Input (serial test data input to chip)
TDO Test Data Output (serial test data output from chip)
TMS Test Mode Select (determines run or debug mode by state at TCK rising edge)

Several companies focus almost exclusively on boundary scan testing, specializing in both the JTAG hardware connection devices and host-based test software tools to adapt the test program to each board design.

The 2nd Role of JTAG - CPU Core Access for Software/Hardware Debug
Given that the CPU processor core is now hidden from observation or control by integrated caches in the core, by local on chip busses, by an MMU that dynamically allocates memory, and by other SOC peripherals and I/O blocks, the JTAG path provides a direct connection into the debug logic inside the CPU. Thus, we now have a means of observing and controlling program execution. Since caches and peripherals have moved on chip, so must the debug logic (Figure 2 below).

Figure 2. JTAG connection use for software debug/development

With this direct core access, host-based debugger software can now assert a "debug exception", redirecting the processor to get the next instruction from the debug logic registers instead of the program counter, thus effectively taking control of the processor to perform software debug operations:

* Run-control: Start, Stop, Single-Step, Step Into/Over (source or instruction)
* Set hardware and software breakpoints
* Specify conditions to be met or scripts to be executed at breakpoints
* Control reset and initialization of the target system
* Download code to be debugged or code to be programmed into flash
* Execute flash programming and other semi-hosting utilities

Note that in both of the above applications, boundary scan and software debug, the role of JTAG is only to provide the physical layer communications interface, analogous to the PHY layer in the ISO Open Systems Interconnect model.

The protocol for what debug functions are supported is embodied in the debug logic, designed into the CPU core and the debugger software capabilities running on the host computer.


Source: Tutorial: The Role of JTAG in system debug & test throughout the embedded system development lifecycle by Lyle Pittroff

Monday, March 1, 2010

embedded world 2010

The embedded world Exhibition&Conference is the world´s biggest exhibition of its kind and the meeting-place of the international embedded community. Embedded technologies are in action everywhere -whether in the car, data and telecommunication systems, industrial and consumer electronics, military systems or aerospace. 704 exhibitors showed the about 16.000 visitors the full range of products for embedded technologies in 2009: hardware, software, tools, services and lots more.

More here: