1 Introduction

Measurement campaigns for certifying the electrical properties of wind turbines are strongly influenced by external conditions when conducted in the field. Hence, these campaigns are time consuming and cost driving. Certification of wind turbines by means of multi-MW system test benches allows reducing the duration of measurement campaigns, making reliable timelines and subsequently reducing costs.

How certification measurement results obtained at system test benches are consistent with such obtained in the field was investigated for the first time within the joint research project CertBench [14] and the results are shared in this paper. In CertBench, an ENERCON E‑115 E2 wind turbine has undergone the complete certification measurement campaign at the system test benches at the Center for Wind Power Drives (CWD) of the RWTH Aachen University and at the Dynamic Nacelle Laboratory (DyNaLab) of the Fraunhofer Institute for Wind Energy Systems (IWES) as well as in the field. It is the first wind turbine world-wide to receive a type certificate based solely on measurements taken on a system test bench. The measurement results were discussed and evaluated by the project consortium, consisting of all stakeholders relevant for certification, i.e. OEM, certification body, accredited measurement institute and test bench operator. The conclusions drawn from the evaluation of the different test items are summarized in this paper. More detailed analysis of single test items and aspects are documented in dedicated publications.

In order to exploit the full potential of certification of wind turbines on system test benches further on a broad scale, standardization is needed to define requirements for system test benches, i.e. their grid emulations and HiL systems, and to outline suitable measurement procedures. With this in mind, the authors share their recommendations on the requirements, based on the results and findings from CertBench, in this paper. To do so, we will walk the reader through the typical test sections active power, reactive power, power quality, and grid faults as defined in [3]. For each test we share comparisons of test bench and field measurements and state our recommendation on grid emulator and HiL-system requirements. Where applicable, we will also share experiences on test and measurement execution. Thereby, this paper provides for the first time a condensed overview of most relevant aspects to be taken into account for certification on test benches, based on full-scale experimental data from test benches and field measurements.

The paper is organized as follows: In Sect. 2, the test setups at CWD and DyNaLab and some general definitions regarding HiL-System and grid emulator are given. This is followed by an introduction to the measurement campaign in Sect. 3. Sect. 4, 5, 6 and 7 present results and conclusions for the certification measurements Active Power, Reactive Power, Power Quality Aspects and Grid Faults, respectively. A summary and general conclusions are given in Sect. 8.

2 Description of baseline test bench setups

The following sections will give an overview of system test benches, their HiL-Systems and grid emulators, as well as the tested wind turbine. Due to limited space in this paper, not too many details on the specific test benches at CWD and DyNaLab are given. Instead we refer the reader to relevant literature.

2.1 General test bench description

System test benches as they are considered in the joint research project CertBench and this paper, are test benches on which a complete wind turbine nacelle is mounted. As illustrated in Fig. 1, the wind turbine comes with its complete control system relevant for power conversion (main control, pitch control, converter control) and—except for the rotor—with the complete mechanical and electrical power train from low speed shaft to transformer.

Fig. 1
figure 1

Principal setup of system test benches as they are considered in CertBench

The test bench features a mechanical-level HiL-System (from here on called HiL-System) that emulates the rotor with its aerodynamic and mechanical properties and missing sensors and actuators. It also features a grid emulator that replicates the electrical grid at the point of common coupling (PCC) of the wind turbine. This grid emulator can also be equipped with a power-level HiL-System (PHiL-System).

Both test benches, CWD and DyNaLab, are equipped with direct drive motors and converter-based grid emulators. At CWD the motor has a maximum torque of 2.7 MNm (up to 3.4 MNm at overload) and a maximum torque change rate of 1 kNm/ms [12]. The drive at the DyNaLab consists of two motors in tandem arrangement with a total torque of 8.6 MNm (up to 13 MNm at overload) and a maximum torque change rate of 86 kNm/ms [5]. The grid emulators have total power of 21.5 MVA (CWD) [10] and 44 MVA (DyNaLab) [4].

2.2 Description of tested wind turbine

The basic electrical design of any type of ENERCON wind turbine is identical. The hub of an ENERCON wind turbine is connected directly, that is, without an intermediate gear box, to the rotor of a high-pole field-excited ring generator. The variable frequency alternating current (ac) output at the ring generator’s stator terminals is connected to the grid through a full-scale power converter. The latter consists of a back-to-back inverter system, the number of which depends on the nominal active power output and the required reactive power capability for the corresponding wind turbine. This means that the ring generator is decoupled from the power system allowing a wide operating speed range. The electrical performance of an ENERCON wind turbine on the grid is hence defined by its control systems and the limitations of the power electronics and other electrical components. The input parameters for the control system are the voltage and the frequency. Both parameters are measured at the low voltage side of the unit transformer. Table 1 provides a summary of the relevant technical details of the ENERCON E‑115 E2 wind turbine used for the tests.

Table 1 Short Technical characteristics of the WTs studied

2.3 Grid emulator properties

The general requirements a grid emulator has to fulfill can be found in TR3 Chap. 3.4 [2]. For the sake of readability of this paper, the major properties are summarized in Table 2.

Table 2 Standard requirements for the grid simulator, based on FGW TR3 Rev. 25 Chap. 3.4

The standard configuration of the gird emulators, which was commonly used throughout the test, was a PCC voltage of 20 kV and a grid impedance of 1.75 Ω with a X/R ratio of 3.3 at both test benches, CWD and DyNaLab. If different settings are used for a test, it is stated explicitly in the results description in the forthcoming chapters. The given grid impedance is emulated by the grid emulators, which both feature an impedance control.

2.4 HiL-System models

The HiL-Systems are equipped with rotor models, which reproduce the aerodynamic and the structural behavior of the rotor, the wind field, which differ in terms of laminar or turbulent wind conditions, and the actuator models, such as a pitch system model. For most tests a common set of models for each element is used and not altered. For a few tests, such as for example maximum power peaks (4.1), flicker (6.1) or harmonics (6.2), the impact of the rotor and wind models is analyzed and different models are used.

Each test bench has its own HiL-System and related controls. Due to the focus of this paper, no further details on the HiL-Systems and internal controls are given here and the reader is kindly referred to the relevant publications from CWD [11,12,13] and DyNaLab [5,6,7].

Model depth

The major differences between the defined model depth MT0, MT2 and MT10 (c.f. Table 3) are the number of modes reproduced by the rotor model. The simplest model (MT0) does not emulate any dynamics and considers the rotor as rigid body. The model type MT2 in this case reproduces two modes of the rotor and hence considers it as flexible. For the turbine at hand this includes the first coupled eigenfrequency located below 2 Hz. Note that we recommend considering the number of coupled eigen-modes as relevant, not the frequency value, here 2 Hz, as this differs from turbine to turbine. The most accurate model is MT10 which considers six modes of the rotor blade. This model corresponds to a typical load calculation model.

Table 3 Short overview of the different modelling details used within the HiL-Systems

Wind models

Wind models in the context of this work are only distinguished in laminar and turbulent wind fields. In this paper a turbulent wind field is considered to be a spatial, 3D wind field. For the purpose of certification, the wind field has to be created according to IEC 61400‑1 [17] with common tools (e.g. TurbSim [18]). The chosen turbulence intensity has to correspond to the site classification valid for the particular wind turbine. The length of the wind field must be at least 600 s in order to capture the spectral properties properly. For practical reasons, we recommend using wind fields of at least 630 s length to exclude transient effects which occur e.g. when the wind field is changed or restarted.

HiL-System standard configuration

The standard model, which was commonly used throughout the tests, was MT2 at CWD and MT10 at DyNaLab. At both test facilities wind is normally considered as turbulent. If different models or wind conditions were used for a test, it is stated explicitly in the results description in the forthcoming chapters.

3 General information about measurement campaign

The certification measurements were carried out in the field and at the test benches at CWD and DyNaLab. At the test benches in total twelve test items from FGW TR3 [2] were carried out. Only optional tests and tests already performed at (small-scale) laboratories prior to the test campaigns at CWD and DyNaLab were not considered. Of the twelve test items executed, it did occur that some of such were not or only partially executed at one or another test bench for different reasons. A comprehensive list of which tests were carried out at which test bench can be found in the appendix in Table 8. Of the executed tests, only “Unbalances of the current” and “Voltage change during switching operations” are not evaluated in this paper.

It is also important to notice that the field measurement campaign and the campaign at the DyNaLab were carried out and evaluated according to FGW TR3 Rev. 24 [1]. The campaign at the CWD was carried out according to revision 25 [2]. Although this difference is not optimal for comparison to the field tests, it was necessary so that the certificate is issued according to the current standard and thereby relevant for ENERCON and for the industry in general.

The measurement campaigns at the test benches were executed by the test bench operators together with an ENERCON team and all measurements were taken by UL. The complete campaign was closely supervised by an FGH team as certification body. The measurement campaigns at the test benches took in both cases only four weeks. This also includes the measurements taken using different variants of model depth and wind fields. The installation and commissioning took four to five weeks and the disassembly two weeks. Apart from this, the planning of the overall integration of the DUT into the test bench was kicked off approximately one year in advance. The main reason is long manufacturing time for the mechanical adapters.

4 Active power

4.1 Active power peaks

Comparison test bench and field

According to TR3 Rev. 25 [2], measurements in the power bins 80%, 90% and 100% are used for evaluation. The HiL-System models and the grid emulators at the test benches are parametrized in standard configuration (cf. Sects. 2.3 and 2.4).

The results of field and test bench tests are compared in Fig. 2a. It shows that the active power peaks determined on the test benches are comparable to the active power peaks determined in the field. The observed deviation is always less than 1.7%, with respect to field measurement. The tendency that active power peaks decrease with increasing averaging time is also reproduced.

Fig. 2
figure 2

a Comparison of measured active power peaks averaged over 0.2, 60 and 600 s, obtained in the field, at the CWD and at DyNaLab. b Comparison of active power peak measurement results at IWES, when different models are used within the HiL-System or when laminar instead of turbulent wind is used

Analysis model depth

For the maximum active power peaks test, we also investigated which impact the model depth of the HiL-System rotor model has on the measurement results. As the active power peak tests are based on repeated 600 s long measurements, we only use a limited subset of the normative measurements given in (Fig. 2a). Namely this subset includes measurements at partial load, rated power and full-load. Furthermore, measurements in laminar instead of turbulent wind conditions are compared.

The results of the comparison are given in Fig. 2b. The comparison between the models MT10, MT2 and MT0 clearly shows that the model depth has an impact on the result. The presented dataset is from an DyNaLab measurement and exhibits a difference in the range of ~1.5% between MT0 and field measurements. Measurements taken at the CWD exhibit differences of 3% between MT0 and field measurements. This indicates that there is some uncertainty involved and hence recommended to minimize this by using higher order models.

A laminar wind inflow results in almost constant active power peaks (c.f. Fig. 2b), independent of the averaging. As this behavior does not reflect the trend observed in the field, laminar wind conditions are not acceptable for active power peak measurements.

4.1.1 Recommendations

Rotor model and wind field in HiL system

When determining the active power peaks, we recommend using rotor models that correspond at least to the model depth MT2. The wind field must be turbulent.

This is because we can observe an influence of the model depth in the evaluation, so that in terms of accuracy it is recommended to use a higher model depth. Turbulent wind is recommended as using a laminar inflow leads to not plausible results (cf. Fig. 2).

Grid emulator

Standard requirements from 2.3 are applicable.

Test execution

When performing the measurement, make sure that each point in time stored in the wind field (e.g. 0–600 s) is only run through once during the ten-minute measurement and that implausible sections in the wind field (e.g. transition from end to start of wind field) are discarded from measurement.

We also recommended to implement a common trigger between HiL-System and measurement system to ensure the synchronization of wind field and measurement, for the reason mentioned above.

4.2 Active power control

The test active power control consists of two sub-tests, which are “Setting Accuracy” and “Gradient and Settling Time”. The results of both tests compared to field measurements are given in the following.

Setting accuracy

In order to determine the setting accuracy of the active power control, the set point is reduced in 10% steps down to 10% Pn. The accuracy is evaluated on the basis of the difference between the one minute average value and the required set point of the active power. For this test items measurements are available from field and CWD tests.

Fig. 3 shows that test bench and field measurements show an identical trend. Even the power jump at 30% can be observed in both measurements. In absolute values, however, the test bench and field measurements differ. Though the absolute error is only approximately 35 kW, in relative numbers the largest deviation observed at the test bench is 30% of the maximum deviation observed in the field.

Fig. 3
figure 3

Power deviation against reference power in the field and at CWD

Settling time and gradient

Determining the settling time and the gradient is done by step changes of the power reference. For exemplary comparison we are considering a step change from 100% to 10%, as this test was executed in the field and at CWD and a comparable test, starting from 90% Pn was executed at DyNaLab.

The results in Table 4 show that the settling time is absolutely comparable between field and test bench measurement. The difference between the gradients measured at the test bench and in the field is due to a different parameter setting in the wind turbine, which allowed higher gradients at the test bench and hence plausible.

Table 4 Comparison of settling Time and Gradient of Active Power Control

4.2.1 Recommendations

Rotor model and wind field in HiL system

The rotor model should represent the structural properties of the rotor in such a way that each blade is represented individually. However, the model depth MT0 from Table 3 is sufficient, as an influence of the model depth is not to be expected. Only a model which can be coupled with a turbulent 3D wind field should be used. The requirements on the wind field are identical to recommendations made in Sect. 4.1.

Since the core of this test is to check the function of the control system in the overall system, it is nevertheless recommended to simulate the “controlled system” as accurately as possible, of which the rotor model is a part. This is especially true for controllers that use, for example, the blade root moments as input.

Grid emulator

Standard requirements from Sect. 2.3 are sufficient.

Test execution

These tests can be performed at the test bench as in the field, without further restrictions.

4.3 Active power as function of frequency

The test describes the verification of the capability for generating units to run through rapid frequency changes without being disconnected from the grid [2]. It demonstrates the reduction of the current applied active power in the respective frequency range for generating unit with a normatively defined gradient depending on an over-frequency per Hz or increases it in the case of an under-frequency per Hz.

The comparison between field and test bench measurement results shown in Fig. 4, show very good agreement for both, partial and full load tests. Slight differences do occur, but can easily be explained by the variance of the grid frequency. This is of course subject to fluctuations in the physical conditions of the locally available grid properties. Deviations between the curves in the range of the grid frequency between 50.00 Hz and 50.25 Hz result from the fact that in some attempts an additional grid frequency of 50.20 Hz was set and is therefore to be regarded as an additional supporting point.

Fig. 4
figure 4

Comparison of active power as function of frequency observed in the field, at the CWD and at DyNaLab; a full load P ≥ 80% Pn; b partial load P = 40 … 60% Pn

Table 5 shows the arithmetic mean values of the determined active power gradients as well as the maximum active power gradients dP(f) during frequency increase measured in the field and at test benches. In Table 6 repetitive measurements obtained from both test benches are compared. In all cases, the active power output is reduced with a gradient dP(f) of approximately 40% of PM/Hz after the grid frequency exceeds 50.2 Hz and the active power returns to the available active power after the grid frequency falls below 50.05 Hz with any gradient.

Table 5 Minimal and maximal active power gradient dP(f) during frequency increase (field and test bench)
Table 6 Minimal and maximal active power gradient dP(f) during frequency increase (comparison of test facilities)

4.3.1 Recommendations

Rotor model and wind field in HiL system

The model depth MT0 from Table 3 is sufficient, as an influence of the model depth is not to be expected. A laminar wind field is sufficient.

Only if the power control of the generating unit is implemented on independently operating control loops P(f) and P(set point), the necessity of a turbulent wind field has to be discussed.

Grid emulator

Beyond standard requirements from Sect. 2.3, we recommend that the grid emulator is able to provide a rate of change of the grid frequency of at least 5 Hz/s.

In general, we recommend to use the grid emulator’s ability to provide a “real” change of the grid frequency, which is unique to test bench tests.

Test execution

This test can be performed as in the field, without further restrictions.

In addition, the consortium recommends that this test should be carried out with a disturbed grid, e.g. with a THD of 8% according to [15]. This is to ensure that the control is robust against perturbation in terms of frequency being not perfectly measureable.

4.4 Power gradient and reconnection time after voltage loss

This test pursues the goal of determining the active power gradient and the reconnection time of a generating unit after a loss of voltage has occurred. It must be proven that the wind turbine’s control complies with the normative requirements regarding functionality, prioritization, accuracy and dynamics.

For the test on the test bench, the parameter for the active power gradient after loss of voltage “Gradient after power failure” was set to 16 kW/s (50% Pn/min) in the wind turbine controller. Table 7 shows the results of the maximum and mean active power gradients as well as the parameterization used for test bench tests and in the field.

Table 7 Maximum and mean gradient of the active power after loss of power (33% to 66% Pn/min)

4.4.1 Recommendations

Rotor and wind models in HiL system

The model depth MT0 from Table 3 is sufficient, as no influence of the model depth is expected. The wind field must be turbulent.

Grid emulator

Standard requirements from Sect. 2.3 are sufficient.

Test execution

This test can be performed as in the field, without further restrictions.

To perform this measurement, it is necessary that the state machines of the wind turbine control and the test bench allow the wind turbine to switch back on automatically after voltage loss. In case of unfavorable tuning, an interlocking can occur which prevents the execution of such a test.

5 Reactive power

According to TR3 Rev. 25 [2], measurements in the complete power range were carried out. The HiL-Systems and grid emulators at CWD and at DyNaLab were operated in standard configuration (cf. Sect. 2.3 and 2.4). Different from the former chapter, all tests regarding reactive power, which are discussed in the forthcoming sections, lead to the same recommendations for the test benches. Therefore they are stated once at the end of this chapter in Sect. 5.5.

5.1 Reactive power characteristic (Q = 0)

This test aims at determining the wind turbine’s reactive power behavior in normal operation, with its reactive power reference set to zero [2]. In Fig. 5 the results derived in the field, at CWD and at DyNaLab are plotted. At the CWD the test was executed at the grid emulator (named “CWD-Emulator”) and at the public grid (“CWD-Grid”).

Fig. 5
figure 5

Comparison of reactive power behavior observed in the field, at the CWD (using grid emulator and public grid) and at DyNaLab

The overall comparison of the results shows only slight differences between field tests and test bench tests. These differences, which were found in the higher power range at the CWD, are plausible as they are due to the fact that the reactive power controller of the wind turbine uses a different calculation method to derive the phase shift than the certification measurement. This is why we obtained almost identical results at the grid emulator and at the public grid at the CWD.

5.2 Reactive power capability

This measurement aims at determining the maximum capacitive and inductive reactive power supply of the wind turbine [2]. For this, the wind turbine is operated using different reactive power ranges. Since the results of the comparisons of these configurations end up in the same conclusions, Fig. 6 shows an exemplarily comparison of tests using ENERCON’s so called FT-configuration.

Fig. 6
figure 6

PQ—Diagram of FT configuration comparing field, CWD and DyNaLab measurements

The comparison shows the final values were not exactly achieved. The difference relating to the manufacturer’s declaration can again be explained by the dissenting phase shift calculation methods of the wind turbine’s reactive power control and the certification measurement, which has been mentioned in Sect. 5.1 already.

5.3 Voltage dependency of PQ diagram

With this test, the manufacturer’s declaration is verified by measuring single operating points of the voltage-dependent P‑Q diagram [2].

The solid lines in Fig. 7 represent the minimum requirements for testing U < Un stated in the manufacturer’s declaration. In this example we consider voltage to be at 85%, 90% and 95%. The comparison of these operational points shows a very comparable behaviour using test benches. Evaluating the tests with respect to TR3 requirements, the test for positive reactive power supply is passed. The test for negative reactive power supply is not passed. Although the measurements are nearly identical, they are located inside the limiting curve regarding reactive power capability, due to a very small safety margin. As already stated in the previous sections this is due to the different phase shift calculation methods between the wind turbine’s reactive power control and the certification measurement. For certification purposes, the data from the manufacturer’s declaration can be reduced by the amount of the maximum undercutting in consultation with the certification body. Therefore, it is not a principle obstacle of carrying out such test at a test bench.

Fig. 7
figure 7

PQ—Diagram for U < Un and FTQ-configuration comparing field, CWD and DyNaLab measurements

5.4 Reactive power control

In this test, the reaction of the wind turbine to a reactive power set point change is determined with respect to setting accuracy and settling time [2]. For this, the wind turbine is initially operated in the partial load range at reactive power set to 0 kvar before different reference steps are applied.

Setting accuracy

The setting accuracy is determined by changing the reactive power reference stepwise from 0 kvar to 50% of its maximum capacitive reactive power (step 1), to 50% of maximum inductive reactive power (step 2) and back to 0 kvar (step 3). Here, each step must be held for a duration of 120 s.

In Fig. 8 the comparison of the measured set point accuracy for all three reference steps for field and test bench measurements is shown. The dashed red lines represent the requirements according to TR3 Rev 24 and 25. Obviously, these requirements are met in the field and at the test benches. However, an observed difference is that the measured reactive power at the test benches tends to be lower than in the field. This is due to the fact that the reactive current induction influences the grid emulator’s voltage.

Fig. 8
figure 8

Set point accuracy for reactive power control

Settling time

The settling time is determined by changing the reactive power reference in three steps from 0 kvar to maximum capacitive reactive power, to maximum inductive reactive power and back to 0 kvar. Each step is held until the reactive power has completely settled plus a duration of 10 s. The results are given in Fig. 9 and show only minor deviations.

Fig. 9
figure 9

Settling time after reference signal step for reactive power controller

5.5 Overall recommendations for reactive power testing (steady-State operation and control performance)

These recommendation hold true for the reactive power test items discussed before. Reactive power test items such as voltage- and power depend reactive power control (Q(U), Q(P)) where not experimentally tested in the project. Still, the authors expect that the following recommendations also hold true for these test items.

Rotor model and wind field in HiL system

The model depth MT0 from Table 3 is sufficient, as an influence of the model depth is not to be expected. The wind can be laminar.

Grid emulato

Additionally to the standard requirements from Sect. 2.3, the grid emulator has to cover the DUT’s maximum reactive power supply. Furthermore, we recommend that the grid emulator controls the PCC voltage in such a way that it is not influenced by the wind turbine changing its reactive power supply. Only for the test item “Reactive Power Control” this is not recommended, but the emulation of a realistic grid impedance.

Test execution

Tests covering Reactive Power Capability can be performed as in the field, without further restrictions.

For the test “Voltage dependency of PQ diagram” (Sect. 5.3) we propose to measure individual operating points of the PQ-Diagram, by changing the grid emulator voltage. We suggest to discuss and agree on six to eight operating points, which are chosen in accordance to the tested grid code and the manufacturer’s declaration, with the certification body.

6 Power quality aspects

6.1 Flicker during normal operation

This test aims at determining the flicker coefficient as a function of the phase angle of the grid impedance and the active power of the wind turbine in normal operation [2]. As this test results in many different values, due to various different phase angles and active power, we focus on an exemplary comparison at a phase angle of 30° in the following. The results are based on the same measurements which were used e.g. for maximum power peaks (Sect. 4.1).

The comparison of field and test bench measurements given in Fig. 10a, show a decent agreement. As the graphs show, the flicker values are higher at the test bench, while the general characteristic is reproduced well. The variation of the CWD measurements is less than the DyNaLab measurements. This is because the CWD measurements were recorded for identical wind fields, in order to demonstrate general reproducibility of test bench results. The increase of the flicker values around 6 m/s is reflected better in the IWES data, while CWD data exhibits an increase at slightly lower wind speed. For all measurements the flicker value is well within acceptable limits.

Fig. 10
figure 10

a Flicker values for a phase angle of 30°during normal operation for field and test bench measurements. b Flicker values for turbulent and laminar inflow

The graphs in Fig. 10b show the influence of the wind field. The results at CWD and at DyNaLab indicate consistently that a laminar wind field leads to an underestimation of the flicker coefficients and is hence not sufficient to use.

Different than that, the modelling depth, which is compared in Fig. 11, does for this specific wind turbine not have a significant impact on the measurement results.

Fig. 11
figure 11

Comparison of the flicker coefficient when different model depths are considered

6.1.1 Recommendations

Rotor and wind models in HiL system

We recommend using rotor models that correspond to the model depth MT10. The wind field must be turbulent.

We are aware that the results seem to allow the conclusion that model depth MT0 is sufficient. As a general recommendation this is misleading, as in this specific case the tested wind turbine exhibits a dynamic behavior (coupled eigenmodes) in the frequency range where flicker is most sensitive that is not critical for the grid connection. If load simulation of tested wind turbine indicate that it does not show relevant eigenmodes in the flicker sensitive range, the usage of models with lower modeling depth such as MT2 may be acceptable.

We point out that the question of whether critical flicker behavior is emulated correctly by the HiL-System and the test bench cannot be answered conclusively on this basis. Further experimental analyses at the test benches suggest that this is the case, but there has been no proof yet.

Grid emulator

Standard requirements from Sect. 2.3 are sufficient. As the feedback effect via a change of the voltage on the current can have a certain influence on the characteristic values it is recommended that a typical grid impedance e.g. according to FGW TR3 [2] is emulated or provided otherwise.

Test execution

A flicker measurement conducted on a test bench allows certain statements to be made about the flicker behavior of a wind turbine. But these results can be subject to great uncertainty due to the complex influences that cause the behavior of the flicker. Therefore, a free-field measurement is recommended for an evaluation of the flicker within the framework of network conformity.

6.2 Harmonics

This test is used to determine the harmonic distortion during continuous operation of the wind turbine [2]. The results presented in this sections are based on the same measurements as used for Active Power Peaks (Sect. 4.1) and Flicker (Sect. 6.1). Due to the sheer amount of available data, we again limit ourselves to a few selected evaluations. These are the comparison of field and test bench results and the impact of model depth and wind field.

Comparison test bench and field

The comparison of field and test bench results are given in Fig. 12. Before going into detail, it is worth stating that all measurements meet the limits relevant for the type certification. Still differences do exist as the graphs show.

Fig. 12
figure 12

Comparison of Harmonic distortion measured at CWD (grid emulator and public grid), at DyNaLab and in the field

Differences occur at the 5th, 7th, 25th, 31st and 35th harmonics. Here, field measurements and test bench measurements differ significantly and the test bench measurements also differ from one another. This can also be observed in the measurement results of the wind turbine at the CWD when the DUT is connected to the public grid.

The differences in the measurements can be partially explained by different conditions at the different test benches and in the field. The harmonic distortion originating from the grid emulators or from the public grid for example were different at the various locations. The effective impedances of the grid emulators and the public grid were also different. Within the scope of this project, these influences could not be investigated in detail and were not the scope of this project.

Influence of model depth and wind field

The impact of the model depth shown in Fig. 13 is not significant. The observed changes are within the range of the measurement variation observed between repeated tests. The impact of the wind field is shown in Fig. 14 and is also not significant.

Fig. 13
figure 13

Impact of model depth on the measured harmonic distortion at test benches

Fig. 14
figure 14

Impact of turbulent and laminar wind for the highest model depth at the CWD grid emulator

6.2.1 Recommendations

Rotor and wind models in HiL system

We recommend using rotor models that correspond at least to the model depth MT0. The wind field can be laminar.

This measurement can be taken alongside with maximum power peaks, so that we in contradiction to the statement above for practical reasons recommend to use the model depth (MT2) and wind field (turbulent). Also due to the fuzziness of the evaluation results we recommend in any case the use of models of higher model depth.

Grid emulator

Due to the unspecific evaluations results and the limits of the work in CertBench, no final recommendations can be stated here. Nonetheless some thoughts on this topic are shared.

It is recommended that the typical grid conditions prevail for the connected wind turbine. This means that if the wind turbine is connected directly to the grid emulator without a plant transformer, this typical inductive impedance should be provided by the grid emulator.

Furthermore, the harmonic emission of the grid emulator should not have dominant switching frequencies of an inverter. Likewise, no untypical harmonics, which do not usually occur in the public grid, should be generated by the grid emulator, for example higher even harmonics.

Test execution

It is recommended to perform the harmonic measurements on the public grid, because then the typical network conditions can most likely be achieved.

7 Grid faults

The Under Voltage Ride Through (UVRT) and Over Voltage Ride Through (OVRT) tests consist of a variety of tests. Different voltage levels with corresponding error durations are tested. In addition, there are two-phase and three-phase tests for different power feeds to the wind turbine (no-load, partial load, full load). Due to the sheer amount of possible measurements and evaluations, it is not possible to cover all aspects in this contribution. Instead we focus on a few results significant for deriving recommendations for test bench tests. More detailed analysis will be shared by the authors in other contributions.

7.1 Under voltage ride through

Two important characteristics of the UVRT test are the voltage curve and the voltage support provided by the reactive current fed in. This paper shows an exemplary comparison of the voltage curve in the positive and negative sequence for a dip to 25% under full load for two- and three-phase faults.

The Test were executed at the test benches at CWD and DyNaLab and in the field. At the test benches the grid faults were created by means of the converter-based grid emulators. In the field measurement the grid faults were created by means of a typical voltage-divider based test unit.

Three-Phase dip to 25% residual voltage

Fig. 15 shows the comparison of the positive-sequence voltage for the three test facilities in the event of a three-phase dip to 25% residual voltage. During the voltage dip, a high level of voltage agreement can be seen. In this case, the maximum deviation in the resulting voltage depth between the test facilities is less than 1%. It is evident on all test facilities that the voltage first drops to 25% before the reactive current feed causes the required voltage to increase to 32%. During the voltage recovery a deviating behavior between the two test bench measurements and the field measurement is observed. At both test benches the voltage reaches the pre-fault value directly, while in the field there is a slow convergence. This is due to the saturation of the wind turbine’s transformer. As the grid emulators of both test benches also use large transformers, they have dedicated flux controllers, which counteract the saturation of their transformers. Unfortunately, this flux control also influences the behavior of the wind turbine’s transformer, so that the saturation of that transformer on both test benches is lower than in the field measurement.

Fig. 15
figure 15

Positive sequence voltage for a three-phase dip to 25% UN under full load operation of the wind turbine

Two-phase dip to 25% residual voltage

Fig. 16 and 17 show correspondingly the positive and negative sequence voltage curves for a two-phase test to 25% residual voltage. The positive and negative sequence voltages reach the expected higher values, since the depth of the dip is calculated from the difference between the positive and the negative sequence voltage. The measured voltage sequences of the three test devices exhibit common behavior. The maximum deviation of the resulting positive sequence voltage is between 2% and 3%. Due to the higher resulting voltage in the positive sequence, the wind turbine transformer will not saturate when the failure declaration is detected in the field. For this reason, the three test devices show a consistent behavior even during the failure declaration. The only noticeable difference is that the pre- and post-fault voltage at the CWD is lower than at the other two test facilities. This is due to the series impedance of the LVRT container, which is not emulated at the CWD. In contrast to the CWD, the test bench at IWES emulated the complete behavior of the LVRT container, including the connection and disconnection of the series impedance. The three negative-sequence voltage curves show a high degree of agreement in all areas, the maximum deviation between the test facilities is below 2%.

Fig. 16
figure 16

Positive sequence voltage for a two-phase dip to 25% UN under full load operation of the wind turbine

Fig. 17
figure 17

Negative sequence voltage for a two-phase dip to 25% UN under full load operation of the wind turbine

Comparison of reactive current injection

In Fig. 18 the reactive current feed and the corresponding tolerances according to FGW TR3 for all symmetrical dips under full load are shown. The tolerance is met by all test facilities. All in all, the reactive current fed into the system shows a high degree of agreement. The maximum deviation for dip depths to a residual voltage of 73% is 3.5%.

Fig. 18
figure 18

Comparison of reactive current injection for three-phase full load tests

In summary, it can be seen that under the precondition of virtual impedance mapping, both test benches achieve comparable results with the field measurement for the LVRT tests.

7.1.1 Recommendations

Rotor and wind models in HiL system

The model depth MT0 from Table 3 is sufficient. The wind field can be laminar.

Grid emulator

Beyond standard requirements from Sect. 2.3 the grid emulators have to meet the requirements stated in FGW TR3 Rev. 25 [2] in the chapters grid faults and test facilities. Naturally, the grid emulators need to be capable of producing the faults in terms of failure duration and voltage dips as specified. Beyond these requirements the following requirements also need to be met by the grid emulator.

An impedance emulation is necessary to achieve behavior comparable to field measurements [8, 9]. It is recommended to use a strong grid for the tests with 0 pu residual voltage. In this case the lowest possible value should be set for the network impedance. For all further tests a weak grid with a high grid impedance should be set to reflect the longitudinal impedance of an FRT container.

For two-phase faults, the grid emulator must be capable of processing voltage references at the PCC according to error type C of Bollen [16]. This is because on grid emulators, any phase jumps can be specified for two-phase faults and in order to achieve realistic fault behavior a fault type C according to Bollen must be specified.

In order to reproduce a real switching behavior on the test bench, the grid emulator must be capable of clearing a three-phase errors via a two-phase fault.

For two-phase tests, the test equipment must be capable of causing a voltage dip either between phases “a” and “b”, “b” and “c” or “c” and “a”.

Test execution

For two phase faults, a calculation of the voltage references, i.e. expected change in the magnitude and angle of the voltage for error type C of Bollen [16], must be made in advance.

On the test benches the saturation effects of the wind turbine transformer may be less than in the field. This is because the grid emulators themselves have large transformers that saturate when the voltage changes rapidly. This saturation is counteracted by special flux controllers inside the converter. These flux controllers also influence the behavior of the system transformer, so that LVRT tests on test benches may lead to a lower saturation of the system transformer.

When conducting FRT tests, there is a concern that an FRT capability of the entire turbine may be compromised by auxiliary equipment that has not been measured. For completely newly developed plants, an additional measurement in the field is therefore recommended.

7.2 Over voltage ride through

For the OVRT test the grid emulators have to meet the same requirements as for an UVRT test, except that the voltage does not have to drop to less than 5% but an overvoltage of 120% of the nominal voltage is required.

The OVRT measurement was carried out in the field and at the IWES according to FGW TR 3 Rev. 24 and at the CWD according to Rev. 25. Revision 24 requires OVRT tests only up to 105% of the nominal voltage. To provide a field comparison, the evaluation is only carried out for this voltage level. Fig. 19 shows the comparison of the positive sequence voltages for a three-phase OVRT event at full load operation of the wind turbine. The no-load measurements show that all three test facilities can provide the required overvoltage of 105%. Under full load, the two test benches reach an overvoltage of more than 104%, while in the field the voltage drops to 103.5%. Fig. 20 shows the associated reactive current in the positive sequence system. The reactive current supply is almost identical on both test benches, while it is up to 2% lower for field measurements. One cause for the deviation can also be the active power of the wind turbine. This is exactly 1 pu on both test benches, whereas the weather conditions in the field only allowed a generation of 0.94 pu.

Fig. 19
figure 19

Positive sequence voltage for a two-phase overvoltage to 105% UN under full load operation of the wind turbine

Fig. 20
figure 20

Positive sequence reactive current for a two-phase overvoltage to 105% UN under full load operation of the wind turbine

In comparison, the OVRT tests also show a high level of agreement, with deviations of less than 2% between the test benches and the field measurement. Since the OVRT tests in the field only measured up to 105% of the nominal voltage, a final statement on comparability is not possible. Of particular interest are the tests with significantly higher over voltages.

7.2.1 Recommendations

Rotor and wind models in HiL system

The model depth MT0 from Table 3 is sufficient. The wind field can be laminar.

Grid emulator

The requirements are identical to those of the UVRT tests in Sect. 7.1.

Test execution

For two phase faults, a calculation of the voltage references, i.e. expected change in the magnitude and angle of the voltage for error type C of Bollen, must be made in advance.

8 Summary and conclusion

This paper gave a condensed and comprehensive overview of the certification measurement campaign of an ENERCON E‑115 E2 carried out at the system test benches at CWD and DyNaLab in compliance with FGW TR3 Rev. 25 [2]. As the first wind turbine in the world, the ENERCON E‑115 E2 received its type certificate solely on the basis of measurements carried out on a test bench with HiL-System and grid emulator.

In general, the comparison of test bench measurements with field measurements showed very good agreement so that we conclude that certification measurements can in principle be carried out at system test benches for most test items without any restrictions. Nonetheless, some test items require more attention from certification body and manufacturer or even accompanying field measurements. This is especially true for the measurement of harmonic distortion and flicker. Both test items directly point at the need for further investigations. For the harmonics distortion we need to develop new methods for analysis with the help of grid emulators. And for flicker we need to gain a more comprehensive understanding of the root causes and the interaction with the structural dynamics of the wind turbine, which are emulated at the test bench.

By varying the model depth as well as the wind field properties in the HiL-System in several tests, the test results indicate that for many tests a relatively simple rotor model seems to be sufficient for this kind of turbine. We also argued, why this may not hold true for other wind turbine types. Furthermore, we found that turbulent wind conditions do often play a significant role for the test results. For practical reasons and to minimize the uncertainty in the overall certification process, we recommend to use higher model depth where possible.

General remark on transferring results

Before transferring these results and statements to guidelines or certification campaigns of other wind turbines, some aspects need to be considered and may need individual discussion. It is important to realize that the tested wind turbine is a type 4 wind turbine. Some stated requirements—especially regarding model depth—may not hold true for other wind turbine types with significant different dynamics. Furthermore, this paper did not discuss the impact of load reducing controls which for instance are based on blade root bending moments or other quantities. If such are part of the wind turbine control, requirements on model depth may also differ from given statements.