The first thing before analyzing phasor measurement unit (PMU) data is to check its consistency. PMU data shall be checked for missing samples and consistency of the measurements. Missing samples are timestamps, for which no measurement value was recorded. The definition of data consistency is not as easy because there are many possible inconsistencies. For example, non-numeric values or outliers. However, an easy but effective approach is to check if the samples of each measured variable are within a certain confidence interval. The confidence interval is defined in terms of standard deviations and can be set by the user.
For the analysis, I used PMU data from the Distributed Electrical Systems Laboratory at EPFL. They provide open records of their PMU measurements with a time resolution of 20 ms for download (thanks to them). More info about the PMU activities at EPFL can be found here.
PMU data consistency check
In the following, I describe how to check the PMU data consistency. The approach is quite simple, but effective.
Evaluate missing timestamps
- Generate a time array (time_ed) that contains all supposed timestamps from the minimum to the maximum of the measurement time with the time resolution of the PMU data.
- Check which timestamps are not in the measurement time. I recommend rounding the time vectors to the decimal of the time resolution (e.g. Ts = 0.02 means 2 decimals) to avoid precision errors.
time_ed…generated time array with all equidistant timestamps (seconds), Ts…time resolution (seconds), time…time of the PMU measurements (seconds), dec…decimals of time resolution, t_miss…missing timestamps (seconds).
#evaluate missing timestamps; time and Ts must be in seconds
time_ed = np.arange(min(time), max(time),Ts) # equidistant/supposed timestamps
t_miss = np.setdiff1d(np.round(time_ed,dec),np.round(time,dec)) # find the missing timestamps, rounding to avoid precision errors
Consistency check of measurements
Typically, PMU measurements include the magnitude and phase of the voltages and currents, and the frequency. Usually, voltages are recorded as phase voltages. That means we have seven measurements to check. The good news is that we can check all measurements with the same logic.
So I defined a function that evaluates the distribution of the measurement variable in terms of mean and standard deviation. Then I check which measurement samples are outside the defined confidence interval. The width of the confidence interval is defined by multiples of the standard deviation. The samples outside the confidence interval are considered inconsistent (I also call them flagged samples) and must be checked in more detail. The number of flagged samples also depends on the width of the confidence interval, the narrower the confidence interval the more flagged samples and vice versa.
#check which samples are outside the defined confidence interval (mean +/- standard deviation x n_std)
Var_mean = np.mean(Var)
Var_std = np.std(Var)
Var_mean_mnstd = Var_mean - Var_std*n_std
Var_mean_pnstd = Var_mean + Var_std*n_std
idx_Var_check = (Var < Var_mean_mnstd) | (Var > Var_mean_pnstd)
As mentioned before, I used PMU data from EPFL to showcase the consistency check approach. In the following, you can find the results of the analysis.
The screenshot below shows the compiled results of the consistency check. First, it prints the name of the data file and the defined confidence interval in terms of standard deviations. Then it shows the number of missing samples (timestamps) and flagged samples (which are outside the confidence interval) for each measured variable.
The analysis was carried out with a confidence interval of +/- 4 standard deviations. The results also refer to plots one to three (find them below) that show the missing samples and the flagged ones for each measurement. The missing samples are the same for all measurements.
Here, I show the plots of the measurements including the missing and flagged samples.
Plot 1 shows the magnitude and phase of the voltages. You can see that the grey vertical lines show the missing samples and the light brown lines the flagged samples. Looking at the voltage magnitudes, we can see an event where the voltage has a dip and it was flagged by the analysis. However, the flagged values do not show inconsistencies, the voltage dip was probably caused by the change of the tap of a transformer. In this case, the confidence interval could be broader not to flag those “normal” events.
Now you will probably ask why the phase of Va is always zero? It’s because I defined it as the reference for all the other phases. If there would be an inconsistent value in the phase of Va, the other two phases would reflect this as well as they are referenced to the phase of Va.
The second plot shows the measured currents in magnitude and phase, also with their missing and flagged samples. There are two current peaks that are flagged. It is interesting that the current peaks do not coincide with the voltage dip. So there must have been some other switching event or similar.
The third plot shows the frequency and its missing and flagged samples. In this case, the frequency does not have any flagged samples as it doesn’t show any extreme deviations from the nominal frequency.
Plots four and five are a bonus. They show the polar plots of the voltages and currents. I added them because they give a good visual idea of the measurements’ plausibility. Look at the left plot showing the voltages, the magnitude is clearly around its nominal value and the phases are apart 120° all the time. If any of the magnitudes or phases is far off its supposed value it would be clearly visible here. Similar logic can be applied to the currents, even if they have a larger bandwidth large deviations would be prominently visible.
If you find this post insightful, you might also like the following: Inter-area oscillation analysis with PMU data.
More insights within power systems are to follow in the next article.