Getting an overview of the available data.

Data availability

After the data has been imported into the psysmon database, its a good point to get an overview of the imported data. This helps to check if the imported amount of data fits the expected data time range and to detect potentially missing data or possible fundamental problems with e.g. the data format.

On this page

Create the availability collection

Create a collection named availability and add the two collection nodes data inventory statistics and data availability to the collection. Next disable the collection node data availability. You can disable the collection node by first selecting the node data availability in the collection listbox and than clicking the menu entry disable node from the context menu in the collection listbox.

Disable the data availability using the context menu, which is accessible by clicking the right mouse button.
Disable the data availability using the context menu, which is accessible by clicking the right mouse button.

The name of the disabled collection node will appear in a light gray color and the icon of the node will be changed to a cross. A disabled collection node will be ignored when executing the collection.

The disabled data availability is marked with a gray text and a cross icon.
The disabled data availability is marked with a gray text and a cross icon.

Display the inventory statistics

Execute the collection with the data availability node disabled to open the inventory statistics dialog. This is a simple dialog showing some statistics of the database data inventory. Currently only the text view is available which shows a dictionary with the content available in the database. This overview can be used to evaluate, if the imported data has the expected content (e.g. all required recorder serials are available, the number of data files look reasonable, …).

The inventory statistics of the database.
The inventory statistics of the database.

Keep the window open, we will need the information for the next step.

Create data availability plots

Create an output directory

The data availability node creates images with diagrams showing the available data of selected station components and time spans. The images are saved in a directory that has to be specified in the collection node preferences. I created an output directory named psysmon_output/availability in the tutorial directory. This directory will be used to save the images created by the execution of the data availability node.

stefan@hausmeister:~/tutorial$ ls
dataset  psysmon  psysmon_projects  venv
stefan@hausmeister:~/tutorial$ mkdir psysmon_output
stefan@hausmeister:~/tutorial$ cd psysmon_output
stefan@hausmeister:~/tutorial/psysmon_output$ mkdir availability
stefan@hausmeister:~/tutorial/psysmon_output$ cd ..
stefan@hausmeister:~/tutorial$ tree -L 1
.
├── dataset
├── psysmon
├── psysmon_output
├── psysmon_projects
└── venv

5 directories, 0 files
stefan@hausmeister:~/tutorial$ tree psysmon_output
psysmon_output
└── availability

1 directory, 0 files
stefan@hausmeister:~/tutorial$ 

Enable the data availability collection node

Enable the data availability node and disable the data inventory statistics node using the context menus.

The disabled data inventory statistics node is marked with a gray text and a cross icon.
The disabled data inventory statistics node is marked with a gray text and a cross icon.

Edit the collection node preferences

Before executing the collection, the preferences of the data availability collection node have to be set. Open the collection node preferences by selecting the data availability node and clicking the menu entry edit node in the context menu. The preferences dialog of the selected collection node opens. Select all stations and channels to process and set the process time span according to the limits given in the data inventory statistics dialog. The start time is 2018-10-25T00:00:00 and the end time is 2019-11-02T00:00:00. Set the window mode to whole. After that, close the data inventory statistics dialog.

The window mode preference can be used to specify the time span into which the whole time span should be split. For each time window (e.g. daily, weekly,…) an individual availability image will be created. The range of the tutorial data set is not long, so we will plot the data availability of the complete time span.

The output directory /home/stefan/tutorial/psysmon_output can be set in the output page of the preferences dialog (change the file path to fit the output path on your file system). The list of available pages is given in the listbox at the left side of the preferences dialog.

The settings in the preferences manager should look like in the following two screenshots. Confirm the settings and close the preferences dialog by clicking the OK button.

The data availability preferences components page.
The data availability preferences components page.
The data availability preferences output page.
The data availability preferences output page.

Execute the collection

After the execution of the collection was started, no dialog window will appear. The data availability node only creates images in the specified output directory. You can check the status of the process in the processes tab of the log area in the psysmon main window.

Don’t forget to save the project.

Checking the availability plots

The results of a collection node will always be placed in a unique folder named using the uniform resource identifier of the collection and the timestamp of the execution. This prevents that results are overwritten when executing the same collection repeatedly (e.g. with changed preferences).

The following listing shows the data structure created when executing the above collection.

stefan@hausmeister:~/tutorial/psysmon_output$ tree -L 3
.
└── availability
    └── smi-mr.sm-psysmon-tutorial-availability_20220803_175105_997754-data_availability
        └── data_availability_20181025_000000_20181102_000000.png

2 directories, 1 file
stefan@hausmeister:~/tutorial/psysmon_output$ 

The resulting image data_availability_20181025_000000_20181102_000000.png shows the available data of the tutorial data set. Black lines mark the time when data is available in the database, broad orange-red lines mark time ranges when no data is available. It can be seen, that station PIL:DPZ:XX:A has no data for the complete time span and array MOR has some short data gaps from 2018-10-30 on.

The resulting image of the data availability collection node showing the available data of the tutorial data set.
The resulting image of the data availability collection node showing the available data of the tutorial data set.

This article is licensed under a Creative Commons Attribution-ShareAlike 4.0 International license.

You are allowed to share the material, that means to copy and redistribute the material in any medium or format as long as you give appropriate credit to the creator and add the link to the license. You are allowed to adapt, that means to remix, transform, and build upon the material. If you adapt the material, you must distribute your contributions under the same license as the original.

If possible, please cite this article using the following form:
Psysmon Documentation, Sonnblick Events, "Data availability", Stefan Mertl, 2022-08-27, www.mertl-research.at, licensed under CC BY-SA 4.0