Factry Historian
Parquet Export
https //parquet apache org/ is an open source, column oriented data file format designed for efficient data storage and retrieval it provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk creating parquet exports is currently an experimental feature, which means the feature is still subject to change to create a parquet export, make sure ‘experimental features’ are enabled, then navigate to configuration > parquet export the screen shows an overview of the created export schedules, and allows for the creation of a one off export or the creation of an export schedule currently the parquet export function only exports data that has a good quality status exporting parquet files can take a significant amount of resources plan the scheduled exports accordingly exporting (many) measurements over great timeranges can take up a lot of storage choose the export destination accordingly one off export to create a one off export task that only runs once, click the parquet export button in the upper right corner a modal will open where the export task can be configured select a time range for which to export data choose an interval to split up the range into multiple files choose a directory where the exported parquet file(s) will be saved choose whether to allow overwriting existing files or not choose to include the full asset tree or not when enabling this option, the relational structure of the asset tree is exported as well this relational structure consists of a file containing the assets' data, a file containing the asset properties, and a file containing measurements data select the measurement(s) that need to be exported either by choosing a set of labels, or by selecting from the asset tree when selecting from the asset tree, the selected asset structure is automatically also exported export on a schedule to create a recurring export, press the create task scheduler button in the top right corner fill in the task details choose a name and description for the scheduler write an {{rrule}} to define when and how often the task should be scheduled fill in the export details fill in the start offset the start offset determines the start of the interval for the exported data, relative to the rrule’s trigger event fill in the period the period defines how long the time period should be that's included in the export, starting from the start time fill in the end time offset this value defines until what time the export should include data, relative to the rrule's trigger event choose an interval to split up the range into multiple files choose a directory where the exported parquet file(s) will be saved choose whether to allow overwriting existing files or not choose to include the full asset tree or not when enabling this option, the relational structure of the asset tree is exported as well this relational structure consists of a file containing the assets' data, a file containing the asset properties, and a file containing measurements data select the measurement(s) that need to be exported either by choosing a set of labels, or by selecting from the asset tree when selecting from the asset tree, the selected asset structure is automatically also exported click save an rrule is way to define a recurrence set the rule defines a pattern to generate a series of timestamps for events their syntax is defined by https //datatracker ietf org/doc/html/rfc5545#section 3 8 5 3 use the https //icalendar org/rrule tool html for help in creating rrules example the following is an example configuration of an exporter the used rrule is rrule\ freq=hourly;interval=1;byminute=0;bysecond=0 , which triggers every hour, on the hour by setting the start offset to 1h10m and the period to 1h , the scheduler will export an hour’s worth of data, starting from 10 minutes before the previous hour until 10 minutes before the current hour setting the stop offset to 10m would be equivalent to setting the period to 1h in this example trigger at export start time export stop time 12 00 10 50 11 50 13 00 11 50 12 50 14 00 12 50 13 50 advanced settings advanced settings can be configured for both the one off export and the scheduled export by default, these settings contain sane defaults, but they can be overwritten if the use case asks for it batch size default 100 000 description the number of records to process in each batch during export a larger batch size can improve performance and compression ratio but may use more memory a parquet file can contain multiple batches max rows per file default 0 description the maximum number of rows allowed in each parquet file if set to 0, there is no limit and all data of a measurement will be written to a single file parquet v2 data pages supported default true description enable support for parquet v2 data pages, which can improve performance and reduce file size for certain workloads disable it when not supported by downstream pipelines compression codec default zstd options none, snappy, gzip, brotly, zstd , lz4raw description the compression codec to use for the parquet files different codecs offer various trade offs between compression ratio and speed compression level default default options best speed, default, better compression, best compression description the level of compression to apply when using the selected codec higher levels typically result in better compression but may take longer to process and require more system resources export progress creating a parquet export can take some time, depending on the amount of data that needs to be exported task progress can be viewed in the docid\ ouywpxc3neevmwrszfhle screen

