|
| 1 | +[PnetCDF](https://parallel-netcdf.github.io) Version 1.11.0 Release Notes (December 19, 2018) |
| 2 | +------------------------------------------------------------------------------ |
| 3 | + |
| 4 | +* New features |
| 5 | + + NetCDF-4 driver -- Accessing HDF5-based NetCDF-4 files is now supported. |
| 6 | + PnetCDF can be built on top of NetCDF-4 library to let users to use PnetCDF |
| 7 | + APIs to read and write a NetCDF-4 file. Users now can add NC_NETCDF4 flag |
| 8 | + when calling ncmpi_create() to create NetCDF-4 files. For opening NetCDF-4 |
| 9 | + files, no additional flag is needed, as PnetCDF automatically detects the |
| 10 | + file format and uses the HDF5 I/O driver underneath. This feature is |
| 11 | + provided for convenience purpose. The parallel I/O performance to NetCDF-4 |
| 12 | + files is expected no difference from using NetCDF-4 library directly. |
| 13 | + + Per-file thread-safe capability is added. This feature can be enabled at |
| 14 | + configure time by adding command-line option `--enable-thread-safe`. In |
| 15 | + addition, option `--with-pthread` can be used to specify the install path |
| 16 | + to the pthreads library. This feature currently only supports |
| 17 | + one-thread-per-file I/O operations and the classic CDF-1, 2, and 5 files. |
| 18 | + |
| 19 | +* New optimization |
| 20 | + + On some systems, e.g. Cori @NERSC, collective MPI-IO may perform poorly |
| 21 | + when the I/O buffer is noncontiguous, compared to a contiguous one. To |
| 22 | + avoid this, `ncmpi_wait()` and `ncmpi_wait_all()` now check whether the |
| 23 | + buffer is noncontiguous and size is less than 16 MiB. If both are true, a |
| 24 | + temporary contiguous buffer is allocated to copy the data over and used in |
| 25 | + the MPI read or write calls. The size of the buffer can be adjusted through |
| 26 | + a new hint `nc_ibuf_size`. See `New PnetCDF hint` below and |
| 27 | + [PR #26](https://github.com/Parallel-NetCDF/PnetCDF/pull/26). Programs |
| 28 | + developed to test this issue is available in |
| 29 | + https://github.com/Parallel-NetCDF/E3SM-IO/tree/master/mpi_io_test |
| 30 | + + Burst buffer driver is updated to run varn APIs more efficiently. Previous |
| 31 | + implementation breaks a single varn request into multiple vara requests, |
| 32 | + which can be slow and require a large amount of meta data. It has changed |
| 33 | + to consider each varn request a single entity. See |
| 34 | + [PR #30](https://github.com/Parallel-NetCDF/PnetCDF/pull/30) and |
| 35 | + [PR #31](https://github.com/Parallel-NetCDF/PnetCDF/pull/31). |
| 36 | + |
| 37 | +* New Limitations |
| 38 | + + For creating new files, the NetCDF-4 driver in PnetCDF supports only the |
| 39 | + classic model I/O operations. Advanced NetCDF-4 features, such as chunking, |
| 40 | + compression, etc. are not supported in PnetCDF. This is due to the |
| 41 | + unavailability of PnetCDF APIs for those operations. |
| 42 | + + The burst buffering driver does not support NetCDF-4 file formats. |
| 43 | + + Due to a bug in HDF5 1.10.2 that fails zero-length write requests to record |
| 44 | + variables in the collective mode, PnetCDF is not able to support such |
| 45 | + requests when NetCDF-4 feature is enabled. See discussion in |
| 46 | + https://github.com/NCAR/ParallelIO/pull/1304 |
| 47 | + The bug fix has appeared in HDF5 1.10.4 release. |
| 48 | + |
| 49 | +* Update configure options |
| 50 | + + Enable NetCDF-4 support. |
| 51 | + - `--enable-netcdf4`: enable NetCDF4 format classic mode support |
| 52 | + - `--with-netcdf4=/path/to/netcdf-4`: path to NetCDF-4 library installation |
| 53 | + + Enable multi-threading support. |
| 54 | + - `--enable-thread-safe`: enable per-file thread-safe support |
| 55 | + - `--with-pthread`: path to the pthread library installation |
| 56 | + |
| 57 | +* New constants |
| 58 | + + none |
| 59 | + |
| 60 | +* New APIs |
| 61 | + + C++ API `NcmpiFile::set_fill()` is added for setting and inquiring the |
| 62 | + fill mode of an opened NetCDF file. |
| 63 | + |
| 64 | +* API syntax changes |
| 65 | + + none |
| 66 | + |
| 67 | +* API semantics updates |
| 68 | + + none |
| 69 | + |
| 70 | +* New error code precedence |
| 71 | + + none |
| 72 | + |
| 73 | +* Updated error strings |
| 74 | + + none |
| 75 | + |
| 76 | +* New error code |
| 77 | + + none |
| 78 | + |
| 79 | +* New PnetCDF hint |
| 80 | + + `nc_ibuf_size` -- to set the size of a temporal buffer to be allocated by |
| 81 | + PnetCDF internally to pack noncontiguous user write buffers supplied to the |
| 82 | + nonblocking requests into a contiguous space. Similarly for read case to |
| 83 | + unpack the temporal buffer to user read buffers, if they are noncontiguous. |
| 84 | + This affects both blocking and nonblocking APIs. On some systems, using |
| 85 | + noncontiguous user buffers in MPI collective read/write functions performs |
| 86 | + significantly worse than using contiguous buffers. Note if the size of |
| 87 | + aggregated user buffers is larger than `nc_ibuf_size`, packing/unpacking |
| 88 | + will be disabled to save memory footprint. The default value is 16 MiB. |
| 89 | + |
| 90 | +* New run-time environment variables |
| 91 | + + none |
| 92 | + |
| 93 | +* Build recipes |
| 94 | + + doc/README.NetCDF4.md is added to describe the usage of the new feature of |
| 95 | + NetCDF-4 support. |
| 96 | + |
| 97 | +* New/updated utility program |
| 98 | + + none |
| 99 | + |
| 100 | +* Other updates: |
| 101 | + + The automatic file layout alignment for fixed-size variables is disabled. |
| 102 | + This is because modern MPI-IO implementations have already aligned the file |
| 103 | + access with the file lock boundaries and the automatic alignment can create |
| 104 | + a file view with "holes" in between variables, which can adversely degrade |
| 105 | + I/O performance. Users can still set hints `nc_header_align_size`, |
| 106 | + `nc_var_align_size`, and `nc_record_align_size` to use customized alignment |
| 107 | + sizes. |
| 108 | + + The internal data buffering mechanism used in the burst buffer driver is |
| 109 | + removed. This mechanism caches the request data in memory until the |
| 110 | + accumulated size is more than 8 MiB, so the write requests to burst buffers |
| 111 | + can be aligned with 8 MiB boundaries. However, experiments on Cray DataWarp |
| 112 | + show a negligible performance improvement unless the I/O request is small |
| 113 | + and fragment. On the other hand, it can degrade performance for mid- and |
| 114 | + large-sized requests. The burst buffer driver now writes directly to the |
| 115 | + burst buffers for each user write request. |
| 116 | + |
| 117 | +* Bug fixes |
| 118 | + + Fix bug of checking interleaved requests for scalar variables. See |
| 119 | + [PR #27](https://github.com/Parallel-NetCDF/PnetCDF/pull/27). |
| 120 | + + When building PnetCDF using the IBM xlc compiler with -O optimization |
| 121 | + option on Little Endian platforms, users may encounter errors related to |
| 122 | + strict ANSI C aliasing rules. Thanks to Jim Edwards for reporting and Rafik |
| 123 | + Zurob for providing the fix. See |
| 124 | + [Issue #23](https://github.com/Parallel-NetCDF/PnetCDF/issues/23) and |
| 125 | + [Pull Request #24](https://github.com/Parallel-NetCDF/PnetCDF/issues/24). |
| 126 | + + Shell ksh has a different way to redirect stdout and stderr from bash. |
| 127 | + PnetCDF configure.ac and acinclude.m4 have been developed mainly on bash. |
| 128 | + This bug can cause configure command to fail when using ksh. Thanks to |
| 129 | + @poohRui for reporting the bug. See |
| 130 | + [Issue #21](https://github.com/Parallel-NetCDF/PnetCDF/issues/21) and |
| 131 | + [PR #22](https://github.com/Parallel-NetCDF/PnetCDF/pull/22). |
| 132 | + However, running configure under ksh is still buggy. A GNU automake bug |
| 133 | + report of hanging problem can be found in |
| 134 | + https://lists.gnu.org/archive/html/bug-automake/2015-04/msg00000.html |
| 135 | + PnetCDF users are recommended to run configure under other shells. |
| 136 | + + For put and get APIs when buftype is MPI_DATATYPE_NULL, bufcount is |
| 137 | + ignored. This is not implemented correctly in blocking put and get APIs. |
| 138 | + See bug fix committed on Aug. 25, 2018. |
| 139 | + + ncmpidiff -- when comparing two files that contain record variables but |
| 140 | + no record has been written. See bug fix committed on Aug. 25, 2018. |
| 141 | + + ncmpidiff -- when comparing two scalar variables, error NC_EBADDIM may |
| 142 | + mistakenly reported. See bug fix committed on Aug. 12, 2018. |
| 143 | + + When the MPI communicator used in ncmpi_create or ncmpi_open is freed by |
| 144 | + the user after the call and before file is closed, programs would crash at |
| 145 | + ncmpi_close with MPI error of "Invalid communicator". The fix moves the |
| 146 | + duplication of MPI communicator to the place before calling driver create |
| 147 | + and open subroutines. See bug fix committed on Jul 21, 2018. |
| 148 | + |
| 149 | +* New example programs |
| 150 | + + examples/C/time_var.c and examples/F77/time_var.f - show how to define, |
| 151 | + write, and read record variables. |
| 152 | + + examples/C/pthread.c - demonstrates the one-file-per-thread I/O example. |
| 153 | + When running on some parallel machines, users may need to set certain |
| 154 | + environment variable to enable MPI multi-threading support, for example on |
| 155 | + Cori @NERSC with command |
| 156 | + ``` |
| 157 | + export MPICH_MAX_THREAD_SAFETY=multiple |
| 158 | + ``` |
| 159 | + + examples/C/transpose2D.c - a 2D version of examples/C/transpose.c |
| 160 | +
|
| 161 | +* New programs for I/O benchmarks |
| 162 | + + none |
| 163 | +
|
| 164 | +* New test program |
| 165 | + + test/F90/test_fill.f90 - another test for bug fix r3730. |
| 166 | + + test/testcases/error_precedence.m4 - tests the error code reporting |
| 167 | + precedence |
| 168 | + + test/nc4/tst_zero_req.c - tests a HDF5 1.10.2 bug that causes test program |
| 169 | + to hang when writing to and reading back a 2D record variable in collective |
| 170 | + mode with some of the processes making zero-length requests. |
| 171 | + + test/nc4/put_get_all_kinds.m4 - tests all supported variable read/write |
| 172 | + API. Make sure they are properly wired up |
| 173 | + + test/nc4/interoperability_rd.m4 - tests whether NetCDF-4 file written using |
| 174 | + NetCDF can be read by PnetCDF |
| 175 | + + test/nc4/interoperability_wr.m4 - tests whether NetCDF-4 file written using |
| 176 | + PnetCDF can be read by NetCDF |
| 177 | + + test/nc4/simple_xy.c - tests reading NetCDF-4 files, borrowed the test |
| 178 | + program simple_xy.c from NetCDF |
| 179 | + + test/testcases/tst_pthread.c - tests thread-safe capability for scenario of |
| 180 | + each thread operating on a unique file. |
| 181 | + + test/testcases/tst_free_comm.c - free MPI communicator right after calling |
| 182 | + ncmpi_create to see if PnetCDF duplicates the communicator correctly. |
| 183 | +
|
| 184 | +* Conformity with NetCDF library |
| 185 | + + none |
| 186 | +
|
| 187 | +* Discrepancy from NetCDF library |
| 188 | + + In contract to NetCDF-4 which allows to read/write variables in define mode |
| 189 | + when the file format is in NetCDF-4 format, PnetCDF still requires reading |
| 190 | + and writing variables in data mode. |
| 191 | + + In contrast to the semantics of nc_set_fill() defined in NetCDF-4, |
| 192 | + ncmpi_set_fill() changes the fill mode of all variables newly defined in |
| 193 | + the current scope of defined mode. Variables affected include the ones |
| 194 | + defined before and after the call to ncmpi_set_fill(). Note this API has no |
| 195 | + effect on the already existing variables created in the previous define |
| 196 | + mode. This behavior follows the convention adopted by NetCDF-3. To change |
| 197 | + fill mode for individual variables after the call to ncmpi_set_fill(), API |
| 198 | + ncmpi_def_var_fill() can be used for this purpose. Refer NetCDF 4.1.3 user |
| 199 | + guide for semantics of |
| 200 | + [nc_set_fill()](https://www.unidata.ucar.edu/software/netcdf/documentation/historic/netcdf-c/nc_005fset_005ffill.html). |
| 201 | + A discussion with NetCDF developers regarding this issue can be found in |
| 202 | + [1114](https://github.com/Unidata/netcdf-c/pull/1114). |
| 203 | + + The error code return precedence can be different between NetCDF and |
| 204 | + PnetCDF in some cases. A test program for error code return precedence is |
| 205 | + available in test/testcases/error_precedence.m4. This program can be used |
| 206 | + to test both PnetCDF and NetCDF libraries. Note when testing NetCDF |
| 207 | + programs, because NetCDF does not follow the same precedence, failures are |
| 208 | + expected. A discussion with NetCDF developers regarding this issue can be |
| 209 | + found in [334](https://github.com/Unidata/netcdf-c/issues/334). |
| 210 | +
|
| 211 | +* Issues related to MPI library vendors: |
| 212 | + + none |
| 213 | +
|
| 214 | +* Issues related to Darshan library: |
| 215 | + + none |
| 216 | +
|
| 217 | +* Clarifications |
| 218 | + + PnetCDF currently does not support Fortran default integer type set to 8 |
| 219 | + bytes (for GNU Fortran compiler, this change of default setting is done by |
| 220 | + using compile option -fdefault-integer-8). Checking this has been added |
| 221 | + and configure command will fail, once default 8-byte integer is detected. |
| 222 | +
|
0 commit comments