Corpus of texts in 12 languages. For each language, we provide one training, one development and one testing set acquired from Wikipedia articles. Moreover, each language dataset contains (substantially larger) training set collected from (general) Web texts. All sets, except for Wikipedia and Web training sets that can contain similar sentences, are disjoint. Data are segmented into sentences which are further word tokenized.
All data in the corpus contain diacritics. To strip diacritics from them, use Python script diacritization_stripping.py contained within attached stripping_diacritics.zip. This script has two modes. We generally recommend using method called uninames, which for some languages behaves better.
The code for training recurrent neural-network based model for diacritics restoration is located at https://github.com/arahusky/diacritics_restoration.
The occurrence of river floods is strongly related to specific climatic conditions that favor extreme precipitation events leading to catchment saturation. Although the impact of precipitation and temperature patterns on river flows is a well discussed topic in hydrology, few studies have focused on the relationship between peak discharges and standard Climate Change Indices (ETCCDI) of precipitation and temperature, widely used in climate research. It is of interest to evaluate whether these indices are relevant for characterizing and predicting floods in the Alpine area. In this study, a correlation analysis of the ETCCDI indices annual time series and annual maximum flows is presented for the Piedmont Region, in North-Western Italy. Spearman’s rank correlation is used to determine which ETCCDI indices are temporally correlated with maximum discharges, allowing to hypothesize which climate drivers better explain the interannual variability of floods. Moreover, the influence of climate (decadal) variability on the tendency of annual maximum discharges is examined by spatially correlating temporal trends of climate indices with temporal trends of the discharge series in the last twenty years, calculated using the Theil-Sen slope estimator. Results highlight that, while extreme precipitation indices are highly correlated with extreme discharges at the annual timescale, with different indices that are consistent with catchment size, the decadal tendencies of extreme discharges may be better explained by the decadal tendencies of the total annual precipitation over the study area. This suggests that future projections of the annual precipitation available from climate models simulations, whose reliability is higher compared to precipitation extremes, may be used as covariates for non-stationary flood frequency analysis.
The paper discusses Tarski’s approach to quotation. It starts from showing that it is vulnerable to semantic inconsistencies connected with what is known as Reach’s puzzle, formulated in 1938 by a Czech logician Karel Reach. This fact gives rise to serious problems concerning the relation between the metalanguage and an object language. Moreover, the paper touches upon a historic aspect, pointing out that the problem at hand is discussed in the only paper signed up as Al. Tajtelbaum, i.e. Alfred Tarski’s original name. It argues that the puzzle reveals the importance of reopening the discussion on the understanding and limitations of deriving the metalanguage from an object language.
Accurate estimation of the soil water balance of the soil-plant-atmosphere system is key to determining the availability of water resources and their optimal management. Evapotranspiration and leaching are the main sinks of water from the system affecting soil water status and hence crop yield. The accuracy of soil water content and evapotranspiration simulations affects crop yield simulations as well. DSSAT is a suite of field‐scale, process‐based crop models to simulate crop growth and development. A “tipping bucket” water balance approach is currently used in DSSAT for soil hydrologic and water redistribution processes. By comparison, HYDRUS-1D is a hydrological model to simulate water flow in soils using numerical solutions of the Richards equation, but its approach to crop-related process modeling is rather limited. Both DSSAT and HYDRUS-1D have been widely used and tested in their separate areas of use. The objectives of our study were: (1) to couple HYDRUS-1D with DSSAT to simulate soil water dynamics, crop growth and yield, (2) to evaluate the coupled model using field experimental datasets distributed with DSSAT for different environments, and (3) to compare HYDRUS-1D simulations with those of the tipping bucket approach using the same datasets. Modularity in the software design of both DSSAT and HYDRUS-1D made it easy to couple the two models. The pairing provided the DSSAT interface an ability to use both the tipping bucket and HYDRUS-1D simulation approaches. The two approaches were evaluated in terms of their ability to estimate the soil water balance, especially soil water contents and evapotranspiration rates. Values of the d index for volumetric water contents were 0.9 and 0.8 for the original and coupled models, respectively. Comparisons of simulations for the pod mass for four soybean and four peanut treatments showed relatively high d index values for both models (0.94–0.99).