The utilization of wood samples from xylarium in historical wooden statues: improving the separation accuracy non-destructive measurement for using several algorithms

There are numerous wooden historical artifacts in Kyoto and other parts of Japan, including Buddhist statues or Shinto deities. The identification of wood species in these historical artifact is desirable for both repair and maintenance purposes. The most common method of identifying wood species involves examining samples taken from the artifacts. However, intentional sampling from old cultural artifacts is prohibited in Japan. As a result, we attempted to determine the wood species of old statues non-destructively using near-infrared spectroscopy (NIRS). In this article, we developed the softwood and hardwood separation model using NIRS to compare the prediction accuracy for few algorithms. The model was created based on wood samples stored in the xylarium of the Forestry and Forest Products Research Institute (TWTw). We then applied this model to old Buddhist statues in order to classify them as either softwood or hardwood. These Buddhist statues were housed in Nazenji temple and are believed to have been carved during the Heian period (8th–12th century). For the near-infrared (NIR) measurements, we collected diffuse reflectance spectra from TWTw sample and Buddhist statues using same spectrometer. Initially, we used the soft independent modeling of class analogy method (SIMCA), partial least squares discriminant analysis (PLS_DA), and support vector machine to analyze the NIR spectra obtained from the TWTw wood samples. Subse-quently, we applied the NIR spectra obtained from several Buddhist statues in Nazenji temple to the aforementioned separation model and determined whether spectra data were classified as the softwood or hardwood. Finally, wood specimens detached naturally from the Buddhist statues over time were observed under microscopic analysis to identify the wood species. As comparing the prediction accuracy of few algorithms, SIMCA had a poor result, but PLS_DA had a good result. PLS_DA had better discrimination because it performed calculations to improve regression from both explanatory variables and objective variables


Introduction
There are numerous wooden historical artifacts such as Buddhist statues or Shinto deities in Japan [1].The wooden sculptures were finished with coloring, lacquer leaf, and base material.Some wooden sculptures painted with color on their surfaces were subject to deterioration over time.The paint on the sculptures would chip off or fade away.In Japan, it is common practice to preserve such wooden sculptures in their original state without repainting them, as it aligns with religious and cultural beliefs.As a result, the surfaces of the wooden parts of the sculptures often remained exposed.On the other hand, demand of identification of wood species of wooden sculptures is growing in many scientific fields, such as sociology, archaeology, history, anthropology, and so on.Identifying wood species offers valuable insights into various aspects, such as the origin of materials during that period, the interplay between forests and society at the time, and the possibility of trade with other cultural regions.Additionally, the identification of wood species in the historical artifact is important for both repair and maintenance purposes.The most common method for identifying wood species involves examining samples taken from the artifact.However, intentional sampling from old cultural artifacts is prohibited in Japan.As a result, we have been developing non-destructive methods to identify wood species in old statues, employing near-infrared spectroscopy (NIRS) [2].
In recent years, near-infrared (NIR) spectroscopy has been widely used in many fields of non-destructive measurement, including not only wood [3][4][5][6] but also food and soil [7,8].NIRS has gained popularity as a method for wood species identification due to its availability and conventionality [9].Brunner et al. [10] and Schimleck et al. [11] both have employed principal component analysis (PCA) and subsequent score plots to demonstrate the potential of in differentiating wood species.The combination of NIRS with multivariate analysis has been proposed as a promising approach for wood species identification [9].
However, the long-term exposure of wooden artifacts (or those containing wood) to varying conditions of light intensity, temperature, and humidity can lead to considerable fluctuations in wood properties.Identifying the wood species of aged and deteriorated wood used in cultural heritage objects presents is a complex problem.Abe et al. [12] conducted wood species identification of historical wooden Buddhist statues using NIRS and PCA, based on wood samples stored in the xylarium of the Forestry and Forest Products Research Institute (TWTw).It was important to be able to non-destructively distinguish between softwood and hardwood.For example, for Noh masks, which were cultural assets made of wood, softwood and hardwood were used depending on the year in which they were made.
In this study, NIR spectroscopic information from existing wood samples (TWTw) and Buddhist statues was obtained.We created a model for distinguishing between softwood and hardwood from TWTw wood samples, and used multiple algorithms to compare and examine whether it was possible to distinguish between softwood and hardwood by applying information about Buddhist statues to the model.In the previous article [12], we conducted discriminant analysis using soft independent modeling of class analogy (SIMCA).However, in the field of near-infrared spectroscopy, there are other powerful discrimination methods besides SIMCA.Therefore, we investigated how much measurement accuracy could be improved by introducing methods called partial least squares discriminant analysis (PLS_DA) and support vector machine (SVM) in addition to the previous SIMCA [13].Furthermore, we have increased the number of wood species from TWTw wood samples.This is because recent surveys have shown that the tree species used as Buddhist statues might be more diverse than expected.We investigated the extent to which it was possible to estimate the tree species of historical wooden statues using TWTw wood samples.
On-site surveys of wooden Buddhist statues and NIR spectra data were obtained from statues.Simultaneously, we collected detached wood fragments that had naturally separated from the wooden Buddhist statues over time.Anatomical structures under a microscope to determine their origin as either softwood or hardwood species were determined.We utilized the NIR spectra obtained from the wooden statues and applied them to the classification model to determine whether they corresponded to softwood or hardwood species.Subsequently, we compared the results obtained from the softwood or hardwood model prediction with the microscopic observations.Then, the percentage of correctly classified was calculated.

Wood sample preparation in the xylarium and NIR spectra measurement
Based on the results of previous surveys and microscopic observations, we selected 9 softwood species and 14 hardwood species that were likely to be used for wooden statue.Wood samples of each species collected from various sites in Japan and stored in the xylarium (refer to Table 1).The wood samples were stored in a collection room, which is conditioned at a temperature range of 20-30 °C and a relative humidity of approximately 50-80%.Therefore, wood had a certain moisture content in a stable state.The tangential and radial faces of the wood samples were sanded using 180-mesh sandpaper to smooth the wood surface before spectroscopic measurement.
MATRIX-F spectrometer (Bruker Optics Japan) was used as NIR spectroscopic measurement.The diffuse reflectance spectra were obtained from 12,000 to 4000 cm −1 (830-2500 nm).The resolution of the spectroscope was 4 cm −1 .Wood samples containing both heartwood and sapwood in one piece were selected for NIR measurements.In cases where the samples did not contain both heartwood and sapwood, spectra were obtained from either the sapwood or heartwood.
Furthermore, as wood was the anisotropic material, spectra were measured at five locations from each of the three cross sections for those capable of measuring three cross sections.For example, 15 spectra would be acquired for one sapwood sample.However, if it was not possible to clearly distinguish between the radial section and the tangential section, 5 spectra on the wood cross section and 5 spots on the radial or tangential section.Thus, a total of 10 spectra were obtained from one sample.As a results of NIR spectroscopic measurement, a total of 1305 spectra were gathered for softwood, and 1620 were for hardwood.The wavelength used in this research was limited from 1600 to 1800 nm.This is because previous research has shown that this wavelength was characteristically different between softwood and hardwood [12].

Spectra acquisition from ancient artifacts at Nazenji temple
The investigated statues at Nazenji temple in Kawazucho, Shizuoka Prefecture, Japan, included 22 wooden statues (represented in Table 2) that were most likely created in the Heian period (8th-12th century).Each statue was carved from a single piece of timber.NIR spectra were collected from surfaces where the paint and lacquer had been removed (using same MATRIX-F spectrometer), including radial, tangential, and transitional surfaces.The 370 spectra obtained from these ancient statues were analyzed in this study.After NIR measurement, naturally

Softwood and hardwood classification model using multivariate analysis
The NIR spectra acquired from the xylarium were used for building the model to classify between hardwood and softwood.Several factors could influence the NIR spectrum, including instrument stability, temperature, humidity, and sample surface conditions (Candolfi et al. [21]).Therefore, it was necessary to reduce the influence by performing mathematical processing.The mathematical processing considered to be optimal was performed according to each regression analysis algorithm, and the results were summarized in Table 3. Moving average smoothing was employed to eliminate random noise from the NIR spectra.Additionally, the second derivative was used to minimize the influence of multiplicative and additive effects while enhancing hidden peaks within the spectra.NIR spectra that have been subjected to such mathematical processing were called pre-treatment data [13].The calculations performed using each algorithm were described below.SIMCA classification analysis was method based on PCA model.PCA on data for softwood and hardwood spectra was performed, and applied the spectral data obtained from wooden statue to this PCA model.The optimal number of principal components shown in Table 3 was 4 for softwood and 6 for hardwood.This result was inconsistent with the previous article [12].The reason for the difference was thought to be that the number of measured tree species and the number of spectra increased.
For PLS-DA and SVM analysis [13,18], response variable of 1 or − 1 was defined for hardwood or softwood.Calculation response variable from xylarium wood sample spectra data, and hardwood and softwood classification model was created.We applied the Buddhist statue data to this model and calculated whether the Buddhist statue data could be classified as hardwood or softwood.Finally, we compared the results with the microscope results and calculated the percentage of correct discrimination.Unscrambler software (Version X; CAMO, Oslo, Norway) was used for spectral pre-treatment and quantitative analysis.

Comparison of spectra shapes in the near-infrared region
Figure 1 showed the averaged raw spectra (a) and pretreated spectra (b).The solid blue line represented softwood, and the dashed red line represented hardwood samples.As there were a large number of spectra in this time, the calculated average values were shown.Especially in the 1600 to 1800 nm region, the spectral fluctuations were different between softwood and hardwood in Fig. 1b.It could be seen that the peak fluctuations were more pronounced after preprocessing than before calculation.These preprocessed data were used for hardwood or softwood classification analysis.

Cross-validation of the classification model created using wood from the xylarium
We first investigated how accurate a model for classification model hardwood and softwood could be made using xylarium spectral data.The data presented in Table 1 were divided into a training set (1950 spectra) and a test set (975 spectra).Each set was evenly distributed with all the wood species.The test set consisted of 540 hardwood samples and 435 softwood samples, respectively.Table 4 displayed the accuracy of classifying between softwood and hardwood species in the test set.
In the SIMCA softwood model, a notable number of samples were classified into multiple categories (both  softwood and hardwood), resulting in a lower accuracy of 82.3%.In PLS-DA, the prediction accuracy was lower for softwood than for hardwood.The accuracy of softwood discrimination became poor due to the poor identification results for Akamatsu.Additionally, we focused on the score plots obtained from PLS_DA to detect the cause of the poor result.Score plots were estimated in bilinear modeling methods, and showed where the sample was located on the model principal component axis.Any clustering can be helpful for the development of the classification models [22].Figure 2 shows the PLS_DA score plot for this model [23].Akamatsu samples were located on the border between softwood and hardwood when projected on to the score plot axis, and was partially classified as hardwood.It remained on the border between softwood and hardwood even at higher factors level (Fig. 2b-d).was reported that the chemical composition of Akamatsu was 46.5% cellulose, 24.6% hemicellulose, and 26.0%lignin [24].The chemical composition of Akamatsu could be closer to that of hardwood.The influence of such compositional characteristics might have been a determining factor in identifying it as closer to hardwood.Consequently, it was thought that the prediction accuracy for softwood in PLS_DA had decreased.The support vector machine gave better results for both softwood and hardwood.This showed that the each model was constructed with a classification accuracy of above approximately 90%, excluding SIMCA.

The classification results of the spectra obtained from wooden artifacts (hardwood)
Buddhist statues that were confirmed to be made of hardwood by microscopic observation and Buddhist statues that were determined to be made of hardwood from spectral data are summarized in Table 5 as being able to be classified correctly.The wooden artifact data were fitted into the SIMCA, PLS_DA, and SVM models developed using the above-mentioned reference wood samples.The wood species of the wooden artifacts were identified through microscopic observations (hardwood (Cinnamomum camphora)).When calculating the average accuracy, the models ranked in the following order from highest to lowest: PLS-DA (100%), SVM (12%), and SIMCA (7%).There were variations in accuracy among the averaged wooden artifacts (33-61%).The classification results from wooden artifacts (softwood)

The comparison of classification results among softwood and hardwood
Using PLS_DA, we were able to distinguish softwood and hardwood with high accuracy.This PLS_DA model could be a good result.However, the lower accuracy in classifying hardwood compared to softwood across all algorithms might be attributed to potential differences in the surface conditions of Buddhist statues.According to the previous report, it was reported that the surface of Buddhist statues made of hardwood deteriorated and was rougher than softwood [12].There was a possibility that the surface condition of the wooden statues affected the spectroscopic measurements, but it was not possible to measure the surface roughness only by visual wood surface observation in this field survey.If there was a smoother wood surface, it would be possible to measure stable spectral information.The difficulty of Buddhist statues survey was that the items that could be measured were limited because the investigation had to be conducted using non-destructive measurements.

Conclusions
From this result, we were able to find differences in accuracy depending on the algorithm.SIMCA uses principal component analysis for classification model.In the field of machine learning, principal component analysis (PCA) was used as dimensionality reduction.The spectral data used this time were also reduced in dimension and an attempt was made to discriminate based on the similarity of the spectra.Therefore, if there was a clear difference in the spectrum from the measured wooden sculpture.It would be possible to distinguish between softwood and hardwood.There was little difference in the PCA spectra between softwood and hardwood, and SIMCA had multiple classification categories, which might be the reason for the lower classification accuracy.PLS_DA was able to evaluate softwood and hardwood with the best accuracy.SVM was able to accurately identify only softwood.PLS_DA is considered to have better discrimination because it performs calculations to improve regression from both explanatory variables and objective variables.Whether or not the objective variable was taken into account might have been the difference in whether or not it was possible to separate the hardwood.
By comparing the wood specimen in xylarium and the Buddhist statues, it had become possible to distinguish between softwood and hardwood using PLS_DA.In this study, we used the part of the measurement spectra between 1600 and 1800 nm to separate the softwood and hardwood.This wavelength area was characteristically different between softwood and hardwood.Using all the measured data required a lot of calculations and was difficult in this study.Deepa et al. forecasted that future research in the field of wood was anticipated to concentrate on an increasing number of species and samples within each species, enabling the creation of discriminating models that were more broadly applicable [5].In the future, we would like to consider whether it is possible to distinguish the tree species with using whole measurement spectra.

Table 1
The samples used for the experiment consisted of wood specimens collected from various sites in Japan and total number of spectra measurement detached wood fragments from the statues were identified through light microscopy as Torreya spp.for 14 statues and Cinnamomum spp.for 8 statues.Therefore, we decided to treat 14 Buddhist statues made of softwood and 8 Buddhist statues made of hardwood in subsequent statistical analysis.

Table 2
The types and wood species (genus) of the investigated statues at Nazenji temple

Table
Differences in the validity of classification models in training set among each algorithm

Table 5
Results of classification accuracy for predicting hardwood (Cinnamomum camphora)

Table 6
Results of classification accuracy for predicting softwood (Torreya nucifera)