Selection indices for agronomic traits in segregating populations of soybean 1 Índices de seleção para caracteres agronômicos em populações segregantes de soja

In genetic improvement of the soybean crop, the selection process is complex and greatly influenced by the environment. One of the alternatives for facilitating this process is the use of selection indices, making it possible to select desirable genotypes for the early generations of breeding programs. The aim of this study was to compare different selection indices in segregating populations of soybean, indicating methods which are superior in various situations, and proposing economic weightings in order to obtain higher gains. Direct and indirect selection criteria were used, together with the classic Smith-Hazel index, an index based on the sum ranks of Mulamba and Mock, a Williams base index, an index based on the desired gains of Pesek and Baker, and a genotype-ideotype distance index. The genetic material consisted of seven F5 generation soybean populations, giving a total of 386 progeny, conducted in a Federer augmented-block design, with the following characteristics being evaluated: number of days to maturity, plant height at maturity, insertion height of the first pod, lodging, agronomic value, number of pods per plant, oil content and grain yield. According to the results, the classic index and base index showed the smallest variations in gains in the different situations and economic weightings under study. The index based on the sum of ranks using agronomic value and grain productivity as the main characteristics with an economic weighting of one, gave the most favourable gains under the conditions of this study.


INTRODUCTION
The soybean [Glycine max (L.) Merrill] is the most important oilseed crop in Brazil, and is the main agricultural product to be exported by the country.In the 2014/2015 agricultural year, domestic production reached 96.24 million tonnes (CONAB, 2015).This success, both in production and agribusiness, is in part due to genetic improvements from the launch of cultivars which are adapted to almost all regions of the country.In this continuing process, knowledge of the genetic diversity of the materials is extremely important, as this will result in better targeting of future crosses (BIZARI et al., 2014).
The selection of superior progeny for breeding programs is not an easy task, as the important characters, which are in the main quantitative, display complex behaviour, are highly influenced by the environment, and may be correlated in such a way that selecting one character produces change in another (CRUZ, 2006).
To reduce this problem, one strategy employed by breeders is the use of selection indices, which seeks to combine all the characters into just one index (number) for each selection unit, then making selections based on the values of these indices, evaluating the indirect answers expected from the original characters and reducing the time necessary to achieve the desired genotypes (CRUZ; REGAZZI; CARNEIRO, 2012).
When different selection criteria are considered, predictions of gain for each criteria is important for guiding the breeder in the use of available genetic material, with a view to maximizing gains for the characters of interest (PAULA et al., 2002).
The aim of this study was to compare different selection indices in segregating populations of soybean, indicating methods which are superior in various situations, and suggested economic weights in order to obtain greater gains.

MATERIAL AND METHODS
The experiment was carried out during the 2012/2013 agricultural year, on the Experimental Teaching and Research Farm (FEPE) of the Júlio de Mesquita Filho State University in Jaboticabal, in the State of São Paulo (SP).The experiment comprised seven F 5 soybean populations, derived from bi-parental crosses (Conquista x Matrinxã, Renascença x Sambaíba, Renascença x Matrinxã, Sambaíba x IAC 17, Confiança x Sambaíba, Conquista x Kinoshita, BRS 231 x Matrinxã).The design was of Federer augmented blocks (1955), consisting of 25 blocks with two checks per block (Conquista and Coodetec 216).The lots consisted of rows, 5 m in length and spaced 0.5 m apart, with a density of 20 plants per metre, giving a total of 386 progeny.Six plants per lot were evaluated in the analyses.Cultivation followed the recommendations for soybean crops (EMBRAPA, 2010).
The statistical model for the analysis of augmented blocks is given by Equation 1: where: Y ij is the value of the character for the ith treatment in the jth block; is the general mean; i is the effect of the ith treatment, which can be broken down into: T i : effect of the ith control, with i = 1, 2...t and G j i : effect of the ith genotype, with i = 1, 2...gj; B j is the effect of the jth block, with j = 1, 2...b; and ij is the random error.
The characters evaluated were: number of days to maturity (NDM), in days; plant height at maturity (PHM), in cm; insertion height of the first pod (IHP), in cm; lodging (Lg), a graded scale ranging from 1.0 (upright) to 5.0 (flat); agronomic value (AV), a graded scale ranging from 1.0 (no agronomic value) to 5.0 (excellent); number of pods per plant (NP); oil content (OC) as a percentage; and grain yield (GY) in kg ha -1 .
Analysis of the oil content was carried out by near infrared spectrometry (NIR), using the Tango spectrometer from Bruker.The equipment measures the wavelength and absorption intensity of near infrared light for the sample.This method is non-destructive, which is important in the early stages of soybean breeding programs, where a reduced number of seeds are used for generation advancement.
Statistical analysis was done using the Genes software (CRUZ, 2007).For the characters Lg and AV, the data were transformed by X for a better fit to the normal distribution curve.The heritability coefficients that were calculated were estimated for mean values of the progeny, using the ratio of genotipic to phenotypic variance, obtained with the analysis of variance in the trial employing augmented blocks.
For analysis of the selection indices, the economic weights and desired gains were established from experimental data obtained by the authors, as recommended by Cruz (1990).Also as per Cruz (2006), the following criteria were used in the analysis: direct and indirect selection; Selection indices for agronomic traits in segregating populations of soybean the classic index (HAZEL, 1943;SMITH, 1936); an index based on the sum of ranks (MULAMBA; MOCK, 1978); a base index (WILLIAMS, 1962), an index based on desired gains (PESEK; BAKER, 1969) and an index of the genotype-ideotype distance (CRUZ, 2006).To date no studies have been found that use this last index in analysing segregating populations of soybean.
With the use of direct and indirect selection, it is to be expected that gains be obtained in the single character for which selection is made, and depending on the association of this character with the remainder, there may be favourable or unfavourable responses in characters of secondary importance which are not being considered in the selection process.For direct selection, the expected gain in the ith character (GSi) can be estimated with formula 2 based on the selection differential (CRUZ, 2006): where: X si = mean value of individuals selected for the character i; X oi = original population mean value; D Si = selection differential practiced in the population; h 2 i = heritability of character i.The indirect gain in character j, by selecting for character i, is given by: GS j(i) = DS j(i) h 2 j , where DS j(i) is the selection differential for indirect selection, obtained as a function of the mean value of the character for those individuals whose superiority was verified based on a further character for which the direct selection was made.
The classic index (HAZEL, 1943;SMITH, 1936) comprises a linear combination of various characters of economic importance, with the weighting coefficients being estimated so as to maximize the correlation between the index and the aggregate genotype.This is established by another linear combination involving genetic values which are weighted by their respective economic values.The selection index (I) and aggregate genotype (M) will then be described as below (Equation 3): (3) where: n is the number of characters evaluated; b is the vector of dimension 1 x n, of the weighting coefficients for the selection index to be estimated; y is the matrix of dimension n x p (plants), of phenotypic values for the characters; a is the vector of dimension 1 x n, of previously established economic weights; g' is the matrix of dimension n x p, of unknown genetic values for the n characters being considered.Thus, vector b = P -1 Ga, where P -1 is the inverse matrix of dimension n x n, of the phenotypic variance and covariance between the characters; G is the matrix of dimension n x n, of the genetic variance and covariance between the characters.
The expected gain for character j when selection is made from the index is expressed by equation 4: where: g j(I) = g j(I) is the expected gain for character j, with selection based on index I; DS j(i) is the selection differential for character j, with selection based on index I; and h 2 j is the heritability for character j.
The index based on the sum of ranks (MULAMBA; MOCK, 1978) consists in classifying the genotypes for each character in an order which is favourable for breeding.The different orders of each genotype are then summed, resulting in the selection index as follows: I = r 1 + r 2 + ...+ r n , where I is the value of the index for a particular individual or family; rj is the classification of an individual in relation to the jth character; n is the number of characters considered in the index.In addition, the breeder may want the sort order of the variables to have different, specified weights.It therefore follows that I = p 1 r 1 +p 2 r 2 +... + p n r n , where pj is the economic weight attributed by the user to the jth character.
The base index (WILLIAMS, 1962) proposes the establishment of indices by the linear combination of the mean phenotypic values of the characters weighted directly by their respective economic weights.The following index is used as the selection criterion (Equation 5): (5) where: y are the mean values and a are the economic weights of the economic characters under study.
The index based on desired gains (PESEK; BAKER, 1969) proposes replacement of the economic weights by the desired gains for a character.Construction of the index involves knowledge of the expression for the expected gain of each character, as defined by Equation 6: (6) where: g is the gain estimated by the index, G is the matrix of dimension n x n, of the genetic variance and covariance between the characters; b is the vector of dimension 1 x n, of the weighting coefficients of the selection index to be estimated; i is the selection differential in standard deviation units for index I; I is the standard deviation of index I.Substituting g by gd, which is the vector of the desired gains, and eliminating that does not affect proportionality of the coefficients b 's , b is estimated from the expression .The coefficients will give the maximum gains for each character, based on the specification of the desired gains (CRUZ; REGAZZI; CARNEIRO, 2012).
The genotype-ideotype distance index (CRUZ, 2006) allows the optimal values for each variable to be set, as E. H. Bizari et al. well as the range of values considered favourable for breeding.For each variable, the mean, maximum and minimum values are calculated.X ij is considered the mean phenotypic value of the ith genotype in relation to the jth character, Y ij the transformed mean phenotypic value, and C j a constant relative to depreciation of the mean value for the genotype where this does not fall within the standards required by the breeder.Therefore: LI j = the lower limit to be presented by the genotype for character j, according to the standard desired by the breeder; LS j = the upper limit to be presented by the genotype and VO j = the optimal value to be presented by the genotype under selection.
The procedure considers C j = LS j -L ij .The value of C j guarantees that any value for X ij within the range of variation around the optimum will result in a value for Y ij with a magnitude close to the optimal value (VO j ), unlike the values for X ij outside of this range.Transformation of X ij is therefore carried out to guarantee the depreciation of those phenotypic values outside of the range.The values for Y ij obtained by transformation are then standardised and weighted by the weights assigned to each character, giving values for y ij , as specified below in equation 7: (7) where: S(Y j ) is the standard deviation of the mean phenotypic values obtained with the transformation; and a j is the weight or economic value of the character.For the calculation, the standardisation and weighting of VO j are also required, as specified by .
The values for the index (GID) are then calculated, expressed by the distances between the genotypes and the ideotype, as shown : Based on this index, the best genotypes can be identified and the selection gains calculated.
In the analysis of the gains resulting from the direct and indirect selections made individually for each character, only one character was considered to be the principal, with an economic weight of one; the remainder were considered to be secondary characters, with a weight of zero.
For the remaining indices, three situations were considered in determining the principal character: I -grain yield (GY) and agronomic value (AV) as the principal characters; II -grain yield (GY), agronomic value (AV), number of pods (NP) and oil content (OC) as the principal characters; III -all eight characters as principal.
The economic weights and desired gains were established from experimental data obtained by the authors, as recommended by Cruz (1990).
For the classic index, the index based on the sum of ranks, and the base index, the economic values established for the principal characters were: a value of 1 (one), the genetic coefficient of variation (CVg) for the character, and the ratio of the genetic coefficient of variation to the experimental coefficient of variation (CVg/CVe); a value of zero, assumed by the secondary characters.For the desired gains index (DGI), the economic weights employed were CVg and genetic standard deviation.For the index based on the genotype-ideotype distance (GID), only the economic weight of 1 was used, the optimal value being considered the maximum value, and the mean value considered the minimum.However, for the characters number of days to maturation and lodging, the optimal value was considered to be the minimum and the maximum value considered to be the mean.
In the calculations for predicting gains, a selection of 23.5% of the progeny was adopted for all indices, giving a total of 90 genotypes.For the characters NDM and Lg, a decrease in values was required, since in general, the aim of breeding is for early and erect genotypes, the latter so as to favour mechanical harvesting.For these characters therefore, with direct selection and the index based on the sum of ranks, the direction of selection adopted was towards the lower end, whereas for the remaining indices, the economic weights assumed negative values so as to rank genotypes with a lower NDM and Lg.

RESULTS AND DISCUSSION
The analysis of variance for the augmented blocks indicated significant differences between the genotypes for the characters NDM, IHP, Lg, AV, OC and GY, whereas for PHM and NP no significant differences were seen (Table 1).
The CVg to CVe ratios displayed values which were greater than one for all characters, showing this condition to be satisfactory for selection (CRUZ; REGAZZI; CARNEIRO, 2012).
As expected, the values obtained for calculating the gains from direct selection were superior to those for indirect gains, for all the characters in all situations.Direct selection also returned the largest individual gains for each character.The greatest gains from direct selection were for the following characters: GY (34.58%),IHP (27.55%),NP (26.34%) and PHM (12.68%).Whereas direct selection for the character NDM returned the lowest individual gain (3.29%), which was expected, since this character had the lowest genetic coefficient of variation (Table 2).Barbaro et al. (2007) and Costa et al. (2004) found similar gains for direct selection in the soybean, with the largest gains seen for the characters, GY and IHP.These gains are due to the greater genetic variation of these characters.
The gains obtained from direct selection were lower than those found by Costa et al. (2004), but this result was expected, since that author worked with F 2 populations of soybean, where there is a greater variability and range of data, resulting in greater gains.However, the present study showed similar results to those obtained by Barbaro et al. (2007), who also used F 5 soybean populations.
In relation to the classic index, for situation I no differences were seen for any of the economic weights used; in addition, this method returned negative gains for E. H. Bizari et al. the characters NDM, IHP, AV and NP.In situation II, the economic weights CVg and CVg/CVe also did not differ, however economic weight 1 provided a small advantage to the characters GY, OC, NP and AV, increasing total gain.For situation III there was an increase in gain for the characters IHP and Lg, but values decreased for AV, NP, OC and especially GY, characters which are fundamental to the selection process in the soybean (Table 3).
According to Table 3, the classic index in general did not show great variation between the situations or economic weights under study.For the character PG, gain varied from 33.51% to 34.07%, while for total gain the values were between 33.02% and 34.07%.Costa et al. (2004) found similar values using PG and AV for the principal characters, with all characters being used as principal in the third situation.Paula et al. (2002), using the classic index in the eucalyptus, also did not find a great range of values for gain with different economic weights.
The index based on the sum of ranks (SRI), with the exception of situation I and the economic weight CVg, gave lower values than the classic index for the character GY.Situation III, with the economic weight CVg, gave the highest total gain for all the indices at 45.46%, but the gain for PG (25.65%) was far lower than the best results for this character.Further, with the index based on the sum of ranks, the economic weight CVg gave the greatest values for total gains and the character PG in all the situations being analysed, demonstrating the efficiency of this parameter as an economic weight with this index.The index displayed a higher range of values, but overall the values for PG were lower than with the classical or base index (Table 3).
The SRI in situation II and for the economic weight CVg/CVe gave the largest gains for OC (2.90%).The character OC, despite having a negative correlation to protein content, is very important in soybean breeding, as it is the main oilseed crop used for biodiesel production in Brazil (MARQUES;ROCHA;HAMAWAKI, 2008).
Further in relation to the SRI, situation I using the economic weight CVg gave a favourable result in the selection process, with a total gain of 39.17% and a gain for PG with a value close to that for direct selection (34.45%), as well as returning positive results for NDM, IHP, PHM, Lg, AV and OC (Table 3).Costa et al. (2004) and Barbaro et al. (2007) concluded that, when compared to the others indices, the ISP was the most appropriate.
The base index (BI) in situation I for all the economic weights under study, gave the same values for gain.These values were the same as those found by direct selection for the character GY, indicating no advantage for direct selection from situation I with the IB.Situations II and III for the economic weights 1 and CVg also showed no difference.These conditions produced good values for GY (34.56%), but showed decreases for NP (-1.28%) and NDM (-0.3%) (Table 3).
The index based on desired gains was generally inefficient in the situations under study.The use of the economic weight SD (standard deviation) produced greater gains than did CVg.With this index, the least satisfactory results were found in situation III, which gave reduced gains for GY (5.92% and 0.5%), as well as unfavourable values for NP (-8.14% and -9.63%) and Lg (-5.68% and -3.91%) (Table 3).Also for the DGI, the most favourable values were found in situation I with the economic weight SD, where gains of 29.9% were seen for GY, with good values for Lg (2.57%), AV (2.92%) and NP (2.69%).Still using SD as the economic weight, situation II produced the second largest gain for NP (18.39%), bettered only by direct selection (26.34%).In general, this index did not give satisfactory gains in relation to the CI, SRI or BI.Costa et al. (2004) did not find good results using this index; use of the DGI is therefore not recommended in segregating populations of soybean.
The index based on genotype-ideotype distance did not give good values for gain for the character GY (17.97% to 27.71%), being below the values found with CI and BI.Moreover, decreases were found for NDM, IHP and PHM in all situations, demonstrating the inefficiency of the index in this study (Table 3).Although the index did not produce good results in this study, Silva andViana (2012) andVasconcelos et al. (2010) found greater and well-distributed gains for the main characters of the passion fruit and alfalfa respectively.
In relation to the indices studied in the present work, situation III produced the lowest values for the character GY, but displayed a better distribution of gains for the other characters.Situation I gave the greatest gains for GY, the principal character seen in the soybean selection process.Characters related to production are of great importance in the selection of superior materials.Viana et al. (2013) found gains for GY using the SRI index, and Dallastra et al. (2014) used exploratory multivariate analysis in the selection of superior progeny for grain yield.
Overall, the BI displayed the greatest gains for GY (34.56% to 34.58%), with values close or equal to those found with direct selection, but produced small gains for the other characters in addition to displaying negative results for NP and NDM.
In general, the DGI and GID indices were not favourable to the situations being analysed.The CI and BI indices, in relation to total gains and gains for the character GY, did not show a great range for the situations or economic weights; they did however return good results against the other indices, with the base index showing a slight advantage.Smith (1936) andHazel (1943), the sum of ranks index (SRI) of Mulamba & Mock (1978), the Williams base index (1962) (BI), the desired gains index (DGI) of Pesek & Baker (1969) and the genotype-ideotype distance index (GID), with economic weights (EW) and different situations, in 386 segregating populations of soybean during the 2012/2013 agricultural year, Jaboticabal, SP (1) Grain yield (GY), agronomic value (AV) as principal characters.
(3) All as principal characters

Table 1 -
Summary of the analysis of variance for the characters number of days to maturation (NDM), height of first pod insertion (IHP), plant height at maturation (PHM), lodging (Lg), agronomic value (AV), number of pods (NP), oil content (OC) and grain yield (GY), in 386 segregating populations of soybean during the 2012/2013 agricultural year, Jaboticabal, SP Treat (Adj): adjusted treatments; CV: general coefficient of variation; CVg: genetic coefficient of variation; CVe: experimental coefficient of variation; h 2 : heritability, ns Not significant by F-test, * and **Significant at 5% e a 1% probability by F-test respectively

Table 2 -
Estimates of selection gains (GS%) found by direct selection for the eight characters under test, considering each character as the principal character, in 386 segregating populations of soybean during the 2012/2013 agricultural year, Jaboticabal, SP

Table 3 -
Estimates of selection gains (GS%) found for eight characters, with the classic index (CI) proposed by