Reproductive toxicology. Ethylene glycol monoethyl ether (2-ethoxyethanol).

Background Reversible phosphorylation catalysed by kinases is probably the most important regulatory mechanism in eukaryotes. Methodology/Principal Findings We studied the in vitro phosphorylation of peptide arrays exhibiting the majority of PhosphoBase-deposited protein sequences, by factors in cell lysates from representatives of various branches of the eukaryotic species. We derived a set of substrates from the PhosphoBase whose phosphorylation by cellular extracts is common to the divergent members of different kingdoms and thus may be considered a minimal eukaryotic phosphoproteome. The protein kinases (or kinome) responsible for phosphorylation of these substrates are involved in a variety of processes such as transcription, translation, and cytoskeletal reorganisation. Conclusions/Significance These results indicate that the divergence in eukaryotic kinases is not reflected at the level of substrate phosphorylation, revealing the presence of a limited common substrate space for kinases in eukaryotes and suggests the presence of a set of kinase substrates and regulatory mechanisms in an ancestral eukaryote that has since remained constant in eukaryotic life.


INTRODUCTION
Kinases are enzymes that transfer a phosphate to an acceptor, which can be carbohydrates, lipids or proteins. The superfamily of eukaryotic protein kinases responsible for phosphorylation of specific tyrosine, serine, and threonine residues is generally recognised as the major regulator of virtually all metabolic activities in eukaryotic cells including proliferation, gene expression, motility, vesicular transport, and programmed cell death [1]. Dysregulation of protein phosphorylation plays a major role in many diseases such as cancer and neurodegenerative disorders, and characterisation of the human kinome space revealed that 244 of 518 putative protein kinase genes are currently mapped to disease loci or cancer amplicons [2,3]. Accordingly, drugs targeting protein kinases are promising avenues for the therapeutic treatment of a plethora of different diseases [4]. In addition, elucidating kinase cascades has proved pivotal for understanding and manipulating cellular behaviour in a variety of divergent eukaryotes.
Most members of the protein kinase superfamily of enzymes can be recognized from their primary sequences by the presence of a catalytic eukaryotic protein kinase (ePK) domain of approximately 250 amino acids, whereas a small number of protein kinases do not share this catalytic domain and are often collectively called atypical kinases [5,6]. A comparison of kinase domains both within and between species displays substantial diversity, which is further increased by the non-catalytic functional domains of kinases that are involved in regulation, interactions with other protein partners, or subcellular localisation. This diversity in catalytic and non-catalytic domains explains the functional diversification of kinases within the eukaryotic kingdom. Eukaryotic protein kinases are now generally classified into several major groups [7,8]: the cyclic nucleotide-and Ca2+-/phoshospholipiddependent kinases (AGC); a group consisting of the cyclindependent and cyclin-dependent-like kinases, mitogen-activated kinases, and glycogen synthase kinases (CMGC); the tyrosine kinases (TK); the tyrosine kinase-like group (which are in fact serine/threonine protein kinases) (TKL); the calmodulin-dependent kinases (CAMK); the casein kinase 1 group (CK); and the STE group (first identified in analyses of sterile yeast mutants) that includes the enzymes acting upstream of the mitogen-activated kinases (STE), summarised in table 1 which is an extension on the table published by . Plants were considered not have a TK group but instead have a large receptor-like kinase group (RLK). However, recently Miranda-Saavedra et al. have shown using a new library that this is not the case. This new library is outperforms BLASTP and general Pfam hidden Markov models in the classification of kinase domains. They show that plants do contain tyrosine kinases and that diverse classes of organisms have a large overlap in kinase families [8]. It should be noted, however, that many eukaryotes also have kinase sequences that are not easily assigned to one of these groups and are referred to as ''other protein kinases.'' Thus far, pan-eukaryotic classification of kinase substrate sequences has not been attempted and would give better insight in the evolution and variability of substrates and their kinases.
Comparative analyses of genomes have already demonstrated substantial differences in the kinomes of different eukaryotes. These differences are partly reflected in the highly variable number of protein kinase genes present in the genomes of different eukaryotes (e.g., the A. thaliana genome contains 973 apparent protein kinases [9], the H. sapiens genome contains 518 [2], S. purpuratus is predicted to have 353 protein kinases [10] D. melanogaster appears to have 240 [7], S. cerevisiae has 115 protein kinase genes [11], and P. falciparum exhibits only 65 putative protein kinases) [12], as well as in highly divergent kinase structures. For instance, plant and unicellular eukaryotic genomes do not contain any apparent kinases from the tyrosine kinase group, despite the detection of phosphorylated tyrosine residues in plants, suggesting that tyrosine phosphorylation in these organisms is possible or that it is mediated via other types of kinases [13][14][15][16]. Strikingly, of the 106 putative protein kinases identified in S. pombe on the basis of primary sequence, only 67 have orthologues in S. cerevisiea but 47 have an orthologue in H. sapiens [17], indicating a great deal of conservation in kinases between different organisms. This high degree of overlap might indicate the presence of conservatism in kinase substrates too. In the P. falciparum kinome, 30% of protein kinases belong to the FIKK family of protein kinases that is apicomplexa-specific and not found in other groups of eukaryotes [12]. As mentioned previously, plants contain a large group of serine/threonine kinases (receptor-like kinases) not found in other eukaryotes. These RLKs most likely share a common evolutionary origin with the receptor tyrosine kinases present in animals and are thus sometimes collectively referred to as receptor kinases and providing an explanation that tyrosine containing motifs on the PepChip can be phosphorylated by these lysates [9]. Interestingly, a recent in silico report on the kinome of the sea urchin has provided new evidence on the evolution of different kinase subfamilies as being an intermediate eukaryote between animals and plants [10]. Fungi such as yeast and Neurospora do not appear to have representatives of the receptor kinase group, whereas the slime mould D. discoideum does have receptor kinases, which fits with the role of receptor kinases in multicellular organisms [18]. Thus, the eukaryotic family of protein kinases displays substantial diversity at the genetic level between different eukaryotic families.
Whether a kinase is able to phosphorylate its substrate depends on multiple factors such as the physical localisation of both molecules, availability of the substrate to the kinase, but a very important factor, in case of a protein kinase, is the amino acid context surrounding the phospho acceptor. The amino acids surrounding the substrate amino acid confer specificity to which kinase can bind correctly to the substrate and confer a phosphate group to the acceptor. The fact that different kinases have different target substrates is being exploited for phosphoproteome profiling using peptide arrays. In this approach, kinase substrates described in the PhosphoBase phosphorylation site database [19] are spotted on a glass slide and incubated with cell lysates and 33P-labelled c-ATP. Phosphorylation of target peptides in arrays has provided substrate phosphorylation profiles for LPS-stimulated monocytes and was instrumental for the discovery of Lck and Fyn kinases as early targets of glucocorticoids [20,21]. Importantly, the extent to which the diversity of kinases at the genetic level is reflected in differences in substrate specificity has not been investigated on a large scale.
In the present study, we investigated substrate requirements of phosphoproteomes of several divergent eukaryotes by employing peptide arrays on resting, unstimulated cellular lysates. Our results show that the divergence of eukaryotic protein kinases observed at the level of primary sequence is not completely reflected at the level of substrate phosphorylation, revealing a large overlap in the phosphorylation profiles from lysates of different eukaryotic origins. Furthermore, the identified minimal eukaryotic phosphoproteome suggests the presence of a set of kinase substrates in an ancestral eukaryote that has since remained invariant in eukaryotic life. The phosphoproteome seems to be involved in the maintenance of cell homeostasis as judged from the source of the peptides involved and thus may be a requisite for eukaryotic life [22].

RESULTS AND DISCUSSION
Phosphorylation of peptide arrays exhibiting mammalian-biased kinase substrates by divergent eukaryote sources A peptide array (PepChip) was employed to determine the preference of cell lysates for kinase substrates. We used the PhosphoBase resource (version 2.0) (now called Phospho.Elm: http://phospho.elm.eu.org) as a source of diverse peptide substrates for kinases [19]. This database contains kinase substrate peptides from diverse organisms, including yeast and plant peptides, but is strongly biased towards mammalian peptide sequences ( Figure 1A and Table S1). It must be noted that this set of substrates is just a small subset of known protein kinase substrates and the complete phosphoproteome which is considered to be a lot bigger. Arrays were constructed by covalently coupling chemically synthesized, soluble peptides to glass substrates as described previously [21]. Arrays contained 1152 different oligopeptides, covering the majority of substrate peptides available through PhosphoBase (version 2.0). On each carrier, the array was spotted twice to allow assessment of variability in substrate phosphorylation. The final physical dimensions of the array were 25675 mm. Each peptide spot had a diameter of approximately 250 mm, and each spot was 620 mm from adjacent spots. When the arrays were incubated with [33P-c] ATP and cell lysates from diverse eukaryotic sources, radioactivity was efficiently incorporated. In contrast, no radioactivity was incorporated when arrays were incubated with [33P-a] ATP and lysates, demonstrating that spot phosphorylation was mediated by specific attachment of the c-phosphate of ATP to the oligopeptides in the array ( Figure 1B). Both the technical replicates (same peptide on the same chip) and the biological replicates were generally of good quality (see supplementary data). Remarkably, the efficiencies by which cell lysates derived from divergent eukaryotic sources phosphorylated specific peptides in the array overlapped substantially, with mammalian lysates showing 33 P incorporation in a large number of spots ( Figure 1C). This overlap in phosphorylation of a strongly mammalian-biased set of kinase substrates indicates that a subgroup of kinases is present in divergent eukaryotes has similar peptide sequence requirements for catalysing phosphorylation reactions.

Serine (S), threonine (T), and tyrosine (Y) phosphorylation is similar in divergent eukaryotes
Eukaryotic organisms from the plant and fungal kingdoms were not thought to express archetypical tyrosine kinases, as judged from the primary sequences of kinases present in their genomes. However, such organisms have been reported to be capable of phosphorylating tyrosine residues via dual-specificity kinases [11,[14][15][16]23,24]. Another explanation for tyrosine phosphorylation by these lysates is the fact that serine, threonine, and tyrosine are not the only phosphate acceptors in eukaryotes. Several lines of research have already shown that histidine and aspartate are also phosphorylated in eukaryotic cells (reviewed in [25][26][27]). Therefore, another explanation could be that histidine and/or aspartate kinases were a possible confounder in our minimal phosphoproteome set ( Table 2). This is boosted by the observation that of the 353 monophospho-substrates, only 35% of the serine/threonine motifs contained a histidine (H) or aspartate (D) and 60% of the tyrosine motifs. The difference in the distribution of the H and D amino acids between S/T and Y containing motifs could imply that phosphorylation of histidine (H), aspartate (D) and tyrosine (Y) might have a common ancestry and a coupled evolutionary background which is not unlikely as remarkable similarities exists between these two classes of kinases (reviewed by Wolanin et al. and references therein) [28]. However, most the tyrosine substrates in our minimal phosphoproteome panel do not contain a histidine or aspartate and therefore common evolutionary backgrounds for histidine, aspartate and tyrosine seems less likely. Thus, the absence of obvious tyrosine kinases in the plant and fungal kingdoms does not result in the inability to phosphorylate tyrosine containing substrates in these organisms. Thus, we compared the relative capacities of animal-derived cell lysates to phosphorylate tyrosine-containing peptide substrates with lysates obtained from the other two eukaryotic kingdoms. To this end, we compared the contribution of serine, threonine, or tyrosine amino acid-containing substrates to the total phosphorylation of all peptide substrates, correcting for the relative abundance of the amino acid in the entire set of substrates. Peptides that can be phosphorylated at more than one residue would bias the results towards a particular amino acid. For example, a peptide that is phosphorylated at two adjacent serines could result in higher signal intensity than a peptide phosphorylated on one threonine. Thus, only those peptides with a single serine, threonine, or tyrosine phosphorylation site were considered (see Table S2). When array phosphorylation was studied in this manner, it appeared that the relative capacities of cell lysates to phosphorylate serine, threonine, or tyrosine substrates were remarkably similar, independent of the kingdom (Figure 2A and B).

Clustering of array phosphorylation patterns along phylogenetic lines
We wished to determine whether the patterns of array phosphorylation reflect phylogenetic relations among the various sources of the cell lysates. To this end, we calculated the Spearman correlation coefficient among the array results using all datasets separately (Table S3), combining datasets with similar origin (Table S4) or combining datasets to organisms (Table S5) and then clustered the results according to Johnson ( Figure 2C) [29]. Histograms the distributions of positive spots of these three datasets analysis show a normal distribution which is shifted to the right (Histogram S1, S2 and S3). Cell lysates from plant and animal sources clustered intra regna, with plants showing less intraregnal variation than animals. This finding could arise from the fact that plant cell lysates were produced from entire organisms, whereas animal lysates were from specialised tissues. Strikingly, the variation in array phosphorylation was comparable between different human or different mouse lysates and between mammalian lysates and a Drosophila lysate. Substrate preferences for kinases do seem to have undergone some diversification after the separation of the animal and plant branches of the eukaryotes. For example, intraregnal variation in phosphorylation between monocotyledons and dicotyledons is smaller than the variation between M. musculus B-cells and H. sapiens macrophages. However, diversity in substrate preferences apparently has not increased after the separation of the Arthropoda and Chordata phyla, and the animal phosphoproteome was established early in animal evolution. This observation corresponds well with analyses of the animal phosphoproteome employing the primary sequences of kinases from divergent animals, as well as with very recent data showing that all major signalling pathways are present in the Porifera phylum, which separated from other animals very early in animal evolution [30,31]. Lysates obtained from the fungal kingdom show much more diversity in array phosphorylation than animal lysates, with a P. pastoris lysate actually clustering with plants rather than with other members of the fungal kingdom. A possible explanation is that fungi consist of a diverse group of organisms closely related to plants [32,33]. It must be noted however, that the other two fungi in the set are also not clustered together, again indicating a large diversity. The diversity in the phosphoproteomes can of course also be caused by the changes in evolutionary pressure on the different samples. It is possible that the evolutionary pressure on metabolic processes in organisms like fungi is of a different level when compared to plants or animals. When the average phosphorylation patterns of the plant, fungal, and animal kingdoms were compared ( Figure 2C), the phosphorylation pattern of plants was found to more closely resemble the animal phosphorylation pattern than the fungal pattern.

Extraction of a minimal phosphoproteome
The clustering analysis indicated that a significant subset of peptide substrates has remained evolutionarily stabile in terms of phosphorylation, irrespective of the eukaryotic source of the cell lysate. Hence, we decided to investigate the set of substrates whose phosphorylation is shared by all organisms tested in the present study. It appeared that phosphorylation of a set of 128 substrates was common to all organisms tested (If phosphorylation is random, one would expect only 0.6 substrates common in different tissues (binomial distribution 13 positive, 1152 total, cumulative chance 0,02; p,0.01) (supplementary information in Table S6). Table 3 lists the set of substrates that are phosphorylated by the divergent eukaryote cell lysates tested. Some of the substrates in the set are highly similar, e.g., 12 slightly different peptides containing Ser15 of glycogen phosphorylase that were apparently deposited in PhosphoBase as separate substrates. When the list of pan-eukaryotic kinase targets is corrected for essentially identical peptide substrates, 71 different peptide substrates remained. These peptides are, in our set, the substrates for what may be termed a minimal eukaryotic phosphoproteome.
Remarkably, all substrates in table 3 contain one or more lysines (K), suggesting a bias in sequence composition or kinase. However, this seems unlikely as studies by Brinkworth et al. showed that, when using prediction models for substrates of kinases, the basic amino acids lysine (K) and arginine (R) are often required for optimal recognition of substrates [34]. Therefore the fact that lysine and arginine are present in the substrates in table 3 is not completely unexpected. Furthermore, it must be noted that the annotation of the substrates is based on the available data at present, and therefore incomplete. Profiling fungal lysates on a primarily mammalian set of substrates can cause the phosphorylation of irrelevant motifs. However the fact that these motifs are still phosphorylated clearly indicates the possible presence for kinase«substrate interactions in other organisms even though no direct in vivo relevance is apparent. Table 4 shows the distribution of peptide substrates with regard to the molecular functions of their source proteins (according to Gene Ontology, based on human homologues in the Swiss-Prot database whenever possible). These data suggest that the phosphorylation events of this minimal phosphoproteome are associated with cell homeostasis; DNA replication, organisation, and stability; RNA translation; cytoskeletal organisation; motility; transmembrane ion transport; and signal transduction. Indeed, these are functions associated with every eukaryotic cell. When all of the peptides on the chip were subjected to a Blastp search (results are listed on http://www.koskov.nl), not all of the peptides included in the minimal kinome scored higher (p,0.01) for multiregnal homology hits than peptides not present in the pan-eukaryotic kinase substrate set. A possible explanation for this observation is that knowledge of non-mammalian regulation of phosphorylation is not as elaborate as that in mammals.
For most substrates in this minimal phosphoproteome set, a kinase capable of phosphorylating the peptide has been described ( Table 3). Although most of the kinases in this list are common to all eukaryotes (e.g., phosphorylase kinase and S6 kinase), some are unique to animals. This is especially true for the tyrosine kinases Src, Ros, and c-Fms, which do not have orthologues in plants or fungi. Hence, phosphorylation of tyrosine in the substrates by plant or fungal cell lysates proceeds through other kinases that have similar substrate specificities as the members of the tyrosine kinase family in animals. Possible candidates for such phosphorylation are the kinases belonging to the dual specificity DYRK, STE7, and Wee family of kinases, which are thought to be capable of tyrosine phosphorylation [35][36][37][38]. However, unique groups of kinases in these species could also be candidates. Interestingly, a recent analysis of the D. discoideum kinome identified a number of kinases that, based on their primary sequences, may act as tyrosine kinases [18]. In A. thaliana, APK1 is capable of tyrosine phosphorylation [13]. It would be interesting to investigate whether any of these kinases are responsible for this minimal phosphoproteome tyrosine phosphorylation events observed in the present study. Interestingly, inhibitors of animal tyrosine kinases also function in plants, suggesting substantial structural homology between the kinases responsible for tyrosine phosphorylation in both kingdoms [39,40]. Further insights into kinase evolution and specificity in different species are needed.

Peptides in this minimal phosphoproteome are not general kinase substrates
An important question concerns the necessity of this minimal eukaryotic phosphoproteome for cell function. The finding that a set of peptide substrates is phosphorylated by cell lysates from highly divergent eukaryotes may indicate that such kinase activity is essential for eukaryotic life and that strong evolutionary pressure exists to prevent its loss. An alternative explanation would be that these substrates act as so-called über-substrates that are relatively non-specifically phosphorylated by multiple kinases. To investigate this question, we incubated chips with relatively high concentrations of purified kinases, e.g., human Tpl2 (MAP3K8). We observed that the substrates phosphorylated by these purified kinases did not overlap with the set of substrates comprising this minimal eukaryotic phosphoproteome (R2 = 0.11). Thus, phosphorylation of the substrates in the minimal phosphoproteome likely reflects the specific activities of multiple kinases in the eukaryotic cell lysates. However, this can only be validated when the phosphorylation profile all kinases are analysed separately. Apparently, strong evolutionary pressure on a minimal phosphoproteome exists, counteracting changes in substrate specificity for the kinases responsible for these phosphorylation events. By inference, this set of substrate motifs was probably present in an ancestral eukaryotic progenitor cell. This notion is in agreement with a recent study by Scheeff and Bourne provides convincing evidence for the evolution of the various kinase families from a common ancestor [41]. It is tempting to speculate that this ancestral protein kinase, or other kinases that appeared relatively early in the history of eukaryotic life, delivered the foundation of essential kinase substrate motifs (the minimal eukaryotic phosphoproteome) that remained stabile ever since.
Concluding, in this paper we described the presence of a set of kinase substrates that is recognised and phosphorylated by a diverse panel of eukaryotic cell lysates. This is remarkable since this set is biased towards mammalian motifs, but can still be a target of non mammalian lysates. The fact that this occurs indicates that some level of conservation exists in the eukaryotic linage. Analysis of the preferred substrates revealed that lysine and arginine have an important role in primary sequence of kinase substrates. The  possibility that the minimal kinome is produced by a few kinases seems unlikely since single kinase experiments reproduce a very limited part of this panel. However a limited set of kinases can very well be able to reproduce this set. This seems not unlikely since the major function of this set is to maintain cell homeostasis, other more specialised functions require specialised kinases.

Organisms
Whole extracts of C. albicans, P. pastoris, F. Solani, D. melanogaster, T. aestivium and A. thaliana were used and cell types of M. musculus and H. sapiens were used as mentioned in the text.

Analysis of Peptide Array
The chips were exposed to a phosphorimager plate for 72 hours, and the density of the spots was measured and analyzed with array software.

Analysis
For the analysis clustering using the spearman correlation coefficient was calculated for each combination of sets and clustering was performed using Johnston hierarchical clustering schemes. Inclusion parameters for each of the kinome profiles are described in supplemental data, Table S4.

Table S6
Calculation of the probability that 116 trials (substrates) are positive (in at least 90% of the samples, corrected for origin bias) in a total number of 1152 trials ( = whole PepChip) using a binominal distribution calculation (http://www.stat.sc. edu/,west/applets/binomialdemo.html). The p-value for success in the binomial distribution is calculated by using the cumulative relative amount of positive spots for every organism. The result of this test shows the chance that a spot is phosphorylated in every set.