Nature Communications2023Full TextOpen Access

Elucidating the molecular programming of a nonlinear non-ribosomal peptide synthetase responsible for fungal siderophore biosynthesis

Matthew Jenner, Yang Hai, Hong Hanh Nguyen et al.

19 citations2023Open Access — see publisher for license terms1 related compound

Research Article — Peer-Reviewed Source

Original research published by Jenner et al. in Nature Communications. Redistributed under Open Access — see publisher for license terms. MedTech Research Group provides these references for informational purposes. We do not conduct original research. All studies are the work of their respective authors and institutions.

Abstract

Siderophores belonging to the ferrichrome family are essential for the viability of fungal species and play a key role for virulence of numerous pathogenic fungi. Despite their biological significance, our understanding of how these iron-chelating cyclic hexapeptides are assembled by non-ribosomal peptide synthetase (NRPS) enzymes remains poorly understood, primarily due to the nonlinearity exhibited by the domain architecture. Herein, we report the biochemical characterization of the SidC NRPS, responsible for construction of the intracellular siderophore ferricrocin. In vitro reconstitution of purified SidC reveals its ability to produce ferricrocin and its structural variant, ferrichrome. Application of intact protein mass spectrometry uncovers several non-canonical events during peptidyl siderophore biosynthesis, including inter-modular loading of amino acid substrates and an adenylation domain capable of poly-amide bond formation. This work expands the scope of NRPS programming, allows biosynthetic assignment of ferrichrome NRPSs, and sets the stage for reprogramming towards novel hydroxamate scaffolds.

Full Text
01

Abstract

Siderophores belonging to the ferrichrome family are essential for the viability of fungal species and play a key role for virulence of numerous pathogenic fungi. Despite their biological significance, our understanding of how these iron-chelating cyclic hexapeptides are assembled by non-ribosomal peptide synthetase (NRPS) enzymes remains poorly understood, primarily due to the nonlinearity exhibited by the domain architecture. Herein, we report the biochemical characterization of the SidC NRPS, responsible for construction of the intracellular siderophore ferricrocin. In vitro reconstitution of purified SidC reveals its ability to produce ferricrocin and its structural variant, ferrichrome. Application of intact protein mass spectrometry uncovers several non-canonical events during peptidyl siderophore biosynthesis, including inter-modular loading of amino acid substrates and an adenylation domain capable of poly-amide bond formation. This work expands the scope of NRPS programming, allows biosynthetic assignment of ferrichrome NRPSs, and sets the stage for reprogramming towards novel hydroxamate scaffolds.

02

Introduction

Iron is an indispensable cofactor for all microbial life. The ability to coordinate and activate molecular oxygen, in addition to optimal redox properties for electron transport, places it central to numerous cellular processes 1 , 2 . Equally, high intra-cellular iron concentrations give rise to Fenton and Haber–Weiss reactions, producing reactive oxygen species capable of cell damage 3 . It is therefore vital that iron homoeostasis is carefully managed. Although iron has a high natural abundance, it exists predominantly as Fe 3+ in aerobic environments and tends to form insoluble ferric hydroxides rendering it inaccessible to microorganisms 4 . As a result, organisms have evolved complex strategies for iron acquisition and storage. Whilst several mechanisms are known, a common approach employed by bacteria and fungi is the production of low-molecular-weight compounds known as siderophores, which serve as high-affinity iron chelators 5 , 6 . In fungi, the majority of siderophore compounds produced belong to the hydroxamate class. This functionality originates from l -ornithine, which is N δ -hydroxylated and subsequently N δ -acylated to yield either N δ -acetyl- N δ -hydroxy- l -ornithine ( l -AHO) or N δ -anhydromevalonyl- N δ -hydroxy- l -ornithine (AMHO) 7 . Typically, siderophores possess three hydroxamate units, producing a hexadentate ligand which promotes formation of a polyhedral Fe 3+ complex with binding constants in the 10 22 –10 32 range 8 . The hydroxamate-containing units, l -AHO and cis -/ trans -AMHO, are enzymatically incorporated into chemical scaffolds and define two separate families of hydroxamate siderophores. These include the depsipeptides, typified by fusarinine C (FSC) ( 1 ) 9 and coprogen 10 , 11 , which utilise either cis- or trans- AMHO as monomeric units and are excreted primarily to capture ferric iron (Fig. 1a ) 12 . In contrast, members of the ferrichrome family, such as ferricrocin ( 2 ) and ferrichrome ( 3 ), are generally considered to be intracellular and can incorporate l -AHO or cis -/ trans -AMHO, in combination with other amino acids, and are principally used for iron storage, although not exclusively (Fig. 1b , Supplementary Fig. 1 ) 13 , 14 . Both extra- and intra-cellular siderophores are essential for the survival and virulence of many problematic fungal species, including the opportunistic pathogen Aspergillus fumigatus and the rice blast fungus Magnaporthe oryzae 15 , 16 . Fig. 1 Hydroxamate-containing siderophores produced by fungi and the non-linear NRPSs responsible for their biosynthesis. a Domain organisation of the SidD NRPS responsible for the biosynthesis of fusarinine C ( 1 ). The A 1 domain loads cis -AMHO units onto the T 1 and T 2 domains (highlighted by blue dashed arrows), a requirement due to an inactive A domain (dA) present in module 2. The NRPS acts in an iterative manner to condense three cis -AMHO units (highlighted in purple) as a depsipeptide, yielding ( 1 ) as the final product. b Domain organisation of the SidC NRPS responsible for the biosynthesis of ferricrocin ( 2 ). The structural variant, ferrichrome ( 3 ), is also highlighted. It is hypothesised that the A 3 domain loads l -AHO units (highlighted in purple) onto the T 4 and T 5 domains in a similar manner to SidD (highlighted by blue dashed arrows), as their respective modules lack dedicated A domains. The domains encompassing modules 1 and 2 must incorporate three amino acids [Gly-Ser-Gly] for ( 2 ) or [Gly-Gly-Gly] for ( 3 ) (highlighted in red). However, only two A domains are present, indicating unusual nonlinear behaviour of the NRPS. In each case, siderophores are shown in their ferric-bound state, and the hydroxymate-containing monomer unit is highlighted in purple. Domain abbreviations are as follows: C, condensation domain (dark blue); A, adenylation domain (purple); C T , terminating condensation domain (green); T, thiolation domain (black). Whilst the physiological function of hydroxamate siderophores in fungi is well established, in some cases, the molecular details underpinning their biosynthesis remain poorly understood. Genes encoding for large non-ribosomal peptide synthetase (NRPS) enzymes are known to be responsible for the assembly of peptidyl siderophores 10 , 17 , 18 . These modular multi-domain enzymes are typically comprised of three domain types: condensation (C), adenylation (A) and thiolation (T). During the biosynthetic process, the peptidyl intermediates are covalently tethered to the T domains via a thioester linkage, afforded by a 4′-phosphopantetheine (Ppant) moiety post-translationally appended to each T domain 19 . Within a module, the A domain specifically selects and loads an amino acid starter unit (module 1 only) or extender units onto the Ppant thiol of the T domains. This allows the C domain to catalyse amide bond formation between the growing peptidyl intermediate appended to the T domain of the upstream module, and the amino acid extender

03

Results

Reconstitution of the SidC NRPS and determination of adenylation domain specificity In the first instance, we elected to examine SidC activity in vivo using Saccharomyces cerevisiae as a heterologous host. This was conducted to allow production of the associated siderophore(s) and to ascertain whether S. cerevisiae would be an appropriate host for recombinant overproduction of SidC for subsequent purification and in vitro analysis. To achieve this, the sidC NRPS gene from A. nidulans FSGC A1145 (Supplementary Data 1 ), in addition to sidA and sidL (required for production of the l -AHO precursor), were cloned into vectors with distinct selection markers, and transformed into S. cerevisiae BJ5464- npgA (a strain with the fungal Ppant transferase, NpgA, integrated into its chromosome to ensure phosphopantetheinylation of the resulting proteins) for siderophore production 32 . Analysis of the small molecule extract from a 3-day culture indicated that ferricrocin ( 2 ) was produced (Fig. 2a , trace i), and large scale cultures allowed purification and isolation of ferricrocin ( 2 ) for structural elucidation, which was in agreement with previous reports (see Supplementary Fig. 2 – 4 ). Having established that an active form of SidC can be produced in S. cerevisiae , recombinant SidC protein was overproduced in S. cerevisiae JHY686 as a polyhistidine-tagged fusion protein, and was purified to near-homogeneity using immobilised metal-ion affinity chromatography (IMAC) (Supplementary Fig. 5 ), thereby allowing controlled exposure to substrates/cofactors 33 . To ensure protein samples were completely in the holo -form prior to assays, purified SidC was enzymatically phosphopantetheinylated using the fungal phosphopantetheinyl transferase, NpgA ( A. nidulans ), as described previously 34 . Following addition of ATP and Mg 2+ cofactors to recombinant SidC, incubation with l -AHO alone (synthesised according to literature protocols 35 , 36 , Supplementary Fig. 6 – 8 ), or a combination of l -AHO + l -Ser, yielded no detectable products (Fig. 2a , trace ii and iii). However, the combination of l -AHO + Gly + l -Ser resulted in the production of ferricrocin ( 2 ) (Fig. 2a , trace v). Interestingly, incubation with l -AHO + Gly resulted in formation of a species consistent with ferrichrome ( 3 ) (Fig. 2a , trace iv), which was confirmed by comparison to a chemical standard (Fig. 2a , trace vii). These observations suggest that the SidC NRPS is capable of producing both ferricrocin ( 2 ) and ferrichrome ( 3 ) depending upon the availability of amino acid substrates, yet appears to produce exclusively ferricrocin ( 2 ) in the native host, probably due to the abundance of l -Ser. Fig. 2 Reconstitution of SidC NRPS and inter-modular loading of l -AHO residues by the A 3 domain. a HPLC traces monitored at 420 nm for the following: (i). production of 2 via heterologous expression of sidC , sidA and sidL in S. cerevisiae JHY686; (ii)–(v). in vitro enzymatic reactions of SidC in the presence of l -AHO, +/− l -Ser and +/− Gly; (vi)–(vii). authentic standards of 2 and 3 . Presented with either a cellular pool of amino acids (i.e. in vivo experiment), or l -AHO + l -Ser + Gly in vitro, SidC produces 2 exclusively. However, when provided with l -AHO + Gly in vitro, SidC produces solely 3 . Experiments were performed in triplicate and representative spectra are shown. b Deconvoluted intact protein mass spectra of holo -SidC C 3 A 3 T 3 C 4 T 4 ( top ) following incubation with l -AHO, ATP and Mg 2+ , showing loading of either: x2 l -AHO units onto the T 3 and T 4 domains, or a condensed di- l -AHO species on the T 4 domain. holo -SidC C 3 A 3 T 3 0 C 4 T 4 ( bottom ) following incubation with l -AHO, ATP and Mg 2+ , showing loading of a single l -AHO unit onto the T 4 domain. The S3151A mutation in the T 3 domain means it is unable to be modified with a Ppant moiety, thus preventing loading of l -AHO. c ). Deconvoluted intact protein mass spectra of holo -SidC C 5 T 5 C T ( top ) and holo -SidC T 5 C T ( bottom ) following incubation with holo -SidC C 3 A 3 T 3 , l -AHO, ATP and Mg 2+ . Loading of l -AHO is only observed when the N-terminal C domain of each construct is present. Mass shifts corresponding to biosynthetic steps are highlighted with red arrows, and proposed intermediates are displayed. Markers for low abundance species are based on calculated masses or previously measured spectra. Exact measured and observed masses are detailed in Supplementary Table 2 . Experiments were performed in duplicate and representative spectra are shown. Our initial biosynthetic model hypothesised that each of the three A domains within SidC are responsible for activation and loading of l -Ser, Gly and l -AHO (Fig. 1b ). In bacteria, bioinformatic analysis allows accurate prediction of the substrate specificity for A domains, primarily based on highly conserved amino acid motifs within the enzyme active site. However, this approach is not poss

04

The SidC A 3 domain catalyses intra - and inter -modular loading of l -AHO

The biosynthesis of ferricrocin ( 2 ) and ferrichrome ( 3 ) both require installation of three l -AHO residues, yet the SidC NRPS possesses only a single A domain capable of activating l -AHO; the A 3 domain situated in module 3. Furthermore, modules 4 and 5 lack integrated A domains for loading amino acid units to their cognate T domains (Fig. 1b ). Based on our previous observations in the SidD NRPS, we postulated that the SidC A 3 domain may be capable of loading l -AHO units onto the T 4 and T 5 domains, in addition to its cognate T 3 domain. This would result in the T 3 , T 4 and T 5 domains being charged with l -AHO units, which can then be condensed together by the sequential activity of the C 4 and C 5 domains to yield the tri- l -AHO motif. An alternative model would involve iterative activity of module 3 to generate tri- l -AHO appended to the T 3 domain; however, this would render the C 4 T 4 C 5 T 5 region redundant for the biosynthesis and seemed less likely. To investigate this aspect, a SidC C 3 A 3 T 3 tri-domain construct was cloned, overproduced and purified to examine the covalently tethered intermediates loaded onto the T 3 domain. Following conversion to its holo form and subsequent incubation with ATP, Mg 2+ and l -AHO (15 min), intact protein MS analysis of C 3 A 3 T 3 revealed loading of a single l -AHO unit, indicated by a + 172 Da mass shift relative to the mass of holo -C 3 A 3 T 3 (Supplementary Fig. 11 ). This highlighted that the standalone SidC C 3 A 3 T 3 tri-domain is only capable of loading a single l -AHO unit onto its cognate T 3 domain and ruled out the possibility of iterative loading. We next generated a SidC C 3 A 3 T 3 C 4 T 4 penta-domain construct to examine the ability of the A 3 domain to catalyse loading of l -AHO onto the T 4 domain. Under the same conditions, two sequential +172 Da mass shifts were observed in the mass spectrum, congruent with loading of two l -AHO units (Fig. 2b, top ). The measured masses were consistent with either two l -AHO units loaded in an uncondensed form onto the T 3 and T 4 domains, or in a condensed di- l -AHO species on the T 4 domain. To validate these observations, a mutant of the SidC C 3 A 3 T 3 C 4 T 4 construct was produced, where the Ser residue that serves as the Ppant group attachment site was mutated to Ala (S3140A, designated as T 3 0 ), allowing only the T 4 domain to be converted to its holo form. When subjected to the loading assay, the SidC C 3 A 3 T 3 0 C 4 T 4 protein was able to activate and transfer an l -AHO unit onto the T 4 domain, indicated by a single +172 Da mass shift in the intact protein mass spectrum (Fig. 2b , bottom). Inter-modular loading of T 4 by the A 3 domain in the SidC NRPS is reminiscent of behaviour observed for the SidD NRPS, where the A 1 domain is able to prime T 2 , situated in the downstream module. In both of these instances, the A domain is interacting with a T domain situated one module downstream. However, in the SidC NRPS, we hypothesised that the A 3 domain is capable of loading the T 5 domain, situated two modules downstream. To probe this, we conducted a bimolecular assay between SidC C 3 A 3 T 3 and SidC C 5 T 5 C T in the presence of ATP, Mg 2+ and l -AHO. This resulted in ~35% of l -AHO loading onto the T 5 domain after a 60 min incubation (Fig. 2c , top), and indicated that the A 3 domain is capable of loading the T 5 domain, situated two modules downstream. An equivalent experiment using a C 4 T 4 construct yielded comparable levels of l -AHO loading, serving as a control measurement, and also highlighting the reduced efficiency when domains are not covalently tethered in megasynth(et)ases (Supplementary Fig. 12 ). Interestingly, spectra obtained from this experiment also gave rise to a small peak congruent with a di- l -AHO species. The relatively small amount of this condensed species relative to the mono- l -AHO suggests that the C 3 domain does not preferentially condense l -AHO units together, indicating that the species in Fig. 2b ( top ) is likely two uncondensed l -AHO units. Analogous assays using the SidC T 5 C T didomain and SidC T 4 domain (i.e. without the N-terminal C domain), resulted in no detectable l -AHO loading (Fig. 2c , bottom and Supplementary Fig. 12 ), implying that the N-terminal C domains facilitate the loading reaction. These results suggest two architectural models for non-linear l -AHO loading by the A 3 domain. One possibility is that intra-chenar loading of l -AHO is promoted by a 3-dimensional arrangement of the SidC NRPS that enables proximity of the A 3 domain to the T 4 and T 5 domains (Supplementary Fig. 13a ). Here, the presence of the C 4 and C 5 domains may be essential to provide an interaction ‘platform’ for the T 4 and T 5 domains to access the A 3 domain. A second possibility involves inter-chenar communication between two SidC proteins, allowing the A 3 domain to load l -AHO onto the T 4 and T 5 domain in trans, whilst loading th

05

SidC C 2 A 2 T 2 tri-domain catalyses non-canonical loading of Gly residues

We next turned our attention to the biosynthetic steps required for the formation of the tripeptide backbone. This region is the sole structural difference between ferricrocin ( 2 ) and ferrichrome ( 3 ), possessing [Gly]-[ l -Ser]-[Gly] and [Gly]-[Gly]-[Gly] motifs, respectively. The SidC A domain specificity assays determined that the A 1 domain preferentially activates l -Ser, and that the A 2 domain only activates Gly (Supplementary Fig. 9 , 10 ). Based on these observations, linear assembly of the amino acid units would yield a T 2 -[Gly]-[ l -Ser]-NH 2 species, requiring a second Gly to be non-canonically condensed onto the amine of l -Ser to yield the T 2 -[Gly]-[ l -Ser]-[Gly]-NH 2 tripeptide intermediate necessary for ferricrocin ( 2 ) production. However, for ferrichrome ( 3 ), two scenarios seemed plausible: i). the A 1 domain would instead load Gly onto the T 1 domain (note, some activity towards Gly observed in specificity assays (Supplementary Fig. 9 )), allowing a T 2 -[Gly]-[Gly]-NH 2 species to be formed, followed by non-canonical condensation of a third Gly to yield the T 2 -[Gly]-[Gly]-[Gly]-NH 2 intermediate. ii). the A 1 and T 1 domains are not utilised, leaving the A 2 domain to generate a T 2 -[Gly]-NH 2 species, which must undergo two sequential non-canonical condensation events to yield the T 2 -[Gly]-[Gly]-[Gly]-NH 2 intermediate. In order to unpick these biosynthetic steps, we first produced a SidC(ΔA 1 T 1 ) construct to examine whether the domains of module 1 are essential for siderophore production. Using S. cerevisiae as a heterologous host, SidC (ΔA 1 T 1 ) was observed to produce ferrichrome ( 3 ) exclusively, with no ferricrocin ( 2 ) detected (Fig. 3a , trace i), which was the product of full-length SidC under the same culture conditions (Fig. 2a ). Purification of the recombinant SidC(ΔA 1 T 1 ) protein allowed controlled exposure to amino acid substrates. Here, SidC(ΔA 1 T 1 ) produced ferrichrome ( 3 ) exclusively, provided that both Gly and l -AHO were present (Fig. 3a , traces ii–iii). The inclusion of l -Ser did not promote ferricrocin ( 2 ) production (Fig. 3a , trace ii), and omission of l -AHO resulted in no detectable products (Fig. 3a , trace iv). These data indicated that the A 1 T 1 domains are essential for ferricrocin ( 2 ) production, but are not required for ferrichrome ( 3 ) formation. Furthermore, these data indicated that during ferrichrome ( 3 ) biosynthesis, neither A 1 or T 1 domain participate in the recruitment of the additional Gly residue. This left the intriguing possibility that module 2 alone (i.e. C 2 A 2 T 2 ) could be responsible for constructing the T 2 -[Gly]-[Gly]-[Gly]-NH 2 intermediate required for ferrichrome ( 3 ) biosynthesis. Fig. 3 Iterative loading of Gly residues by the A 2 domain and chain-length control by the C 3 domain. a HPLC traces monitored at 420 nm for the following: (i). production of 3 via heterologous expression of sidC (ΔA 1 T 1 ), sidA and sidL in S. cerevisiae JHY686; ii - iv). in vitro enzymatic reaction of SidC (ΔA 1 T 1 ) in the presence of Gly, +/− l -Ser and +/− l -AHO; (v)–(vi). authentic standards of 2 and 3 . Experiments were performed in triplicate and representative spectra are shown. b Stacked deconvoluted intact protein mass spectra of holo -SidC C 2 A 2 T 2 in the presence of Gly, ATP and Mg 2+ . Assays conducted with excess Gly/ATP are shown in spectra (ii.) and (iii.) at 10 min and 60 min time intervals, and with limited concentrations of Gly/ATP in spectra (iv.) and (v.) after a 10 min incubation. Mass shifts corresponding to mono-/poly-Gly species are highlighted with red arrows, and proposed intermediates are displayed. c Stacked deconvoluted intact protein mass spectra of l -AHO-SidC C 3 A 3 T 3 following incubation with holo -SidC C 2 A 2 T 2 , Gly, ATP and Mg 2+ following 10 min and 60 min incubation periods. Spectrum (i.) shows l -AHO SidC C 3 A 3 T 3 alone; spectra (ii.) and (iii.) show increasing production of the condensed product, l -AHO-Gly 3 -SidC C 3 A 3 T 3 over time. Only the Gly 3 condensed product is observed, not Gly 1 or Gly 2 , suggesting that this is not a stepwise process. Instead, Gly 3 must be formed on SidC C 2 A 2 T 2 before the SidC C 3 domain will catalyse the condensation reaction. Spectrum iv. shows an experiment where a 60 min pre-incubation of holo -SidC-C 2 A 2 T 2 with Gly, ATP and Mg 2+ was conducted to allow formation of Gly 3 /Gly 5 -SidC C 2 A 2 T 2 (see Fig. 3b, spectrum (iii.), before addition of l -AHO-SidC C 3 A 3 T 3. Only the Gly 3 condensed product is observed, not Gly 5 , indicating that the SidC C 3 domain selectively condenses the Gly 3 -SidC C 2 A 2 T 2 species with l -AHO only. Mass shifts corresponding to biosynthetic species are highlighted with red arrows, and proposed intermediates are displayed. Exact measured and observed masses are detailed in Supplementary Tables 2 and 3 . Experiments were performed in duplicate and representative spectra ar

06

The SidC C 3 domain is a chain-length gatekeeper

Our biochemical investigations of SidC C 2 A 2 T 2 demonstrated its ability to generate poly-Gly chains of up five residues in length (Fig. 3b ). However, the biosynthetic products of SidC, ferricrocin ( 2 ) and ferrichrome ( 3 ), both require the T 2 -tethered intermediate to be three residues in length: T 2 -[Gly]-[ l -Ser]-[Gly]-NH 2 and T 2 -[Gly]-[Gly]-[Gly]-NH 2 , respectively. Therefore, in order to maintain biosynthetic fidelity, we postulated that the SidC C 3 domain imposes a selective requirement for three-residue chains appended to the T 2 domain in order to catalyse condensation with the first l -AHO unit, effectively acting as a gatekeeper. To explore this hypothesis, we incubated holo- SidC C 2 A 2 T 2 with holo- SidC C 3 A 3 T 3 in the presence of Gly, l -AHO, ATP and Mg 2+ , and monitored the SidC C 3 A 3 T 3 protein using intact protein MS at several time points. After 10 min, a new peak at + 171 Da from the l -AHO-C 3 A 3 T 3 species had emerged, indicating condensation of a tri-Gly unit onto the l -AHO (Fig. 3c , spectra (i) and (ii)), with the intensity of this species increasing over a 60 min period (Fig. 3c , spectrum (iii)). The absence of signals corresponding to mono- (+ 57 Da) or di-Gly (+ 114 Da) species condensed with l -AHO-C 3 A 3 T 3 during the time-course strongly indicated that the entire T 2 -[Gly]-[Gly]-[Gly] intermediate is condensed with l -AHO, rather than sequential addition of Gly residues. To examine whether the SidC C 3 domain can discriminate between tri-, tetra- and penta-Gly intermediates, we pre-incubated holo- SidC C 2 A 2 T 2 with Gly, ATP and Mg 2+ for 60 min to generate a mixture of poly-Gly chain lengths, represented by Fig. 3c , spectrum iii. The remaining Gly in the reaction was then removed by multiple cycles of ultrafiltration, before addition to holo- SidC C 3 A 3 T 3 in the presence of l -AHO, ATP and Mg 2+ for 60 min. Subsequent intact protein MS analysis revealed only the tri-Gly species condensed with T 3 -tethered l -AHO (Fig. 3c , spectrum iv), suggesting that the SidC C 3 domain possesses strict selectivity for poly-amino acid chain lengths where n = 3, thereby acting as a critical checkpoint during the biosynthesis.

07

A biosynthetic model for siderophore production by the SidC NRPS

Our data allows proposal of a rational biosynthetic model for the construction of ferricrocin ( 2 ) and ferrichrome ( 3 ) by the SidC NRPS (Fig. 4 ). In ferricrocin ( 2 ) biosynthesis, the process is initiated by the A 1 domain loading a l -Ser residue onto the T 1 domain, which is subsequently condensed with a Gly residue tethered to the downstream T 2 domain by the C 2 domain, as a result of A 2 domain loading, yielding the T 2 -[Gly]-[ l -Ser]-NH 2 intermediate (Fig. 4a ). This initial step is not required for ferrichrome ( 3 ) biosynthesis, which commences with A 2 domain-catalysed loading of a single Gly residue onto the T 2 domain, and is then condensed with a second Gly residue, catalysed by the amide bond-forming capabilities of the A 2 domain, yielding a T 2 -[Gly]-[Gly]-NH 2 intermediate (Fig. 4b ). Both T 2 -tethered dipeptide intermediates during ferricrocin ( 2 ) and ferrichrome ( 3 ) biosynthesis then undergo addition of a Gly residue to the free NH 2 group, catalysed by the A 2 domain, producing T 2 -[Gly]-[ l -Ser]-[Gly]-NH 2 and T 2 -[Gly]-[Gly]-[Gly]-NH 2 intermediates. Whilst the A 2 domain is capable of adding further Gly residues to extend the peptidyl chain over time (Fig. 3b ), the nascent tripeptide intermediates are rapidly and selectively condensed with the T 3 -tethered l -AHO species by the C 3 domain. Fig. 4 Proposed biosynthetic models for SidC-catalysed formation of ferricrocin and ferrichrome. a Ferricrocin ( 2 ) biosynthesis commences with condensation between l -Ser and Gly, which is catalysed by the SidC C 2 domain forming an ( l -Ser)-Gly dipeptide ( n = 2) tethered to the SidC T 2 domain. Non-canonical ligation of a Gly unit onto the amine of l -Ser produces a Gly-( l -Ser)-Gly tripeptide ( n = 3), which can undergo condensation with l -AHO catalysed by the chain-length selective SidC C 3 domain. The SidC A 3 domain loads l -AHO onto SidC T 4 and T 5 domains allowing a succession of condensation events to generate a Gly-( l -Ser)-Gly-( l -AHO) 3 hexapeptide intermediate bound to the SidC T 5 domain. Chain release is catalysed by the C-terminal C T domain to yield the biosynthetic product 9 . b Ferrichrome ( 3 ) biosynthesis can occur in the absence of l -Ser, where canonical loading of Gly onto the T 2 domain is followed by two successive rounds of non-canonical Gly ligation to yield a Gly 3 species ( n = 3) tethered to the T 2 domain. The remaining steps are identical to the biosynthesis of 2 , to yield the biosynthetic product 3 .

08

Discussion

Ferrichrome NRPSs are found in the vast majority of Ascomycetes, providing the biosynthetic machinery for siderophore production. Despite producing near-identical products, ferricrocin ( 2 ) and ferrichrome ( 3 ), substantial differences in the NRPS domain architecture exist within the NRPS family (Types I–V, Fig. 5 ). Phylogenetic work has suggested that ferrichrome NRPSs originate from an ancestral colinear hexamodule NRPS, created by adjacent duplication of complete NRPS modules resulting in two lineages: NSP2 and NSP1/SidC. The recently reported Sid1 NRPS responsible for AS2488059 biosynthesis, a related ferrichrome siderophore, might be considered as a contender for this ancestral gene (Supplementary Fig. 18 ) 44 – 46 . Here, dedicated A domains are employed to load each of the three l -AHO residues, in addition to the three backbone residues (Asn, Val and Phe), totalling six A domains in the NRPS. However, despite the similarities, phylogenetic analyses suggest that the Sid1 NRPS is of a different evolutionary origin to siderophores of the ferrichrome family 46 . All combinations of the ferrichrome family of NRPSs give rise to unusual non-linear domain organisations, which cannot be reconciled with standard biosynthetic logic of NRPSs. Plausible biosynthetic proposals linking the domain organisation to the peptidyl product require inter-modular loading of amino acid substrates by A domains up to n + 2 modules downstream (blue arrows), and/or A domains capable of creating (poly)amide chains on the same T domain (red arrows). Our study of the SidC NRPS highlights that both activities are possible in NRPSs and allow evidence-based biosynthetic proposals for all variations of the ferrichrome NRPS (Fig. 5 ). Fig. 5 Biosynthetic schemes for the six modular architectures of ferrichrome class of NRPSs. The new programming rules allow the biosynthetic assignment of other architectures of ferrichrome-family synthetase NRPSs. Examples include: Sid2 (ferrichrome), U. maydis ; 53 SidC (ferricrocin), A. nidulans ; 27 Sib1 (ferrichrome), S. pombe ; 17 , 54 NPS2 (ferricrocin), F. graminearum ; 55 NPS2 (ferricrocin), C. heterostrophus ; 56 CsNPS2 (basidioferrin), C. subvermispora ; 47 and Cpf1 (coprinoferrin A), C. cinerea 48 . The inter-module loading events, either ( n + 1) or ( n + 2), are highlighted in blue, and non-canonical loading of Gly residues is highlighted in red. In each case, siderophores are shown in their desferric state, and the hydroxymate-containing monomer unit is highlighted in purple. The lineage classification and Type I–VI groupings are based on previous phylogenetic analyses of ferrichrome synthetase NRPSs conducted by Bushley et al. 18 . All biosynthetic proposals are based on observations from the SidC NRPS in this study, but other possibilities may exist. Members of the NPS2 lineage (Types III, IV and V) all possess the correct number of C domains required for the number of amide bonds formed in the peptidyl product. However, loss or degeneration of A domains requires inter-module loading of T domains with amino acid substrates. This is observed in the tri-Orn region, as characterised for SidC, and in the tripeptide region for Type III and IV. In contrast, the Type I and II members of the NPS1/SidC lineage both require an amide bond-forming A domain to compensate for the lack of C domains in the NRPS, in addition to inter-module loading capabilities in the tri-Orn region. The CsNPS2 and Cpf1 NRPSs (Type VI) are the most truncated variation and appears to have lost much of the N-terminus, leaving just the tri-Orn region that requires inter-module loading capabilities. Recent assignments of the CsNPS2 and Cpf1 products as basidioferrin ( 4 ) and coprinoferrin ( 5 ) revealed a structure comprised of three condensed l -AHO units for basidioferrin ( 4 ) 47 , and three l -hydroxyhexanoyl ornithine ( l -hhOrn) units for coprinoferrin ( 5 ) 48 . In the latter, this suggests that the Cpf1 A 1 domain (equivalent to A 3 in SidC and Sid2) has evolved specificity towards a larger hydroxamate substrate, whilst retaining the ability to conduct inter-module loading (Fig. 5 ). Taken together, our observations highlight the impressive evolutionary changes employed by fungal NRPSs to improve atom economy and increase structural diversity in their biosynthetic assembly-lines. Our improved understanding of the biosynthetic rules has set the stage for manipulating and recombining these pathways towards novel hydroxamate-containing scaffolds.

09

Methods

Molecular cloning and site directed mutagenesis Yeast Expression Constructs: the SidC (Genbank: XM_653119 [ https://www.ncbi.nlm.nih.gov/nuccore/XM_653119.2 ]), SidA (Genbank: XM_658335 [ https://www.ncbi.nlm.nih.gov/nuccore/XM_658335.1 ]), and SidL (Genbank: XM_652967 [ https://www.ncbi.nlm.nih.gov/nuccore/XM_652967 ]) gene exon fragments were cloned from the cDNA library prepared from the mRNA extract of A. nidulans FSGC A1145 strain 49 cultured on Czapek-Dox (CD) agar. The corresponding yeast expression plasmids were assembled through yeast homologous recombination using a Frozen-EZ Yeast Transformation II Kit (Zymo research). Gene fragments were integrated into a 2µ-based yeast expression vector with auxotrophic markers and ADH2 promoter and terminator regions. All proteins were cloned in-frame with an N-terminal pHis 8 tag to facilitate purification. E. coli Expression Constructs: target regions of SidC were subcloned into either pHis 6- MBP-pET28a or pHis 6 -pET28a vectors. All proteins were cloned in-frame with an N-terminal TEV-cleavable tag (either MBP or pHis 6 ), allowing removal post-purification. Primers used for the cloning of SidC constructs and mutagenic primers to generate point-mutations/truncations are detailed in Supplementary Table 1 .

10

Protein overproduction and purification

Yeast expression constructs The full-length proteins were expressed in S. cerevisiae JHY686 33 strain cultured in YPD medium. Briefly, single colonies of yeast cells harbouring expression plasmids were inoculated into SDCt uracil drop-out culture and left growing at 28 °C for 2 days. The seed culture was then inoculated into YPD culture (20 ml to 1000 mL) and left growing at 28 °C for another 2 days. Cells were harvested by centrifugation and washed once with cell lysis buffer (50 mM K 2 HPO 4 (pH 7.5), 10 mM imidazole, 300 mM NaCl, 5% glycerol). Cells were flash frozen in liquid nitrogen and lysed by using a stainless-steel Waring blender. The cell lysate was cleared by centrifugation at 26,000 g for 60 min at 4 °C and the supernatant was filtered through a 0.22 µm filter (Millipore). The filtrate was incubated with Ni 2+ -NTA resin for 30 min at 4 °C and then the slurry was loaded onto a gravity column. The resin was washed and eluted with increasing concentrations of imidazole in cell lysis buffer. The fractions were examined by SDS-PAGE gels. Pure fractions were concentrated to ~20 mg/mL by Amicon concentrators (Millipore), supplemented with 10% glycerol and stored at −80 °C. Protein concentrations were determined by Bradford assay. Typically, 2 L cell culture could yield 1 − 10 mg of protein depending on the nature of the protein construct.

11

E. coli expression constructs

A single colony of E. coli BL21 (DE3) that had been transformed with the appropriate expression vector was picked and used to inoculate LB medium (5 or 10 mL) containing kanamycin (50 µg/mL). The resulting culture was incubated overnight at 37 °C and 180 rpm then used to inoculate LB medium (0.5 or 1 L) containing kanamycin (50 µg/mL). The resulting culture was incubated at 37˚C and 180 rpm until the optical density of the culture at 595 nm reached 0.6, then IPTG (1 mM) was added and growth was continued overnight at 15 °C and 180 rpm. The cells were harvested by centrifugation (4000 g , 15 min, 4 ˚C) and re-suspended in buffer (20 mM Tris-HCl, 100 mM NaCl, 20 mM Imidazole, pH 7.4) at 10 mL/L of growth medium then lysed using a Constant Systems cell disruptor. The lysate was centrifuged (37,000 g , 30 min, 4 °C) and the resulting supernatant was loaded onto a HiTrap FF Chelating Column (GE Healthcare), which had been pre-loaded with 100 mM NiSO 4 and equilibrated in re-suspension buffer (20 mM Tris-HCl, 100 mM NaCl, 20 mM Imidazole, pH 7.4). Proteins were eluted in a stepwise manner using re-suspension buffer containing increasing concentrations of imidazole—50 mM (5 mL), 100 mM (3 mL), 200 mM (3 mL) and 300 mM (3 mL). The presence of the protein of interest in fractions was confirmed by SDS-PAGE, and an additional gel filtration step (Superdex 75/200, GE Healthcare) was used to further purify proteins where necessary. Fractions containing the protein of interest were pooled and concentrated to 250–400 µM using a Viva-Spin centrifugal concentrator (Sartorius) at an appropriate MWCO. Samples were snap-frozen in liquid N 2 and stored at −80 °C.

12

Siderophore isolation/preparation

Desferriferricrocin and ferricrocin were obtained through coexpression of sidC , sidA , and sidL genes in S. cerevisiae BJ5464-npgA strain 50 . Briefly, competent yeast cells were transformed with plasmids XW55-SidC, XW06-SidL and XW02-SidA and the colonies harbouring these three plasmids were selected using minimal medium dropping out uracil, tryptophan, and leucine. The colony was inoculated into the corresponding liquid minimal medium, and the cell culture was grown at 28 °C for 2 days. To induce production, the starting culture was inoculated to YPD medium and left growing at 28 °C for 3 days. The cell pellet was harvested through centrifugation and the produced siderophore compound was extracted using acetone. The organic extract was dried using rotary evaporation and the residue was dissolved in methanol and subjected to LC-MS analysis on a Shimadzu 2020 LC-MS (Phenomenex Kinetex, 1.7 µm, 2.0 × 100 mm, C18 column) using positive and negative mode electrospray ionisation with a linear gradient of 5–95% MeCN - H 2 O supplemented with 0.1% (v/v) formic acid in 15 min followed by 95% MeCN for 3 min with a flow rate of 0.3 mL/min. To convert desferriferricrocin into ferricrocin, FeCl 3 (final concentration at 1 mM) was added into the organic extract. To purify the fermentation product for structural analysis, similar extraction procedure was performed on 4 L cell culture pellet. The organic extract was dried and dissolved in H 2 O and fractionated with Amberlite XAD-16 (Sigma-Aldrich) resin. The desferriferricrocin and ferricrocin were eluted from a gradient from 20% MeOH to 70% MeOH. The eluent was combined and purified by semipreparative HPLC using a reverse-phase column (Phenomenex Kinetics, C18, 5 µm, 100 Å, 250 × 4.6 mm). The identity of desferri-ferricrocin was confirmed by HR-MS and NMR analysis. The NMR spectra data are consistent with the literature data 51 . 1 H-NMR (500 MHz, CD 3 OD): δ 8.36 (s,), 4.46–4.30 (m, overlap, 3H), 4.24 (d, J = 17.1 Hz, 1H, Gly C α H 2 ), 4.08, (m, overlap, 1H, Ser C α H), 4.07 (d, J = 16.0 Hz, 1H, Gly C α H 2 ), 3.86 (ddd, J = 56.5, 11.1, 5.4 Hz, 2H Ser C β H 2 ), 3.69 (d, J = 15.7 Hz, 1H, Gly C α H 2 ), 3.62 (d, J = 17.0 Hz, 1H, Gly C α H 2 ), 3.76-3.53 (m, overlap, 6H), 2.11 (s, 3H, hydroxamic CH 3 ), 2.105 (s, 3H, hydroxamic CH 3 ), 2.102 (s, 3H, hydroxamic CH 3 ), 2.00-1.90 (m, 1H, Orn C β H 2 ), 1.90-1.85 (m, overlap, 1H, Orn C β H 2 ), 1.85-1.80 (m, overlap, 1H, Orn C β H 2 ),1.80-1.75 (m, overlap, 1H, Orn C β H 2 ), 1.78-1.73 (m, overlap, 1H, Orn C β H 2 ), 1.75-1.70 (m, overlap, 1H, Orn C β H 2 ), 1.74-1.60 (m, overlap, 6H, Orn C β H 2 ). 13 C-NMR (125 MHz, CD 3 OD): δ 174.6 C, 174.4, 174.2, 173.8 (overlap), 173.7, 172.5, 171.90, 171.88, 62.3 (Ser C β ), 56.7(Orn α ), 56.2 (Ser C α ), 54.9 (Orn C α ), 54.5 (Orn C α ), 48.41 (Orn C δ ), 48.39 (Orn C δ ), 48.3 (Orn C δ ), 44.4 (Gly C α ), 43.7(Gly C α ), 30.3 (Orn C β ), 30.2 (Orn C β ), 28.3 (Orn C β ), 24.4 (Orn C γ ), 24.3(Orn C γ ), 24.0(Orn C γ ), 20.31 (hydroxamic CH 3 ), 20.29 (hydroxamic CH 3 ), 20.22(hydroxamic CH 3 ). HRMS: calc. for [M + H] + C 28 H 48 N 9 O 13 + , 718.3367, found 718.3365.

13

Synthesis of l -AHO amino acid substrate

The amino acid N 5 -acetyl- N 5 -hydroxy- l -ornithine ( l -AHO) is synthesised from N 2 -Cbz- N 2 -Boc- l -ornithine according to the literature 35 , 36 . 1 H-NMR (500 MHz, CD 3 OD): δ 3.65 (t, J = 6.0 Hz, 1H, C α H), 3.56 (t, J = 6.7 Hz, 2H, C δ H 2 , cis-trans isomers not resolved), 2.04 (s, 3H, acetyl CH 3 , with a small shoulder at 2.01 due to cis-trans isomerization), 1.83-1.56 (m, overlap,.4H, C β H 2 , C γ H 2 ). 13 C-NMR (125 MHz, CD 3 OD): δ 174.3 (carboxylate), 173.7 (amide carbonyl), 169.3 (amide carbonyl, minor isomer), 54.3 (C α ), 50.9 (C δ ), minor isomer), 47.2 (C δ , major isomer), 27.5 (C β ), 27.3 (C β , minor isomer) 22.1 (C γ ), minor isomer), 19.7 (acetyl CH 3 , minor isomer), 19.4 (acetyl CH 3 ). HRMS: calc. for [M + H] + C 7 H 15 N 2 O 4 + , 191.1027; found 191.1097.

14

Biochemical characterisation of SidC in vitro

Purified SidC and associated variants/mutants were converted to their holo - form by incubation in 20 mM Tris HCl, 100 mM NaCl, 2 µM of NpgA, 0.1 mM CoA and 10 mM MgCl 2 in a total volume of 50 µL for 1 h at 25 °C. Reactions were initiated by addition of ATP (5 mM) and either all or various combinations of the following: l -AHO (1 mM)/ l -Ser (1 mM)/Gly (1 mM) in a final volume of 50 µL, and the reaction was allowed to proceed at 25 °C. At different time points, the reaction was quenched by mixing with an equal volume of methanol. The reaction products were analysed on an UHPLC-MS on a Shimadzu 2020 EVLC–MS controlled using Shimadzu LabSolutions (Phenomenex kinetex, 1.7 µm, 2.0 × 100 mm, C 18 column) using positive and negative mode electrospray S5 ionisation with a linear gradient of 5−95% MeCN−H 2 O supplemented with 0.1% (v/v) formic acid in 15 min followed by 95% MeCN for 5 min with a flow rate of 0.3 mL/min. All data were analysed using Shimadzu LabSolutions.

15

Biochemical assays to determine adenylation domain specificity

ATP-PPi exchange assays The amino acid substrate specificity profiles for the SidC A 1 and SidC C 3 A 3 constructs were conducted using ATP-PPi exchange assays. Assays performed in 100 μL of reaction buffer (50 mM Tris-HCl pH 8, 2 mM MgCl 2 ) containing 1 mM TCEP, 5 mM ATP, 1 mM tetrasodium pyrophosphate (Na 4 PPi), 5 mM substrate, and 5 μM enzyme. Before the addition of enzyme, Na 4 [ 32 P]-PPi was added to a final intensity of ∼2.5 × 106 cpm/mL. Reactions were allowed to proceed for 2 h at 25 °C and then quenched by the addition of 500 μL of charcoal (3.6% w/v activated charcoal, 150 mM Na 4 PPi, 5% HClO 4 ). Samples were centrifuged, and supernatant was discarded. To remove residual free [ 32 P]PPi, the pellet was washed twice with wash solution (0.1 M Na 4 PPi, 5% HClO4). The pellet was resuspended in 500 μL of water and added to scintillation fluid at a final volume of 5 mL. Radioactivity was measured using a Beckman LS 6500 scintillation counter.

16

Hydroxylamine Release Assays

The hydroxylamine-trapping assay for detecting adenylation activity was conducted for the SidC C 2 A 2 construct, and performed according to a reported protocol 38 . Briefly, the reaction was initiated by mixing 150 µL of substrate mixture [50 mM Tris, (pH 8.0), 30 mM MgCl 2 , 300 mM hydroxylamine (pH 8.0), 10 mM carboxylic acid substrate] with an equal volume of enzyme mixture [100 mM Tris (pH 8.0), 20 mM ATP, 20 µM enzyme]. For some hydrophobic substrates, 2–5% (v/v) DMSO was included to facilitate dissolving the substrate. The reaction mixture was then incubated at 30 o C for 16 h. The reaction was stopped by mixing with 300 µL of stopping solution [10% (w/v) FeCl 3 •6H 2 O and S5 3.3% TCA dissolved in 0.7 M HCl]. The precipitated enzymes were removed by centrifugation at 17,000 g for 5 min, and 200 µL of the supernatant were transferred to a 96-well plate and the absorbance of the ferric-hydroxamate complex at 540 nm was measured by using a Tecan M200 plate reader.

17

Biochemical characterisation of intra -molecular l -AHO loading by SidC A 3 domain

Purified SidC C 3 A 3 T 3 /C 3 A 3 T 3 C 4 T 4 and C 3 A 3 T 3 0 C 4 T 4 proteins were converted to their holo - form by incubation in 20 mM Tris, 100 mM NaCl, 2 µM Sfp PPtase, 1 mM CoA and 10 mM MgCl 2 in a total volume of 50 µL for 1 h at 25 °C. Loading of l -AHO was initiated by addition of ATP (5 mM) and l -AHO (1 mM) in a final volume of 50 µL, and the loading reaction was allowed to proceed for 1 h at 25 °C before intact protein analysis by UHPLC-ESI-Q-TOF-MS.

18

Biochemical characterisation of inter -molecular l -AHO loading by SidC A 3 domain

Purified SidC C 3 A 3 T 3 /C 4 T 4 /T 4 /C 5 T 5 C T and T 5 C T proteins were converted to their holo - form by incubation in 20 mM Tris, 100 mM NaCl, 2 µM Sfp PPtase, 1 mM CoA and 10 mM MgCl 2 in a total volume of 50 µL for 1 h at 25 °C. Loading of l -AHO was initiated by the addition of ATP (5 mM) and l -AHO (1 mM) to a solution of holo -C 3 A 3 T 3 (100 μM) and one of holo -C 4 T 4 /T 4 /C 5 T 5 C T and T 5 C T (100 μM). The loading reaction was allowed to proceed for 1 h at 25 °C before intact protein analysis by UHPLC-ESI-Q-TOF-MS.

19

Biochemical characterisation of intra -molecular Gly loading by SidC C 2 A 2 T 2

Purified SidC C 2 A 2 T 2 was converted to its holo - form by incubation in 20 mM Tris, 100 mM NaCl, 2 µM Sfp PPtase, 1 mM CoA and 10 mM MgCl 2 in a total volume of 50 µL for 1 h at 25 °C. To a solution of SidC C 2 A 2 T 2 (100 μM), loading of Gly was initiated by the addition of ATP (5 mM, or limited to a 2:1, 4:1 ratio with protein) and Gly (1 mM, or limited to a 2:1, 4:1 ratio with protein) in a total volume of 50 µL at 25 °C. Loading reactions were allowed to proceed for various time intervals before intact protein analysis by UHPLC-ESI-Q-TOF-MS.

20

Biochemical characterisation of inter -molecular condensation reaction between SidC C 2 A 2 T 2 -Gly 3 and SidC C 3 A 3 T 3 - l -AHO

Purified SidC C 2 A 2 T 2 and SidC C 3 A 3 T 3 proteins were converted to their holo - form by incubation in 20 mM Tris, 100 mM NaCl, 2 µM Sfp PPtase, 1 mM CoA and 10 mM MgCl 2 in a total volume of 50 µL for 1 h at 25 °C. Reactions were initiated by addition of ATP (5 mM), l -AHO (1 mM), Gly (1 mM) to a solution containing holo -C 2 A 2 T 2 (50 μM) and holo -C 3 A 3 T 3 (100 μM) in a total volume of 50 µL at 25 °C. Reactions were allowed to proceed for either 10 min or 60 min before intact protein analysis by UHPLC-ESI-Q-TOF-MS. A variation of this reaction was conducted which allowed Gly 3 and Gly 5 to be formed in situ on SidC C 2 A 2 T 2 (following procedure for intra -molecular Gly loading by SidC C 2 A 2 T 2 ) before its addition (at a final concentration of 50 μM) to a solution containing holo -C 3 A 3 T 3 (100 μM), ATP (5 mM), l -AHO (1 mM).

21

UHPLC-ESI-Q-TOF-MS analysis of intact proteins

Biochemical assays were analysed on a Bruker MaXis II ESI-Q-TOF-MS connected to a Dionex 3000 RS UHPLC fitted with an ACE C 4 −300 RP column (100 × 2.1 mm, 5 μm, 30 °C), controlled using Bruker Otof control 4.0. The column was eluted with a linear gradient of 5–100% MeCN containing 0.1% formic acid over 30 min. The mass spectrometer was operated in positive ion mode with a scan range of 200–3000 m/z . Source conditions were: end plate offset at −500 V; capillary at −4500 V; nebuliser gas (N 2 ) at 1.8 bar; dry gas (N 2 ) at 9.0 L min −1 ; dry temperature at 200 °C. Ion transfer conditions were: ion funnel RF at 400 Vpp; multiple RF at 200 Vpp; quadrupole low mass at 200 m/z ; collision RF at 2000 Vpp; transfer time at 110.0 µs; pre-pulse storage time at 10.0 µs. All spectra were analysed using Bruker DataAnalysis 4.4. Measured masses for all species are displayed in Table S2 and S3 .

22

AlphaFold modelling of SidC fragments

Structural models of the A 3 T 3 C 4 T 4 and C 4 T 4 C 5 T 5 regions of SidC were constructed using AlphaFold 39 . The full amino acid sequences of the excised tri-/tetra-domain regions were submitted to the AlphaFold Colab notebook (v1.5.2) 52 , which uses a slightly simplified version of AlphaFold v2.3.1, with the run_relax parameter enabled. Structures were assessed for their reliability via inspection of PAE and pLDDT plots. The resulting structures were then aligned to each other via the C 4 domain present in both structures using PyMOL v1.3, yielding the final model of SidC A 3 T 3 C 4 T 4 C 5 T 5 . Structure co-ordinate files for the SidC A 3 T 3 C 4 T 4 C 5 T 5 region and associated fragments used to assemble the model are available for download from Mendeley Data 10.17632/c3ymyp3yx4.1.

23

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

24

Supplementary information

Supplementary Information Peer Review File Description of Additional Supplementary Files Supplementary Data 1 Reporting Summary

25

Supplementary information

The online version contains supplementary material available at 10.1038/s41467-023-38484-8.

26

Peer review information

Nature Communications thanks Marc Lensink, and the other, anonymous, reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Article Details
DOI10.1038/s41467-023-38484-8
PubMed ID37198174
PMC IDPMC10192304
JournalNature Communications
Year2023
AuthorsMatthew Jenner, Yang Hai, Hong Hanh Nguyen, Munro Passmore, Will Skyrud, Junyong Kim, Neil K. Garg, Wenjun Zhang, Rachel R. Ogorzalek Loo, Yi Tang
LicenseOpen Access — see publisher for license terms
Citations19