Abstract
Protein dynamics analysis is central to elucidating biological functions. Owing to its unique combination of temporal resolution and structural sensitivity, infrared spectroscopy has emerged as a pivotal tool for investigating protein dynamical processes. This review systematically summarizes the fundamental principles, cutting-edge technologies, and practical applications of infrared spectroscopy in the analysis of protein dynamic structures. The temporal resolution capability of infrared spectroscopy has achieved full temporal scale coverage, ranging from the femtosecond to the millisecond level. This capability enables the capture of complete dynamic processes of proteins, spanning from ultrafast relaxation to conformational rearrangement. Two-dimensional infrared spectroscopy further improves spectral resolution and enhances the analytical capacity for complex protein systems. Infrared spectroscopy has been successfully applied to multiple research fields, including protein folding, ligand binding, and membrane protein dynamics. Notably, it has yielded significant progress in the investigation of critical biological processes such as the folding mechanism of amyloid fibrils, ligand binding of heme proteins, and proton transfer of membrane proteins. In the future, with the development of novel infrared probes and the integration of artificial intelligence technologies, infrared spectroscopy will exhibit greater application potential in the field of protein dynamics analysis.
0 Introduction
People's understanding of the biophysics of proteins has progressed from a single static structure to a multi-conformation tautomerism system depicted by a hierarchical energy diagram [1]. Protein motions range from local chemical bond rotations to global subunit movements, with timescales spanning from femtoseconds to seconds and beyond. Their dynamic characteristics cover hydrogen-bond fluctuations and solvent rearrangements at the femtosecond-picosecond level, amino acid side-chain motions on the tens of picosecond scale, secondary structure transitions and diffusion at the nanosecond-millisecond scale, intermolecular interactions, large-scale conformational changes, and domain rearrangements at the millisecond-second scale, and further extend to intermolecular aggregation and cascade reactions at the minute–hour scale [2]. Different structural regions of a protein contribute distinctively to its function. Large-scale backbone motions play a central role in protein misfolding and allosteric regulation, whereas side chains, as primary sites of evolutionary pressure, finely tune the biophysical properties of proteins through subtle chemical differences [3].
Infrared spectroscopy and nuclear magnetic resonance spectroscopy are two common methods used for molecular structure analysis. Infrared spectroscopy analyzes molecular structure by capturing the vibrational frequency of chemical bonds, while nuclear magnetic resonance spectroscopy analyzes molecular information by detecting the spin frequency of atomic nuclei. The time resolution scale of infrared spectroscopy is 6 orders of magnitude higher than that of nuclear magnetic resonance spectroscopy, which has an advantage in the study of rapid dynamic changes [4]. In the study of channel rhodopsin protein, time-resolved infrared spectroscopy captured the transient signal generated by the C-C stretching vibration mode during the photoisomerization of retinaldehyde molecules on the picosecond time scale, revealing its initial reaction kinetics. This rapid chemical process far exceeds the second-level time resolution limit of nuclear magnetic resonance [5]. In addition, although nuclear magnetic resonance spectroscopy can analyze the dynamic changes of proteins at atomic resolution, its sensitivity and resolution for large molecular weight proteins are low [6]. X-ray diffraction in structural biology requires protein samples to form high-quality crystals, and loses its characterization ability for proteins that can not be crystallized [7]. Infrared spectroscopy can obtain secondary structure information of proteins in solution environment and has no special requirements on the molecular weight of proteins. It is especially suitable for dynamic research of amorphous and high molecular weight proteins. The aggregation process of β-amyloid protein is closely related to Alzheimer's disease. This system is difficult to form a single crystal, which limits X-ray diffraction technology. Attenuated total reflectance infrared spectroscopy can capture the evolution of β-sheet characteristic peaks and the formation of oligomeric intermediate states in protein solutions, and realize the dynamic analysis of this type of system [8].
Although infrared spectroscopy has many advantages, it also has obvious shortcomings. The strong absorption of water(H2O) in the mid-infrared band is a huge challenge for studying protein structure using infrared spectroscopy. The strong absorption of the bending vibration of the OH bond in H2O at 1700-1600 cm-1 seriously overlaps with the amide I band in the protein, thus interfering with the resolution of the protein secondary structure. Since the bending vibration of the OD bond in heavy water(D2O) will move to 1200 cm-1, replacing H2O in the solvent with D2O can effectively avoid the problem of band overlap, but the potential impact of hydrogen-deuterium exchange on protein conformation needs to be fully considered [9]. In addition, water saturation absorption occurs when the optical path exceeds 10 μm, or even at shorter optical paths. Attenuated total reflection and microfluidic chip technologies can effectively reduce the optical path of infrared light in water, thereby avoiding saturation absorption by water [10–11].
Infrared spectroscopy offers time-resolved scales from femtoseconds to milliseconds, covering the timescales of most protein structural motion. Modern time-resolved infrared spectroscopy has seen rapid development and is widely used in protein folding, ligand binding, enzyme catalysis, and membrane protein dynamics studies. This article reviews the basic principles of infrared spectroscopy, time-resolved infrared spectroscopy techniques, and their applications in protein dynamics research.
1 Protein structure detection principle based on infrared spectroscopy
Infrared absorption is essentially a physical process in which the vibrational or rotational energy levels of molecules transition under infrared radiation excitation. When infrared light irradiates a molecule, its energy can be selectively absorbed by specific chemical bonds or functional groups, triggering vibrational transitions. Molecular vibrations are closely related to chemical structure, making infrared spectroscopy highly sensitive to changes in the protonation state, redox, bond order, and bond conformation of molecules. Characteristic parameters such as vibrational frequency, absorption intensity, and band width are affected by intramolecular and intermolecular non-covalent interactions, such as hydrogen bonding, dipole-dipole interactions, and local electric field effects [12-13]. Protein molecules have multiple vibrational transition absorption peaks, mainly involving the vibration of the main chain peptide bond, side chain vibration, and the vibration of exogenous non-natural amino acid probes(see Table 1).
Note: ν represents stretching vibration, νas represents antisymmetric stretching vibration, and δ represents in-plane bending vibration.
The amide I band(1700-1600 cm-1) of the main chain peptide bond mainly involves C=O stretching vibration and a small amount of CN stretching vibration and NH bending vibration. It is extremely sensitive to small changes in molecular geometry and hydrogen bond mode and is often used to analyze the secondary structure of proteins [14]. The secondary structure of a protein is a local spatial regular(or irregular) conformation formed by the main chain atoms of the peptide chain through intramolecular hydrogen bonds. Common secondary structures include α-helix, β-sheet and random coil. Differences in hydrogen bond binding mode and spatial conformation lead to different degrees of displacement of the amide I band. α-helix is usually located at 1655-1650 cm-1, β-sheet is located at 1640-1620 cm-1 and 1690-1670 cm-1, and random coil is located at 1650-1640 cm-1. The amide II band(1580-1520 cm-1) mainly originates from the N-H bending vibration and C-N stretching vibration, and is sensitive to hydrogen bonding and changes in skeletal structure, but its signal intensity is weaker than that of the amide I band. The amide II band of deuterated protein will redshift to 1480-1450 cm-1. This property can be used to study the stability of protein secondary structure and folding dynamics [15]. The amide A band(3350-3200 cm-1) corresponds to the stretching vibration of N-H, and this vibration frequency is highly sensitive to hydrogen bonding.
Side chain groups can also provide abundant structural information. Tyrosine exhibits characteristic peaks at 1270-1235 cm-1, and tryptophan has absorption at 1435-1412 cm-1. The S-H stretching vibration of cysteine occurs at 2600-2500 cm-1; it is sensitive to the hydrogen-bonding environment and can serve as a spectroscopic probe for conformational changes [16-18]. The vibrational modes of carboxylate(COO-) and amino() groups directly reflect their protonation states, which is crucial for understanding proton transfer mechanisms in enzyme catalysis, signal transduction, and other processes [19].Characteristic absorption peaks of natural amino acid side chains are usually located in the amide region or the fingerprint region, which may lead to severe spectral overlap and complicate analysis. To overcome this limitation, groups such as cyano(-C≡N, 2280-2210 cm-1), azido(-N=N=N, 2140-2080 cm-1), and thiocyanato(-S-C≡N, 2160-2040 cm-1) can be introduced via genetic code expansion or chemical modification. The vibrational frequencies of these unnatural amino acids typically lie in the range of 2800-1800 cm-1, avoiding interference from other absorption peaks, and can act as sensitive probes for monitoring protein folding, local electrostatic fields, and solvation dynamics [20]. Binding of ligands such as inhibitors and cofactors induces local conformational rearrangements in proteins. Therefore, vibrational coupling between ligand signals and protein functional groups can also be used as a spectroscopic probe to characterize structural changes inside proteins.
2 Time-resolved infrared spectroscopy
Time-resolved infrared spectroscopy is a core tool for studying protein dynamics. The key premise is that the reaction system is synchronized by external triggers such as light pulses, pH changes, and temperature jumps. Time-resolved infrared spectroscopy has achieved full timescale coverage from femtoseconds to milliseconds, enabling researchers to capture the complete dynamic processes of protein vibrational relaxation(femtosecond-picosecond) andconformational rearrangement(microsecond-millisecond). Based on the rapid scanning and step-scan techniques of Fourier Transform Infrared(FTIR) spectrometers, the dynamic processes of protein structural changes on the nanosecond to millisecond timescale can be captured. This technique is often combined with differential infrared spectroscopy to monitor the changes in specific vibrational frequencies of proteins. Differential spectroscopy extracts information on the changes in specific spectral bands from complex and overlapping infrared absorption spectra by measuring the absorption differences between two states of a protein [19,21-22]. Two-dimensional infrared spectroscopy based on pump-probe technology can realize the study of protein dynamics on the femtosecond scale.
3 Applications in the Study of Protein Dynamics
3.1 Triggering Method
Synchronous triggering is a key prerequisite for time-resolved testing. Common triggering methods include light pulse excitation, rapid mixing, and temperature jump. Photo-driven proton pump proteins, such as bacteriorhodopsin(BR) [23-24] or photosynthetic reaction centers containing chromophores [25-26]. In these biological systems, pulsed lasers of specific wavelengths can directly activate chromophores, triggering isomerization or redox reactions of cofactors, thereby initiating a series of subsequent reactions. In addition, pulsed lasers can also cause some cage-like compounds containing bioactive molecules to undergo photolysis, rapidly releasing the active molecules and triggering subsequent biochemical processes [27-29].
Stop-flow and continuous-flow technologies can achieve rapid mixing of protein solutions, ligand solutions, denaturants, etc. to trigger changes in protein structure. The stop-flow device can inject two or more sets of liquids into the observation chamber at the same time; after the sample is full, the shut-off valve is closed instantly to stop the flow; the infrared spectrometer immediately records the spectral changes synchronously. The stop-flow technology is limited by the dead time of the device and is usually only suitable for millisecond-level protein dynamics research [30-31]. The continuous-flow technology makes up for this deficiency through microfluidic chip technology and can achieve microsecond-level mixing. In the continuous-flow microfluidic chip device, the two solutions to be mixed form a laminar flow mode. Due to the thin layer thickness, reactant molecules can quickly diffuse from one solution to another to achieve microsecond-level rapid mixing [32].
Temperature jump is a technique that triggers a molecular system to deviate from equilibrium by instantaneously heating the solvent with a laser pulse(usually exciting the overtone vibration of water). Although traditional nanosecond pulse laser triggering heats up very quickly, it is limited to a window of less than milliseconds due to thermal diffusion [33]. The intensity-tunable continuous wave laser heating technology can offset heat dissipation through precise waveform control, and can achieve long-term isothermal observation, which can be used to capture biological processes such as large-scale rearrangement of proteins in the millisecond to second time range [34].
3.2 Rapid scanning infrared spectroscopy technique
Fast scanning infrared spectroscopy is suitable for monitoring dynamic processes on the time scale of milliseconds to seconds, and is particularly suitable for monitoring irreversible processes or single-occurrence changes. In fast scanning mode, the moving mirror is driven to move at a high speed. Each time the moving mirror completes a stroke, the detector records a complete interferogram. By rapidly and continuously recording the interferogram, a series of infrared spectra that change with time can be obtained by Fourier transform. The time resolution of fast scanning mode depends on the scanning speed of the moving mirror and the spectral resolution. Reducing the spectral resolution can further shorten the acquisition time of a single spectrum. The time-resolved fast scanning FTIR technology was first used in protein research when the differential infrared spectrum of the intermediate of the photocycle of bacterial rhodopsin was obtained at room temperature [35].
3.3 Step-scan infrared spectroscopy technique
Step-scan infrared spectroscopy is suitable for studying protein dynamics on timescales from nanoseconds to milliseconds. Fast scanning mode, limited by the movement speed of the moving mirror, typically only resolves millisecond-level changes. By changing the interferometer's operating mode, the temporal resolution can be improved to the nanometer scale. In step-scan mode, the moving mirror does not move continuously but stops at a specific optical path difference position. At this point, an external perturbation triggers a protein response, and the detector begins recording the change in light intensity at that position over time. After completing one recording, the moving mirror moves to the next interferometric position, repeating the process until the required path for the entire interferogram is completed.
The acquired raw data consists of signal attenuation curves over time at each specific optical path difference position. To obtain a complete spectrum, data reconstruction is required. All intensity data points recorded at the same time are extracted from the obtained three-dimensional data matrix, and then the data points at different positions of the moving mirror at that time are connected to form a complete interferogram for that moment. Performing a Fourier transform on the reconstructed interferogram yields the infrared spectrum for that moment; repeating this operation for all time points produces a series of infrared spectra that vary over time.
Therefore, the temporal resolution of step scanning is determined by the detector's response speed and the bandwidth of the data acquisition card. With a mercury cadmium telluride photovoltaic detector cooled by liquid nitrogen, the temporal resolution can reach the nanosecond level. However, its principle dictates that the excited reaction process must be highly repeatable and reversible.
3.4 Pump-probe based two-dimensional infrared spectroscopy
Two-dimensional infrared spectroscopy combines the advantages of time resolution and frequency resolution. It uses multiple ultrafast infrared laser pulses to simultaneously detect molecular responses in two frequency dimensions, thereby revealing the coupling and energy transfer between different vibrational modes [36]. With the development of ultrafast infrared laser technology, pump-probe-based two-dimensional infrared spectroscopy has become a key means of analyzing the dynamic structure of proteins. This technology can effectively capture protein conformational changes and vibrational coupling information spanning from picoseconds to milliseconds.
Based on the principle of third-order nonlinear optics, the pump light selectively excites specific molecular vibrations in the sample, causing them to transition from the ground state to the excited state; the probe light records the response of the molecule in the excited state after different delay times, thereby obtaining the coupling of vibrational modes, energy transfer, and the interaction between the molecule and the surrounding environment. As shown in Figure 1, two independent but coherent pump pulses act on the sample in sequence: the first pump pulse excites specific vibrations of the molecule; the second pump pulse arrives at the sample after a delay time t1, further modulating the vibration and coupling of the molecule. After a delay time t2(t2 is the time interval between the pump pulse and the probe pulse), the probe pulse arrives at the sample and records the response of the molecule in the excited state. t2 can be precisely controlled by a mechanical delay line. Through Fourier transform, two-dimensional infrared spectra with excitation frequency and probe frequency as the horizontal and vertical axes can be obtained respectively [4]. Over the past two decades, the instruments and methodologies of two-dimensional infrared spectroscopy have made significant progress, providing strong technical support for the study of different scientific problems [37-38].
Fig.
1
Basic principle diagram of two-dimensional infrared spectroscopy technology
The diagonal peaks in the two-dimensional infrared spectrum correspond to the specific vibrational modes of the molecules themselves, which correspond to the absorption peaks in the one-dimensional infrared spectrum; the cross peaks are unique signals of the two-dimensional infrared spectrum, and the peak positions and intensities can quantitatively reflect the coupling strength between different vibrational modes. The vibrational modes of proteins with similar frequencies usually overlap highly in the one-dimensional spectrum, making it difficult to distinguish them. The two-dimensional infrared spectrum can realize the deconvolution of overlapping absorption bands and improve the spectral resolution [3]. Combined with the delay time-dependent peak intensity decay, it can distinguish between uniform and non-uniform broadening, which can be used to analyze the microenvironment in which the protein is located and the dynamics of different conformational transformations. The femtosecond to picosecond time resolution capability of the pump-probe technology enables the two-dimensional infrared spectrum to track the dynamic evolution of molecules in the system in real time, providing an intuitive dynamic characterization method for analyzing the function of biomolecules.
4 Applications in protein dynamics research
4.1 Dynamic Study of Protein Folding and Aggregation
Two-dimensional infrared spectroscopy offers femtosecond to millisecond-level temporal resolution. Utilizing the conformational dependence of amide I band vibrational signals and isotope-specific labeling techniques, it can effectively separate vibrational signals of target amino acid residues, enabling residue-level structural dynamic tracking during protein folding and aggregation. This technology has become a core tool for elucidating the misfolding mechanisms of amyloidosis-related proteins.
In the study of amyloid protein aggregation, the aggregation mechanism of human islet amyloid polypeptide(hIAPP) which is related to the pathogenesis of type 2 diabetes has always been a hot topic. The structural features of transient intermediates in the aggregation process are difficult to capture by static techniques. Zanni MT's group used 13C=18O to specifically label a single amino acid residue and combined it with two-dimensional infrared spectroscopy to separate the signal of the target residue from the overall signal of the main chain, captured the structure of transient intermediates in the aggregation process, and analyzed the structural differences between the aggregation lag stage and the mature filament stage [39]. As shown in Figure 2, in the lag stage, the labeling peak of F23 appears at 1587 cm-1, corresponding to the characteristic signal of parallel β-sheet; while in the mature filament stage, the intensity of this characteristic peak weakens and the peak shape broadens, suggesting that the region where the F23 residue is located has transformed into a disordered conformation. This discovery directly confirms that the FGAIL region has different structures in the intermediate and final states, providing key information for revealing the molecular mechanism of protein misfolding and aggregation.
Polyglutamine(polyQ) fragments exist in a variety of proteins, and the abnormal expansion of their repeat counts is associated with a variety of neurodegenerative diseases(such as Huntington's disease). Due to the uniformity of the sequence and its tendency to aggregate to form amyloid filaments, the study of the structure and dynamics of polyQ peptides faces great difficulties. When polyQ peptides form amyloid filaments, there are two main models for the arrangement of their monomers: the β-arc model and the β-turn model. Since the vibrational modes are highly delocalized within the filaments, conventional infrared spectroscopy can not distinguish between these two structures. Researchers mixed 12C and 13C-labeled polyQ peptides(sequence K2Q24K2W) and assembled them together into filaments. The 13C labeling caused the amide I band frequency to shift and its vibrational modes to be limited to the monomers [40]. The results of the spectroscopic study combined with molecular dynamics simulation confirmed that the dominant structure of polyQ amyloid filaments is the β-turn model and revealed its highly coordinated aggregation mechanism.
Fig.
2
Structure and two-dimensional infrared spectral data of isotopically labeled hIAPP [39]: (a) Sequence structure of hIAPP fibers; (b) NMR model of hIAPP, where β-sheet structures are marked with black arrows andisotopic labeling positions are indicated by small dots; (c)-(f) Two-dimensional infrared spectra and diagonalintensity slices of V17 and F23 components in the hysteresis and equilibrium phases
4.2 Study on ligand binding and enzyme catalytic mechanism
Ligand binding and enzyme catalysis are crucial components of core protein biological functions. Their dynamic processes span timescales from femtoseconds to seconds, encompassing a series of molecular events such as ligand recognition, binding, and dissociation, as well as conformational rearrangement of the enzyme's active site and substrate transformation. By tracking the spectral changes of these dynamic processes in real time, we can elucidate ligand-protein interaction patterns and the microscopic regulatory mechanisms of enzyme catalysis.
As an important biological functional protein, the binding dynamics of heme protein with diatomic molecules such as NO, CO, and O2 determine its physiological function. CO has a very strong ability to bind to free heme protein, but this binding is significantly inhibited in myoglobin, which is considered to have important physiological significance. Traditional spectroscopic techniques have difficulty capturing the orientation information of the ligand-bound state and the dynamic behavior after dissociation at the same time, while the emergence of time-resolved polarized infrared spectroscopy has effectively solved this problem. Using this technique, researchers found that the angle between CO bound to heme iron and the normal of the heme plane is less than or equal to 7°, forming an almost linear Fe-CO structure; when CO dissociates from iron, it is captured at a docking site in a heme pocket, and its orientation is constrained to a direction roughly parallel to the heme plane. Since the orientations of the bound state and the dissociated state CO are almost orthogonal, CO rebinding from the docking site is inhibited. This result in solution provides key kinetic evidence for understanding the ligand selectivity mechanism of heme protein [41].
In mixed solvent systems, the regulatory effect of the solvent environment on ligand binding kinetics can also be analyzed by time-resolved infrared spectroscopy. Researchers used microperoxidase-8(Mp-8) as a model to study the re-binding kinetics of NO and Mp-8 after photolysis in a glycerol/water mixed solution [42]. By tracking the transient absorption signal of NO stretching vibration, the kinetic parameters such as the time constant of NO re-binding can be obtained. The re-binding process of NO and Mp-8 in viscous solution is extremely efficient and rapid, and its reaction rate is almost unaffected by the viscosity of the solution. This study provides direct spectroscopic evidence for the mechanism by which the solvent environment regulates protein-ligand interactions.
HemAT protein is a heme-containing oxygen sensor protein. Its ligand binding process is accompanied by a conformational change: the open A0 conformation(1967 cm-1) to the closed A1 conformation(1925 cm-1). Researchers combined step-scan infrared spectroscopy to reveal the biphasic dynamics of CO ligand rebinding with heme iron after photolysis. At the same time, they modified the key residues at the distal end(Y70, L92, T95) and the proximal end(Y133) through site-directed mutagenesis to systematically explore the regulatory role of residues on ligand binding dynamics. The study confirmed that L92 and T95 act as"gating switches" for ligand entry and exit channels, mainly used to regulate the slow binding phase rate; while the hydrogen bond network between the proximal Y133 and H123 plays a decisive regulatory role in the number of phases(single-phase/biphasic) and pathway of the binding process [43].
Enzyme dynamics span timescales from femtoseconds to seconds. While researchers have a relatively good understanding of slow dynamic processes in the millisecond to second timescale(such as substrate binding), the regulatory mechanisms of enzyme catalytic activity in the femtosecond to picosecond timescales remain unclear. Two-dimensional infrared spectroscopy, with its ultra-high temporal resolution, has become a key technique for monitoring the hydration layer and ion dynamics surrounding proteins. L et al. used gene coding technology to introduce an infrared probe 3-azido-L-tyrosine(N3Y) into the active site of an iron-dependent metalloenzyme DddK and observed the fluctuations of the microenvironment of the active site on the femtosecond-picosecond timescale [44]. By analyzing the frequency-frequency correlation function, it was found that the catalytic efficiency of the enzyme is directly controlled by the degree of water molecule confinement in the active site, and the conformational flexibility of the active site and the water molecule confinement environment play an important role in maintaining the enzyme's catalytic function.
4.3 Membrane protein dynamics studies
Bacterial rhodopsin acts as a light-driven proton pump on the cell membrane of halophilic bacteria, capturing light through seven transmembrane α-helices. After light excitation, the retinal chromophore undergoes all-trans isomerization to 13-cis(see Figure 3(a)), thereby triggering the generation of a series of intermediate states(J, K, L, M, N, O). As a classic research system, the conformational changes of retinal, key amino acid residues, and proton transfer kinetics are studied by detecting changes in infrared spectra at different time scales [45].
The path of protons in bacterial rhodopsin from the cytoplasmic side through the membrane to the extracellular side is shown in Figure 3 [45]. The figure also shows the protonation groups and the order of proton transfer:(1) from the Schiff base to Asp85;(2) proton release;(3) proton transfer from Asp96 to the Schiff base;(4) proton absorption;(5) Asp85 transfers the proton to the proton-releasing group.
Fig.
3
Photoreaction processes and proton transport pathways in bacterial rhodopsin [45]: (a) In bacterial rhodopsin, the photoisomerization reaction from all-trans structure to 13-cis structure mainly occurs; (b) The process of proton pumping from the cytoplasm to the extracellular region
The molecular pathway and timing of proton transfer can be revealed by step scanning and differential spectroscopy(see Figure 4). During the transition from L state to M state, the intensity change at 1188 cm-1 reflects the deprotonation process of Schiff base; 1762 cm-1 indicates that Asp85 has been protonated; while 1739 cm-1 and 1742 cm-1 reflect the change in the environment of Asp96; the band shift from 1542 cm-1 to 1562 cm-1 reflects the change in the vibration mode of C=C double bond during the transition from L state to M state [46]. In the BR mutant D96N, the process of proton transfer from Schiff base to Asp85 and the subsequent kinetic process of transfer from Asp96 to deprotonated Schiff base can be clearly observed [47]. The D96N mutant has a unique M-N intermediate state during photocycle, and has the characteristics of both M state(Schiff base deprotonation) and N state(protein conformation). This suggests that in the wild type, protein conformational rearrangement(transition to the N state) may have started before the Schiff base was reprotonated [47].
Fig.
4
Changes in signal intensity over time at 1800-1000 cm-1 during the photocycle of bacterial rhodopsin [46]
Lorenz-Fonfria V A et al. used MES buffer and its deuterated derivatives as pH-sensitive vibrational probes to track the dynamics of proton release and uptake at a time resolution of 6 µs, and found evidence that was inconsistent with the previously accepted proton uptake mechanism [48]. There is a proton uptake complex in the cytoplasmic domain region, which provides an intermediate carrier for proton transfer. The researchers revealed the dynamic regulation mechanism of proton transfer through pH-dependent spectral differences; at the same time, by using the difference spectrum comparison between wild type and E194D, E204D mutants, they verified that Glu194 and Glu204 play a key role in maintaining water cluster stability and proton transfer pathway [49].
Time-resolved infrared spectroscopy is also applicable to the photocycle study of other rhodopsins. Channelrhodopsin(ChR) is the only known light-gated ion channel in nature, and its ion channel activity depends on the photoisomerization of the retinal chromophore and its subsequent photocycle. Infrared spectroscopy revealed the nanosecond-level"pre-gated" conformational change process of the channelrhodopsin variant(ChETA) [50]. Heliorhodopsin(HeR) is a rhodopsin protein newly discovered through functional metagenomics [51]. Its photocycle process is relatively slow, and it will form K, M and O intermediate states in sequence. Among them, the O intermediate state has a longer lifetime and is considered to be a potential functional state for transmitting signals or regulating enzyme activity. Time-resolved infrared spectroscopy has been used to detect the dynamic changes in the secondary structure of the protein during the photocycle of HeR [52]. By analyzing the spectral differences of different HeR variants(48C12 and TaHeR) during the decay processes of the M and O states, it was found that the key hydrogen bond between Ser112 in transmembrane helix 3(TM3) and Asn138 in transmembrane helix 4(TM4) plays a regulatory role in protein structural dynamics and photocycling rate.
Time-resolved infrared spectroscopy has also been used to study the proton co-transfer reaction mechanism in the photocycle of thermophilic rhodopsin(TR) at high temperatures(30-70 ℃) [53]. D95, as a proton acceptor, has a characteristic band corresponding to its protonation at a positive peak at 1753 cm-1; E106, as a proton donor, has a characteristic band corresponding to its deprotonation at a positive peak at 1398 cm-1. The changes in the amide I band at 1697 cm-1 and 1628 cm-1 revealed that the transition from angle to β-sheet occurred simultaneously during the proton transfer process. Combined with singular value decomposition(SVD) and global fitting analysis, the reaction kinetics at different temperatures were quantified.
Fig.
5
Workflow for predicting the dynamic secondary structure of proteins [54]: (a) The entire experimentalprocess proceeds from top to bottom, including constructing a two-dimensional infrared spectroscopy dataset, pre-training the model, and applying the pre-trained model to predict protein folding trajectories. The "helicalpart" in the Trp-cage structure and the "chain-like structure part" in the WW domain represent the proportions of these conformational components in the protein secondary structure, respectively. (b) In the pre-training dataset, before each snapshot is generated, the relevant protein samples are first subjected to molecular dynamics simulations to generate two-dimensional infrared spectra and determine their secondary structure composition. (c) Detailed architecture of the machine learning model
5 Outlook
Despite significant advancements in the dynamic analysis of proteins using infrared spectroscopy, challenges remain, including spectral overlap, water peak interference, and limitations in quantitative analysis accuracy. However, these challenges are gradually being addressed through the development of novel infrared probes, advancements in artificial intelligence, and innovations in instrumentation methods.
Introducing specific infrared probes into protein structures using techniques such as isotope labeling and non-natural amino acid embedding is key to improving the sensitivity and specificity of infrared spectroscopy. Currently, these specific infrared probes are widely used in areas such as protein folding mechanism analysis, ligand binding and catalytic process monitoring, and dynamic conformation tracking of membrane proteins, enabling real-time tracking of microenvironmental changes and conformational dynamics at specific protein sites. The future development trend of infrared probes will further improve site labeling efficiency, signal discrimination, and spatial resolution.
Artificial intelligence technology is playing a crucial role in infrared spectral data analysis, pattern recognition, and structure prediction. Researchers have already established a correlation between simulated spectra and structures using pre-trained models based on the Transformer architecture. This allows for the automatic and rapid prediction of dynamic changes in protein secondary structures(helices, folds, random coils) from two-dimensional infrared spectra(with an accuracy of up to 90%). This method is applicable to the analysis of folding trajectories from microseconds to milliseconds(see Figure 5) [54]. Using the DeepLabV3 model and a large amount of protein structure data, the three-dimensional skeleton structure of proteins can be identified and predicted from two-dimensional infrared spectral data, and different protein sizes and temperature conditions can be adapted through transfer learning [55]. The future development trend is to establish a larger-scale protein infrared spectral database to provide richer training data for machine learning, while developing more advanced machine learning models to improve the ability to identify complex spectral patterns.
Innovations in instrumentation and methodology will provide stronger technical support for studying the dynamic structure of proteins using infrared spectroscopy. Surface-enhanced infrared absorption spectroscopy(SEIRAS) exhibits a significant surface signal enhancement effect, which can solve the problems of weak signals and large water peak interference in the study of the dynamic structure of interface and membrane proteins, and has broad application prospects in the detection of trace proteins. Nanoscale infrared spectroscopy combines the high spatial resolution of atomic force microscopy with the chemical recognition capability of infrared spectroscopy, breaking through the diffraction limit of infrared light. This technology achieves infrared spectral characterization at the nanoscale spatial resolution, providing a completely new technical means for protein structure research. In addition, the combination of microfluidic devices and infrared spectroscopy can realize the dynamic response monitoring of trace proteins under different environmental conditions(such as pH, temperature, and concentration gradients).
6 Conclusion
Infrared spectroscopy plays a crucial role in protein dynamics research. From fundamental principles to advanced techniques, and from traditional one-dimensional to two-dimensional spectroscopy, infrared spectroscopy is continuously overcoming its limitations, providing theoretical support for understanding the dynamic structure and function of proteins. This technique has timescales ranging from femtoseconds to milliseconds, encompassing the movement of multi-scale structural units, from local chemical bond rotation to global subunit movement. Compared to other characterization techniques, infrared spectroscopy offers high time resolution and the ability to obtain secondary structure information of proteins in solution. The main chain peptide bonds, amino acid side chains, and exogenous infrared probes of proteins exhibit specific absorption bands that can reflect changes in the microenvironment, such as conformational changes, protonation states, and local electric fields.
Modern time-resolved infrared spectroscopy techniques include millisecond-level rapid scanning modes based on Fourier transform, nanosecond-microsecond-scale step-scan modes, and femtosecond two-dimensional infrared spectroscopy based on pump-probe technology. Reactions are initiated through methods such as light pulse excitation, rapid mixing, and temperature jumps, with simultaneous time-resolved infrared spectral acquisition. Currently, infrared spectroscopy has been successfully applied to various fields, including the study of protein folding and aggregation dynamics, ligand binding and enzyme catalysis mechanisms, and membrane protein dynamics. Significant progress has been made in the study of key biological processes such as amyloid fibrous folding mechanisms, heme protein ligand binding, and membrane protein protein transport. However, infrared spectroscopy still faces challenges such as spectral overlap, water peak interference, and the accuracy of quantitative analysis. In the future, with the development of novel infrared probes, the integration of artificial intelligence into spectral data analysis, and innovations in instrumentation methods, infrared spectroscopy will provide stronger support for the study of protein dynamic structure-function.