Conclusion
The penultimate goal of our design is to increase the total yield of the Hepatitis E Vaccine protein. Our design utilizes an in silico model to first predict, then to utilize the relationship between synonymous codon variation and an increase in the expression of a given protein. The algorithm used by the model to predict protein expression of a single variations was derived through the linear regression of 11 codon-related variables (e.g. number of rare codons, TAI, CAI, etc.) of a wild and synthetic protein against positive fold-changes in their protein expression. The in silico synthetic HEV primary sequence will be compared to the wild type HEV primary sequence, with net yield of protein as the test variable/standard of comparison. Our design produces a synthetic HEV protein primary sequence that predicts an increase in protein expression of 7-fold, however in the validation of our model, it was significant that our model overestimates the increase in protein expression. The average ratio of observed to predicted fold-increase is 0.757.