Researchers can utilize the instructions provided to interact with NatureLM.
Generate molecule with given text compound properties (Section 3.2)
Instruction: Generate a molecule with four hydrogen bond donors.
Response: C(C[C@@H](C(=O)O)N)CN=C(N)N
One can replace `four hydrogen bonds ` to other properties like `QED 0.5`, `TPSA 90`, `three hydrogen bond donors`, etc.
Protein sequence to full molecule design (Section 3.4)
Instruction: Produce a compound guided by the target. <protein>LALSLTADQMVSALL...SYDLLLEMLDAH</protein>
Response: <mol>CC1=C(c2cccc(O)c2)C(c2ccc(I)cc2)Oc2ccc(O)cc21</mol>
Protein sequence to molecular fragment generation (Section 3.4)
Instruction: Design a compound with reference to the target <protein>DTKEQRILR….EKAIYQGP</protein> and the fragment <fragA> O=c1[nH]cnc2c(O)cc([*:1])c([*:2])c12 </fragA>.
Response: <fragB>Fc1ccc([*:1])cc1.Fc1ccc([*:2])cc1</fragB>
Here, ''[*:digit]'' refers to the connection point of the molecular fragment.
Binding affinity improvement (Section 3.5)
Instruction: Improve the binding affinity on Uridine-cytidine kinase 2 of <mol>Cc1ccc(‑c2nc3c(c(SCC(=O)Nc4ccccc4)n2)Cc2cccc(C)c2O3)cc1</mol>.
Response: <mol>Cc1ccc(‑c2nc3c(c(SCC(=O)Nc4cccc(C(=O)O)c4)n2)Cc2cccc(C)c2O3)cc1</mol>
Metabolism optimization (Section 3.6)
Instruction: Transform <mol>COCCn1c(=O)c(-c2ccccc2)nc2cnc(N3CCOCC3)nc21</mol> to yield lower CYP1A2 inhibition and present a better metabolic safety record.
Response: <mol>COc1ccc(‑c2nc3cnc(N4CCOCC4)nc3n(Cc3cccc(OC)c3)c2=O)cc1</mol>
You can replace CYP to the enzymes you wish as input.
Reaction prediction (Section 3.7)
Instruction: Please suggest possible reactants for the given product
<product>CC(=O)c1ccc2c(ccn2C(=O)OC(C)(C)C)c1</product>.
Response: <reactant>CC(=O)c1ccc2[nH]ccc2c1.CC(C)(C)OC(=O)OC(=O)OC(C)(C)C</reactant>
Protein generation (Section 4.2)
Instruction: I require a stable protein sequence, kindly generate one.
Response:<protein>GSVEVKIGTDRTTASSQDEAEERARRAGARRVRIDGNEVEVEM</protein>
Instruction: Generate a soluble protein sequence.
Response:<protein>MSLPNIFVEELVHTPIEKQVIEIVERKGIGHPDSLADGMAEAMSRELSREYIRRFGAVLHHNTDETQIVAGRSNPQFGGGEVIEPIYVLLVGRATKFFNGEYIPTDKIALKAARDYIRQHMQNLDPELDVVFNVRLGEGSTDLQDVFRRKSGNVALANDTSFGIGFAPLSETERLVFNVERRIYEEFRKKNPAIGEDVKVMGLREKDRISLTIAAAFVDRYVANIKEYDAIKEELENFVKEISSEYTEREVEVFVNTADDYETGCVYLTVTGTSAENGDDGSVGRGNRCNGLITPGRPMSMEASSGKNPINHVGKIYNLLANQIAARIAEEVEGVEEVYVRILSQIGKPINEPKALSVQVIPKSGYDISKLERPARDIAEEMIANVGKITDMVIEGKVRTFEGHHHHHH</protein>
Instruction: Produce a protein sequence that is not soluble.
Response:<protein>MAFLSQLQGGFAQLKQKAEAAAPSIADSLNQLKEAPKFMESAFKTGLNKVKLDSLGKVLNVPSLGRTQSPIDIVPVITAFGEHLQNAHFEVTYESTGEFKAVNDGNSIWLMREGNSSELAISFLPEEQYHLDAVNFHWATEPMNGSEHTIGGVGYAGEMHLIHRNTRFATMADALKQPNGVIAIAVFLNESHDDNAVFSPLINLLPQVIYKGSECKLCSFDFQTFFPVAEKTKEFWMYEGSETTDPFRETVNWIVIRAALPISSHQLDKLREVRAGRYDEEFSDKVPMKPLRPIQNPSSRTIQSSFRSVAGAPDLGFRQ</protein>
Antibody design and optimization (Section 4.3)
Instruction: Using antigen <protein>TQVCTGTDMKLR…GESSEDCQS</protein> and antibody frameworks <antibody> IVLTQTPS…LAVYYC</antibody> and <antibody>FGGGTRLEIEVQ</antibody>, create the CDR3 regions.
Response: <antibody>QQYSNYPWT</antibody>
Composition to material generation (Section 5.2)
Instruction: Build a material that has Li, Ti, Mn, Fe, O.
Response: <material> Li Li Li Li Ti Ti Ti Mn Mn Fe Fe Fe O O O O O O O O O O O O O O O O <sg8> </material>
Material generation with given bulk modulus (Section 5.3)
Instruction: Construct the composition for a material with a specified bulk modulus of 86.39 GPa.
Response: <material> Se Se Pd Sc <sg164> </material>
Design material with given bulk modulus.
Guide RNA design (Section 6.2)
Instruction: Generate a guide RNA for targeting the DNA sequence.
<dna>GACTGGCACCAG…CCCTCGC</dna>
Response: <rna>AGACACAGCGGGTGCTCTGC</rna>
Protein binding RNA design (Section 6.3)
Instructions: Given <protein>MSEY…SSGWGM</protein>, create an RNA molecule that binds to it.
Response: <rna>AAACAGG…CGTACGACA</rna>